Twice a week, Data Points brings you the latest AI news, tools, models, and research in brief. In today’s edition, you’ll find:
- Google AI detects wildfires from satellite images
- Baidu shows off image RAG, no-code tools, and smart glasses
- ChatGPT apps link up with desktop coding tools
- New LoRA tools improve image models’ consistency and text generation
But first:
New scaling laws reveal optimal precision for language model training and inference
Researchers from Harvard, Stanford, Carnegie Mellon, and elsewhere developed new “precision-aware” scaling laws that predict how training and inference in lower precision affects language model performance. They found that post-training quantization degrades models more as they are trained on more data, eventually making additional pretraining data harmful. For pretraining, their scaling laws suggest that training larger models in lower precision (around 7-8 bits) may be compute-optimal, while very low precision (below 4 bits) requires disproportionately increasing model size to maintain performance. (arXiv)
EU releases draft guidelines for regulating general-purpose AI
The European Union released an initial draft of its Code of Practice for providers of general-purpose and high-risk AI models, inviting feedback until November 28. The draft, prepared by independent experts, aims to guide the development of trustworthy and safe AI models, detailing transparency rules, copyright regulations, and risk assessment measures for advanced AI systems. While this draft is still provisional and short on specifics, the final version is expected to play a crucial role in shaping the future of AI development and deployment across the EU. (Europa.EU)
New satellite systems improve early wildfire detection
Google Research announced a partnership with the U.S. Forest Service to develop FireSat, a satellite constellation dedicated to detecting and tracking wildfires. The system will provide global high-resolution imagery updated every 20 minutes, enabling detection of fires as small as a classroom and using AI to analyze images for reliable fire identification. FireSat’s data will offer scientists new insights into how fire behaves and spreads, potentially improving wildfire prediction models and emergency response efforts. (Google)
Baidu unveils new AI technologies at annual conference
Baidu introduced iRAG, a technology to reduce hallucinations in image generation, and Miaoda, a no-code tool for creating AI applications, at its Baidu World 2024 conference in Shanghai. The company reported that daily API calls to its ERNIE foundation model reached 1.5 billion in early November, a 30-fold increase from the previous year. Baidu also announced Xiaodu AI Glasses, powered by ERNIE and equipped with various AI capabilities, set to launch in the first half of 2025. (Reuters)
ChatGPT desktop app adds code-reading feature for MacOS
OpenAI’s ChatGPT desktop app for MacOS can now read code from popular developer tools like VS Code and Xcode, eliminating the need for manual copying and pasting. The new “Work with Apps” feature, available to Plus and Teams users, automatically sends code sections to ChatGPT for context alongside user prompts, but cannot write code directly into these apps. OpenAI views this capability as a step toward building AI agents that can understand and interact with computer interfaces beyond prompts and responses. (TechCrunch)
Simple tweaks unlock powerful in-context image generation abilities
Recent research demonstrates that text-to-image diffusion transformers (DiTs) can perform in-context image generation with minimal tuning. The study proposes a simple pipeline called In-Context LoRA (IC-LoRA) that concatenates images, performs joint captioning, and applies task-specific fine-tuning on small datasets. This approach generates high-fidelity image sets across various tasks like film storyboards, portrait photography, and visual effects, while maintaining consistency in style and identity. The researchers also released models based on the FLUX text-to-image model and displayed some of the results. (GitHub)
Still want to know more about what matters in AI right now?
Read last week’s issue of The Batch for in-depth analysis of news and research.
Last week, Andrew Ng shared his thoughts on optimizing large language models (LLMs) for agentic workflows, highlighting how advancements such as function calling and native computer use have transformed the way LLMs support complex, iterative applications.
“Following ChatGPT’s breakaway success at answering questions, a lot of LLM development focused on providing a good consumer experience. So LLMs were tuned to answer questions or follow human-provided instructions… But agentic workloads call on different behaviors. Rather than directly generating responses for consumers, AI software may use a model in part of an iterative workflow to reflect on its own output, use tools, write plans, and collaborate in a multi-agent setting. Major model makers are increasingly optimizing models to be used in AI agents as well.”
Read Andrew’s full letter here.
Other top AI news and research stories we covered in depth: OpenHands launches Free Agents, an open toolkit for advanced code generation and automation; Perplexity introduced Election Hub, an AI-powered experience providing voters with verified, real-time news and insights on U.S. politics; Meta and Anthropic explore opportunities for AI in U.S. defense and national security, pursuing major military contracts; and Hunyuan-Large surpasses other open competitors with impressive benchmark scores, showcasing the potential of Mixture of Experts models.