Finding the limits of pretraining and quantization New EU draft guidelines for regulating top models

Published

Nov 18, 2024

Reading time

3 min read

Twice a week, Data Points brings you the latest AI news, tools, models, and research in brief. In today’s edition, you’ll find:

Google AI detects wildfires from satellite images
Baidu shows off image RAG, no-code tools, and smart glasses
ChatGPT apps link up with desktop coding tools
New LoRA tools improve image models’ consistency and text generation

But first:

New scaling laws reveal optimal precision for language model training and inference

Researchers from Harvard, Stanford, Carnegie Mellon, and elsewhere developed new “precision-aware” scaling laws that predict how training and inference in lower precision affects language model performance. They found that post-training quantization degrades models more as they are trained on more data, eventually making additional pretraining data harmful. For pretraining, their scaling laws suggest that training larger models in lower precision (around 7-8 bits) may be compute-optimal, while very low precision (below 4 bits) requires disproportionately increasing model size to maintain performance. (arXiv)

EU releases draft guidelines for regulating general-purpose AI

The European Union released an initial draft of its Code of Practice for providers of general-purpose and high-risk AI models, inviting feedback until November 28. The draft, prepared by independent experts, aims to guide the development of trustworthy and safe AI models, detailing transparency rules, copyright regulations, and risk assessment measures for advanced AI systems. While this draft is still provisional and short on specifics, the final version is expected to play a crucial role in shaping the future of AI development and deployment across the EU. (Europa.EU)

New satellite systems improve early wildfire detection

Google Research announced a partnership with the U.S. Forest Service to develop FireSat, a satellite constellation dedicated to detecting and tracking wildfires. The system will provide global high-resolution imagery updated every 20 minutes, enabling detection of fires as small as a classroom and using AI to analyze images for reliable fire identification. FireSat’s data will offer scientists new insights into how fire behaves and spreads, potentially improving wildfire prediction models and emergency response efforts. (Google)

Baidu unveils new AI technologies at annual conference

Baidu introduced iRAG, a technology to reduce hallucinations in image generation, and Miaoda, a no-code tool for creating AI applications, at its Baidu World 2024 conference in Shanghai. The company reported that daily API calls to its ERNIE foundation model reached 1.5 billion in early November, a 30-fold increase from the previous year. Baidu also announced Xiaodu AI Glasses, powered by ERNIE and equipped with various AI capabilities, set to launch in the first half of 2025. (Reuters)

ChatGPT desktop app adds code-reading feature for MacOS

OpenAI’s ChatGPT desktop app for MacOS can now read code from popular developer tools like VS Code and Xcode, eliminating the need for manual copying and pasting. The new “Work with Apps” feature, available to Plus and Teams users, automatically sends code sections to ChatGPT for context alongside user prompts, but cannot write code directly into these apps. OpenAI views this capability as a step toward building AI agents that can understand and interact with computer interfaces beyond prompts and responses. (TechCrunch)

Simple tweaks unlock powerful in-context image generation abilities

Recent research demonstrates that text-to-image diffusion transformers (DiTs) can perform in-context image generation with minimal tuning. The study proposes a simple pipeline called In-Context LoRA (IC-LoRA) that concatenates images, performs joint captioning, and applies task-specific fine-tuning on small datasets. This approach generates high-fidelity image sets across various tasks like film storyboards, portrait photography, and visual effects, while maintaining consistency in style and identity. The researchers also released models based on the FLUX text-to-image model and displayed some of the results. (GitHub)

Still want to know more about what matters in AI right now?

Read last week’s issue of The Batch for in-depth analysis of news and research.

Last week, Andrew Ng shared his thoughts on optimizing large language models (LLMs) for agentic workflows, highlighting how advancements such as function calling and native computer use have transformed the way LLMs support complex, iterative applications.

“Following ChatGPT’s breakaway success at answering questions, a lot of LLM development focused on providing a good consumer experience. So LLMs were tuned to answer questions or follow human-provided instructions… But agentic workloads call on different behaviors. Rather than directly generating responses for consumers, AI software may use a model in part of an iterative workflow to reflect on its own output, use tools, write plans, and collaborate in a multi-agent setting. Major model makers are increasingly optimizing models to be used in AI agents as well.”

Read Andrew’s full letter here.

Other top AI news and research stories we covered in depth: OpenHands launches Free Agents, an open toolkit for advanced code generation and automation; Perplexity introduced Election Hub, an AI-powered experience providing voters with verified, real-time news and insights on U.S. politics; Meta and Anthropic explore opportunities for AI in U.S. defense and national security, pursuing major military contracts; and Hunyuan-Large surpasses other open competitors with impressive benchmark scores, showcasing the potential of Mixture of Experts models.

Subscribe to Data Points