AI giants’ U.S. policy proposals Gemma 3 beats bigger open weight rivals

Published
Mar 14, 2025
Reading time
3 min read
A therapy session in a modern office where a patient lies on a couch talking to an AI-powered computer therapist.

Twice a week, Data Points brings you the latest AI news, tools, models, and research in brief. In today’s edition, you’ll find:

  • OpenAI’s new SDK and APIs for agentic workflows
  • Olympic Coder, two powerful open coding models
  • Alibaba applies RL to emotion detection
  • GPT-4.5 and Claude Sonnet 3.7 top a new agent leaderboard

But first:

U.S. AI companies weigh in on federal Action Plan

Anthropic, Google, and OpenAI published policy proposals in response to U.S. President Trump calling for a national “AI Action Plan.” Both Google and OpenAI argued that the U.S. should implement fair use and data-mining exceptions to copyright restrictions, allowing AI companies to train their models on copyrighted material. OpenAI and Anthropic’s proposals both argued that certain Chinese models like DeepSeek should be restricted, with OpenAI calling them government-funded and Anthropic warning of biosecurity concerns. Other matters addressed include U.S. export controls on chips and other AI hardware and the broad regulations of the EU’s AI Act. (AnthropicGoogle, and OpenAI)

Google unveils Gemma 3 family of smaller, open weight models

Google relesed Gemma 3, a set of multimodal models based on its Gemini 2.0 technology. The models range from 1 billion to 27 billion parameters and are designed to run on various devices, from smartphones to workstations. Gemma 3 supports over 140 languages and includes features like visual reasoning and a 128,000-token context window. Gemma 3 27B outperforms larger models like DeepSeek-V3 and Llama 3-405B on Chatbot Arena while remaining small enough to run on a single GPU. (Google)

OpenAI introduces new tools for AI agent development

OpenAI released new APIs and tools to help developers build AI agents. The new Responses API combines features from OpenAI’s existing Chat Completions and Assistants APIs, allowing models to use built-in tools like web search and file search. (OpenAI plans to phase out the Assistants API by mid-2026.) The company also launched an open-source Agents SDK for orchestrating multi-agent workflows. These tools aim to simplify the creation of AI systems that can perform complex tasks independently for developers using OpenAI models. (OpenAI)

OlympicCoder models excel at competitive programming tasks

Researchers affiliated with Open-R1 developed OlympicCoder, a set of 7B and 32B parameter models fine-tuned on competitive programming data that outperform some closed-source frontier models on challenging coding tasks. The models were trained on CodeForces-CoTs, a new dataset of nearly 100,000 high-quality coding samples distilled from DeepSeek-R1, and evaluated on a new benchmark using problems from the International Olympiad in Informatics (IOI). OlympicCoder-32B demonstrated particularly strong performance, surpassing all open weight models tested and even the much larger Claude Sonnet 3.7 on IOI problems. The new coding models are an important step in replicating the performance of DeepSeek’s R1 reasoning model using fully open data sets. (Hugging Face)

Alibaba releases AI model that can read emotions

Alibaba’s Tongyi Lab unveiled R1-Omni, an open vision model capable of inferring emotional states from video and audio inputs. The model, a reinforcement learning-enhanced version of the earlier HumanOmni, achieves state of the art performance on emotion recognition vision benchmarks. R1-Omni adds another nascent layer of understanding to vision models and is freely available on GitHub and Hugging Face. (GitHub)

Hugging Face’s smolagents evaluates top models for agents

Researchers launched a leaderboard to measure large language models’ effectiveness in powering AI agents, using a CodeAgent on various benchmarks. GPT-4.5 topped the rankings, outperforming specialized reasoning models, with Claude 3.7 Sonnet placing second. The leaderboard shows that all models achieve significant performance gains from agentic setups compared to vanilla LLMs, providing valuable insights for AI developers working on agent-based systems. (Hugging Face)


Still want to know more about what matters in AI right now?

Read this week’s issue of The Batch for in-depth analysis of news and research.

This week, Andrew Ng defended the importance of learning to code, arguing that as AI-assisted coding makes programming easier, more people should code—not fewer. He pushed back against claims that programming will become obsolete, arguing that understanding the “language of software” empowers individuals to work effectively with AI tools and maximize their impact.

“I see tech-savvy people coordinating AI tools to move toward being 10x professionals — individuals who have 10 times the impact of the average person in their field. I am increasingly convinced that the best way for many people to accomplish this is not to be just consumers of AI applications, but to learn enough coding to use AI-assisted coding tools effectively.”

Read Andrew’s full letter here.

Other top AI news and research stories we covered in depth: QwQ-32B emerged as a strong contender against DeepSeek-R1 and other larger reasoning models, challenging the dominance of high-parameter architectures with compact reasoning; Microsoft’s Phi-4 Multimodal model offered simultaneous processing of text, images, and speech; a U.S. court ruling rejected the fair use defense in the Thomson Reuters AI lawsuit, citing Ross's attempt to use copyrighted material to build a competing product; and Perplexity launched an uncensored version of DeepSeek-R1, raising discussions about AI safety and adapting open language models.


Subscribe to Data Points

Share

Subscribe to Data Points

Your accelerated guide to AI news and research