Twice a week, Data Points brings you the latest AI news, tools, models, and research in brief. In today’s edition, you’ll find:
- ByteDance’s Doubao promises GPT-4o performance at cut-rate prices
- Perplexity debuts new API grounding in web search
- Hugging Face’s SmolVLM gets even smaller
- Benchmark-maker Epoch AI and OpenAI criticized for keeping funding deal under wraps
But first:
OpenAI unveils web-based AI agent for everyday online tasks
OpenAI released Operator, an AI agent that can perform simple web jobs like booking tickets or ordering groceries using a new model called Computer-Using Agent (CUA). The web app is currently available to ChatGPT Pro subscribers, with plans to expand access to paid and free users in the future. OpenAI claims Operator outperforms similar tools from Anthropic and Google DeepMind, and intends to make CUA available via API for developers to build their own agents. (OpenAI and Ars Technica)
White House shifts AI policy focus with Trump’s new executive order
U.S. President Trump signed an executive order on artificial intelligence that revokes past government policies he claims hinder American AI innovation. It calls for a review of actions taken under Biden’s 2023 AI executive order, which Trump rescinded earlier this week, and for the development of an AI action plan within 180 days. Trump’s order emphasizes developing AI systems “free from ideological bias” and aims to promote U.S. economic competitiveness and national security. (The White House and Associated Press)
ByteDance unveils powerful, low-cost AI model for Chinese market
TikTok owner ByteDance released a new version of its AI model Doubao, claiming performance comparable to leading models like GPT-4o and Claude Sonnet 3.5. The company emphasized a “resource-efficient” training approach and introduced aggressive pricing, with the most powerful version of Doubao 1.5 costing just $1.24 per million tokens. This development signals ByteDance’s ambition to compete in the global AI race while potentially reshaping the AI market with its ultra-low pricing strategy. Warning: the signup process for users outside of China is cumbersome. (ByteDance and Reuters)
Perplexity launches Sonar Pro API for developers
Perplexity updated its Sonar API and introduced a new Sonar Pro API, allowing developers to integrate generative search features with real-time web research and citations into their applications. The new Sonar API offers lightweight, fast question-answering capabilities with customizable sources, while Sonar Pro provides advanced features for handling complex queries with an expanded context window of 200,000 tokens. Pricing for Sonar starts at 5 per 1,000 searches plus 1 per 750,000 words input/output, while Sonar Pro costs 5 per 1,000 searches, 3 per 750,000 input words, and $15 per 750,000 output words. This product, which Perplexity says beats Google’s comparable API on benchmark tests, enables developers to incorporate sophisticated AI-powered search functionality into their products. (Perplexity)
Hugging Face unveils compact AI models for image and text analysis
Hugging Face released SmolVLM-256M and SmolVLM-500M, two small AI models capable of analyzing images, short videos, and text on devices with limited RAM. The models, trained on high-quality datasets, reportedly outperform larger models on various benchmarks and are available for unrestricted use under an Apache 2.0 license. These compact models are versatile and cost-effective for developers working with constrained devices or processing large amounts of data, but may have limitations compared to larger models when asked to perform complex reasoning tasks. (Hugging Face and TechCrunch)
OpenAI’s involvement in math test development raises questions about AI benchmarking
OpenAI’s early report on its o3 model included a high score on FrontierMath, a challenging AI math test developed by Epoch AI — but (it was later revealed) with OpenAI’s funding. The revelation that OpenAI may have had prior access to the test problems and solutions raised concerns about the benchmark’s fairness and independence. This controversy highlights the complexities surrounding AI model evaluation and questions whether evolving AI benchmarks can be truly unbiased. (TechCrunch and meemi’s Shortform)
Still want to know more about what matters in AI right now?
Read this week’s issue of The Batch for in-depth analysis of news and research.
This week, Andrew Ng shared insights from the World Economic Forum in Davos, Switzerland, where he discussed AI business implementations, governance, and climate solutions, including geoengineering. He highlighted the potential of Stratospheric Aerosol Injection (SAI) to combat global warming and introduced an AI-powered climate simulator at planetparasol.ai to explore these possibilities.
“The world urgently needs to reduce carbon emissions, but it hasn’t happened fast enough. Without geoengineering, there’s no longer any plausible path to keeping global warming to the 1.5 degrees Celsius goal set by the Paris agreement.”
Read Andrew’s full letter here.
Other top AI news and research stories we covered in depth: DeepSeek-R1 emerged as an affordable rival to OpenAI’s o1, sharpening its reasoning capabilities; Unitree and EngineAI showcased affordable humanoid robots, breaking price barriers; Texas introduced a landmark bill to regulate AI development and use, further opening the door for state-level AI governance; and researchers combined deep learning with an evolutionary algorithm to design chips in minutes, revealing mysterious but effective processes in generated hardware designs.