All the models we’ve been waiting for OpenAI’s scaled-up Project Orion arrives

Published
Feb 28, 2025
Reading time
4 min read
Team in modern office applauding while watching a news anchor on a big screen.

Twice a week, Data Points brings you the latest AI news, tools, models, and research in brief. In today’s edition, you’ll find:

  • Mercury debuts diffusion language models
  • Alibaba’s top video model is now free to download
  • A new model from Tencent is built for speed
  • IBM’s Granite 3.2 models are built for business

But first:

Microsoft unveils new Phi-4 models for multimodal and text-based AI

Microsoft released two new models in its open weights Phi-4 family: Phi-4-multimodal, a 5.6 billion parameter model capable of processing speech, vision, and text simultaneously, and Phi-4-mini, a 3.8 billion parameter language model optimized for text-based tasks. Phi-4-multimodal outperforms larger models on various benchmarks, including speech recognition and visual reasoning, while Phi-4-mini excels in tasks like coding and math. These compact models enable developers to create efficient AI applications for edge devices, smartphones, and vehicles. (Microsoft)

GPT-4.5 advances unsupervised learning at a premium

OpenAI released a research preview of GPT-4.5, showcasing significant improvements in pattern recognition, knowledge breadth, and reduced hallucinations compared to previous models. The new model is faster and interacts more naturally with users than o3, and it excels relative to GPT-4o at tasks like writing assistance, programming, and creative problem-solving, as measured by benchmarks like GPQA and MMMLU. GPT-4.5 represents a major advancement in scaling unsupervised learning, but its high computational requirements make it substantially more expensive than previous models, with OpenAI charging API users $75 per million input tokens and $150 per million output tokens. GPT-4.5’s high costs makes its future uncertain, since it does not outperform reasoning models like o3, and more lightweight models like GPT-4o can perform many of the same tasks at a fraction of the price. (OpenAI)

Diffusion models promise faster text generation than transformers

Inception Labs unveiled Mercury, a family of diffusion large language models (dLLMs) that generate text up to 10 times faster than current LLMs. Mercury Coder, the first publicly available dLLM, matches or surpasses the performance of speed-optimized autoregressive models on coding benchmarks while running at over 1,000 tokens per second on NVIDIA H100 GPUs. Pricing for the new model via API is private, but the Coder model is available through a web-based playground. New diffusion architectures could enable more efficient language-model-based applications, including improved agents, reasoning capabilities, and edge deployments on resource-constrained devices. (Inception Labs)

Alibaba expands open AI offerings with video generation models

Alibaba Cloud released four open weights models from its Wan2.1 video model series, making them freely available for download on Model Scope and Hugging Face. The models can generate video from text and image inputs, with Wan2.1-14B leading the VBench leaderboard for video models. Wan2.1-14B is the only open video generation model in the VBench top five, making it a compelling option for students, artists, and researchers to experiment with video models. (Alizila)

Hunyuan Turbo S offers faster AI responses at a low price

Tencent released Hunyuan Turbo S, a new language model designed for near-instant responses. The model doubles word output speed and reduces first-word latency by 44 percent compared to traditional models, using a novel hybrid Mamba-Transformer architecture to lower training and inference costs. Hunyuan Turbo S performs comparably to leading models like DeepSeek V3, GPT-4o, and Claude 3.5 Sonnet on public benchmark tests like MMLU, AIME 2024, and ArenaHard, with especially high scores on Chinese-specific benchmarks. Tencent priced Hunyuan Turbo S competitively at 0.8 yuan per million tokens for input and 2 yuan per million tokens for output, offering an intriguing alternative to competing model from DeepSeek, Baidu, and Alibaba. (Reuters and AIbase)

IBM offers new Granite models with reasoning abilities

IBM released several new models in its Granite series, including improved language models, a multimodal vision model, and embedding models. The Granite 3.2 8B and 2B Instruct models feature experimental chain-of-thought reasoning modes that allow them to handle complex instructions more effectively, while the new Granite Vision 3.2 2B model focuses on document understanding. These open weights models are now available on IBM watsonx.ai, Hugging Face, and other platforms, showing IBM’s efforts to compete with larger language models by offering specialized capabilities. (IBM)


Still want to know more about what matters in AI right now?

Read this week’s issue of The Batch for in-depth analysis of news and research.

This week, Andrew Ng discussed advancements in voice AI, challenges in controlling voice models, and techniques to reduce latency in voice interactions. He highlighted DeepLearning.AI’s work with RealAvatar and encouraged developers to prototype voice applications.

“I think generating a pre-response followed by a full response, to quickly acknowledge the user’s query and also reduce the perceived latency, will be an important technique, and I hope many teams will find this useful.”

Read Andrew’s full letter here.

Other top AI news and research stories we covered in depth: Researchers unveiled Brain2Qwerty, a nonsurgical system that decodes thoughts using brain waves, enabling mind-to-text communication; tech giants ramped up cloud spending to meet the surging demand for AI infrastructure; a viral deepfake video sparked legal debate after using AI to depict celebrities without their consent; and Meta introduced Chain of Continuous Thought (Coconut), a new approach to reasoning, using vectors rather than text to improve next-token prediction.


Subscribe to Data Points

Share

Subscribe to Data Points

Your accelerated guide to AI news and research