Deep research brings PhD analysis to ChatGPT YuE’s music model released under Apache 2.0 open license

Published
Feb 3, 2025
Reading time
4 min read
A traditional music studio where software engineers, audio engineers, and musicians collaborate.

Twice a week, Data Points brings you the latest AI news, tools, models, and research in brief. In today’s edition, you’ll find:

  • Qwen updates its many multimodal models
  • Nvidia’s Eagle vision-language models are small but sharp
  • Tülu open post-training recipe whips Llama 3.1 405B into shape
  • Microsoft Azure is of two minds regarding DeepSeek R1

But first:

OpenAI launches deep research capability in ChatGPT

OpenAI introduced a new deep research agent in ChatGPT that conducts comprehensive internet research on complex tasks. The feature, powered by an analysis-optimized version of OpenAI’s unreleased o3 model, can analyze hundreds of online sources to create detailed reports in a fraction of the time it would take a human. Currently the agent is only available for ChatGPT Pro subscribers, and they are limited to 100 queries a month. If proven, this technology could significantly boost productivity in knowledge-intensive fields like finance, science, and engineering, transforming how businesses and researchers gather information and analyze it. (OpenAI)

New AI model generates full five-minute songs from lyrics

Researchers introduced YuE (pronounced “yeah” in English), an open weights model that transforms lyrics into complete songs with vocals and accompaniment. YuE can generate up to five minutes of music in various genres and languages, using tools like a semantically enhanced audio tokenizer and a dual-token approach for vocal-instrumental modeling. The model’s release under the Apache 2.0 license aims to advance music generation and creative AI, similar to how Stable Diffusion and LLaMA impacted their respective fields. (GitHub)

Alibaba challenges AI leaders with trio of advanced models

Alibaba updated its Qwen series of models with Qwen2.5-Max, Qwen2.5-VL, and the Qwen2.5-1M family. Qwen 2.5-Max is a Mixture-of-Expert model pretrained on over 20 trillion tokens that outperforms DeepSeek V3 in several benchmarks. Qwen2.5-VL is a vision-language model capable of understanding long videos, localizing visual input, and generating structured outputs for various applications. Qwen2.5-1M extends the Qwen2.5 language models’ context windows to 1 million tokens, improving the models’ long-context capabilities through multi-stage fine-tuning and other training methods. All models are released under a variety of licenses, ranging from quite permissive to somewhat restricted. These updates continue to position Alibaba as a formidable competitor in the AI race, challenging industry leaders like DeepSeek, OpenAI, and Anthropic. (Qwen2.5-MaxQwen2.5-VL, and Qwen2.5-1M)

Nvidia’s Eagle 2 9B vision-language model matches 70B rivals

Nvidia researchers developed Eagle 2, a series of vision-language models (VLMs) that can process and understand both images and text, available under an Apache 2.0 license. The nine billion parameter version of Eagle 2 achieves state-of-the-art results on several benchmarks, outperforming some much larger models and even matching or exceeding GPT-4V on certain tasks. Eagle 2 uses a “tiled mixture of vision encoders” approach, allowing it to process high-resolution images effectively and understand diverse visual content. In their paper, the researchers emphasize that their data strategy and training techniques were crucial in achieving these capabilities, potentially offering insights to help other AI developers create more powerful open-source VLMs. (GitHub and arXiv)

Tülu 3 405B model sets new benchmark for open AI

Ai2 researchers released Tülu 3 405B, which they claim is the largest open weights model trained using fully open post-training recipes. The model outperforms other models of similar size on various benchmarks, including GPT-4o and Deepseek v3, and shows particular improvement in mathematical problem-solving at larger scales. This release demonstrates the scalability and effectiveness of the team’s novel Reinforcement Learning from Verifiable Rewards (RLVR) approach, which they applied to the 405 billion parameter Llama 3.1 base model. (Ai2)

Microsoft adds DeepSeek R1 to Azure amid AI model controversy

Microsoft announced it will host DeepSeek-R1 on its Azure cloud service. DeepSeek R1 reportedly matches OpenAI’s o1 in performance at a fraction of the cost, with DeepSeek listing R1’s API cost as $2.19 per million output tokens compared to o1’s $60 per million output tokens. Azure’s decision comes despite recent accusations from OpenAI that DeepSeek violated its terms of service by extracting substantial training data through OpenAI’s API. Microsoft is OpenAI’s largest investor and (until recently) its exclusive cloud provider, and helped identify unusual activity on its servers that suggested DeepSeek may have exploited OpenAI in this way, making its quick decision to host DeepSeek-R1 noteworthy. (Ars Technica and The Verge)


Still want to know more about what matters in AI right now?

Read last week’s issue of The Batch for in-depth analysis of news and research.

Last week, Andrew Ng reflected on DeepSeek’s impact, highlighted China’s rapid progress in generative AI, the growing influence of open models in the AI supply chain, and the importance of algorithmic innovation beyond just scaling up.

“Scaling up isn’t the only path to AI progress. Driven in part by the U.S. AI chip embargo, the DeepSeek team had to innovate on many optimizations to run on less-capable H800 GPUs rather than H100s, leading ultimately to a model trained for under $6M of compute.”

Read Andrew’s full letter here.

Other top AI news and research stories we covered in depth: how DeepSeek-R1 and Kimi k1.5 leveraged reinforcement learning to train reasoning models, pushing the boundaries of AI capabilities; OpenAI introduced Operator, an AI agent designed to automate online tasks; The White House made a bold policy shift, rolling back AI regulations and emphasizing the need for U.S. leadership in the global market; and Cohere researchers proposed active inheritance, a novel fine-tuning approach that lets model-makers automatically select better synthetic data.


Subscribe to Data Points

Share

Subscribe to Data Points

Your accelerated guide to AI news and research