Twice a week, Data Points brings you the latest AI news, tools, models, and research in brief. In today’s edition, you’ll find:
- Codestral matches or beats mid-sized fill-in-the-middle models
- MiniMax’s “Lightning Attention” tries to improve on the transformer
- Co-STORM now available for AI collaboration on encyclopedia articles
- LlamaIndex uses agentic RAG to retrieve info from complex documents
But first:
DeepSeek-R1 matches top models and opens access
DeepSeek-R1 achieves performance comparable to OpenAI’s latest o1 model on reasoning tasks, including a 79.8 percent pass rate on AIME 2024 and 97.3 percent on MATH-500. The model, along with the reinforcement-learning-trained R1-Zero and smaller distilled versions, is now available under an MIT license, allowing open access for the community to use the model weights and outputs. Through DeepSeek’s API, R1 costs $0.14 per million input tokens for cached inputs, $0.55 per million input tokens for standard inputs, and $2.19 per million output tokens. (GitHub)
Luma’s Ray2 model challenges top contenders in video generation
Luma Labs integrated its new Ray2 video model into the Dream Machine AI creativity platform, offering improved realism and motion compared to its predecessor. Ray2 utilizes 10 times more compute power than previous models and aims to provide better visual storytelling capabilities, though early users report some performance issues due to high demand. Early comparisons suggest Ray2 may outperform competitors like OpenAI’s Sora and Runway’s Gen-3 in motion accuracy and physics simulation, potentially setting a new benchmark for AI-generated video quality. (Luma Labs)
Mistral AI updates its best coding model
Mistral AI launched Codestral 25.01, an upgraded version of its coding model that generates and completes code twice as fast as its predecessor. The model outperforms other leading sub-100B parameter coding models on various benchmarks, particularly excelling in fill-in-the-middle tasks across multiple programming languages. Codestral 25.01 is now available through IDE plugin partners such as VS Code and JetBrains, with enterprise options for local deployment. It can now also be accessed via the Codestral API or on cloud platforms like Google Cloud’s Vertex AI and Azure AI Foundry for $0.30 per million input tokens and $0.90 per million output tokens. (Mistral)
MiniMax builds open weight models with alternative attention mechanism
MiniMax released its 01 series of models, featuring a novel non-transformer “Lightning Attention” architecture that enables processing of up to 4 million tokens. The 01 series includes MiniMax-Text-01, a 456 billion parameter language model, and MiniMax-VL-01, a vision-language model, both of which are now available under an open weights license on GitHub. MiniMax is offering API access to these models at rates of $0.20 per million input tokens and $1.10 per million output tokens. (Minimax)
AI system collaborates with humans to draft Wikipedia-style articles
Co-STORM, a research and summarization tool now available for a user study on the Stanford website, enables writers to work alongside language models in sourcing and drafting encyclopedia articles. The system, developed by Stanford researchers, employs a collaborative discourse protocol, featuring AI experts, a moderator, and human input to guide information gathering and knowledge curation. Co-STORM builds upon its predecessor STORM, which automates internet research and article outlining, by introducing a dynamic mind map to organize concepts and reduce cognitive load during in-depth discussions. While STORM and Co-STORM aren’t producing publication-ready content yet, experienced Wikipedia editors are interested in the system as a pre-writing aid. (Stanford)
LlamaIndex introduces new agentic RAG architecture for document processing
LlamaIndex developed Agentic Document Workflows (ADW), a new architecture that combines document processing, retrieval, and AI agents to automate complex knowledge work. ADW improves upon traditional Intelligent Document Processing and Retrieval-Augmented Generation by maintaining context across multi-step processes and coordinating between different system components. This advancement enables language models to handle sophisticated tasks like contract review, medical case summaries, and insurance claims processing while keeping humans in control of final decisions. (LlamaIndex)
Still want to know more about what matters in AI right now?
Read last week’s issue of The Batch for in-depth analysis of news and research.
Last week, Andrew Ng shared his thoughts on the growing demand for AI product management and how AI advancements are transforming roles within software development teams.
“The demand for good AI Product Managers will be huge. In addition to growing AI Product Management as a discipline, perhaps some engineers will also end up doing more product management work.”
Read Andrew’s full letter here.
Other top AI news and research stories we covered in depth: DeepSeek-V3 set new benchmark highs in LLM performance and cost efficiency; the U.S. announced expanded AI export restrictions, reshaping global tech markets; Nvidia unveiled Project Digits, a $3,000 home supercomputer for mid-sized AI models; and X-CLR introduced an innovative approach to contrastive learning, enhancing vision model performance.