AI models can generate code, but how well do they understand it? Falcon’s Mamba model outperforms transformers

Published
Oct 14, 2024
Reading time
3 min read
A futuristic AI system recreating pianists hand motions for a musical score.

Twice a week, Data Points brings you the latest AI news, tools, models, and research in brief. In today’s edition, you’ll find:

  • The MI325X, AMD’s answer to NVIDIA’s H200
  • Research on new language model architectures from Microsoft
  • Mathcoder 2 makes open models better at mathematical reasoning
  • An AI system that recreates pianists’ hand movements

But first:

CodeMMLU benchmark shows gaps in AI models’ grasp of code

Researchers in Vietnam introduced CodeMMLU, a multiple-choice question benchmark with over 10,000 questions to evaluate how well AI models understand code across multiple programming languages and software concepts. The test reveals that even advanced AI models face significant challenges in comprehending complex code structures, not just generating them. GPT-4o posted the highest score on the new benchmark, followed by Claude 3 Sonnet and Llama 3 70B; however, the researchers did not test newer versions of these models. (arXiv)

Falcon’s new open-source model builds on Mamba

Researchers unveiled Falcon Mamba 7B, a new language model that surpasses several leading open-source AI models based on traditional transformer and hybrid architectures, including Mistral 7B, Llama 3.1 8B, and Falcon2 11B. The model uses the Mamba architecture, which offers faster processing and lower memory requirements for long texts compared to its rivals. This achievement challenges recent beliefs about hybrid designs, demonstrating that pure Mamba-based models can compete with or outperform both transformer and hybrid architectures in language tasks. (arXiv)

AMD releases powerful new AI chip to compete with NVIDIA

AMD announced its MI325X AI accelerator chip, claiming it outperforms NVIDIA’s H200 GPUs when used in data centers for AI applications. The chip, expected in 2015, contains 153 billion transistors and delivers up to 2.61 PFLOPs of peak eight-bit precision performance. AMD’s move aims to narrow the gap with NVIDIA in the AI processor market, though the company still trails significantly in market share; AMD projects AI chip sales of $4.5 billion for 2024 compared to NVIDIA’s $26.3 billion in a single quarter. (AMD and Ars Technica)

Transformer variation reduces noise, boosts efficiency

Microsoft researchers proposed Differential Transformers, a new architecture that improves attention mechanisms in language models by amplifying relevant context and canceling noise. Experiments show differential transformers outperform standard transformers on language modeling tasks, requiring only about 65 percent of the model size or training tokens to achieve comparable performance. The architecture shows advantages in areas like long-context modeling, key information retrieval, hallucination mitigation, and in-context learning, showing potential as a foundation for large language models. (arXiv)

Pretraining on this dataset gives AI models a math and reasoning boost

Researchers at the Chinese University of Hong Kong created a novel method to enhance AI models’ mathematical skills. They built a high-quality pretraining dataset called MathCode-Pile, which combines math-related sources with generated code that captures mathematical reasoning. The team trained four popular AI models (Llama-3-8B, DeepSeekMath-7B, Mistral-7B, and Code-Llama-7B) on this 19.2 billion-token dataset. This significantly improved the models’ math abilities, resulting in the new MathCoder2 family of AI models. (GitHub)

AI system recreates pianists’ hand motions for any musical score

Scientists captured 10 hours of 3D hand motion data from 15 elite pianists playing 153 classical pieces. Using this dataset, they developed an AI system combining imitation learning, reinforcement learning, and diffusion models to generate realistic hand movements for new musical scores. The ability to recreate fine motor movement has potential applications in character animation, robotics, biomechanics, and virtual reality. (GitHub)


Still want to know more about what matters in AI right now?

Read last week’s issue of The Batch for in-depth analysis of news and research.

Last week, Andrew Ng celebrated the 2024 Nobel Prizes in Physics and Chemistry being awarded to pioneers in AI, recognizing the significant contributions of Geoff Hinton, John Hopfield, Demis Hassabis, John Jumper, and David Baker. He expressed excitement about the growing recognition of AI’s impact on various fields and reflected on the importance of celebrating innovators within the AI community.

“Even as we cheer the new Nobel wins for AI, let’s continue to think about how we in AI can do more to celebrate the next generation of innovators.”

Read Andrew’s full letter here.

Other top AI news and research stories we covered in depth: Meta debuts Movie Gen for text-to-video generation; OpenAI unveils tools for speech, vision, and cost-efficiency for GPT-4o API at DevDay; a German court rules that LAION did not violate copyrights, marking a win for AI in legal disputes; and researchers expose a black market for AI-driven cybercrime services.


Subscribe to Data Points

Share

Subscribe to Data Points

Your accelerated guide to AI news and research