Nvidia makes new mini versions of open models Plus, Apple trains low-power “always-on” AI models

Published
Aug 26, 2024
Reading time
3 min read
A humanoid robot is sitting in a cozy reading chair, reading.

Twice a week, Data Points brings you the latest AI news, tools, models, and research in brief. In today’s edition, you’ll find:

  • Microsoft’s Phi-3.5 language family
  • OpenAI strikes licensing deal with Condé Nast
  • TableBench measures tabular data performance
  • DeepMind workers protest Google contracts

But first:

NVIDIA’s “Minitron Method” creates new pruned-and-distilled versions of Mistral’s NeMo and Meta’s Llama 3.1

NVIDIA and Mistral AI introduced Mistral-NeMo-Minitron 8B, a new language model that outperforms similarly sized models on multiple benchmarks. The model was created by width-pruning the larger Mistral NeMo 12B model and then retraining it using knowledge distillation, a technique that transfers knowledge from a larger “teacher” model to a smaller “student” model. NVIDIA, which used the same techniques to create a 4 billion parameter version of Llama 3.1, believes its Minitron method of pruning and distillation can be used to create even smaller, mobile-device-sized models while retaining the larger models’ power and accuracy. (NVIDIA)

A new approach to “always-on” machine learning models

Researchers at Apple developed a method to train small convolutional models by first expanding them into larger multi-branched architectures, then re-parameterizing them for efficient inference. Their wake-word detector, RepCNN, achieved 43% higher accuracy than traditional single-branch models with the same runtime, and matched the accuracy of more complex models while using less memory and running faster. This approach could significantly enhance the capabilities of always-on machine learning models, which are constrained by low memory and compute requirements. (Apple)

Microsoft expands AI offerings with family of small, safe Phi-3.5 models

Microsoft introduced three new models in its Phi-3 family: Phi-3.5-mini, Phi-3.5-vision, and Phi-3.5-MoE. Phi-3.5-mini offers enhanced multi-lingual support and a 128K context length, while Phi-3.5-vision improves multi-frame image understanding and reasoning. Phi-3.5-MoE, a Mixture-of-Experts model with 16 experts and 6.6B active parameters, achieves results similar to or better than much larger models in language understanding, math, and reasoning tasks. Phi-3.5-mini, despite its compact 3.8B parameter size, matches or surpasses the performance of larger models on multi-lingual tasks and long-context benchmarks. All the Phi-3.5 models are designed for minimum size and maximum safety. (Microsoft)

Magazine and media giant strikes content licensing deal with OpenAI

Condé Nast agreed to a multi-year deal allowing OpenAI to use content from its publications, including Wired and The New Yorker, in ChatGPT and SearchGPT. While terms were undisclosed, the partnership aims to compensate publishers for their intellectual property while helping OpenAI improve its AI models with high-quality content. This deal highlights the growing trend of media companies collaborating with AI firms, as publishers seek new revenue streams and AI companies work to address copyright concerns. (Wired)

New benchmark reveals gaps in LLMs’ ability to handle real-world tabular data

Researchers developed TableBench, a comprehensive benchmark testing large language models’ performance on tabular data across 18 fields in four categories. The team also created TableLLM, a model trained on their custom dataset that performs similarly to GPT-3.5 on the new benchmark. Experiments show even advanced models like GPT-4 struggle with complex, real-world tabular data tasks, highlighting the need for further improvements to meet practical industrial demands. (arXiv)

Google DeepMind workers protest military contracts

About 200 Google DeepMind employees signed a letter in May 2024 urging the company to end its contracts with military organizations. The letter expresses concern that DeepMind’s AI technology is being sold to militaries engaged in warfare, potentially violating Google’s commitments to not pursue AI applications likely to cause overall harm, contribute to weapons, or violate international law and human rights. This internal conflict highlights tensions between DeepMind’s commitment to ethical AI and Google Cloud’s business practices, including contracts with Israeli and U.S. military entities. (TIME)


Still want to know more about what matters in AI right now? 

Read last week’s issue of The Batch for in-depth analysis of news and research.

Last week, Andrew Ng discussed why the DEFIANCE Act and FTC ban on fake product reviews take the right approach to regulating AI: 

“I hope DEFIANCE passes in the House and gets signed into law. Both rules guard against harmful AI applications without stifling AI technology itself (unlike California’s poorly designed SB-1047), and they offer a good model for how the U.S. and other nations can protect citizens against other potentially harmful applications.”

Read Andrew’s full letter here.

Other top AI news and research stories we covered in depth: An agentic workflow that generates novel scientific research papers, all about Google’s Imagen 3 and Alibaba’s Qwen2-Math and Qwen2-Audio, and scaling laws for data quality.


Subscribe to Data Points

Share

Subscribe to Data Points

Your accelerated guide to AI news and research