Mistral AI Sharpens the Edge Mistral AI unveils Ministral 3B and 8B models, outperforming rivals in small-scale AI

Published
Reading time
2 min read
Comparison table of pre-trained models like Mistral, Llama, and Gemma, showcasing performance across evaluation metrics.

Mistral AI launched two models that raise the bar for language models with 8 billion or fewer parameters, small enough to run on many edge devices.

What’s new: Ministral 3B and Ministral 8B, which come in base and instruction-tuned versions, outperform Google’s and Meta’s similar-sized models on several measures of knowledge retrieval, common-sense reasoning, and multilingual understanding. Ministral 8B-Instruct is free to download and use for noncommercial purposes, and commercial licenses are negotiable for this model and the others in the family. Accessed via Mistral’s APIs, Ministral 3B costs $0.04 per million tokens of input and output, and Ministral 8B costs $0.10 per million tokens of input and output.

How it works: The Ministral family can process 131,072 tokens of input context. The models are built to support function calling natively to interact, for example, with external APIs that fetch real-time weather data or control smart-home devices.

  • Ministral 3B is sized for smaller devices like smartphones. In Mistral’s tests, it surpassed Gemma 2 2B and Llama 3.2 3B on MMLU, AGIEval, and TriviaQA (question answering and common-sense reasoning), GSM8K (math), HumanEval (coding), and multilingual tasks in French, German, and Spanish. Independent tests by Artificial Analysis show Ministral 3B behind Llama 3.2 3B on MMLU and MATH.
  • In Mistral’s tests, the instruction-tuned Ministral 3B-Instruct outperformed Gemma 2 2B and Llama 3.2 3B across several benchmarks including GSM8K, HumanEval, and three arena-style competitions judged by GPT-4o. 
  • Ministral 8B targets more powerful devices like laptops and requires 24GB of GPU memory to run on a single GPU. In Mistral’s tests, it outperformed its predecessor Mistral 7B and Meta’s Llama 3.1 8B on most benchmarks reported except HumanEval one-shot, where it was slightly behind Llama 3.1 8B. Independent tests by Artificial Analysis show Ministral 8B behind Llama 3.1 8B and Gemma 2 9B on MMLU and MATH. 
  • In Mistral’s tests, Ministral 8B-Instruct outperformed its peers on all benchmarks reported except WildBench, on which Gemma 2 9B Instruct achieved a higher score. WildBench tests responses to real-world requests that include digressions, vague language, idiosyncratic requirements, and the like.

Behind the news: Headquartered in France, Mistral AI competes head-to-head in AI with U.S. tech giants. It released its first model, Mistral 7B, a year ago under an Apache open source license. Since then, it has released model weights under a range of licenses while exploring alternative architectures such as mixture-of-experts and mamba. It also offers closed models that are larger and/or built for specialized tasks like code generation and image processing.

Why it matters: Edge devices can play a crucial role in applications that require fast response, high privacy and security, and/or operation in the absence of internet connectivity. This is particularly important for autonomous and smart home devices where uninterrupted, rapid processing is critical. In addition, smaller models like Ministral 8B-Instruct enable developers and hobbyists to run advanced AI on consumer-grade hardware, lowering costs and broadening access to the technology.

We’re thinking: Mistral’s new models underscore the growing relevance of edge computing to AI’s future. They could prove to be affordable and adaptable alternatives to Apple and Google’s built-in models on smartphones and laptops.

Share

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox