Large Language Models (LLMs)

89 Posts

Graph showing how training loss affects token prediction accuracy and hallucination elimination.
Large Language Models (LLMs)

Getting the Facts Right: A memory method that reduces hallucinations in LLMs

Large language models that remember more hallucinate less.
o1 Family Benchmarks comparing pass rates across AIME, Codeforces, and GPQA.
Large Language Models (LLMs)

Higher Reasoning: OpenAI debuts o1 and pro mode for $200/month

OpenAI launched not only its highly anticipated o1 model but also an operating mode that enables the model to deliver higher performance — at a hefty price.
Table comparing model performance on Mathvista, MMMU, ChartQA, DocVQA, and other tasks.
Large Language Models (LLMs)

Mistral’s Vision-Language Contender: Mistral unveils Pixtral Large, a rival to top vision-language models

Mistral AI unveiled Pixtral Large, which rivals top models at processing combinations of text and images.
Flow diagram of an application using LLMs to process prompts and tools for responses.
Large Language Models (LLMs)

Agents Open the Wallet: Stripe builds ecommerce agent toolkit for AI to securely spend money

One of the world’s biggest payment processors is enabling large language models to spend real money.
Illustration of a person holding a box with network nodes emerging from it.
Large Language Models (LLMs)

AI Power Couple Recommits: Amazon deepens Anthropic partnership with $4 billion investment

Amazon and Anthropic expanded their partnership, potentially strengthening Amazon Web Services’ AI infrastructure and lengthening the high-flying startup’s runway.
Bar charts comparing performance of AI models across six tasks.
Large Language Models (LLMs)

Reasoning Revealed: DeepSeek-R1, a transparent challenger to OpenAI o1

An up-and-coming Hangzhou AI lab unveiled a model that implements run-time reasoning similar to OpenAI o1 and delivers competitive performance. Unlike o1, it displays its reasoning steps.
Efficient Foundations animation showing layered AI model components.
Large Language Models (LLMs)

More-Efficient Training for Transformers: Researchers reduce transformer training costs by 20% with minimal performance loss

Researchers cut the processing required to train transformers by around 20 percent with only a slight degradation in performance.
Graph showing test loss decreases with more tokens and larger model sizes (103-109 parameters).
Large Language Models (LLMs)

Next-Gen Models Show Limited Gains: AI giants rethink model training strategy as scaling laws break down

Builders of large AI models have relied on the idea that bigger neural networks trained on more data and given more processing power would show steady improvements. Recent developments are challenging that idea.
User retrieves vendor contact information to fill out a request form, verifying each entry.
Large Language Models (LLMs)

Claude Controls Computers: Anthropic empowers Claude Sonnet 3.5 to operate desktop apps, but cautions remain

API commands for Claude Sonnet 3.5 enable Anthropic’s large language model to operate desktop apps much like humans do. Be cautious, though: It’s a work in progress.
Comparison table of pre-trained models like Mistral, Llama, and Gemma, showcasing performance across evaluation metrics.
Large Language Models (LLMs)

Mistral AI Sharpens the Edge: Mistral AI unveils Ministral 3B and 8B models, outperforming rivals in small-scale AI

Mistral AI launched two models that raise the bar for language models with 8 billion or fewer parameters, small enough to run on many edge devices.
OpenAI logo next to the Microsoft logo with their shadows visible.
Large Language Models (LLMs)

AI Bromance Turns Turbulent: Microsoft and OpenAI partnership faces strain as both seek less dependence

Once hailed by OpenAI chief Sam Altman as the “best bromance in tech,” the partnership between Microsoft and OpenAI is facing challenges as both companies seek greater independence.
A GIF showcasing a dynamic spreadsheet interaction using AI, with cells being populated and analyzed automatically.
Large Language Models (LLMs)

Enabling LLMs to Read Spreadsheets: A method to process large spreadsheets for accurate question answering

Large language models can process small spreadsheets, but very large spreadsheets often exceed their limits for input length. Researchers devised a method that processes large spreadsheets so LLMs can answer questions about them.
How Qwen2-Audio performs against the competitors.
Large Language Models (LLMs)

Open Models for Math and Audio: Alibaba advances open-weight LLMs with Qwen2 Math and Audio variants

Alibaba followed up its open-weights Qwen2 large language models with specialized variations.
Conceptual illustration of The A I Scientist, an end-to-end LLM-driven scientific discovery process.
Large Language Models (LLMs)

AI Agents for AI Research: Agentic workflow generates novel scientific research papers

While some observers argue that large language models can’t produce truly original output, new work prompted them to generate novel scientific research.
Machine Translation Goes Agentic: TransAgents, a system that boosts literary translation with a multi-agent workflow
Large Language Models (LLMs)

Machine Translation Goes Agentic: TransAgents, a system that boosts literary translation with a multi-agent workflow

Literary works are challenging to translate. Their relative length, cultural nuances, idiomatic expressions...
Load More

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox