Machine Learning Research

400 Posts

X-CLR loss: training models to link text captions and image similarity.
Machine Learning Research

Calibrating Contrast: X-CLR, an approach to contrastive learning for better vision models

Contrastive loss functions make it possible to produce good embeddings without labeled data. A twist on this idea makes even more useful embeddings.
GB10 Superchip architecture with Blackwell GPU and Grace CPU.
Machine Learning Research

AI Supercomputer on Your Desk: Nvidia introduced Project Digits, a $3,000 home supercomputer for mid-sized AI models

Nvidia’s new desktop computer is built specifically to run large AI models.
DeepSeek-V3 accuracy across benchmarks compared to other AI models.
Machine Learning Research

DeepSeek Ups the Open Weights Ante: DeepSeek-V3 redefines LLM performance and cost efficiency

A new model from Hangzhou upstart DeepSeek delivers outstanding performance and may change the equation for training costs.
Diagram of Localize-and-Stitch merging fine-tuned models by combining critical weights into one model.
Machine Learning Research

Better Performance From Merged Models: Localize-and-Stitch improves methods for merging and fine-tuning multiple models

Merging multiple fine-tuned models is a less expensive alternative to hosting multiple specialized models. But, while model merging can deliver higher average performance across several tasks, it often results in lower performance on specific tasks. New work addresses this issue.
A narrow library aisle filled with shelves stacked with countless books.
Machine Learning Research

Massively More Training Text: Harvard unveils a million-book corpus for AI training

Harvard University amassed a huge new text corpus for training machine learning models.
Claude 3 Opus performs the Self-Exfiltration task, balancing renewable goals and corporate priorities.
Machine Learning Research

Models Can Use Tools in Deceptive Ways: Researchers expose AI models' deceptive behaviors

Large language models have been shown to be capable of lying when users unintentionally give them an incentive to do so. Further research shows that LLMs with access to tools can be incentivized to use them in deceptive ways.
Top use cases for Claude.ai, with percentages for tasks like app development and content creation.
Machine Learning Research

What LLM Users Want: Anthropic reveals how users interact with Claude 3.5

Anthropic analyzed 1 million anonymized conversations between users and Claude 3.5 Sonnet. The study found that most people used the model for software development and also revealed malfunctions and jailbreaks.
MUSTAFA SULEYMAN
Machine Learning Research

Mustafa Suleyman: Agents of action

In 2025, AI will have learned to see, it will be way smarter and more accurate, and it will start to do things on your behalf.
ALBERT GU
Machine Learning Research

Albert Gu: More learning, less data

Building a foundation model takes tremendous amounts of data. In the coming year, I hope we’ll enable models to learn more from less data.
JOSEPH GONZALEZ
Machine Learning Research

Joseph Gonzalez: General intelligence

In 2025, I expect progress in training foundation models to slow down as we hit scaling limits and inference costs continue to rise.
A hand holding a snow globe with skaters and a snowman.
Machine Learning Research

Smaller Is Beautiful: Compact AI models redefine efficiency, bringing advanced capabilities to everyday devices

For years, the best AI models got bigger and bigger. But in 2024, some popular large language models were small enough to run on a smartphone.
Snowman using a camera during snowfall.
Machine Learning Research

Generative Video Takes Off: Generative video models revolutionize content creation with stunning realism

Video generation exploded in an abundance of powerful models.
Sleigh rides sign with pricing adjustments and hot cocoa.
Machine Learning Research

Prices Tumble: AI price wars drive costs down as competition heats up

Fierce competition among model makers and cloud providers drove down the price of access to state-of-the-art models.
Santas in line with gifts and a ‘Photos with Santa’ sign.
Machine Learning Research

Agents Ascendant: LLMs evolve with agentic workflows, enabling autonomous reasoning and collaboration

The AI community laid the foundation for systems that can act by prompting large language models iteratively, leading to much higher performance across a range of applications.
Animation showcasing 7 key NLP topics visually expanding on the screen.
Machine Learning Research

When LLMs Propose Research Ideas: Stanford study finds AI matches human experts at writing research proposals

How do agents based on large language models compare to human experts when it comes to proposing machine learning research? Pretty well, according to one study.
Load More

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox