Large Language Models (LLMs)

82 Posts

User retrieves vendor contact information to fill out a request form, verifying each entry.
Large Language Models (LLMs)

Claude Controls Computers: Anthropic empowers Claude Sonnet 3.5 to operate desktop apps, but cautions remain

API commands for Claude Sonnet 3.5 enable Anthropic’s large language model to operate desktop apps much like humans do. Be cautious, though: It’s a work in progress.
Comparison table of pre-trained models like Mistral, Llama, and Gemma, showcasing performance across evaluation metrics.
Large Language Models (LLMs)

Mistral AI Sharpens the Edge: Mistral AI unveils Ministral 3B and 8B models, outperforming rivals in small-scale AI

Mistral AI launched two models that raise the bar for language models with 8 billion or fewer parameters, small enough to run on many edge devices.
OpenAI logo next to the Microsoft logo with their shadows visible.
Large Language Models (LLMs)

AI Bromance Turns Turbulent: Microsoft and OpenAI partnership faces strain as both seek less dependence

Once hailed by OpenAI chief Sam Altman as the “best bromance in tech,” the partnership between Microsoft and OpenAI is facing challenges as both companies seek greater independence.
A GIF showcasing a dynamic spreadsheet interaction using AI, with cells being populated and analyzed automatically.
Large Language Models (LLMs)

Enabling LLMs to Read Spreadsheets: A method to process large spreadsheets for accurate question answering

Large language models can process small spreadsheets, but very large spreadsheets often exceed their limits for input length. Researchers devised a method that processes large spreadsheets so LLMs can answer questions about them.
How Qwen2-Audio performs against the competitors.
Large Language Models (LLMs)

Open Models for Math and Audio: Alibaba advances open-weight LLMs with Qwen2 Math and Audio variants

Alibaba followed up its open-weights Qwen2 large language models with specialized variations.
Conceptual illustration of The A I Scientist, an end-to-end LLM-driven scientific discovery process.
Large Language Models (LLMs)

AI Agents for AI Research: Agentic workflow generates novel scientific research papers

While some observers argue that large language models can’t produce truly original output, new work prompted them to generate novel scientific research.
Machine Translation Goes Agentic: TransAgents, a system that boosts literary translation with a multi-agent workflow
Large Language Models (LLMs)

Machine Translation Goes Agentic: TransAgents, a system that boosts literary translation with a multi-agent workflow

Literary works are challenging to translate. Their relative length, cultural nuances, idiomatic expressions...
AI Leadership Makes for a Difficult Balance Sheet: OpenAI faces financial growing pains, spending double its revenue
Large Language Models (LLMs)

AI Leadership Makes for a Difficult Balance Sheet: OpenAI faces financial growing pains, spending double its revenue

OpenAI may be spending roughly twice as much money as it’s bringing in, a sign of the financial pressures of blazing the trail in commercial applications of AI.
Higher Performance, Lower Prices: AI model prices drop as competition heats up
Large Language Models (LLMs)

Higher Performance, Lower Prices: AI model prices drop as competition heats up

Prices for access to large language models are falling as providers exploit new efficiencies and compete for new customers.
Google Gets Character.AI Co-Founders: Google acquires Character.AI talent and tech in strategic move
Large Language Models (LLMs)

Google Gets Character.AI Co-Founders: Google acquires Character.AI talent and tech in strategic move

Character.AI followed an emerging pattern for ambitious AI startups, trading its leadership to a tech giant in exchange for funds and a strategic makeover. 
Art Attack: ArtPrompt, a technique that exploits ASCII art to bypass LLM safety measures
Large Language Models (LLMs)

Art Attack: ArtPrompt, a technique that exploits ASCII art to bypass LLM safety measures

Seemingly an innocuous form of expression, ASCII art opens a new vector for jailbreak attacks on large language models (LLMs), enabling them to generate outputs that their developers tuned them to avoid producing.
Synthetic Data Factory: AgentInstruct, a framework for generating diverse synthetic data for LLM fine-tuning
Large Language Models (LLMs)

Synthetic Data Factory: AgentInstruct, a framework for generating diverse synthetic data for LLM fine-tuning

Researchers increasingly fine-tune models on synthetic data, but generated datasets may not be sufficiently diverse. New work used agentic workflows to produce diverse synthetic datasets.
Web Data Increasingly Off Limits: Online publishers crack down on AI training data access
Large Language Models (LLMs)

Web Data Increasingly Off Limits: Online publishers crack down on AI training data access

Online publishers are moving to stop AI developers from training models on their content.
Search Gets Conversational: OpenAI launches SearchGPT to rival Google and Microsoft
Large Language Models (LLMs)

Search Gets Conversational: OpenAI launches SearchGPT to rival Google and Microsoft

OpenAI is testing an AI-powered search engine in a bid to compete head-to-head with both Google and its close partner Microsoft Bing. 
The State of the Art Is Open: Meta’s Llama 3.1 outperforms GPT-4 in key areas
Large Language Models (LLMs)

The State of the Art Is Open: Meta’s Llama 3.1 outperforms GPT-4 in key areas

Meta raised the bar for large language models with open weights and published details about how it built one that outperforms GPT-4o and Claude 3.5 Sonnet by some measures.
Load More

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox