AI Agents

4 Posts

OpenDevin animation illustrating open-source AI model collaboration.
AI Agents

Free Agents: OpenHands launches as an open toolkit for advanced code generation and automation

An open source package inspired by the commercial agentic code generator Devin aims to automate computer programming and more.
MLE-Bench workflow showing competition steps for model training, testing, and leaderboard scoring.
AI Agents

When Agents Train Algorithms: OpenAI’s MLE-bench tests AI coding agents

Coding agents are improving, but can they tackle machine learning tasks? 
User retrieves vendor contact information to fill out a request form, verifying each entry.
AI Agents

Claude Controls Computers: Anthropic empowers Claude Sonnet 3.5 to operate desktop apps, but cautions remain

API commands for Claude Sonnet 3.5 enable Anthropic’s large language model to operate desktop apps much like humans do. Be cautious, though: It’s a work in progress.
The SWE-bench full leaderboard shows Cosine Genie outperforming its competitors.
AI Agents

Agentic Coding Strides Forward: Genie coding assistant outperforms competitors on SWE-bench by over 30 percent

An agentic coding assistant boosted the state of the art in an important benchmark by more than 30 percent.

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox