Agents Ascendant LLMs evolve with agentic workflows, enabling autonomous reasoning and collaboration

Published

Dec 25, 2024

Reading time

2 min read

The AI community laid the foundation for systems that can act by prompting large language models iteratively, leading to much higher performance across a range of applications.

What happened: AI gained a new buzzword — agentic — as researchers, tool vendors, and model builders equipped large language models (LLMs) to make choices and take actions to achieve goals. These developments set the stage for an upswell of agentic activity in the coming year and beyond.

Driving the story: Several tools emerged to help developers build agentic workflows.

Microsoft primed the pump for agentic development tools in late 2023 with Autogen, an open source conversational framework that orchestrates collaboration among multiple agents. (Learn how to take advantage of it in our short course “AI Agentic Design Patterns with Autogen.”) In late 2024, part of the Autogen team split off to build AG2 based on a fork of the code base.
In October 2023, CrewAI released its open source Python framework for building and managing multi-agent systems. Agents can be assigned roles and goals, gain access to tools like web search, and collaborate with each other. (DeepLearning.AI’s short courses “Multi-Agent Systems with crewAI” and “Practical Multi AI-Agents and Advanced Use Cases with crewAI” can give you a fast start.)
In January, LangChain, a provider of development tools, introduced LangGraph, which orchestrates agent behaviors using cyclical graphs. The framework enables LLM-driven agents to receive inputs, reason over them, decide on actions, use tools, evaluate the results, and repeat these steps to improve results. (Our short course “AI Agents in LangGraph” offers an introduction.)
In September, Meta introduced Llama Stack for building agentic applications based on Llama models. Llama Stack provides memory, conversational skills, orchestration services, and ethical guardrails.
Throughout the year, integrated development environments implemented agentic workflows to generate code. For instance, Devin and OpenHands accept natural-language instructions to generate prototype programs. Replit Agent, Vercel’s V0, and Bolt streamline projects by automatically writing code, fixing bugs, and managing dependencies.
Meanwhile, a number of LLM makers supported agentic workflows by implementing tool use and function calling. Anthropic added computer use, enabling Claude 3.5 Sonnet to control users’ computers directly.
Late in the year, OpenAI rolled out its o1 models and the processing-intensive o1 pro mode, which use agentic loops to work through prompts step by step. DeepSeek-R1 and Google Gemini 2.0 Flash Thinking Mode followed with similar agentic reasoning. In the final days of 2024, OpenAI announced o3 and o3-preview, which further extend o1’s agentic reasoning capabilities with impressive reported results.

Behind the news: Techniques for prompting LLMs in more sophisticated ways began to take off in 2022. They coalesced in moves toward agentic AI early this year. Foundational examples of this body of work include:

Chain of Thought prompting, which asks LLMs to think step by step
Self-consistency, which prompts a model to generate several responses and pick the one that’s most consistent with the others
ReAct, which interleaves reasoning and action steps to accomplish a goal
Self-Refine, which enables an agent to reflect on its own output
Reflexion, which enables a model to act, evaluate, reflect, and repeat.
Test-time compute, which increases the amount of processing power allotted to inference

Where things stand: The agentic era is upon us! Regardless of how well scaling laws continue to drive improved performance of foundation models, agentic workflows are making AI systems increasingly helpful, efficient, and personalized.

Subscribe to The Batch