Letters from Andrew Ng

Personal messages to the AI community

Building Models That Learn From Themselves: AI developers are hungry for more high-quality training data. The combination of agentic workflows and inexpensive token generation could supply it.

Inexpensive token generation and agentic workflows for large language models (LLMs) open up intriguing new possibilities for training LLMs on synthetic data. Pretraining an LLM

Technical Insights

Why We Need More Compute for Inference: Today, large language models produce output primarily for humans. But agentic workflows produce lots of output for the models themselves — and that will require much more compute for AI inference.

Much has been said about many companies’ desire for more compute (as well as data) to train larger foundation models.

Proposed ChatDev architecture, illustrated.

Technical Insights

Agentic Design Patterns Part 5, Multi-Agent Collaboration: Prompting an LLM to play different roles for different parts of a complex task summons a team of AI agents that can do the job more effectively.

Multi-agent collaboration is the last of the four key AI agentic design patterns that I’ve described in recent letters.

Technical Insights

Agentic Design Patterns Part 4, Planning: Large language models can drive powerful agents to execute complex tasks if you ask them to plan the steps before they act.

Planning is a key agentic AI design pattern in which we use a large language model (LLM) to autonomously decide on what sequence of steps to execute to accomplish a larger task.

Technical Insights

Agentic Design Patterns Part 3, Tool Use: How large language models can act as agents by taking advantage of external tools for search, code execution, productivity, ad infinitum

Tool use, in which an LLM is given functions it can request to call for gathering information, taking action, or manipulating data, is a key design pattern of AI agentic workflows.

Technical Insights

Agentic Design Patterns Part 2, Reflection: Large language models can become more effective agents by reflecting on their own behavior.

Last week, I described four design patterns for AI agentic workflows that I believe will drive significant progress this year: Reflection, Tool use, Planning and Multi-agent collaboration.

Technical Insights

Agentic Design Patterns Part 1: Four AI agent strategies that improve GPT-4 and GPT-3.5 performance

I think AI agent workflows will drive massive AI progress this year — perhaps even more than the next generation of foundation models. This is an important trend, and I urge everyone who works in AI to pay attention to it.

Technical Insights

Life in Low Data Gravity: With generative AI, data is bound less tightly to the cloud provider where it’s stored. This has big implications for developers, CIOs, and cloud platforms.

I’ve noticed a trend in how generative AI applications are built that might affect both big companies and developers: The gravity of data is decreasing.

Technical Insights

The Dawning Age of Agents: LLM-based agents that act autonomously are making rapid progress. Here's what we have to look forward to.

Progress on LLM-based agents that can autonomously plan out and execute sequences of actions has been rapid, and I continue to see month-over-month improvements.

Illustration of a Python inside a cardboard box

Technical Insights

The Python Package Problem: Python packages can give your software superpowers, but managing them is a barrier to AI development.

I think the complexity of Python package management holds down AI application development more than is widely appreciated. AI faces multiple bottlenecks — we need more GPUs, better algorithms, cleaner data in large quantities.

Technical Insights

How to Think About the Privacy of Cloud-Based AI: How private is your data on cloud-based AI platforms? Here's a framework for evaluating risks.

The rise of cloud-hosted AI software has brought much discussion about the privacy implications of using it. But I find that users, including both consumers and developers building on such software

Technical Insights

Outstanding Research Without Massive Compute: Researchers at Stanford and Chan Zuckerberg Biohub Network dramatically simplified a key algorithm for training large language models.

It is only rarely that, after reading a research paper, I feel like giving the authors a standing ovation. But I felt that way after finishing Direct Preference Optimization (DPO) by...

Technical Insights

Making Large Vision Models Work for Business: Large language models can learn what they need to know from the internet, but large vision models need training on proprietary data.

Large language models, or LLMs, have transformed how we process text. Large vision models, or LVMs, are starting to change how we process images as well. But there is an important difference between LLMs and LVMs.

Technical Insights

An Expanding Universe of Large Language Models: From ChatGPT to the open source GPT4All, the bounty of large language models means opportunities for users and developers alike.

One year since the launch of ChatGPT on November 30, 2022, it’s amazing how many large language models are available. A year ago, ChatGPT was pretty much the only game in town for consumers (using a web user interface) who wanted to use a large language model (LLM)...

"Generative AI for Everyone" course promotional banner

Technical Insights

Everyone Can Benefit From Generative AI Skills: Announcing “Generative AI For Everyone,” a new course that requires no background in coding or AI.

I’ve always believed in democratizing access to the latest advances in artificial intelligence. As a step in this direction, we just launched “Generative AI for Everyone” on Coursera.

Letters from Andrew Ng

Building Models That Learn From Themselves: AI developers are hungry for more high-quality training data. The combination of agentic workflows and inexpensive token generation could supply it.

Why We Need More Compute for Inference: Today, large language models produce output primarily for humans. But agentic workflows produce lots of output for the models themselves — and that will require much more compute for AI inference.

Agentic Design Patterns Part 5, Multi-Agent Collaboration: Prompting an LLM to play different roles for different parts of a complex task summons a team of AI agents that can do the job more effectively.

Agentic Design Patterns Part 4, Planning: Large language models can drive powerful agents to execute complex tasks if you ask them to plan the steps before they act.

Agentic Design Patterns Part 3, Tool Use: How large language models can act as agents by taking advantage of external tools for search, code execution, productivity, ad infinitum

Agentic Design Patterns Part 2, Reflection: Large language models can become more effective agents by reflecting on their own behavior.

Agentic Design Patterns Part 1: Four AI agent strategies that improve GPT-4 and GPT-3.5 performance

Life in Low Data Gravity: With generative AI, data is bound less tightly to the cloud provider where it’s stored. This has big implications for developers, CIOs, and cloud platforms.

The Dawning Age of Agents: LLM-based agents that act autonomously are making rapid progress. Here's what we have to look forward to.

The Python Package Problem: Python packages can give your software superpowers, but managing them is a barrier to AI development.

How to Think About the Privacy of Cloud-Based AI: How private is your data on cloud-based AI platforms? Here's a framework for evaluating risks.

Outstanding Research Without Massive Compute: Researchers at Stanford and Chan Zuckerberg Biohub Network dramatically simplified a key algorithm for training large language models.

Making Large Vision Models Work for Business: Large language models can learn what they need to know from the internet, but large vision models need training on proprietary data.

An Expanding Universe of Large Language Models: From ChatGPT to the open source GPT4All, the bounty of large language models means opportunities for users and developers alike.

Everyone Can Benefit From Generative AI Skills: Announcing “Generative AI For Everyone,” a new course that requires no background in coding or AI.

Subscribe to The Batch