Mustafa Suleyman Agents of action

Published

Jan 01, 2025

Reading time

2 min read

In 2025, AI will have learned to see, it will be way smarter and more accurate, and it will start to do things on your behalf.

Today AI systems struggle to understand our full context. Their perception is limited to the chat window and a fairly narrow set of interactions. They don’t have a full understanding of what we’re doing or aiming for beyond that. To really grasp our intentions, they need to see what we see.

This capability is now here. AI can sit within the software we use and work alongside us co-browsing. If text was the first modality for interacting with AI, and voice the breakthrough feature of 2024, I think vision will occupy a similar place in 2025. At Microsoft AI, it has been a major priority of mine to create an AI that can work alongside you in your browser, so you can chat through what you’re looking at or working on and make it a true two-way interaction.

Vision is a step change, palpably different from the ways we’ve been able to use computers in the past. I can’t wait to see where it goes in the coming months.

Alongside vision, we’ll see enormous progress in reducing hallucinations. This is still a critical blocker for widespread adoption of AI. If people doubt what AI tells them, it severely limits what they’ll use it for. Trust is utterly foundational for AI. The good news is that the quality of models as well as their retrieval and grounding capabilities are still rapidly improving.

While I don’t think we’ll eliminate hallucinations entirely, by this time next year, we won’t be fussing about them as much. On most topics, talking to an AI will be at least as reliable as using a search engine and probably more so. This isn’t about a single technical advance, but the persistent accretion of gains across the spectrum. It will make a massive difference.

Lastly, we’re entering the agentic era. We’ve been dreaming of this moment for decades. In my book, The Coming Wave: Technology, Power, and the 21st Century’s Greatest Dilemma, I proposed that we start thinking about ACI, or artificially capable intelligence: the moment when AI starts taking concrete actions on behalf of users. Giving AI the ability to take actions marks the moment when AI isn’t just talking to us, it’s doing things. This is a critical change, and it’s right around the corner.

If we get it right, we’ll be able to, at once, make life easier and calmer while supercharging businesses and personal productivity alike. But agentic capabilities demand the highest standards of safety, security, and responsibility. Meanwhile, creating genuinely useful agents still has many formidable hurdles, not least integrating with myriad other systems.

The momentum is there. Actions are on their way. 2025 is going to be a big year.

Mustafa Suleyman is Chief Executive Officer of Microsoft AI. He co-founded Inflection AI and founded DeepMind Technologies.

Subscribe to The Batch