Why Science-Fiction Scenarios of AI’s Emergent Behavior Are Likely to Remain Fictional The sudden apparance of “emergent” AI capabilities may be an artifact of the metrics you study

Published
Reading time
2 min read
Comic where a robot is hiding in a closet during a game of hide-and-seek.

Dear friends,

Over the weekend, my two kids colluded in a hilariously bad attempt to mislead me to look in the wrong place during a game of hide-and-seek. I was reminded that most capabilities — in humans or in AI — develop slowly.

Some people fear that AI someday will learn to deceive humans deliberately. If that ever happens, I’m sure we will see it coming from far away and have plenty of time to stop it.

While I was counting to 10 with my eyes closed, my daughter (age 5) recruited my son (age 3) to tell me she was hiding in the bathroom while she actually hid in the closet. But her stage whisper, interspersed with giggling, was so loud I heard her instructions clearly. And my son’s performance when he pointed to the bathroom was so hilariously overdramatic, I had to stifle a smile.

Perhaps they will learn to trick me someday, but not yet! (In his awful performance, I think my son takes after me. To this day, I have a terrible poker face — which is matched by my perfect lifetime record of losing every poker game I have ever played!)

Last year, the paper “Are Emergent Abilities of Large Language Models a Mirage?” by Rylan Schaeffer, Brando Miranda, and Sanmi Koyejo, which won a NeurIPS outstanding paper award, considered “emergent” properties of LLMs, which refers to capabilities that seem to appear suddenly as model sizes increase. The authors point out that scaling laws imply that the per-token error rate decreases (improves) slowly with scale, and emergent properties might be an artifact of researchers studying nonlinear or discontinuous metrics that transform a gradually decreasing per-token error rate into something that looks more like a step function.

Consider a “combination lock” metric that requires getting many items right. Say we’re measuring the likelihood that an LLM will get 10 independent digits of an answer right. If the odds of it getting each digit right improve gradually from 0 to 1, then the odds of it getting all 10 digits right will appear to jump suddenly. But if we look at continuous metrics, such as the total number of correct digits, we will see that the underlying performance actually improves gradually. (Public perception of a technology can also shift in a discontinuous way because of social dynamics.)

This is why many of us saw GPT-3 as a promising step in transforming text processing long before ChatGPT appeared: BERT, GPT, GPT-2, and GPT-3 represented points on a continuous spectrum of progress. Or, looking back further in AI history, even though AlphaGo’s victory over Lee Sedol in the game of Go took the public by surprise, it actually represented many years of gradual improvements in AI’s ability to play Go.

While analogies between human and machine learning can be misleading, I think that just as a person’s ability to do math, to reason — or to deceive — grows gradually, so will AI’s. This means the capabilities of AI technology will grow gradually (although I wish we could achieve AGI overnight!), and the ability of AI to be used in harmful applications, too, will grow gradually. As long as we keep performing red-teaming exercises and monitoring our systems’ capabilities as they evolve, I’m confident that we will have plenty of time to spot issues in advance, and the science-fiction fears of AI-initiated doomsday will remain science fiction.

Keep learning!

Andrew 

Share

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox