Dear friends,
ChatGPT has raised fears that students will harm their learning by using it to complete assignments. Voice cloning, another generative AI technology, has fooled people into giving large sums of money to scammers, as you can read in this issue of The Batch. Why don’t we watermark AI-generated content to make it easy to distinguish from human-generated content? Wouldn’t that make ChatGPT-enabled cheating harder and voice cloning less of a threat? While watermarking can help, unfortunately financial incentives in the competitive market for generative AI make their adoption challenging.
Effective watermarking technology exists. OpenAI has talked about developing it to detect text produced by ChatGPT, and this tweet storm describes one approach. Similarly, a watermark can be applied invisibly to generated images or audio. While it may be possible to circumvent these watermarks (for instance, by erasing them), they certainly would pose a barrier to AI-generated content that masquerades as human-generated.
Unfortunately, I’m not optimistic that this solution will gain widespread adoption. Numerous providers are racing to provide text-, image-, and voice-generation services. If one of them watermarks its output, it will risk imposing on itself a competitive disadvantage (even if it may make society as a whole better off).
For example, assuming that search engines downranked AI-generated text, SEO marketers who wanted to produce high-ranking content would have a clear incentive to make sure their text wasn’t easily identifiable as generated. Similarly, a student who made unauthorized use of a text generator to do their homework would like it to be difficult for the teacher to find out.
Even if a particular country were to mandate watermarking of AI-generated content, the global nature of competition in this market likely would incentivize providers in other countries to ignore that law and keep generating human-like output without watermarking.
Some companies likely will whitewash these issues by talking about developing watermarking technology without actually implementing it. An alternative to watermarking is to use machine learning to classify text as either AI- or human-generated. However, systems like GPTzero that attempt to do so have a high error rate and don’t provide a robust solution.
If one company were to establish a monopoly or near-monopoly, then it would have the market power to implement watermarking without risking losing significant market share. Given the many downsides of monopolies, this is absolutely not the outcome we should hope for.
So what’s next? I think we’re entering an era when, in many circumstances, it will be practically impossible to tell if a piece of content is human- or AI-generated. We will need to figure out how to re-architect both human systems such as schools and computer systems such as biometric security to operate in this new — and sometimes exciting — reality. Years ago when Photoshop was new, we learned what images to trust and not trust. With generative AI, we have another set of discoveries ahead of us.
Keep learning!