A United States court delivered a major ruling that begins to answer the question whether, and under what conditions, training an AI system on copyrighted material is considered fair use that doesn’t require permission.
What’s new: A U.S. Circuit judge ruled on a claim by the legal publisher Thomson Reuters that Ross Intelligence, an AI-powered legal research service, could not claim that training its AI system on materials owned by Thomson Reuters was a so-called “fair use.” Training the system did not qualify as fair use, he decided, because its output competed with Thomson Reuters’ publications.
How it works: Thomson Reuters had sued Ross Intelligence after the defendant trained an AI model using 2,243 works produced by Thomson Reuters without the latter’s permission. This ruling reversed an earlier decision in 2023, when the same judge had allowed Ross Intelligence’s fair-use defense to proceed to trial. In the new ruling, he found that Ross Intelligence’s use failed to meet the definition of fair use in key respects. (A jury trial is scheduled to determine whether Thomson Reuters' copyright was in effect at the time of the infringement and other aspects of the case.)
- Ross Intelligence’s AI-powered service competed directly with Thomson Reuters, potentially undermining its market by offering a derivative product without licensing its works. Use in a competing commercial product undermines a key factor in fair use.
- The judge found that Ross Intelligence’s use was commercial and not transformative, meaning it did not significantly alter or add new meaning to Thomson Reuters’ works — another key factor in fair use. Instead, it simply repackaged the works.
- The ruling acknowledged that Thomson Reuters’ works were not highly creative but noted that they possessed sufficient originality for copyright protection due to the editorial creativity and judgment involved in producing it.
- Although Ross Intelligence used only small portions of Thomson Reuters’ works, this did not weigh strongly in favor of fair use because those portions represented the most important summaries produced by Ross Intelligence.
Behind the news: The ruling comes amid a wave of lawsuits over AI training and copyright in several countries. Many of these cases are in progress, but courts have weighed in on some.
- The New York Times is suing OpenAI and Microsoft, arguing that their models generate output that competes with its journalism.
- Condé Nast, McClatchy, and other major publishers recently filed a lawsuit against Cohere, accusing it of using copyrighted news articles to train its AI models.
- Sony, UMG, and Warner Music filed lawsuits against AI music companies including Suno and Udio for allegedly using copyrighted recordings without permission.
- A judge dismissed key arguments brought by software developers who claimed that GitHub Copilot was trained on software they created in violation of open source licenses. The judge ruled in favor of Microsoft and OpenAI.
- In Germany, the publisher of the LAION dataset won a case in which a court ruled that training AI models on publicly available images did not violate copyrights.
Why it matters: The question of whether training (or copying data to train) AI systems is a fair use of copyrighted works hangs over the AI industry, from academic research to commercial projects. In the wake of this ruling, courts may be more likely to reject a fair-use defense when AI companies train models on copyrighted material to create output that overlaps with or replaces traditional media, as The New York Times alleges in its lawsuit against OpenAI. However, the ruling leaves room for fair use with respect to models whose output doesn’t compete directly with copyrighted works.
We’re thinking: Current copyright laws weren’t designed with AI in mind, and rulings like this one fill in the gaps case by case. Clarifying copyright for the era of generative AI could help our field move forward faster.