Question and Answer
In the U.S., is it legal for developers to use copyrighted material to train generative AI tools?
There isn’t a clear answer yet. Some say no, it’s unlawful. There are several lawsuits underway against companies like OpenAI, Microsoft, and Stability AI. Many artists and writers feel that AI is appropriating their work without consent or compensation, threatening their creative livelihoods.
Others say that training AI models on copyrighted works is fair use. They argue that AI models learn from these works to generate transformative original content, so no infringement occurs.
Many scholars and librarians agree that training AI language models on copyrighted works is fair use and essential for research. If restricted to public domain materials, AI models would lack exposure to newer works, limiting the scope of inquiries and omitting studies of modern history, culture, and society from scholarly research.
This issue is complex, and it will likely take a long time before all the lawsuits are settled. Some courts have thrown out parts of the lawsuits, but kept others. Some cases may be settled out of court.
In June of 2025 one court found that Anthropic’s use of copyrighted books to train its LLMs is a fair use.
Learn more
- Training Generative AI Models on Copyrighted Works Is Fair Use. Association of Research Libraries.
- Comment from Creative Commons in support of Fair Use from Creative Commons to the US Copyright Office
- AI and Copyright: Expanding Copyright Hurts Everyone—Here’s What to Do Instead - Electronic Frontier Foundation
- The AI Copyright Trap - Carys J. Craig, Associate Dean of Research & Institutional Relations at York University, Toronto
- Anthropic’s multi-billion dollar loss in Bartz v. Anthropic is really a win (for AI) - Matthew Sag