
At the core of the dispute is an escalating debate over whether using copyrighted content without compensation to train AI models constitutes fair use or infringement, with tech firms citing innovation and creators defending intellectual property rights
A group of prominent authors has filed a lawsuit against Microsoft, accusing the tech giant of using pirated copies of their books without consent to train its artificial intelligence model, Megatron. The suit, lodged in a New York federal court on June 24, claims that Microsoft exploited thousands of copyrighted literary works to develop its generative AI technology.
Authors named in the lawsuit include Pulitzer Prize winner Kai Bird, essayist Jia Tolentino, and historian Daniel Okrent, among others. They allege that Microsoft used a dataset comprising nearly 200,000 pirated books to teach its AI system how to mimic human-like responses in text-based prompts. According to the complaint, this training allowed Microsoft’s model to replicate the style, voice, and thematic elements of the original works—without obtaining proper authorization from the copyright holders.
The lawsuit seeks a court injunction to halt any further use of the authors’ content and is asking for statutory damages of up to $150,000 per infringed work.
Copyright clash in AI era
This case is the latest in a series of legal challenges targeting major tech companies, including Meta, Anthropic, and OpenAI, over the use of copyrighted material in AI training. Just a day before the Microsoft lawsuit, a California judge ruled that Anthropic’s use of certain works constituted fair use, but left open the possibility of liability for using pirated materials.
Microsoft has yet to issue a public response to the allegations, while legal representatives for the authors have declined to comment further.
At the heart of the dispute is a growing debate over how AI models are trained and whether the use of copyrighted content without compensation constitutes infringement or fair use. Tech firms argue that such usage is transformative and essential for AI innovation, while authors and publishers warn it undermines creative ownership and intellectual property rights.
See What’s Next in Tech With the Fast Forward Newsletter
Tweets From @varindiamag
Nothing to see here - yet
When they Tweet, their Tweets will show up here.