If AI Is Trained on Your Content Without Permission, What Rights Do You Have and Who’s Accountable? (New York Times v. Microsoft Corporation)
The New York Times Company sued Microsoft and OpenAI in court recently.
Big Tech just received a legal wake-up call. In a high-profile case, The New York Times took on Microsoft and OpenAI over how AI tools like ChatGPT learn from online content, and the court's response could define the future of journalism, copyright, and AI itself. Whether you are a creator, a novice tech user, or just curious about how data machines learn from what you read, this case is one you will want to follow, closely!👇
🏛️ Court: U.S. District Court, Southern District of New York
🗓️ Judgment Date: 4 April 2025
🗂️ Case Number: No. 23-CV-11195 (SHS), 2025 WL 1009179
🔍 Legal Issue in New York Times Company v. Microsoft Corporation
In New York Times Company v. Microsoft Corporation, the groundbreaking question was considered: Can tech giants legally use news articles to train artificial intelligence (AI) without permission?
This case pits respected news organizations like The New York Times against Microsoft and OpenAI, creators of powerful AI systems such as ChatGPT.
The journalists from the New York Times allege that their original reporting, often paywalled or copyrighted, was “scraped” and fed into AI systems that now spit out summaries, rewrites, or even near-verbatim reproductions of their work.
Essentially, the media groups are saying: “Hey, that’s our hard-earned content! You didn’t ask, and now you are profiting from it.”
This raises several complex legal issues, such as:
❓ Direct copyright infringement: Did AI developers unlawfully copy the content during training?
🤝 Contributory and vicarious infringement: Are Microsoft and OpenAI responsible for what users do with their AI tools?
🚫 DMCA violations: Was copyright info deliberately stripped from articles to hide the origin?
⚖️ “Hot news” misappropriation: Did the companies unfairly capitalize on time-sensitive news?
🌐 Standing and timeliness: Did the publishers wait too long to bring their claims?
This lawsuit does not just affect journalists or tech firms; it could impact how all of us use and interact with AI-powered tools in the future.
Are we heading for a world where bots can remix anything online without credit or cash? Or will copyright laws be redrawn to protect human creativity in the AI age? 🤔
Material Facts – New York Times Company v. Microsoft Corporation
The New York Times Company, along with Daily News LP and the Center for Investigative Reporting, filed lawsuits against Microsoft and OpenAI, accusing them of misusing their news content to train AI tools like ChatGPT and Copilot.
Between 2020 and 2024, OpenAI developed increasingly sophisticated large language models (LLMs), including GPT-4, which became integrated into Microsoft products such as Bing Chat and Office Copilot. These AI systems were trained on enormous datasets scraped from the internet—datasets that, according to the plaintiffs, included their copyrighted news articles, often taken from behind paywalls.
The media organizations alleged that OpenAI and Microsoft used automated scraping tools to extract and ingest their journalism into the AI training pipelines.
What made this more concerning was the claim that the tools removed or altered copyright management information (CMI), like author names and copyright notices, during the process. The result? Users could ask AI tools about news topics and receive direct answers or summaries that echoed (and in some cases nearly replicated) the original content without attribution or a link to the source.
The plaintiffs presented more than 100 examples of AI outputs that closely resembled or outright quoted their work. Some outputs, they claimed, appeared to “regurgitate” original articles nearly word-for-word, with no mention of the original publisher. This behaviour wasn’t an isolated glitch; it occurred after the tools were publicly released and widely used by millions of people around the world.