Training Data on Trial: How Courts Are Defining Copyright for the AI Era

Mar 4

By: Caroline Altschul
Edited by: William Liu and Gabriela Pesantez

Generative artificial intelligence is transforming how content is created, raising urgent legal questions about copyright, ownership, and authorship. AI systems like Stability AI and Ross Intelligence are trained on massive datasets, often containing copyrighted works, to generate text, images, or legal research tools. [1] Recent high-profile lawsuits highlight the uncertainties that arise when established notions of intellectual property clash with the AI boom. Courts are now grappling with how to apply traditional intellectual property laws to AI, with implications for creators, businesses, and policymakers worldwide. [2]

Getty Images, one of the world’s largest stock photography companies, sued Stability AI in the United Kingdom in January of 2023, alleging that the company used over 12 million Getty-owned photographs to train its image-generation model without permission. [3] Stability acknowledged that “many” copyrighted works, including some Getty images, appeared in its training data but denied infringement, arguing that the model learned from images only in an abstracted way, extracting patterns rather than storing or reproducing the originals. [4] Getty’s initial claims focused on two forms of primary copyright infringement: the use of millions of Getty images to train the model and the appearance of Getty-derived content in the model’s outputs. [5] However, the training claim was withdrawn because Getty could not prove the training occurred in the U.K., and the output claim was abandoned after Stability blocked the prompts producing the disputed images. [6]

The court ultimately ruled on two remaining issues. First, it found partial trademark infringement, holding that earlier versions of Stability’s model had generated synthetic images containing Getty’s “Getty Images” and “iStock” watermarks. [7] Though the instances were limited, the court concluded that Stability used the Getty marks “in the course of trade,” offering synthetic images bearing those marks as part of its commercial service. [8] Then, the court rejected Getty’s secondary copyright infringement claim. The court considered whether the model’s “weights,” the numerical parameters learned by the AI during training that encode patterns from the dataset, could constitute an “article” under copyright law, meaning a tangible or intangible object capable of infringing copyright. It concluded that, while weights can legally qualify as an article, they do not store or reproduce the copyrighted images themselves, and therefore cannot be infringing copies. [9] The ruling highlights the difficulty creators face in protecting their work in an era where AI models learn from vast datasets without retaining original copies, leaving major gaps in how copyright laws apply to training AI.

The first substantive U.S. decision to evaluate fair use in the context of AI training data came in Thomson Reuters v. Ross Intelligence, a case that carries major implications for how courts may treat copyrighted material used to train AI systems. [10] Thomson Reuters, the parent company of Westlaw, an online legal research database, sued Ross Intelligence after Ross used Westlaw headnotes— editorial summaries and classifications of legal principles— to train its AI-powered legal research tool. [11] Ross had initially attempted to license the content, but after Thomson Reuters refused due to direct competition, Ross obtained training materials from a third-party vendor, LegalEase. Those “Bulk Memos,” created by lawyers using instructions derived from Westlaw headnotes, were ultimately used to train Ross’s system. [12] After initially appearing open to Ross’s defenses, the court reconsidered and granted partial summary judgment for Thomson Reuters. [13] It held that Ross infringed 2,243 headnotes and rejected all five of Ross’s defenses, including innocent infringement, merger, scènes à faire, and copyright misuse. [14] Most significantly, the court rejected Ross’s fair use argument, marking the first time a court has addressed fair use in the context of AI training data. [15] The judge found that Ross’s use was commercial and not transformative, noting that the headnotes were used not out of necessity but to build a competing research tool. While the headnotes were only minimally creative and Ross did not reproduce them publicly, the court concluded that Ross’s product posed meaningful market harm, including to Thomson Reuters’s potential market for AI training data. [16] The ruling signals that U.S. courts are highly attentive to competitive impact when evaluating AI-related fair use, and it underscores a growing expectation that companies training AI models must vet copyrighted data carefully or risk infringement liability.

The Getty and Thomson Reuters rulings illustrate how courts in different jurisdictions are wrestling with distinct legal levers— trademark in the U.K. and copyright/fair-use in the U.S.— to address the same underlying problem of AI trained on copyrighted material. In London, the High Court narrowed Getty Images v. Stability AI to limited trademark findings while rejecting the theory that model “weights” are infringing copies, a decision legal commentators say leaves open large gaps in copyright protection for creators. [17] In Delaware, Judge Stephanos Bibas’s ruling in Thomson Reuters v. Ross Intelligence marked the first substantive U.S. decision to reject a fair-use defense for training-data copying: the court treated the use as commercial rather than transformative and emphasized potential market harm from a competing product. [18]

Together, the cases highlight three emerging trends in judicial reasoning. First, courts are split on whether “abstract learning” by a model shields developers from infringement; Getty’s reasoning accepted aspects of that argument for copyright but still found limited trademark liability. [19] Second, the transformative-use inquiry is being applied with heavy attention to competitive context: even modestly creative works can weigh against defendants when the AI product substitutes for the rights-holder’s market. [20] Third, judges are separating the analysis of training data from model outputs, creating doctrinal uncertainty about when and how liability should attach. [21]

The practical consequence is broad uncertainty for creators, tech firms, and policymakers: rights-holders face enforcement hurdles, developers face exposure if they train on proprietary material, and legislators are being pushed to clarify rules, hence the flurry of government working groups and policy consultations on AI and copyright law.

For businesses seeking to navigate the evolving AI-copyright landscape, clear due diligence is now indispensable. Firms should audit training data to identify and document any copyrighted materials, establish licensing protocols for third-party content, and explore alternative datasets, such as public-domain or properly licensed content, to reduce exposure. At the same time, policymakers must respond to the technology’s rapid advance by offering tailored guidance and legislative clarity. The U.S. Copyright Office has already published a three-part report on AI training data and is calling for scalable licensing mechanisms, acknowledging that fair use cannot be presumed when vast copyrighted collections are used. [22] Looking ahead, companies should keep an eye on emerging regulatory frameworks, such as the Generative AI Copyright Disclosure Act introduced in Congress, which could reshape obligations around transparency and model-training practices. [23]

Notes:

Potter Clarkson, “What Data Is Used to Train an AI, Where Does It Come from, and Who Owns It?” Potter Clarkson, accessed November 9, 2025, https://www.potterclarkson.com/news/what-data-is-used-to-train-an-ai-where-does-it-come-from-and-who-owns-it.
AdminAI, “AI Content Creation: Legal Challenges for Intellectual Property Rights,” April 5, 2025, https://attorneys.media/ai-content-legal-issues/.
Getty Images (US) Inc & Ors v Stability AI Ltd, [2025] EWHC 38 (Ch), Case No: IL-2023-000007.
Lavinia Puder et al., “Getty Images v. Stability AI: Intellectual Property Rights in the Age of Generative AI,” Katten Muchin Rosenman LLP, November 14, 2025, https://katten.com/getty-images-v-stability-ai-intellectual-property-rights-in-the-age-of-generative-ai.
Getty Images v. Stability AI, 142.
Sophia Goossens and Brett Shandler.“Getty Images v. Stability AI: English High Court Rejects Secondary Copyright Claim,” November 13, 2025, https://www.lw.com/en/insights/getty-images-v-stability-ai-english-high-court-rejects-secondary-copyright-claim.
Luis Rijo, “High Court Rules Stable Diffusion Training Does Not Infringe Copyright,” PPC Land, November 5, 2025, https://ppc.land/high-court-rules-stable-diffusion-training-does-not-infringe-copyright/.
Getty Images v Stability AI, 143.
Louise Popple et al., “Getty Images v Stability AI: the copyright ruling,” November 18, 2025, https://www.taylorwessing.com/de/insights-and-events/insights/2025/11/getty-images-v-stability-ai---the-copyright-ruling.
Thomson Reuters Enterprise Centre GmbH et al v. ROSS Intelligence Inc., U.S. District Court for the District of Delaware, No. 1:20-cv-00613-SB, 2025.
Monique Bhargava et al., “Court Shuts down AI Fair Use Argument in Thomson Reuters Enterprise Centre GMBH v. Ross Intelligence Inc.,” Perspectives, Reed Smith LLP, March 3, 2025, https://www.reedsmith.com/en/perspectives/2025/03/court-ai-fair-use-thomson-reuters-enterprise-gmbh-ross-intelligence.
“Breaking Point: Court Rules AI Companies Can’t Feed on Copyrighted Content,” February 18, 2025, https://www.dentons.com/en/insights/alerts/2025/february/18/court-rules-ai-companies-cant-feed-on-copyrighted-content.
Frank D. D’Angelo, Erin Shields, “Thomson Reuters v. Ross Intelligence, Inc.” Loeb & Loeb LLP, February 11, 2025, https://www.loeb.com/en/insights/publications/2025/02/thomson-reuters-v-ross-intelligence-inc.
Perspectives, Reed Smith LLP.
James Rosenfeld et al., “Thomson Reuters v. Ross Intelligence: Copyright, Fair Use, and AI (Round One),” Davis Wright Tremaine, February 14, 2025, https://www.dwt.com/blogs/artificial-intelligence-law-advisor/2025/02/reuters-ross-court-ruling-ai-copyright-fair-use.
Davis Wright Tremaine.
Katten.
Thomson Reuters v. ROSS Intelligence Inc.
Oliver Yaros et al., “Getty Images v Stability AI: What the High Court’s Decision Means for Rights-Holders and AI Developers,” Insights, Mayer Brown, November 13, 2025, https://www.mayerbrown.com/en/insights/publications/2025/11/getty-images-v-stability-ai-what-the-high-courts-decision-means-for-rights-holders-and-ai-developers.
Monique Bhargava et al., “Court Shuts down AI Fair Use Argument in Thomson Reuters Enterprise Centre GMBH v. Ross Intelligence Inc.,” Perspectives, Reed Smith LLP,” March 3, 2025, https://www.reedsmith.com/en/perspectives/2025/03/court-ai-fair-use-thomson-reuters-enterprise-gmbh-ross-intelligence.
Bristows LLP-Toby Headdon and Jeremy Blum, “The Judgment in Getty Images v Stability AI in Tables and Bullet Points,” Lexology, November 14, 2025, https://www.lexology.com/library/detail.aspx?g=c494ecd4-1d54-4a0d-ad61-ff5e17da2de5.
U.S. Copyright Office, "Copyright and Artificial Intelligence Part 3: Generative AI," May, 2025, https://www.copyright.gov/ai/Copyright-and-Artificial-Intelligence-Part-3-Generative-AI-Training-Report-Pre-Publication-Version.pdf?utm_source=chatgpt.com.
Generative AI Copyright Disclosure Act of 2024, H.R. 7913, 118th Cong., 2nd sess., introduced April 9, 2024 (not yet enacted).

Bibliography:

AdminAI. “AI Content Creation: Legal Challenges for Intellectual Property Rights.” Attorneys Media, April 5, 2025. https://attorneys.media/ai-content-legal-issues/.

Bristows LLP (Toby Headdon and Jeremy Blum). “The Judgment in Getty Images v Stability AI in Tables and Bullet Points.” Lexology, November 14, 2025.

https://www.lexology.com/library/detail.aspx?g=c494ecd4-1d54-4a0d-ad61-ff5e17da2de5.

Dentons. “Breaking Point: Court Rules AI Companies Can’t Feed on Copyrighted Content.” February 18, 2025. https://www.dentons.com/en/insights/alerts/2025/february/18/court-rules-ai-companies-cant-feed-on-copyrighted-content.

James Rosenfeld, Sarah Wood, Haley Zoffer, Shannon L. McNeal, Christopher W. Savage, “Thomson Reuters v. Ross Intelligence: Copyright, Fair Use, and AI (Round One),” Davis Wright Tremaine, February 14, 2025. https://www.dwt.com/blogs/artificial-intelligence-law-advisor/2025/02/reuters-ross-court-ruling-ai-copyright-fair-use.

Getty Images (US) Inc & Ors v Stability AI Ltd. [2025] EWHC 38 (Ch), Case No. IL-2023-000007.

Lavinia Puder, “Getty Images v. Stability AI: Intellectual Property Rights in the Age of Generative AI,” Katten, November 14, 2025. https://katten.com/getty-images-v-stability-ai-intellectual-property-rights-in-the-age-of-generative-ai.

Frank D. D’Angelo, Erin Shields, “Thomson Reuters v. Ross Intelligence, Inc,” Loeb & Loeb LLP, February 11, 2025. https://www.loeb.com/en/insights/publications/2025/02/thomson-reuters-v-ross-intelligence-inc.

Oliver Yaros, Alasdair Maher, Ellen Hepworth, Rebecca Keay, Shannon Balnaves, “Getty Images v Stability AI: What the High Court’s Decision Means for Rights-Holders and AI Developers,” Mayer Brown, November 13, 2025. https://www.mayerbrown.com/en/insights/publications/2025/11/getty-images-v-stability-ai-what-the-high-courts-decision-means-for-rights-holders-and-ai-developers.

Luis Rijo, “High Court Rules Stable Diffusion Training Does Not Infringe Copyright,” PPC Land, November 5, 2025. https://ppc.land/high-court-rules-stable-diffusion-training-does-not-infringe-copyright/.

Potter Clarkson. “What Data Is Used to Train an AI, Where Does It Come from, and Who Owns It?” Accessed November 9, 2025. https://www.potterclarkson.com/news/what-data-is-used-to-train-an-ai-where-does-it-come-from-and-who-owns-it.

Monique N. Bhargava, Mitesh P. Patel, Katherine Litaker, “Court Shuts Down AI Fair Use Argument in Thomson Reuters Enterprise Centre GMBH v. Ross Intelligence Inc.,” Reed Smith LLP, Perspectives, March 3, 2025. https://www.reedsmith.com/en/perspectives/2025/03/court-ai-fair-use-thomson-reuters-enterprise-gmbh-ross-intelligence.

Louise Popple, Adam Rendle, Xuyang Zhu, “Getty Images v Stability AI: The Copyright Ruling,” Taylor Wessing, November 18, 2025. https://www.taylorwessing.com/de/insights-and-events/insights/2025/11/getty-images-v-stability-ai---the-copyright-ruling.

Thomson Reuters Enterprise Centre GmbH et al. v. ROSS Intelligence Inc. U.S. District Court for the District of Delaware, No. 1:20-cv-00613-SB (2025).

U.S. Copyright Office. “Copyright and Artificial Intelligence Part 3: Generative AI.” May 2025. https://www.copyright.gov/ai/Copyright-and-Artificial-Intelligence-Part-3-Generative-AI-Training-Report-Pre-Publication-Version.pdf.

U.S. House. Generative AI Copyright Disclosure Act of 2024. H.R. 7913, 118th Cong., 2nd sess. Introduced April 9, 2024 (not enacted).

Hannah Cheves