In a Monday filing, the Times also took aim at OpenAI's accusation last month that the newspaper paid a hacker to get ChatGPT to produce infringing outputs, calling the allegation an "attention-grabbing claim" that "is as irrelevant as it is false."
"OpenAI's true grievance is not about how the Times conducted its investigation, but instead what that investigation exposed: that defendants built their products by copying the Times' content on an unprecedented scale — a fact that OpenAI does not, and cannot, dispute," the newspaper said.
The Times sued OpenAI and Microsoft in December, alleging the tech companies ripped off millions of copyrighted works to train ChatGPT. The newspaper said Monday that Microsoft and OpenAI "have misappropriated almost a century's worth of copyrighted content, without paying fair compensation."
The tech companies have separately moved to dismiss the bulk of the complaint. Microsoft is a financial backer of OpenAI.
OpenAI and Microsoft did not move to dismiss the newspaper's lead claim of direct infringement entirely, with OpenAI limiting its motion on that count to works the company contends are barred by the copyright statute of limitations. Microsoft, meanwhile, said the direct infringement claim would be resolved later because it hinges on whether training content for ChatGPT constitutes fair use.
The Times said in its Monday brief that OpenAI and Microsoft cannot deny they used the newspaper's material to train ChatGPT.
"OpenAI worked with Microsoft to build these training data sets, which OpenAI does not dispute are packed with Times content," the newspaper said.
The Times said "the exact number" of works that were copied to train ChatGPT are unknown, "including because defendants have not publicly disclosed the makeup of the data sets used to train GPT-3 and each subsequent model."
GPT-4, the model now powering ChatGPT Copilot, is believed to be much more powerful than its predecessors.
In OpenAI's motion to dismiss, the company asked a judge to toss the newspaper's contributory copyright infringement claim, arguing that the Times has to allege that OpenAI knew of ChatGPT's allegedly infringing outputs cited in the newspaper's complaint — the outputs OpenAI accuses the Times of prompting.
"First, contrary to OpenAI's assertion, the Times' contributory infringement claim is not limited to 'the Times' creation of [the] outputs' cited in the complaint, but instead extends to circumstances in which 'an end-user may be liable as a direct infringer based on output of GPT-based products,'" the newspaper said.
The Times also shrugged off OpenAI's argument that the newspaper had not plausibly alleged that the tech company removed copyright management information, or CMI, from material used to train ChatGPT.
"The omission of CMI in the models' outputs suggests that CMI was removed during the training process; otherwise the models would have outputted all CMI as well," the Times said.
The Times has not yet filed a response to Microsoft's motion to dismiss.
Counsel for OpenAI did not immediately respond to a request for comment Tuesday.
The Times is represented by Elisha Barron, Ian Crosby, Davida Brook, Ellie Dupler and Tamar Lusztig of Susman Godfrey LLP, and Steven Lieberman, Jennifer B. Maisel and Kristen J. Logan of Rothwell Figg Ernst & Manbeck PC.
Microsoft is represented by Annette Louise Hurst and Christopher Cariello of Orrick Herrington & Sutcliffe LLP.
OpenAI is represented by Allison Levine Stillman, Andrew Gass, Joseph Richard Wetzel Jr., Sarang Damle and Luke Budiardjo of Latham & Watkins LLP, and Joseph C. Gratz and Rose Lee of Morrison & Foerster LLP.
The case is New York Times Co. v. Microsoft Corp. et al., case number 1:23-cv-11195, in the U.S. District Court for the Southern District of New York.
--Editing by Alyssa Miller.
For a reprint of this article, please contact reprints@law360.com.