In response to the New York Times’ copyright infringement case, OpenAI has addressed in a blog post the copyright claims made against it
it said that training generative AI models with existing content is fair use. As a good corporate citizen, the AI business is delighted to offer publications the choice to opt-out, which the NYT did in August 2023.
It also addresses allegations made in the NYT lawsuit that OpenAI’s models will “recite large portions” of the newspaper’s articles “verbatim” with “minimal prompting.” OpenAI says “regurgitation” is “a rare bug that we are working to drive to zero” and that “we have measures in place to limit inadvertent memorisation and prevent regurgitation in model outputs”.
Read also: Microsoft, OpenAI face litigation over ChatGPT training
AI vs Copyright: Clash of the Titans
The technology-content industry dispute over AI businesses’ copyright duties is growing. AI enterprises must obtain authorisation from copyright holders before training generative AI models with existing content. Exceptions in copyright law or the complex fair use principle in American law allow most tech companies to claim they don’t require permission.
Open AI rejects NYT’s copyright claims. The blog article claims that “training AI models using publicly available internet materials is fair use, as supported by long-standing and widely accepted precedents”. “This principle is fair to creators, necessary for innovators, and essential for US competitiveness”.
That said, “legal right is not essential to us than being good citizens. With a straightforward opt-out mechanism for publishers to block our technologies from accessing their sites, we led the AI industry. OpenAI applies the EU data mining copyright exception’s opt-out for copyright owners more broadly.
AI & Copyright: Partnerships Bridge the Divide
AI emphasises that it works with various copyright owners, including news companies. Before NYT got legal in late December, it thought talks regarding cooperation were going well.
In response to NYT repetition complaints, Open AI says the newspaper “repeatedly refused to share any examples, despite our commitment to investigate and fix any issues”. According to it, NYT’s platform regurgitated widely published third-party stories, and even then, its model presumably heavily quoted them due to particular prompts.
New York Times and other copyright critics of the AI corporation are unlikely to be satisfied. The template increasingly used by AI companies on copyright matters is: “It’s all fair use, but hey, we’re collaborating with the savvy content owners, and anyway, we’re in this for the good of humanity, let us innovate otherwise, you know, big bad China will end up owning AI, and nobody wants that”.