Out-Law Analysis 4 min. read
01 Mar 2023, 3:08 pm
Artificial intelligence (AI) image generators have become increasingly popular over the last year, but their developers could be exposing themselves – and their users – to legal, ethical, and regulatory risks by training the technology using copyrighted material.
To produce accurate images from user prompts that can be as varied and unusual as ‘a pufferfish swallowing an atom bomb’ or ‘Kermit being arrested’, AI tools are trained on existing photographs. These photographs, often scraped from online sources, will almost certainly attract copyright protection.
It is easy to assume that in the age of the internet, where information is disseminated so freely and easily, intellectual property (IP) rights can be dispensed with – but that assumption is currently being challenged in the courts. In one high-profile case, stock images company Getty Images is suing Stability AI, creators of AI image generator Stable Diffusion, over alleged copyright infringement.
Getty claims that Stability AI unlawfully copied and processed millions of images protected by copyright to train its AI software. The company suspects its images are being used unlawfully because Stability AI’s generated images appear to feature a tell-tale vestige of the famous Getty Images grey watermark.
While in the US the act of scraping might fall under the country’s ‘fair use’ doctrine, analogous laws are much stricter in European jurisdictions. The UK has a specific list of exceptions to copyright, including non-commercial research or private study, teaching, reporting current events, and parody, among others.
The UK also has an exception covering “text and data mining for non-commercial research”, but this is unlikely to cover web-scraping for the purposes of training an AI tool to generate images given that the researcher still needs to have lawful access to the work in the first place, which is likely not the case for mass web-scraping without permission. On top of this, most of the AI tools trained on data scraped from the web will be deployed for commercial purposes, meaning that the exception cannot be relied on.
Regulators are increasingly pushing back against the idea that AI systems are impenetrable and have called for increased transparency in how they are developed, trained, and deployed
generated “in the style” of a particular artist are already being put up for sale online, which could siphon royalties and commissions from the real artists themselves. This is particularly contentious since it could be argued that, unlike a human artist who must spend time labouring over their work, no real effort goes into AI-generated images – which are produced in seconds based on pre-existing work.
At the same time, proponents of AI image-generation software have compared the process to human creativity, which entails drawing inspiration from other artists. David Holz, CEO of AI developer Midjourney, told journalists in 2022 that, just as humans “look at somebody else’s picture and learn from it and make a similar picture”, AI image generators “are learning like people, it’s sort of the same thing”.
However, evidence suggests that some AI tools are going even further than producing images based on copyrighted material and are capable of creating identical copies of pre-existing images with minimal text prompts. AI developers state that models are supposed to create novel images from pre-existing works, as opposed to merely duplicating images. But the same evidence suggests that as AI systems get more sophisticated, the likelihood increases that they will generate perfect copies of pre-existing materials, because they will have more space to store more training data.
Victory for Getty Images in its legal challenge against Stability AI would send out an unequivocal legal message to companies training AI image generators that they will need a copyright licence for this activity. Such a ruling could limit the pool of images to which an AI tool has access if image providers only license out their photos on strict terms. In practice, however, it would likely be difficult to enforce in practice.
Currently, rights-holders have no freely available and inexpensive way to determine whether their work is being fed into an AI tool. Technology is likely to advance rapidly so that it becomes possible to trace what has been fed into a particular AI tool, but rights holders still might then struggle to prove that their work has actually been used in the AI’s output in such a way that constitutes copying.
Regulators are increasingly pushing back against the idea that AI systems are impenetrable and have called for increased transparency in how they are developed, trained, and deployed. The UK Intellectual Property Office (UKIPO) originally suggested a text and data mining exception in June 2022, which would relax copyright restrictions for AI analysing large amounts of information to identify trends and patterns. But in January the government indicated that the exception would not be taken forward.
The decision was likely informed by strong criticism of the proposal by the creative industries, concerned that the exception would undermine the UK’s copyright framework and its publishing industry. In an open letter, the Publishers Content Forum warned: “Without the ability to license and receive payment for the use of their data and content, certain businesses will have no choice but to exit the UK market or apply paywalls where access to content is currently free.”
This could be the beginning of a wider pushback against AI and its seemingly limitless power, reflecting the impact it could have on the enforcement of IP rights and the creative industries more generally if allowed to develop unchecked. As has been the case with all major technological advancements, it is important to balance the needs of any IP rights owners involved in developing that technology and allowing individuals to innovate. And yet, one of the main justifications for IP rights for centuries has been to give individuals the impetus to innovate.
The lack of legal clarity also threatens innovation. Last year, for example, Getty Images banned the upload and sale of AI-generated illustration over fears of legal challenges. Questions over the legality of these nascent avenues of potential creativity certainly provides an impetus for the law to catch up and keep pace with developments in AI technology.
However, until courts provide unequivocal rulings on these matters, or government drafts clear legislation, AI developers who train their software using unlicensed copyrighted material continue to expose themselves to legal challenges and regulatory enforcement.
Co-written by Concetta Scrimshaw of Pinsent Masons.
18 Jan 2023
26 Jan 2023