Out-Law News 4 min. read

OpenAI loses copyright case in Germany

ChatGPT brand imposed onto mobile screen

Cheng Xin / Getty Images


Copyright works can subsist within AI models, the Regional Court of Munich has ruled, in a case that has potentially profound implications for AI developers and rights holders.

The Munich court held that OpenAI, the company behind ChatGPT, infringed the copyright of song writers by means of “taking over into the LLMs parameters” the lyrics that were used in the course of the training of the model. This allowed users of ChatGPT to make those lyrics display when prompting the tool with relevant queries – such as “what is the text of song X?”. The case was brought by German collecting society GEMA.

Central to the court’s ruling was its finding that the copyrighted lyrics were memorised within ChatGPT. It further suggested that the text and data mining (TDM) exception to copyright, provided for under EU copyright law, does not provide a basis for developers to reproduce copyrighted content in their gen-AI outputs without right holders’ consent. The exception was said to cover only preparatory measures taken in the context of preparing the training data sets.

Dr Nils Rauer of Pinsent Masons, an expert in AI and intellectual property (IP) law, said the judgment could prove to be pivotal if it is upheld in the further course of the proceedings. He said the idea of copyright-protected works existing within the LLM is game-changing, adding that it would lead to an obligation sitting with the operator of the model to prevent users from bringing to life such works by means of targeted prompting.

Rauer said, however, that he expects the judgment to be appealed and believes it is likely that legal questions at the centre of this case will be considered, and ruled on, by the EU’s highest court in due course. However, he highlighted how some AI providers have already taken precautionary measures in response to the Munich court’s ruling. He said searching for the specific lyrics at issue in the case using ChatGPT or Copilot today results in a copyright notice that states that the text cannot be displayed due to legal reasons.

“The crucial question for the Court of Justice of the EU (CJEU) to give a definitive answer to in due course is whether or not large language models (LLMs) can indeed be said to memorise content to the extent that it can be concluded that the content used for training purposes exists within that LLM,” Alexander Bibi, a copyright expert in Rauer’s team, added. “The answer to that question is critical to whether copyrighted content can be said to have been ‘reproduced’ in the model itself and subsequently in the outputs those models generate, for the purposes of interpreting EU copyright law.”

“In its ruling, the Munich court has said that the concept of ‘reproduction’ is not confined to covering identical copies of copyright works. In fact, it said reproduction can be said to take different forms. In this regard, it cited an example of copyrighted content being reflected in digital tokens within an LLM, while stressing that whether those tokens can then be said to constitute a reproduction would depend on whether the content they reflect is perceptible. This would be the case if a user could bring the underlying content to the fore by virtue of their prompts,” Bibi added.

In examining the concept of ‘reproduction’ further, the court considered that it does not matter if reproduction relies on probabilities and underlying calculations, or if copyrighted content is memorised as a puzzle within an LLM – if the entire puzzle can be brought to light upon a user prompt. The court also said it does not matter whether the individual puzzle pieces can be individualised within the LLM, as long as they are somewhere in there, and it said rights holders do not need to be able to describe how the memorising process works within an LLM to be able to rely on claims of unauthorised reproduction of their works.

In a statement, Dr. Kai Welp, general counsel of GEMA, said the ruling had helped clarify “central legal questions regarding the interaction of a new technology with European copyright law … for the first time”, adding that it is “a milestone on the way to fair remuneration for authors throughout Europe”.

Rauer said that the Munich court’s ruling is just one in what is likely to be a number of AI copyright cases to come before courts in the EU. He said the potential for divergence in case law, as different courts in different countries take a view on the legal questions arising, will present increasing uncertainty for AI developers and for rights holders too.

In this regard, notwithstanding the different facts of the case and differences in the technology and legislation at issue, he highlighted how the High Court in London had taken a different view from the Munich court – on the question of whether an AI system could be said to store and reproduce content – in its recent landmark ruling in the Getty Images v Stability AI case.

The Munich court drew a distinction between preparatory acts falling within the scope of the TDM exception and the memorising of the lyrics within the AI model. The judges took the position that the exception applies to the point where the analysis of the training date terminates and the shaping of the parameters of the LLM begins.

Rauer said that this ruling will trigger an intense debate on how to balance the need for AI tools that are adequately trained, and thus capable of supporting society in delivering economic, health, cultural and other benefits, and the protection of intellectual property rights. He said that if the current judgment were to mark the future understanding of memorised content within AI models, it could get to a point where the legislator is called to reconsider the scope of the TDM exception.

We are processing your request. \n Thank you for your patience. An error occurred. This could be due to inactivity on the page - please try again.