Hamburg Higher Regional Court specifies text and data mining restrictions for AI training
A year on from the initial decision in the widely publicised LAION case at the Hamburg Regional Court, the Higher Regional Court has now ruled on the appeal. This decision, handed down just one month after the equally notable Munich Regional Court ruling in the GEMA v OpenAI case, brings the so-called “AI lawsuits” to the second instance.
Background
The plaintiff is a professional photographer who distributes their work via stock photo platforms. The defendant is a registered association which provides a dataset containing around 5.85 billion image-text pairs to the public free of charge under an open licence. This dataset contains only hyperlinks to images that are freely available on the internet, along with image descriptions. The images themselves are not stored in the dataset. This dataset is particularly suitable for developing and training generative image AI.
To create the dataset, the defendant used an existing US dataset. Image URLs were extracted from this dataset, the linked images were downloaded from the original websites and software was used to automatically check whether the image content and existing text descriptions matched. Only if there was sufficient correspondence were the URL and description transferred to the new dataset as metadata. The necessary downloads, including the disputed watermarked preview image from a stock photo platform, took place in the second half of 2021.
The platform’s terms of use expressly prohibited automated scraping (“bots or the like”). The photographer considered the download of his image to be an unauthorised reproduction pursuant to Section 16 UrhG (German Copyright Act) and sought an injunction against the association. Among other things, he argued that neither Section 44b UrhG nor Section 60d UrhG cover AI training. In any case, he claimed that a valid “machine-readable” reservation of use pursuant to Section 44b (3) UrhG had been declared.
The Hamburg Regional Court dismissed the action at first instance (judgment of 27 September 2024, Ref. 310 O 227/23). Although it affirmed that reproduction had taken place, the court considered this to be justified by both Section 44b and, primarily, Section 60d UrhG, leaving the question of the reservation of use under Section 44b(3) UrhG open. However, in its obiter dictum, it was open to the view that an opt-out in text form could fulfil the “machine readability” requirement. We have already discussed this decision in detail elsewhere.
The photographer appealed against the judgment, but was unsuccessful. The Hanseatic Higher Regional Court of Hamburg (5th Civil Senate) dismissed the appeal in its judgment of 10 December 2025 and permitted an appeal to the Federal Court of Justice.
Court decision
The 5th Civil Senate confirms the copyright infringement but deems it justified by two provisions: Section 44b UrhG (text and data mining, or “TDM”) and Section 60d UrhG (TDM for scientific research purposes). Unlike the first instance court, which somewhat surprisingly relied on Section 60d UrhG and only addressed Section 44b UrhG in passing, the Higher Regional Court now clearly focuses on Section 44b UrhG. In particular, it clarifies the reservation of use under Section 44b(3) UrhG and the requirements for “machine readability”.
1. Reproduction yes – but applicability of the TDM exception
The defendant’s download of the image constitutes reproduction under Section 16(1) UrhG, as there was no consent from the photographer or contractual licence via the image agency. Only the freely accessible preview image with a watermark was downloaded. Therefore, the decisive question is whether this reproduction is covered by an exception or limitation provision.
The Higher Regional Court has classified the download in question as “reproduction for text and data mining” within the meaning of Sections 44b(1) and (2) UrhG. The photograph is a digital work, and the reproduction took place as part of an automated analysis. The download was part of the automated comparison of image content and descriptions. This comparison is used to obtain information, particularly about patterns, trends and correlations. The court considers the question of whether an image and text match to be a “correlation”, or at least an informative connection, as defined by the provision.
The Senate thus clarifies that text and data mining encompasses not only the statistical evaluation of large amounts of data, but also preparatory analysis activities aimed at gaining knowledge at a later stage. At the same time, however, the Senate draws a distinction: mere collections for the creation of “digital parallel archives” are not covered by Section 44b UrhG. However, in the present case, only links and metadata are stored in the data set, not the image files themselves.
In all this, the Senate expressly relies on the will of the EU legislature and refers to the connection with the DSM Directive and Article 53(1)(c) of the AI Act. This means that the text and data mining exception is intended to expressly cover machine learning and the training of AI models, since Article 53(1)(c) of the AI Act refers to Article 4 of the DSM Directive, which in turn refers to the TDM exception.
2. No effective reservation of use under Section 44b(3) UrhG
At the heart of the appeal was the reverse exception in Section 44b(3) UrhG. The photographer invoked the stock photo platform’s terms and conditions clause that prohibits scraping by bots. The Senate considered this argument and contoured the provision differently from the Regional Court.
Firstly, the Senate affirmed that an author can invoke a reservation of use declared by a stock photo agency, even though the agency only holds simple rights of use. The term “right holder” in Article 4 of the Directive on the Harmonisation of Certain Aspects of Copyright and Related Rights in the Information Society (the “DSM Directive”) and Section 44b UrhG must be interpreted in light of the “effet utile”. Reservations declared by a platform operator with the author’s consent are therefore attributable to the author. Otherwise, the author's decision-making rights regarding TDM uses would be impractically restricted. In this respect, the Senate confirmed the Regional Court’s tendency to favour rights holders.
Within the framework of Section 44b (3) sentence 1 UrhG, the Senate places the burden of proof on the user. The user must therefore prove that the rights holder has not reserved the use for text and data mining. Conversely, the machine readability of this reservation must be demonstrated and proven by the rights holder. This includes demonstrating that the reservation could be automatically recognised and correctly interpreted by commercially available tools at the time of use.
According to the Senate, while the legislator has not prescribed a specific form, they have required that the chosen form be machine-readable. However, the Senate requires more than mere machine “intelligibility”; it requires a form that enables automated systems to interpret the reservation of use in such a way that the content is not evaluated. The requirement of “machine readability” is therefore technology-neutral, but must also meet an appropriate standard.
In this case, the Senate did not consider these requirements to have been met. Although the claimant referred to current tools, they were unable to demonstrate that corresponding, reliable technologies were available and commercially viable at the time of reproduction in 2021. However, the Senate leaves open the possibility that a reservation of use formulated in natural language could be sufficient for Section 44b(3) UrhG, provided the rights holder can demonstrate that their specific opt-out was designed in such a way at the time of use that automated systems could recognise and observe it. In this respect, the Senate defines the element of “machine readability” more strictly than the lower court indicated in its obiter dictum.
3. Three-step test: No impermissible impairment
Like the Regional Court, the Senate also applies the three-step test to the limitation under Article 5(5) of the InfoSoc Directive. Despite only national harmonisation of photo protection under Section 72 UrhG, it applies this test by way of an interpretation in conformity with the directive. The test determines whether a limitation provision constitutes an acceptable interference with the author’s rights.
- In the first stage, the Senate establishes that Section 44b UrhG constitutes a clearly defined special case.
- In the second stage, the Senate determines that normal exploitation of the work or other protected object is not impaired. The disputed reproduction is a purely internal technical process. The published dataset contains only links to works that are legally accessible to the public. The Senate leaves open the question of whether indirect consequences, i.e. the generation of infringing content by AI models trained using the dataset, should be considered at this stage. In principle, the rights holder can take action against such infringements, as the Senate has stated with reference to the recent decision of the Regional Court of Munich I in the GEMA case. Furthermore, the rights holder can prevent use from the outset with an effective opt-out.
- Finally, in the third stage, the Senate also denies that there has been an undue infringement of the rights holder’s interests. The plaintiff does refer to competitive pressure from generative AI models. However, in this case, the Senate believes that the interests of research and innovation outweigh these concerns, particularly since rights holders can opt out and the defendant is not acting commercially. The plaintiff was unable to prove any specific loss of revenue attributable to the disputed data set.
Overall, therefore, the Senate confirms the lower court’s tendency to consider TDM restrictions applicable and compatible with the three-step test, despite well-known legal policy concerns, even for the training of generative AI.
4. Section 60d UrhG as a second exception
Although Section 44b UrhG is already in effect, the Senate is also examining the limitation of Section 60d UrhG, which was primarily applied by the Regional Court.
Like the lower court, the Higher Regional Court also affirms the applicability of this provision. According to the Senate, “scientific research” encompasses any systematic, methodical activity aimed at gaining new, verifiable knowledge, including applied and technological research. The boundaries with development activities are fluid, meaning development work can also be privileged if it is closely intertwined with research.
The court expressly classifies the creation of the dataset as the defendant’s own scientific research. The association’s purpose is to promote and disseminate free research and education. The statutes indicate that creating infrastructure for large AI models is part of this purpose. The creation of the dataset and its use for developing and improving AI models aims to gain knowledge and is therefore research under Section 60d UrhG.
The defendant is therefore classified as an “other institution” within the meaning of Section 60d(2) sentence 2 UrhG that conducts scientific research. However, the institution must conduct research itself; merely providing infrastructure without conducting its own research is insufficient. The defendant does not pursue commercial purposes, but reinvests profits in research in accordance with its statutes. The Senate emphasises that cooperation with companies is harmless in itself, provided they do not have decisive influence or preferential access to research results (Section 60d(2) sentence 3 UrhG). The connections to AI companies presented by the plaintiff were insufficient to prove such controlling influence or preferential access. Individual paid services or personnel links are not sufficient as long as structural control is not proven.
Outlook
In contrast to the initial decision, the Senate has shifted the focus from the more specific Section 60d UrhG to the broader Section 44b UrhG, thus confirming the growing trend that data scraping for the purpose of training AI models is generally considered to be privileged under the TDM exception. This decision helps to clarify controversial aspects of the provision, such as the machine-readability of the reservation of use, and provides legal practitioners with valuable guidance on presentation requirements and the burden of proof.
The well-reasoned decision of the Senate cannot easily be contrasted with the Munich GEMA proceedings. In the Hamburg proceedings, the focus is on the copyright assessment of the collection, processing and linking of training data by a non-commercial association, as well as the purely technical reproductions necessary for creating an open dataset. The Munich proceedings, on the other hand, also concern the model and output levels, specifically the memorisation of specific, copyright-protected works (in this case, song lyrics) in a commercial AI system and their identical reproduction upon entering corresponding prompts.
Nevertheless, the same legal questions arise in both cases, particularly with regard to the interpretation of Section 44d UrhG. It is regrettable that neither the Regional Court of Munich I nor the Higher Regional Court of Hamburg referred the case to the European Court of Justice. The Federal Court of Justice will have to take this step if it gets the opportunity – the Higher Regional Court of Hamburg has allowed an appeal.
You might also be interested in this
Almost exactly one year after filing its lawsuit against OpenAI, GEMA achieved a major victory against the US AI company before the Regional Court of Munich. The court ordered OpenAI to cease and desist, provide information and pay damages.
Geographical origin or mere recipe? While new trends may have emerged on social media, the legal battle over this phenomenon continues. The Higher Regional Court of Cologne ruled in four cases that “Dubai chocolate” is (still) considered an indication of origin under trademark law.
Design has always been about more than just aesthetics. A well-crafted design can evoke emotions, convey values, and ensure brand recognition. Ideally, it defines a brand, significantly influencing its success. In today's competitive landscape, effective protection of design rights is therefore essential. In this blog post, we shed light on the application process at the EUIPO and highlight the key things to pay attention to when filing your design application.
Can a single shade of orange be distinctive enough to warrant EU trade mark protection? The General Court recently delivered its answer in a case that spans decades and raises fundamental questions about the requirements for claiming acquired distinctiveness. The Veuve Clicquot decision (T-652/22) serves as a key reminder: proving acquired distinctiveness in the EU is a high-stakes and high-evidence game.








