Out-Law Analysis 9 min. read

The big data protection question that could get answered in 2026

Exterior view of the European Court of Justice in Luxembourg

The CJEU in Luxembourg has played a major role in defining the concept of personal data over the years. Alexandros Michailidis/Getty Images.


Businesses could find it easier to argue that information they handle is not subject to EU data protection law under legislative proposals that could be endorsed during 2026.

For years, organisations, data protection authorities and the courts have grappled with the concept of ‘personal data’ and how broadly it should be interpreted. Whether information constitutes personal data is a fundamental question for businesses, as it dictates whether the processing of that information is subject to data protection law. There are now signs that EU law will be changed to clarify the matter.

In November 2025, the European Commission outlined its proposals for a Digital Omnibus Regulation and Digital Omnibus on AI Regulation, which provide for noteworthy reforms to centrepiece EU frameworks – including the General Data Protection Regulation (GDPR). One of the Commission’s proposals is to update the GDPR to clarify that information organisations hold would not constitute personal data if those organisations lack the “means reasonably likely to be used to identify the natural person to whom the information relates”.

We highlighted at the time how the proposed change could take many businesses handling pseudonymised or aggregated datasets outside the scope of the GDPR’s obligations and support data-related innovation. Below, we look in more detail at how it aligns with the recent direction of travel in EU case law and consider what it might mean for AI development and for controller-processor arrangements.

The 2025 ruling that moves the dial

Disputes over what constitutes personal data have made their way to the EU’s highest court over the years. For a while, it seemed like the concept was being interpreted in an increasingly broad way.

For example, in 2011 the Court of Justice of the EU (CJEU) held that internet users’ IP addresses constitute personal data because they allow for an individuals' “precise” identification. That judgment, in the so-called Scarlet case, concerned the extent to which internet service providers (ISPs) are obliged to address copyright infringement by users.

In 2016, in the so-called Breyer case, the CJEU ruled again in a similar case, but this time on whether ‘dynamic’ IP addresses constitute personal data.

While static’ IP addresses are linked to individual computers – they act as a signpost for the transmission of data to the correct recipient – dynamic IP addresses serve the same purpose but change each time there is a new connection to the internet is established and do not enable a link to be established between a given computer and the physical connection to the network used by an ISP.

The CJEU said dynamic IP addresses will constitute personal data to an online service provider if it “has the legal means which enable it to identify the data subject” owing to additional data being held about that person by the ISP. The case law developed in the Breyer case informed guidance the UK’s Information Commissioner’s Office (ICO) produced last year on anonymisation.

In a similar vein, the CJEU ruled in 2023 that vehicle identification numbers can constitute personal data where a business accessing that data, such as in the context of third-party after-sales repair and maintenance services, can match those numbers with other information to identify individuals.

This trend of an apparent broadening of the concept of personal data was arrested by the CJEU in September last year in a case between the European data protection supervisor – responsible for overseeing EU institutions’ compliance with data protection laws – and the Single Resolution Board (SRB), which has a supervisory role in EU banking.

The SRB case raised the question of whether information that cannot be identified in the hands of the recipient is personal data. The CJEU ruling was clear that even if information is definitely personal data in the hands of the provider, it may not be personal data in the hands of the recipient and, if so, can be treated as being anonymised and out of scope of the GDPR. Whether that is the case in practice will depend, the court said, on whether the recipient of the data has the means to identify someone from the data.

This is a question of fact and businesses should note that the test for whether data is fully anonymised is extremely stringent. Among other things, it requires assessment of the reasonable likelihood of reidentification arising – an assessment that must be considered through the lens of a so-called ‘motivated intruder’. If there is any reasonable basis, any reasonable means for reverse engineering to individual identification, the data should be treated as being pseudonymised data, which still constitutes personal data under the GDPR.

However, the SRB ruling does open up the possibility for data sharing in a sufficiently masked form so as to be not identifiable in the hands of the recipient.

What was also clear from the judgment, however, is that even if a controller is providing information that cannot be identified by a third-party recipient, the controller is not relieved of its GDPR obligations – it must, for example, meet transparency duties it owes data subjects.

Codification of the SRB ruling but not everything will change

The principle established in the SRB ruling is set to be codified as part of the EU’s planned new Digital Omnibus. This would not only represent the evolution of EU case law, confirm that information is not necessarily personal data for every entity simply because another entity can identify the individual, and provide a much stronger basis for businesses to argue that data received or acquired in pseudonymised or aggregated form should not be regarded as personal data.

However, although the proposed changes to the definition of personal data would help clarify whether data qualifies as personal data in situations where the data subject is ‘identifiable’ but is not yet ‘identified’, the proposed changes do not alter the wide interpretation which is given to the term ‘identified’ under the GDPR: an individual is considered to be identified if they are singled out. For example, a biometric template of a person is unique to that person, even if the controller may not have the means to associate that template to the real-world identity of the data subject. The scenarios which will be impacted by the change in the definition may therefore be quite limited.

Some campaign groups have argued that device identifiers would no longer constitute personal data if the Commission’s proposals were accepted, on the basis that the individual using the device could not be directly identified. However, this is likely to be a misconception. Although in some situations, for example in the case of a shared computer, an IP address may qualify as personal data only because the data subject is ‘identifiable’, in many others, the data subject can be considered as ‘identified’ on the basis that the individual is singled out. A mobile phone for example, is generally considered a personal device, therefore, an identifier of that device is effectively an identifier of the data subject.

Nowadays it is widely accepted that data associated with a vehicle identification number (VIN) is personal data because the owner of the vehicle can be identified from the VIN. Although the manufacturer may not have sold the vehicle directly and therefore may not have access to information which identifies the owner, any person can see the VIN imprinted on a vehicle which is parked on the street. Under the new test, would it qualify as ‘reasonably likely’ that the manufacturer of the vehicle can obtain information to identify the owner based on the fact that this information is somewhere in the public domain? Does it make a difference that the manufacturer could potentially come to know the identity of the owner in the future if a warranty claim was made? The Breyer judgment considered data to be personal data based on hypothetical future identification. The proposed revised definition does not in any way exclude this interpretation.

What this might mean for AI development

If the Commission’s proposals are implemented as drafted – and there is still significant scrutiny expected from EU law makers in both the European Parliament and Council of Ministers during the course of this year – then it could benefit AI developers.

If the developer has access to the sources of the data they acquire for training their AI models and these sources enable identification, the data would still constitute personal data, as there would be identification via “means reasonably likely” to be used by the controller. While the developer may be able to sever the link with the original data in a way that they could argue that the data is then anonymised, this may still be difficult in practice – if data is obtained from the internet, for example, including that data in a search query could return a link to the original source.

It may be easier for AI developers to demonstrate that the models themselves are anonymised. The EDPB’s guidance on AI model anonymity essentially provides that an AI model will constitute or comprise personal data if it is possible for any person to obtain personal data used for training in one of two ways: either through extraction from the parameters of the model, which have been created using personal data but do not contain records of personal data; or if the model can produce outputs containing training data in response to queries, whether intentional or unintentional.

Developers currently have to consider the identifiability of this data from the perspective of any person who may access the model. Limiting the assessment to the developer’s own means could make it more straightforward to satisfy the test of anonymity.

In that scenario, the GDPR would not be engaged and so AI developers would not have to consider issues such as lawful bases for processing, conditions for processing special category or criminal offence data, or data subject rights such as the right of access, rectification, erasure or the right to object where processing depends on legitimate interests.

That said, because it is not clear what constitutes means ‘reasonably likely’ to be used to enable identification, there is the potential for inconsistent interpretation and enforcement. It is certainly likely that the scope and boundaries of a revised definition would be tested by privacy advocates and campaigning groups.

Also on AI, the proposed new Digital Omnibus includes provisions that would confirm legitimate interests as explicitly recognised as a lawful basis for processing personal data for AI training and operation, provided controllers conduct a balancing test that considers data subjects’ rights and interests, ensure transparency, and allow opt-outs. While this also broadly follows the approach taken by a number of EU supervisory authorities and the ICO in the UK, its explicit recognition would assist many AI developers.

However, there is an inherent tension. If an AI developer were to rely on legitimate interests as its lawful basis for processing personal data, then that would still carry significant regulatory obligations – for example, the balancing test and the risk of objection. Consequently, there would be a clear temptation for AI developers to lean heavily on the relative approach to personal data and to argue that GDPR is not engaged at all. A mixed approach might be possible – i.e. treating some acquired data as personal, others as anonymised – but the granular analysis required to achieve that would be a significant challenge. AI developers might simply seek to argue that GDPR is not engaged at all.

The impact on controller-processor arrangements

Since the SRB ruling, there has been significant debate within the data protection community over whether the judgment applies in the context of controller-processor arrangements or only to the disclosure of data to third parties – i.e. whether data ceases to be personal data if, in the hands of a processor or sub-processor it does not provide for a person’s identification.

Consensus on that issue appears to be building with the prevalent view being that such data will remain personal data however far down the processing chain it goes, even if from the perspective of a particular processor, it is not identifiable. That is because the controller has the contractual means all the way down the processor chain to reassemble and reidentify.

The root of that consensus is in how ‘processor’ is defined in the GDPR – as it makes clear that a processor’s processing of personal data is done “on behalf of the controller”. On that basis, if information is identifiable in the hands of the controller, that is determinative of the matter.

Of course, it is for the data protection authorities and, ultimately, the courts to provide a definitive answer on this question. In the UK, we are expecting the ICO to provide some comment on the matter over the next couple of months.

We are processing your request. \n Thank you for your patience. An error occurred. This could be due to inactivity on the page - please try again.