AI & GDPR: Minimise compliance costs with anonymisation & pseudonymisation!

True to the motto “the future is now!”, artificial intelligence is regarded as the key technology of the future but has long since been established in the present all over the world. More and more companies are discovering the advantages of machine learning and deep learning as well as Big Data for themselves. Processes can be accelerated, optimised and made more profitable. This is made possible by algorithm-based automated data processing that takes the place of human decisions and evaluations.

The GDPR is often called an obstruction to development because of stringent requirements. In fact, it regulates the area of data processing relevant to the application of artificial intelligence. Here the instruments of anonymisation and pseudonymisation become relevant. They can overcome the challenges of GDPR that emerge for AI companies and they can minimise compliance efforts.

However, only those AI companies that use anonymisation and pseudonymisation in accordance with legal certainty may benefit from this advantage. We will explain step-by-step the challenges that you may encounter whilst applying anonymisation and pseudonymisation, as well as the benefits obtained by these tools and what you should be aware of when applying them.

Operational challenges in implementing GDPR requirements for automated processes

The big legal challenge is the applicability of the GDPR to the data processed by an AI company. The GDPR is not applicable if no personal data are available. However, in such cases, the AI company can manage with anonymisation. The GDPR is particularly not applicable to anonymised data and entrepreneurs in the AI sector could avoid the data protection requirements!

However, if personal data are available and a company deals with the data protection requirements of the GDPR, then there are two steps to follow:

Step 1: Is the data processing legally justified? This, for example, is the case, if the person concerned consented to the processing of his personal data.

Step 2: The operational implementation of the GDPR. What requirements does the GDPR pose for specific data processing and how can these be implemented?

Here, AI users encounter actual or practical challenges in justifying data processing (step 1) and in implementing the requirements of the GDPR (step 2), such as personnel and financial charges. A pseudonymisation of the data would be a considerable relief in solving the emerging challenges, since a pseudonymisation minimises the risk of data protection incidents and therefore has a privileging role in the GDPR.

Anonymise data – avoid the GDPR!

The advantage of anonymisation is obvious: the GDPR is not applicable to anonymised data and thus an AI company could process those data. From a data protection point of view, nothing prevents from the use of Artificial Intelligence. However, in many cases, AI users need personal information, so that, for example, a deletion would be disadvantageous for them. Anonymisation is then out of the question.

Exploit the balancing of interests for the benefit of the AI user!

Pseudonymization has a simplifying function for AI users. You play a role in step 1, namely that data can be processed if a justification for processing is relevant. Data can not only be processed by consent.

One possible justification is also the processing due to predominant interests of the data controller. Since, however, his interests must be weighed against those of the person concerned, pseudonymisation has the appealing aspect for AI companies that a balancing of interests tends to be in their favour when data is pseudonymised. Thus, the data is better protected than without pseudonymisation. Consequently, the data processing becomes lawful.

Eliminate requests of data subjects by means of pseudonymisation!

Furthermore, for AI companies a pseudonymisation has the advantage that data subject requests may lapse. The implementation of the requirements (step 2), which the GDPR provides to data controllers in the context of data subject rights, is made easier by eliminating the time-consuming processing of data subjects’ inquiries and may related costs and workload be omitted. AI users who pseudonymise personal information do not face these issues. According to Art. 11 para. 2 GDPR, those persons affected would no longer have their rights at their disposal if it was not possible for the data controller to identify those persons affected. This includes pseudonymous data.

The challenges said are faced by AI entrepreneurs because of the principle of transparency. Accordingly, the person concerned by the processing must always be comprehensibly informed about the use of their data in AI systems. If artificially intelligent systems replace human decisions, decision-making process must be explainable.

In addition, the data controller must prove the general compliance: he has the obligation to provide evidence that the legal requirements of the GDPR are complied with (accountability). As a result, the processing must be able to be presented comprehensibly not only to the persons concerned, but in case of doubt also to business partners and authorities.

If automated processes are used, these requirements lead to a general compliance problem, since automated processes are more difficult to track and determined for the data controller. Particularly within the context of deep learning, AI users often do not even know how their system will evolve, since they are working with changing and self-adapting models. Depending on the selected models (keyword: black box), the fulfilment of transparency obligations and information rights requires considerable documentation effort.

This problem is reflected in specific requirements of the GDPR. It gives special rights to persons affected by automated decisions (Artt. 13-15 GDPR). In such cases, a person concerned also has the right to access and information about the logic involved as well as the scope and intended effects of automated processing for the data subject (Article 13 (2) (h), Art. 15 (1) lit. h GDPR). Higher costs, more staff and higher compliance costs resulting from processing data subjects’ enquiries are the result of not pseudonymising data.

Risk minimization by pseudonymisation – interaction of the compliance areas

Pseudonymised data is also advantageous in the framework of the data protection impact assessment. The idea behind this is a risk analysis and evaluation of the data processing to be carried out by the data controller. It does not always have to be done, but particularly in cases of profiling and other types of automated decisions that are often used in the context of AI, a data protection impact assessment is required.

In addition, the data protection impact assessment is not just a one-off, but the development and production processes must be constantly monitored (monitoring) in order to detect new risks and to meet the accountability requirements. This is often difficult due to the agile development of AI systems and requires good communication between developers and legal advisers who can provide legal guidelines for development. The implementation of the data protection impact assessment thus interacts with other compliance areas.

However, if the data is pseudonymised, the risk of data processing to be assessed is lower than normal and it may not be necessary to consult the supervisory authority, as would it be the case with a high level of risk. Often pseudonymised data even eliminates the obligation to carry out the data protection impact assessment. Pseudonymisation also makes it easier for the data controller to follow his accountability obligations. It must be done before the data processing, that means, still in the context of the generation of the AI.

What are anonymised and pseudonymised data?

Anonymous data are the flip side of personal data. Data is either personal or anonymous. The consequence of this is that the GDPR is not applicable to anonymous data. Anonymous data can exist in two constellations: First, data can be anonymous in the first place simply by not referring to an identified or identifiable person. On the other hand, it is possible that initially personal data were available, but these were subsequently anonymised.

In contrast to anonymisation, pseudonymised data describes personal information; the later simply constitutes a subordinate. The scope of application of the GDPR is therefore opened and consequently the processing of pseudonymous data is prohibited.

Pseudonymous data is information that can be assigned to a person only when accessing separately kept and protected information (see Article 4 (5) GDPR). Thus, it can be achieved that the assignment of data to a person is no longer possible. The difference to anonymisation is, that the information that makes the identification possible, is not deleted, but stored separately. The guiding principle is the principle of functional separation. It must be technically ensured that users of the pseudonymised data have no access to the separate information and therefore a person cannot be identified.

In order to fully exploit the advantages of anonymisation and pseudonymisation, it makes sense to AI companies to consider these procedures at an early stage of development. One should begin with the raw data, so that in the context of pseudonymisation one can already alienate for example the address. Care should be taken to ensure that no non-pseudonymised personal information is incorporated into machine learning. In this way only, the advantages of anonymisation and pseudonymisation can be realised and privacy by design can be effectively implemented.

Conclusion: Anonymisation and pseudonymisation are effective means of solving data protection problems

The GDPR brings up challenges to the application of algorithm-based artificial intelligence. However, the innovation is not stopped. Through the instruments of anonymisation and pseudonymisation, the GDPR provides users with procedures that they can be used to mitigate the challenges. This allows AI users to use Big Data for themselves, to apply machine learning and deep learning systems in accordance with data protection law and to avoid increasing costs, work loads and legal violations.

However, only those AI companies using anonymisation and pseudonymisation in accordance with legal requirements benefit from this advantage. In the end, if data were not properly anonymised, the GDPR is applicable and the company is subject to appropriate sanctions if it did not know it and did not process the data in accordance with the requirements of the GDPR. Therefore, we recommend appropriate legal advice in that matter.

We may help you creating a concept for anonymisation. It must be ensured that account is taken to all the means that might be used by the data controlling company or other persons in order to identify a person. The identification of a person must be made irrevocably impossible. The costs and time required for the identification must also be taken into account. Technical requirements for the process of anonymisation cannot be derived from the law; the deletion or generalisation of the identifying features are examples, that may come into question.

In order to successfully apply the means of pseudonymisation and to reap benefits to the full, companies must ensure that the type of pseudonymisation they are using complies with the requirements of the GDPR. There are now guidelines for pseudonymisation solutions in a White Paper issued by the Ministry of the Interior. Entrepreneurs and their legal advisors may follow this. One possibility is, for example, that a person concerned chooses a user ID or gets assigned to one by a third party. The data controller can also assign a pseudonym to a person concerned by a code number if he knows the identity.

In the event, that you need advice or have questions, ISiCO Datenschutz GmbH is your ideal contact. Along with our affiliated law firm Schürmann Rosenthal Dreyer Rechtsanwälte, we successfully work together in the area of data protection law. By implementing the requirements of the GDPR legally compliant, potential customers build trust in you and chose you instead of the competition. This is how you become a trademark of European AI companies! We know what matters, you can count on our expertise!

Please also read our article on “Artificial Intelligence and Data Protection: application cases”!

You need advice on your AI project? Trust in the expertise of our ISiCO consultants. Please contact us directly.