ICO launches guidance on AI and data protection


The Information Commissioner’s Office (ICO) has published an 80-page guidance document for companies and other organizations on the use of artificial intelligence (AI) in accordance with data protection principles.

The guide is the result of two years of research and advice from Reuben Binns, an associate professor at Oxford University’s Department of Computer Science, and the ICO’s AI team.

The guidelines deal with what the ICO sees as “best practice for data protection-compliant AI” and how we interpret the data protection act as it applies to AI systems that process personal data. The instructions are not a legal code. It provides advice on interpreting the relevant law relating to AI, as well as recommendations on best practices for organizational and technical measures to mitigate the risks to people that AI can cause or exacerbate. “

The aim is to create a framework for “auditing AI, with an emphasis on best practices for compliance with data protection – regardless of whether you are designing your own AI system or implementing one from a third party”.

It embodies, it says, “Examination tools and procedures that we will use in exams and investigations; detailed guidance on AI and data protection; and a toolkit designed to provide further practical assistance to organizations testing the compliance of their own AI systems. “

It is also an interactive document that invites further communication with the ICO.

These guidelines are intended to address two audiences: “People with a compliance focus, such as data protection officers (DPOs), general counsel, risk managers, senior executives and the ICO’s own auditors; and technology specialists, including machine learning experts, data scientists, software developers and engineers, and cybersecurity and IT risk managers. “

It identifies two security risks that can be exacerbated by AI, namely the “loss or misuse of the large amounts of personal data that are often required to train AI systems; and software vulnerabilities to be introduced through the introduction of new AI-related codes and infrastructures. “

As the guide shows, standard practices for developing and deploying AI necessarily involve processing large amounts of data. There is therefore an inherent risk that this will not comply with the principle of data minimization.

This according to the GDPR [the EU General Data Protection Regulation] Former Computer Weekly journalist Warwick Ashford stated, “Organizations do not need to retain data any longer than is strictly necessary or change the way the data is used from the purpose for which it was originally collected while simultaneously using it. You must delete all data at the request of the data subject. “

While the guide notes that privacy and “AI ethics” overlap, it does not seek to “provide general ethical or design principles for your use of AI”.

AI for the ICO

What is AI in the eyes of the ICO? “We use the umbrella term“ AI ”because it has become a standard industry term for a number of technologies. A prominent area of ​​AI is machine learning, which uses computer techniques to create (often complex) statistical models using (typically) large amounts of data. These models can be used to make classifications or predictions about new data points. While not every AI pertains to ML, most of the recent interest in AI has been driven by ML in some way, be it image recognition, voice acting, or credit risk classification.

“These guidelines therefore focus on the privacy challenges that ML-based AI can pose, while recognizing that other types of AI can create different privacy challenges.”

Of particular interest for the ICO is the concept of “explainability” in AI. The guide continues: “In collaboration with the Alan Turing Institute, we created a guide on how organizations can best explain their use of AI to individuals. This led to the explanatory decisions, which were made under the guidance of the AI ​​and published in May 2020. “

The instructions contain a comment on the distinction between a “controller” and a “processor”. It states: “Organizations that determine the purposes and means of processing will be controllers regardless of how they are described in a contract for the processing of services.”

This could potentially be relevant to the controversy surrounding US data analytics firm Palantir’s involvement in the NHS data store project, where Palantir has repeatedly stressed that the provider is just a processor, not a controller – that’s the NHS in this contractual relationship.

Biased data

The guidance also discusses issues such as biases in data sets that cause AIs to make biased decisions, and advice includes: “In cases of unbalanced training data, it may be possible to underscore it by adding or removing data balance / overrepresented subgroups of the population (e.g. adding more data points to loan applications from women).

“In cases where the training data reflects previous discrimination, you can either change the data, change the learning process, or change the model after training.”

Commenting on the guidelines, Simon McDougall, ICO’s Deputy Commissioner for Regulatory Innovation and Technology, said, “Understanding how to assess compliance with data protection principles can be challenging in the context of AI. From the heightened and sometimes novel security risks posed by the use of AI systems to the potential for discrimination and data bias. It is difficult for technology specialists and compliance experts to get on the road to compliant and functional AI systems.

“The guidelines provide recommendations on best practices and technical measures that companies can take to mitigate the risks created or exacerbated by the use of this technology. It reflects current AI practices and is practically applicable. “