Simplified GDPR compliance using MPC cryptography
The European Commission and the United Nations sponsored three studies that explain that a technique called “Secure Multi-Party Computation” (MPC) is a “state of the art privacy-preserving tool”.
MPC allows organizations to use data while it is encrypted. For example, several hospitals can collaborate in a clinical study without revealing patient records to each other, and only revealing the conclusion of the study. An NGO could analyze encrypted HR data to benchmark gender pay gaps without being able to see the salary data from participating organizations. An airport could run border control checks against a biometric database, without using the biometric data in the clear.
One of the big insights in the UN and EC studies is that the data is considered “anonymous” after the initial encryption step (referred to as “secret-sharing” step, to be precise). Furthermore, processing personal data with MPC is not considered “data processing” under the GDPR. This is because MPC provides uniquely strong security and privacy features to data owners.
In this blog, we highlight the key insights and provide links to the studies.
“MPC is a state-of the-art privacy preserving tool.”
Background to Multi-Party Computation
MPC is a (cryptographic) technique that enables parties to perform computations on data, such that each party learns nothing beyond its own input and, optionally, the output of these computations. In other words, the data input by one party essentially remains hidden to the other parties.
The way MPC works is as follows: The data is encrypted by means of randomly splitting the data into so-called “secret shares”, which have the property that single or multiple shares reveal absolutely no information about the data. These secret shares are distributed among multiple servers. Each server is controlled by a different trustee. The servers are setup to perform the necessary computation jointly. There can be two, three or multiple trustees, depending on the required trust model, but privacy is ensured as long as a subset of the trustees acts honestly (i.e., that those trustees do not collude).
The beauty is that this technique can also be used by a single organization: After data collection, the data can be stored in secret-shared form, and several internal servers perform the secure computation. In this model, even inside a single organization, the data ‘at-rest’ and ‘in-use’ is fully protected against prying eyes.
While the theoretical foundations of MPC have been in development since the 1980s, MPC has become valuable for real-world applications only since a few years, due to a combination of recent theoretical breakthroughs, and advances in computer hardware and networking technology. (See this Wiki for an excellent overview.)
“... the analytics carried out using MPC shall be considered as analysis carried out with feature data.”
Legal Opinion on Privacy properties of MPC
Three recent studies provide insight in the privacy properties of Secure Multi-Party Computation. In the next sections, we recap some of the most relevant paragraphs.
If you study the reports in more detail, you will see that MPC gives substance to GDPR concepts such as:
“Data protection by design and by default” and “appropriate technical and organisational measures” (Art. 25 and Art. 24)
The sole requirement for confidentiality to hold throughout the business process is proper segregation of duties between the trustees (the administrators of the MPC servers). The fact that data remains encrypted during a business process is a significant paradigm shift versus today’s use of data. MPC also provides a significant level of protection: to steal or leak data, a malicious actor needs to compromise the admin keys of multiple trustees (this can be compared to protection by two-factor authentication).
“‘Processing’ means any operation .. performed on personal data” (Art. 4.2)
The output of the initial encryption step (the “secret-shares” used during the operation) are considered non-personal data. As we will see below, legal experts explain that operations done on the secret-shared data is not considered processing of personal data due to MPC’s unique data protection properties.
“Purpose limitation” (Art. 5.1.b)
Parties involved in the computation have explicit control over what is calculated. If trustees do not agree to a computation, it cannot be executed.
“Data minimization” (Art. 5.1.c)
Selected parties receive the result of the calculation, and will not know more than this result.
“Rights of the data subject” (Chpt. 3)
As personal data is not exchanged, the data owner maintains control over the data, simplifying access (Art. 15), accuracy/rectification (Art. 16), erasure (Art. 17), restriction of processing (Art. 18), etc.
Summary from UN Report on Privacy Techniques
UN’s recent Handbook for Privacy-Preserving Techniques summarizes the impact of MPC as follows:
“One of the first significant precedents for secure multiparty computation was reached in Estonia with the Private Statistics project in 2015. In the project, 10 million identifiable tax records were linked with 600 000 identifiable education records and statistically analysed using secure multiparty computation. The Data Protection Agency, after studying the technical and organisational controls of the system, stated that no personal data was processed. The precedent has also been upheld with the MPC servers hosted in the public cloud.
The PRACTICE project (European Commission Framework Programme 7) spent significant effort in analysing legal aspects of secure computing technologies. The report studies the Estonian precedent described above under the European General Data Protection (GDPR) regulation and finds that precedent can be upheld under the GDPR.
Further research has been performed by the SafeCloud project and SODA project.”
SODA H2020 Study: Relevant Paragraphs
The report on 'Scalable Oblivious Data Analytics' states:
“The fact that the data fragmentation procedure [secret-sharing step] as such is processing of personal data does not mean that the output data has to fall under the scope of the GDPR. On the contrary, based on the arguments put forward here the data shards that have undergone the partitioning are considered to be non-personal data.”
It continues on page 52:
“If personal data is turned into non-personal data, then the subsequent storage of the data pieces should not be considered ‘data processing’ within the frames of Art. 4 Nr. 2 and the data protection provisions in general. Following this logic, the analytics carried out using MPC shall be considered as analysis carried out with feature data.”
“As to the technical details, structure and design of the processing, MPC is a state-of the-art privacy preserving tool. The cryptographic solutions in use protect the data from intruders during the analysis. Intruders in this sense mean unauthorised external adversaries not intended to have access to the anonymised data.”
A third, relevant study on privacy properties of MPC is by the EU Horizon 2020 PRACTICE consortium. We link to it below under 'Sources'.
MPC enables analysis of, and collaboration on, sensitive data. Please don’t hesitate to reach out with further questions or suggestions. Toon Segers (for more info: firstname.lastname@example.org)
- EU Horizon2020 "Scalable Oblivious Data Analytics" (SODA), Deliverable D3.5. See Section 3 on MPC.
- EU Horizon 2020 PRACTICE, Deliverable D32.3.
- UN Handbook on Privacy Preserving Computation Techniques. See Page 47.