Customers often ask how Multi-Party Computation compares to Federated Learning. In this post we explain the differences and synergies.
Federated Learning (FL) offers a way to train AI/ML-models while keeping the inputs decentralized. By sharing local training parameters rather than actual data, FL aims to protect participants’ privacy. However, the privacy properties of FL have limitations, as a survey study from 2021 published in ‘Future Generation Computer Systems’ explains:
“Federated Learning still has some privacy threats because the adversaries can partially reveal each participants’ training data in the original training dataset based on their uploaded parameter.”
In short: Machine learning parameters are exchanged between the parties participating in the training. It is possible to infer information from these parameters, which makes the residual risk hard to evaluate. This could cause delays in setting up the collaboration.
Multi-party Computation (MPC), by contrast, does not focus on training an AI/ML alone, but supports a broad set of computations, e.g. tabular joins, queries and general statistics. It offers stronger privacy guarantees because inputs remain encrypted during the entire processing. This eliminates the risk of inference attacks on intermediate results.
However, the stronger privacy property of MPC comes at higher computational cost. For example, training a decision tree with MPC on 10,000 records takes minutes, and training a neural net takes hours, while pure FL can handle these tasks faster. (Try this user-friendly open-source MPC example for training a Convolutional Neural Network on the well-known MNIST handwritten digits training set of 60,000 images.)
Whether you choose Federated Learning or MPC for your next project depends on your needs: Do you need to train an AI/ML model on a large dataset, and you don’t have the strictest privacy requirements, then FL is a great option. If you need to support a broad range of analyses, queries and statistics or the data is very sensitive, you want to consider MPC.
And as a bonus, MPC can help you in case you need to train a large model and require strict privacy: Use FL to train on the local sets, and use MPC to combine the parameters!
What's your data collaboration challenge?