paint-brush
Understanding Privacy-preserving Computationsby@encapsulation

Understanding Privacy-preserving Computations

tldt arrow

Too Long; Didn't Read

Delve into the intricacies of MPC and HE, powerful privacy-preserving computation technologies, understanding their mechanisms, and discovering the importance of verification in ensuring trustworthy results.
featured image - Understanding Privacy-preserving Computations
Bundling data and functions into a single unit HackerNoon profile picture

This paper is available on arxiv under CC 4.0 license.

Authors:

(1) Tariq Bontekoe, University of Groningen;

(2) Dimka Karastoyanova, University of Groningen;

(3) Fatih Turkmen, University of Groningen.

Table of Links

Abstract & Introduction

Preliminaries

Zero-knowledge proofs and verifiable computing

Privacy-preserving computations

Requirements: an application’s perspective

Verifiable, privacy-preserving computing

Open challenges and future directions

Conclusion & References

4. Privacy-preserving computations

There are different PETs that can be used to construct solutions for privacy-preserving computations. Two of them are specifically suitable for our setting: MPC and HE. Both are well-established PETs, providing flexibility and offering strong privacy guarantees, which make them excellent building blocks for verifiable privacy-preserving computations. Below we discuss both building blocks, and provide an intuitive answer to RQ1.

4.1. Secure multi-party computation

MPC is a collection of techniques that allows for performing joint computations over distributed data, while keeping all input data private and revealing nothing but the computation output. Generally, there are n parties that participate in an MPC computation, and any participating party contributes to the collaboration with their respective private inputs. All n parties execute an interactive protocol to compute the joint function output from their private inputs.


An MPC protocol must therefore be resistant to the fact that some participating parties may be corrupted by an adversary. As discussed in Section 2.2, there exist different security models. However, any secure MPC protocol should at least satisfy the following (informal) requirements [14], given the constraints of the used adversarial model:


• Input privacy: No party should learn anything other than the function output it is assigned. Specifically, no party should learn anything about other parties’ inputs, other than what can be derived from its assigned output.


• Correctness: Every party that receives a (partial) output, obtains a value that is correct with respect to the protocol specification and input data.


• Independence of inputs: Corrupted parties must pick their inputs independently of the honest parties’ private inputs.


There are several ways to construct an MPC protocol or framework. Below, we present the two most common constructions: secret sharing and garbled circuits.


Secret Sharing. In a secret sharing approach, private data is split over multiple parties. Information is split in such a way that only predefined sets of parties can reconstruct the full information, while other (sub)sets cannot even deduce partial information. Most secret sharing schemes divide a secret s among n parties in such a way that only subsets with t or more parties can reconstruct s. Such a scheme is called a t-out-of-n-threshold scheme.



Finally, we note that shares from an LSSS can be multiplied efficiently using so called Beaver’s multiplication triplets [30]. These triplets are input/function-independent and can thus be generated in a so called offline preprocessing phase. Using this approach, the remaining computations can be performed in a very efficient online phase.


Many popular, modern secret sharing schemes make use of linear shares in combination with Beaver triplets or oblivious transfer. Example schemes using secret sharing are: Shamir’s secret sharing (SSS) [31] or, in the online phase, SPDZ [32] and MASCOT [33]. For an up-to-date overview on existing schemes with implementations we refer to the MP-SPDZ library [34].


Garbled Circuits. Yao’s garbled circuits (GCs) [35] are a way to enable two distrusting parties to securely evaluate a function on their private input data. It requires that the underlying function is expressed as a boolean circuit consisting of 1-out-2-in gates.


One party, the garbler, generates a garbled version of the entire circuit, by garbling the individual gates. In the original implementation, a gate is garbled by assigning a unique, random label to the possible values (true, false) of each input wire, and doubly encrypting the four possible output values, under the corresponding input labels. The garbler randomly permutes the encrypted outputs and shares these with the evaluator accompanied by the random labels corresponding to the private inputs of the garbler. The evaluator then participates in an oblivious transfer (OT) protocol [36] to obtain the labels corresponding to their private inputs, without revealing those.


Having received both input labels, the evaluator can correctly decrypt exactly one of the garbled outputs, thereby obtaining the true output bit(s). When a gate output is used as input to another gate, the true output will not be a bit, but rather a new random label that is used in the decryption of this other gate. An alternative construction to Yao’s GCs is the BMR framework [37]. Implementations of both and other frameworks can be found in, e.g., MP-SPDZ [34], ABY [38], or ABY 3 [39].


RQ1a: Why should we verify MPC? Most MPC frameworks are applied in a fully decentralized setting, where all parties communicate with one another directly. Many MPC schemes guarantee privacy for a given threat model, under the assumption of a certain number of honest parties, ranging from a strict majority to only one. Therefore, one might say that correctness is a largely solved problem.


However, none of the above takes into account the use case in which parties that did not participate in the computation want to verify the results. In this situation, the results could be incorrect when all computing parties are colluding. The schemes are thus not publicly verifiable or publicly auditable [40], making the results untrustworthy for external parties.


Another issue regarding (public) verifiability, that is often overlooked, is verification of data authenticity. Correctness of results is often only guaranteed given the assumption that input data is provided honestly. This especially plays a role when input data of parties should be the same over multiple computations, e.g., for reproducibility, or when input data should be authenticated, e.g., signed by an expert authority. In practice, both external and computation parties that want verifiably correct results, will require verification of computation and data authenticity.

4.2. Homomorphic encryption

HE is a term used to denote a collection of (asymmetric) encryption schemes that allow users to perform operations on encrypted data without decrypting it first. More formally, and following [41], an HE scheme is described by four (probabilistic) polynomial-time algorithms (KeyGen, Enc, Dec, Eval):


• KeyGen(1κ ) → (sk, pk, evk) is the key generation algorithm. It takes a unary representation of the security parameter κ as input and outputs a secret decryption key sk, public encryption key pk, and public evaluation key evk.


While there are different ways to represent the function f in HE schemes, most constructions prefer a translation to an arithmetic circuit Cf over M. Notably, not all homomorphic encryption schemes can be used to evaluate arbitrary arithmetic circuits. Therefore, we distinguish, following [42], several types of homomorphic schemes in increasing order of evaluation capabilities.


Partially homomorphic encryption. PHE is a type of homomorphic encryption that is only additively or only multiplicatively homomorphic, implying that it can only evaluate arithmetic circuits consisting solely of addition, respectively multiplication gates. In general, partially homomorphic schemes have the lowest computational cost and stable ciphertext size.


Somewhat homomorphic encryption. SWHE constructions can evaluate arithmetic circuits consisting of both addition and multiplication gates. However, the circuits that can be evaluated may be limited in the number of operations (of either type), for example a scheme that can only evaluate circuits with a maximum multiplication depth of 2. Moreover, for SWHE there is no requirement that prevents exponential blowup of the ciphertext size..


Leveled homomorphic encryption. LHE resembles SWHE in that it can evaluate circuits that are limited in depth d. However, since d is provided as input to KeyGen, the scheme parameters may be a function of d. Next to this, the length of the ciphertext output of the Eval algorithm may not depend on the depth d, meaning that ciphertexts do not blow up exponentially. Notice that the key sizes may still depend on the depth d and can get impractically large.


Fully homomorphic encryption. FHE schemes can evaluate arbitrary arithmetic circuits, without the schemes parameters depending upon any circuit property. This generality often comes at the cost of expensive bootstrapping algorithms after a certain number of multiplications [43].


RQ1b: Why should we verify HE? HE is often used in a setting where one (cloud) server or central entity performs the majority of computations, for example in an outsourcing scenario. A frequently observed scenario is where all data owners use a distributed, or threshold, key generation protocol to obtain a public-private key pair where the private key is shared over all data owners [44], [45]. To perform a computation, all data owners encrypt their data under the same public key and let the central server perform the actual computation on the received ciphertexts. Due to the high computational cost of operations on homomorphically encrypted data, it is often preferred to have a single, central computing server with sufficient computational resources.


In most cases all parties are assumed to behave semihonestly. This assumption, however, is unrealistic in most real-world scenarios and raises significant integrity and verifiability issues. Integrity issues arise from the possibility of key-recovery attacks [46], [47], [48], whereas verifiability issues stem from the fact that the server can deviate from the requested computation.


Another issue regarding verifiability, is whether external parties can verify the results of a computation that uses HE. These guarantees may be warranted in cases where external auditing is required or when results are of interest to a larger community. Therefore, verifiable solutions should provide guarantees even in the case where all data owners and the central computation server(s) are colluding. Next to this, external parties may also require guarantees concerning the integrity and authenticity of input data, which is in general not considered in existing work.