FMD + TEE for private state sync proposal

Basic proposal

Based on our discussions, here is a first proposal for prototyping.

0. Building blocks

Fuzzy message detection

We use the IND-CCA secure construction FMD2 := (\mathsf{KeyGen},\mathsf{Flag},\mathsf{Extract},\mathsf{Test}). The flags are ciphertexts of 1 under a hash variant of ElGamal with plaintext \mathbb{Z}_2. ElGamal is instantiated over a cyclic group \mathbb{G} of prime order q with generator g. The scheme also uses two random oracles H:\mathbb{G}^3\rightarrow\mathbb{Z}_2, and G:\mathbb{G}\times\mathbb{Z}_2^\gamma\rightarrow\mathbb{Z}_q.

We slightly modify \mathsf{Flag} and \mathsf{Test} algorithms: detection keys are any ordered subset of the \gamma secret keys, along with their indices. Therefore, decryption of the flag ciphertexts is selected acording to those indices. The reason of this tweak is to allow using the filtered output of a TEE as input to another one. (Originally, detection keys are the first n secret keys for false positive 2^{-n}.)

:information_source: Only the \mathsf{Test} algorithm needs to be implemented in the TEEs. The other algorithms are executed by the party that creates the resource (the sender).

KeyGen

Inputs:

  • \gamma\in\mathbb{N}. Determines the range of false positive rates p\in\{2^{-n}\}_{1\leq n\leq \gamma}
  • q,g: The order and generator of \mathbb{G} (given by a security parameter)
  1. For 1\leq i\leq \gamma sample random x_i\gets\mathbb{Z}_q
  2. Output sk:=(x_1,\ldots,x_\gamma) and pk:=(h_1:=g^{x_1},\ldots,h_\gamma:=g^{x_\gamma})

The parameters q,g,\gamma are passed implicitely to the remaining algorithms.

Flag

Input: Public key pk:=(h_1,\ldots,h_\gamma)\in\mathbb{G}^\gamma

  1. Sample random r,z\gets\mathbb{Z}_q
  2. u:=g^r, w:=g^z
  3. For 1\leq i\leq \gamma do:
    a. k_i := H(u,h_i^r,w) // Hash to \mathbb{Z}_2
    b. c_i:=k_i+1 // Arithmetic in \mathbb{Z}_2
  4. m:=G(u,c_1,\ldots,c_\gamma) // Hash to \mathbb{Z}_q
  5. y:=(z-m)r^{-1} // Arithmetic in field \mathbb{Z}_q
  6. Output flag ciphertexts \vec{f}:=(u,y,c_1,\ldots,c_\gamma)

Extract

Inputs:

  • sk:=(x_1,\ldots,x_\gamma)\in\mathbb{Z}_q^\gamma
  • n\in[1,\gamma]. Yields false positive rate p=2^{-n}
  • \vec{i}:=(i_1,\ldots,i_n)\subseteq[1,\gamma]^n. An ordered subset of indices of size n // We add this input.
  1. Output detection key \tilde{dsk}:=((x_{i_1},\ldots,x_{i_n}),(i_1,\ldots,i_n))

Test

Inputs:

  • Detection key \tilde{dsk}:=((x_{i_1},\ldots,x_{i_n}),(i_1,\ldots,i_n))\in\mathbb{Z}_q^n\times[1,\gamma]^n for some n\leq\gamma
  • Flag ciphertexts \vec{f}:=(u,y,c_1,\ldots,c_\gamma)\in\mathbb{G}\times\mathbb{Z}_q\times\mathbb{Z}_2^\gamma
  1. m:=G(u,c_1,\ldots,c_\gamma) // Hash to \mathbb{Z}_q
  2. w:=g^m\cdot u^y
  3. For 1\leq j \leq n do: // Below i_j is an input index, i.e. decrypt ciphertext at position i_j
    a. k_j := H(u,u^{x_{i_j}},w) // Hash to \mathbb{Z}_2
    b. b_j:=k_{i_j}+c_{i_j} // Arithmetic in \mathbb{Z}_2
  4. If all b_j=1 output \mathsf{accept}. Else, output \mathsf{reject}.

TEE confidential channels

Explicit confidential and authenticated communication with the TEEs is out of the scope of this proposal. In the algorithms described below it is just assumed to happen ‘under the hood’. Most likely a symmetric/hybrid encryption scheme will be used.

Shielded encryption

Refer to this post for a description of the algorithms (\mathsf{KeyGen},\mathsf{Encrypt},\mathsf{Decrypt}) used in the implementation of the shielded resource machine.

1. Detection layer

The TEE implements \mathsf{AddDetectionKey} and \mathsf{Filter}. Both algorithms have access to an internal table \mathbb{K}_{det} storing pairs of user identifiers and detection keys (\mathsf{uid},\tilde{dsk}), initialized to empty.

AddDetectionKey

  • External input: (\mathsf{uid},\tilde{dsk}): // user id, detection keys
  • Internal inputs: \mathbb{K}_{det},k_{in}
  1. If entry for \mathsf{uid} exists in \mathbb{K}_{det}, or detection key \tilde{dsk} is not in the right domain do nothing
  2. Else, add (\mathsf{uid},\tilde{dsk}) to \mathbb{K}_{det}

:bulb: Alternatively, output a status flag indicating success or failure (leaks user existence).

Filter

  • External inputs: \mathsf{uid},m,\vec{f} // user id, message and flag ciphertexts
  • Internal inputs: \mathbb{K}_{det}
  1. Retrieve \tilde{dsk} for user \mathsf{uid} from \mathbb{K}_{det}
  2. If there is no entry in \mathbb{K}_{det} or \mathsf{FMD2.Test}(\tilde{dsk},\vec{f}) rejects, output (\mathsf{no\_match},\bot)
  3. Else \mathsf{FMD2.Test}(\tilde{dsk},\vec{f}) accepts. Output (\mathsf{match},\mathsf{uid},m,\vec{f})

:bulb: If m is sent to the decryption layer, \vec{f} can be excluded from the output. (For simplicty we assume it is always included.)

2. Decryption layer

The TEE implements \mathsf{AddDecryptionKey} and \mathsf{TrialDecrypt} algorithms. Both algorithms have access to an internal table \mathbb{K}_{dec} storing pairs of user identifiers and message decryption keys (\mathsf{uid},sk), initialized to empty.

AddDecryptionKey

  • External input: \mathsf{uid},sk // user id and message decryption key
  • Internal inputs: \mathbb{K}_{dec}
  1. If entry for \mathsf{uid} exists in \mathbb{K}_{dec} do nothing
  2. Else, add (\mathsf{uid},sk) to \mathbb{K}_{dec}

TrialDecrypt

  • External inputs: \mathsf{uid},m // user id and message
  • Internal inputs: \mathbb{K}_{dec}
  1. Retrieve sk for user \mathsf{uid} from \mathbb{K}_{dec}. If there is no entry, output (\mathsf{no\_user},\bot) and halt
  2. pt\gets \mathsf{ShieldRM.Decrypt}(sk,m) // Get the plaintext of ciphertext m
  3. If cannot parse pt as (\mathsf{uid},\star) output (\mathsf{false\_positive},\bot)
  4. Else output (\mathsf{true\_positive},m)

:bulb: We have assumed plaintext message pt is prepended with the user identifier \mathsf{uid}. Any other mean to identify that pt belongs to user \mathsf{uid} can be used.

3. Remarks

  • Algorithms \mathsf{Filter} and \mathsf{TrialDecrypt} could have as external input/output an stream of elements. The description above is discretized for simplicity.
  • For efficiency, instantiate ElGamal over an elliptic curve (non pairing-friendly!), for example Curve25519. The choice of the exact curve depends on what is supported by the TEE hardware, but the level of security should be at least 128 bits.
  • One possibility to instantiate the random oracles H and G is to serialize the elements of the domain, apply a cryptographic hash function defined over bits (SHA2/3, a XOF…), and then truncate appropriately. Use domain separators to differentiate H from G, and enforce full domain hashing for G.

4. Diagram

Another arrangements are possible. For example, a backend service holding a list of user identifiers interacts with the detection layer (instead of the user).

ZKPs of correct flag generation

:warning: I am not including this part in the above proposal because I feel it is not mature enough for prototyping (but this is just my opinion).

The problem

The concern is that a malicious resource creator (sender) can send a message m inconsistent with the flag ciphertexts \vec{f}. Thus, the message is addressed to user \mathsf{uid} , but flag \vec{f} is generated with the public key pk of a different user \mathsf{uid}'.

Enforcing consistency

Consistency between m and \vec{f} can be enforced attaching an non-interactive zero-knowledge (NIZK) proof \pi attesting for correct flag generation. Thus, the sender would send the triplet (m,\vec{f},\pi).

Let ciphertext flags \vec{f} for public key pk. Let us set the user identifier to the public key \mathsf{uid}:=pk, and as in the proposal above, suppose m encrypts plaintext (pk,\star) with \mathsf{ShieldRM.Encrypt}.

NP statement
Given public \vec{f}, and encrypted message m as above. Then, \vec{f} is the result of running \mathsf{FMD2.Flag} on input whatever is encrypted in m.

The statement above can be turned into an arithmetic circuit and prove its satisfiability in zero-knowledge.

Optimizations

Use FMD1 construction We can probably use the FMD1 construction from the original paper. To generate an FMD1 flag, everything related to the oracle G and tag w is removed. This means that the resulting circuit to prove in zero-knowledge would be much less heavy.
Note: FMD1 is only IND-CPA secure, but since we are attaching NIZKs involving the c_i, these cannot be modified. More specifically, FMD1+Knowledge-sound NIZKs should yield IND-CCA security (reduce IND-CPA to IND-CCA by answering decryption queries leveraging the knowledge extractor of the NIZK).

IVC-based approach Correct FMD1 flag generation \vec{f}:=(u,y,c_1,\ldots,c_\gamma) from public key pk:=(h_1,\ldots,h_\gamma) actually involves proving statements of the form:

NP statement (FMD1):
Given public c_i,u, encrypted message m and group generator g. Then, the discrete logarithm of u in base g is the same than the discrete logarithm of the preimage under H of c_i-1 in base h_i, where h_i is the i-th public key component encrypted in m.

A total of \gamma statements, one per c_i in \vec{f}. We could recurse and produce a single proof via a commit-and-prove approach to ensure it is used the same discrete logarithm r --the flag ciphertext encryption randomness–.

Random oracle instantiation. The random oracle H could be instantiated with a zk-friendly collision-resistant hash function (e.g. Poseidon, Pedersen).

Questions

  • Why is that the sender needs to prove correct flag generation, again? We have discussed this already, but going through it once more won’t hurt. Specially given this seems to require structural changes to resources, and FMD is just one way to speed up message retrieval.
  • How public keys relate to user identities? Above I’ve assumed \mathsf{uid}=pk for simplicity. There may be other ways.
  • How FMD public keys relate to resources? Above I’ve assumed they are encrypted in m. In principle could be placed somewhere else in the resource, for example, as an explicit field of the resource, or cheaply derived from some field.
1 Like