Brief summary of where we are now

We have spent quite a bit of time on improving synchronisation of notes for the multi-asset shielded pool of Namada.

In the last months, we have gone from brainstorming to propose a concrete solution, based on Fuzzy Message Detection (a sophisticated cryptographic primitive --check it out here–) and trusted enclaves for defense-in depth.

The journey has been quite amazing, involving pure research as well as lots of engineering. The outcome is an advanced prototype, formed by libraries written in Rust. One for performing multi-key FMD across several detection servers, and the Kassandra service bridging and coordinating the process, with room to accommodate TEEs in the future.

So, are we done? What we have got so far speeds up the computations required to sync on the wallet side. It does not improve, significantly, the number of downloaded notes. This comes at no surprise, as what FMD gives you is detection of your own notes; it doesn’t retrieve the notes.

What’s going on. In layman words, in the current solution, wallets locally combine the responses of several detection servers to reduce the number of trial-decryptions they need to perform. BUT, to keep their own notes hidden, the can’t just query the reduced (combined) set of notes to a MASP indexer. At least not in the clear. If we were to do that, then we could just deploy a single server, and ask him to detect with the reduced false-positive rate (with the corresponding downgrade of privacy).

  • The safe solution is having wallets query all the notes they get from the detection servers (i.e. the notes before reducing/combining). This is what is currently implemented. See the diagram below
  • An alternative approach, that we are currently considering, is to query a subset of the notes to different indexers. As long as they don’t collude, none of them will get the whole view. This would reduce latency, but yes, is an ad-hoc solution, and, unavoidable, indexers learn something below the threshold set for detection servers.

Is there more to it? A third option that we could explore is private information retrieval. The point is that wallets do know what to retrieve from the public pool. So, they could run a PIR on the reduced/combined set of notes (in the diagram, after step 7). In theory, this works beautifully, in practice… well it depends on the exact performance of the PIR.

  • Single-server PIR tends to be slower because it relies on computational hard-problems.
  • Multi-server PIR is faster as it can be shown to be information-theoretic secure.

My suggestion is to spend a bit more of time on this (in parallel with the planned integration). Explore PIR schemes and, potentially, other related primitives. See if we can reduce communication overhead without hurting too much the computational performance we have already got.

1 Like

If I understand correctly, we should ideally retrieve information privately, but this would complicate the protocol and increase costs, as you mentioned. Could we use the second approach instead, adding random dummy indices to real queries to reduce information leakage to indexers? In practice, query indices may not be large, so the cost of dummy indices might be minimal. While indexers could still learn something, this may not be a major issue imp.

2 Likes

Thank you for the lovely overview @kike!

While I agree that this approach is suboptimal compared to full PIR, I think that the practical tradeoffs are quite reasonable in the short term – something to keep in mind is that if shielded synchronization is too slow, users simply won’t use it, so even if the theoretical privacy protections are very high the effective privacy protection is zero! I suggest that we implement this for Namada right now, and start in parallel on the future PIR research.

Are there known implementations of this scheme (or kind of scheme) with benchmarks?

I agree with @xuyang . Seems like querying a subset of notes is not only the simplest and efficient enough solution, but it practically makes the same trade-offs as the FMD part.

The detection the server learns true positives and some false positives. Here the indexer will also learn some true positives and some false positives (the random notes we query to reduce leakage). If we are fine with it for the detection servers, I think it is fine to reveal as much information to the indexer as well? It is also more dynamic compared to the FMD server false positive rate - the amount of “false positives” reuqested from the indexer can be adjusted more easily since the number of random notes queried doesn’t have to be specified in advance

The comunication overhead is from Indexer to Receiver, when answering with the notes for the queried indices (notes may be large in size). We’d like to reduce the number of downloaded notes as much as possible.

The ideal situation is to reveal as much information to the indexer than what is revealed to the detection servers. The question is, how to ensure this is the case, and under what circumstances?

Imaging three servers, and a dishonest majority setting: two servers can be corrupted, and they can collude (we don’t know which). After running detection on the flags, they send indices \mathcal I_1, \mathcal I_2, \mathcal I_3 to the receiver[1], who then runs trial-decryption on the notes indexed by the intersection \cap_j \mathcal I_j.

[2]If the receiver gives to the indexer \mathcal Q = \mathcal I_1\cap \mathcal I_2\cap\mathcal I_3 + \mathcal R, and \mathcal R is a small and random choice of indices, it is pretty likely none of them are in \mathcal I_1 \cap \mathcal I_2. At this point, if the indexer colludes with the corrupted detection servers, they can get the full intersection:

\mathcal I_1 \cap \mathcal I_2 \cap \mathcal I_3 = \mathcal Q \cap (\mathcal I_1 \cap \mathcal I_2)

We can somewhat mitigate the above with several indexers. However, whatever the indexers are given, a colluding one can always reduce \mathcal I_1 \cap \mathcal I_2 by intersecting with its indices. In other words, during the retrieval phase, the adversary can reduce the privacy threshold of the detection phase.

Based on the above we can distinguish two cases: indexer(s) colludes with the detection server, or it doesn’t.

Case 1: Indexer(s) does not collude with detection servers. In this case, receivers can query \mathcal I_1 \cap \mathcal I_2 to the indexer. In general, if in the FMD phase we assume up to t corrupted detection servers (out of d), the receiver can query, e.g., the intersection of the first t indices \cap_{j\leq t}\mathcal I_j. This is better than querying the union \cup_{j\leq d} \mathcal I_j, as Kassandra currently does, and provides the same privacy guarantees (against non-colluding adversaries).

Case 2: Indexer can collude with detection servers. In this case, if we give whatever indices in the clear to the indexer(s), the privacy threshold of the FMD phase can be reduced. So, it seems we are left with running a PIR to protect against this stronger adversary (again, to ensure the privacy threshold of the detection phase is not reduced in the retrieval phase). Since the PIR guarantees nothing is leaked about the queried indices, the receivers can query the fully-reduced ones: the intersection \cap_{j\leq d}\mathcal I_j. It is unclear to me whether this reduction compensates the overhead introduced by the PIR.


  1. If each server sends the notes \mathcal N_1,\mathcal N_2, \mathcal N_3 for the detected indices (instead of the indices), is the same as if the indexer sends the union of the notes \cup_j\mathcal N_j to the receiver. (In terms of communication overhead.) ↩︎

  2. I’ve edited this paragraph with a clearer reasoning. ↩︎

Agreed. There is tension between privacy and usability. Here, in favour of our solution, I’d like to say that the communication overhead of the FMD-based sync is no worse than the current sync (download all notes). Actually, is slightly better. So, one can say there is already some minor incentive in switching to the new solution. (Of course, the computational overhead is way better, though not sure how does this show up in the overall user experience.)

This is a good point. I wonder the same. I will look into it.

1 Like