Relevant open research questions in distributed systems

I’m curating a list of open research questions related to Anoma’s distributed systems subcomponents. Feel free to respond with anything relevant; I’ll edit the top post to keep it up-to-date.

Which consensus algorithms are suitable for the slow game?

Which consensus algorithms are most suitable for the slow game, where we want to be able to run consensus on long timescales which can incorporate user preferences, doesn’t rely on a proposer, and supports large numbers (billions) of voters?

Can we make heterogeneous-trust randomized consensus protocols?

Can we make randomized consensus protocols (background) heterogeneous, in the style of Heterogenous Paxos and the rest of the Typhon protocol stack?

Can we cleanly characterize the optimisation space of distributed compute and storage?

If we take distributed storage and compute to be distributed cache locality optimisation problems subject to the additional constraint of trust and an unknown & evolving physical network topology, can we characterise the optimisation space cleanly and craft algorithms for dynamically selecting and reselecting storage and compute locations according to the best guess as to the underlying network topology? Can this be done with only local information in a way which composes well?

This is an interesting paper called TrustBoost which could be indirectly related to the slow game. Each chain has their own consensus but agrees to opt into a Meta-consensus protocol; consensus on top of consensus. Here is a helpful thread from one of the authors, Sreeram Kannan.

Excerpt from the Abstract:

Currently there exist many blockchains with weak trust guarantees, limiting applications and participation. Existing solutions to boost the trust using a stronger blockchain, e.g., via checkpointing, requires the weaker blockchain to give up sovereignty. In this paper we propose a family of protocols in which multiple blockchains interact to create a combined ledger with boosted trust. We show that even if several of the interacting blockchains cease to provide security guarantees, the combined ledger continues to be secure – our TrustBoost protocols achieve the optimal threshold of tolerating the insecure blockchains. Furthermore, the protocol simply operates via smart contracts and require no change to the underlying consensus protocols of the participating blockchains, a form of “consensus on top of consensus."

This is a very nice paper. I think it is solving a related but slightly different problem. As I understand the approach, their protocol allows for consensus among m ledgers, f of which may be faulty. By default, these would still be using the same operator set (e.g. validators, for Tendermint & proof-of-stake style systems) as the base chains. I don’t think that addresses the requirements of a slow game protocol because those operators are the same operators who might be price-fixing and we want to run consensus to potentially fork out!

This could be helpful for consensus among some heterogeneous chains which are instead operated directly by the users, and which are spun up precisely for the purpose of running the slow game protocol. I think the hard problems in such a construction are:

  • Efficiency, as we want to aggregate votes from many many (millions to billions) of users in a way that other users can directly verify.
  • Heterogeneous trust, and in particular using a user’s trust preferences to inform at least semiautomatic selection of what kinds of slow game quorums they would want to participate in and whose outputs would be meaningful to them. The slow game can be run independently for different trust domains (e.g. different proof-of-stake tokens), but it’s probably more practical and more credible if it can handle heterogeneous trust natively.