I’ll try to address your questions in order:
- What is a blob storage? What data is stored there and why?
- Unfortunately, there are several places where there may be storage, which may call the things they’re storing “blobs.” I write about 2 that we might care about below: Per-Node Key/Value Stores, and Anoma State Machine Blob Storage. In a prior slack conversation, we were discussing Anoma State Machine blob storage.
- Who decides what to store in the blob storage and for how long?
- I try to address this in the sections below.
- What other types of storage are there? What are they used for?
- Anoma State Machine state also has to include commitments and nullifiers, and possibly some other stuff (past state roots, maybe endorsements on other machines). Per-Node key/value stores that we’ve talked about in the past think of all values as “blobs.” We can, of course, imagine other types of structured storage.
- What is the relationship between different types of storages?
- Each node (including validators) can maintain their own Per-Node Key/Value Store, and each validator also maintains a local copy of the Anoma State Machine state (which includes blob storage). However, these are conceptually different: one is a per-node affair, with reads and deletes (at least potentially) independent on each node. The anoma state machine blob storage exists “within” the replicated state machine: it is only (directly) accessible from TransactionCandidates running inside the state machine.
- What storage type deletion criteria are associated with? Is it one storage type or many?
- This may vary depending on which blob storage we’re talking about. In any case, we presumably need some predicate, and the storage is only supposed to delete the blob if it has evidence of some witness that satisfies the predicate. Since Anoma State Machine Blob Storage only updates on a successful anoma transaction, it seems reasonable to roll this into the resource logic that determines whether an anoma transaction is successful. For Per-Node Key/Value Storage, it may be preferable to have a different kind of predicate, particularly if the storing nodes are expected to produce the witness (and know when to do so).
- Who decides what the deletion criterion should be?
- I can think of no better answer than to bake the criterion into the “key” identifying the blob. That way, at least everyone referring to the blob agrees on what the criterion should be. Since Anoma State Machine Blob Storage is primarily used for whatever application-specific data TransactionCandidates need to use in post-ordering execution, we presumably want to permission parts of this storage space by application. This is why the first element of Anoma State Machine Blob keys is resource logic hash.
- How to translate between RM
deletion_criterion
and actual delete and write instructions associated with transactions
- I think this is best explained with a little context, which I have tried to do in the sections below.
- What is the relationship between the RM transactions and distributed systems (?) transactions
- When talking about state machine replication or online transaction processing or database transactions in distributed systems, we imagine that the thing we care about is some
State
, which updates in a serializable manner according to some state_transition_function
. A transaction is the only input to the state transition function besides the previous state, so state_transition_function : State -> transaction -> State
. In principle, if a group of replicas agree on a starting state, and a list of transactions, they can reach the current state by running the state_transition_function
over and over. A fairly common pattern is to have the transaction
type be a function State-> State
, and so state_transition_function
literally just applies transaction
to the old state to get the new state. In the anoma documentation, these “distributed systems transactions” should always be called TransactionCandidates. We call them this to distinguish them from RM transactions, also known as anoma transactions. I occasionally fail at making this distinction, and that is my bad.
- RM transactions, also know as anoma transactions, to my understanding (and admittedly I am not up on the latest drafts), specify a specific set of nullifiers and commitments to emit, and a proof that all the resource logics of all the resources are involved are satisfied by some set of witnesses. They can also specify some kind of
appdata
, which, to my understanding, can include some writes of some kind (possibly to various kinds of blob storage) that may be required to satisfy some resource logics (i.e. resource logics can make writes mandatory). Specifically for the anoma state machine blob storage, I propose that this can be extended to a fairly arbitrary predicate over writes (including deletes) made: each resource logic can govern exactly what writes or deletes are allowed within its portion of blob-space.
- The anoma state machine update function needs to do several things:
- Perform “post-ordering execution,” running some kind of executable code in the TransactionCandidate. This can read from anoma state machine state (including blobs therein), and outputs an anoma transaction (which can include storage operations in
appdata
), as well as a set of “side effects,” which can include messages to be sent over the wire, including store requests to Per Node key/value stores.
- Verify the anoma transaction: check that witnesses exist for all the resource logics involved, all the nullifiers are new, and all the required commitments exist.
- If the verification passes, perform all side effects, and write updates to state (nullifiers, commitments, and storage operations from
appdata
)
Per-Node Key/Value Stores
We can have each node store key/value pairs, which other nodes can request by key.
- Storage is initiated by sending them a message, and they promise to store the value until some deletion criterion is met. Each node decides what it wants to store, and for how long. I’m not sure exactly what the deletion criteria here are.
- Presumably, after the deletion criterion is met, the node forgets about the key, the value, and any kind of proof the deletion criterion was met.
- Reads are likewise by message: someone sends in a request for a key, and the node responds with the value.
We have at times talked about implementing this as part of the network stack, for some reason.
Interaction with Anoma Transactions and Post-Ordering Execution
We can, in fact, store stuff in per-node storage from within post-ordering execution. This is considered a “side effect” as far as the executor is concerned. We could, for example, say that all validators have to store all storage requests (up to some price?) output by all TransactionCandidates that produce valid anoma transactions. This would be super-convenient for storing, say, output resources (possibly encrypted) that we want other people to eventually read and learn about.
We cannot, however, read from per-node storage from within TransactionCandidates. We cannot rely on reads during post-ordering execution, and we cannot rely on reads for verifying anoma transaction validity. This is because different validators may have different experiences of per-node storage at execution time (a given read might succeed for some, but fail for others), which would make the TransactionCandidate non-deterministic.
Anoma State Machine Blob Storage
Each executor (which may be blockchains, run by validators) is a replicated state machine. This state is the most expensive to maintain, since there are a lot of demands on it. We therefore want to keep as little state in here as possible.
State within the state machine is accessible for reads and writes during Post-Ordering Execution, and can be used (along with whatever the TransactionCandidate outputs) to verify anoma transactions. It therefore needs to include:
- commitments, or at least roots thereof, so that anoma transactions can commit new resources
- nullifiers, so that anoma transactions can prove the nullifiers they emit have not been emitted before
- any application-specific state that transaction candidates need to write and read in post-ordering execution. These are what the state machine calls blobs.
Note that, in order to accomplish our goal of carrying the minimum information needed for post-ordering execution and checking anoma transaction validity, we do not need to store:
- proofs of past anoma transaction validity
- deleted stuff, or proof that it was ok to delete it.
Reads and writes (and deletes) to storage here are part of running TransactionCandidates. Writes are approved if and only if the anoma transaction is valid, which is why it makes some kind of sense to include resource logic hashes in block identifiers: it’s how we determine which transaction candidates are allowed to write what.
My proposal for how to lay out state in the anoma state machine is The State Architecture ART Report.
I hope that helps,