Storage and deletion criteria questions

To have a more synchronised view between the RM and distributed systems sides, it would be helpful to explicitly answer these questions related to storage:

  1. What is a blob storage? What data is stored there and why?
  2. Who decides what to store in the blob storage and for how long?
  3. What other types of storage are there? What are they used for?
  4. What is the relationship between different types of storages?
  5. What storage type deletion criteria are associated with? Is it one storage type or many?
  6. Who decides what the deletion criterion should be?
  7. How to translate between RM deletion_criterion and actual delete and write instructions associated with transactions
  8. What is the relationship between the RM transactions and distributed systems (?) transactions

cc @isheff @cwgoes @degregat

I’ll try to address your questions in order:

  1. What is a blob storage? What data is stored there and why?
    • Unfortunately, there are several places where there may be storage, which may call the things they’re storing “blobs.” I write about 2 that we might care about below: Per-Node Key/Value Stores, and Anoma State Machine Blob Storage. In a prior slack conversation, we were discussing Anoma State Machine blob storage.
  2. Who decides what to store in the blob storage and for how long?
    • I try to address this in the sections below.
  3. What other types of storage are there? What are they used for?
    • Anoma State Machine state also has to include commitments and nullifiers, and possibly some other stuff (past state roots, maybe endorsements on other machines). Per-Node key/value stores that we’ve talked about in the past think of all values as “blobs.” We can, of course, imagine other types of structured storage.
  4. What is the relationship between different types of storages?
    • Each node (including validators) can maintain their own Per-Node Key/Value Store, and each validator also maintains a local copy of the Anoma State Machine state (which includes blob storage). However, these are conceptually different: one is a per-node affair, with reads and deletes (at least potentially) independent on each node. The anoma state machine blob storage exists “within” the replicated state machine: it is only (directly) accessible from TransactionCandidates running inside the state machine.
  5. What storage type deletion criteria are associated with? Is it one storage type or many?
    • This may vary depending on which blob storage we’re talking about. In any case, we presumably need some predicate, and the storage is only supposed to delete the blob if it has evidence of some witness that satisfies the predicate. Since Anoma State Machine Blob Storage only updates on a successful anoma transaction, it seems reasonable to roll this into the resource logic that determines whether an anoma transaction is successful. For Per-Node Key/Value Storage, it may be preferable to have a different kind of predicate, particularly if the storing nodes are expected to produce the witness (and know when to do so).
  6. Who decides what the deletion criterion should be?
    • I can think of no better answer than to bake the criterion into the “key” identifying the blob. That way, at least everyone referring to the blob agrees on what the criterion should be. Since Anoma State Machine Blob Storage is primarily used for whatever application-specific data TransactionCandidates need to use in post-ordering execution, we presumably want to permission parts of this storage space by application. This is why the first element of Anoma State Machine Blob keys is resource logic hash.
  7. How to translate between RM deletion_criterion and actual delete and write instructions associated with transactions
    • I think this is best explained with a little context, which I have tried to do in the sections below.
  8. What is the relationship between the RM transactions and distributed systems (?) transactions
    • When talking about state machine replication or online transaction processing or database transactions in distributed systems, we imagine that the thing we care about is some State, which updates in a serializable manner according to some state_transition_function. A transaction is the only input to the state transition function besides the previous state, so state_transition_function : State -> transaction -> State. In principle, if a group of replicas agree on a starting state, and a list of transactions, they can reach the current state by running the state_transition_function over and over. A fairly common pattern is to have the transaction type be a function State-> State, and so state_transition_function literally just applies transaction to the old state to get the new state. In the anoma documentation, these “distributed systems transactions” should always be called TransactionCandidates. We call them this to distinguish them from RM transactions, also known as anoma transactions. I occasionally fail at making this distinction, and that is my bad.
    • RM transactions, also know as anoma transactions, to my understanding (and admittedly I am not up on the latest drafts), specify a specific set of nullifiers and commitments to emit, and a proof that all the resource logics of all the resources are involved are satisfied by some set of witnesses. They can also specify some kind of appdata, which, to my understanding, can include some writes of some kind (possibly to various kinds of blob storage) that may be required to satisfy some resource logics (i.e. resource logics can make writes mandatory). Specifically for the anoma state machine blob storage, I propose that this can be extended to a fairly arbitrary predicate over writes (including deletes) made: each resource logic can govern exactly what writes or deletes are allowed within its portion of blob-space.
    • The anoma state machine update function needs to do several things:
      1. Perform “post-ordering execution,” running some kind of executable code in the TransactionCandidate. This can read from anoma state machine state (including blobs therein), and outputs an anoma transaction (which can include storage operations in appdata), as well as a set of “side effects,” which can include messages to be sent over the wire, including store requests to Per Node key/value stores.
      2. Verify the anoma transaction: check that witnesses exist for all the resource logics involved, all the nullifiers are new, and all the required commitments exist.
      3. If the verification passes, perform all side effects, and write updates to state (nullifiers, commitments, and storage operations from appdata)

Per-Node Key/Value Stores

We can have each node store key/value pairs, which other nodes can request by key.

  1. Storage is initiated by sending them a message, and they promise to store the value until some deletion criterion is met. Each node decides what it wants to store, and for how long. I’m not sure exactly what the deletion criteria here are.
  2. Presumably, after the deletion criterion is met, the node forgets about the key, the value, and any kind of proof the deletion criterion was met.
  3. Reads are likewise by message: someone sends in a request for a key, and the node responds with the value.

We have at times talked about implementing this as part of the network stack, for some reason.

Interaction with Anoma Transactions and Post-Ordering Execution

We can, in fact, store stuff in per-node storage from within post-ordering execution. This is considered a “side effect” as far as the executor is concerned. We could, for example, say that all validators have to store all storage requests (up to some price?) output by all TransactionCandidates that produce valid anoma transactions. This would be super-convenient for storing, say, output resources (possibly encrypted) that we want other people to eventually read and learn about.

We cannot, however, read from per-node storage from within TransactionCandidates. We cannot rely on reads during post-ordering execution, and we cannot rely on reads for verifying anoma transaction validity. This is because different validators may have different experiences of per-node storage at execution time (a given read might succeed for some, but fail for others), which would make the TransactionCandidate non-deterministic.

Anoma State Machine Blob Storage

Each executor (which may be blockchains, run by validators) is a replicated state machine. This state is the most expensive to maintain, since there are a lot of demands on it. We therefore want to keep as little state in here as possible.

State within the state machine is accessible for reads and writes during Post-Ordering Execution, and can be used (along with whatever the TransactionCandidate outputs) to verify anoma transactions. It therefore needs to include:

  1. commitments, or at least roots thereof, so that anoma transactions can commit new resources
  2. nullifiers, so that anoma transactions can prove the nullifiers they emit have not been emitted before
  3. any application-specific state that transaction candidates need to write and read in post-ordering execution. These are what the state machine calls blobs.

Note that, in order to accomplish our goal of carrying the minimum information needed for post-ordering execution and checking anoma transaction validity, we do not need to store:

  1. proofs of past anoma transaction validity
  2. deleted stuff, or proof that it was ok to delete it.

Reads and writes (and deletes) to storage here are part of running TransactionCandidates. Writes are approved if and only if the anoma transaction is valid, which is why it makes some kind of sense to include resource logic hashes in block identifiers: it’s how we determine which transaction candidates are allowed to write what.

My proposal for how to lay out state in the anoma state machine is The State Architecture ART Report.

I hope that helps,

  • Isaac
3 Likes

Thanks for the detailed answer! I gave it a first read and got some clarifying questions (statements)

  1. There is one storage (per node) that is associated with the machine itself, and there is one storage (Anoma state machine) that is associated with the abstraction of Anoma State Machine. The latter is a part of the abstraction, the former is associated with the physical machine
  2. Anoma State Machine storage contains everything needed to represent the state and we are trying to minimise the number of these things. It is the same for all machines (maybe not in practice but it strives for it) since it is a part of the replicated state. The per node storage is governed by the machine it exists on and may store whatever the machine deems appropriate
  3. Blob storage is a meaningful concept from the engineering perspective, but is not very helpful from the research perspective as it refers to a type but not to the function of storage

I think it might be helpful to describe what appData is designed for in the RM specs to sync our understanding. appData is designed to contain instances (additional arguments to the proving system) needed to verify the logic proofs associated with the transaction. Each such entry in the appData has an associated deletion_criterion that specifies (in the way it is thought of in the RM realms) for how long to store this piece of data from the moment the Transaction settles.

The assumption is that once the transaction settles, it is stored in the storage (per-node storage, if choosing between the two) either forever or is rolled up in a way that it is always possible to verify the transaction again at any point in time. I think my prior head model thought of it as stored forever (which makes the deletion criterion part rather confusing), but now the latter seems to make more sense since it explains the role of the deletion criterion. Either way, I’ll try to clarify this.

This is interesting, it makes my previous thought about minimising the state machine storage incorrect. So if we replace the actual data with an instruction to fetch it, the data must be stored in the state machine storage? This is another trade off to consider when choosing between the options, but it is only a trade off if we assume the state machine storage size a bottleneck. My understanding is that storing things in the state machine storage is very expensive (relative to that, the per-node storage is cheap). Is that correct?

Some of my questions seem to be answered in your response, but I asked them anyway to make sure I understood what you are saying correctly

I see; thank you for the clarification. In general, validators will of course want to store such proof data until they think they won’t need it anymore (such as if / when some kind of roll-up is completed). The ultimate purpose here is for a validator to convince users that its replica of the replicated state machine is correct: it is the result of a series of correct state transitions. As far as I know, we don’t need such proof data inside the state machine (since, “inside” the state machine, we’re in some sense already assuming that whatever state we read is correct). I therefore assume this goes in per-node storage on all validators, where there should indeed be some deletion criteria that correspond with the “won’t need it anymore” requirement.

I still think that resource logics (or at least something very similar) can govern writes to state-machine blob storage. Applications need some way to specify what state updates are allowed to “their” portion of the state, and I don’t know a better way to do that than to partition blob-space by logic hash, and require that any writes to that part of blob space be approved by the resource logic (which may, in turn, demand additional witnesses or some resources being created or destroyed, just like when they approve creating or consuming a resource). Of course, we’d need to include such a thing in the type of resource logic.

Storage in the replicated state machine is indeed very expensive (price will vary by chain, but in general, this is an expensive type of storage). In general, whenever possible, TransactionCandidates should include any information needed to create the anoma transaction (appData included). This avoids touching state machine state, which makes running transaction candidates easier to run in parallel.

Aside: We might imagine that if a validator has some of this information in per-node storage, it would be more efficient to assemble the TransactionCandidate on the Validator, rather than assembling it on a Solver and then transmitting it to a Validator. I suppose we might call this “pre-ordering execution”? That said, all TransactionCandidates must eventually be replicated across all Validators, so this might not actually be a useful thing to do. We might imagine a data-syncing protocol (used by clients, solvers, and validators), that allows data structures (like TransactionCandidates) to reference sub-structures by hash, and nodes fetch only those sub-structures they don’t already have, as a way to make general data transfer more efficient. Anyway, for now, such things are not part of the spec.

In general, thanks for the clarifications. I think we’re much closer to the same page.

  • Isaac

I’m wondering now if we should have two distinct mechanisms for storage writes. From what I understand, it can be useful to specify writes to both storages since they seem to have different purposes:

  • blob storage keeps data accessible post ordering
  • per-node storage allows independent transaction verification and can generally can store stuff for whatever reason the user has

I think it is either that we never explicitly specify writes to the latter, or we should have them separate to have a clear distinction and avoid confusion.

We can either do it by introducing some structure to the deletion criterion predicate, or have two separate predicates. Or some other way, these two just came to mind now.

Side note: I think it would also be helpful to further think about distant time verification: if the transaction is to be a part of a roll up, then storing whatever data the user specifies is very different from when there is no roll up. If there is rollup verification, storing the data seems to be more of a user’s whim. If there is no rollup, everything must be stored, and if it must be stored, the user can’t be charged much for pretty much infinite storage (because it isn’t just for the user now, the whole chain status would depend on the ability to verify the transaction). Of course it may be not very practical, but it makes sense to consider if we don’t exclude this option. But this generally also introduces the distinction between must writes and want writes and they shouldn’t be treated the same, I think.

My understanding is that post- and pre- ordering execution partition the world of execution. Then what do we call shielded execution with a transaction candidate assembled by the solver? I think we used to call that pre-ordering execution, but even if we don’t, we need to call it something.

Indeed. We should definitely be able to do this. From the “state update function” or “executor” perspective, writes to per-node storage are a “side effect,” while writes to anoma state machine blob storage are a “write,” but that distinction isn’t terribly important for this discussion.

I have been imagining that these 2 writes arise from 2 different places: per-node storage writes come out of post-ordering execution, along with the anoma transaction, and are written (along with their deletion predicates and any other metadata they’re supposed to carry) if the anoma transaction’s proofs verify. Anoma state-machine storage writes are part of the transaction itself: some resource logics may depend on the set of writes (e.g. you can only consume this resource if you store a blob in storage here…). It occurs to me that we could remove this distinction, and allow anoma transaction proofs to reference write operations to per-node storage (to be stored on all validators, but not readable from within the anoma resource machine). Come to think of it, we could allow anoma transaction proofs to reference other side-effects as well, like messages to be sent over the network. I genuinely don’t know if this is a good idea.

Good point. We should definitely not add another definition to the term “pre-ordering execution.”

I think I might understand post-ordering execution wrongly since it seems to imply more than I imagine. The way I understand post-ordering execution is that the solver submits a transaction “template” that describes the idea of what should be done and fills in some of the fields, but in the post-ordering phase the placeholders are replaced with the actual values.

  • Do we assume that submitted transactions that imply post-ordering execution being valid before ordering?
  • If no, how to tell the difference between an invalid transaction and a transaction that will be valid after ordering?

But back to my understanding: such a definition practically means that post-ordering execution is not possible for shielded transactions (I describe it a bit further here) since they must have all the fields ready. But it seems there are other things that happen post-ordering that do affect shielded transactions? I would be helpful to understand better how per-node storage writes come out of post-ordering execution. I don’t have an intuition for that now and so I can’t see how it applies to shielded transactions. Can users affect what is written in per-node storage if they have a particular desire to have something stored but do not need it in the blob storage?

From your description it seems to me that we make sure that users can specify what to write to the blob storage easily, which makes a lot of sense to me, but I don’t understand how to access this state in the shielded case and how it can be used in a helpful way there. At the same time per-node storage can be useful, but doesn’t have the same guarantees, so it might not be what is needed. I think a better understanding of post-ordering execution will help me with it