Minimum viable abstract VM

cwgoes · July 31, 2023, 2:08pm

The aim of this post is to describe a “minimum viable abstract VM” which does everything that we require. This could also be understood as an abstract interface specification for Taiga.

Basics

Fix a finite field F, with:

Multiplication: *_F, with type F \to F \to F
Addition: +_F, with type F \to F \to F
Additive identity: 0_F, with type F
Multiplicative identity: 1_F, with type F
Additive inverse: -_F, with type F \to F
Equality: =_F, with type F \to F \to {0 | 1}

Fix an abstract (possibly infinite) program field P, with:

Multiplication: *_P, with type P \to P \to P
Addition: +_P, with type P \to P \to P
Additive identity: 0_P, with type P
Multiplicative identity: 1_P, with type P
Additive inverse: -_P, with type P \to P
Comparison: <_P (less than, exclusive), with type P \to P \to {0 | 1}

Fix an abstract proving system with two functions:

Prove(C, x, w) ⟶ π
Verify(C, x, π) ⟶ { 0 | 1}

We say that this proving system is zero-knowledge if from π the verifier learns nothing of w.

(we will omit preprocessing for the sake of simplicity)

Fix a hash function H, with type P \to F, such that:

H is non-invertible, in that a computationally bounded adversary cannot compute the preimage from the output
H is collision-resistant, in that a computationally bounded adversary cannot feasibly find two preimages with the same output

Programs

Now, we will define programs.

Programs PROG are extended multivariate polynomials over a_0 ... a_n \in P in canonical form, with designated variable b, for example:

a_0^3 + 2a_1^2 - 3a_0 + 2 - b = 0

We extend these polynomials with control flow and reflection.

First, let’s introduce the control flow primitives: comparison and branching.

Comparison is lt(a, b), where:

lt(a, b) = 0 if a <_P b
lt(a, b) = 1 otherwise

Branching is branch(a, b, c), with type P \to P \to P \to P, where:

branch(0_P, b, \_) = b
branch(1_P, \_, c) = c
branch(\_, \_, \_) = undefined

Thus, for example,

branch((lt(a_0, a_1), a_2, a_3^2) - b = 0

is a valid program.

Assuming that a \in {0, 1}, branch(a, b, c) is semantically equivalent to ((1 - a) *_P b) + (a *_P c), so we aren’t changing the expressive power here, just adding explicit information about control flow.

Second, let’s introduce reflection:

reflect_P, with type P \to PROG
repr_P, with type PROG \to P
eval_P, with type PROG \to P {...} \to P, where the arity depends on PROG

Argument order corresponds to numbering of a_0 ... a_n.

Thus, for example,

eval_P(branch((lt(a_0, a_1), a_2, a_3^2) - a_4 = 0, 3, 4, 5, b) = 0

is a valid program, and so is

eval_P(reflect_P(a_1), 3) + 2a_2 - b = 0

The behavior of applying eval_P with the wrong number of arguments is undefined.

Note, in particular, that we can freely swap a subprogram PROG with eval_P(reflect_P(repr_P(PROG))), where repr_P(PROG) can be calculated at compile-time, and thus “defer compilation”.

Circuit transformation

Programs may be transformed to a circuit, by:

factoring the polynomial
ordering the factors in canonical form
transforming to a circuit DAG

Stack machine compilation

The circuit DAG may then be compiled to a sequential stack machine in the standard manner. For example, define the stack machine state as (pc, ins, data), with:

pc the program counter
ins the instruction sequence
data the data stack

Stack machine operations:

PUSH a
NEG
ADD
MUL
COMPARE
BRANCH pc_1 pc_2
REPR
REFLECT
EVAL

TODO: Figure out any stack consistency issues on branching.

Resources

Define a resource R as a seven-tuple (PROG, P, F, P, F, F, 0 | 1) with fields named as follows:

R_{logic} of type PROG
R_{label} of type P
R_{quantity} of type F
R_{value} of type P
R_{nonce} of type F
R_{nc} of type F (“nullifier commitment”)
R_{ephemerality} of type { 0_P | 1_P }

Resources with R_{ephemerality} = 0_P are known as ephemeral, while resources with R_{ephemerality} = 1_P are known as persistent.

TODO: Same as the commitment for now.

Define the commitment of a resource R_{commitment} as hash(R).

Define the address of a resource R_{address} as R_{commitment}.

Define the nullifier of a resource R_{nullifier} as n such that hash(n) = R_{nc}.

Define the type of a resource R_{type} as hash(R_{logic}, R_{label}).

Define the delta of a resource R_{delta} as the two-tuple (R_{type}, R_{quantity}).

Partial transactions

Define a partial transaction PTX as a six-tuple (Set\ F, Set\ F, Set\ π, D, Set\ P, PROG) with fields named as follows:

PTX_{created} of type Set\ F (commitments to newly created resources)
PTX_{consumed} of type Set\ F (nullifiers for consumed resources)
PTX_{proofs} of type Set\ π (proofs for consumed and created resources)
PTX_{delta} of type Set (F, F) (as resource delta above)
PTX_{extradata} of type Set\ P (extradata possibly used as input by proofs)
PTX_{executable} of type PROG (possibly empty)

A partial transaction is valid w.r.t. a past commitment set CS if and only if:

For each commitment c in PTX_{created}, a valid proof π is included in PTX_{proofs} such that, for some resource R with R_{commitment} = c,
- Verify(R_{logic}, (in, out, extradata, 1), π) = 1, with in \subset PTX_{consumed}, out \subset PTX_{created}, extradata \subset PTX_{extradata}.
For each nullifier n in PTX_{consumed}, a valid proof π is included in PTX_{proofs} such that, for some resource R with R_{nullifier} = n and R_{commitment} \in CS.
- Verify(R_{logic}, (in, out, extradata, 0), π) = 1, with in \subset PTX_{consumed}, out \subset PTX_{created}, extradata \subset PTX_{extradata}.
PTX_{delta} = \sum_{R \in PTX_{consumed}} R_{delta} - \sum_{R \in PTX_{created}} R_{delta}

A partial transaction is balanced if and only if PTX_{delta} = 0.

A partial transaction may be executed by running PROG, which may additionally:

read resources by R_{address}
read from and add to PTX_{extradata}
add to PTX_{created} (altering PTX_{delta})
add to PTX_{consumed} (altering PTX_{delta})

Partial transactions may be composed either sequentially or concurrently, where sequential composition uses the output commitment set as input to the next transaction, and concurrent composition does not. In both cases, nullifier sets may not conflict.

Recursion

Abstractly, recursion may be achieved as follows: define a resource R with:

R_{label} set to the consensus provider identity
R_{quantity} set to 1
R_{value} set to the commitment and nullifier trees
R_{nonce} an incrementing counter
R_{nc} = hash(0_F)
R_{ephemerality} = 1
R_{logic} = verify\_partial\_tx\ \&\&\ verify\_signature where the partial tx is checked with respect to the input and output commitment and nullifier sets in R_{value} and the signature is checked with respect to the consensus provider identity in R_{label}.

apriori · July 31, 2023, 2:31pm

Is there a reason to prefer a stack machine to a register machine?

e.g., a stack-based VM is going to make it a lot easier to write compilers. You don’t have to deal with register allocation strategies. You have, essentially, an unlimited number of registers to work with.

vveiln · July 31, 2023, 3:31pm

“VM which does everything that we require” - what do we require it to do? Do we have an explicit set of requirements anywhere?
When we introduce reflection/eval/repr it would be helpful to have more context about the meaning. I don’t understand from the name and the type what the defined things (what are they?..) supposed to do
Is there an explicit conversion function between F and P? I can see we apply functions defined for P elements on F elements and use equality inside functions defined for P elements
If we define everything it might make sense to define circuits as well
Do the bullet points in program-to-circuit compilations represent steps or options? If steps it would be more clear with a numbered list
What is the standard manner for compiling circuit DAGs to sequential stack machines? Do you mean the approach used for boolean circuits?
Why are we compiling circuits to sequential stack machines? My impression is that we do the same stuff as on normal computers but use arithmetic circuits and then use them to additionally create a proof. What are we trying to achieve with it then?
I’m not sure if it is a mistake or a proposal about nullifiers. In the notes we don’t keep nullifier commitments, we have nullifier key commitments
What is a resource label?
What is a resource delta? Why is it delta? Is it somehow related to the value commitment? Now I start to get the delta thing… (sorry I’m just writing as I read) maybe it makes sense to call it quantity delta then, or maybe not
I think a partial transaction is never balanced, otherwise it isn’t partial. If we want to unify them, it might make sense to call them something else instead of partial transactions, but I’m not sure if it will affect clarity positively
Also it isn’t superclear how we combine partial transactions. I see two ways to think about it - one is to add ptxs to a set until it balances and then turn them into a transaction (I think rn it is the one at least from the implementation perspective), the other one is just merging ptxs until it is balanced (which fits your description more). Maybe it makes sense to define it / agree on how we understand it, without explicit agreement we say different things (and might actually mean different things)
If partial transactions are composed sequentially and it happens in the intent gossip network (when the resource being consumed was created in the preceding partial transaction that was not finalised yet) the notes need to be ephemeral otherwise there is no way to check the existence of the resource
Does concurrent composition only happens when two notes are both inputs (or outputs)?
I’m not sure I understand the recursion part. I can’t see the recursion in it
So, what does this VM do? We define how to compile programs to sequential stack machines, and then talk about resources and partial transactions. What is the connection between these parts?

alberto · July 31, 2023, 3:34pm

How do you sum tuples (R_address, R_quantity)? Since there are multiple resources (and therefore multiple addresses) in a partial transaction, shouldn’t the type of PTX_{delta} be Set(F,F)?

alberto · July 31, 2023, 3:51pm

Partial transactions may be composed either sequentially or concurrently , where sequential composition uses the output commitment set as input to the next transaction, and concurrent composition does not. In both cases, nullifier sets may not conflict.

I’m thinking about the restriction of the nullifier of the consumed resource to the \rho value of the created resource. Do we also want to mention that commitment sets may not conflict either?

Define the commitment of a resource R_{commitment} as hash(R)

Are the hiding properties via randomness important at this level?

Other minor comments:

We can say that a program field is an ordered ring and a PROG is the polynomial ring \mathbb{Z}[X] constrained to P.
Do we want to add a multiplicative inverse to the finite field’s operations?

cwgoes · August 1, 2023, 9:00am

No (we should write one). Implicitly here, I’m including:
- Resource-based state model
- Partial transaction execution model
- Post-ordering execution
- Flexible compilation (can target circuits directly, or a stack machine, and can split a program into parts which target each)
I will add more to this section. The basic idea is that reflect_p and repr_p allow us to convert opaque “program” objects to and from elements of P, which we can operate over directly in programs, and eval_p actually evaluates an abstract program object.
No, we should probably add a conversion function - I may just have been sloppy sometimes with the delineation though.
i.e. Halo2 circuits? Or what is the particular form?
Will update.
Probably similar, I can write down a more explicit algorithm.
By proving the interpretation of a sequential stack machine, we don’t need to evaluate all branches of the program - this is the basic reason. We could also use another kind of virtual machine, but a stack machine seems most natural (e.g. corresponds to the linearity we already have in a circuit).
I do not understand all of the Taiga details yet, so I approximated some parts. What is the difference, exactly?
Basically “app_data_static” but renamed.
I think we might have called this “denomination” previously but I thought that name might have been confusing. One word names are nice if possible, I think.
The tricky part is that a “full transaction” is not really a different data structure, just a partial transaction which happens to be balanced, right? I thought of this too but wasn’t sure how to make it clearer (suggestions welcome).
Yes, this should be defined.
Yes.
No, concurrent composition basically means that the inputs and outputs of two ptxs are unrelated.
The idea is to describe how we encode the “balanced valid partial transaction” check directly in Taiga itself - this is what we would need in order to make “roll-up” proofs to other chains, for example.
Taiga needs to execute the VM when running PTX_{executable}, and I’m trying to define a VM that will work for our needs in both resource logics (not evaluating all branches) and the executable (computing new resources), to avoid duplicate work. Also, there’s an interesting correspondence between the “virtual register” model of resource logics and the stack machine style evaluation that it might be interesting to diagram or explore further.

cwgoes · August 1, 2023, 9:01am

Summed by adding together quantities of resources with the same address. Yes, it should be Set(F, F), thanks - fixed.

vveiln · August 1, 2023, 3:09pm

Post-ordering execution

How does this VM model ensure post-ordering execution?

i.e. Halo2 circuits? Or what is the particular form?

I was thinking about something simple: gates, wires, inputs, outputs, types. It just seems a bit inconsistent to not have this definition. Also, what is the difference between a circuit and a circuit DAG?

By proving the interpretation of a sequential stack machine, we don’t need to evaluate all branches of the program

Why?

corresponds to the linearity we already have in a circuit

What kind of linearity you mean here? If you mean the “lack of branching” that circuits have then we don’t need to evaluate all branches even without stack machine because circuits don’t have branches

What is the difference, exactly? *between nullifier and nullifier key*

Nullifier key is used to compute nullifiers and acts as sort of a reference to the receiver of the note (private information known to the receiver only), the creator of the note only knows a commitment. The nullifier is computed from the note and the other note from the same action (each action contains one input and one output note, our note is an output note and the nullifier of the input note will be used to compute the nullifier of the note we are creating now) - there is no way to precompute the nullifier for a note, so the note creator cannot use this instead

One word names are nice if possible, I think.

I actually like delta

The tricky part is that a “full transaction” is not really a different data structure, just a partial transaction which happens to be balanced, right?

Right now we see partial transaction and transaction as different structures = set of partial transactions, but they can be seen as one thing (if we want). In that case we need to rethink the way we explain them and call them. I would say I prefer the separation of these notions because it is easier to reason about how things work, but maybe it could make sense to use some other word instead of transaction? Partial transaction is currently a data structure with strictly two output and two input notes, a transaction is a data structure with undefined amount of notes, it is built from a set of partial transactions. On the implementation level a partial transaction can balance, but to send it to the mempool you need to turn it into a transaction, because only transactions are allowed to go there. Clearly there are two evolution stages here, I think it makes sense to preserve the separation but maybe call them something else

No, concurrent composition basically means that the inputs and outputs of two ptxs are unrelated.

Feels more like dependent/independent composition than concurrent/sequential because it gives a feeling that there is some relationship between the ingredients of concurrent composition, which is wrong

The idea is to describe how we encode the “balanced valid partial transaction” check directly in Taiga itself - this is what we would need in order to make “roll-up” proofs to other chains, for example.

I think it might be helpful to describe it in more details

Taiga needs to execute the VM

We really need to define Taiga

I’m trying to define a VM that will work for our needs in both resource logics (not evaluating all branches) and the executable (computing new resources), to avoid duplicate work

So far I’m struggling to understand how it all works together but I’m sure it will change by the end of the hacker house

Also, there’s an interesting correspondence between the “virtual register” model of resource logics and the stack machine style evaluation that it might be interesting to diagram or explore further.

Are there any sources you could recommend to read to understand the background better?

degregat · September 26, 2023, 4:02pm

Proposals for the Resource format:

R_{label} should also be of type F
R_{ephemerality} should either be 0_F|1_F, or even just F
We might also want to put commitments of R_{logic} and R_{value} (which are \in F) into the canonical form of Resources and always look up the p \in P they correspond to during exection.

Corrections for the Field specifications:

multiplicative inverses are missing
- if this is intended, we should not call them fields

Topic		Replies	Views
ART Report: Resource Machine Resource Machine Stuff	57	973	September 23, 2024
Anatomy of a partial transaction, Part 2: Creation Education	4	256	August 11, 2023
Anatomy of a partial transaction, Part 1: Introduction Education	5	423	October 20, 2023
Proper abstraction of instruction sets / virtual machines Protocol Design	19	190	September 5, 2024
Constraint linearity in the resource machine Resource Machine Stuff	42	186	July 17, 2024

Minimum viable abstract VM

Related topics