Investigating alternative backends for the resource machine

Requirements:

  • full recursion
  • working implementation (prover/verifier)
  • the higher-level, the better

Some potential options:

@mariari please take it from here!

Perhaps this is worth considering too:

I believe @jonathan and @Lukasz have already investigated zkLLVM in some depth.

1 Like
1 Like

From what I gather there are issues on the zk-LLVM side that have not been resolved, making it an unlikely candidate to be chosen

Weā€™re getting a segfault in the zkLLVM transpiler and thatā€™s a bug in the transpiler - itā€™s up to the zkLLVM team to fix it. They said some time ago that theyā€™ll work on it, but itā€™ll take them more time to fix this particular bug. Previously, theyā€™ve been fixing relatively quickly other bugs I encountered or adding necessary features. Itā€™s not clear whether this one is the last bug.

Investigations into ZK Backends

For this investigation of the various ZK Backends I aim the answer the following question: ā€œWhat ZK system works best for Anomaā€, in order to do so Iā€™ve looked at the following properties

  1. How high leverage is the system?
  2. How mature is the backend?
    • Does the project even work?
    • Has the interface changed dramatically?
    • Is there any audit of the codebase?

1. includes properties like full recursion, and hopefully a reasonably expressive language that we can use now and that Juvix can more easily target.

2. has a bunch of sub clauses, as the rough goal that we are looking for is a mature and stable project. As we will see, no choice is fully mature, but there are some decent considerations.

Because of the nature of Anoma, none of these choices are fully binding, but rather express various systems that can fulfill the role of being the resource machine implementation.

For this Iā€™ll be assessing the following projects:

  1. zkLLVM
  2. Risc0
  3. Triton
  4. Lurk
  5. Miden
  6. Cairo

zkLLVM

The zkLLVM repository can be found here.

This is more of a toolset for circuit development, it seems to currently support a PLOKish system. From a leverage perspective, it seems to allow support for rather high level languages. However the maturity of the system seems lacking. The Juvix team have run into several bugs in their transpiler, while most have been fixed in a timely manner, there is still at least one bug stopping the compilation pipeline to properly succeed. Due to these bugs, I would say the backend is currently immature and needs more time before being properly considered.

Triton

The Triton repository can be found here.

Triton programs can be found here.

Leverage Considerations

Triton leverages recursive verification of STARKs produced from the VM itself.

The language is a stack machine, however the stack machine is a bit preculiar (in similar ways to Miden):

  1. arguemnts are handed both off the stack and from arguments
  2. recurse goes to the latest labels
  3. branching is done via labels

the implications of 1. is that the system can not dynamically take values off the stack, one has to place the values quite statically during compilation. This means that if we were to say pass a list of resources, we would need to generate n read_resource functions that call read_mem for the appropriate offsets each.

2. is implemented via a call stack (much like in normal forth), however looking at the control flow operations, there isnā€™t much manipulation of where we ought to jump. This combined with 3. means that various kinds of looping operations have to be expressed very awkwardly if they can be expressed at all. You can read a correspondence Iā€™ve had with Jan Ferdinand about this exact issue.

Maturity

For maturity, it seems that the team who are working on Triton are making a language called tasm over it. However I am unaware of any other project working over it. The language seems rather stable, with the rust bindingā€™s not changing much:

  1. Upgrade to Triton VM v0.20.0
  2. Upgrade to Triton VĪœ v0.29.1
  3. Upgrade to Triton VM v0.33.0

If you look at the diffs above, the changes are only minor in the rust side to upgrade it to the new version. If we count when the project was included in the ZKP compiler shootout, then these updates span a time from march 20th 2023 to August 10th 2023 which is 4 months and 21 days.

It seems a good amount of the improvements over this last period was focused on speed. Looking at the benchmarks from v0.1.0 of the ZKP compiler shootout we can see the following graph:

Triton: fibonacci-1000 Miden: iter-1000 Risc0: iter-1000
34.91 s (:white_check_mark: 1.00x) 3.29 s (:rocket: 10.62x faster) 8.88 s (:rocket: 3.93x faster)

comparing it to the latest v0.33.0 results we can see the speed has considerably improved (ran manually)

Triton: fibonacci-1000 Miden: iter-1000 Risc0: iter-1000
2.44 s (:white_check_mark: 1.00x) 1.64 s (:white_check_mark: 1.49x faster) 5.66 s (:x: 2.32x slower)

The Miden version has not changed during this period, so the x times faster should be somewhat indictaive of the improvmenet of speed in the latest Triton benchmarks.

Thus the code seems to work, however the lack of evidence of other projects ontop of Triton nor too much outside investigation is a bit concerncing for maturity considerations.

Risc0

The Risc0 repository can be found here.

Some example programs for Risc0 can be found here

Leverage Considerations

Risc0 works on a receipt and trace system, verifying the execution trace of programs. It uses a Stark system described here.

The benefit of Risc0 is that it lets us use C, CPP or Rust out of the box. For Juvix, it can likely target the C backend, and we can chose one of the three for writing programs (more later on why this is an uphill battle).

A note about user experience, Iā€™ve had Risc0 in the shootout since the start of the project, however Ī™ have noticed that the compilation strategy is an odd one. It requires a series of Cargo files, Ī‘ good example can be found here:

  1. The code that has the execution trace to verify can be found in methods. Inside here there is a cargo file with guest and src and a cargo file
  • The src simply is a wrapper to tell rust outside of the ZK environment about the code.
  • The real ZK code is in the guest folder
  • the cargo file and build file also must be there in the specified way you see
  1. Inside guest we have another level of cargo files, the src has the zk code
  2. Finally inside the src folder we have the methods that are exported
  • These have a special environment, and require special imports to interact prolperly.

It is unclear to me if other languages are as messy. However it seems to me we need:

  1. Rust code whose exeuction will be traced (this ends up as an elf binary).
  2. We need a clear module to export this to Rust proper.
  3. Rust code that this incorporates into.

The mentions of C and CPP on their repo is quite misleading, as it seems that the work for the C and CPP backends are simply not there. This means that Juvix will have to do most of the heavy lifting in this regard. Overall this means that the Rust user experience is the best that they offer.

I suspect CPP and C will be as messy. On their site they proudly claim to be Rust first, meaning if this is the user experience for Rust I donā€™t have much faith in C or CPP being any better. However if we were to look at the above pipeline above I believe an interactive programming environment may be able to leverage a proper system environment to hide this from the user. However, since Juvix is also a batch language this will likely make programs rather messy. Due to the lack of a C backend I suspect their previous attempts at integrating zkLLVM may not be completely reusable for Risc0.

Maturity

For maturity, I believe they are using Risc0 internally with their Bonsai, however this is currently in Alpha. I do not know who else has built on Risc0, however I believe they may exist as the project seems more mature than Triton.

One other black mark against Risc0 is the breaking changes Iā€™ve faced in the ZKP compiler shootout. Ī™ previous had issues with rustc versions due to VampIR and Risc0 disagreeing on what version was okay. Due to that, and unable to see why my programs failed to compile on newer Risc0 versions. An outside PR was had to fix the issues in the backend. As we can see in it, the API has changed quite a bit between 0.12.0 and 0.18.0.

Risc0 mentions the stability and mentions due to this it has not been auditied.

Speed wise the product preforms fine

Triton: fibonacci-1000 Miden: iter-1000 Risc0: iter-1000
2.44 s (:white_check_mark: 1.00x) 1.64 s (:white_check_mark: 1.49x faster) 5.66 s (:x: 2.32x slower)

being within an order of mangitude other solutions at least for the subsets of examples that are in the shootout.

With all these considerations, Iā€™m not very pleased with Risc0, and I donā€™t believe it in the current state is the best choice.

Lurk

The current Lurk repo can be found here.

Leverage Considerations

Lurk is a turing complete language for recursive zk-SNARKS, they leverage the Nova proving system in the current iteration.

From a leverage perspective, lurk is rather nice. As a lisp itā€™s quite odd but it seems they are fixing some of my criticisms of the current language:

  1. arithmetic operations (+, *, /) only accepts 2 arguments (this is unbecoming of a lisp)
  2. Lack of macros, apply, and macro infrastructure
  3. Lack of any kind of packages/namespaces
  4. Discouragement of normal definitions.

1. is a nit, but is a rather odd choice. 2. should be possible since it has eval I believe in the absence of apply. 3. is getting solved but does show an example that the language could potentially be improved in the future. 4. is odd, they have !(def) for adding some functions to the environment, but it seems that the infrastructure for tree shaking their definitions depending on what is used is simply not there. This means Iā€™m unsure if the lanugage will grow towards being more reasonable from an ergonomic standpoint.

With these criticisms said, I believe Lurk is the most expressive of the ZK languages. From an ergonomics point of view, the project is quite nice, one can simply call their binary and start testing out programs inside of lurk,

user> (letrec ((next (lambda (a b n target)
                 (if (eq n target)
                     a
                     (next b
                           (+ a b)
                           (+ 1 n)
                           target))))
         (fib (next 0 1 0)))
        (fib 100))
[4841 iterations] => 0x00000000000000000000000000000000000000000000001333db76a7c594bfc3
user> !(prove)
Claim hash: 0x2622abaf145a10664b02f109966d6863930d486a432654e2d49bead710f71166
Proof key: "Nova_Pallas_10_2622abaf145a10664b02f109966d6863930d486a432654e2d49bead710f71166"
user>
Exiting...

is an example usage of the system. This can be greatly imrpoved, but this working as the language does natively is a good sign for the direction in which they are taking.

Due to all of these factors I believe it should be an easy target for Juvix to compile to as well.

Maturity

There are good and bad signs regarding Lurkā€™s maturity. On the positive side there is an adult of Lurk. While the software was not found free of issues, the issues that were found were all addressed. The report should be read, as it covers various libraries that Lurk relies upon and a brief analysis of their current nova implementation and caveats to zero knowledge properties.

The negatives regarding maturity is mainly regarding performance. Iā€™ve personally contacted them at one point, and they seemed reluctant to adding results to a benchmarker, which is further confirmed by their zulip discussions. However we do have some benchmarking data from this closed topic. This PR does not sadly compile now due to rust dependency issues, however the results are quite bad.

Triton: fibonacci-1000 Miden: iter-1000 Risc0: iter-1000 Lurk: fibonacci-1000 Reduction-100 Lurk: fibonacci-1000 Reduction-10
2.06 s (:white_check_mark: 1.00x) 1.12 s (:rocket: 1.84x faster) 3.89 s (:x: 1.89x slower) 189.23 s (:x: 91.97x slower) 276.43 s (:x: 134.36x slower)

running a naive fibbonaci functions found on the lurk website Iā€™ve confirmed the results

./bin/lurk  7459.38s user 129.83s system 973% cpu 12:59.57 total

Further this took 14 GĪ’ of ram to run the !(prove) call.

On the bright news they are quite well aware of the issues, and their own README points out the performance gains that can be had, however it is uncertain when they can achieve this.

Overall Iā€™d say Lurk is an interesting backend, but in itā€™s current state is most likely a bottle neck. Due to how expressive the language is, it may be trivial to port over the resource machine, and thus use an comparative example against a much faster system.

Miden

The current Miden repo can be found here.

Some example Miden programs can be found here and here.

Leverage Considerations

Miden is a stack based VM approach to a STARK system.

Miden unlike Triton and Lurk does not have recursive proofs however it is planned:

  1. planned features.
  2. comment on hashes in regards to recursive proofs

Some of the tooling for Miden seems nice, from when Iā€™ve last tried it, they seem to have added a REPL.

Miden overall has the same oddities that Triton does, however it does mitigate the issues somewhat:

  1. arguments are handed both off the stack and from arguments
  2. there is no recursion, only explicit loops
  3. flow control is not great

1. is much like Tritonā€™s limitations, however due to how the memory model works, there are some upsides that we can realize. Namely due to memory being a big array that we can treat like a stack, we can process over it generically with loading moving the pointer into this stack.

With the control flow operators optionally taking addresses from the stack, there could be good strategies for storing items in memory, however this seems to not extend to function local memory, where you would want to use this.

# STACK EFFECT
# -- VERFIED? #
proc.sudoku.27
  padw
  adv_loadw
  loc_storew.0
  adv_loadw
  loc_storew.1
  adv_loadw
  loc_storew.2
  adv_loadw
  loc_storew.3
  adv_loadw
  loc_storew.4

Above is an excerpt of code I generated, showing taking arguments from the advice stack and storing them in global memory.

2. is the biggest let down of Miden, it does not seem possible to encode arbitrary gotos, meaning that program structure is somewhat strict. It should be noted that this may or may not be less expressive than Triton, since itā€™s flow control is also somewhat suspect.

3. adds to this criticism, as the only flow control is if-else, repeat and while. However, maybe some conditional stack rolling operations may aid in programming.

Overall the language is okay, I do worry about how well Juvix can target this for computation, as Iā€™m unsure of how to convert recursive processes into the limited format expressed above without serious work.

Maturity

Miden seems like a rather mature project with a few caveats.

Miden has had a security audit that found 16 issues (4 of them being critical) that are now all fixed.

Further Miden as a project has had a telegram group for developers and ā€œpioneersā€ building ontop of Miden (Private group), for over a year at this point. Often they would have weekly calls that people can bring up issues theyā€™ve had. In this group I have seen various compilers trying to target it (I donā€™t have a link). Further it seems that Polygon themselves have a language they are targeting ontop of Miden.

Speed wise the project itself is quite decent

Triton: fibonacci-1000 Miden: iter-1000 Risc0: iter-1000
2.44 s (:white_check_mark: 1.00x) 1.64 s (:white_check_mark: 1.49x faster) 5.66 s (:x: 2.32x slower)

The only caveat Ī™ have is on their Status and Features, they state the following.

ā€œAt this point, Miden VM is good enough for experimentation, and even for real-world applications, but it is not yet ready for production use. The codebase has not been audited and contains known and unknown bugs and security flaws.ā€

However, given the state of the other ZK systems, this may not be so mcuh of an issue.

Cairo

The current Cairo repo can be found here.

An experimental VM implementation of it can be found here.

Leverage Considerations

Cairo offers two levels that we can use it at. One from Cairo Assembly, and another from the Cairo language itself.

As a language, Cairo is a Rust ripoff, even down to the testing pragmas.

However one important point of Cairo over Rust is that it does allow tail recursion.

Further Cairo enables recursive proofs.

For Juvix, this should be an easy target as Cairo assembly seems fairly standard.

An example proram is linked here

fn main() -> felt252 {
    fib_two(16)
}


fn fib_rec(n: felt252, a: felt252, b: felt252) -> felt252 {
    if n == 0 {
        a
    } else {
        fib_rec(n - 1, b, a + b)
    }
}

fn fib_two(n: felt252) -> felt252 {
    fib_rec(n, 0, 1)
}

#[cfg(test)]
mod tests {
    use super::{fib_two, fib, fib_rec};
    use debug::PrintTrait;

    #[test]
    #[available_gas(18446744073709551617)]
    fn it_works() {
      ...
    }
}

The only thing that has be concerned is that there seems to be gas execution

Maturity

Cairo has been used in production for over a year now on Starknet. I think that is fairly good.

Conclusion

Overall the landscape is rather immature. There are some decent up and coming projects. The most mature of these systems, Miden, does not have recursive proofs, which goes against some of the criteria that we need. The speed of Lurk makes me hesitant to recommend it, leaving Triton, however Triton seems not mature yet.

One good thing about Anoma is that we do not have to marry ourselves to any particular system. Further the resource machine spec is rather simple, and the languages in this report are expressive enoguh to talk about the machine. Thus Ī™ suggest a plan like the following:

  1. Write a version of the resource machine that conforms to the sepc like in Elixir and Nock.
  2. Port this over to Miden
  3. Make a port to Lurk
  4. Finally port it to Triton
  5. Once the investigation for recursive Halo2 is done update Taiga to respect the new spec and interface.

The multi target support is to namely get shielded verification and proving system attached to a live Anoma system and to get the interface nicely lined up.

I suspect weā€™ll want to use Triton from all of these as the system matures, and potentially switching over to Lurk or the Halo2 work in Taiga as they mature.

Edited Conclusion

Previously the conclusion did not consider Cairo. But Cairo seems the most mature and the natural pick for deeper investigations.

5 Likes

Thanks; this is a super helpful overview for me at least. A few questions which come to mind:

  • Would it be possible (even with some performance loss) to standardize a resource encoding, including specific field element types? In other words, what datatypes do Miden, Lurk, and Triton support (relatively) efficiently? It would be easier to port / cross-test if we can standardize an encoding across all of these backend implementations.
  • Would a better integration option for RISC0 change your mind at all regarding feasibility there (is it primarily a question of clean vs messy integration)? One advantage I see is that RISCV is more powerful compared to (as you describe) the somewhat janky models of Miden and Triton.
1 Like

I agree about the standardized resource encoding, which we want to have in any case.

It should take any cryptographic details neccesitated by the backend as parameters.

Worst case the different frameworks (including halo2/taiga) operate at different abstraction layers, but maybe we can handle that by standardizing the most abstract encoding and then building adaptors for compatibility.

I was suspecting we would indeed have the same encoding of data, but where it would differ is how it is fed to each system. But as for your quesiton

Miden

The docs on the element size on the stack can be found here

Triton

Lurk

We know from the security report Lurk uses pasta curves and in fact we can actually compute the field value on the fly

user> (- 0 1)
[3 iterations] => 0x40000000000000000000000000000000224698fc0994a8dd8c46eb2100000000

which CL tells me is (255 bits)

For formal documentation

There are operations that can coerce data to any size

Risc0

I found this in a blog post

However risc0 will deal with any rust value fine, meaning we can send in u64ā€™s or u256ā€™s fine itā€™ll just take more than 1 field element to represent

Overall

For Miden and Triton they use the same field size of an u64 with the lower 32 bits cut off, Lurk gives you the full 255 bit range, and risc0 gives you a smaller 31 bit field.

On the Elixir and Hoon side, we can handle arbitrary bignums that we can serialize however you want, so no issues feeding any numbers to our clear resource machine interpreters

1 Like

Potentially, Risc0 works over elf binaries rather than given assembly. So for a batch compiler, this means having a second compilation to generate an elf binary with only the core ZK code one needs, which gets into why the Rust compilation is so complicated. For an interactive system this actually isnā€™t as bad as it seems, as elf binaries can be quite small, however this places the constraint that one must define out a tree shaker and partial dumps early in the programming language life cyle. This would fix my issue with risc0 as a lifecycle on a programming languages (interactive languages win yet again!).

is it primarily a question of clean vs messy integration

From an Anoma POV it does not matter too much, for a Juvix POV it may matter a lot. For Anoma you just have to submit the proof, that can be all done offline, and the location handed to Anoma. Ideally the anoma system can load the code and a person can call prove from within Elixir/Nock and then submit that

2 Likes

I have updated the original post with a brief analysis of Cairo, please give it a read.

from afar, very helpful analysis, the benchmarks in particular.

If the same code can run as easily inside of risc0 as outside of it, then it doesnā€™t seem quite so important that risc0 itself be efficiently interactive. (I think thatā€™s broadly a good way to design thingsā€”e.g. debug your code on a cpu then run it on a gpuā€”of course it is suboptimal sometimes, principally when debugging the implementation itself.)