Standardization of Application-related Data

Recently, the topic of standardization came up again in different, application-related, contexts.
@vveiln pointed out that this doesn’t belong in the RM specs and that it might deserve a separate section in the specs.anoma.net page (or wherever else it makes sense).

This thread aims to collect and list these standardization topics to eventually create such a section/page.

Data Fields Relevant for Standardization

Standards might emerge everywhere where arbitrary, application-related data can be stored. This includes

A standard can require specific data to be in one or multiple of the above-mentioned storage locations. For example, an ownership-related standard might require the ExternalIdentity of the owner of resource r to be stored in the Resource.value field and a corresponding signature of the owner in the transaction.action.appData[tag] related to r.

As mentioned in the related forum post (see Standardization of Resource data (Label and Value) and to allow multiple standardized and non-standardized data next to each other, label and value should, IMO, be required to be maps in the specs @vveiln @cwgoes.

Examples for Resource Data Standards

Standards of Resource.label data could be related to:

  • Symbol / name / description (e.g., to be displayed in a UI)
  • Supply type (e.g., fixed/unbound/capped supply)
  • Origin/ originator identity of the resource (e.g., issuer in case of a token)
  • Intent formats
  • Compliance with other standards (that can be in- or external to Anoma, i.e., from other ecosystems such as Cosmos, Ethereum, Solana, etc.)

Standards of Resource.value data could be related to:

  • Owner / ownership-related information
  • Time dependence (e.g., expiry date in a coupon or intent resource)

Examples for Action Data Standards

Standards related to action.applicationData (and LogicInstance.applicationData[tag]) could be related to:

Standards related to LogicWitness.applicationInputs could be related to:

  • authorization
  • nullification?
1 Like

Just some thoughts about what should go where and standardisation:

  • My definition of the RM spec is the part of the spec that describes how the RM itself works. I believe anything related to applications should go elsewhere. All of it is related to RM, but it isn’t RM specs. Maybe we could rename RM spec to make the distinction clearer, but if we start referring to everything related to RM as RM specs, then everything will be RM specs.
  • I believe the main distinction between standardisation and the RM specs is that the standardisation covers the subset of use cases and the RM specs defines how everything works at all times.
  • Another distinction that comes from the previous one is that RM specs are to be enforced, standardisation is more of an agreement. The subset of use cases “agrees” to work in a certain way to achieve certain results. Opting out of this agreement results in leaving the subset but staying within the Anoma context, opting out of the RM specs agreement results in leaving the Anoma protocol. This one is more vague, but that is just to share how I think about it

label and value are fixed size data structures that can refer to any type, including maps. They cannot be maps on their own

1 Like

I agree that standardization shouldn’t be mixed with specification.

We need some place to list standards. Some of our software components (such as the example indexer, solver, or application libraries) rely already on such de-facto standards that are nowhere mentioned.

I am not referring to labelRef and valueRef but the label and value data blobs that they refer to.

1 Like

I am not referring to labelRef and valueRef but the label and value data blobs that they refer to.

Just in case, wanted to make this clear that per current specs the fields are not references in the proper Anoma-side sense, i.e. they do not point to anything in the blob store. At the point that we evaluate the transaction function there is nothing that we dereference from blob store. It’s just the values that get fed to the proving systems you use.

E.g. the Cairo logicref just gets fed to the CairoVM as the logic, we do not store the actual logic anywhere.

1 Like

Then I don’t understand why they should be required to be of any format

1 Like

My understanding is that actual logic, label, and value of resources are stored content addressed in the local blob-storage and are fetched from it for proof generation. The resource object only contains the binding references (the hash of the thing).

There are other software components (outside the RM/Anoma) that need to be able to look up data from resources / actions.

  • An indexer wants to have a standardized place to look for the owner identity information.
  • A solver wants to know what the intent standard is that an intent resource is following (e.g., it could follow ERC-7683: Cross Chain Intents which the solver can deal with).

How would you achieve this otherwise without maps / key value pairs while still allowing other data to be present?
Let’s say that the owner information is stored in the value alongside other data. How can an indexer retrieve this information without needing to know how to decode all the other data as well?

Is this only relevant to transparent resources? In that case, I don’t see a reason to fix it for all resources

The solver can’t work with all applications and they can look up the standards of the relevant applications. Not sure I see how getting familiar with how the relevant applications work implies fixing the format for all applications

1 Like

In terms of both current Cairo and Transparent RM implementations, neither do any interactions with blob storage after transaction function evaluation. Logic for CairoRM gets read from JSON inputs, those are not stored anywhere in the storage, neither client-side, nor node side unless explicitly stored by the user. Logic for the transparent case also need not be stored anywhere.

Of course these are binding references in the sense that they are hashes for some notion of a predicate defined for some specific VM, but they have no (guarantee of having) deref information availiable at verification time. This is what I mean by them not being “references in the Anoma-side sense”. References in the Anoma-side sense are Nockma 12 calls.

At the very least I currently see nothing in the specification points to the interactions with blob storage you describe.

1 Like

There is no difference between transparent and shielded resources. The indexer can index any resource he has visibility on. Shielded resources can be visible to indexers but generally are not.

Nobody has access to the label and logic of shielded resources, including the ownership information, so the part where indexers want to have access to any information about shielded resource objects feels confusing to me

I don’t think they want to have access - they are given access voluntarily by users of their service. The mechanisms behind this haven’t been explored – maybe this has overlap with your SSS research (which I don’t know much about).

However, I think visibility is not the point here.

Even in the transparent domain, how do you find out that you own a resource (meaning that nobody else can spend it except you)? You must somehow be able to see that it has this property.
Should indexers maintain an ever-growing list of all resource kinds having the property of being ownable? How can you make sure that this list is up-to-date? To me, this doesn’t seem to work.

In an ideal world, we would be able to formally verify certain behavioral properties and invariants of resources and logics (such as “nobody else except me can spend it”) which @cwgoes has pointed out

but to quote him once more:

HOWEVER, for now, this is a research problem, and we should do something pragmatic for the devnets.

Standardized lookup would be a pragmatic approach, but it would require these fields to be maps.
I also think that there are other software components besides indexers wanting to inspect standardized application-related data that we haven’t encountered/thought about yet.

Both of these points make sense to me, but it is specific to ownable resources / property-able resources, so from my perspective, these things don’t belong to the RM spec since they don’t apply to all resources

1 Like

I think this is a very bad idea. The spec should not enforce what the values the valueRef and labelRef point to.

I imagine many better ways to layout data than a map, and I’ve propsed a different strategy entirely for getting the kinds of generic dispatch and interfaces we want in the a story here:

Enforcing it to be a certain way will cause these kinds of strategy to not be possible and lock away any chance at making a generic protocol around the RM.

I believe this is not a pragmatic approach, if you want to encourage developers with Juvix to currently do it, go ahead. But I’m skeptical if this is what we want in the long run. We are just now trying to get our feet wet trying to make RM applications, and I believe some of the ones we have right now don’t do this as it only complicates the logic for little gain, as the Juvix type system can’t really support this and I believe long term this is not a good strategy to go with.

1 Like

I think this discussion would be easier if we separate out two distinct questions:

  1. How should we standardize FFI calls, timestamps, and other such interactions between the RM backend / controller and the RM application?
    • This is not a question of behavioral properties of applications. It’s just a question of where to put the data (which we need to decide and specify). This should also not in any way restrict what kinds of applications can be built on top of the RM.
  2. How should we standardize semantic patterns in applications?
    • This covers everything else in @Michael’s post and this one.
    • We do not need to have a “forever” answer to this question for now, we just need a workable one. This should also not restrict what kinds of applications can be built on top of the RM. We can require that certain fields have a certain format for “applications written using the standard library in Juvix” without requiring it in the RM design or specs (but we should write down what that standard format is in some specs).
1 Like

That’s fine for me and I hear you:

I don’t mind where we come up with answers to those questions.

I have the same feeling, but I wasn’t able to see an alternative. If you see one, that’s great. I’ve read your post that you’ve linked, but it is not so easy to read and digest for me (despite your terminology page). If we have a better solution for the devnet, e.g., for indexing, I would like to understand it - maybe .

Yes. Currently, the indexer requires the owner external identity to be stored in the nullifierKeyCommitment field, which we want to change soon to the value field.
However, if indexing services just try to decode the value field, apps cannot put anything else inside besides owner information. Even worse: putting other data inside could result in the indexer wrongly assuming that this resource is owned by some non-existing identity.

We can require that certain fields have a certain format for “applications written using the standard library in Juvix” without requiring it in the RM design or specs (but we should write down what that standard format is in some specs).

Yes, by including these expected formats in the standard library, we are creating de-facto standards. Writing this down somewhere explicitly (not the RM specs, I hear you Yulia) is the main point I am trying to make here.

It isn’t just where they are. Things that belong to the RM spec apply to all resources, so my point is more that we cannot define the format for logic/label/value for all resources

To

FFI Call Data
In our Tuesday meeting, we said that we want to put it in the appData[tag] of the wrapper resource under a specific lookup key. What would be the lookup key/algorithm?

Timestamps
Should these be also part of the appData[tag] of the resource that has time constraints? If so, I would like to understand the mechanism better that puts them there. Maybe this is a separate thread. Is there material to read up on this @vveiln or should we create a separate thread?

To

  1. How should we standardize semantic patterns in applications ?

The summary I take away from this thread so far is that you @vveiln and @mariari don’t want to require resource object data like value and label to be maps as per the RM specs and instead leave this open. That’s fair.

The open question still is: How can some actor (either a professional indexer or me personally sorting through my resources in my local storage) practically inspect resources for the owner information.
Given the above, the next best thing apps will come up with is a check if the field is a map or not and a second check if there are map entries complying with certain standards this app requires.
So we would create de-facto standards requiring value/ label to be maps by doing this in the Juvix applib. This is obviously not elegant, but are you ok with this?

This was just the example of indexing. General solvers will have similar requirements.
Maybe storage, compute, or bandwidth service providers will also have terms and conditions being dependent on resource properties that go beyond the resource kind.

Applications (including apps related to service provisioning) should be able to handle all kinds of resources as long as they have certain properties, and not just a list of specifically allow-listed kinds.
I.e., I want to be able to trade/sell/barter/lend/auction/… all ownable resources on the cool e-Bazaar app.

1 Like

@Michael asked me to comment on the feasibility for requiring some RM Resource fields to have a certain format when written using the Juvix standard application library.

  1. It’s not possible to automatically detect the type (e.g Map, ED25519 key, etc.) of an encoded (i.e Nock encoded = jammed) field on a resource (e.g Resource.label).
  2. We could support ‘type detection’ by requiring that a Resource field have additional structure.
    For example we could require that the Resource.label field has the structure [typeId encodedPayload]. Where [ ... ] denotes a nock cell. typeId is a nat that indicates the type of the encoded payload corresponding to the Anoma/Juvix ‘standard’ and encodedPayload is a nat, the encoded (jammed) value.
    We have a more detailed proposal of this idea in: Safer Anoma decoding proposal · Issue #3273 · anoma/juvix · GitHub
  3. One issue with the [typeId encodedPayload] proposal is that a label with some other type (e.g. unrelated to Juvix app stdlib) could be encoded as [a b] with a equal to some typeId by coincidence. So to make this safe we’d have to require that the Resource file has [typeId encodedPayload] with some id (e.g 0) used to mean that no standard is used.
1 Like

I am not ok with this, all of this is far too premature. The reason is the following:

  1. Enforcing them to be maps, means more indirection and complications for understanding programs. We are already mired in artificial complexity that we are trying to strip out and boil it down to the essential complexity. Thus I can see a strategy on how to maybe make this work, but I’ll get to that in point 4.
  2. We are having issues even making simple programs, if we are having issues, outside parties will have issues. I don’t think an explosion of programs will be made yet, dedicated programs with dedicated solvers can be made and I believe these are fine first steps.
  3. Our understanding of applications is evolving, what may look like a fine idea now, may be a determent and we’d have to strip out later and chose a different strategy for.
  4. Even if maps are a backing data structure we chose for Juvix in the short term, we should not encourage users to explicitly use maps. This is error prone as every record access of the map would return a maybe type before it can be used only adding to the annoyance of writing programs (note that the encoding of basic type information isn’t there yet so currently as of v0.2 it’d be worse than that it’d be a massive foot gun). Now I can see it potentially in the lower layer that is automatically translated, I.E. the base layer encodes them to simple maps but the more user friendly library encourages users to use custom data types that have an encoding function that moves it to a map. I’m heavily against doing this now as it only makes more work for Juvix and we don’t have the library up, thus meaning that a series of boilerplate would have to accompany the program, we should first see how the layer above base turns out before making any decisions here.
  5. Looking at the long term (note this is not an argument about now nor Juvix), I think there are much better enforcement mechanisms than standards which will have many false positives (I.E. I name a field owner because for my domain an owner is an unrelated concept and is instead a different kind of information. No practical way to deal with this as it’s just about a field in a map…), I.E. if we have a proper model for computation, this can be boiled down to a simple reflective query based on who implements what interfaces (we can get much better information than this).

For example take the above image. Here I have a trait of mine GithubJSON, which states that every implementer ought to accept the message fromJson:. Here I can query for all the users of this, the top left shows this in gui form, the first picture on the right shows how that gui is made, since it’s made with users, we can list all instances who conform to GithubJSON by simply doing the query self users collect: #allInstances, and just like that I have a collection all instances that can accept fromJson:. However this may not be enough, what if fromJson: is common but my interface is not? Well look at the image below!

In that case I just queried for every object in the current image that has fromJson: implemented. Now I can try writing long running programs that work not only on GithubJSON, but I can also experimentally try on anything that accepts a fromJson: message. What if I want to filter it down to a specific controller?

Here we can see that the same filtering mechanism lets us change the scope of the query to the specific range of things I wish to see.

Thus I’d argue a self reflective system that represents information about itself is a better model to go for that lets you splice by field criteria (owner) or confirmed interface acceptance (implements a trait). We can get this by implementing a MOP or Smalltalk Object System ontop of resources. From there, we can actually do sophisticated queries with known techniques as we can look at GT/moose’s rewriting and query system. This is extra powerful as the queries can run on the AST as well, giving us the ability to query for all programs that have certain patterns[1]. I believe this can be realized not in the valueRef nor labelRef, but instead either through kind mapping or from the reference to the logic directly (pending details). In the long run this is the direction I believe we ought to go and can actually realize, boiling things down to solved problems is nice.

[1]

2 Likes

I want to summarize all the pieces of information I got so far (from this thread, but also by talking to @mariari, @paulcadman, and @cwgoes).

Indexing and Decoding

Context
Users using are interested in querying for unspent resources (i.e., “I want to know all Kudos resources that I own”). We assume that users will be offline sometimes. An indexer provides this information as a service to users.

This is how I understand that indexing works in v0.2 and as per Ray’s post:

  1. An indexer runs an Anoma node and stores resource objects being visible to them in their local storage.
  2. On user request, indexers query (scry) the set of all unspent resources, with a filtering function.
  3. The filtering function filters the set of unspent resources for certain properties (e.g., the owner information) and is written in Juvix. No gRPC calls need to happen, and no shell scripts or makefiles need to be written.
  4. The resource set after filtering is communicated to the requesting client of the user.

Problem / Constraints
In Juvix we cannot have a maybeDecode – as soon as a resource with a wrong encoding is encountered, the filtering-algorithm will crash. This is the motivation for the proposal by the Juvix team requiring a certain encoding format.

Furthermore, we wish to NOT specify (i.e., require in the RM specs) that resource and application-related data fields must have a specific format (e.g., to be a map).

Talking to @cwgoes, I confirmed that we want general indexers for the devnets in contrast to dedicated/specialized indexers for specific applications.

Solution
Indexing/filtering of resources requires a way to decode resource data without crashing.
For this to be possible in Juvix, @mariari told me that the proposal by the Juvix team is considered and something in this direction will be implemented as a first stab.
The discussions on how to make this decoding safer should be continued.

Solving and Decoding

General solving is, obviously, not possible, and solvers need to be implemented for specific use cases.
Nevertheless, the topic of decoding is important. Solvers need to filter the intent pool for intents (unbalanced transactions) to understand what the constraints are and how to solve them.
This means that solvers have to look for certain actions and resources within expressing those constraints. As for indexers, this requires solvers to decode resource object data to understand intent resources and constraints they carry. Specific intent formats have already emerged and we can expect more formats to emerge.

Documenting Standards

I still think that it is important to have a place where we write down conventions around topics like the above for specific versions. I would still call these standards as those can be temporary, deprecated, and superseded.

This includes questions like

that we haven’t discussed much in this thread.

1 Like