Projection Functions and Data Discovery

Michael · August 15, 2024, 11:35am

This thread aims to clarify the definition of projection functions and their dependency on data discovery and data availability services.

My initial implementation of the balance projection function for Kudos was:

balance (kind : Kind) (account : PublicKey) : Nat :=
  let
    ownedResources : Set Commitment := anomaGet (kind, account);
  in for (sum := 0) (cm in Set.toList ownedResources)
       {let
         q := Resource.quantity (commitmentResource cm);
       in q + sum};

You, @cwgoes, commented:

They can have custom outputs, but I thought that we defined projection functions as taking as input only the application’s state (i.e. resources associated with the resource logics which define the application)

This make sense to me and I tried to align it with the brainstorming we did yesterday with @degregat , where we discussed also data discovery. We said that an indexing service might take a filter as an input argument and return a Set Hash associated with the resources (or data BLOBs in general) having the filtered properties.
Having the set of hashes, the associated resource plaintexts could then be fetched from a data availability service and passed to the projection function.

Does this generally make sense?

If yes, then the balance projection function implementation would just be

balance (resources : Set Resource) : Nat :=
  for (sum := 0) (r in Set.toList resources) {sum + Resource.quantity r};

and I should change the report accordingly.

Error Handling / Sanity Checking

I am now wondering where error handling / sanity checking should take place. Probably, this should take place after the data discovery and data availability service request.
E.g., for the balance function to work, we would initially query/filter for resources of certain kind that are owned by a certain accout. We would then also need to check that the properties actually hold. If not, the indexing or DA service should be penalized (which is, as I understand, where slow games become important).
Afterward, we just can just pass the Set Resource to the balance projection function.

Does this sound correct @cwgoes @degregat?

cwgoes · August 20, 2024, 6:20am

Thanks for the writeup here. Just some passing thoughts:

I think your revised projection function is in alignment with the original definition, yes. There’s a lot of nuance here in general which we haven’t fully unpacked (and will need to figure out soon). For example, I imagine that some applications would want to process historical state changes (transactions) in addition to current state (resources), so we might want to amend the projection function definition to allow them to do that. That’s just a type change, though - the difficult parts will mostly lie in figuring out how to analyze these functions, construct the right indexes, and route data around the network in an efficient manner.

Regarding error handling and sanity checking, this discussion on proofs is relevant - whatever services we query from, we should request proofs (which may often just be signatures) - and there should be a service commitment in place such that if a computational result is found to be invalid, proof that the service commitment has been broken can be published (and perhaps an operator slashed). There’s a separate question about how to figure out that we have actually queried all of the relevant state, which is not a trivial problem as the state is typically distributed around the network. If we’re just querying created but not yet consumed resources, we can get a proof (at least in principle) that all of the resources have been iterated over and checked (by whatever the filter function is). For more general state / history queries this isn’t possible in the same way - we should probably construct a little taxonomy of query type vs. what kinds of “completeness guarantees” we can get.

Topic		Replies	Views
Simple query interface Protocol Design	18	69	September 13, 2024
Applications and service commitments Protocol Design	18	208	August 12, 2024
Taxonomies of Available Data, or: consequences of Information Flow Control for application developers Protocol Design	5	34	July 17, 2024
Quick proposal write-up for distributed indexing Specs V2 Review	3	52	January 1, 2025
State Architecture for Shielded Execution and Other Questions Resource Machine Stuff	0	28	February 27, 2025

Projection Functions and Data Discovery

Error Handling / Sanity Checking

Related topics