Applications and service commitments

I think the most underdetermined remaining part of application interface standardization is figuring out how applications interact with service commitments (promises to store data, perform computation, perform ordering, etc.). In particular, I would expect that applications rely on service commitments in the following ways:

  • In order to maintain state over time, applications rely on storage services.
  • In order to access state efficiently, applications rely on availability services (serving data over the P2P layer), which will likely be tightly coupled with some compute services to index over that data and serve computed results.
  • In order to compute longer-term aggregate statistics, applications will rely on perhaps a separate subclass of compute services.
  • In order to provide counterparty discovery for user intents, applications will rely on compute and bandwidth services (i.e. solvers).
  • In order to order user transactions, applications will rely on ordering services.

I think we should be able to clearly eludicate what services a specific application requires by:

  • describing the state of the application (which will be sharded over the network) and who is expected to store each part at each point in logical application time
  • describing the services the application wishes to provide to users, which will entail availability, compute, networking, and ordering services

We will need to figure out which parts of these definitions are parts of what exact data structures - the same application (in the resource machine sense) can be coupled with different sets of promises chosen by different users. Perhaps it would be helpful to introduce a notion of an “application service configuration” or something like this. With a sufficiently clear definition of a service configuration, we should be able to analyze whether or not an application can “keep promises” to its users - of a more gestalt form - on the basis of whether the service providers keep their constituent component promises which are part of the application service configuration.

Curious for thoughts from @nzarin @vveiln @Michael.

3 Likes
  1. You mention applications relying on storage, d/a,compute, bandwidth, and ordering services. What is the difference between the services? Are they expected to have different properties (if so, what properties are expected from each kind?) or can they just be described as compute + reduced-functionality-compute services?
  2. Does the application decide who provides all of the services above or only some of them? I see it is at least responsible for choosing who works with its state
  3. How do networking services (that the application can choose to provide) relate to the other services from the list? Asking because we don’t expect the application to rely on them but to provide them to the users
  4. What is the relationship between the services the application provides for the users and the services the application relies on? Is the application in some sense an intermediary between the service providers and the users, so by comparing what the application promises to the users and what services it relies on we can tell if it can actually deliver what it promises?
1 Like

Great questions!

At the highest level of abstraction, all of these services can be described as promises about what messages to send or not send in response to other messages over time, so in that sense they can be unified, but there are important sub-categories, I would say maybe four:

  • storage services concern (a) storing a particular piece of data for a certain time period, and (b) responding promptly to requests to retrieve that data (so-called “data availability”)
  • compute services concern performing computation and possibly providing proofs-of-correctness upon request
  • ordering services concern ordering transactions on request and maintaining safety (not double-signing)
  • bandwidth services concern sending network messages between various nodes upon request

These services will often be colocated (i.e. performed by the same physical node) for efficiency reasons, but they are conceptually and analytically distinct.

I think we probably want to reify services as first-class objects within the resource machine, i.e. represent promises to provide particular services and requests to obtain particular services as resources, in which case yes, applications can reason about the promises concerning their state. That said, there should still be a distinction between the application definitions (resource logics, transaction functions, projection functions) and choices of who should provide what services, which should always be up to the user. Maybe we will want some notion of an ApplicationServiceConfiguration which can often be packaged along with the other parts of an application definition, or something like that. I also expect that applications will often be configuring services on the basis of who has permissions in the application itself - e.g. in multichat, users with publishing rights to a channel should perhaps also be responsible for storing messages, or choosing themselves another party to do so.

I’m not sure I entirely follow this line of thought, can you expand a bit? Do you have an example application and example networking services in mind?

An application is not an intermediary in the network sense - it’s not a physical node - and as such it does not itself technically issue promises or provide services. I guess maybe a better way to phrase what I mean is that an application should often come with definitions as to what users should be able to do in interacting with it, which will require various services, and users should be able to reason about - given known promises to provide services - whether the application can indeed provide for those kinds of interactions or not.

1 Like

I’m asking because you didn’t mention networking services when listing what services applications rely on to function but mentioned them saying what applications provide for the users. I’m not really sure what these networking services are really and don’t have any specific example in mind

1 Like

Some questions related to the ApplicationServiceConfiguration:

We store content-addressed BLOBs in the Anoma key-value storage, so the data is immutable.
How can we have dynamic data, e.g., a list of service providers changing over time?
Is this supposed to happen via the pub-sub system, i.e., could this configuration file refer to a topic ID or be a topic itself?

I know that Protocol Labs’ IPLD/IPNS is a solution for this, but I am not sure how it maps to Anoma.

1 Like

Service providers could potentially be discovered over the pub/sub system, yes, and one could most likely use patterns very similar to the intent gossip / counterparty discovery system - I think we can just express service commitments and requests in the resource machine as an application, which would make that easy to do.

I think perhaps a related question is the question of nameservices, i.e. how would users or applications be able to associate names of semantic import with particular content-addressed pieces of data such as blobs in storage or external identities. Here, there are at least two kinds of nameservices which might be useful:

  • Totally-ordered nameservices, not entirely unlike something such as ENS, which would be applications in the resource machine, where, say, each controller can maintain their own namespace. Totally-ordered nameservices would make sense for e.g. tracking new versions of a particular application frontend code.
  • Eventually-consistent nameservices, which would just be associations gossiped around the P2P network, and stored in a distributed fashion. See this subsection of the identity specs for some relevant descriptions, but we still need to fully specify this. Eventually-consistent nameservices would make sense for e.g. trying to reach another node associated with some name (where totally-ordered consistency is not required), trying to look up validator quorums, etc.

Does that help answer your question? I can take a further look at IPLD/IPNS as well soon.

1 Like

Does that help answer your question

Yes, at least partially. Maybe it helps if I provide more context.

Concretely, I was thinking how an application could be loaded into the local storage/memory of the node when used for the first time (or after a longer period of inactivity). In this context, I imagined that there is an application configuration file pointing to all the data blobs being required and known already for this specific application (version). This could include, e.g.,

  • frontend code
  • transaction, projection, and logic function objects
  • non-linear/constant resources (?)

as well as a list of providers that are expected to store these (and provide other services) as you wrote above:

I think we should be able to clearly eludicate what services a specific application requires by:

  • describing the state of the application (which will be sharded over the network) and who is expected to store each part at each point in logical application time
  • describing the services the application wishes to provide to users, which will entail availability, compute, networking, and ordering services

An example config file could look like this

// Application Configuration
{
  "frontend": "Bytes32", // Blob CID
  "interface": {
    "transactionFunctions": "Map String Bytes32", // Name + Blob CID
    "projectionFunctions": "Map String Bytes32"
  },
  "resources": {
    "logicFunctions": "List Bytes32", // Blob CIDs
    "nonLinear": "List Bytes32"
  },
  "services": {
    "storage": {}, // TODO: Unclear as it depends on how services are defined
    "compute": {},
    "ordering": {},
    "bandwidth": {}
  }
}

Some entries could change over time:

  • the list of providers is changing over time
  • a newer frontend or updated transaction function becomes (basically a new version)

With a name service, finding the latest application configuration file is straightforward.
However, this also could raise security concerns, e.g., if the application frontend or transaction function is switched out and suddenly behaves differently/maliciously.
If things are heavily composed, that could be more dangerous.
In my previous work, where we’ve built an on-chain setup process with a pre-approval mechanism before an installation, update, or uninstallation.
Is there an easier way to deal with this? I am realizing that we enter the complex topic of application versioning, upgradeability, and curation here.

I think we can just express service commitments and requests in the resource machine as an application

Oh, this sounds great!

I am now wondering how this could look like.
Let’s say my intent is to store a 1 GB blob for 1 year.

  1. I call a storeBlob transaction function and provide
    – my intent (“I give at most 10 Kudos. I want a StorageCertificate for my Blob that’s valid for 1 year.)”
    – a reference to my blob (e.g., my own node ID and the key to the blob in my local key-value store)

  2. A service provider matches my intent and it gets settled. The provider receives my Kudos and I receive a StorageCertificate resource referencing my blob address

  3. Within one year I can get availability proofs from a projection function. If my blob is unavailable, I can call a transaction function that checks the availability again and allows me to claim a refund + compensation for the StorageCertificate.

  4. After 1 year, the StorageCertificate expires and anyone can burn it.

Does this sound right? I find it harder to imagine how this would work for bandwidth, compute and ordering services.

1 Like

I see, makes sense (in general). It’s a good intuition. I think we can split this into (at least) three distinct concerns, which I will call the service commitment cache, state-dependent provider lookup, and application frontend versioning:

Service commitment cache

I think it’s worth noting that many data blobs may be shared between different applications, so at least some of this data (e.g. who was last known to be storing a particular blob, who made a service commitment to store a particular blob, …) would make sense to keep in a local cache shared across all applications. Many service commitments will also be made in a way which is not strictly tied to a particular application - for example:

  • storage service commitments will be tied to particular data blobs (content-referenced by hash)
  • compute service commitments will probably be tied to classes of intents (so related to applications, but not necessarily a 1-1 correspondence)
  • ordering service commitments may be tied to assets used for fee payment and/or identities of the parties transacting or proofs which the parties are required to make in order to use a particular ordering service
  • packet relay service commitments may be similarly tied to assets used for fee payment, identities of the parties transacting, or other historical data

So rather - it seems to me - it might make sense to keep a “service commitment cache” which stores information about known service commitments - and use that cache in conjunction with the application definition and known references (e.g. data blobs) to look up service providers.

State-dependent provider lookup

That doesn’t cover all of what I think you’re describing here, though - in particular, you mention that service commitments can change over time, which is true. Another part of what I meant by “who is expected to store each part at each point in logical application time” is that these expectations may be state-dependent. Concretely, for example:

  • in multichat (a version without a designated storage provider), the sender of a message to a channel may be expected to store it for at least 7 days, and the receiver for at least 1 day
  • … so, if I want to fetch the message data, as a party syncing my local state, I should contact the sender and/or receiver (depending on time elapsed since the message)

In other words, often we will not be able to say which service providers are even relevant until we are talking about a particular piece of state about which we already know some information (in this case, the channel metadata and the message timestamp) - and - if I don’t already know this metadata - I will need to first fetch it before fetching the message data. This implies:

  1. We will need a recursive synchronization procedure which - each run - fetches data that we know the providers for - and runs until all data is fetched.
  2. For efficiency reasons (avoiding too many round-trips), we will probably want to come up with a way to represent and forward these complex queries as functions (so that e.g. a party who might have all the data already can serve the query using only their local cache and return all the results to you as a bundle). Luckily, this should be more or less the same problem as representing the application read interface that we’ve already been discussing.

Application frontend versioning

First, just to avoid confusion: I think it may be easier not to think of applications and frontends as abstractions which are particularly coupled. An application in Anoma is a set of interface definitions - which can be composed and decomposed - while a frontend (for now, before we have some sort of composable UI library) is a specific piece of data that may allow the user to interact with parts of one application or parts of many applications. The questions of upgrading, versioning, etc. for the frontend and for the application are very different - in particular, because the application state must be handled in the latter case. I think application state versioning deserves a separate treatment, so let’s just talk about frontend versioning for now here. I think this is quite related to the nameservice question I discussed in my post - a simple implementation could say that:

  • a particular frontend name is reserved by a particular party, represented by a non-fungible token resource
  • whenever they want to change what code that name points to, they consume the old NFT resource and make a new NFT resource with the same name and the new code reference (hash)
  • a second party who wants to use this party’s name can look it up and download the frontend code - then perhaps perform other kinds of verification on the frontend code (e.g. typecheck it)

More complex versions could add attestations to new code versions from different parties which the user could then check according to their personal trust graph - but let’s start with something simple first :slight_smile: .

Yes, although the part about checking the availability and claiming the refund + compensation hides a lot of distributed systems and game theoretic complexity that we’ll need to reason through - but in principle, one could imagine such refund (insurance) mechanisms existing, yes. The rest of the steps you describe match my understanding, except that as part of (1) or (2) you need to actually send the blob to the storage provider in question (somehow) - could be part of the transaction, or perhaps sent separately and just referenced.

I haven’t thought through everything in detail here (@Jamie and @nzarin are also thinking about these problems), but for example:

  • for bandwidth services, you could request that another node forwards the next 1 GB of packets you send them in the next month
  • for compute services, you could request that another node commits X hours of CPU-time on request for the next month
  • for ordering services, you could request that the consensus provider commits X units of gas on request for the next month
1 Like

My thinking was, that because the storage provider has the reference to my blob from my intent, they can get it from my node, but maybe this is not how it works.
We definitely don’t want to put 1GB of data into the tx extra data and gossip it around until a counterparty is found.

1 Like

Yes, something like that would probably make sense.

1 Like