[Proposal] Characterizing the Hierarchy of Games in Anoma

As a follow up to the intent machine ART (which is going to be published within the next two weeks), we want to make progress in characterizing the hiearchy of games within anoma, by analyzing where boundaries can be drawn and how they relate to each other. The notes below are the result of a meeting with @nikete and @cwgoes

Intro

Here we want to introduce the fast and slow games

Mechanisms of regulation

  • How does information flow between games
  • Formulation of game relations in terms of a control theoretic and distributed systems approach

Objects and Actors in Games

Controllers

Analyze games around execution and state updates, e.g. around inclusion, censorship and ordering

Solvers

Analyze barter auctions and solving

Time Relations

Introduce Time relations along different dimensions (bandwidth, compute, etc.)

Can we derive bounds of welfare extraction related to time relations of games:

  • how do the shapes of the functions for extraction look like to fast actors
  • how does aggregate data look like to slower actors
  • what are the costs of verification/auditing procedures necessary to regulate to regulate a fast game from a slower one
  • Given this data, and auditing procedure costs, how do bounds look like?

Try and formulate/draft (coarse) quantifications for expected and worst case scenarios.

Mechanisms

  • Introduce the purposes of mechanisms for controllers and solvers
  • The slow game will likely want to decide on properties of mechanisms to be approximated, or research to be performed into what might be possible, e.g. request examples along a tradeoff frontier, s.t. better decision can be made in the next round or slower games

Analysis of a single slow game

Work out an example in detail with a single fast and slow game.

Analysis of multiple slow games

Work out an example of multiple, interacting slow games, up to the slowest game

Slowest game

This is the fixed point in respect to the time operator. Decisions in this game do not come from any slower/outer game.

Future Work

Span the bridge to further reports on specific slow and fast games which are of practical interest to the running of the anoma network.

1 Like

My understanding of the scope here:

  1. Introduction
    • Basic intuition for the frame, what is the fast game, what is the slow game
    • Basic intuition for control / feedback here, periodic cycles
    • What questions will this report try to answer
  2. Examples
    • Controller selection in Anoma
    • Solver selection in Anoma
    • Ethereum validator selection
    • Coordinated group electing some decision-makers (keep it relatively abstract)
  3. Analysis of a single slow game
    • Time relations (as mentioned above)
    • Verification procedures
    • External time reference for the slow game
    • Ratio of the expected case to worst case
  4. Analysis of multiple slow games
    • “Slow/fast game” is a relative distinction
    • Slow game hierarchy & fix point
    • Multiple slow games at once and questions of game identity
  5. Future work

Simple example:

  • Fast game agent making a decision about a number (maybe the number is ~ control input into a heater, input electricity)
  • Slow game agents trying to bound the temperature

One of the things we want to achieve, is to calculate bounds/probabilities of being able to detect defection of fast agents from rules set and enforced by slow agents.

Disclaimer: I am not a domain expert on information- and coding theory, so the following postulated correspondences could be arbitraririly misguided and are likely naive. This post will receive edits as we clarify the model.

Bounds on detection probability via rate-distortion theory

One modelling approach, which could inform bounds about the resolution with which we need to make measurements to detect defection could work via rate-distortion theory :

We postulate the following correspondences:
X^n: actions of fast agents
f^n: mapping from actions to observable(s) (this can include aggregation)
Y^n: observables over actions
g^n: mapping from (a subset of) observables to estimations
\hat{X^n}: Estimated action, derived by decoding observables

X^n, Y^n, \hat{X^n} are random variables with associated probability functions p.

We should then be able to give bounds on the distortion for a given encoder/decoder pair, the choice of which will depend on the available observations agents could make on proxies for the actions to be approximated.

n, which represents the message length corresponds to the highest resolution perceivable by the fast player. The difference in perception of time between slow and fast actors would be implemented in the encoder and decoder pairing.

For a given acceptable distortion we will receive a lower bound on the bitrate.

Perceptual divergence

Rethinking Lossy Compression: The Rate-Distortion-Perception Tradeoff focuses on a third metric, the divergence d(p_X, p_{\hat{X}}) of input and output. This divergence is meant to encode the divergence in perception of users and can thus be defined to measure only relevant dimensions, or weigh them accordingly.

Expressing our models in these terms should give us a tool to calculate tradeoffs between coarseness of data aggregation during observation and detection probability. These should correspond to rate (which is connected to compression) and divergence (which expresses the uncertainty of the estimated actions.

Connection to regulation of fast game actors by slow game actors

Slow game players should decide on policies about actions, which the fast game players must adhere to. For now, we want to ignore the cost of observations, as well as the structure of enforcement mechanisms, but being able to bound or even calculate the probability of detecting defection is a necessary component of the machinery.
Other concerns will be estimating risk of defection by analyzing observations over time, and calculating cost for different types of defections.

Example

I will try and work out an informative example in the following days to check these ideas and see if I can derive any meaningful quantitative statements.

2 Likes

Further notes concerning some of these topics:

  • Examples
    • Controller selection: the basic frame is that users want to delegate “custody” (required signature) over some of their state to particular highly-available, performant parties in order to allow those parties to order state transitions involving this state. If this state becomes valuable, these controllers can try to charge higher and higher fees for users to change it, so users need a way to credibly threaten to fork away and just no longer consider the state held by a particular controller to be semantically relevant - this is the “slow game” here.
    • Solver selection: the basic frame is that users want to delegate solving (bandwidth and compute) over some of their intents to particular highly-available, performant parties in order to avoid doing the work themselves and to find more counterparties. These solvers can make solving decisions which are not in alignment with the users’ welfare, so users need a way to measure how aligned/misaligned their solvers decisions are and to switch away if their solvers’ decisions become too misaligned - that’s the “slow game” here.
    • Ethereum validator selection - just an example of a controller.
    • Coordinated group electing some decision-makers - quite similar to the solver case, probably some measure of welfare involved here.
  • Time relations: the slow game and fast game each have a period - say that the slow game period is S and the fast game period is F. The ratio between the two - F/S - how many fast game periods happen per slow game period - will affect how tightly the slow game participants can bound actions taken by the fast game participants. In particular, if the fast game participants deviate from some desired behaviour in D rounds of the fast game, they then have F/S - D rounds to “exploit” this deviation before the slow game takes action - so, depending on F/S, this may or may not be “worth it” for different kinds of deviation and slow game responses.
  • Verification procedures - this is pretty much straight Bayesian inference - the fast game participants have some configuration space of actions, and the slow game participants take measurements, which they want to use to gain information about what actions were actually taken. How much and what information can be gained here will correspond to the “tightness” of the regulation the slow game can enforce.
  • External time reference - in order to decide to “run” a new slow game round, slow game participants probably need some shared external signal (local clocks within some epsilon of each other count here, as do periodic astronomical events, etc.)
  • Ratio of the expected case to the worst case - much of the benefit of slow game regulation is - if the bounds can be made tight enough - a rational group of fast game actors will act mostly in alignment with the desired behaviour profiles of the slow game participants, without the slow game participants having to run the fast game themselves or go to the bother of verifying exactly what happened all of the time. We should spell this out in more detail.
1 Like

Notes on model review:

  • Distinction between observables (in principle can be observable) and observations (what we actually observe).
  • Theoretical encoder - everything that is observable is observed; theoretical decoder - we take everything that was observed, which was everything that was observable, and infer the estimated action.
  • The estimated action should be a probability distribution.
  • What is time in this model? answer: n
  • We want there to be two rates - f and s - and be able to reason about the sequence of interactions / measurements here.

Here is my conceptual frame, articulated as clearly as I can manage at the moment :grin:

An instance of the slow game consists of, at minimum:

  • A fast agent (which might be a coordinated group) taking actions. The identity of the fast agent, their action space, and costs/rewards to taking particular actions are specific to each instance.
  • A slow agent (which might be a coordinated group) taking measurements. The identity of the slow agent, the measurements which can be taken, how frequently they can be taken, and how much they cost to take are specific to each instance.
  • A world model (which may or may not be fully known) which determines how the actions taken by the fast agent affect the measurements taken by the slow agent (often over time). The nature of the world model (and how much of it is known) is specific to each instance.
  • A regulatory mechanism through which the slow agent can reward or punish the fast agent, depending on the measurements which they take over time. The nature of the possible rewards and punishments is specific to each instance.
  • A target world profile chosen by the slow agent (often changing over time). This target profile may include actions taken by the fast agents, measurements taken by the slow agents, or in-between (inferable) variables of the world state. The type of the target world profile is specific to each instance, and the value is typically an input to the system over time.

The characteristic questions for a slow game instance are:

Given the action space and costs/rewards of the fast agent, the measurement space, frequencies, and costs of the slow agent, the (possibly uncertain) world model, and the available regulatory mechanism:

  • Can a policy be crafted which will achieve the target world profile in incentive-compatible equilibrium?
  • What is that policy?
  • What is the deviation between the reward profile of the actions which best maximize the target world profile, and the reward profile of the actions which best maximize the fast agent’s returns? This could be called something like slack (or extractable value - this is a sort of generalized MEV).

Here are some slow game examples, and how they instantiate each of these variables:

Controller selection in Anoma / Validator selection in Ethereum

  • Fast agent: controller in question, who can choose what fees to charge, and which transactions to possibly censor. The controller’s reward is the fees paid, and possibly side rewards (bribes) for censorship.
  • Slow agent: users submitting transactions to the controller in question, who can measure the fees charged, and can measure over time whether particular transactions are being censored.
  • World model: fees are directly measurable; censorship is probabalistically measurable over time (since we also assume unreliable network conditions).
  • Regulatory mechanism: users can decide whether or not to pay fees, and they can switch controllers, which reduces future rewards for the controller to zero.
  • Target world profile: controller charges fees not more than a fixed margin above its operating costs and what would be needed to clear the market, and controller does not censor transactions.

Solver selection in Anoma

  • Fast agent: solver in question, who can choose to accept or not accept particular intents and to exploit slack (price differences between intents) or to return slack back to users.
  • Slow agent: users submitting intents to the solver in question, who can measure (over time and by comparing with each other) whether the solver is censoring intents and how much slack is being returned to users.
  • World model: slack (MEV) return and censorship are probabilistically measurable over time (since we also assume unreliable network conditions).
  • Regulatory mechanism: users can decide whether or not to keep sending intents to this particular solver, which reduces future rewards for the solver to zero.
  • Target world profile: solver exploits slack not more than a fixed margin above its operating costs and does not censor intents.

Delegated governance systems

  • Fast agent: governance delegates, who can make particular decisions more for their own benefit or more for the benefit of a public (slow agent).
  • Slow agent: voters, who can measure which decisions are made, or at least their impacts.
  • World model: decisions made impact the state of the world (very general)
  • Regulatory mechanism: varies, often voting out particular delegates on a periodic basis, sometimes also emergency referenda
  • Target world profile: general happiness and stability
2 Likes