Yesterday in discussions with @degregat and @Michael it became clear to me that we have not sufficiently spelled out the various flavours of transaction functions and times during the transaction authoring, solving, and ordering process. I thought it might be useful to write up my current understanding, so this is an attempt at that.
First, let’s distinguish between three times which are relevant for each transaction:
“Creation time”, which happens when the user decides that they want to issue an intent to the network. The first computation here happens on the user’s local node, and may access state which is private to that user (e.g. shielded resources). From the perspective of the network, all of this is a single point in time (visible to the network as soon as the user’s node sends a message to some other node).
“Solving time”, which happens when nodes on the network are searching for ways to compose intents (unbalanced transactions) of which they are aware. This is really multiple points in time, but all of them happen in between the creation time and the execution time, so we can treat them as a block for the reasoning here.
“Execution time”, which happens after a final ordering of transactions involving the controller in question has been picked. For simplicity, we can assume that the controller themselves is executing the transaction functions, although this need not always be the case.
Second, I think it’s necessary to clarify the two kinds of representation in play:
A transaction object (the data structure specified in the RM report) is only fully produced during execution time, when all ordering information is known.
Prior to execution time, we’re really just creating, sending around, and processing transaction functions of various flavours - and technically they don’t even return a transaction object until the final transaction function when it is executed - transaction functions during creation time and solving time just return more transaction functions, in a sort of “distributed partial application” process (there’s an old discussion about this)
A further complexity here is that, in order to search for potential matches, a node really must be able to introspect the transaction function and extract the transaction object inside, since there’s no way to search for matches to an opaque function. Now, there are two issues here:
If ordering information is required to determine what the transaction can be matched with, the solver either cannot search at all (since they don’t have that information), or must somehow analyze the relationship between ordering and the transaction (whatever computation in the transaction function), and try to find a match which guarantees that the computation will result in a balanced, valid transaction.
In more common cases, simple post-ordering computations such as incrementing a counter for statistics purposes are independent from the transaction delta, and the solver really just needs to extract the “constant part” of the transaction function to do their analysis, then add back the counter-incrementing code somehow later.
One can imagine that we might want a representation of transaction functions which is easier to decompose and analyze. A first stab would be to simply represent transaction functions as Set TxOrFunc, where:
and the results of the transactions and functions in the set (applied post-ordering) are to be composed together via the transaction composition operator. Then, a solver node can examine directly the Tx items in this set to search for matches, and if those independently balance, the functions computed during execution time should result in a valid and balanced transaction.
I don’t know if this is the right direction here or not yet, there are probably other options. I also wonder if we need separate names for:
the structure passed to the node during creation time (maybe “transaction creation function”)
the structure passed around the solver network (maybe “transaction solver function”)
the structure passed to the controller (this one could stay as “transaction function”)
… but these are just candidates. Thoughts welcome.
So I guess when you say that solvers need to extract the transaction object you really mean they need to have access to it? Because the approach you suggest doesn’t seem to help with extraction
Can the extraction be achieved by executing the transaction function? If so, I would mention it explicitly
My intuition here is that it can be important for example for exchanges where the exchange rate depends on the pool size, and maybe previous transactions affect the rate in a way that doesn’t work for the user. Is that a valid example?
Would it be basically two representations of the same thing (the dream transaction), perhaps the first part ignoring the counter-incrementing parts?
So, solvers learn from the tx object and update both tx and tx function objects in TxOrFunc. How do we ensure the correspondence between the two, assuming that the first one might correspond to the “constant part” of the transaction (so always differs from the transaction object fetus “contained” in the tx function)?
I mean that they need to analyze the transaction function (which will eventually produce a transaction object) to figure out what sort of transaction object it’s going to produce.
Maybe. The final ordering information isn’t known, so e.g. resources cannot be dynamically read, but some transaction functions could be executed beforehand in a simulation which the solver would know would still be valid, yes. We can break this down into enumerated cases.
Yes, that’s a good example.
We don’t need to ensure correspondence, this proposed data structure would just replace whatever solvers are sending around at the moment (which is not very well-specified).
Ah, I misunderstood the proposed solution. But now I don’t understand how it actually helps. Do I understand correctly that:
before the solvers were sending transaction functions around
you propose to replace it with a set of elements each of which can either be a transaction or a transaction function? I assume the set here corresponds to the set of arguments to the composition function
if 1 and 2 are true, we probably would first turn transaction functions from the set into transactions and then compose the transactions? (because i can imagine that we might otherwise need to compose transaction with a transaction function and it isn’t defined)
the solvers would only examine the elements that are transactions and not transaction functions? That seems to imply some kind of rules for when we do turn transaction functions into transactions and when we don’t. Is it only about the ordering information? Would it be fair to say that if a transaction function output doesn’t depend on the ordering information, then we can/will evaluate the transaction function right away and keep dealing with the transaction directly?
The idea is that - before sending to a solver - a user would split their transaction function into the part which is already determined (e.g. the kudos transfer) and the part which is not (e.g. the counter update) and encode those here (so the set would contain two items), and the solver can try to find matches for the transactions in the set while ignoring the transaction functions, and just add them back later.
In the simplest version of this, solvers never turn transaction functions into transactions. In a more complex version, they could elect to run transaction functions sometimes (running the risk of the actual ordering being different than the ordering they use to run the transaction functions).
can’t these be the same thing? (If this is pedantic ignore).
Solver recieves “intent” from Alice. Solver decides to create their own intent to match Alice’s or capture some slack (MEV) between matching Alice and Bob’s.
It doesn’t actually answer my composition question. From what I understand, you propose to compose the set elements in the end, but the set contains both transactions and transaction functions, and composition is not defined for such pairs.
A separate question: are there ways to match intents that depend on ordering information? I can see that it isn’t covered here, but I’m just wondering generally because matching happens before ordering and I can’t think of a way
Curious about this also - shouldn’t you be able to match any intent and do state changes that are far out in the future as long as the tx at execution time (post-ordering execution) is valid?
Lets say there is are intents A and B
A can be executed anytime after time t
B can be executed anytime after time t + 1
A solver with probabilistic knowledge of potential future states could match A and B at time t but not submit the tx for execution until after time t + 1.
If waiting until after time t + 1 to submit the tx including A matched with B results in greater user welfare than otherwise matching A with another intent slightly after time t, then this type of thing would happen often.
I propose to compose the set elements with a different composition function - call it transaction function composition - which simply takes two transaction functions a and b and returns a third function which runs a, runs b, and composes their results (which are transaction objects, on which composition is defined). For set elements which are transaction functions, this composition function can be used directly; for set elements which are transactions, the constant function can be applied to turn them into transaction functions.
I think it depends on what you mean by “ways to match”. Insofar as parts of the intent - e.g. constraint carrier resources - may require information in order to craft a proof that can only be known post-ordering, then in a certain sense some solving must be done post-ordering (but this “solving” would typically happen in the transaction function). In principle, it is possible to express constraints which would require matching (of other intents) to be done post-ordering, but in the current execution model this doesn’t make sense, since the transactions ordered cannot be changed once agreed upon (so the transactions would just fail).
Yes (there’s just some risk that other state changes will happen in the meantime and cause the transaction to fail).
We probably want that as an option for cases in which the likelihood of turning a match into a mismatch after execution is zero to low.
This sounds resonable to me.
After talking to @isheff I understand the following:
A transaction function can access arbitrary blob storage at any point before ordering, to perform data dependent computation.
Post ordering, it should only ever access blob storage on the replicated state machine, since any kind of unreliable storage could lead to the replicas diverging.
As long as the transaction function can only access the blob storage if the node it is being executed on, this should not be a problem.
If the transaction function needs to access state in respect to some controller, the resource logic should probably specify if that can happen at any point in time, or if it must happen post-ordering.
State dependent operations that happen before ordering could be used to increase optionality without user interaction, e.g. invalidate transactions before they reach the controller, after the solver checking some non-fulfilled requirements, or return a different transaction object/transaction function containing alternative resources.
Transaction functions of this kind could be executed to produce transaction objects which could then be matched.
Given the above, I don’t think we need to distinguish between
instead we should probably distinguish between functions which can/should be executed at pre- or post-ordering time (or both) and wether they can/should return a transaction object or another transaction function.
Mostly, resources which might not have been created when the transaction function was written, but have been created by the time the transaction function is executed (for example, the latest version of a counter resource). Also, the latest block height and timestamp information from the controller, and possibly other oracle information created in between the time when the transaction function was written and the time it was executed.