Decompositional object systems

If it looks like an object and quacks like an object, it’s an object - @mariari (apocryphal).

Decompositional declarative object systems

The objective of this post is to construct an “object system”, i.e. a system where complex programs can be constructed from “objects” with method interfaces, that permits operational decomposition, i.e. the implemention of a complex program with higher-level objects in terms of solely a set of stateful primitive objects and message constraints, where the higher-level objects need not be reified. This system will be constructed in a declarative style, but this is just an aesthetic choice – it could also be programmed imperatively in practice.

Why aim for operational decompositionality? The short answer is “distribution”: decomposing a complex program into little primitive objects allows for those primitive objects to be (in some cases) distributed, while the whole system still behaves as if it were one logical object. Decomposition into primitive objects is also conducive to distribution of knowledge, as different parties can make proofs as part of a whole transaction the full details of which no single party knows.

Definitions

Assume a primitive type Type, instances of which can be understood as specific sets.

A method type can be defined as a pair of types (Input, Output) where the first type is the type of the method input and the second type is the type of the method output.

An object type O can be defined as (M, T) where

  • M is a method type.
  • T is a trace invariant, i.e. a predicate of type List\ M \rightarrow Boolean.

This is a behavioral definition (“quacks like an object”), in that it defines the observable behavior which something must have in order to qualify as a particular object type.

Namely, for O := (M := (Input, Output), T):

  1. It must be possible to send messages of type Input, and get back responses of type Output.
  2. For all possible traces (sequences of method calls), the trace invariant must hold.

Given an object type definition, anything that satisfies these two conditions can be said to be an object of that type, and anything which does not cannot.

Here, objects are assumed to have only one method. The typical convenience of multiple named methods can be compiled to a single method by using the name (or index) of the individual methods as the first part of the input to the single method. If the method is broken down as such, there will likely be some named methods which do not change the future possible traces – these we would call “reads”.

Composition

Given a set of objects O_0, O_1, ... O_n of trace invariants T_0, T_1, ... T_n, one can define a new object O as:

  1. A method type M := (Input, Output), and
  2. An internal constraint IC, of type (Input, Output) \rightarrow {Messages}_{0} \rightarrow {Messages}_{1} \rightarrow ... {Messages}_{n} \rightarrow Boolean, where Messages_n = List \ T_n. Informally, the internal constraint describes the relationship between method calls to this object and method calls to the sub-objects, which are assumed to happen atomically (a single transaction, in database parlance).

Given the trace invariants of the sub-objects, internal constraint, and method type, we can then derive the trace invariant of this object T by logical conjunction and a \forall quantification over the sub-object traces, i.e.

T\ (trace) := \\ \forall\ {Msgs}_{0} ... {Msgs}_{n} . \\ let\ step\ ((input, output) : trace) \\ ((msgs_0 : rest_0) ... (msgs_n : rest_n)) := \\ IC(input, output, msgs_0 ... msgs_n) \land \\ step(trace, (rest_0 ... rest_n)) \\ in\ step(trace, (Msgs_0 ... Msgs_n)) \land \\ T_0(Msgs_0)\ \land\ ...\ \land T_n(Msgs_n)

Intuitively, what this says is roughly: if we take some objects which behave as described, and interact with them in a way which satisfies the internal constraint, then we will end up with an object which behaves in a way consistent with the objects’ trace invariants and the internal constraint.

We might want to require that outputs can be computed, which would mean that the internal constraint must be represented as a program that can compute a series of message sends from the input until the output is determined (and then return that), and that we do not need the forall but can compute the sub-object messages from the inputs.

Decomposition

Given an object composed of a set of sub-objects and an internal constraint, we can decompose this object by dividing the internal constraint up amongst the sub-objects. Assume that the internal constraint is expressed as conjunctions/disjunctions of sets of clauses which reference the input, output, and sub-object messages. Assign to each sub-object the clauses which reference it (duplicating clauses where multiple sub-objects are involved – these can be operationally optimized out later). Proceed recursively until non-decomposable objects are reached, splitting carried clauses by which sub-objects are involved. Finally, join all clauses conjunctively.

Examples

  • A counter object would have Add and Get methods, and a trace invariant that specifies that the result of any Get is the number of Add inputs ahead of it in the trace.
  • Two counters could be combined by specifying AddTwo, which calls both their Add methods, and GetTwo, which calls both their Get methods and adds the result (enforced by the internal constraint).
  • Note that we can prove that this counter is eventually consistent by noting that two trace invariants with Add messages in a different order will result in the same Gets as long as they eventually receive the same Add messages.

Extensions

Asynchronous calls: implicitly, we’ve assumed a transactional model here where each method call is an atomic, synchronous transaction. This system could be extended to asynchronous sub-object calls, where (a) asynchronous calls do not return any data, and (b) asynchronous calls may be arbitrarily re-ordered.

1 Like

I think, hierarchical decomposition is a nice idea to hide complexity, in particular in graphical representations, but also for the sake of being able to have several alternative ways to implement the “internals” of a (top-level) object. I think, it is worthwhile to spell out the nice properties of this decomposition on the underlying message sequence charts in terms of refinement as covered in §5 of this thesis, titled

Distributed System Design with Message Sequence Charts.

Roughly, each incoming message is replaced by a whole diagram of messages, on the level of the graphical representation.

Each step [1] can be represented as an “activation box” in a diagram of messages, with incoming and outgoing arrows, looking something like the following picture:

Now, we may open up the activation box, to find that there are additional messages sent “inside” object 1, and that the messages m2 and m3 actually have different “actual” senders, which send on behalf of the surrounding (top-level) object.

If we keep expanding boxes, we eventually reach a most refined level. On the level of message sequence charts, we have additional scopes, which describe which messages are to be considered direct causes of each message (relative to an object).

I hope that this is a proper interpretation of the decomposition that you have described.


The interesting point now concerns the transactional structure of each (top-level) activation box. This is usual the hard part if we think of the “usual” actor model, because one would have to add a final join/sync of sub-objects so that we are sure that either all or none of the messages will be send. For the above example, a third sub-object object 1.3 would have to sync the objects 1.2 and 1.1 and send the messages m2 and m3[2].

However, if all (top-level) messages are (non-ephemeral) resources, then we simply simply wrap the (top-level) step into a transaction with all the internal messaging represented by (ephemeral) resources. Thus, I think, the decomposition principle is good to go essentially as described. The interesting questions arise of course when one does not have the “enclosing” controller that makes sure that we can fabricate a single transaction that bundles together all the steps of the sub-objects into a single step of a top-level object (cc @isheff).[3] But that would bring us into the deep waters of distributed systems, which we care about, but want to abstract away for application development.[4]


  1. Reminding me very much of isolated turns in actor model terminology: “a turn is taking a waiting message, interpreting it, and deciding on both a state update for the actor and a collection of actions to take in response, perhaps sending messages or creating new actors.” ↩︎

  2. This paper discusses how one may try to solve this and similar issues. ↩︎

  3. Finally, we have the cross-controller case, which however is also relevant to the object system per se. ↩︎

  4. Drawings can be found here. ↩︎

3 Likes