Protocol versioning

Protocol versioning

We need protocol version negotiation when two nodes establish connection:
the connecting node advertises a range of supported protocol versions in the connection request,
while the responder either accepts the connection indicating the highest mutually supported protocol version, or rejects it.

Nodes need to support multiple protocol versions because we can’t upgrade the whole network at once, and to be able to reason about correctness of the protocol, we should ensure that nodes speak the exact same protocol version when they communicate, and in the implementation make sure they cannot fallback to other versions after agreeing on one.

Protocol versioning fits into our type system neatly, we could do it as follows:

--- Ticker Engine v1
type TickerMsg1 :=
  | Increment1
  | Count1
  ;

--- Ticker Engine v2
type TickerMsg2 :=
  | Increment2
  | Decrement2
  ;

--- Protocol v1
type Msg1 :=
  | MsgNetwork1 NetworkMsg1
  | MsgTicker1 TickerMsg1
  ;

--- Protocol v2
type Msg2 :=
  | MsgNetwork1 NetworkMsg1
  | MsgTicker2 TickerMsg2
  ;

--- Supported protocols in the current specs release
type Msg :=
  | Msgv1 Msg1
  | Msgv2 Msg2
  ;

Msg comprises all supported protocol versions by the current specs release, which gets incremented at every release.
Individual protocol versions (Msg1, Msg2) contain versioned engine messages in that specific protocol version.

Engines have versioned messages as well.
When changing/adding/removing any message to an engine between specs releases,
we need to increment both the message version and the engine version.

We should describe each engine version in separate files,
e.g. ticker_messages_v1.juvix.md defines TickerMsg1,
and ticker_messages_v2.juvix.md defines TickerMsg2.

When incrementing an engine version we can thus copy the latest version to a new file,
and increment all Message versions as well, even if they didn’t change.
This will make it easier to remove old protocol versions just by deleting the corresponding file.

After a while we can remove old protocol versions that are not supported anymore
(e.g. when adding Msg3 we could remove Msg1, in practice we should allow a reasonable time period for upgrades, a few weeks should be sufficient, similar to what Signal does)

Later we should also have some in-band software update protocol with signed releases, for this we could use a a more stable protocol that doesn’t change that often, or something like HTTP, such that long-offline nodes can also upgrade when they return.

Protocol vs specs vs software vs product versioning

It’d make sense to keep the protocol and specs versions in sync, since these are closely coupled. Software and product versions should specify which specs version they use.

2 Likes

Thanks for the writeup. Overall I like the direction. A few specific questions:

This part I didn’t quite follow. What is a “versioned engine message”? Is there a sub “Engine1Msgv1” and “Engine1Msgv2” type, where multiple engine versions are supported by the same protocol version? Or what did you have in mind?

Is there a difference between the engine version and the version of messages that the engine processes? What’s the difference / when would we update one vs the other?

Yes, there are multiple protocol versions supported simultaneously.
in the above example Msg1 is protocol v1, Msg2 is protocol v2.
As for engine versions: TickerMsg1 contains messages for v1 of the Ticker engine, TickerMsg2 for v2, etc.
Msg contains all supported protocol versions, in this case v1 & v2.

What I meant about is that I suggest to keep the engine version and the engine message versions in sync, the same version number, thus when changing any engine message in a specs release, we’d bump both the engine version and also all message versions in the engine. This makes it easier to write a new version and to remove old versions. Also, even though a message format might not change between engine versions, the message handling behaviour may change.

Makes sense.

I understand what you mean, but I’m not entirely convinced that this is the system we want. Wouldn’t it be better to only increment message versions when the message processing behavior for that message has changed, such that message versions always correspond to processing behavior changes (and the absence of a message version change indicates the absence of behavior changes).

Yes, I have thought about this, and I think it makes it harder and more error-prone to maintain the specs. If we don’t increment all message numbers when incrementing the engine version:

  • while writing the new version we have to keep track of which message version number have changed and which have not, and make sure to not miss any place in the code that should be incremented.
  • when removing an old engine version, the messages that are still in use must be moved to the file with the next version.

However, we could make this easier by a clever use of includes:

  • we never actually remove engine versions from the repo, only from the ToC
  • in a new version we include unchanged messages from the version they were defined at,
    and increment the ones where either the message type or the behaviour changed

This avoids the above problems, I think.

2 Likes

Paging @jonathan the specs-automation-maestro: what do you think of the above?

Does it make sense to track the granularity of message type updates below the engine level? If we define only external interfaces for engines (which are used via messages), can we guarantee that behaviour in processing of one message type does not change, if the behavior of the engine changes at all, even if this message type is not directly affected?
I feel tracking versions of messages independently implies that we guarantee composability at this level and we should only do that, if the components processing these messages are truly independent, e.g. guaranteeing stability of interfaces internal to the engine.

Yes, it is a possibility that message behaviour changes across engine versions even if the message type does not.
A solution to this is that we bump the message version if either the message type or the message handling behaviour changes in the specs.

1 Like

Yes, you mentioned that above. If we can’t guarantee independence of message processing, I think we should go with this approach.

1 Like

Discussed with @isheff @mariari @jonathan @degregat :

What are versions?

  • Product versions consist of an enumerated list of high-level requirements.
  • Protocol versions are sets of compatible engine versions.
  • Implementation versions are code.

Which versions implement what?

  • Protocol versions implement product versions (if and only if they satisfy all the product requirements of a particular product version).
  • Implementation versions implement potentially multiple protocol versions (if and only if they handle all of the messages specified by a particular protocol version). Implementation versions, in general, should not support any unspecified messages. Multiple version support is systematized in the manner described above.

Current specs process implications:

  • Track a number associated with each engine
  • When the engine changes, the number gets incremented
  • When a number gets incremented, some top-level number gets also incremented
  • New instructions for authors: when you change an engine, increment the engine version number by one
2 Likes

Looks good to me in general. Some clarifications I’d add:

What are specs versions? Equal to the protocol version described therein or something else?

Here I’d clarify that incrementing version numbers should be done once for each specs release, not per PR, it should be done in the first PR that changes either message types or message handling behaviour (engine dynamics) in a specs release.
This top-level number is the protocol version number, I assume, should state that explicitly.
This should be also incremented once per specs release.

1 Like

Per request of @mariari, specs versions are major-minor-patch, associated with a changelog and used primarily for tracking changes that implementors need to pay attention to.

This should be coordinated with @jonathan.

1 Like