Distributed Publish/Subscribe Protocol

Overview & properties

The distributed publish/subscribe protocol is responsible for few-to-many dissemination of events from a set of publishers to subscribers.

The topic ID is derived from the public key of the topic owner, who can designate publishers.

The protocol offers Byzantine Causal Broadcast and Byzantine Eventual Consistency properties within the topic.
This is enabled by the underlying Blocklace data structure,
which builds a causal message history DAG of messages signed by a publisher.
The Blocklace also prevents equivocation, i.e. a publisher publishing independent messages that do not transitively depend on each other, which would allow a publisher to send different messages to different subscribers.

The message DAG allows detecting and recovery of lost messages,
as well as synchronization after reconnecting to the network.

A pub/sub topic offers the following functionality:

  • Message send (for publishers).
  • Subscription either directly or via brokers.
  • Message reception from other subscribers or brokers.
  • Message delivery in causal order (for both publishers and subscribers).
  • Message history synchronization (after reconnection and lost message detection).

Messages

Messages sent by publishers contain:

  • Dependencies: set of message IDs (message content hashes) of previous messages:
    • Reference to the publisher’s own previous message ID.
    • Reference to all message IDs from other publishers that are leaves of the message DAG.
  • Typed message (Event, Ack, Epoch, TopicAdvertisement)
    • Events contain a payload with an application-specific message that may be encrypted.
  • Content tags: opaque strings for content-based filtering that may be obfuscated or encrypted (optional).
  • Signature by a publisher listed in the latest TopicAdvertisement.

Topic Advertisement

The configuration and advertisement of a pub/sub topic happens via a TopicAdvertisement, which contains:

  • Topic ID: corresponds to the topic owner’s public key. May be a compositional identity.
  • Publishers: set of publisher identities, along with permissions of which message types they are allowed to send.
  • Relays: node IDs that may be used for direct subscription.
  • Version: sequential version number of the advertisement, only the last known version is valid.
  • Signature: cryptographic signature by the owner’s private key. Threshold signature in case of compositional identity.

Updates to the TopicAdvertisement are always sent to the topic as a message, and always retained until a new version becomes available.
A TopicAdvertisement may also be published in a zone in the Distributed Name System.

Overlay structure and dissemination

Pub/sub topics inside a domain form a Topic-Connected Overlay (TCO).
This property ensures that only nodes that are explicitly interested participate in message dissemination in the topic. The overlay structure may be optimized by alignment of different topics by each node prioritizing connection to other nodes that they share topic subscriptions with.

Message dissemination in the topic happens in phases:

  1. Dissemination among publishers.
  2. Dissemination from publishers to direct subscribers and brokers.
  3. Dissemination from brokers to indirect subscribers.

This allows various dissemination strategies, either using direct dissemination to subscribers, or indirect dissemination via brokers, or a hybrid between the two, dependending on requirements of each topic.

Subscription

Direct subscribers specify and advertise the set of topics they subscribe to within a domain, which is used by the overlay construction protocol to create and optimize a suboverlay for each topic.
While indirect subscribers subscribe to a topic at one or more brokers.

In both cases, the subscriber may specify content-based filters for the messages delivered over the network.
These filters operate on the set of tags and publisher identity in each message. Tags may correspond to topics in a forum, threads in a chat, as well as keys or shards in a key-value store.

Epochs and message retention

Subscribers of the topic, including publishers and brokers, retain all messages received in the topic for the current epoch.
Thus history synchronization and lost message recovery is only possible within the current epoch.

Changing epoch requires a quorum of publishers to agree.
This may happen automatically when the required number of publishers acknowledge a cut in the message history DAG.

Messages may be deleted immediately after an epoch change, or after an ean xpiry time set in the topic configuration or in the message. The expiry time is only considered after an epoch change.

Message types and permissions

The following message types may be sent to the topic by publishers, given they have permission to do so:

  • Event: contains application-specific payload that may be encrypted
  • Ack: transitive acknowledgement of all referenced dependencies, contains no payload
  • Epoch: initiate or acknowledge an epoch change, when sent in response to another Epoch message. Contains the version number of the latest TopicAdvertisement.
  • TopicAdvertisement: updated topic advertisement, latest version always retained, regardless of in which epoch it was sent.

Usage

Pub/sub topics are used by engines for sending event notifications among them, as well as higher layer protocols such as the Mempool and Consensus, and later the MRDT and Multi-Chat protocols.

When used for event notifications between engines, the payload contains a message type emitted by the publisher engine that is understood by all subscribers.

In case of Mempool, Consensus and MRDT, the payload contains transaction candidates that are later executed, if valid.
In these cases tags may correspond to keys and/or shards in the KV store that allow
selective subscription.
In case of MRDTs, an epoch corresponds to a snapshot of the KV store, and allows deleting transaction history.

For Multi-Chat, tags may correspond to threads,
and using epochs and message expiration timers allow message deletion.

The Resource Machine uses pub/sub to notify about blobs in Distributed Storage that are published and about Storage Commitments that nodes issue who commit storing blobs for a certain amount of time.

Intents are also published in pub/sub topics, where users who submit them assign appropriate tags for content-based filtering, while solvers subscribe to topics and tags of interest, perform intent matching, then publish transaction candidates to the mempool.

1 Like

Thanks for the overview!

What do these properties mean, exactly? I can guess, but do you have precise definitions?

I infer from this that a topic has an updatable set of publishers, and only those publishers listed are allowed to publish to that topic – is this correct? Is only the topic owner allowed to update this set?

By “message types”, do you mean Event/Ack/Epoch/TopicAdvertisement, or something else? How are these permissions represented?

What is the form of these filters? Are these predicates over tags (and publisher identity)? How are they represented?

As in, a message may be deleted only if the epoch has changed and the expiry time has been reached? Is this expiry time measured relative to a local timestamp on the node, or relative to some timestamp broadcast by the publishers of the topic?

yes, i’ll post a precise definition
for reference, these papers talk about it:

yes, only the owner is allowed to update it, however the owner can be a compositional identity that could consist of a subset of the publishers.

yes, those types.
they are represented as a map of: Publisher ID → set of message types the publisher is allowed to send

This part haven’t worked out in detail, the rough idea is the following:

In its simplest form it is a set of tags/publishers to include/exclude, we could potentially allow prefix/postfix/regex match as well.
Matching on tags and publisher is always possible on the network level,
however matching on the content itself is only possible after decryption of the payload, which is only possible by subscribers before delivery, but not relays or brokers that do not have the decryption key.
Thus there are two kinds of distinct filters, a coarser one that can be expressed on the network level, and a finer one that happens locally before delivery (either in the pub/sub engine or the subscribing engine’s guard function)
We started talking about coming up with a DSL to express these filters, we can investigate this direction further, first identifying the requirements and use cases for such filters.

Messages for the current epoch must be retained always, after that we have the option to either count the number of epochs relative to the one the message is sent to (where 0 means it is deleted at the end of current the epoch), or use an absolute time stamp which must be after the current epoch and if it happens earlier then it is only deleted after the current epoch ended.
Using an absolute time stamp requires an additional assumption that nodes have roughly synchronized time within some delta T, especially when we want to use this to verify service commitments and punish nodes if they don’t keep their commitments.