Event Filtering: filter DSLs

ray · February 5, 2025, 2:21pm

Event filtering as a model is simple for internal messages. To recap, event filtering means that the subscriber chooses the scope of its subscription, rather than the event producer thinking about subscription scopes, by means of a subscription being a pattern or sequence of patterns against which to match events. For a real-life example, have the following:

The main barrier against using it for non-internal events is the need for a pattern matching DSL if you can’t use native pattern-matching capabilities. This is hardly insurmountable; GraphQL reaches towards this, for example, although not that well since they don’t think in terms of pattern matching but rather have a jq-like mental model, as I see it.

But if it’s something we can mechanically translate into a filter function, something like e.g. a pair of some sort of lens into the “transaction id” field and the value of the desired transaction ID, it’s quite doable. So the problem pretty much reduces to a lens DSL, potentially for ur-formats?

cdetroye · February 5, 2025, 2:42pm

I agree with your summary.

Two things I think have to be determined:

How complex do we want to make the DSL. I believe that, given that this is user-facing, we should opt for maximizing simplicity.
How expressive do we want this DSL to be? From a meeting with specs I remembered that the events themselves can be arbitrary, so I assume we would want a DSL that is very generic so that i can express “everything.”
Do we want to be able to subscribe to messages we dont know the structure about beforehand, or only known? I assume the former.

On Slack I posted a very first example of what I think could be a good level of expressiveness, as well as simplicity

event = (node, engine, [(key, value)])
key = string
value = string | integer | event

# filter language
topic_filter = [filter]
filter = (predicate, expr, key)
predicate = equal | begins | ends | contains
expr = string | integer

Regarding the known/unknown messages, I think the following.

A message should have a part that’s always required, and is present for each message. This could be the “topic” of the message; it is something that can be used to do an initial filter. In the example above I used node id and engine as two values that are assumed to be always there.

Any other fields of the message should have the same structure. In the ur-messages that could be a triple {name, value, type} or something.

cwgoes · February 12, 2025, 11:37am

I think starting with a simple DSL that can be mechanically translated to a filter function, where the DSL only allows for expression of predicates which can be efficiently evaluated (we’ll need to nail down that definition a bit, but roughly “constant time bounded by the size of the DSL term” would do), makes sense for the time being. @cdetroye what you propose seems roughly the right shape to me. How difficult would this be to implement?

Topic		Replies	Views
Remote subscriptions Protocol Design architecture	4	42	July 25, 2025
Anoma Topics Meeting Digest: Global Data Brokers Protocol Design anoma , architecture	9	132	July 31, 2024
Distributed Publish/Subscribe Protocol P2P Protocols	2	41	April 10, 2025
Node architecture: topics Protocol Design architecture	0	27	December 12, 2024
Anoma Engines: An argument against sender categorization of messages Specs v2	10	95	February 4, 2025

Event Filtering: filter DSLs

Related topics