Transport Engine Design

This is the physical transport engine design.

An Erlang system is a maximum trust domain, so messages between engines on the same Erlang system are just simple Erlang messages. We never rely on Erlang for security; this is all arbitrary code running on our CPU (even if we use Erlang distribution, this is just system administration).

So every engine has an Erlang PID (it’s actually a process group, but it has one PID that receives interengine messages). We extend this to foreign engines as well, creating a process for each foreign engine we know about, and registering it under that address just as with local engines.

The difference is that these foreign engine proxy processes are not the actual engine. Instead, they do two things:

  1. Sign, MAC, and whatever the message with the local sending engine’s key; this involves asking the keyholder (currently Router) to do the signature.
  2. Forward {src, dst, signed_message} to wherever it’s been told to forward signed messages to. This could be a TCP connection process (holding a physical TCP connection), a Unix socket process, or anything. It’s just a PID.

On the receiving side, the TCP connection process verifies signed_message, silently dropping the message if this fails, and sending it as a normal Erlang message to dst if it passes.

Transport is the supervisor over these two types of processes (foreign proxies and physical connection holders). It also supervises e.g. TCP listening sockets and similar, which create physical connection holders. Upon the creation of a physical connection, we simply burst “learn this engine” messages for all our engines, signed by Router whose job is knowing what “all our engines” means. Over its lifetime, we may send new “learn this engine” messages and also “forget this engine” messages; these create and destroy foreign engine proxies on the other side.

The practical reality of this is that a message sender need do nothing special, ever. It simply sends the message it wants to send as an ordinary Erlang message, it’s just that the recipient may be the above-described transport which delivers to the actual engine instead of the actual engine. The built-in Erlang global registry may be used effectively here; it maps engine names to PIDs and its names may be used as arguments to e.g. GenServer.call and friends.

This design smoothly handles call and cast trivially, but subscription needs additional design work. Trivially, each proxy can subscribe to the firehose and this would make porting the semantic trivial, but we don’t want or need that much network traffic. As such, engine proxies should have a specific filter spec as part of their state, which they subscribe to; what this actually contains is a matter for future design. Most will be able to subscribe to nothing.

1 Like

I say it handles call and cast trivially but I feel call needs a bit more explanation. Call blocks the caller until it gets a return value, but it does not block the callee! It’s typical to reply at the end of handle_call, but not required. Instead, the from argument to handle_call may be handed off; anyone with this value can perform the reply. We just inject a short-lived dummy caller on the receiving side, while the foreign engine proxy spawns a short-lived call replier on the sending side. So call requires a small amount of special handling to be transport-transparent.

(It’s possible that no interengine messages are ever calls; I think currently this is the case. But it’s worth noting.)

For prosperity the offline version of the diagram can be found here,

Thanks for the writeup. Just paging @tg-x @graphomath @jonathan to take a look here as we will need to synchronize on the engine model and implementation w.r.t. networking.


for simplicity as a png

Artem, Ray, and I discussed this during HHH, and came up with a revised approach.

The gist of the design is that a proxy engine serves as the communication point with foreign nodes. If communication over TCP is interrupted, messages are queued and flushed when the TCP connection is restored.

Assumptions:

  • Any proxy engine communicates with a node via a specific port. A node is discovered on port X, so all traffic will go through that port from that point on.

The architecture must satisfy the following:

  • When a foreign node is unavailable, messages are queued by the proxy process. The messages are flushed by the proxy process when the TCP connection is reestablished.
  • If the TCP server stops in an expected way, the proxy process, TCP server, and all listeners are terminated in an orderly fashion.
  • If the TCP server stops in an unexpected way, the TCP server and all listeners are restarted. The proxy engine remains alive to buffer the messages.
  • A TCP server can be used by zero or more proxy engines.

Functional implications:

  • TCP servers must be addressable by a unique name which must contain their port for the above to work. This allows the proxy process to know where to send its messages to.

  • If the proxy engine is alive, but it’s TCP server is offline, the dependency must be recreated by a notification mechanism. E.g., a process sends a message to the proxy engine {:tcp_available, {:port, 5001}}.

The supervision tree will look like the design below.

To think about later:

  • How long do we queue? What if a node is disconnected for a $long period of time. Do we eventually forget it, put limits on the queue, etc.