Chapter 21 — Multi-Node Message Routing (In Development)

Feature Status: In Development

Multi-node local message routing is a known gap in the current implementation. This chapter documents the problem, the current behavior, and the approaches under consideration.

21.1 The Multi-Node Architecture

A single physical device can host multiple virtual OpenLCB nodes. For example, a command station might host one configuration node plus several train nodes. Each virtual node has its own unique 48-bit Node ID and its own 12-bit CAN alias. From the perspective of other devices on the CAN bus, each virtual node appears as an independent node.

flowchart TB subgraph Device["Single Physical Device"] N1["Virtual Node A
ID: 05.01.01.01.00.01
Alias: 0x3AB"] N2["Virtual Node B
ID: 05.01.01.01.00.02
Alias: 0x4CD"] N3["Virtual Node C
ID: 05.01.01.01.00.03
Alias: 0x5EF"] SM["Shared CAN Interface
(single hardware port)"] end BUS["CAN Bus"] N1 --> SM N2 --> SM N3 --> SM SM <--> BUS style Device fill:#fff3e0,stroke:#e65100 style BUS fill:#e3f2fd,stroke:#1565c0

21.2 Per-Node Alias Allocation

Each virtual node goes through the full CAN login sequence independently, via round-robin processing in the CAN main state machine. The login state machine iterates through all nodes that are not yet in RUNSTATE_RUN, running one state transition per node per main loop iteration. This means all nodes log in concurrently, each generating its own alias from its own Node ID seed.

21.3 The Loopback Problem

When virtual Node A sends a message that virtual Node B (on the same device) should receive, the CAN driver transmits the frame onto the bus, but the hardware does not deliver that frame back to the local receive path. Remote nodes see it, but local sibling nodes do not.

sequenceDiagram participant A as Virtual Node A participant TX as CAN TX participant BUS as CAN Bus participant RX as CAN RX participant B as Virtual Node B A->>TX: Send event PCER TX->>BUS: CAN frame transmitted BUS->>RX: Frame NOT looped back Note over B: Node B never sees
Node A's event BUS-->>BUS: Remote nodes receive it

This affects several protocol areas:

ProtocolImpact
Event TransportIf Node A produces an event that Node B consumes, Node B will not see the PCER.
DatagramsIf Node A sends a datagram addressed to Node B, Node B will not receive it.
Train ControlIf a local throttle node sends a speed command to a local train node, the train will not respond.
Verify Node IDGlobal Verify Node ID from Node A should elicit a response from Node B, but it will not.

21.4 Current Behavior

In the current implementation, the openlcb_msg_t structure includes a state.loopback flag that is used to mark messages as sibling dispatch copies. When a global message is sent by a local node, the main state machine can set the loopback flag on a copy and push it to the incoming FIFO so that other local nodes can process it. However, this mechanism is not yet fully implemented for all message types.

21.5 Approaches Under Consideration

ApproachDescriptionProsCons
TX-side loopback When the TX path sends a message, also push a copy (with loopback flag) into the RX FIFO for local delivery. Simple, transparent to protocol handlers Consumes extra buffer slots, must handle reference counting correctly
Application-level routing The application explicitly routes messages between local nodes by calling the appropriate API functions directly. No library changes needed Burden on application developer, easy to miss edge cases
Dispatcher-level sibling fan-out The main state machine dispatcher detects that the destination is a local node and delivers the message directly, bypassing the CAN bus entirely. Most efficient, no bus traffic for local messages Requires dispatcher to know about all local nodes' aliases, adds complexity to the dispatch path

21.6 Impact on Protocol Correctness

For single-node devices (the most common case), this limitation has no effect. Multi-node devices where nodes do not need to communicate with each other are also unaffected. The issue only arises when co-hosted virtual nodes need to exchange messages -- for example, a command station node sending speed commands to a co-hosted train node.

Applications that require local inter-node communication today can work around this by using the application layer API directly (e.g., calling the train state modification functions directly rather than going through the message path).

← Prev: Ch 20 — Listener Aliases Next: Ch 22 — CAN Login →