Chapter 21 — Multi-Node Message Routing (In Development)
Multi-node local message routing is a known gap in the current implementation. This chapter documents the problem, the current behavior, and the approaches under consideration.
21.1 The Multi-Node Architecture
A single physical device can host multiple virtual OpenLCB nodes. For example, a command station might host one configuration node plus several train nodes. Each virtual node has its own unique 48-bit Node ID and its own 12-bit CAN alias. From the perspective of other devices on the CAN bus, each virtual node appears as an independent node.
ID: 05.01.01.01.00.01
Alias: 0x3AB"] N2["Virtual Node B
ID: 05.01.01.01.00.02
Alias: 0x4CD"] N3["Virtual Node C
ID: 05.01.01.01.00.03
Alias: 0x5EF"] SM["Shared CAN Interface
(single hardware port)"] end BUS["CAN Bus"] N1 --> SM N2 --> SM N3 --> SM SM <--> BUS style Device fill:#fff3e0,stroke:#e65100 style BUS fill:#e3f2fd,stroke:#1565c0
21.2 Per-Node Alias Allocation
Each virtual node goes through the full CAN login sequence independently, via round-robin processing in the CAN main state machine. The login state machine iterates through all nodes that are not yet in RUNSTATE_RUN, running one state transition per node per main loop iteration. This means all nodes log in concurrently, each generating its own alias from its own Node ID seed.
21.3 The Loopback Problem
When virtual Node A sends a message that virtual Node B (on the same device) should receive, the CAN driver transmits the frame onto the bus, but the hardware does not deliver that frame back to the local receive path. Remote nodes see it, but local sibling nodes do not.
Node A's event BUS-->>BUS: Remote nodes receive it
This affects several protocol areas:
| Protocol | Impact |
|---|---|
| Event Transport | If Node A produces an event that Node B consumes, Node B will not see the PCER. |
| Datagrams | If Node A sends a datagram addressed to Node B, Node B will not receive it. |
| Train Control | If a local throttle node sends a speed command to a local train node, the train will not respond. |
| Verify Node ID | Global Verify Node ID from Node A should elicit a response from Node B, but it will not. |
21.4 Current Behavior
In the current implementation, the openlcb_msg_t structure includes a state.loopback flag that is used to mark messages as sibling dispatch copies. When a global message is sent by a local node, the main state machine can set the loopback flag on a copy and push it to the incoming FIFO so that other local nodes can process it. However, this mechanism is not yet fully implemented for all message types.
21.5 Approaches Under Consideration
| Approach | Description | Pros | Cons |
|---|---|---|---|
| TX-side loopback | When the TX path sends a message, also push a copy (with loopback flag) into the RX FIFO for local delivery. | Simple, transparent to protocol handlers | Consumes extra buffer slots, must handle reference counting correctly |
| Application-level routing | The application explicitly routes messages between local nodes by calling the appropriate API functions directly. | No library changes needed | Burden on application developer, easy to miss edge cases |
| Dispatcher-level sibling fan-out | The main state machine dispatcher detects that the destination is a local node and delivers the message directly, bypassing the CAN bus entirely. | Most efficient, no bus traffic for local messages | Requires dispatcher to know about all local nodes' aliases, adds complexity to the dispatch path |
21.6 Impact on Protocol Correctness
For single-node devices (the most common case), this limitation has no effect. Multi-node devices where nodes do not need to communicate with each other are also unaffected. The issue only arises when co-hosted virtual nodes need to exchange messages -- for example, a command station node sending speed commands to a co-hosted train node.
Applications that require local inter-node communication today can work around this by using the application layer API directly (e.g., calling the train state modification functions directly rather than going through the message path).