We've launched our new site at www.openlighting.org. This wiki will remain and be updated with more technical information.
RDM Discovery
From wiki.openlighting.org
This briefly describes the RDM discovery process and outlines various scenarios that should be considered when writing software that performs RDM Discovery. Unlike DMX, RDM is bi-directional, which means bugs in responders can cause the controller to crash or go in to an infinite loop.
This page is not a substitute for the E1.20 standard, which takes precedence over all information presented here. It simply presents situations which designers of controllers should consider when writing their software.
Contents
Process Overview
Each responder has a unique identifier (UID) which consists of a 2 byte ESTA manufacturer ID, and a 4 byte device ID. Like a MAC address, no two responders should have the same UID. The 'all devices' UID (ffff:ffffffff) is a broadcast address, to which all devices will listen but not necessarily take action or respond.
There are three RDM messages used during the discovery process, all use the DISCOVERY_COMMAND Command Class.
- DISC_UNIQUE_BRANCH, which takes two parameters, a lower and an upper UID. Any responders within this range (inclusive) must respond.
- DISC_MUTE
- DISC_UN_MUTE
There are many different ways to implement the discovery process, one (simplified) method is described below:
- Send a DISC_UNMUTE to the ALL DEVICES UID
- Send a DISC_MUTE to any previously discovered devices, if they don't respond after multiple attempts remove them from the UID list.
-  Send a DISC_UNIQUE_BRANCH with a range of (0000:00000000, ffff:ffffffff). One of three things can happen:
- No response, which generally means there are no further responders to be muted
- A single response with a valid checksum. In this case the controller should attempt to mute the responder (DISC_MUTE) and if that succeeds, add the responder to the list of UIDs.
- A collision, in which case a controller should divide the range in two (0000:7fff:ffffffff), (8000:00000000, ffff:ffffffff) and proceed from step 3.
 
If everything goes well, eventually all devices will be muted and sending a DISC_UNIQUE_BRANCH with a range of (0000:00000000, ffff:ffffffff) will result in no responses.
Failure Modes
Responders that reply outside their range
Responders may have bugs in the UID inequality code, causing them to reply to DISC_UNIQUE_BRANCH messages which don't cover their UID. One example may be a responder that replies when just the Manufacturer ID part of the UID matches. Depending on the type of bug, discovery can sometimes proceed in these cases and usually other devices are found and muted first before the bad responder is located.
The worst case is a responder that replies to every DISC_UNIQUE_BRANCH request, which usually prevents discovery of any other responders.
Responders that don't reply when within range
The opposite of the case above is where a responder fails to respond to a DISC_UNIQUE_BRANCH request for a range which it is part of. A common example of this is the off-by-one case, where a responder doesn't reply if it's at the endpoints of the range. This can cause responders to 'disappear' which happens when a range that previously caused a collision, produces no responses when split in two.
Without proper detection, this can cause controllers to loop indefinitely as they try to locate the missing responders.
Responders that don't reply to Mute
Some responders may not reply to the MUTE_DEVICE message. This case must be differentiated from the case where the MUTE_DEVICE message is corrupted or lost so the responder never replies. The pseudo-code in the E1.20 standard attempts to mute each device up to 10 times before continuing.
Responders that don't mute
Not to be confused with the scenario above, some responders may acknowledge the mute request but continue to respond to discovery requests. If there is only one responder that behaves this way controllers should be able to succeed,
Proxied Devices
The E1.20 states that a proxy shall not provide more than one DISC_UNIQUE_BRANCH response at once. This means that once the proxy has been located and muted, controllers must continue to
Example Implementation
The implementation of the RDM discovery algorithm as used in the Open Lighting Architecture can be seen here . The tests which cover the cases above are in DiscoveryAgentTest.cpp
