firmata/ConfigurableFirmata

create a Device ID scheme

Closed this issue · 10 comments

Now that it's possible to wrap 3rd party libraries to add support for individual devices, we need a scheme for assigning IDs for these devices. The ID for existing Firmata features is the 2nd byte of a Sysex message (the command byte). Using this byte for device IDs will not scale well as there are only about 80 bytes remaining and at least half of those should be reserved for future system features (SPI, SoftSerial, PulseIn/Out, etc). What is needed is a way to provide a unique ID for each device added to the protocol. The ideal solution will require as few additional bytes as possible. There may also be an advantage in having a way to group similar devices (which would help establish common interfaces for devices of the same type).

Here are 4 options I'm considering:

Option A: No change

Supports approximately 40 individual Device IDs

Only ~ 80 IDs remaining and need to reserve space for future system features
such as SPI, SoftwareSerial, PulseIn/Out, etc. At most 40 available
for individual devices to be safe.

// commands
0x10 = DHT11
0x11 = NeoPixel
0x12 = RFSwitch
etc.

// use
0  START_SYSEX   (0xF0)
1  command byte: DEVICE_ID (0x10 - limit TBD)
2  device data
...
N  END_SYSEX     (0xF7)

pros:

  • fewest bytes required
  • simplest approach (requires no changes)

cons:

  • does not scale
  • need to figure out how to map to pin configuration
  • difficult to enforce common interface for similar devices

Option B: Single device command, individual device IDs

Supports up to 128 individual Device IDs

// command
0x10 = DEVICE
// subcommand
0x00 = DHT11 (0 - 127 available for individual devices)
etc.

// use
0  START_SYSEX      (0xF0)
1  command byte:    DEVICE (0x10)
2  subcommand byte: DEVICE_ID (0 - 127)
3  device data
...
N  END_SYSEX        (0xF7)

pros:

  • simple
  • scales better than option A (but still limited to 128 unique device IDs)

cons:

  • need to figure out how to map to pin configuration
  • difficult to enforce common interface for similar devices

Option C: Device categories

Supports up to 128 individual device IDs per device category

// commands
0x10 = DEVICE_SENSOR
0x11 = DEVICE_ACTUATOR
0x12 = DEVICE_COMM (radios, etc)
etc.

// sensor subcommands
0x00 = DHT11
0x01 = some other specific sensor
0x02 = yet another specific sensor
etc.

// use
0  START_SYSEX      (0xF0)
1  command byte:    DEVICE_CATEGORY (0x10 - some limit TBD)
2  subcommand byte: DEVICE_ID (0 - 127 per type)
3  device data
...
N  END_SYSEX        (0xF7)

pros:

  • accomodates more devices than Options A and B
  • manageable number of device types

cons:

  • types have less semantic meaning than in Option D
  • need to figure out how to map to pin configuration
  • difficult to enforce common interface for similar devices

Option D: Device types

Supports up to 128 individual device IDs per type

// commands
0x10 = DEVICE_TEMP
0x11 = DEVICE_RADIO
0x12 = DEVICE_LED_STRIP

// sub command of DEVICE_TEMP
0x00 = reserved for generic type (assuming common interface across devices of this type)
0x01 = DHT11
0x02 = some other temp / humidity sensor

// use
0  START_SYSEX      (0xF0)
1  command byte:    DEVICE_TYPE (0x10 - some limit TBD)
2  subcommand byte: DEVICE_ID (0 - 127 per category)
3  device data
...
N  END_SYSEX        (0xF7)

pros:

  • categories could map to pin configurations
  • could (attempt to) implement a common interface for devices within a category

cons:

  • limited number of categories (no more than 40 - less is even better)
  • categories could be difficult to maintain

Currently I'm leaning towards Option D but am concerned about defining and maintaining categories. Option B is a good option in terms of simplicity with a big assumption that we'd never have more than 128 unique devices.

I would also lean towards D as I think this provides future extensibility. Having said that, the number of things that are not I2C compatible is relatively finite as most new stuff is shipping with that capability. Also note the work we've been doing on the backpack stuff which extends common devices towards having I2C support (neopixel, ping sensor etc).

Given that, perhaps we go for Option B with the following mods:

Reserve 0 - 31 for generic interfaces where we think it's possible to define a generic interface. 32-127 are then used for device specific implementations. Thus we may be able to implement a generic LED strip (that wraps an interface that talks WS2811, WS2812 etc...) which allows for extensibility but maintains generic interfaces to these things as well. In the case of things like controllable LEDs, chances are we'll support all the common interfaces forms and then keep extending that capability over time rather than wrap new libs etc. The further assumption here is that we may need to include config sub messages but

32-126 are then used for device specific implementations that are singular devices.

Reserve 127 as a catchall like 127.0.0.1 works so that people can work locally with something they aren't necessarily going to release or have integrated back into Configurable Firmata.

Might also be nice to have a method by which if you want an actual device ID assigned you have to help out on some wider parts of firmata to encourage some wider help on the project. Or do what Philips & USB peeps do and get a payment for an address ;P

I'm not sure I've fully digested all the options here but I hope the following is still useful.

The kind of grouping that seems useful to me is one that groups all devices using the same protocol. That is, making all I2C devices part of the I2C category / group, with some (potentially I2C-specific) way of distinguishing between them. Similarly for OneWire devices, etc.

Beyond devices that use the same protocol, though, I'm not sure you get much value in trying to maintain semantic groups, especially if there's no actual commonality to the kind of data / messages they're going to require.

You might just continue with the current scheme and when you get close to running out of IDs, simply use the last few IDs for an "extended" set of devices, which have two-byte IDs (where the first byte is one of the "extended" IDs). Something like the UTF8 encoding strategy for unicode.

I like options C and D, because I like the additional level of specificity. If I was forced to choose, I would pick C because I think there is a higher likelihood of 128 different devices by category, then by type (maybe I'm totally wrong about that?)

To clarify, this would not typically be used for I2C, SPI (to be supported in the near future) or other common protocols. This issue is only for devices that don't use a common protocol (such as a DHT11 temp sensor or a NeoPixel Led array, RF radios, etc). It could also be applied to certain I2C, SPI, Serial devices that don't conform to the general rules for those protocols (extra delays, timing critical, too much data to buffer through Firmata, etc).

Also bytes 0x00 - 0x0F are currently reserved for user-defined IDs (these serve as the "catch all" that @ajfisher referred to). These are IDs that are not added to Firmata client implementations and are useful for one-off applications, testing new features, etc. The Device IDs I'm talking about here (0x10 - 0x40 or so and sub commands if necessary) are IDs that will be assigned so that Firmata client libraries can implement the appropriate interfaces for them.

Some combination of option A and D could work. If a device type is obscure, just assign that device it's own unique ID from the existing pool like some obscure radio that doesn't use any standard protocol to interface with the MCU. However if the device fits into a clear category such as the DHT11/21/22 fits into the general "TEMP_HUMIDITY" category, then assign a top level TEMP_HUMIDITY (or perhaps just TEMP or WEATHER) type and a DHT11_21_22 sub command type). Reserve the subcommand of 0x00 for a generic implementation (0x10 = TEMP, 0x00 is sub command and is reserved) or more likely for a library bundled with an IDE - such as the libraries bundled with the Arduino IDE in the event new core libraries are added in the future).

Once the command IDs are nearly used up (and that may take a while, even with only about 80 remaining) a few remaining bytes could be used as extensions as David suggested. Command bytes 0x00 - 0x0F would still be reserved for user-defined device IDs (I could also cut this down to 0x00 - 0x08 or smaller to expand the range of available commands for dedicated IDs).

Even with the use of a generic device type, each specific device will still need its own implementation. Otherwise the generic type would need to include libraries for each specific type even if they are not needed. I want to avoid unnecessary includes whenever necessary. For example if anyone was following the LED_STRIP discussion a while back (a generic LED_STRIP type and specific NEOPIXEL subtype), that implementation had to include the neopixel libraries in the generic LedStrip class even if not needed. A better approach would be for a specific NeoPixel wrapper library. It would still have the LED_STRIP device type, but a specific NEOPIXEL sub type. The advantage is another (non-neopixel) LED_STRIP implementation would not need to include the neopixel libraries but both implementations would share the same device type identifier and could conform as much as possible to a common interface. It would create some duplicate code across implementations but would eliminate unnecessary includes (which is expensive in a resource constrained microcontroller).

Option E: 14-bit IDs/Commands for new devices and features

"ID" here is the first 2 bytes of the SYSEX message. Traditionally only a single 7-bit COMMAND value has been used for this purpose. Existing values are in the range 0x60 - 0x7F (eg SERIAL_MESSAGE, SERVO_CONFIG, STRING_DATA, STEPPER_DATA, etc).

Supports up to 9017 IDs in the 0x0900 - 0x5000 range and an additional 1016 in the reserved 0x0000-0x0800 range.

ID / Command allocation

0x0000 - 0x0800 = reserved (could reduce this range to expand the available range)
0x0900 - 0x5000 = available
0x51XX - 0x7FXX = reserved for backwards compatibility

Example use

0  START_SYSEX         (0xF0)
1  MESSAGE_ID msb      the msb and lsb are reversed from the typical pattern in Firmata
2  MESSAGE_ID lsb
3  message body
...
N  END_SYSEX           (0xF7)

pros:

  • No need to categorize devices and features
  • Provides a sufficient number of IDs to scale well into the future

cons:

  • ranges of IDs need to be avoided to support legacy 7-bit IDs
  • need to swap LSB and MSB order in order to support legacy 7-bit IDs

Option F: 14-bit IDs for devices

This is similar to Option B, but extended to include a wider range of IDs.

Support up to 2032 unique device IDs by allocating SYSEX subcommands in the range 0x10 - 0x1F to devices as the LSB in the 14 bit device ID. For each subcommand in this range there are 127 values where the value is the MSB.

SYSEX subcommand (2nd byte of SYSEX message) allocation per this proposal:

0x00 - 0x0F = user defined commands / IDs (no change)
0x10 - 0x1F = device ID LSB range (where each ID is 14 bits)
0x20 - 0x5F = available new Firmata firmata features (allocate new IDs backwards from 0x5F)
0x60 - 0x7F = mostly allocated to existing firmata features

Alternatively, start with half the range of Device IDs and reserve the remaining half

0x00 - 0x0F = user defined commands / IDs (no change)
0x10 - 0x17 = device ID LSB range (where each ID is 14 bits)
0x18 - 0x1F = reserved (in case overflow is necessary for new firmata features)
0x20 - 0x5F = available new Firmata firmata features (allocate new IDs backwards from 0x5F)
0x60 - 0x7F = mostly allocated to existing firmata features

Example use

0  START_SYSEX      (0xF0)
1  DEVICE_ID LSB    (0x10) // range 0x10 - 0x1F
2  DEVICE_ID MSB    (0x00) // range 0x00 - 0x7F (per LSB 0x10 - 0x1F range)
3  message body
...
N  END_SYSEX        (0xF7)

pros:

  • No need to categorize devices and features
  • Provides a sufficient number of Device IDs to scale well into the future
  • Preserves sufficient range of single byte commands for new firmata features
  • No change to existing user defined command range (since these may be in use)

cons:

  • mix of 7-bit commands and 14-bit IDs, but for the most part initial parsing only needs to use LSB (since range 0x10 - 0x1F identifies a "device" and the exact device ID can then be parsed in a second step)
  • uses more IDs than may ever be allocated to devices (could be an issue if more feature commands are needed - one solution is to allocate device IDs in order so if space gets tight for features, a part of the end of the device ID range can be reallocated to new features)

Option G: Device driver

See the proposal in this thread: firmata/protocol#47

I finally have a solution here. Let's align with the way it is in the midi spec for sysex where the 2nd byte is the feature ID in the 1 - 127 range (in midi it's the manufacturer ID), and if the ID is set to 0, then the following two bytes are the extended ID (midi extended manufacturer ID). It looks like this and could be the model for all future Firmata features:

byte 0 byte 1 bytes 2 - N-1 byte N
START_SYSEX ID (1-127) PAYLOAD END_SYSEX
START_SYSEX ID (0) EXTENDED_ID (0-16383) + PAYLOAD END_SYSEX

Except bytes 1 - 8 or 1 - 15 would still be reserved for custom Firmata features (IDs that aren't added to the feature registry).

A feature registry would be added to firmata/protocol where authors can request new features get an id (either a 1 byte or 2 byte ID, where the functionality of the former is more general and the later is more specific).

The Device Driver proposal could still exist in this model as an option for users who prefer the open/status/control/read/write/close driver syntax.