emfcamp/badge-2024-software

Compact single-file EEPROM format

gsuberland opened this issue · 2 comments

Given the issues described in #129, it is looking like the M24C16 EEPROMs will have little space for user code, making it difficult for developers to write feature-rich firmware for existing hexpansion designs that used this EEPROM. The space was already rather tight, since in practice the LittleFS filesystem isn't all that "little" on an EEPROM of this size, but with careful optimisation I was able to get the header and filesystem overhead down to just over 25%, leaving about 1500 bytes left for code. If any of the EEPROM pages end up being inaccessible due to addressing conflicts, this becomes more problematic. While the option of having a separate app with the EEPROM used for identification does somewhat resolve this, the UX isn't as good and it would be nice to retain the pure-EEPROM approach.

One way to reduce this impact would be to offer a "compact" format EEPROM structure for hexpansions that only need a single source file and no additional resources. Rather than the header being followed by a LittleFS filesystem, the header would be immediately followed by the raw contents of the app.py file, in whichever string encoding is normally expected in the filesystem version (presumably either UTF-8 or ASCII).

Naturally the badge firmware would need to be able to detect which of the two data formats are used. There are two ways I can see of implementing this:

  1. Update the header with an additional flags bitfield added, using a new manifest version.
  2. Keep the existing header, and use one of the existing fields as a signal, e.g. setting the filesystem offset to zero or 0xFFFF.

The first option has the benefit of also allowing any new features that have been thought of since the badge release to be added at the same time, and also provides the opportunity for easy future extensibility through the flags field without needing to actually bump the header version again and change the structure. It also means that older badge firmware versions won't try to load the EEPROM data since the manifest version will be wrong.

The second option has the benefit of being quicker to implement, and not having to handle multiple header versions, but does kinda pollute the meaning of one of the existing fields, and doesn't offer as many benefits overall. Header versioning is probably inevitable, too.

If we wanted to get fancy, we could also have a flag in the bitfield for compression (e.g. deflate) in raw mode. Based on my tests, we should expect to be able to fit approximately 6KiB of minified Python into a 2KiB EEPROM once compressed, including an expanded header. For reference, that's approximately the size of the BQ25895 module from this repo, which is 19KB unminified. This is a pretty good value proposition in my view.

I'd be happy to collaborate on this idea if you think it has legs.

I like this idea. I'd thought about it when working on the EEPROM for the Flopagon but I managed to fit what I wanted into the littlefs.

The header format is already versioned, the manifest version (currently "2024") is the first variable field and changing it should change the way the rest of the header is interpreted.

Compressing the file is interesting. I'm not sure what you mean by "minimised" python. I have been compiling the python to *.mpy files which makes them smaller.

The real issue is that to run an app you have to be able to import it. If we have a filesystem we can mount it and import from that folder. If we just have a bytes object I don't know what we do with it. I don't want to copy the file to flash on the badge.

The header format is already versioned, the manifest version (currently "2024") is the first variable field and changing it should change the way the rest of the header is interpreted.

The main thing I was referring to regarding versioning was the actual version management code. As far as I'm aware right now the code basically just checks "is this string '2024'?" in a few places and that's it, so once a new version of the header is added there'll need to be code to handle it, and that handling will need to be robust. But I agree that it's worth it to use this approach.

Compressing the file is interesting. I'm not sure what you mean by "minimised" python. I have been compiling the python to *.mpy files which makes them smaller.

Minified code is just code that's been rewritten (often automatically) to have the same function but with a smaller size by doing things like renaming variables to single characters, aliasing class names to shorter ones, swapping space indentation for tabs, stripping comments, reducing newlines, and things of that nature. An example would be this python minifier web tool. Often this process reduces the size of the code dramatically, with a compression ratio comparable to that of actual data compression algorithms like Deflate, allowing for significant compounded space savings.

Compiling to a binary mpy file is definitely an option, although this does still require the overhead of the LittleFS filesystem which is significant on these small EEPROMs.

The real issue is that to run an app you have to be able to import it. If we have a filesystem we can mount it and import from that folder. If we just have a bytes object I don't know what we do with it. I don't want to copy the file to flash on the badge.

There are, broadly speaking, two approaches I can envision here. Both require some careful thought but should be workable.

The first approach is using dependency injection techniques. There are quite a few articles online about how this is typically implemented in Python, but in short you can create your own implementation of Python's import handler, allowing you to do your own module spec resolution, module loading/creation, type creation, and execution, all within the same Python execution environment as your regular code. The general workflow is to subclass MetaPathFinder from importlib with a custom find_spec function to override the default behaviour, point that to a custom loader class that resolves the type spec to an object representing the loaded module, call exec on the source while passing in an empty dictionary as the globals so that all the registered stuff ends up in that object which now acts as a proxy for a module, then use __import__(namestr) as you usually would to load the module by name. There's a blog post here that talks about this in more detail, albeit without the exec part. This is something I should be able to write without too much trouble, assuming MicroPython uses the same importlib stuff under the hood.

The upside of this is that it solves the import problem directly - we want to import a module that isn't on the filesystem, so we override the importer behaviour to do what we need. All the code ends up in the same vicinity as the existing EEPROM loader stuff. The downsides are twofold. The main issue is I'm not sure if the regular importlib stuff exists in MicroPython, or if it behaves in a similar enough fashion that it'd be easy to implement. It may well, I just haven't tried. There's some mention of MicroPython using CPython's implementation of it, and the version they talk about seems to at least support some degree of this feature set, but I'd have to actually try it to know for sure. The second issue is a lesser one but, since (afaik) MicroPython doesn't offer a method of executing a compiled mpy file from runtime data in the same way we can use exec(...) to load source code from a string, we would never be able to support a compact EEPROM format containing mpy data instead of source.

The second approach would be to tackle the problem from the other end - we want to import a module that isn't on the filesystem, so why don't we make (or fake) a filesystem and put the module on it? MicroPython's vfs library supports custom block devices which would allow us to create a filesystem whose contents are backed by custom code or memory. This does result in two separate filesystems, but we can leave the import code alone and just use it like normal.

The simplest way to implement this would be to follow the example in the linked docs to create a RAM-backed block device, format it to LittleFS, mount it, add the mount path to sys.path, then when reading the compact EEPROM format (doing decompression if required) we copy the file contents into this in-memory filesystem. The downside would be increased memory usage, since we need to store the whole Python file (and the filesystem data) in memory plus the loaded module representation at runtime. We also need to remember that there may be up to 6 hexpansions loading these scripts into memory. However, this approach is likely the simplest overall option to implement and would work as a minimum viable product if we want to take the VFS approach. We'd also be able to support compressed mpy files, for potentially even higher density.

If we wanted to avoid the memory overhead we could go a slightly more complex route. I did some tinkering in Wokwi's simulator and put together a block device fully backed by code, with each block dynamically generated at runtime to emulate a FAT12 filesystem containing a single file called app.py, with its contents also generated at runtime. Under the hood it's just a Python class with a readblocks method that generates the necessary data to appear like a FAT12 filesystem, which MicroPython can consume as a custom block device. The code is nowhere near production-ready, but it proves the concept. The same idea would work for FAT16, FAT32, or LittleFS - I just went with FAT12 because it was the default filesystem generated in MicroPython on the ESP32 Wokwi simulator. FAT16 would probably be easier if only because it avoids needing to pack 12-bit little-endian values in the alloc table, which was a source of a lot of head-scratching... endianness across byte boundaries is weird. We could use this trick to emulate six separate mounted filesystems and add them all to the system path, allowing importing. The emulated storage class could then be provided with a pair of callback functions for retrieving the total file size and blocks of file data. We could even have a global cache for decompressed file blocks to prevent needing to re-read and decompress EEPROM data constantly, as a controlled time-memory tradeoff.

All approaches described here have their merits. I feel like the emulated filesystem solution might be slightly cleaner overall since it doesn't get into the internal gubbins of how Python's import stuff works, and makes it super easy to debug stuff since the Python file appears just like any other, while also keeping the memory usage down. The dynamic filesystem generation stuff is potentially prone to more quirky bugs, but with some careful design and testing I think we could make it work.