Metadata handling
abey79 opened this issue · 2 comments
I'm still in the process of sorting that out.
There are at least two design goals:
- The data structure should support the hierarchical nature of metadata, i.e. color is looked up for a path but not defined, the look-up should escalate to the layer, then to the document, then to default values.
- Cloning metadata should be cheap. For example, flattening a document should not copy all the metadata upon cloning, but only lazily upon mutation—if any.
Maybe a HashMap<_, Cow<_>>
? Or immutable data structure from the im
crate?
After thinking—and discussing—about this, I'm strongly leaning towards forgoing entirely a hierarchical metadata organisation:
I'm still trying to figure out a good strategy for metadata handling, and it occurs to me that there might actually be no benefit to vpype for metadata to be hierarchic. In other word, what I'm considering is to have only path metadata, and skip trying to maintain layer- or/global-level metadata, and dealing with the related hierarchical relationship (e.g. if path X doesn't have stroke-weight but it's parent layer does, then it should inherit it's parent's stroke-weight).
Of course, reading SVG absolutely must handle attribute hierarchy properly, but then the resulting metadata can be entirely flattened into the paths. Likewise, exporting SVG should also try to extract common subsets of metadata within each layer so that the corresponding attributes are set at the layer/group level rather than in individual paths (in order to limit file size). Finally, some commands will definitely want to operate at the layer level (e.g. the current color --layer 3 red) . In such case, it will have to iterate over the layer's paths and set the metadata for each of them, and it's effect will not affect subsequently added paths (e.g. color red and line 0 0 10 20 would no longer be commutative as is currently the case—this could possibly be addressed by maintaining layer-level "default values").
It feels like the price to pay for this is very reasonable, in regards to handling the complexity of the hierarchical relationship. A path can truly be self-contained, without the need to know which layer or global context it is attached to in order to know what its own color or stroke weight is. Much code can be simplified with this approach I believe.
My current plan is as follows:
- Each hierarchy level has its own metadata structure with relevant information:
DocumentMetadata
(e.g. page size),LayerMetadata
(e.g. name),PathMetadata
(e.g. stroke width & colour). - Metadata structure contains mostly
Option
field (an empty layer name isn't the same thing as no name). DocumentMetadata
andLayerMetadata
include adefault_path_metadata
field that cascades to paths when relevant.- Metadata structure implement some "boolean" operation like merge, diff, etc., to support some high-level operations.
For example, merging two layers involve the following steps:
- Both layer's metadata is merged, which means only identical fields remain and the rest becomes
None
. - The
default_path_metadata
has a special treatment: the difference after merge is cascaded to the corresponding paths' metadata, such that the overall operation isn't lossy on the "resolved" path attributes.