SynBioDex/SEPs

SEP 041 -- Make all sequence associations explicit

Closed this issue · 4 comments

This SEP proposes to require that the association of a Sequence to complete Component be explicit, rather than implicit, and to enable this via changes to Location, Component, and a new EntireComponent location class.

The SEP is available at: https://github.com/SynBioDex/SEPs/blob/master/sep_041.md

It took a bit to wrap my head around this one but I think it does work well. It is like the common source annotation in many GenBank files. I’m not sure I love the name EntireComponent but I cannot think of a better one. In SBOL2 the sequence list is a set but I assume this would translate to multiple EntireComponent SAs. It is a bit more complex for basic components but guess okay.

@cjmyers That's right: adding an EntireComponent link in a basic component is the price we pay for being able to have two sequences in a flattened system.

Note, however, that there will usually not be multiple EntireComponent SAs, but more likely multiple EntireComponent locations on SubComponents. The second example in the SEP shows this with a two-plasmid system.

graik commented

I have not entirely wrapped my head around it but I definitely think that each plasmid in these examples should be wrapped by its own Component with the sequence attached to it. This is the most basic level of a hierarchical (layer-based) representation of a design. This SEP seems to go the exactly other way. The "flattening" argument in general seems to prevent a clean representation of design layers. I think this idea would have to be discussed more widely and it was a bit premature to push it through to the SEP vote.

@graik As noted in the SEP 042 thread, this one has also been discussed more widely in other interactions, not all of then written.

Second, I would encourage you to look again at the "annotation in context" example, which does have hierarchical wrappings just like you are looking for. In this case, however, the SequenceAnnotations only make sense the context of the multi-sequence module, and we need some way of indicating which sequence is being annotated.