Experimental support SampleCollections and Loading_ID
clintval opened this issue · 1 comments
clintval commented
A SampleCollection
is a container for samples which may have originated from multiple sample sheets / flow cells / lanes.
A SampleCollection
will facilitate organizing samples by their Sample_Name
or Library_ID
. A few methods will help with merge strategies for identical samples that have either been topped-off (same library, sequenced on different flow cells or lanes) or re-prepared (different library, can exist on same flow cell or lane).
>>> from sample_sheet import SampleCollection
>>> collection = SampleCollection(samples)
>>> collection.visualize()
"""
collection(n=4)
├─ sample1
│ ├─ library1
| │ ├─ loading1
| │ └─ loading2
│ └─ library2
| └─ loading1
└─ sample2
└─ library1
└─ loading1
"""
Grouping samples by loading returns a new collection. Samples that can be merged at this level will be equivalent (see L261-L265)
>>> collection = collection.group_by_loading(attr='Loading_ID')
>>> collection.visualize()
"""
collection(n=3)
├─ sample1
│ ├─ library1
│ └─ library2
└─ sample2
└─ library1
"""
Grouping samples by library returns a final collection.
>>> collection = collection.group_by_library(attr='Library_ID')
>>> collection.visualize()
"""
collection(n=2)
├─ sample1
└─ sample2
"""
clintval commented
Cool thought, maybe another time.