[QUESTION]
saramoein372 opened this issue · 5 comments
Hello,
I have question about the definition of the clode _id. originally it had 4 parts, as it is explained in the tutorial. But in my new run of dandelion, each clone_is has 6 parts (clone_is is like: a_b_c_d_e_f)
Would you please explain what are each part?
Thanks
Sara
Hi Sara, yes sorry i need to update the documentation.
The behaviour is as before in that a_b_c refers to the heavy chain and d_e_f refers the light chain for (V_J gene pairing, CDR3 length, and sequence similarity respectively) but the last update #124 made it easier to extract the light chain numbering assignments.
hmm but you have raised a possible bug. I thought i had add in a full_pairing_label
option to the call to revert the clone id definitions to the original 4 part definition but looks like this isn't working as intended if you are getting the 6 part clone ids with a normal run.
Hi Sara,
If you run it as ddl.tl.find_clones(vdj, full_pairing_label = True)
, it should revert back to the 4-part id.
I named this new option wrongly but it's working as intended (to toggle between the 6-part vs 4-part ids). I will mostly likely rename the option to something more sensible in a future update.
the option has been renamed from full_pairing_label
to collapse_label
in v0.2.0