zktuong/dandelion

[QUESTION]

saramoein372 opened this issue · 5 comments

Hello,

I have question about the definition of the clode _id. originally it had 4 parts, as it is explained in the tutorial. But in my new run of dandelion, each clone_is has 6 parts (clone_is is like: a_b_c_d_e_f)

Would you please explain what are each part?

Thanks
Sara

Hi Sara, yes sorry i need to update the documentation.

The behaviour is as before in that a_b_c refers to the heavy chain and d_e_f refers the light chain for (V_J gene pairing, CDR3 length, and sequence similarity respectively) but the last update #124 made it easier to extract the light chain numbering assignments.

hmm but you have raised a possible bug. I thought i had add in a full_pairing_label option to the call to revert the clone id definitions to the original 4 part definition but looks like this isn't working as intended if you are getting the 6 part clone ids with a normal run.

Hi Sara,

If you run it as ddl.tl.find_clones(vdj, full_pairing_label = True), it should revert back to the 4-part id.
I named this new option wrongly but it's working as intended (to toggle between the 6-part vs 4-part ids). I will mostly likely rename the option to something more sensible in a future update.

the option has been renamed from full_pairing_label to collapse_label in v0.2.0