zktuong/dandelion

Accept metadata column as clones for germline inference (pp.create_germlines)

Ngort opened this issue · 3 comments

Ngort commented

Is your feature request related to a problem?

Right now, the documentation for pp.create_germlines reads "data (Union[Dandelion, pd.DataFrame, str]) – Dandelion object, pandas DataFrame in changeo/airr format, or file path to changeo/airr file after clones have been determined."

I believe there should be a way to extract clone assignments from a data/metadata column and run buildGermline with clone assignments from a Dandelion object alone, rather than having to export it as a ChangeO file first.

Describe the solution you'd like

Add a parameter clones_col or something like that. Turn this column into the kind of list that ChangeO expects for buildGermline as pass that as --cf (and set --clones as true).

Describe alternatives you've considered

No response

Additional context

No response

I'm actually thinking of torching this entire code chunk and just replace it how ddl.pp.ext.create_germlines calls CreateGermlines.py directly + the additional args option in #280

This would trivialise the requirement to keep this function updated in its current state. as this will probably involve a bit of rework, my suggestion for now is to run CreateGemlines.py manually while i rework this two functions (they impact on quite a bit of code so i'll have to take my time to comb through it).

@Ngort hi, if you try the branch at #288 pip install git+https://www.github.com/zktuong/dandelion@path-update, you should now be able to do the above with adding additional arguments just like in #279

Ngort commented

Thank you very much. My HPC is in maintenance but will try it asap :).