zktuong/dandelion

Define clones within each group

Ngort opened this issue · 3 comments

Ngort commented

Is your feature request related to a problem?

I have data from multiple replicates pooled together, each with an identity assigned by a metadata column. I want the defined clones to be within each group, and the edges across groups to be of zero weight.

Right now the DefineClones.py code run is defined as:

    cmd = [
        "DefineClones.py",
        "-d",
        h_file1,
        "-o",
        h_file2,
        "--act",
        action,
        "--model",
        model,
        "--norm",
        norm,
        "--dist",
        str(dist_),
        "--nproc",
        str(nproc),
        "--vf",
        v_field,
    ]

Could we add something that serves as argument for --gf aka --group_fields?

Describe the solution you'd like

I want the defined clones to be within each group, and the edges across groups to be of zero weight.

Describe alternatives you've considered

Slicing the data and running the algorithm.

Additional context

No response

Hi @Ngort, thanks - yes i can certainly add this.

I'm currently thinking if i add a free form list [] that you can specify that option, and any other options so it would be like:

cmd_final = cmd + user_additional_cmd

can you try install the version at #280

pip install git+https://www.github.com/zktuong/dandelion@update-cmd-for-external

and let me know if this works?

ddl.tl.define_clones(..., additional_args = ['--gf', GROUP_FIELD, ]) # if separate each entry in GROUP_FIELD into its own string if there's multiple
Ngort commented

This is perfect! It works like a charm.