ml-struct-bio/cryodrgn

Extend cryodrgn filter to cryodrgn-et

ryanfeathers opened this issue · 1 comments

Running cryodrgn filter for cryoDRGN-ET produces the following error

  File "/scratch/gpfs/ZHONGE/rf2366/conda/cryodrgn_internal/bin/cryodrgn", line 8, in <module>
    sys.exit(main())
  File "/scratch/gpfs/ZHONGE/rf2366/dev/cryodrgn_internal/cryodrgn/__main__.py", line 74, in main
    args.func(args)
  File "/scratch/gpfs/ZHONGE/rf2366/dev/cryodrgn_internal/cryodrgn/commands/filter.py", line 152, in main
    plot_df = analysis.load_dataframe(
  File "/scratch/gpfs/ZHONGE/rf2366/dev/cryodrgn_internal/cryodrgn/analysis.py", line 647, in load_dataframe
    df = pd.DataFrame(data=data)
  File "/scratch/gpfs/ZHONGE/rf2366/conda/cryodrgn_internal/lib/python3.9/site-packages/pandas/core/frame.py", line 664, in __init__
    mgr = dict_to_mgr(data, index, columns, dtype=dtype, copy=copy, typ=manager)
  File "/scratch/gpfs/ZHONGE/rf2366/conda/cryodrgn_internal/lib/python3.9/site-packages/pandas/core/internals/construction.py", line 493, in dict_to_mgr
    return arrays_to_mgr(arrays, columns, index, dtype=dtype, typ=typ, consolidate=copy)
  File "/scratch/gpfs/ZHONGE/rf2366/conda/cryodrgn_internal/lib/python3.9/site-packages/pandas/core/internals/construction.py", line 118, in arrays_to_mgr
    index = _extract_index(arrays)
  File "/scratch/gpfs/ZHONGE/rf2366/conda/cryodrgn_internal/lib/python3.9/site-packages/pandas/core/internals/construction.py", line 666, in _extract_index
    raise ValueError("All arrays must be of the same length")

I get the same error trying cryodrgn filter on outputs of reconstruction done with --encode-mode=tilt using v3.3.3-b0; this is caused by the poses and CTFs for the dataset being stored on a tilt basis as opposed to model results (such as latent space co-ordinates) which are given on a particle basis.

We may yet figure out a way to retrieve particle-level pose and CTF values for tilt series analyses of reconstruction experiments; for now in ec16068 I am updating cryodrgn filter to only use covariates with particle-level values (UMAP, z-space, PCA), which runs successfully on tilt series experiments!