make chromarms - broken, or changed?
Closed this issue · 3 comments
Phlya commented
This code from our open2c_examples notebooks doesn't work:
# Use bioframe to fetch the genomic features from the UCSC.
hg38_chromsizes = bioframe.fetch_chromsizes('hg38')
hg38_cens = bioframe.fetch_centromeres('hg38')
# create a view with chromosome arms using chromosome sizes and definition of centromeres
hg38_arms = bioframe.make_chromarms(hg38_chromsizes, hg38_cens)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/tungstenfs/scratch/ggiorget/ilya/Projects/open2c_examples/contacts_vs_distance.ipynb Cell 8 line 5
hg38_cens = bioframe.fetch_centromeres('hg38')
# create a view with chromosome arms using chromosome sizes and definition of centromeres
----> hg38_arms = bioframe.make_chromarms(hg38_chromsizes, hg38_cens)
# select only those chromosomes available in cooler
hg38_arms = hg38_arms[hg38_arms.chrom.isin(clr.chromnames)].reset_index(drop=True)
File /tungstenfs/scratch/ggiorget/ilya/condaenvs/open2c/lib/python3.9/site-packages/bioframe/extras.py:72, in make_chromarms(chromsizes, midpoints, cols_chroms, cols_mids, suffixes)
69 raise ValueError(\"unknown input type for chromsizes\")
71 if len(cols_chroms) == 2:
---> 72 _verify_columns(df_chroms, [ck1, sk1])
73 columns_to_drop += [sk1]
74 df_chroms[\"end\"] = df_chroms[sk1].values
File /tungstenfs/scratch/ggiorget/ilya/condaenvs/open2c/lib/python3.9/site-packages/bioframe/core/specs.py:89, in _verify_columns(df, colnames, unique_cols, return_as_bool)
87 if return_as_bool:
88 return False
---> 89 raise ValueError(
90 \", \".join(set(colnames).difference(set(df.columns)))
91 + \" not in keys of df.columns\"
92 )
93 if return_as_bool:
94 return True
ValueError: chrom not in keys of df.columns
Phlya commented
@nvictus issue is the 'local' provider different Series than 'ucsc':
For some reason this highlighted "name" index name breaks make_chromarms
... at least using the 'ucsc' provider fixes the problem.
nvictus commented
The input behavior for this function are kind of inconsistent. A series should be treated like a dictionary (index names should be ignored) and dicts should be accepted as input for both chromsizes and midpoints.
Meant to create a PR in the vscode web UI, but it ended up pushing to main by accident: 3d2f347
Please review and make sure it works for you.
Phlya commented
Thank you, this fixed the problem!