open2c/bioframe

coverage crashes when `cols` is specified

Closed this issue · 2 comments

This code fails

df1 = pd.DataFrame(data={"A": ["chr1"], "B": [1], "C": [4]})
df2 = pd.DataFrame(data={"D": ["chr1"], "E": [2], "F": [5]})
bf.coverage(
    df1,
    df2,
    cols1=["A", "B", "C"],
    cols2=["D", "E", "F"],
)

with:

Traceback (most recent call last):
  File "/home/vscode/.local/lib/python3.11/site-packages/pandas/core/indexes/base.py", line 3790, in get_loc
    return self._engine.get_loc(casted_key)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "index.pyx", line 152, in pandas._libs.index.IndexEngine.get_loc
  File "index.pyx", line 181, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 7080, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 7088, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'overlap_end'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/workspaces/process-modeling/coverage.py", line 12, in <module>
    bf.coverage(
  File "/home/vscode/.local/lib/python3.11/site-packages/bioframe/ops.py", line 884, in coverage
    df_overlap["overlap"] = df_overlap["overlap_end"] - df_overlap["overlap_start"]
                            ~~~~~~~~~~^^^^^^^^^^^^^^^
  File "/home/vscode/.local/lib/python3.11/site-packages/pandas/core/frame.py", line 3896, in __getitem__
    indexer = self.columns.get_loc(key)
              ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/vscode/.local/lib/python3.11/site-packages/pandas/core/indexes/base.py", line 3797, in get_loc
    raise KeyError(key) from err
KeyError: 'overlap_end 

while this works as expected:

df1 = pd.DataFrame(data={"chrom": ["chr1"], "start": [1], "end": [4]})
df2 = pd.DataFrame(data={"chrom": ["chr1"], "start": [2], "end": [5]})

bf.coverage(df1, df2)

Am I using it wrong or is this a bug?

Setup:

  • bioframe 0.5.0
  • pandas 2.1.1
  • python 3.11.6

looks like a bug where a string is not getting properly renamed!

Will try to address soon.

should be closed by #170