PayneLab/cptac

ValueError: Length mismatch: Expected axis has 1 elements, new values have 3 elements

Closed this issue · 1 comments

This is with cptac version 1.5.11, python version 3.11.7. I've tried it with cptac versions 1.5.10 and 1.5.8 and get the same error.

This error appears when I import cptac in a jupyter notebook. This wasn't happening about a month ago - it looks like it's likely a simple error, where an index.tsv file is being loaded incorrectly. I'm not finding this file in the repository, so it may be a downloaded file issue. Any ideas on how this can be resolved?

What I ran:

import cptac

Error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[1], line 1
----> 1 import cptac
      2 import pandas as pd
      3 import numpy as np

File ~/miniforge3/envs/dl2/lib/python3.11/site-packages/cptac/__init__.py:83
     79     # options_df.loc[options_df.iloc[:2].str.contains('miRNA')] = 'miRNA' # condense all forms of micro RNA
     80     # options_df = options_df.unique().reset_index(drop=True)
     81     return options_df
---> 83 OPTIONS = _load_options()
     85 def list_datasets(*, condense_on = None, column_order = None, print_tree=False):
     86     """
     87     List all available datasets.
     88     
     89     :param condense_on (list): A list of column names. Values in selected columns will be aggregated into a list.
     90     :param print_tree (bool): If True, returns the database split in a pretty tree.
     91     """

File ~/miniforge3/envs/dl2/lib/python3.11/site-packages/cptac/__init__.py:77, in _load_options()
     75 """Load the tsv file with all the possible cancer, source, datatype combinations"""
     76 options_df = pd.DataFrame(INDEX['description'].str.split('-').tolist())
---> 77 options_df.columns = ['Source', 'Cancer', 'Datatype']
     78 options_df = options_df[['Cancer', 'Source', 'Datatype']]
     79 # options_df.loc[options_df.iloc[:2].str.contains('miRNA')] = 'miRNA' # condense all forms of micro RNA
     80 # options_df = options_df.unique().reset_index(drop=True)

File ~/miniforge3/envs/dl2/lib/python3.11/site-packages/pandas/core/generic.py:6218, in NDFrame.__setattr__(self, name, value)
   6216 try:
   6217     object.__getattribute__(self, name)
-> 6218     return object.__setattr__(self, name, value)
   6219 except AttributeError:
   6220     pass

File properties.pyx:69, in pandas._libs.properties.AxisProperty.__set__()

File ~/miniforge3/envs/dl2/lib/python3.11/site-packages/pandas/core/generic.py:767, in NDFrame._set_axis(self, axis, labels)
    762 """
    763 This is called from the cython code when we set the `index` attribute
    764 directly, e.g. `series.index = [1, 2, 3]`.
    765 """
    766 labels = ensure_index(labels)
--> 767 self._mgr.set_axis(axis, labels)
    768 self._clear_item_cache()

File ~/miniforge3/envs/dl2/lib/python3.11/site-packages/pandas/core/internals/managers.py:227, in BaseBlockManager.set_axis(self, axis, new_labels)
    225 def set_axis(self, axis: AxisInt, new_labels: Index) -> None:
    226     # Caller is responsible for ensuring we have an Index object.
--> 227     self._validate_set_axis(axis, new_labels)
    228     self.axes[axis] = new_labels

File ~/miniforge3/envs/dl2/lib/python3.11/site-packages/pandas/core/internals/base.py:85, in DataManager._validate_set_axis(self, axis, new_labels)
     82     pass
     84 elif new_len != old_len:
---> 85     raise ValueError(
     86         f"Length mismatch: Expected axis has {old_len} elements, new "
     87         f"values have {new_len} elements"
     88     )

ValueError: Length mismatch: Expected axis has 1 elements, new values have 3 elements

I created an entirely new environment and re-installed cptac, and it seems to be working just fine now. I previously pip uninstall cptac'd the package rather than start entirely fresh. I suspect I inadvertently moved the index.tsv file from earlier for some reason, hence the error. I'll go ahead and close this.