zktuong/dandelion

Singularity Container Preprocessing Error

Closed this issue · 2 comments

Description of the bug

Hi Zewen,
Great package and love the container you made! I was trying to do preprocessing manually myself but had some issues with file paths. So I decided to use your container, and used the preprocessing function. But there was an issue I think with how I labeled the individual column. Stupidly I didnt think to make it lead with a alphabetical character and I think the container read it in as an int64( if I am understanding this correctly). Maybe a typing assignment when importing in preprocessing.py would help? Will try again with the individual changed to ms'3'.
Thanks and let me know if you need anything else!

Minimal reproducible example

sample,prefix,individual
3s_bcr,3s,3
4s_bcr,4s,4
5s_bcr,5s,5
6s_bcr,6s,6
7s_bcr,7s,7
8s_bcr,8s,8
3b_bcr,3b,3
4b_bcr,4b,4
5b_bcr,5b,5
6b_bcr,6b,6
7b_bcr,7b,7
8b_bcr,8b,8

apptainer run -B $PWD ~/kt16_default_sc-dandelion.sif dandelion-preprocess --org=mouse --filter_to_high_confidence --meta ./sample_info.csv

The error message produced by the code above

Traceback (most recent call last):
  File "/share/dandelion_preprocess.py", line 378, in <module>
    main()
  File "/share/dandelion_preprocess.py", line 288, in main
    ddl.pp.reassign_alleles(
  File "/opt/conda/envs/sc-dandelion-container/lib/python3.11/site-packages/dandelion/preprocessing/_preprocessing.py", line 1439, in reassign_alleles
    out_dir = Path(combined_folder)
              ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/sc-dandelion-container/lib/python3.11/pathlib.py", line 871, in __new__
    self = cls._from_parts(args)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/sc-dandelion-container/lib/python3.11/pathlib.py", line 509, in _from_parts
    drv, root, parts = self._parse_args(args)
                       ^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/sc-dandelion-container/lib/python3.11/pathlib.py", line 493, in _parse_args
    a = os.fspath(a)
        ^^^^^^^^^^^^
TypeError: expected str, bytes or os.PathLike object, not int64

OS information

MacOS container most recent

Version information

command line

Additional context

No response

Hi @bpr4242 thanks! yes you are right it's a problem with the way numbers are interpreted by default with pandas.read_csv i would normally never name files/folders as numbers as it causes issues like this. so yea changing to an actual string should work.

potentially solved with #403. will be implemented in new release shortly.