dRep issue - DataFrame.pivot
Closed this issue · 2 comments
Hi Silas -- not quite sure if I should be raising this with you or on drep page, but running through the atlas pipeline I get stuck in the second step of dRep throwing an error TypeError: DataFrame.pivot() takes 1 positional argument but 4 were given. I'm currently running this stage with
atlas run genomes
On the off chance it was an install error, I deleted and reran drep in the vein of your working suggestion here #547
But alas returned to the same error. And that was the closest similar error I was able to find. Any advice?
***************************************************
..:: dRep dereplicate Step 1. Filter ::..
***************************************************
Will filter the genome list
Loading genomes from a list
325 genomes were input to dRep
Calculating genome info of genomes
100.00% of genomes passed length filtering
100.00% of genomes passed checkM filtering
***************************************************
..:: dRep dereplicate Step 2. Cluster ::..
***************************************************
Running primary clustering
Running pair-wise MASH clustering
Traceback (most recent call last):
File "/p/work1/mkardish/subs/databases/conda_envs/e4ca0a910149c0c7b21c70f20a241e3d_/bin/dRep", line 32, in <module>
Controller().parseArguments(args)
File "/p/work1/mkardish/subs/databases/conda_envs/e4ca0a910149c0c7b21c70f20a241e3d_/lib/python3.10/site-packages/drep/controller.py", line 100, in parseArguments
self.dereplicate_operation(**vars(args))
File "/p/work1/mkardish/subs/databases/conda_envs/e4ca0a910149c0c7b21c70f20a241e3d_/lib/python3.10/site-packages/drep/controller.py", line 48, in dereplicate_operation
drep.d_workflows.dereplicate_wrapper(kwargs['work_directory'],**kwargs)
File "/p/work1/mkardish/subs/databases/conda_envs/e4ca0a910149c0c7b21c70f20a241e3d_/lib/python3.10/site-packages/drep/d_workflows.py", line 37, in dereplicate_wrapper
drep.d_cluster.controller.d_cluster_wrapper(wd, **kwargs)
File "/p/work1/mkardish/subs/databases/conda_envs/e4ca0a910149c0c7b21c70f20a241e3d_/lib/python3.10/site-packages/drep/d_cluster/controller.py", line 179, in d_cluster_wrapper
GenomeClusterController(workDirectory, **kwargs).main()
File "/p/work1/mkardish/subs/databases/conda_envs/e4ca0a910149c0c7b21c70f20a241e3d_/lib/python3.10/site-packages/drep/d_cluster/controller.py", line 32, in main
self.run_primary_clustering()
File "/p/work1/mkardish/subs/databases/conda_envs/e4ca0a910149c0c7b21c70f20a241e3d_/lib/python3.10/site-packages/drep/d_cluster/controller.py", line 100, in run_primary_clustering
Mdb, Cdb, cluster_ret = drep.d_cluster.compare_utils.all_vs_all_MASH(self.Bdb, self.wd.get_dir('MASH'), **self.kwargs)
File "/p/work1/mkardish/subs/databases/conda_envs/e4ca0a910149c0c7b21c70f20a241e3d_/lib/python3.10/site-packages/drep/d_cluster/compare_utils.py", line 115, in all_vs_all_MASH
Cdb, cluster_ret = cluster_mash_database(Mdb, **kwargs)
File "/p/work1/mkardish/subs/databases/conda_envs/e4ca0a910149c0c7b21c70f20a241e3d_/lib/python3.10/site-packages/drep/d_cluster/compare_utils.py", line 279, in cluster_mash_database
linkage_db = db.pivot("genome1","genome2","dist")
TypeError: DataFrame.pivot() takes 1 positional argument but 4 were given
New pandas version - new conflicts.
I guess that you got the latest pandas version 2.0 which creates a bug in drep.
What you can do is to activate the conda env.
conda activate /p/work1/mkardish/subs/databases/conda_envs/e4ca0a910149c0c7b21c70f20a241e3d_
check the version of pandas you have.
conda list pandas
If my assumption is correct install an older version of Pandas.
conda install pandas=1.5.1
That's the version I have.
I think it's also good idea to raise the issue also at drep github and link the two issues.
Awesome! That seemed to fix the conflict.
drep issue opened at : MrOlm/drep#189