leylabmpi/DeepMAsED

Initial usage help

Closed this issue · 5 comments

wwood commented

Hi,

I'm looking to predict misassemblies in my MAGs, but struggling to work out exactly what needs to be done. I suppose I should use DeepMAsED predict, but it is not clear to me what feature tables are required, or how to generate these tables. Is there a worked example from assembly+reads available?

Also, minor thing, I noticed this grammatical error in the description:

DESCRIPTION:
    Predicting misassemblies by used a model

Thanks, ben

You're complete right. The docs are pretty bad. I've updated them, so let me know if that helps. Note that currently you need your feature files in a specific directory structure (see the updated docs), which is a bit of a pain. I plan on changing that in the next few days.

@wwood I've updated the UI and docs some more. Please feel free to reopen this issue if you are still having problems

wwood commented

Hi,
Thanks for working on this. Much improved.

I cannot reopen this issue since you closed it and I'm a nobody here, but that's OK.

Ran into this running through the test data

[Sun Mar 15 08:08:31 2020]
Finished job 36.
1 of 80 steps (1%) done
Waiting at most 5 seconds for missing files.
MissingOutputException in line 8 of /srv/projects2/wierdbin1/ben/41_deepmased/DeepMAsED/DeepMAsED-SM/bin/MGSIM/Snakefile:
Missing files after 5 seconds:
tests/output_n10/genomes/Escherichia_coli_O104_H4.fna
tests/output_n10/genomes/Clostridium_perfringens_ATCC_13124.fna
tests/output_n10/genomes/Clostridium_botulinum_A_str_Hall.fna
tests/output_n10/genomes/Methanosarcina_barkeri_MS.fna
tests/output_n10/genomes/Methanobrevibacter_smithii.fna
tests/output_n10/genomes/Lactobacillus_acidophilus.fna
tests/output_n10/genomes/Bifidobacterium_breve.fna
tests/output_n10/genomes/Christensenella_massiliensis_P2438.fna
tests/output_n10/genomes/Shigella_flexneri_2a_str301.fna
This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait.

I guess that is missing files from the git repo.

It'd be quite helpful if there was an actual worked example with code that could be copy/pasted, including an install procedure. For instance "use DeepMAsED-SM" is a little vague. I guess you mean clone the repo, use the conda env from the .travis file, cd into the DeepMAsED directory and then run snakemake?

Thanks.

OK, it just took all day, but I did update the code to provide a full working example for creating a feature table (see the script: section of the .travis.yml file). After you create the feature tables + "feature_table_file", then running DeepMAsED predict is straight-forward.

Note: I'm probably going to create a new subcommand DeepMAsED features to create the feature table(s) directly via the DeepMAsED package code (if you don't want to use snakemake)

I've added DeepMAsED features and updated the docs. Open a new issue if you are still running into problems