gem-pasteur/macsyfinder

How can I get the modle from the macsyfinder v2.0?

Closed this issue · 9 comments

Hi, your achevement is very admirable. but i still have some problem:
1、Do I have to modify the model myself from macsyfinder v.1 on my linux?
2、Is there any way to get the modle for macsyfinder v.2?
3、I checked the "macsydata" command. but I still don't how to use it.

Hi !

We are currently preparing a new candidate release of MacSyFinder v2 (including improved documentation for modellers), together with updated models for protein secretion systems (TXSScan) and the type IV filament super-family (TFF-SF).
This release should be out in the next few days.

Which models are you currently using?
For now, you can update your favorite models following the Documentation here.
We'll gradually update all published models for the V2 version.

Hope this helps! I'll write again here with updates on how to install models from our repo when these will be online.

Kind regards

To follow up on this, we've just released models updated for V2 of MacSyFinder (for now, only TXSScan and TFF-SF - see my previous post). They should be now accessible via:

macsydata available
to list available models

macsydata install TXSS
to install the models for protein secretion systems.

Then:
macsyfinder --db-type ordered_replicon --sequence-db myproteins.fasta --models TXSS all
to run the MacSyFinder search of all protein secretion systems in the package TXSS on your favorite organism's genome (that is, "proteic" version of it).

We should be able to release more updated models in the next weeks!

Best,

Sophie

Great! Many thanks to you!!

Hi!
I am wondering whether the models subfolder names should be renamed of there is just a help page update to do.
Indeed, in the help it is written:

  --models-dir MODELS_DIR
                        Specifies the path to the models if the models are not installed in the canonical place.
                        It gathers definitions (xml files) and HMM profiles arranged in a specific
                        file structure. A directory with the name of the model with at least two directories
                        	'profiles' - which contains HMM profiles for each gene components described in the systems' models
                        	'models' - which contains either the XML files of models' definitions or subdirectories
                        to organize the models in subsystems.

which suggest that a set of models should have two subfolders: profiles and models,
but in the latests installed one, the xml files are in a definitions subfolder.

$ ls /home/flejzerowicz/usr/miniconda3/envs/macsyfinder/share/macsyfinder/data/models/TXSS
definitions  LICENSE  metadata.yml  profiles  README.md

<-- no TXSS/models folder in there...

Also, I noted that in the previous version, this folder was named DEF. If I want to use the previous HMMs, shall I rename the DEF subfolder to definitions, or to models?

I just wanted to be sure that the tool will find the info it needs, and that all its capabilities are used!
Thanks.

Hi,

Yes, sorry if it was not clear (I'll try and see where potential misleading messages remain in the documentaion): in the latest version V2, the XML models should be placed in the "models" folder (they can also be placed in sub-folders within), and all HMM profiles should be placed in the folder "profiles".

You can find the full description of the new data structure here: https://macsyfinder.readthedocs.io/en/latest/modeler_guide/package.html

The macsydata tool can also help you "fix" your package in case you want to share it with colleagues, or even contribute to the macsy-models repository. See here for more details on how to share your macsy-models: https://macsyfinder.readthedocs.io/en/latest/modeler_guide/publish_package.html

I hope this helps! Let me know if you need more details.

Best,

Sophie

Hello,
MacSyFinder has been very helpful in my research, thank you very much for your work.

I was able to install MacSyFinder V2 when I installed it on another computer, but I had a problem downloading the protein models using the command macsydata install TXSS. Can you give me a hand?

(macsyfinder2.0) lkj666@Cool:~/3_software/macsyfinder-master$ macsydata install TXSS
Downloading TXSS (1.0rc1).
Extracting TXSS (1.0rc1).
Traceback (most recent call last):
File "/home/lkj666/miniconda3/envs/macsyfinder2.0/bin/macsydata", line 8, in
sys.exit(main())
File "/home/lkj666/miniconda3/envs/macsyfinder2.0/lib/python3.7/site-packages/macsypy/scripts/macsydata.py", line 938, in main
parsed_args.func(parsed_args)
File "/home/lkj666/miniconda3/envs/macsyfinder2.0/lib/python3.7/site-packages/macsypy/scripts/macsydata.py", line 348, in do_install
dest = config.models_dir()[0]
IndexError: list index out of range

can you give me the version of macsyfinder you use (macsyfinder --version)?

I'm not 100% sure
but I think I understand what's happen
it's because you have not any directory where maxsyfinder look for the models

as workaround (waiting I fix this) I propose you install TXSS data in your home
with the following command
macsydata install --models_dir ~/.macsyfinder/models TXSS
It should create the directory ~/.macsyfinder/models
install TXSS in it
and macsyfinder should look in this directory to find TXSS
check if the installation succeed with
macsyfinder --list-models

the bug is fixed in dev version.
the workaround works. close the issue