tbepler/topaz

Preparing training data

Closed this issue · 2 comments

Hi,

I wanted some clarification on creating the training data. my current format is as follows:

raw/
     particles.txt
     images/
          a.mrc
          b.mrc
          c.mrc

my particles.txt (tab separated) file looks like:

image_name     x_coord     y_coord
a.mrc     345    123
b.mrc     234    344
c.mrc     566     987

I am running my train as follows:

topaz train --train-images <path>/raw/images/ \
            --train-targets <path>/raw/particles.txt \
            --save-prefix=saved_models/model \
            -o saved_models/model_tranining.txt \
            -n 400 --num-workers=8 --no-pretrained --image-ext .mrc

I am getting the following error:

WARNING: 5846 micrographs listed in the coordinates file are missing from the training images. Image names are listed below.
# Loaded 5846 training micrographs with 0 labeled particles
ERROR: no training particles specified. Check that micrograph names in the particles file match those in the micrographs file/directory.
Traceback (most recent call last):
  File "/vast/projects/miti2324/envs/topaz_env/bin/topaz", line 33, in <module>
    sys.exit(load_entry_point('topaz-em==0.2.5', 'console_scripts', 'topaz')())
  File "/vast/projects/miti2324/envs/topaz_env/lib/python3.6/site-packages/topaz/main.py", line 148, in main
    args.func(args)
  File "/vast/projects/miti2324/envs/topaz_env/lib/python3.6/site-packages/topaz/commands/train.py", line 641, in main
    image_ext=args.image_ext
  File "/vast/projects/miti2324/envs/topaz_env/lib/python3.6/site-packages/topaz/commands/train.py", line 272, in load_data
    raise Exception('No training particles.')
Exception: No training particles.

Can you tell me if what I'm doing is correct?

Can you create a version of particles.txt where the filenames don't contain the extension and give that a try? Also, it won't cause issues with the training, but your output file has a typo.

It ran! Thanks a lot :)
& also, silly typo in the output file - but it still worked