'Species Segregator' or SpSeg is a Machine-learning tool for species-level segregation of camera-trap images originating from wildlife census and studies. SpSeg is currently trained for the Central Indian landscape specifically. The model is build as second-step to Microsoft's MegaDetector, which identifies animals, person and vehicle in these images. SpSeg reads the results of MegaDetector and classifies the animal images into species (or a defined biological taxonomic level).
The tool is currently under developement and the instruction for installation and use on a new data are not shared yet. One can use the current 'environment_multimodel.yml' to setup an Anaconda environment and find the models from a publicly shared SpSeg_Models Google Drive folder to run on a new dataset at their own risk. There is no need to setup a separate MegaDetector environment, which is incorporated in the codes here. However, MegaDetector model v4.1.0 is required to obtain images with 'Animal' tags and the bounding boxes.
We further plan to 1) train a couple of EfficientNet models with PyTorch, 2) finalize model based on the top performing models in a multi-model approach, and 3) share the tools to use in camera-trap studies in practical ways.
The models in different architectures were trained for 100 ephocs each with the same training and test dataset. So far we have achieved the highest test accuracy for ResNet152v2 and InceptionResNetv2 at 89.2%.
Architecture | avg top-1 acc | Architecture | avg top-1 acc |
---|---|---|---|
Xception | 88.9% | ||
VGG16 | 3.4% | VGG19 | 3.3% |
ResNet50 | 88.5% | ResNet50v2 | 87.5% |
ResNet101 | 88.8% | ResNet101v2 | 89.1% |
ResNet152 | 82.0% | ResNet152v2 | 89.2% |
InceptionResNetv2 | 89.2% |
Training dataset includes 36 species commonly encountered in camera-trap surveys in Eastern Vidarbha Landscape, Maharashtra, India:
Species | Scientific name | Image set | Species | Scientific name | Image set |
---|---|---|---|---|---|
00_barking_deer | Muntiacus muntjak | 7920 | 18_langur | Semnopithecus entellus | 12913 |
01_birds | Excluding fowls | 2005 | 19_leopard | Panthera pardus | 7449 |
02_buffalo | Bubalus bubalis | 7265 | 20_rhesus_macaque | Macaca mulatta | 5086 |
03_spotted_deer | Axis axis | 45790 | 21_nilgai | Boselaphus tragocamelus | 6864 |
04_four_horned_antelope | Tetracerus quadricornis | 6383 | 22_palm_squirrel | Funambulus palmarum & Funambulus pennantii | 1854 |
05_common_palm_civet | Paradoxurus hermaphroditus | 8571 | 23_indian_peafowl | Pavo cristatus | 10534 |
06_cow | Bos taurus | 7031 | 24_ratel | Mellivora capensis | 5751 |
07_dog | Canis lupus familiaris | 4150 | 25_rodents | Several mouse, rat, gerbil and vole species | 4992 |
08_gaur | Bos gaurus | 14646 | 26_mongooses | Urva edwardsii & Urva smithii | 5716 |
09_goat | Capra hircus | 3959 | 27_rusty_spotted_cat | Prionailurus rubiginosus | 1649 |
10_golden_jackal | Canis aureus | 2189 | 28_sambar | Rusa unicolor | 28040 |
11_hare | Lepus nigricollis | 8403 | 29_domestic_sheep | Ovis aries | 2891 |
12_striped_hyena | Hyaena hyaena | 2303 | 30_sloth_bear | Melursus ursinus | 6348 |
13_indian_fox | Vulpes bengalensis | 379 | 31_small_indian_civet | Viverricula indica | 4187 |
14_indian_pangolin | Manis crassicaudata | 1442 | 32_tiger | Panthera tigris | 9111 |
15_indian_porcupine | Hystrix indica | 5090 | 33_wild_boar | Sus scrofa | 18871 |
16_jungle_cat | Felis chaus | 4376 | 34_wild_dog | Cuon alpinus | 7743 |
17_jungle_fowls | Includes Gallus gallus, Gallus sonneratii & Galloperdix spadicea | 4760 | 35_indian_wolf | Canis lupus pallipes | 553 |
SpSeg repository contains all the required tools to train and test the model. Run the codes from Tools directory in the repository
Step 1: Run MegaDetector model on Images to separate animal images. Latest model V4.1 can be downloaded from here.
python run_tf_detector_batch.py path_to_model/md_v4.1.0.pb image_directory image_directory/output_file.json
Step 2: Crop the the animals in sqaure images.
python crop_detections.py image_directory/output_file.json path_to_crops --images-dir image_directory --detector-version "4.0" --threshold 0.8 --logdir "." --threads 25 --square-crops
Step 3: Create CSV file with paths to each image in the directory alongwith a numrical identifier of the species.
python csv_paths.py --image_folder path_to_crops --image_format jpg --output_csv ../paths/species_data.csv --net_type cnn
Step 4: In windows systems, sometimes file paths do not have required extension at the end (‘file_01.’ instead of file_01.jpg). This steps removes these paths from the data (should be used cautiously).
python test_files.py --input_csv ../paths/species_data_test.csv --output_csv ../paths/species_data_cleaned.csv
Step 5: Since the number of images varry in each species class, we restricted sample size at 5000 images max for each class by randomly undersampling.
python random_sampling.py --input_csv ../paths/species_data_test.csv --output_csv ../paths/species_data_usample.csv --sample_size 5000
Step 6: Split the dataset into train, testing and validation sets. Validation data size is set to be equal to testing data size.
python split_dataset.py --input_csv ../paths/species_data_usample.csv --output_dir ../paths/ --file_name species_data --test_per 0.15
Step 7: Training CNN models from keras https://keras.io/api/applications/
python train_cnn.py --model model_name --train_csv ../paths/species_data_train.csv --valid_csv ../paths/species_data_valid.csv --batch_size 10 --num_classes 37 --epochs 100 --input_shape 224 224 3
Step 8: Calculate accuracy of CNN models
python accuracy_cnn.py --model model_name --input_shape 224 224 3 --csv_paths ../paths/species_data_test.csv --weights ../trained_models/model_file.hdf5
Dr Bilal Habib's lab at Wildlife Institute India, Dehradun, India is a partner in the development, evaluation and use of MegaDetector model. Development of SpSeg was supported by Microsoft AI for Earth (Grant ID:00138001338)