/SpSeg

Tool for species-level segregation of camera-trap images. The 'Species Segregator' is currently trained for Central Indian Landscape.

Primary LanguageJupyter Notebook

SpSeg

'Species Segregator' or SpSeg is a Machine-learning tool for species-level segregation of camera-trap images originating from wildlife census and studies. SpSeg is currently trained for the Central Indian landscape specifically. The model is build as second-step to Microsoft's MegaDetector, which identifies animals, person and vehicle in these images. SpSeg reads the results of MegaDetector and classifies the animal images into species (or a defined biological taxonomic level).

Kindly note:

The tool is currently under developement and the instruction for installation and use on a new data are not shared yet. One can use the current 'environment_multimodel.yml' to setup an Anaconda environment and find the models from a publicly shared SpSeg_Models Google Drive folder to run on a new dataset at their own risk. There is no need to setup a separate MegaDetector environment, which is incorporated in the codes here. However, MegaDetector model v4.1.0 is required to obtain images with 'Animal' tags and the bounding boxes.

We further plan to 1) train a couple of EfficientNet models with PyTorch, 2) finalize model based on the top performing models in a multi-model approach, and 3) share the tools to use in camera-trap studies in practical ways.

Results of initial trained models

The models in different architectures were trained for 100 ephocs each with the same training and test dataset. So far we have achieved the highest test accuracy for ResNet152v2 and InceptionResNetv2 at 89.2%.

Architecture avg top-1 acc Architecture avg top-1 acc
Xception 88.9%
VGG16 3.4% VGG19 3.3%
ResNet50 88.5% ResNet50v2 87.5%
ResNet101 88.8% ResNet101v2 89.1%
ResNet152 82.0% ResNet152v2 89.2%
InceptionResNetv2 89.2%

Training data

Training dataset includes 36 species commonly encountered in camera-trap surveys in Eastern Vidarbha Landscape, Maharashtra, India:

Species Scientific name Image set Species Scientific name Image set
00_barking_deer Muntiacus muntjak 7920 18_langur Semnopithecus entellus 12913
01_birds Excluding fowls 2005 19_leopard Panthera pardus 7449
02_buffalo Bubalus bubalis 7265 20_rhesus_macaque Macaca mulatta 5086
03_spotted_deer Axis axis 45790 21_nilgai Boselaphus tragocamelus 6864
04_four_horned_antelope Tetracerus quadricornis 6383 22_palm_squirrel Funambulus palmarum & Funambulus pennantii 1854
05_common_palm_civet Paradoxurus hermaphroditus 8571 23_indian_peafowl Pavo cristatus 10534
06_cow Bos taurus 7031 24_ratel Mellivora capensis 5751
07_dog Canis lupus familiaris 4150 25_rodents Several mouse, rat, gerbil and vole species 4992
08_gaur Bos gaurus 14646 26_mongooses Urva edwardsii & Urva smithii 5716
09_goat Capra hircus 3959 27_rusty_spotted_cat Prionailurus rubiginosus 1649
10_golden_jackal Canis aureus 2189 28_sambar Rusa unicolor 28040
11_hare Lepus nigricollis 8403 29_domestic_sheep Ovis aries 2891
12_striped_hyena Hyaena hyaena 2303 30_sloth_bear Melursus ursinus 6348
13_indian_fox Vulpes bengalensis 379 31_small_indian_civet Viverricula indica 4187
14_indian_pangolin Manis crassicaudata 1442 32_tiger Panthera tigris 9111
15_indian_porcupine Hystrix indica 5090 33_wild_boar Sus scrofa 18871
16_jungle_cat Felis chaus 4376 34_wild_dog Cuon alpinus 7743
17_jungle_fowls Includes Gallus gallus, Gallus sonneratii & Galloperdix spadicea 4760 35_indian_wolf Canis lupus pallipes 553

Training pipeline

SpSeg repository contains all the required tools to train and test the model. Run the codes from Tools directory in the repository

Step 1: Run MegaDetector model on Images to separate animal images. Latest model V4.1 can be downloaded from here.

python run_tf_detector_batch.py path_to_model/md_v4.1.0.pb image_directory image_directory/output_file.json

Step 2: Crop the the animals in sqaure images.

python crop_detections.py image_directory/output_file.json path_to_crops --images-dir image_directory --detector-version "4.0" --threshold 0.8 --logdir "." --threads 25 --square-crops

Step 3: Create CSV file with paths to each image in the directory alongwith a numrical identifier of the species.

python csv_paths.py --image_folder path_to_crops --image_format jpg --output_csv ../paths/species_data.csv --net_type cnn

Step 4: In windows systems, sometimes file paths do not have required extension at the end (‘file_01.’ instead of file_01.jpg). This steps removes these paths from the data (should be used cautiously).

python test_files.py --input_csv ../paths/species_data_test.csv --output_csv ../paths/species_data_cleaned.csv

Step 5: Since the number of images varry in each species class, we restricted sample size at 5000 images max for each class by randomly undersampling.

python random_sampling.py --input_csv ../paths/species_data_test.csv --output_csv ../paths/species_data_usample.csv --sample_size 5000

Step 6: Split the dataset into train, testing and validation sets. Validation data size is set to be equal to testing data size.

python split_dataset.py --input_csv ../paths/species_data_usample.csv --output_dir ../paths/ --file_name species_data --test_per 0.15

Step 7: Training CNN models from keras https://keras.io/api/applications/

python train_cnn.py --model model_name --train_csv ../paths/species_data_train.csv --valid_csv ../paths/species_data_valid.csv --batch_size 10 --num_classes 37 --epochs 100 --input_shape 224 224 3

Step 8: Calculate accuracy of CNN models

python accuracy_cnn.py --model model_name --input_shape 224 224 3 --csv_paths ../paths/species_data_test.csv --weights ../trained_models/model_file.hdf5


Dr Bilal Habib's lab at Wildlife Institute India, Dehradun, India is a partner in the development, evaluation and use of MegaDetector model. Development of SpSeg was supported by Microsoft AI for Earth (Grant ID:00138001338)