Documentation for COMPSCI 760 group project
The main preprocessing code loads the raw dataset.hdf5 file and performs the following operations:
-
Discards videos missing a usable tag.
-
Discards videos with fewer than 45 frames.
-
Trims the length of the videos to 45 frames.
-
Interpolates each of the cropped frames to 24 x 24.
-
Outputs 3 channels:
- The raw thermal values (min-max normalization)
- The raw thermal values (each frame normalized independently)
- The thermal values minus the background (min-max normalization)
-
Splits the data into training, validation, and test sets. This is performed using a fixed seed for reperformability, and the size of the split is 7664/1500/1500 (72%/14%/14%). Stratification is using in the split to ensure classes are equally represented across the data sets.
-
Encodes the labels as integers.
-
Saves the pre-processed data and the labels as numpy arrays.
The single-frame preprocessing code performs the same operations as the main preprocessing code, except it extracts only the most useful single frame from the entire video clip.
The movement preprocessing code collates information of the movement of the cropped region, and outputs 9 normalised variables for each of the 45 frames:
- Left boundary of cropped region
- Upper boundary of cropped region
- Right boundary of cropped region
- Lower boundary of cropped region
- Number of pixels above a temperature threshold (mass)
- Cropped region horizontal velocity
- Cropped region vertical velocity
- Horizontal velocity scaled by area of cropped region
- Vertical velocity scaled by area of cropped region
Hidden / Latent Space Visualisation
The dimentionality reduction code allows any selected layer from any model to be visualised by calling plot_dim_reduction(model, layer)
Hyperparameter tuning is done using the Ray Tune library.