Connectivity Training

This is the python implementation of Connectivity training code.

prepare dataset

  1. Run Generate triplet feature.ipynb in "TripletTraining" repository to generate triplet features from the tripletTraining repository.
  2. Run Generate train test txt.ipynb to generate train/test.txt to define which race for training.
  3. Set the parameters and run Training.ipynb

Analysis

Using the Performance Analysis ipython book to do performance analysis.

It requires the ground truth csv, and the feature generated by triplet model in order to evaluate. It calculates the recall and precision of the model.

This performance analysis will loop through each frame, and for each cap in each frame(frame0), try to pair up with another cap in another frame (frame1), which frame1 - frame0 = 1,2.

It will have a threshold for filtering score that are too low, where the highest score will be selected for pairing up.

Example:
Reject Threshold - 0.3
Frame 0 - jockey-id[1,2,5,8] (4 candidates)
Frame 1 - jockey-id[2,5,7] (3 candidates)
We will get a 3*3 score matrix:
[[0.05,0.1,0.4], For frame 0 jockey 1
 [0.9,0.05,0.3], For frame 0 jockey 2
 [0.05,0.1,0.2], For frame 0 jockey 5
 [0.05,0.1,0.1], For frame 0 jockey 8
]
In this case, We will have a pair for 1-7,2-2,for 5 and 8, all candidates are rejected by threhold.

We say that here we have 2 matchable pairs which is 2-2 and 5-5, and we get 2 pairs, 1 wrong 1 correct.
The precision will be 1/2 and the recall will be 1/2.

It will do for all of the test races.

TO-DO

  1. Add more documentation/Pipeline instruction/Clean up?
  2. Generate training data to support more format?
  3. Add more testing module
  4. Potentially need some data for STT/STAWT for training/testing/analysis purpose? (Not doing it for now due to uneffective time cost issue raised by Suzy)