Set aside N tracks for "real-world" experiment

Question

Set aside N tracks for "real-world" experiment

Closed this issue 4 years ago · 5 comments

While metrics from model training are insightful and helpful, it doesn't quite accurately portray how the model will perform in practice. We need to set aside N number of tracks (ground station, satellite combinations) for use in an experiment which simulates real-world use of the model.

Answer 1 · 2021-01-22T15:25:12.000Z

First #69 must be completed.

Answer 2 · 2021-01-22T15:35:36.000Z

Also plot these tracks and save the figures as part of this issue.

Answer 3 · 2021-01-23T19:11:50.000Z

34 ground station, satellite combinations should be set aside in order to ensure the validation set is at least 20% of the data. However, with 3 satellites in the data (G07, G08, and G20), it may make sense to ensure that the validation set contains an equal amount of each satellite. We could have 11 ground stations for each satellite in the validation set, resulting in 33 observations (or the validation set will be 19.64% if the original data. I think that's close enough for being able to balance the validation set.

Will right some code to randomly sample which ground station and satellite combinations we will keep, and then will manually set those aside in the data through some further reorganization of the directory structure. Once that's complete, I can return to #65 and update the readme accordingly.

Answer 4 · 2021-01-23T19:12:39.000Z

Oh and I'll include the code in notebooks/data_validation.ipynb on the feature/validate_data branch.

Answer 5 · 2021-01-23T19:35:10.000Z

I set aside some data to use for validation, 11 from each satellite. Started the model training process so everything seems to be in order and working effectively.

I'll close this issue and proceed to #65.