Hotdog | Not Hotdog

This model is based on the premise from Silicon Valley Season 4 | Episode 4: https://www.youtube.com/watch?v=ACmydtFDTGs

Dataset

The HuggingFace Food101 Dataset was used, split down to hotdog (750) and non-hotdog.
non-hotdog instances were reduced to a sampling of 750.
Data was re-combined and then split into a training and validation set.
The google/vit-base-patch16-224-in21k model checkpoint was built off, utilizing transfer learning

This can significantly reduce both the training time, as well as the need for large, labeled datasets, since the model has already learned a set of features that can be applicable to the new task.
Normalization and transforms are defined and applied to reduce overfitting (gotta be honest, I pulled this from the Interwebs)
A Data collator is defined in order to batch the data
Accuracy and metric functions (also yanked from Interwebs) to compute accuracy of the model
All labels are then mapped (even though we know it's just 0 and 1 for hot_dog and not_hot_dog)
Load the model from the pretrained Image classification transformer
Training arguments are defined (these are just defaults)
Train the model!