Project competition for the Introduction to Machine Learning course (2023/2024)
Follow these steps:
-
Select one of the pre-trained models present in pytorch.
-
Create a folder for the model in the
models
folder. -
Inside the new folder, create three files:
config.yml
,main.py
andREADME.md
(to clarify what the model does). -
Call the main function with the required parameters (there is an example in the
SwinTransformer
folder). -
Specify the run paramenters in the
config.yml
file. -
From the terminal, move to the folder of the model and run the following command
python main.py --config ./config.yml --run_name <run_name>
The datasets can be manually downloaded and added to the src/data
folder. This folder is however ignored by git and so it will only exists in the local environment.
To keep the process of training the models as smooth as possible, some functions to download libraries directly from the code are defined in the utils.py
file. Datasets can be downloaded in such these ways:
- plain download from web (
.zip
and.tgz
) - download from Kaggle (with Kaggle Api)
An extra step is required to download datasets from Kaggle. Follow these steps to use the Kaggle download.
- Install Kaggle with pip.
pip install --upgrade kaggle
-
Create a Kaggle account.
-
In the account settings, look for API and click on Create New Token. Automatically, a file called
kaggle.json
will be downloaded. -
Place this file in the location
~/.kaggle/kaggle.json
on your machine. You may need to create the directory and set the correct permissions.
mkdir ~/.kaggle
chmod 600 ~/.kaggle/kaggle.json
Finally, datasets from Kaggle can be downloaded calling the function download_dataset_from_kaggle
and passing as argument the name of the dataset (<author>/<name>
) and the name of the directory where the dataset will be saved. There is an example call in the test.py
file.