- These are 4 subfolders in the project folder
code
: this is the folder storing all the project codedataset
: Please look at theREADME.md
in this foldergit-repo
: Please look at theREADME.md
in this folderresearch
: Please look at theREADME.md
in this folder
- This is an important step in the project. It is the step where you do the research and find the best way to solve the problem.
-> How to research:
- Step 1: Define the problem
You need to clearly know what the problem is and what you want to achieve. For examples:
"You have a request to build a model which can predict the cat or dog in the image."
-> Question: What is the input?
-> Answer: The image of the cat or dog.
-> More specifically, the image of the cat or dog is a 3D array of sizes (height, width, channel).
-> Question: What is the output?
-> Answer: The cat if the input is a cat image, the dog if the input is a dog image.
-> More specifically, the output is a scalar of 0 or 1. 0 means the input is a cat image, 1 means the input is a dog image.
-> Question: What is the type of the problem?
-> Answer: The problem is a classification problem.
-> More specifically, the problem is a binary classification problem.
If you don't know what the problem is, you can't solve it. So, you need to clearly define the problem first.
-
Step 2: Find the best way to solve the problem based on step 1
After you clearly know what the problem is, you need to find the method to solve the problem. Paperwithcode.com, scholar.google.com, kaggle.com, etc are good places to find the best way to solve the problem. Tips: you can use the keywords in step 1 to search. -
Step 2.1: What you need to look at when you find a method (deep learning) can solve the problem?
What is the input?
What is the output?
What is the model architecture?
What is the loss function?
What is the optimizer?
What is the metric?
What are the hyperparameters?
Is there any provided code? Try to use the provided code first and see if it works.
P/s: this is just a general guideline.
-
Step 2.1: Dataset
If you don't have the dataset, you need to find the dataset. You can find the dataset on kaggle.com, google.com, etc.
The preprocessed method that you need to apply to your dataset.
The format of the dataset (csv, json, folder, jpg, etc)
The information of the dataset (number of samples, number of classes, etc), choose the best dataset for your problem.
Train, val, test dataset -
Step 3: Plan the project
After you find the best way to solve the problem, you need to plan the project. You need to know what you need to do in the project. For example:
- Step 1: Download the dataset
- Step 2: Preprocess the dataset
- Step 3: Build the model
- Step 4: Train the model
- Step 5: Evaluate the model
- Step 6: Deploy the model
Or you only need to do some of the steps above. For example:
- Step 1: Preprocess the dataset
- Step 2: Train the model (use some other's code)
- Step 3: Evaluate the model
- Please look at here
- Please look at here
- Please look at here
- Please look at here
- Please look at here
-
Please create a Readme.md in the
code
folder. This is the documentation of the code which let other people know how to use your code. For example: how to create environment, how to run the code, how to train the model, how to evaluate the model, etc. -
You may want to provide a notebook to show how to use your code in notebooks folder, some images in docs folder, etc.