This is a web-based application that simulates Google's Quick Draw! game. Users can doodle in a canvas area, and the application will predict what the doodle represents in real-time using a pre-trained ONNX machine learning model (using PyTorch and the Python code in this repo). The application also features a countdown timer and randomly selects a label for the user to draw, adding an interactive challenge.
- HTML5 (for markup)
- JavaScript (for client-side logic)
- ONNX (Open Neural Network Exchange for the ML model)
- CSS3 (for styling)
- Canvas for Drawing: Provides a designated canvas area for doodling.
- Real-time Prediction: Predicts what the user is drawing in real-time and displays the predicted label.
- Random Labeling: Generates a random label for the user to draw within a set time.
- Timer Countdown: Displays a countdown timer, setting a time limit for drawing each label.
- A modern web browser that supports HTML5, CSS3, and JavaScript.
-
Clone the repository to your local machine - e.g.
git clone https://github.com/simpetre/quick-draw.git
-
Navigate to the project directory.
-
Open
index.html
in your web browser.
- The application sets a timer and provides a random label for you to draw.
- Start doodling in the canvas area.
- The application continuously predicts and displays what it believes you are drawing.
- When the timer runs out, a new random label is generated for you to draw.
The application uses CSS3 for custom styling. Refer to the styles.css
file for more information.
The layer-wise breakdown of the model used in this project follows: The architecture can be summarized as follows:
- Input: [28x28] Grayscale Image
- Conv1: 32 filters, [3x3] kernel, stride 1, padding 1 -> ReLU -> Max Pooling [2x2]
- Conv2: 64 filters, [3x3] kernel, stride 1, padding 1 -> ReLU -> Max Pooling [2x2]
- Flatten
- FC1: 128 output units -> ReLU
- FC2: 345 output units
This architecture provides a good balance between computational efficiency and model effectiveness for the QuickDraw dataset.
- The input to the network is a grayscale image with dimensions [28x28].
-
Conv1: The first convolutional layer has 32 filters of kernel size [3x3], with stride 1 and padding 1.
- Activation Function: ReLU (Rectified Linear Unit)
- Max-Pooling: [2x2] window with stride 2
-
Conv2: The second convolutional layer has 64 filters of kernel size [3x3], with stride 1 and padding 1.
- Activation Function: ReLU (Rectified Linear Unit)
- Max-Pooling: [2x2] window with stride 2
-
After the convolutional layers, the tensor is flattened to prepare it for fully connected layers.
-
FC1: The first fully connected layer has 128 output units.
- Activation Function: ReLU (Rectified Linear Unit)
-
FC2: The second fully connected layer has 345 output units, corresponding to the 345 classes of the QuickDraw dataset.
- The output is a 345-dimensional tensor where the value at each index i represents the model's confidence that the input image belongs to class i.
- ReLU (Rectified Linear Unit) is used as the activation function in all layers except the output layer.
- Cross Entropy Loss is used as the loss function to train the model.
- Stochastic Gradient Descent (SGD) with a learning rate of 0.001 and momentum of 0.9.
The JavaScript code handles client-side logic, including event handling and real-time doodle prediction. See the script.js
file for details.
Feel free to contribute to this project by submitting a Pull Request.