googlecreativelab/quickdraw-dataset

Can a normal doodle image be used ?

shubhank008 opened this issue · 2 comments

Tried searching on this topic quite a lot but could not find any information. Seems most of the implementation of Quick Draw are done using strokes data rather than the doodle images itself, due to which for prediction/inference you need to provide stroke data of your drawing too and not a image of the drawing, atleast from what I have concluded till now.

So I wanted to ask and confirm, is it correct that you cannot use a image of your drawing/doodle to make same kind of categorical prediction like Quick Draw ?

Any reference or pointers on this would be much appreciated

I found this notebook in a Kaggle competition around QuickDraw which uses a CNN instead of the stroke data:
https://www.kaggle.com/gaborfodor/greyscale-mobilenet-lb-0-892

So it's certainly possible to classify on rendered pixels rather than the strokes.

I found this notebook in a Kaggle competition around QuickDraw which uses a CNN instead of the stroke data:
https://www.kaggle.com/gaborfodor/greyscale-mobilenet-lb-0-892

So it's certainly possible to classify on rendered pixels rather than the strokes.

Just recently I also found this good looking git repo which seems to use rcnn and run on images just like you linked:
https://github.com/zihenglin/quick-draw-recognition

The problem is, everything is fine and dine using a limited set of categories like shown in the repo's readme, but using use all 345 categories seems to be a hassle due to the way it creates "canvas images" with doodles on it to train/test I suppose.

So in its example, it uses a canvas image size of 500x500 and grid size of (4,4) for 5 categories. The combine_quick_drawings.py creates a single image randomly positioning random doodles from those 5 categories. So it creates 100k such images to run the training on. Pretty straight, yep.

Now the thing is, when I tried with all 345 categories, first I had to change canvas size to 2000,2000 and grid size to I believe 20,20 or 25,25.
But, since the combine python script uses all categories to place a random doodle from them on a canvas to export it as a image, you end up with a single 1800-2000px image with 345 doodles on them.
And you need to export 100k such images.

Since this process is not running on GPU, its taking 10 seconds to create 1 such image, so 27 hours to create 10k images or 277 hours to create 100k images.
And this is using a p2 and p3 instance from AWS.

Right now I am exporting 10k images to first test everything works and after it maybe will have to export it in batches of 10k images parallel on 10 servers.

Thats the current technical limitation I am hitting on that repo and it seems one of its kind, could not really find anything similar to it except for a mobile/android project which seems to be working on/for android but rewriting its code in python seems not to be working (I think its something to do with how I am converting image file to data/buffer/array to pass to the model and its np.shape).

Unfortunately both repo author seems to be inactive or away. No replies on stackoverflow as well.