googlecreativelab/quickdraw-dataset

100% recognized in the first and last days

celacanto opened this issue · 1 comments

I was interested in seeing the percentage of hits over time. So, for some themes I tabulated the variable "recognized" and the days. I got graphics with the same format. Here are some examples.

Does anyone know the reason for this pattern? Is there something I'm missing?

Due to the way we have sampled the dataset from the raw dump from the game, the samples aren't evenly distributed, and only represent a subset of the time the game has been online. Also the distribution between words isn't necessarily representative of the data we got in through the game, since we prioritized to get a good representation of all the words.