unsplash/datasets

cardinality and type of unsplash_photos.photos_featured

copiousfreetime opened this issue · 1 comments

I was checking the cardinality of various columns in the dataset and the unsplash_photos.photos_featured is always true.

unsplash_lite=# select photo_featured, count(*) from unsplash_photos group by 1;
 photo_featured | count
----------------+-------
 t              | 25000

Is this the expected value?

Also - the data type for this column in the create-tables.sql is varchar and I think it should be bool. I did a quick reload of the data checking if it would still be valid with that change, and it would. Happy to submit a pull request for that if you like.

It is expected that all the photos in the Lite dataset are featured photos. It won't be the case in the Full dataset

The type being wrong is obviously an issue here. Thank you for catching this. Feel free to submit a PR and I'll approve it.
If you don't have time, I'll fix this asap, knowing that I'm currently on paternity leave so I don't have much time for coding 😄