Question about the entities behind the "anonymous_user_id"
vii33 opened this issue · 2 comments
Hey,
I'm trying to calculate the average amount of images a user downloads.
As I know from my own photo stats, a lot of downloads are generated via API requests from external applications. You state in your API doc that external applications don't need to authenticate on a user level.
My question: Is for an external application like Trello one anonymous user id generated or do you guys have a better approach to distinguish users "behind" the external application?
Example from the test dataset
Could the user from the first row (942 downloads) really be one person or also a whole logical entity like Trello?
anonymous_user_id | downloads |
---|---|
5a055748-57d2-45c1-a882-5b9bb9313509 | 942 |
beb0923e-c17d-4a90-a8db-47b0f45fb0fc | 897 |
85e5db9c-07c7-49bf-9e08-5cbd1603dd74 | 546 |
... | ... |
Thanks a lot for the answer and great job with the data set. 👍
Hi @vii33 !
These conversions/downloads only concern our main website unsplash.com, after a search happened. Technically, these are unique devices. You're right that this would need some clarification.
We could see to include another table in the dataset with all the downloads from all the sources but I don't think we'd be able to tie individual users of third party apps to the downloads they make. Mainly because of the reason you mentioned: some third party apps act as proxy for the photo download and would "override" the little we know about the device that's actually downloading.
I think it could be an idea for a future version of the dataset though. I'll talk with the team and we'll see what we can do.
Hey Timmy, all good. Thanks for clearing this up 👍