unsplash/datasets

Values of latitude and longitude entries in dataset are swapped

ys-koshelev opened this issue · 0 comments

Describe the bug
Values of photo_location_latitude and photo_location_longitude entries in photos.tsv are swapped (both in Lite and Full versions).

To Reproduce
Using a photo with id gXSFnk2a9V4 as an example (currently indexed with 1 in the Lite Dataset)

  1. Check the location listed in dataset:
import pandas as pd
df = pd.read_csv('photos.tsv000', sep='\t', header=0)
print({'latitude': df.loc[1]['photo_location_latitude'], 'longitude': df.loc[1]['photo_location_longitude']})

Which outputs {'latitude': -123.97116667, 'longitude': 45.4655}. You can already notice, that it is incorrect, since the latitude is measured within [-90, 90].

  1. Now let's verify that the values are just swapped: check location returned for the same photo by the API (Bash with curl and python3 installed):
curl -k 'https://unsplash.com/napi/photos/aerial-photography-of-seashore-gXSFnk2a9V4' \
  -H 'Accept: */*' \
  -H 'Connection: keep-alive' \
  -H 'Sec-Fetch-Dest: empty' \
  -H 'Sec-Fetch-Mode: cors' \
  -H 'Sec-Fetch-Site: same-origin' \
  -H 'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36 Edg/124.0.0.0' \
  -H 'accept-language: en-US' \
  -H 'sec-ch-ua: "Chromium";v="124", "Microsoft Edge";v="124", "Not-A.Brand";v="99"' \
  -H 'sec-ch-ua-mobile: ?0' | \
  python3 -c "import sys, json; resp = json.load(sys.stdin); print(resp['location']['position'])"

which outputs {'latitude': 45.4655, 'longitude': -123.97116667}.

Expected behavior
The entries in the dataset should contain the correct coordinates, meaning that the values of photo_location_latitude and photo_location_longitude keys should be swapped.

Additional context
N/A