In this lab, you'll practice working with JSON files whose schema you know beforehand.
You will be able to:
- Use the JSON module to load and parse JSON documents
- Extract data using predefined JSON schemas
- Convert JSON to a pandas dataframe
Here's the JSON schema provided for a section of the NY Times API:
or a fully expanded view:
You can more about the documentation here.
Note that this is a different schema than the schema used in the previous lesson, although both come from the New York Times.
Open the JSON file located at ny_times_movies.json
, and use the json
module to load the data into a variable called data
.
# Your code here
Run the code below to investigate its contents:
# Run this cell without changes
print("`data` has type", type(data))
print("The keys are", list(data.keys()))
Create a variable results
that contains the value associated with the 'results'
key.
# Your code here
Below we display this variable as a table using pandas:
# Run this cell without changes
import pandas as pd
df = pd.DataFrame(results)
df
Now that you have a general sense of the data, answer some questions about it.
The metadata says this:
# Run this cell without changes
data['num_results']
Double-check that by looking at results
. Does it line up?
# Your code here
"""
Your written answer here
"""
A critic's name can be identified using the 'byline'
key. Assign your answer to the variable unique_critics
.
# Your code here
This code checks your answer.
# Run this cell without changes
assert unique_critics == 7
Create a list review_urls
that contains the URL for each review. This can be found using the 'url'
key nested under 'link'
.
# Your code here (create more cells as needed)
The following code will check your answer:
# Run this cell without changes
# review_urls should be a list
assert type(review_urls) == list
# The length should be 20, same as the length of reviews
assert len(review_urls) == 20
# The data type contained should be string
assert type(review_urls[0]) == str and type(review_urls[-1]) == str
# Spot checking a specific value
assert review_urls[6] == 'http://www.nytimes.com/2018/10/11/movies/barbara-review.html'
Well done! In this lab you continued to practice extracting and transforming data from JSON files with known schemas.