Lab SQL Subqueries
Jorge Castro DAPT NOV2021
- Instructions:
- 1. How many copies of the film Hunchback Impossible exist in the inventory system?
- 2. List all films whose length is longer than the average of all the films.
- 3. Use subqueries to display all actors who appear in the film Alone Trip.
- 4. Sales have been lagging among young families, and you wish to target all family movies for a promotion. Identify all movies categorized as family films.
- 5. Get name and email from customers from Canada using subqueries. Do the same with joins. Note that to create a join, you will have to identify the correct tables with their primary keys and foreign keys, that will help you get the relevant information.
- 6. Which are films starred by the most prolific actor? Most prolific actor is defined as the actor that has acted in the most number of films. First you will have to find the most prolific actor and then use that actor_id to find the different films that he/she starred.
- 7. Films rented by most profitable customer. You can use the customer table and payment table to find the most profitable customer ie the customer that has made the largest sum of payments
- 8. Get the client_id and the total_amount_spent
Instructions:
1. How many copies of the film Hunchback Impossible exist in the inventory system?
Answer:
SELECT
title,
(
SELECT
count(inventory_id) AS num_of_copies
FROM
inventory
WHERE
film_id IN (
SELECT
film_id
FROM
film
WHERE
title = 'Hunchback Impossible'
)
) AS num_of_copies_available
FROM
film
WHERE
title = 'Hunchback Impossible';
2. List all films whose length is longer than the average of all the films.
Answer:
SELECT
title AS film_title,
length AS above_avg_duration
FROM
film
WHERE
length > (
SELECT
avg(length)
FROM
film
)
ORDER BY
2 DESC;
3. Use subqueries to display all actors who appear in the film Alone Trip.
Answer:
SELECT
a.actor_id,
concat(a.first_name, ' ', a.last_name) AS actors,
(
SELECT
title
FROM
film
WHERE
title = 'Alone Trip'
) AS film_title
FROM
film f
INNER JOIN film_actor fa ON f.film_id = fa.film_id
INNER JOIN actor a ON fa.actor_id = a.actor_id
WHERE
title = 'Alone Trip'
GROUP BY
1;
4. Sales have been lagging among young families, and you wish to target all family movies for a promotion. Identify all movies categorized as family films.
Answer:
SELECT
title
FROM
film
WHERE
film_id IN (
SELECT
film_id
FROM
film_category
WHERE
category_id = (
SELECT
category_id
FROM
category
WHERE
name = 'Family'
)
);
ℹ️
|
Here only using subqueries I can output the expected result: a list of film titles. However, I wanted to add more details so a user can quickly identify what the table is showing. If I wanted to add a column with the name of the category, and category_id for example, it seems to me that the query would become much bigger, having to write more when the same result can be obtained with a join approach. |
SELECT
(
SELECT
category_id
FROM
category
WHERE
`name` = 'Family'
) AS category_id,
(
SELECT
`name`
FROM
category
WHERE
`name` = 'Family'
) AS film_category,
title AS film_title
FROM
film
WHERE
film_id IN (
SELECT
film_id
FROM
film_category
WHERE
category_id = (
SELECT
category_id
FROM
category
WHERE
name = 'Family'
)
);
Answer: combining joins and subqueries
SELECT
c.category_id,
c.`name` AS category,
f.film_id,
f.title AS film_title
FROM
film f
INNER JOIN (
SELECT
film_id,
category_id
FROM
film_category
) fc ON f.film_id = fc.film_id
INNER JOIN (
SELECT
category_id,
`name`
FROM
category
WHERE
`name` = 'Family'
) c ON fc.category_id = c.category_id;
5. Get name and email from customers from Canada using subqueries. Do the same with joins. Note that to create a join, you will have to identify the correct tables with their primary keys and foreign keys, that will help you get the relevant information.
Answer: subqueries
SELECT
customer_id,
first_name,
last_name,
email,
(
SELECT
country
FROM
country
WHERE
country = 'Canada'
) AS country
FROM
customer
WHERE
address_id IN (
SELECT
address_id
FROM
address
WHERE
city_id IN (
SELECT
city_id
FROM
address
WHERE
city_id IN (
SELECT
city_id
FROM
city
WHERE
country_id = (
SELECT
country_id
FROM
country
WHERE
country = 'Canada'
)
)
)
);
Answer: with joins
SELECT
c.customer_id,
concat(c.first_name, ' ', c.last_name) AS customer_name,
c.email,
co.country
FROM
customer c
INNER JOIN address a ON c.address_id = a.address_id
INNER JOIN city ci ON a.city_id = ci.city_id
INNER JOIN country co ON ci.country_id = co.country_id
WHERE
country = 'Canada';
6. Which are films starred by the most prolific actor? Most prolific actor is defined as the actor that has acted in the most number of films. First you will have to find the most prolific actor and then use that actor_id to find the different films that he/she starred.
Answer:
SELECT
*
FROM
(
SELECT
title AS film_title
FROM
film
WHERE
film_id IN (
SELECT
film_id
FROM
film_actor
WHERE
actor_id = (
SELECT
actor_id
FROM
(
SELECT
actor_id,
count(film_id) AS number_of_films
FROM
film_actor
GROUP BY
1
ORDER BY
2 DESC
LIMIT
1
) t
)
)
) t0, (
SELECT
concat(first_name, ' ', last_name) AS most_prolific_actor_name
FROM
actor
WHERE
actor_id = (
SELECT
actor_id
FROM
(
SELECT
actor_id,
count(film_id) AS number_of_films
FROM
film_actor
GROUP BY
1
ORDER BY
2 DESC
LIMIT
1
) t1
)
) t2;
7. Films rented by most profitable customer. You can use the customer table and payment table to find the most profitable customer ie the customer that has made the largest sum of payments
Answer task 7: using subqueries, step by step
Step 1:
-
find the most profitable customer
Every query will be used as a filter in the WHERE
clause. First I need to identify the customer_id
of the customer that has made the largest sum of payments
SELECT
customer_id,
sum(amount)
FROM
payment
GROUP BY
1
ORDER BY
2 DESC
LIMIT
1;
Step 2
-
customer_id
Now I need to have a query that outputs only the customer_id
. The previous query has two columns and it would not work to use it as a subquery. But I will use it to create the new query
SELECT
customer_id
FROM
(
SELECT
customer_id,
sum(amount)
FROM
payment
GROUP BY
1
ORDER BY
2 DESC
LIMIT
1
) t;
Step 3
-
inventory_id
The previous query with the customer_id, will help me to find the inventory_id
so we can reach the film titles. I will limit the 45 rows output
to 5 just for esthetic reasons
SELECT
inventory_id
FROM
rental
WHERE
customer_id IN (
SELECT
customer_id
FROM
(
SELECT
customer_id,
sum(amount)
FROM
payment
GROUP BY
1
ORDER BY
2 DESC
LIMIT
1
) t
);
Step 4
-
film_id
Next I will output the `film_id`
SELECT
film_id
FROM
inventory
WHERE
inventory_id IN (
SELECT
inventory_id
FROM
rental
WHERE
customer_id IN (
SELECT
customer_id
FROM
(
SELECT
customer_id,
sum(amount)
FROM
payment
GROUP BY
1
ORDER BY
2 DESC
LIMIT
1
) t
)
);
Step 5
-
film title
finding the `titles: I have 44 titles returned`
SELECT
title AS film_title
FROM
film
WHERE
film_id IN (
SELECT
film_id
FROM
inventory
WHERE
inventory_id IN (
SELECT
inventory_id
FROM
rental
WHERE
customer_id IN (
SELECT
customer_id
FROM
(
SELECT
customer_id,
sum(amount)
FROM
payment
GROUP BY
1
ORDER BY
2 DESC
LIMIT
1
) t
)
)
);
Step 6:
-
customer_name
To be able to obtain the customer_name, I need to recycle the customer_id
query in step 2
SELECT
concat(first_name, ' ', last_name) AS customer_name
FROM
customer
WHERE
customer_id IN (
SELECT
customer_id
FROM
(
SELECT
customer_id,
sum(amount)
FROM
payment
GROUP BY
1
ORDER BY
2 DESC
LIMIT
1
) t
);
Step 7
-
Main outer final query
On the final query, I will insert in the SELECT
statement the customer_name
so we have a table with the rented films and the name of the person that rented them, in this case is our most profitable customer
SELECT
title AS film_titles_rented,
(
SELECT
concat(first_name, ' ', last_name) AS customer_name
FROM
customer
WHERE
customer_id IN (
SELECT
customer_id
FROM
(
SELECT
customer_id,
sum(amount)
FROM
payment
GROUP BY
1
ORDER BY
2 DESC
LIMIT
1
) t
)
) AS customer_name_most_prof
FROM
film
WHERE
film_id IN (
SELECT
film_id
FROM
inventory
WHERE
inventory_id IN (
SELECT
inventory_id
FROM
rental
WHERE
customer_id IN (
SELECT
customer_id
FROM
(
SELECT
customer_id,
sum(amount)
FROM
payment
GROUP BY
1
ORDER BY
2 DESC
LIMIT
1
) t
)
)
);
Answer task 7 using joins:
in this approach, I combined joins and subqueries. To be able to filter the films by the most profitable customer at the WHERE
clause, I recycled the subqueries in Step 2 when building the query using subqueries only.
Also this time Im using the USING
clause, which I find very convenient to setup joins when the joining keys have the same name
SELECT
f.film_id,
f.title AS rented_films,
c.customer_id,
concat(c.first_name, ' ', c.last_name) AS most_profit_customer
FROM
customer c
INNER JOIN payment p USING (customer_id)
INNER JOIN rental r USING (rental_id)
INNER JOIN inventory i USING (inventory_id)
INNER JOIN film f USING (film_id)
WHERE
c.customer_id = (
SELECT
customer_id
FROM
(
SELECT
customer_id,
sum(amount)
FROM
payment
GROUP BY
1
ORDER BY
2 DESC
LIMIT
1
) t
)
GROUP BY
2;
8. Get the client_id and the total_amount_spent
-
of those clients who spent more than the average of the total_amount spent by each client.
Answer:
SELECT
concat(c.first_name, ' ', c.last_name) AS customer_name,
round(sum(amount), 2) AS amount
FROM
customer c
JOIN payment p ON c.customer_id = p.customer_id
GROUP BY
1
HAVING
amount > (
SELECT
avg(amount)
FROM
(
SELECT
concat(c.first_name, ' ', c.last_name) AS customer_name,
round(sum(amount), 2) AS amount
FROM
customer c
JOIN payment p ON c.customer_id = p.customer_id
GROUP BY
1
) AS t
)
ORDER BY
2 DESC;