Lab SQL Subqueries

Instructions:

1. How many copies of the film Hunchback Impossible exist in the inventory system?

Answer:

SELECT
    title,
    (
        SELECT
            count(inventory_id) AS num_of_copies
        FROM
            inventory
        WHERE
            film_id IN (
                SELECT
                    film_id
                FROM
                    film
                WHERE
                    title = 'Hunchback Impossible'
            )
    ) AS num_of_copies_available
FROM
    film
WHERE
    title = 'Hunchback Impossible';
151388929 7ae54480 08f3 49f1 84d2 8b039b8734f1

2. List all films whose length is longer than the average of all the films.

Answer:

SELECT
    title AS film_title,
    length AS above_avg_duration
FROM
    film
WHERE
    length > (
        SELECT
            avg(length)
        FROM
            film
    )
ORDER BY
    2 DESC;
151395424 c438b1ec a295 47cd a108 e348111febf2

3. Use subqueries to display all actors who appear in the film Alone Trip.

Answer:

SELECT
    a.actor_id,
    concat(a.first_name, ' ', a.last_name) AS actors,
    (
        SELECT
            title
        FROM
            film
        WHERE
            title = 'Alone Trip'
    ) AS film_title
FROM
    film f
    INNER JOIN film_actor fa ON f.film_id = fa.film_id
    INNER JOIN actor a ON fa.actor_id = a.actor_id
WHERE
    title = 'Alone Trip'
GROUP BY
    1;
151462706 6444c2a0 0f75 40e6 8a60 a575008578e9

4. Sales have been lagging among young families, and you wish to target all family movies for a promotion. Identify all movies categorized as family films.

Answer:

SELECT
    title
FROM
    film
WHERE
    film_id IN (
        SELECT
            film_id
        FROM
            film_category
        WHERE
            category_id = (
                SELECT
                    category_id
                FROM
                    category
                WHERE
                    name = 'Family'
            )
    );
151557399 3d5c8491 1f96 4958 963f a497d5b41daa
ℹ️

Here only using subqueries I can output the expected result: a list of film titles. However, I wanted to add more details so a user can quickly identify what the table is showing. If I wanted to add a column with the name of the category, and category_id for example, it seems to me that the query would become much bigger, having to write more when the same result can be obtained with a join approach.

SELECT
    (
        SELECT
            category_id
        FROM
            category
        WHERE
            `name` = 'Family'
    ) AS category_id,
    (
        SELECT
            `name`
        FROM
            category
        WHERE
            `name` = 'Family'
    ) AS film_category,
    title AS film_title
FROM
    film
WHERE
    film_id IN (
        SELECT
            film_id
        FROM
            film_category
        WHERE
            category_id = (
                SELECT
                    category_id
                FROM
                    category
                WHERE
                    name = 'Family'
            )
    );
151578168 8e6bf390 9491 4a5e 9ca2 030af2145574

Answer: combining joins and subqueries

SELECT
    c.category_id,
    c.`name` AS category,
    f.film_id,
    f.title AS film_title
FROM
    film f
    INNER JOIN (
        SELECT
            film_id,
            category_id
        FROM
            film_category
    ) fc ON f.film_id = fc.film_id
    INNER JOIN (
        SELECT
            category_id,
            `name`
        FROM
            category
        WHERE
            `name` = 'Family'
    ) c ON fc.category_id = c.category_id;
151557176 213b276c 868d 499a 8f60 0dcbacb456cc

5. Get name and email from customers from Canada using subqueries. Do the same with joins. Note that to create a join, you will have to identify the correct tables with their primary keys and foreign keys, that will help you get the relevant information.

Answer: subqueries

SELECT
    customer_id,
    first_name,
    last_name,
    email,
    (
        SELECT
            country
        FROM
            country
        WHERE
            country = 'Canada'
    ) AS country
FROM
    customer
WHERE
    address_id IN (
        SELECT
            address_id
        FROM
            address
        WHERE
            city_id IN (
                SELECT
                    city_id
                FROM
                    address
                WHERE
                    city_id IN (
                        SELECT
                            city_id
                        FROM
                            city
                        WHERE
                            country_id = (
                                SELECT
                                    country_id
                                FROM
                                    country
                                WHERE
                                    country = 'Canada'
                            )
                    )
            )
    );
151573620 a99b4eda 9131 4b6a 9671 7c9855e8295a

Answer: with joins

SELECT
    c.customer_id,
    concat(c.first_name, ' ', c.last_name) AS customer_name,
    c.email,
    co.country
FROM
    customer c
    INNER JOIN address a ON c.address_id = a.address_id
    INNER JOIN city ci ON a.city_id = ci.city_id
    INNER JOIN country co ON ci.country_id = co.country_id
WHERE
    country = 'Canada';
151582323 a5c1d1f3 a3a9 46b7 8b2f 4d3188c15f1e

6. Which are films starred by the most prolific actor? Most prolific actor is defined as the actor that has acted in the most number of films. First you will have to find the most prolific actor and then use that actor_id to find the different films that he/she starred.

Answer:

SELECT
    *
FROM
    (
        SELECT
            title AS film_title
        FROM
            film
        WHERE
            film_id IN (
                SELECT
                    film_id
                FROM
                    film_actor
                WHERE
                    actor_id = (
                        SELECT
                            actor_id
                        FROM
                            (
                                SELECT
                                    actor_id,
                                    count(film_id) AS number_of_films
                                FROM
                                    film_actor
                                GROUP BY
                                    1
                                ORDER BY
                                    2 DESC
                                LIMIT
                                    1
                            ) t
                    )
            )
    ) t0, (
        SELECT
            concat(first_name, ' ', last_name) AS most_prolific_actor_name
        FROM
            actor
        WHERE
            actor_id = (
                SELECT
                    actor_id
                FROM
                    (
                        SELECT
                            actor_id,
                            count(film_id) AS number_of_films
                        FROM
                            film_actor
                        GROUP BY
                            1
                        ORDER BY
                            2 DESC
                        LIMIT
                            1
                    ) t1
            )
    ) t2;
151595302 d4e961ef cc17 4518 a048 16ee8973504f

7. Films rented by most profitable customer. You can use the customer table and payment table to find the most profitable customer ie the customer that has made the largest sum of payments

Answer task 7: using subqueries, step by step

Step 1:
  • find the most profitable customer

Every query will be used as a filter in the WHERE clause. First I need to identify the customer_id of the customer that has made the largest sum of payments

SELECT
    customer_id,
    sum(amount)
FROM
    payment
GROUP BY
    1
ORDER BY
    2 DESC
LIMIT
    1;
151601450 eade6fec 4e04 4763 9db1 7d227c4dffec

Step 2
  • customer_id

Now I need to have a query that outputs only the customer_id. The previous query has two columns and it would not work to use it as a subquery. But I will use it to create the new query

SELECT
    customer_id
FROM
    (
        SELECT
            customer_id,
            sum(amount)
        FROM
            payment
        GROUP BY
            1
        ORDER BY
            2 DESC
        LIMIT
            1
    ) t;
151602299 c3420d17 9ec7 4922 9ea5 b7dad042e97f

Step 3
  • inventory_id

The previous query with the customer_id, will help me to find the inventory_id so we can reach the film titles. I will limit the 45 rows output to 5 just for esthetic reasons

SELECT
    inventory_id
FROM
    rental
WHERE
    customer_id IN (
        SELECT
            customer_id
        FROM
            (
                SELECT
                    customer_id,
                    sum(amount)
                FROM
                    payment
                GROUP BY
                    1
                ORDER BY
                    2 DESC
                LIMIT
                    1
            ) t
    );
151603810 8d3cd45a 7abe 4ac5 a1c0 9a99cac2fd2b

Step 4
  • film_id

Next I will output the `film_id`

SELECT
    film_id
FROM
    inventory
WHERE
    inventory_id IN (
        SELECT
            inventory_id
        FROM
            rental
        WHERE
            customer_id IN (
                SELECT
                    customer_id
                FROM
                    (
                        SELECT
                            customer_id,
                            sum(amount)
                        FROM
                            payment
                        GROUP BY
                            1
                        ORDER BY
                            2 DESC
                        LIMIT
                            1
                    ) t
            )
    );
151604699 c4482ed4 3068 4c9f 9a82 9a29c131df2c

Step 5
  • film title

finding the `titles: I have 44 titles returned`

SELECT
    title AS film_title
FROM
    film
WHERE
    film_id IN (
        SELECT
            film_id
        FROM
            inventory
        WHERE
            inventory_id IN (
                SELECT
                    inventory_id
                FROM
                    rental
                WHERE
                    customer_id IN (
                        SELECT
                            customer_id
                        FROM
                            (
                                SELECT
                                    customer_id,
                                    sum(amount)
                                FROM
                                    payment
                                GROUP BY
                                    1
                                ORDER BY
                                    2 DESC
                                LIMIT
                                    1
                            ) t
                    )
            )
    );
151605325 21d40b07 73a1 4866 b9cf d175682f21e9

Step 6:
  • customer_name

To be able to obtain the customer_name, I need to recycle the customer_id query in step 2

SELECT
    concat(first_name, ' ', last_name) AS customer_name
FROM
    customer
WHERE
    customer_id IN (
        SELECT
            customer_id
        FROM
            (
                SELECT
                    customer_id,
                    sum(amount)
                FROM
                    payment
                GROUP BY
                    1
                ORDER BY
                    2 DESC
                LIMIT
                    1
            ) t
    );
151606431 e4e63ebf a5c0 4ae9 b46c c8357ce066ad

Step 7
  • Main outer final query

On the final query, I will insert in the SELECT statement the customer_name so we have a table with the rented films and the name of the person that rented them, in this case is our most profitable customer

SELECT
    title AS film_titles_rented,
    (
        SELECT
            concat(first_name, ' ', last_name) AS customer_name
        FROM
            customer
        WHERE
            customer_id IN (
                SELECT
                    customer_id
                FROM
                    (
                        SELECT
                            customer_id,
                            sum(amount)
                        FROM
                            payment
                        GROUP BY
                            1
                        ORDER BY
                            2 DESC
                        LIMIT
                            1
                    ) t
            )
    ) AS customer_name_most_prof
FROM
    film
WHERE
    film_id IN (
        SELECT
            film_id
        FROM
            inventory
        WHERE
            inventory_id IN (
                SELECT
                    inventory_id
                FROM
                    rental
                WHERE
                    customer_id IN (
                        SELECT
                            customer_id
                        FROM
                            (
                                SELECT
                                    customer_id,
                                    sum(amount)
                                FROM
                                    payment
                                GROUP BY
                                    1
                                ORDER BY
                                    2 DESC
                                LIMIT
                                    1
                            ) t
                    )
            )
    );
151607288 cf8aa5f7 91f5 4071 a07d 8f81f2874d89

Answer task 7 using joins:

in this approach, I combined joins and subqueries. To be able to filter the films by the most profitable customer at the WHERE clause, I recycled the subqueries in Step 2 when building the query using subqueries only.

Also this time Im using the USING clause, which I find very convenient to setup joins when the joining keys have the same name

SELECT
    f.film_id,
    f.title AS rented_films,
    c.customer_id,
    concat(c.first_name, ' ', c.last_name) AS most_profit_customer
FROM
    customer c
    INNER JOIN payment p USING (customer_id)
    INNER JOIN rental r USING (rental_id)
    INNER JOIN inventory i USING (inventory_id)
    INNER JOIN film f USING (film_id)
WHERE
    c.customer_id = (
        SELECT
            customer_id
        FROM
            (
                SELECT
                    customer_id,
                    sum(amount)
                FROM
                    payment
                GROUP BY
                    1
                ORDER BY
                    2 DESC
                LIMIT
                    1
            ) t
    )
GROUP BY
    2;
151650283 118c840b 4c5b 4b59 94b5 5b2485d607f4

8. Get the client_id and the total_amount_spent

  • of those clients who spent more than the average of the total_amount spent by each client.

Answer:

SELECT
    concat(c.first_name, ' ', c.last_name) AS customer_name,
    round(sum(amount), 2) AS amount
FROM
    customer c
    JOIN payment p ON c.customer_id = p.customer_id
GROUP BY
    1
HAVING
    amount > (
        SELECT
            avg(amount)
        FROM
            (
                SELECT
                    concat(c.first_name, ' ', c.last_name) AS customer_name,
                    round(sum(amount), 2) AS amount
                FROM
                    customer c
                    JOIN payment p ON c.customer_id = p.customer_id
                GROUP BY
                    1
            ) AS t
    )
ORDER BY
    2 DESC;
151611296 4b0cad6d f3fe 4bc1 acf1 99b81f8de422