The data for this exercise has been derived from the Indeed Data Scientist/Analyst/Engineer dataset on kaggle.com

Provide the SQL queries and answers for the following questions/tasks using the data_analyst_jobs table you have created in PostgreSQL:

  1. How many rows are in the data_analyst_jobs table? 1793

SELECT COUNT (*) FROM data_analyst_jobs;

  1. Write a query to look at just the first 10 rows. What company is associated with the job posting on the 10th row? ExxonMobil

SELECT * FROM data_analyst_jobs LIMIT 10;

  1. How many postings are in Tennessee? How many are there in either Tennessee or Kentucky? 21

SELECT COUNT (location) FROM data_analyst_jobs WHERE location = 'TN';

27

SELECT COUNT (location) FROM data_analyst_jobs WHERE location = 'TN' OR location = 'KY';

  1. How many postings in Tennessee have a star rating above 4? 3

SELECT COUNT (location) FROM data_analyst_jobs WHERE location = 'TN' AND star_rating > 4;

  1. How many postings in the dataset have a review count between 500 and 1000? 151

SELECT COUNT (review_count) FROM data_analyst_jobs WHERE review_count BETWEEN 500 AND 1000;

  1. Show the average star rating for each state. The output should show the state as state and the average rating for the state as avg_rating. Which state shows the highest average rating? NE

SELECT location AS state, AVG(star_rating) AS avg_rating FROM data_analyst_jobs GROUP BY state ORDER BY avg_rating DESC;

  1. Select unique job titles from the data_analyst_jobs table. How many are there? 881

SELECT DISTINCT title FROM data_analyst_jobs;

SELECT COUNT(DISTINCT title) FROM data_analyst_jobs;

  1. How many unique job titles are there for California companies? 230

SELECT COUNT(DISTINCT title) FROM data_analyst_jobs WHERE location = 'CA';

  1. Find the name of each company and its average star rating for all companies that have more than 5000 reviews across all locations. How many companies are there with more that 5000 reviews across all locations? 70

SELECT company, AVG(star_rating) AS avg_star_rating FROM data_analyst_jobs GROUP BY company HAVING SUM(review_count) > 5000 AND company IS NOT NULL;

SELECT COUNT(company) FROM (SELECT company FROM data_analyst_jobs GROUP BY company HAVING SUM(review_count) > 5000 AND company IS NOT NULL) AS subquery;

  1. Add the code to order the query in #9 from highest to lowest average star rating. Which company with more than 5000 reviews across all locations in the dataset has the highest star rating? What is that rating? Google, 4.3...

SELECT company, AVG(star_rating) AS avg_star_rating FROM data_analyst_jobs GROUP BY company HAVING SUM(review_count) > 5000 AND company IS NOT NULL ORDER BY avg_star_rating DESC;

  1. Find all the job titles that contain the word ‘Analyst’. How many different job titles are there?
  2. 774 with all case types

SELECT title FROM data_analyst_jobs WHERE title LIKE '%Analyst%';

SELECT COUNT(DISTINCT title) FROM data_analyst_jobs WHERE title LIKE '%Analyst%';

SELECT COUNT(DISTINCT title) FROM data_analyst_jobs WHERE title LIKE '%Analyst%' OR title LIKE '%ANALYST%' OR title LIKE '%analyst%';

SELECT COUNT(DISTINCT title) FROM data_analyst_jobs WHERE LOWER(title) LIKE '%analyst%'

  1. How many different job titles do not contain either the word ‘Analyst’ or the word ‘Analytics’? What word do these positions have in common?
  2. Tableau

SELECT DISTINCT title FROM data_analyst_jobs WHERE title NOT LIKE '%Analyst%' AND title NOT LIKE '%analyst%' AND title NOT LIKE '%ANALYST%' AND title NOT LIKE '%Analytics%' AND title NOT LIKE '%analytics%' AND title NOT LIKE '%ANALYTICS%' ;

SELECT DISTINCT title FROM data_analyst_jobs WHERE LOWER(title) NOT LIKE '%analyst%' AND LOWER(title) NOT LIKE '%analytics%' ;