Team of three:
- Eliana Suarez
- Zoe Liu
- Jean-Paul Mittehofer
In what part of the Unites States are job offers, specifically job posting, widespread in a particular area, in a particular time? To answer this a search was conducted for the appropriate source of data. Then determined what useful information the data should have:
- Category
- Time
- Lat/Lng
- Unique ID
- Company Name
With the source determined the data cleansing process was incorporated with the help of pandas. After, the data visualization process proceeded, and decided what information was important to extract.
Use the package manager pip to install the following:
pip install jupyter notebook
pip install pandas
pip install gmaps
pip install numpy
pip install matplotlib
pip install opencage
pip install random
- classResources (Contains related material needed to complete the project assignment.)
- data (Contains all data sources.)
- cleanData (Data that is ready for visualization. 1_master_clean_data is the data used in our analysis)
- rawData (Data that is not formatted strait from the source)
- vizData (Visual output of the data)
- notebooks (working not book for each team member)
- reports (final notebook and presentation)
ProjectOnePyBootCamp\reports:
- Final_API.ipynb (Key needed to connect to Azduna and Google Geocoding)
- Final_Cleaning_Up_Data.ipynb
- Final_API_City_State.ipynb (Key need to connect to OpenCage Geocoding)
- Final_Visualization.ipynb
- Final_Gmaps.ipynb
ProjectOnePyBootCamp\data\vizData
- 1_master_Graph_Top_5_Categories
- 2_master_Graph_Top_5_Cities
- 3_master_Graph_Top_5_State
- 4_master_Graph_Top_City_Top_5_Category
- 5_master_Graph_Top_State_Top_5_Category
- 6_master_Plot_Category_Job_Postings_Month
- 7_master_Plot_City_Job_Postings_Month
- 8_master_Plot_State_Job_Postings_Month
- 9_master_Plot_Total_Job_Postings_Month
-
Azduna is a search engine for job advertisements. The company operates in 16 countries worldwide and the UK website aggregates job ads from several thousand sources. This is our dataset source and also the source of our lat and lng data containing the exact location of the job offer.
-
OpenCage Geocoder is open geocoding. Their API combines multiple geocoding systems in the background. Each is optimized for different parts of the world and types of requests. This was used to get the lat and lng for cities in the United States.
-
Google Geocoding is the Google Maps Platform. This service was used to provide the reverse geocoding process of converting geographic coordinates into a readable address of city, state, and country.
Our main raw data set could not be pushed into github since the csv file was in excess of 200MB. Once the raw data was cleansed, its size was below 50MB, more than enough to meet GitHubs polices of file size.