Hey there!
This repository contains all the code for my story for the San Antonio Express-News that looks at the highest-paying bachelor's degrees in the San Antonio area.
The story is based on data from the U.S. Department of Education's College Scorecard, which is a database of information on colleges and universities in the United States. The data is available on the College Scorecard website. You can find the main analysis in notebooks/san-antonio-major-earnings.ipynb
jupyter notebook.
Specifics of the analysis:
- Data were pulled March 16, 2023.
- The U.S. Department of Education says that the data was last updated September 14, 2022.
- Median income figures are based on students who graduated during award years 2014-15 and 2015-16 and who accepted federal aid during their studies. The scorecard sources its income figrues from administrative tax records maintained by the IRS within the Department of the Treasury.
- Not all median income figures are publicly available. If class sizes were too small, their data was hidden for privacy reasons.
- If the scorecard's data provided a median income figure for a major while listing 0 graduates for that major, it suggested that either that program was no longer offered at that school or there was a mismatch of data. Refer to page 12 of the education department's technical documentation for more information. During my analysis, I found two programs with zero graduates. I chose to filter them out of my analysis.
- Only bachelor's degrees were analyzed for this story.
If you'd like to run the code yourself, use the following steps:
- Clone the repo.
- Install the requirements with
pip install -r requirements.txt
. - You can either run the jupyter notebook or
cd
into the notebooks folder and use thenbexec get-coordinates.ipynb
command to run it as a script.