This is a data analysis on Stackoverflow Anual Developer Survey The whole project is implemented following "Cross-Industry Standard Process for Data Mining (CRISP-DM)" process, including the following steps:
- Business Understanding
- Data Understanding
- Prepare Data
- Model Data
- Results
- Deploy
The environment needed for this project:
- data analysis.ipynb : jupyter notebook that contains the code and detailed data analysis
- data
- latin_america.csv: list of countries that are part of Latin America and the Caribbean
- developer_survey_2020.zip: developer survey data, it can be also dowloaded here.
- clone the github repository:
git clone https://github.com/Erickramirez/Stack-Overflow-data-analysis.git
- verify the Prerequisites
- extract developer survey data into the path (if it doesn't exist, please create it)
/data/2020
- Run the jupyter notebook:
data analysis.ipynb
I'm from Guatemala and I'm interested in how the latin american developers works. I would like to answear the following questions:
- How are developers' demographics in Latin America and the Caribbean (LAC): average age, country, level of education?
- Does it change the status of satisfaction related to the professional experience in Latin America and the Caribbean (LAC)?
- What is the salary behavior according to years of experience, country and Developer type?
- What skills are more valuable for the developers (love the language), and what skills generate more earning?
The results of this questions are in this Medium Post