/Udacity

Primary LanguageHTML

Repository where I collect all the assignments for the Udacity Data Analytics Nanodegree.

Projects

Statistics -- Check repo

• Analyzed the Stroop effect using descriptive statistics to provide an intuition about the data, and inferential statistics to draw a conclusion based on the results.

Skills: Python, IPython, Pandas.

statistics

Exploratory Analysis with Pandas -- Check repo

• What variables are related to surviving the Titanic? In this data set I posed this question and used different descriptive and modelling strategies to uncover these relations.

Skills: Python, IPython, Pandas.

exploratory pandas

Data gathering and wrangling with SQL and Pandas -- Check repo

• Parsed 140 Mb XML document to obtain relevant data. • Cleaned, audited and corrected more than 2500 registries. • Stored cleaned data in a SQL database, performed queries and generated plots. Created map plots to inspect georeferenced data.

Skills: Python, SQL, XML parsing, regular expressions, Pandas, BaseMap, GeoPandas.

wrangling pandas

Exploratory Analysis with R -- Check repo

• Cleaned, merged and analyzed data on consequences of earthquake in the world from the 1900s. • Created a notebook with clear steps for getting, cleaning and merging the data from different sources. Created a codebook with all the variables included in final dataset. • Created more than 20 visualizations to understand the data. Analyzed the conditional relationships of deaths of earthquakes given its magnitude and regime type/gdp per capita.

Skills: R, R Studio, ggplot, Python, pandas, GeoPandas.

exploratory R

Machine Learning - Enron Case -- Check repo

• Identified which Enron employees are more likely to have committed fraud using machine learning and public Enron financial and email data. • Trained and tested different algorithms and used feature selection techniques. • Tunned algorithms’ parameters to improve original results.

Skills: Python, Scikit-learn, Pandas, machine learning.

machine learning

Data visualization - Earthquake project -- Check repo

• Developed visualization where users can fully interact with geographical and temporal features of earthquakes. • Successfully integrated D3.js and Leaflet to produce animations and transitions. • Project featured by Data Science Weekly.

Skills: D3.js, Leaflet, GeoPandas, Pandas, Python.

earthquake visualization

A/B testing -- Check repo

• Designed an A/B test, including which metrics to measure and how long the test should be run. I also analyzed the results of an A/B test that was run by Udacity, recommended a decision, and proposed a follow-up experiment.

Skills: Pandas, IPython.

a b test