Ironhack Data Analytics Bootcamp

This repo contains all of the practical exercises I did during the Data Analytics Bootcamp @ Ironhack. The entire course lasted for 9 weeks (20-Jan, 20-March 2020) with an additional career week. It was divided into 3 modules:

Git, Python and SQL;
Statistics and probability;
Machine Learning;

Lab Index

In the table below is an index of each exercise ordered by bootcamp module and week, a link to the exercises, the programming language, libraries used and the main topics covered or methods used by me to solve the problems.

Mod/Week	Lab	Language	Libraries	Topics/Methods
M1-W1	resolving-git-conflicts	Git, Command Line, Bash	-	GitHub, add, commit, push, pull, merge, conflicts, pull requests
M1-W1	tuple-set-dict	Python	random, operator, pandas	random.sample, operator.itemgetter, pd.DataFrame
M1-W1	list-comprehensions	Python	os, numpy, pandas	os.listdir, os.path.join, pd.concat,np.array, _get_numeric_data
M1-W1	string-operations	Python	re, math	f-strings, str.lower, str.endswith, str.join, str.split, str.replace, re.findall, re.search, bag of words
M1-W1	lambda-functions	Python	-	functions, lambda, zip, sorted, dict.items
M1-W1	numpy	Python	numpy,	np.random (random, rand, sample), np.ones, size, shape, np.reshape, np.transpose, np.array_equal, max, min, mean, np.empty, np.nditer,
M1-W1	functions	Python	iter	functions, iterators, generators, yield
M1-W1	intro-pandas	Python	pandas, numpy	pd.Series, pd.DataFrame, df.columns, subsetting, df.mean, df.max, df.median, df.sum
M1-W1	python-project	Python	inquirer, playsound	Escape Room python text game. functions, dictionaries, conditions
M1-W2	map-reduce-filter	Python	numpy, pandas, functools	functions, map, reduce, filter
M1-W2	import-export	Python	pandas	pd.read_csv, pd.to_csv, pd.read_excel, df.head, df.value_counts
M1-W2	dataframe-calculations	Python	pandas, numpy, zipfile	df.shape, df.unique, str.contains, df.astype, df.isnull, df.apply, df.sort_values, df.equals, pd.get_dummies, df.corr, df.drop, pd.groupby.agg, df.quantile,
M1-W2	first-queries	SQL	-	create db, create table, select, distinct, group by, order by, where, limit, count
M1-W2	my-sql-select	SQL	-	aliases, inner join, left join, sum, coalesce,
M1-W2	my-sql	SQL	-	db design, table relationships, db seeding, forward engineering schemas, one-to-many, many-to-one, many-to-many, linking tables
M1-W2	advanced-mysql	SQL	-	temporary tables, subqueries, permanent tables
M1-W2	data-cleaning	Python	pandas, numpy, scipy.stats	pd.rename, df.dtypes, pd.merge, df.fillna, np.abs, stats.zscore
M1-W2	project-cities	Python	pandas	collected data online from different sources and analyzed the effect of increasing AirBnBs in Lisbon on hotel prices
M1-W3	api-scavenger	Python, APIs, Command Line	pandas, pandas.io.json	curl, pd.read_json, json_normalize, pd.to_datetime
M1-W3	web-scraping	Python, APIs	requests, beautifulsoup, tweepy	requests.get, requests.get.content, BeautifulSoup, soup.find_all, soup.tag.text, soup.tag.get, soup.tag.find, tweepy.get_user, tweepy.user_timeline, tweepy.user.statuses_count, tweepy.user.follower_count
M1-W3	advanced-regex	Python	re	re.findall, re.sub,
M1-W3	matplotlib-seaborn	Python	matplotlib.pyplot, seaborn, numpy, pandas	plt.plot, plt.show, plt.subplots, plt.legend, plt.bar, plt.barh, plt.pie, plt.boxplot, plt.xticks, ax.set_title, ax.set_xlabel, sns.set, sns.distplot, sns.barplot, sns.despine, sns.violinplot, sns.catplot, sns.heatmap, np.linspace, pd.select_dtypes, pd.Categorical, df.cat.codes, np.triu, sns.diverging_palette
M1-W3	pandas-deep-dive	Python	pandas	df.describe, df.groupby.agg, df.apply
M1-W3	project-data-thieves	Python	pandas, geopandas, geoplot	data from kaggle survey and web scraping to analyze the best countries in the world to work in data jobs (quality of life, number of offers and average salaries)
M2-W4	subsetting-and-descriptive-stats	Python	pandas, matplotlib, seaborn	df.loc, df.groupby.agg, df.quantile, df.describe,
M2-W4	understanding-descriptive-stats	Python	pandas, random, matplotlib, numpy	random.choice, plt.hist, plt.vlines, np.mean, np.std
M2-W4	regression-analysis	Python	numpy, pandas, scipy, sklearn.linear_model, matplotlib, seaborn	plt.scatter, df.corr, scipy.stats.linregress, sns.heatmap, sklearn.LinearRegression, lm.fit, lm.score, lm.coef_, lm.intercept
M2-W4	advanced-pandas	Python	pandas, numpy, random	df.isnull, df.set_index, df.reset_index, random.choices, df.lookup, pd.cut
M2-W4	mini-project1	Python	pandas, numpy, matplotlib, seaborn, scipy.stats	EDA, df.map, df.info, df.apply (with lambda), df.replace, df.dropna, sns.boxplot, plt.subplots_adjust, df.drop, sns.pairplot, sns.regplot, sns.jointplot, stats.linregress
M2-W4	pivot-table-and-correlation	Python	pandas, scipy.stats	df.pivot_table(index, columns, aggfunc), stats.linregress, plt.scatter, stats.pearsonr, stats.speamanr
M2-W4	tableau	Tableau	-	mini project: analyzed the relationship between the number of characters in the title and description of apps and umber of downloads
M2-W5	intro-probability	Probability	-	probability space, conditional probability, contingency tables
M2-W5	reading-stats-concepts	Statistics	-	p-values, AB testing, means and expected values
M2-W5	probability-distributions	Python	scipy.stats, numpy	discrete: stats.binom, stats.poisson. continuous: stats.uniform, stats.norm, stats.expon, np.random.exponential, stats.rvs, stats.cdf, stats.pdf, stats.ppf
M2-W5	confidence-intervals	Python	scipy.stats, numpy	stats.norm.interval, calculating sample sizes
M2-W5	intro-to-scipy	Python	scipy, numpy	stats.tmean, stats.fisher_exact, scipy.interpolate, interpolate.interp1d, np.arange
M2-W5	hypothesis-testing-1	Python	scipy.stats, numpy, pandas, statsmodels	stats.ttest_1samp, stats.sem, stats.t.interval, pd.crosstab, statsmodels.proportions_ztest
M2-W5	hypothesis-testing-2	Python	pandas, scipy.stats	stats.f_oneway, stats.ttest_ind, stats.ttest_rel, pd.concat
M2-W5	mini-project2	Python	pandas, numpy, scipy.stats, matplotlib	stats.norm, stats.ppf, stats.t.interval, stats.pdf, np.linspace, stats.shapiro
M2-W6	two-sample-hyp-test	Python	pandas, scipy.stats, numpy	stats.ttest_ind, stats.ttest_rel, stats.ttest_1samp, stats.chi2_contingency, np.where
M2-W6	goodfit-indeptests	Python	scipy.stats, numpy	stats.poisson, stats.pmf, stats.chisquare, stats.norm, stats.kstest, stats.cdf, stats.chi2_contingency, stats.binom
M3-W7	intro-to-ml	Python	pandas, numpy, datetime, sklearn.model_selection	pd.to_numeric, df.interpolate, np.where, dt.strptime, dt.toordinal, train_test_split
M3-W7	supervised-learning-feature-extraction	Python	pandas, numpy	pd.to_numeric, df.apply, pd.to_datetime, np.where, pd.merge
M3-W7	supervised-learning	Python	pandas, seaborn, sklearn.model_selection, sklearn.linear_model, LogisticRegression, sklearn.neighbors, sklearn.preprocessin	df.corr, sns.heatmap, df.drop, df.dropna, pd.get_dummies, train_test_split, LogisticRegression, confusion_matrix, accuracy_score, KNeighborsClassifier, RobustScaler
M3-W7	supervised-learning-sklearn	Python	sklearn.linear_model, sklearn.datasets, sklearn.preprocessing, sklearn.model_selection, statsmodels.api, sklearn.metrics, sklearn.feature_selection	LinearRegression, load_diabetes, PolynomialFeatures, StandardScaler, train_test_split, sm.OLS, r2_score, RFE
M3-W7	unsupervised-learning	Python	sklearn.preprocessing, sklearn.cluster, sklearn.metrics, yellowbrick.cluster	StandardScaler, KMeans, silhouette_score, KElbowVisualizer, DBSCAN
M3-W7	unsupervised-learning-and-sklearn	Python	sklearn.preprocessing, sklearn.cluster, mpl_toolkits.mplot3d	LabelEncoder, KMeans, fig.gca(projection='3d')
M3-W8	problems-in-ml	Python	sklearn.metrics, sklearn.model_selection, sklearn.ensemble, sklearn.datasets, sklearn.svm, matplotlib.colors	r2_score, mean_squared_error, train_test_split, RandomForestRegressor, load_boston, SVC, ListedColormap
M3-W8	imbalance	Python	sklearn.model_selection, sklearn.preprocessing, sklearn.linear_model, sklearn.tree, sklearn.preprocessing, sklearn.metrics	train_test_split, LabelEncoder, LogisticRegression, DecisionTreeClassifier, RobustScaler, StandardScaler, PolynomialFeatures, MinMaxScaler, confusion_matrix, accuracy_score
M3-W8	deep-learning	Python	tensorflow, keras.models, keras.layers, keras.utils, sklearn.model_selection	keras.Sequential, keras.Dense, keras.to_categorical, save_weights, load_weights
M3-W8	nlp	Python	re, nltk, nltk.stem, nltk.corpus, sklearn.feature_extraction.text, nltk.probability	WordNetLemmatizer, stopwords, CountVectorizer, TfidfVectorizer, ConditionalFreqDist, nltk.word_tokenize, nltk.PorterStemmer, nltk.WordNetLemmatizer, nltk.NaiveBayesClassifier, nltk.classify.accuracy, classifier.show_most_informative_features

ricardozacarias/ironhack-labs

Ironhack Data Analytics Bootcamp

Git, Python and SQL;

Statistics and probability;

Machine Learning;

Lab Index