Data Science Bootcamp at IT Academy from Barcelona Activa
September 2023 to February 2024
Presentation of the course:
The main objective of this course was to learn the basic concepts of data science and to achieve effective competence with Python code, oriented towards data analytics. The course was aimed at students with a basic knowledge of statistics, SQL, and Python. All of the work was conducted in the Jupyter Notebook environment.
Course contents:
The course featured an itinerary of 10 thematic modules and a transversal final project. Each of these modules included different self-learning resources and tasks.
- Sprint 0. Introduction to the course and installation of the working environment: This module explained what a working environment is, its purpose, and introduced and installed the Jupyter Notebook and Github environments to deliver the exercises.
- Sprint 1 aka Base de dades. Relational databases (SQL): Basic concepts of SQL were covered, and relational databases were created and queried with MySQL.
- Sprint 2. Python basics: Work was done on basic Python concepts such as data structures and control structures.
- Sprint 3. Numerical programming, dataframes, and statistical analysis: The basic Python packages that allow for statistical treatment of data and work with tables were introduced. The NumPy and Pandas libraries were introduced.
- Sprint 4. Graphical visualization of data: The main packages that enable graphical data visualization were explored, specifically the Matplotlib and Seaborn packages.
- Sprint 5. Introduction to Machine Learning: The first concepts of machine learning were introduced along with one of the main data science packages with Python: scikit-learn.
- Sprint 6. Supervised learning algorithms: Regression: The main regression algorithms, their presentation, and their evaluation were covered.
- Sprint 7. Supervised learning algorithms: Classification: The main classification algorithms, their presentation, and how they were evaluated, were covered.
- Sprint 8. Non-supervised learning algorithms: Grouping: The main grouping algorithms, their presentation, and how they were evaluated, were covered.
- Sprint 9. Analysis of sentiment and texts: Text mining was introduced and explored.
- Sprint 10. Web scraping and automation: The main web scraping packages for Python were presented and utilized in various exercises.
Instructions provided for the final project:
The final project was required to have a professional aspect, meaning it should focus on a real case or topic of interest. The project must include, at least, the following sections:
- Presentation, contextualization, and interest of the case.
- Description of the data.
- Statistical and graphical analysis of the most relevant data.
- Application of algorithms seen throughout the course, which may be useful in the case.
- Analysis of the extracted information.
- Conclusions This project had to be presented and defended in front of colleagues and mentors. It was succesfully done on Wed, 7th February of 2024.
Important:
All sprints were conducted in Catalan (the institutional language). After receiving approval from my mentor and colleagues, I decided to complete my final project about Analyzing Female Bouldering Results in Climbing Competitions for Olympic Qualification Predictions in English to make it accessible to a wider audience.
Thank you!