Author: Dr. Jody-Ann S. Jones, Data Steward | Contact me at jody.jones@tum.de
I am Dr. Jody-Ann S. Jones, a passionate Data Steward at the Munich Data Science Institute. My expertise lies in leveraging Python for insightful data analysis and advancing machine learning techniques. With a commitment to fostering knowledge and skill development in these areas, I've been instrumental in designing and leading the "Introduction to Python for Data Analysis Seminar."
My journey in data science is marked by a strong academic background and practical experiences that span various industries. I believe in the power of data to drive decision-making and innovation, which is reflected in my approach to teaching and mentoring. My goal is to empower learners by providing them with the tools and understanding necessary to harness the potential of data.
In this repository, I share my insights, best practices, and a wealth of resources to guide both beginners and seasoned professionals in the ever-evolving field of data science. Whether you're looking to grasp the basics of Python, delve into complex machine learning algorithms, or refine your data analysis skills, this repository is your go-to resource.
Feel free to reach out to me at jody.jones@tum.de for any queries, collaborations, or discussions related to data science. Let's embark on this exciting journey of discovery and innovation together!
This repository hosts a curated set of notebooks and related resources, offering an extensive guide for those using Python in data analysis and machine learning.
├── README.md ├── datasets │ ├── seoul_bike_data.csv │ └── world_education_data.csv ├── end_to_end_research │ ├── end_to_end.ipynb │ └── end_to_end_checklist.md ├── exploratory_data_analysis │ └── eda.ipynb ├── feature_engineering │ └── feature_engineering.ipynb ├── getting_started │ ├── first_steps.ipynb │ └── resources.ipynb ├── images │ └── mdsi_logo.png └── python_basics └── python_crash_course.ipynb
This file serves as the repository's front page. It offers an overview of the repository's purpose, contents, and how to navigate or use the resources effectively. It's the first point of contact for anyone exploring the repository.
Contains the datasets used in the seminar. This includes seoul_bike_data.csv
for bike-sharing analysis in Seoul and world_education_data.csv
for a global education analysis. These datasets are integral for hands-on learning and application of Python in real-world data analysis.
This folder houses materials for conducting comprehensive research projects. It includes End to End ML Project, a Jupyter notebook that guides users through a complete data science project from start to finish, and End to End Project Checklist, a markdown file providing a checklist to ensure all essential steps of a data science project are covered.
Contains Exploratory Data Analysis, a Jupyter notebook focused on Exploratory Data Analysis (EDA). This notebook teaches how to perform initial investigations on data to discover patterns, spot anomalies, test hypotheses, and check assumptions using statistical figures and plots.
Hosts Feature Engineering, a notebook dedicated to the process of feature engineering. This involves creating new features from existing data and selecting the most relevant features for modeling, which is crucial in enhancing the performance of machine learning algorithms.
This folder is designed for getting your machine prepared to follow along with this repo. It contains First Steps, which guides you through the processes of downloading and installing the requisite software and tools that you will need to take full advantage of the content in this repo. Resources, lists additional learning material, tutorials, and guides for Python and data science. This notebook supplements the content in this repo, and hosts a wealth of documentation links, cheatsheets, and other resources that you can refer to if you get stuck in your research projects.
Stores images used in the repository, including the mdsi_logo.png
. This folder typically contains visuals that are referenced in the notebooks or the README file for illustrative purposes.
This includes Python Basics Crash Course, a notebook offering a crash course on Python basics. It's aimed at those who are new to Python or programming in general, covering fundamental concepts and operations in Python.
Each folder is structured to facilitate a step-by-step learning process, starting from Python basics to more advanced topics in data science.