Data Science is an interdisciplinary field that focuses on extracting knowledge from data sets which are typically huge in amount. The field encompasses analysis, preparing data for analysis, and presenting findings to inform high-level decisions in an organization.
In this roadmap I will recommend
Python
, although you may encounterR
in more Data Analytics related jobs. Python mastery will come with time - learn enough basics to be able to read code and implement some simple projects with it.
To become a successful data scientist, you should have knowledge of Statistics. Statistics knowledge will give you the ability to decide which algorithm is good for a certain problem
Notes I strongly recommend to study these libraries from This Book , in case you aren't familiar with reading
i will put some resources i hope it help!
Pandas is used for data cleaning and data preprocessing
NumPy is short for numerical Python and is one of Pythonโs most important libraries it used for matrix and multidimensional array operations in Python
feel free to take a look at a buch of Notebooks of others through kaggle :)
As a Data scientist, you have to showcase your findings in a visual form, so that stakeholders can understand them properly
- How to ask good questions?
- Practical Statistics for Data Scientists Book
- HowHow To choose the right chart
- Matplotlib Course 1
- Matplotlib Course 2
- Seaborn Course 1
- Seaborn Course 2
Note,All you need from week 1 to week 5
Quick Remember, going into ML is a
lifelong learning commitment
- you are going to learn new things time after time no matter how far you are in your career.
- Mathematics for Machine Learning Specialization
- The essence of calculus
- Essence of linear algebra
- Multivariable Calculus
- Intro to Inferential Statistics
- How do I choose which attributes of my data to include in the model?
- How do I choose which model to use?
- How do I optimize this model for best performance?
- How do I ensure that I'm building a model that will generalize to unseen data?
- Can I estimate how well my model is likely to perform on unseen data?
last but not least everything you need to know about statistics feel free to go ahead to this Amazing YouTupe chanel BrandonFoltz
- Kaggle Competitions
- The Top 15 Places to Find Datasets ๐
- Deploying Machine Learning Models in Production
In the upcomming years i will update the roadmap with the Advanced Level
Happy learning.