Welcome to IS509 Introduction to Data Science course tutorials page. The instructor of this lecture is Assoc.Prof.Dr.Altan Koçyiğit and the assistant who will handle the applied portion of this course is Ece Işık Polat. This page will contain the sample codes for the classes and assignments. You can go to the Github repository for the full files to use in your local environment.
The syllabus of the lecture is as follows:
-Week 1: Introduction to Data Science
-Week 2: Frameworks and Platforms
-Week 3: Understanding Data
-Week 4: Probability Overview
-Week 5: Statistical Inference - Part I
-Week 6: Statistical Inference - Part II
-Week 7: Data Preprocessing - Part I
-Week 8: Data Preprocessing - Part II
-Week 9: Data Preprocessing - Part III
-Week 10: Regression Analysis
-Week 11: Classification - Part I
-Week 12: Classification - Part II
-Week 13: Clustering - Part I
-Week 14: Clustering - Part II
-Anaconda is a free and open-source distribution of the Python and R programming languages for scientific computing (data science, machine learning applications, large-scale data processing, predictive analytics, etc.)
-To download Anaconda, you can use the installer or follow these steps
-Please see the Anaconda Starter Guide and Cheat Sheet
You can see some of the useful packages used in the course below. You can access detailed documentation and examples from the given links.
1. NumPy: N-dimensional array for numerical computation
2. SciPy: Scientific computing library for Python
3. Matplotlib: 2D Plotting library for Python
4. Pandas: Powerful Python data structures and data analysis toolkit
5. Seaborn: Statistical graphics library for Python
6. Scikit-Learn: Python modules for machine learning and data mining
7. Jupyter Notebook/Lab: Web app that allows you to create and share documents that contain live code, equations, visualizations, and explanatory text
You may want to see the Quick Introduction to the Jupyter tutorial.