/dsc-introduction-v2-2-intro

Primary LanguageJupyter NotebookOtherNOASSERTION

Getting Started with Data Science - Introduction

Introduction

Congratulations on making it this far! Now that you have mastered fundamentals of programming with Python, descriptive statistics, and data visualization, we're going to start digging in to the process of "doing data science".

Data Science Fundamentals

In the first half of this section, we will introduce a lot of new ideas about what we mean by "data science". What is the process? What kinds of problems can data science solve?

We will also go over some key professional concerns of data scientists, including following code best practices and being ethical in our use of data.

Professional Data Science Environment Setup

So far, all of your lessons have been completed in a cloud environment that "just works". You open a lesson and are immediately able to run through your own copy of the code without worrying about where the code came from, how it is stored, whether you have the appropriate software downloaded, etc.

This is very convenient for educational purposes, but is not very representative of a real-world data science environment. So, in the second half of this section, we show you how to get all of the tools set up so that your computer has a professional data science environment!

The tools we cover in this section include:

  • Python
  • Jupyter Notebook
  • Anaconda
  • Git
  • GitHub

You have actually already been using all of these tools "under the hood", but these lessons will walk through what they are all used for and how to install and use them on your computer.

Summary

Remember, it's okay to feel a little uncomfortable. We are going to throw a lot of new concepts at you, and some of them won't fully make sense until much further down the line. Remember that you'll continue to practice these day after day, until they become second nature!