/setting-up-a-professional-data-science-environment

The repository helps Data Science beginners to easily get started.

Primary LanguagePython

Setting-up-a-professional-data-science-environment

The repository aims to help Data Science beginners to easily get started. It so far has covered the following topics (it will be continuously updated, you are also welcome to contribute):

Essential tools used by professional Data Scientists

Visual Studio Code, Anaconda, Jupyter Notebook, Virtual Environment, Git, GitHub

Visual Studio Code a free open-source and lightweight code editor from Microsoft, which is very popular among developers. Alternatives: IntelliJ, PyCharm, Spyder, etc.

Anaconda is the world's most popular data science platform, which greatly simplifies package management.

Jupyter Notebook is a web-based interactive computational environment for creating Jupyter notebook documents. Alternatives: Jupyter Lab, Google Colab, Databricks Notebook.

Virtual Environment is a tool for creating different virtual environments for different projects (because different projects may have different requirements for Python packages).

Git is a free and open source distributed version control system, tech teams must use it for collaborative working.

GitHub is an online platform which provides hosting for software development and version control using Git. Alternatives: Bickbucket, GitLab.

Setup tools

Visual Studio Code can be setup by following this documents: Windows, macOS, Linux.

Anaconda can be setup by following this document.

Jupyter Notebook does not need to be setup separately, it comes with Anaconda.

Virtual Environment does not need to be setup separately, it comes with Anaconda.

Git can be setup by following this document.

GitHub, you only need to register for a GitHub account.

Use tools effectively

Good coding habits