Data Science with Python will help you get comfortable with using the Python environment for data science. You will learn all the libraries that a data scientist uses on a daily basis. By the end of this book, you will be able to take a large raw dataset, clean it, manipulate it, and run machine learning algorithms to obtain results that influence business decisions.
Data Science with Python by Rohan Chopra, Aaron England and Mohamed Noordeen
- Pre-process data to make it ready to use for machine learning
- Create data visualizations with Matplotlib
- Use scikit-learn to perform dimension reduction using principal component analysis (PCA)
- Solve classification and regression problems
- Get predictions using the XGBoost library
- Process images and create machine learning models to decode them
- Process human language for prediction and classification
- Use TensorBoard to monitor training metrics in real time
- Find the best hyperparameters for your model with AutoML
For an optimal student experience, we recommend the following hardware configuration:
- Processor: Intel Core i5 or equivalent
- Memory: 4GB RAM (8 GB Preferred)
- Storage: 15 GB available hard disk space
- Internet connection
You'll also need the following software installed in advance:
- OS: Windows 7 SP1 64-bit, Windows 8.1 64-bit or Windows 10 64-bit, Ubuntu Linux, or the latest version of OS X
- Browser: Google Chrome/Mozilla Firefox Latest Version
- Notepad++/Sublime Text as IDE (optional, as you can practice everything using Jupyter Notebook in your browser)
- Python 3.4+ (the latest version is Python 3.7) installed (https://python.org)
- Anaconda (https://www.anaconda.com/distribution/)
You can download the dataset for the following lessons from the respective URL:
Lesson 06, Lesson 07 and Lesson 08: https://drive.google.com/drive/folders/1SZ7vVby_gfb8Isu4b-fJlSJBfqDCsCVh?usp=sharing Lesson 06 and Lesson 08 use the same dataset