Python is a high-level, general-purpose, dynamic programming language that is becoming ever more widespread in the programming world. It is readable, succinct, scalable, and can support multiple programming paradigms. It is now the most common ‘starter’ language taught on university programming courses and is seen by many as the future of coding.
This repository holds the notebooks for the book "Regression Analysis with Python" by Luca Massaron and Alberto Boschetti. You can find details about the book on the Packt website.
The books requires the current development version of scikit-learn, that is
0.18-dev. Most of the book can also be used with previous versions of
scikit-learn, though you need to adjust the import for everything from the
model_selection
module, mostly cross_val_score
, train_test_split
and GridSearchCV
.
To run the notebooks, you need the packages numpy
, scipy
, scikit-learn
, matplotlib
, and pandas
.
The easiest way to set up an environment is by installing Anaconda.
If you already have a Python environment set up, and you are using the conda
package manager, you can get all packages by running
conda install numpy scipy scikit-learn matplotlib pandas
If you already have a Python environment and are using pip (Python 2) and pip3 (Python 3) to install packages, you need to run
pip install numpy scipy scikit-learn matplotlib pandas
If you are using OS X and macports, you can sudo port install packagename
. If you are on Ubuntu or debian, you can apt-get install packagename
.