/2022-10-26-machine-learning-novice-sklearn

A Carpentry style lesson on machine learning with Python and scikit-learn.

Primary LanguagePythonOtherNOASSERTION

Introduction to Machine Learning with Scikit Learn and Python

Create a Slack Account with us

This repository generates the corresponding lesson website from The Carpentries repertoire of lessons.

Contributing

We welcome all contributions to improve the lesson! Maintainers will do their best to help you if you have any questions, concerns, or experience any difficulties along the way.

We'd like to ask you to familiarize yourself with our Contribution Guide and have a look at the more detailed guidelines on proper formatting, ways to render the lesson locally, and even how to write new episodes.

Please see the current list of issues for ideas for contributing to this repository. For making your contribution, we use the GitHub flow, which is nicely explained in the chapter Contributing to a Project in Pro Git by Scott Chacon. Look for the tag good_first_issue. This indicates that the mantainers will welcome a pull request fixing this issue.

Maintainer(s)

Current maintainers of this lesson are:

Outline

As determined by the attendees of CarpentryConnect Manchester 2019, the proposed outline of this lesson is as follows:

Unsupervised Learning

I. Clustering

1. Kmeans

II. Dimesionality Reduction

1. PCA
2. TSNE

Supervised Learning

All models, objectives:

  • What it is;
  • when to use it and on what type of data;
  • how to evaluate the fit, over/underfitting;
  • computational complexity

I. Regression

1. Linear
2. Polynomial
  • Overfitting/underfitting
  • Test sets (how and why)

II. Classification

1. Logistic regression
  • Over/underfitting can happen in regression too
  • Accuracy
  • Confusion Matrix
  • Precision
  • Recall
2. Random Forest
3. Neural Networks
  • Evaluation
  • Cross Validation

Ethics

Authors

A list of contributors to the lesson can be found in AUTHORS

Citation

To cite this lesson, please consult with CITATION