/ML_tutorial

A beginner-friendly repository for learning about various machine learning techniques, and contributions welcome.

Primary LanguageJupyter Notebook

ML Tutorial

Welcome to the ML Tutorial! This repository is designed to help beginners and intermediates learn about machine learning and its various techniques and algorithms.

Prerequisites

To make the most out of these tutorials, you should have a basic understanding of programming concepts and have some experience with Python.

Getting Started

Clone the repository by running the following command in your terminal:

git clone https://github.com/iiitl/ML_tutorial.git

Navigate to the repository directory:

cd ML_tutorial

Open the tutorials and start learning!

Contributing

We welcome contributions to this repository! If you have any suggestions for tutorials or want to contribute a tutorial of your own, please submit a pull request.

Fork the repository
Create your feature branch (git checkout -b feature/new-feature)
Make the changes in your code editor.
To preview the changes, we suggest using a local server like Live Server for your code editor.
Commit your changes (git commit -am 'Added new feature')
Push to the branch (git push origin feature/new-feature)
Create a new pull request

Various types of ML problems and its type

  1. Regression :-

    Definition: At the time of predicting numerical values, problems are called regression problems.

    Algorithms/Models: Simple Linear Regression, Multiple Linear Regression, Polynomial Linear Regression, K-Nearest Neighbour(K-NN), Decision tree, Random Forest

  2. Classification :-

    Definition: At the time of classification of data into classes, problems are called Classification problems.

    (Two classes:- Binary classification problem, Multiple classes:- Multiple classification problem). A scenario for such problems might include the classification of cars into Hatchback, SUV, Sedan etc.

    Algorithms/Models: Logistic Regression, Random Forest, K-NN, gradient boosting classifier, neural networks

  3. Clustering :-

    Definition: Problems involving the classification of data points into similar groups.

    Algorithms/Models: K-Means:- In this we create K clusters of having data points of much similarity between each other DBSCAN, Hierarchical clustering, Gaussian Mixture models, BIRCH

  4. Time-series forecasting :-

    Definition: Problems involving the need for us to predict a number based on time-series data (data points plotted in a particular order of time) are time-series forecasting problems. These are important when we have data about a particular thing over a time interval, and we want to predict the future outcomes (eg. Weather)

    Algorithms/Models: ARIMA (Autoregressive integrated moving average), SARIMA (Seasonal autoregressive integrated moving average), LSTM (Long short term memory), Exponential smoothing, Prophet

  5. Anomaly detection :-

    Definition: Problems in which there is a need to find the outliers (unexpected values) in the dataset. This can be used for fraud detection in online transactions.

    Algorithms/Models: Isolation Forest, Minimum covariance determinant, One-class SVM

  6. Ranking :-

    Definition: Problems involving the need to order the data in a particular order based on some criteria. The output of such queries is computed based on the scores that are assigned to them. Amazon, Flipkart etc. use this for their recommendation engines.

    Algorithms/Models: Bipartite ranking (Bipartite Rankboost, Bipartite RankSVM)

  7. Recommendation :-

    Definition: We see these at work on the platforms such as YouTube or Netflix, where we get the recommendation of the next video we should watch based on our previous watches. Such problems are recommendation problems.

    Algorithms/Models: Content-based and collaborative filtering machine learning methods

  8. Data generation :-

    Definition: At the time of generating data such as images, videos etc, the problems that arise are data generation problems.

    Algorithms/Models: Generative adversarial network (GAN), Hidden Markov models

  9. Optimization :-

    Definition: In such problems, we change the parameters in each step until we reach the optimum result. We can create more accurate models with less error with the help of Optimization. Maximum and Minimum function evaluation is used here.

    Algorithms/Models: Linear programming methods, genetic programming

Conclusion

We hope that this repository will help you on your journey to becoming a machine learning expert. Happy learning!