LightML.jl is a collection of reimplementation of general machine learning algorithm in Julia.
The purpose of this project is purely self-educational.
This project is targeting people who want to learn internals of ml algorithms or implement them from scratch.
The code is much easier to follow than the optimized libraries and easier to play with.
All algorithms are implemented in Julia.
You should access test function of every implementation for its usage in detail. Every model is actually constructed in a similar manner.
First make sure you have correct python
dependency. You can use the Conda Julia package to install more Python packages, and import Conda to print the Conda.PYTHONDIR directory where python was installed. On GNU/Linux systems, PyCall will default to using the python program (if any) in your PATH.
The advantage of a Conda-based configuration is particularly compelling if you are installing PyCall in order to use packages like PyPlot.jl or SymPy.jl, as these can then automatically install their Python dependencies.
ENV["PYTHON"]=""
Pkg.add("Conda")
using Conda
Conda.add("python==2.7.13")
Conda.add("matplotlib")
Conda.add("scikit-learn")
Pkg.add("PyCall")
Pkg.build("PyCall")
or you can simply
Pkg.build("LightML")
It's actually same with the procedure above.
Then every dependency should be configured, you can simply run command below to install the package.
Pkg.clone("https://github.com/memoiry/LightML.jl")
Let's first try the overall functionality test.
using LightML
test_LSC()
Figure 1: Smiley, spirals, shapes and cassini Datasets using LSC(large scale spectral clustering)
using LightML
demo()
Figure 2: The Digit Dataset using Demo algorithms
- Adaboost
- Decision Tree
- Gradient Boosting
- Gaussian Discriminant Analysis
- K Nearest Neighbors
- Linear Discriminant Analysis
- Linear Regression
- Logistic Regression
- Multilayer Perceptron
- Naive Bayes
- Ridge Regression
- Lasso Regression
- Support Vector Machine
- Hidden Markov Model
- Label propagation
- Random Forests
- XGBoost
- Gaussian Mixture Model
- K-Means
- Principal Component Analysis
- Spectral Clustering
- Large Scale Spectral Clustering
- test_ClassificationTree()
- test_RegressionTree()
- test_label_propagation()
- test_LDA()
- test_naive()
- test_NeuralNetwork()
- test_svm()
- test_kmeans_random()
- test_PCA()
- test_Adaboost()
- test_BoostingTree()
- test_spec_cluster()
- test_LogisticRegression()
- test_LinearRegression()
- test_kneast_regression()
- test_kneast_classification()
- test_LSC()
- test_GaussianMixture() (Fixing)
- test_GDA() (Fixing)
- test_HMM() (Fixing)
- test_xgboost (Fixing)
Please examine the todo list for contribution detials.
Any Pull request is welcome.
using LightML
test_LinearRegression()
Figure 3: The regression Dataset using LinearRegression
test_Adaboost()
Figure 4: The classification Dataset using Adaboost
test_svm()
Figure 5: The classification Dataset using LinearRegression
test_ClassificationTree()
Figure 6: The digit Dataset using Classification Tree
test_kmeans_random()
Figure 7: The blobs Dataset using k-means
test_LDA()
Figure 8: The classification Dataset using LDA
test_PCA()
Figure 9: The Digit Dataset using PCA