Portable, scalable and reliable distributed machine learning.
Wormhole is a place where DMLC projects works together to provide scalable and reliable machine learning toolkits that can run on various platforms
- Portable:
- Supported platforms: local machine, Apache YARN, MPI and Sungrid Engine
- Rich support of Data Source
- All projects can read data from HDFS, S3 or local filesystem
- Scalable and Reliable
- Boosted Trees (GBDT): XGBoost: eXtreme Gradient Boosting
- Clustering: kmeans
- Linear method: Asynchrouns SGD L-BFGS
- Factorization Machine: DiFacto
- Requires a C++11 compiler (e.g.~
g++ >=4.8
) andgit
. Install them on Ubuntu= 13.10
sudo apt-get update && sudo apt-get install -y build-essential git
-
Type
make
to build all deps and tools -
All tools can run both in a laptop and in a cluster. For example, train logisitic regression using 2 workers and one servers in local machine
tracker/dmlc_local.py -n 2 -s 1 bin/linear.dmlc learn/linear/guide/demo.conf
If you are having issues, please let us know.
- We are actively building new tools. The source codes of all tools are available under learn/.
- Wormhole depends on other DMLC projects, which are also under active developing