Multiverso is a parameter server based framework for training machine learning models on big data with numbers of machines. It is currently a standard C++ library and provides a series of friendly programming interfaces, and it is extended to support calling from python and Lua programs. With such easy-to-use APIs, machine learning researchers and practitioners do not need to worry about the system routine issues such as distributed model storage and operation, inter-process and inter-thread communication, multi-threading management, and so on. Instead, they are able to focus on the core machine learning logics: data, model, and training.
For more details, please view our website http://www.dmtk.io.
Linux (Tested on Ubuntu 14.04)
sudo apt-get install libopenmpi-dev openmpi-bin build-essential cmake git
git clone https://github.com/Microsoft/multiverso.git && cd multiverso
mkdir build && cd build
cmake .. && make && sudo make install
Windows
Open the Multiverso.sln
with Visual Studio 2013 and build.
Current distributed systems based on multiverso:
- lightLDA: Scalable, fast, lightweight system for large scale topic modeling
- distributed_word_embedding Distributed system for word embedding
- distributed_word_embedding(deprecated) Distributed system for word embedding
- distributed_skipgram_mixture(deprecated) Distributed skipgram mixture for multi-sense word embedding
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.