NMF-mGPU implements the Non-negative Matrix Factorization (NMF) algorithm by making use of Graphics Processing Units (GPUs). NMF takes an input matrix (V) and returns two matrices, W and H, whose product is equal to the former (i.e., V ≈ W ∗ H). If V has n rows and m columns, then dimensions for W and H, will be n × k and k × m, respectively. The factorization rank ("k") specified by the user, is usually a value much less than both, n and m.
This software has been developed using the NVIDIA's CUDA (Compute Unified Device Architecture) framework for GPU Computing. CUDA represents a GPU device as a programmable general-purpose coprocessor able to perform linear-algebra operations.
On detached devices with low on-board memory available, large datasets can be blockwise transferred from the CPU's main memory to the GPU's memory and processed accordingly. In addition, NMF-mGPU has been explicitly optimized for the different existing CUDA architectures.
Finally, NMF-mGPU also provides a multi-GPU version that makes use of multiple GPU devices through the MPI (Message Passing Interface) standard.
If you use this software, please cite the following work:
E. Mejía-Roa, D. Tabas-Madrid, J. Setoain, C. García, F. Tirado and A. Pascual-Montano. NMF-mGPU: Non-negative matrix factorization on multi-GPU systems. BMC Bioinformatics 2015, 16:43. doi:10.1186/s12859-015-0485-4 [http://www.biomedcentral.com/1471-2105/16/43]
Basic steps:
- Install the NVIDIA CUDA Toolkit and Drivers.
- Download, Decompress and Compile NMF-mGPU.
- Example of Use.
The full installation guide can be found in the doc/
folder. Similarly, please read the user guide for information of program usage, and a detailed description of the analysis process.
-
UNIX System (GNU/Linux or Darwin/Mac OS X).
-
One or more CUDA-capable GPU devices: A detailed list of compatible hardware can be found at http://developer.nvidia.com/cuda-gpus
Please note that all devices must be of the same architecture (i.e., heterogeneous GPU clusters are not supported yet). -
CUDA Toolkit and CUDA Driver: They are freely available at the CUDA Downloads Page. Nevertheless, for deprecated GPU devices and/or OS platforms, you can download a previous CUDA release (e.g., version 5.5) from the CUDA Archive Page. Please note that NMF-mGPU requires, at least, the version 4.2.
-
A C compiler conforming to the ISO-C99 standard, such as GNU GCC or LLVM Clang (64-bits only).
-
The optional multi-GPU version also requires an MPI-2.0 (or greater) software library, such as OpenMPI or MPICH.
-
For GNU/Linux:
Instructions vary among different distributions. For instance, on Ubuntu 14.04 LTS (Trusty Tahr):
-
NVIDIA proprietary driver: Open the program Software & Updates, then go to Additional Drivers section and check the option "Using NVIDIA binary driver".
Alternatively, you can open a terminal and type:$> sudo apt-get install nvidia-current
You may have to reboot the system in order to use this driver after installing it.
-
Additional packages: The following packages are required:
build-essential
,nvidia-cuda-dev
andnvidia-cuda-toolkit
.
They can be installed through the Ubuntu Software Center, or via a terminal by typing:$> sudo apt-get install build-essential nvidia-cuda-dev nvidia-cuda-toolkit
-
Multi-GPU version (optional): This version also requires any of the following packages:
openmpi
ormpich
.
For other GNU/Linux distributions, we recommend to read the Getting Starting Guide for GNU/Linux.
-
-
For Darwin/Mac OS X:
-
C/C++ compiler: Please install the Apple's Xcode toolset. Some versions may require to explicitly add the Command Line Developer Tools plug-in in order to make available the required commands on the Terminal.
-
CUDA Toolkit and Drivers: Just download and execute the proper
.dmg
file from the CUDA Download Page (or the Archive Page for previous releases), and follow the instructions. -
Multi-GPU version (optional): Most MPI libraries are available on package managers, such as MacPorts or Homebrew. Otherwise, you can download the source code and compile it.
We highly recommend to read the Getting Starting Guide for Darwin/Mac OS X for detailed instructions.
-
NMF-mGPU can be downloaded from two main sources:
- From the Releases Page for the latest and previous releases.
- From the Applications Page to get the latest development code.
Once the file has been unzipped, please open a Terminal in the generated folder and execute the following command:
$> source env.sh <CUDA_PATH>
where <CUDA_PATH>
denotes the path to your CUDA Toolkit (e.g., /usr/local/cuda-5.5
or /Developer/NVIDIA/cuda-5.5
).
To compile the program, just execute:
$> make
The compilation process may take some time since different versions of the software (one per GPU model) will be generated into the executable file. To compile code for just a particular GPU architecture (e.g., for Compute Capability 1.3), you can use the following command:
$> make SM_VERSIONS=13
No code is compiled for Compute Capabilities 1.x, since they are being deprecated on newer versions of the CUDA Toolkit. For such GPU models, code must be explicitly requested as shown in the example above.
The resulting executable file, NMF_GPU
, will be stored in the bin/
folder.
To compile the multi-GPU version (process not performed by default), just execute:
$> make multi_gpu MPICC=<path_to_mpi>/bin/mpicc
Please make sure that your MPI library is properly installed (i.e., environment variables, etc). Similarly as above, the resulting executable file, NMF_mGPU
is stored in the bin/
folder.
In the test/
folder, you can find different examples of a valid input file. They contain a 5000-by-38 gene-expression data matrix, with or without row/column labels.
For example, to process the file "test/ALL_AML_data.txt
" with a factorization rank of k=2, you can use the following command:
$> bin/NMF_GPU test/ALL_AML_data.txt -k 2 -j 10 -t 40 -i 2000
The rest of arguments denote that:
-j 10
: the test of convergence will be performed each 10 iterations.-t 40
: If there are no relative differences in matrix H after 40 consecutive convergence tests, it is considered that the algorithm has converged.-i 2000
: If no convergence is detected, the algorithm stops after 2000 iterations.
On the screen, you should see something similar to:
<<< NMF-GPU: Non-negative Matrix Factorization on GPU >>>
Single-GPU version
Loading input file...
File selected as ASCII text. Loading...
Data matrix selected as having numeric column headers: No.
Data matrix selected as having numeric row labels: No.
Row labels detected.
Number of data columns detected (excluding row labels): 38.
Name (i.e., description string) detected.
Column headers detected.
Loaded a 5000 x 38 data matrix (190000 items).
Starting NMF( K=2 )...
NMF: Algorithm converged in 430 iterations.
Distance between V and W*H: 0.566049
Saving output file...
File selected as ASCII text.
Done.
In this case, the algorithm converged after a total of 430 iterations.
After completion, both output matrices, W and H, are stored in the same folder and with a similar filename as the input matrix, but suffixed with _W
and _H
respectively. In the example above, such output files would be:
test/ALL_AML_data.txt_W.txt
test/ALL_AML_data.txt_H.txt
Notes:
-
An exhaustive list of all valid parameters for
NMF_GPU
can be shown with the option-h
. That is,$> bin/NMF_GPU -h
In that case, any other argument will be ignored.
-
On any error, please check first the troubleshooting section in the installation guide, located in the
doc/
folder.
The multi-GPU version works similarly. Nevertheless, the MPI Standard mandates that all programs must be launched through the mpiexec
or mpirun
commands. Using similar arguments as the example above, NMF-mGPU can be executed as follow:
mpiexec -np 2 bin/NMF_mGPU test/ALL_AML_data.txt -k 2 -j 10 -t 40 -i 2000
The argument -np 2
denotes that two GPU devices will be used.
-
All GPU devices must have a similar Compute Capability.
-
Please, remember to properly setup the environment of your MPI library.