Cannot compile
thangvubk opened this issue · 21 comments
I totally new to Go-lang
Me too. See the answer below as an answer from Noob:
You need golang > v1.5 as the repository requires the vendor-directory.
Then, golang requires a single workspace for all projects. Mine is
# in .bashrc
GOROOT=/home/patwie/libs/go # contains "api, bin, doc, lib, misc, pkg, robots.txt, ..."
GOPATH=/home/patwie/gocode # see below
My structure looks like:
/home
/patwie
/gocode
/src
/github.com
/patwie
/cluster-smi
/golang.org # not needed
/gopkg.in # not needed
Did you git clone the entire repository by
mkdir -p ${GOPATH}/src/github.com/patwie
cd ${GOPATH}/src/github.com/patwie
git clone https://github.com/PatWie/cluster-smi.git
cd cluster-smi
cp cluster-smi.example.env cluster-smi.env
make
You either need to clone this repo exactly into
/usr/local/go/src/github.com/patwie/cluster-smi
# or
/root/work/src/github.com/patwie/cluster-smi
Thank you for your quick response. The problem has gone. I've just leave my lab so I just tested on another computer without Nvidia GPU. Now it have problem with nvml.h header, and i think it will be resolved in the computer with Nvidia driver.
See my other project:
https://github.com/PatWie/tf_zmq
which basically says:
# compile ZMQ library for c++
cd /path/to/your_lib_folder
git clone https://github.com/zeromq/libzmq
cd libzmq
./autogen.sh
./configure
./configure --prefix=/path/to/your_lib_folder/libzmq/dist
make
make install
and add to your bashrc
export PKG_CONFIG_PATH=/path/to/your_lib_folder/libzmq/dist/lib/pkgconfig/:$PKG_CONFIG_PATH
Please let me know if this helps, so this can be documented somewhere in the readme here.
edit cluster-smi-node only works on GPU machines, so you need the cuda-toolkit. You might need the CUDA_INSTALL_PATH
env-variable as well.
See the readme in https://github.com/PatWie/cluster-smi/tree/master/vendor/github.com/pebbe/zmq4
go get github.com/pebbe/zmq4
If you need support for ZeroMQ 4.2 DRAFT, checkout the branch draft4.2.
While not downloading the tar with the correct version from
http://download.zeromq.org/
Unfortunately, ZMQ can only be dynamically linked to the go-app.
So you mean i need another version of GO?
I also have questions.
- Do both server and nodes need cuda-toolkit
- Can i execute cluster-smi on the nodes, or just on the server.
Thank you!
You need another version of ZMQ.
-
Only the cluster-smi-node should need the cuda toolkit. The other should run and be compileable on machines without cuda.
-
Is the image in the readme so bad? You can place all these apps on completely different machines as long as they can communicate. You should be able to call cluster-smi even from a different network if the firewall allows it. But you should compile all apps ones on a machine supporting cuda as this is easier.
Here the setup is: On dump machine having cluster-smi-server running with the port's open. Several different machines with GPUs running cluster-smi-node. And cluster-smi can run everywhere.
Feel free to update the readme if this is confusing presented there.
Aha. Your figure is pretty cool, but the server looks like a modem, which should be changed. In my opinion, when the binaries are run on multiple machines. It is important to clarify where to build code, where to run the binaries. Previously, i thought that i have to build the code in every nodes, and run client and server respectively.
For the zmq lib, i found on the github that the latest version is 4.2.3, but in the error, it says installed version is 4.2.4. I also have a question: Is the zmq4.go in /vendor directory will automatically use the latest version of zmq.
Could you please send me your binaries through thangvubk@gmail.com. I think i can use your binary in case of i cannot build the code :((
Try godep update <name>
to update the package in the vendor directory. I cannot provide a pre-compile binary, as they include specific settings:
# ip of cluster-smi-server
cluster_smi_server_ip="127.0.0.1"
# port of cluster-smi-server, which nodes send to
cluster_smi_server_port_gather="9080"
# port of cluster-smi-server, where clients subscribe to
cluster_smi_server_port_distribute="9081"
# tick for receiving data in milliseconds
cluster_smi_tick_ms="1000"
Further, this would not help in your case, as ZMQ is dynamically linked which would not solve version miss-matches. I will consider trying to update libzmq here.
Yes. I think it should be nice if you check the compilation in your code. :) thank you
Finally, it works beautifully. I downloaded version 4.1.4 of zeromq and compiled. I really appreciate your support. Thank you very much and have a nice day :D
You are welcome!
Since you are able to compile this project, cluster-top might be interesting for you as well.
Wow. This is great. Thank you very much :)
I cannot imagine how to display these information. For multiple machines this list would be long. Any suggestions #5?
Hi @PatWie ,
It seems i successfully compiled cluster-smi, however when i want to launch cluster-smi-node or cluster-smi-router i get the following error:
./cluster-smi-node: error while loading shared libraries: libzmq.so.4: cannot open shared object file: No such file or director
I followed all steps in the README, including the compilation of zmq (from http://files.patwie.com/mirror/zeromq-4.1.0-rc1.tar.gz)
Just for you to know: I have no experience at all with Go, so i might have missed something that is totally obvious to you.
Thanks for your help,
-Ivan
Seems like the only missing step is adding the path to the libzmq.so directory to the LD_LIBRARY_PATH environment variable.
Seems like the only missing step is adding the path to libzmq.so to the LD_LIBRARY_PATH environment variable.
That was exactly the missing step. Thank you:)
I will add this step to the README.md and create a new PR