Non-trivial build

Question

Non-trivial build

mckib2 opened this issue 5 years ago · 12 comments

Problem:
This fork is a few years out of date and is not easy to install, e.g., Ubuntu 16 docker image with correct Bazel version still leads to seemingly endless compilation issues even when using gcc v4.

Question:
Is there an existing Dockerfile or docker image that has a successful build?

Current solution:
I have forked this repo and merged changes with the current tensorflow master and successfully built based on tensorflow/tensorflow:devel docker image (see fork). The ultimate goal is to run this variational network, but still running into a few problems using my fork of the tensorflow-icg repo. I'll keep working on it, but something portable like a docker image would be very helpful.

Answer 1 · 2020-03-25T22:36:26.000Z

Dear mckib2,

Sorry to leave a message here, since I should leave it at your forked repo but didn't find anywhere to open the issue there. I'm very interested in the Variational Network. While trying to implement it, I encountered the same issue of updating TF as you did last year in this repo. I noticed that the "contrib" folder, in which the most icg code is put into, has been moved since TF 2.0, and I'm trying to compile it based on TF 2.1 (I tried compilation of TF 2.0 with GPU but it does not work with VS 2019 compiler, while TF 2.1 + GPU could be compiled). Since I'm not very familiar with the architecture of TF, may I have your suggestions on how to transfer the icg part to TF 2.1, e.g. where to put this part of the code, any necessary modifications to the configuration of building, etc. Really appreciate your help.

Best regards,

Zhe Wu

Problem:
This fork is a few years out of date and is not easy to install, e.g., Ubuntu 16 docker image with correct Bazel version still leads to seemingly endless compilation issues even when using gcc v4.

Question:
Is there an existing Dockerfile or docker image that has a successful build?

Current solution:
I have forked this repo and merged changes with the current tensorflow master and successfully built based on tensorflow/tensorflow:devel docker image (see fork). The ultimate goal is to run this variational network, but still running into a few problems using my fork of the tensorflow-icg repo. I'll keep working on it, but something portable like a docker image would be very helpful.

Answer 2 · 2020-03-25T22:55:59.000Z

Hi Zhe, I was not able to get everything working in my fork and ran out of time with it. I think the best strategy would be to have @khammernik provide a functional docker image. She does not seem responsive on here, you might try looking up her email for correspondence. It doesn't count as reproducible research unless you can reproduce it.

I apologize that I am unable to help you get up and running.

Answer 3 · 2020-03-29T00:52:39.000Z

Hello, I am very interested in trying to get this Variational Network to work as well. I managed to compile the source code after a minor modification using Ubuntu 18.04, Cuda 10.0, Cudnn 7.6.5, Python 3.6, Anaconda, and Bazel 0.11.1, however, I get the error:

Error: /anaconda3/envs/tficg/lib/python3.6/site-packages/tensorflow/contrib/data/python/ops/../../_prefetching_ops.so: undefined symbol: _ZN6google8protobuf8internal26fixed_address_empty_stringE

after running

import tensorflow as tf
print(help(tf.contrib.icg))

in Ipython and I am not sure what is causing it. Zhe, have you attempted contacting khammernik?

Thanks!

Answer 4 · 2020-04-02T05:26:55.000Z

Hi @cecilr14 I have no access to the Linux machine at present so I used a Windows machine to compile with CMake and succeeded. If you also want to have a trial on Windows, I can share more details with you. For now, I only configured the compilation environment for the old version TF 1.5 with icg, haven't upgraded the code to a newer version of TF.

The current issue after the compilation is that the extra custom ops in tf-icg cannot be found in the compiled binary library (_icg_ops.dll for Windows or _icg_ops.so for Linux). This could be a load failure due to the early version of tf.load_op_library function, or there is still something wrong in compilation configuration. I will try to upgrade the tf-icg source code to 2.1 or 2.2 this week.

Answer 5 · 2020-04-02T14:01:47.000Z

Hey @nbwuzhe, I believe it was due to flaws in the earlier version of TensorFlow. I compiled the icg code with the r1.8 branch and the functions began to work. To insert the icg code into the r1.8 branch of Tensorflow, I simply had to put /tensorflow/contrib/icg into the same location in the r1.8 branch, then do the last two steps listed in the file /tensorflow/contrib/icg/HOWTO.md:

Add line "//tensorflow/contrib/(your-module):(your-module)_py" to tensorflow/contrib/BUILD under section py_library( name = "contrib_py",
Add line from tensorflow.contrib import "your-module" to tensorflow/contrib/init.py

Answer 6 · 2020-04-02T15:27:39.000Z

Thank you @cecilr14 . I have converted these commands for Bazel into CMake and performed the compilation. I also tried Bazel with a very early version (0.11.1, same as you did) in Windows but was not working. Will upgrade it to >=2.0 soon.

Answer 7 · 2020-04-24T11:36:06.000Z

Hi @nbwuzhe. I would like to help with that, as I am looking to create (ideally a Docker) version for Linux. Let me know about the progress or if I can add something.

Answer 8 · 2020-04-24T15:50:19.000Z

Hi @wizofe , for now I only have access to Windows so things differ a lot from Linux in compiling. I managed to a successful build of TF1.5 with ICG on Windows with CMake, but reading the binary file of ICG is not successful - Python complained it cannot find the required functions from the binary file, which is really weird.

To be more specific, please refer to line 11 to 13 in the file tensorflow/contrib/icg/python/ops/icg_ops.py

_icg_ops_so = loader.load_op_library(resource_loader.get_path_to_datafile("_icg_ops.so"))

fftshift2d = _icg_ops_so.fftshift2d
ifftshift2d = _icg_ops_so.ifftshift2d

The functions named fftshift2d and ifftshift2d were not able to be found. I have corrected the extension .so as .dll for Windows.

It would be nice if everything could be executed on Docker, but for my current environment on Windows, Docker requires Hyper-V, which conflicts with some essential software for my work, so I'm not giving a trial of that for now.

I'm still looking into the possible way to upgrade the code into TF 2.x since the APIs changed significantly. I would really appreciate it if anyone who is familiar with these differences could give some hint.

Answer 9 · 2020-07-04T16:24:16.000Z

@nbwuzhe I didn't try to port it on TF 2.x at the moment but I have a wheel for TensorFlow 1.8.0, happy to share with you if you're on Linux. You just need to create a virtualenv and install the whl in there. Let me know!

Answer 10 · 2020-07-16T13:45:29.000Z

It also works when added into Tensorflow 1.14.0. Also, if you plan on running the code from here, then you may get errors involving FigureCanvasBase. I had to downgrade my Matplotlib to Matplotlib 2.1.0 to get it to work. In addition, you may also get some errors in the code that saves the images because it is outdated.

Answer 11 · 2020-08-12T22:23:02.000Z

I have it working in TF 2.3. Currently I am running it by building the icg functions with bazel separately according to Tensorflow's custom operator guide and then including the custom operators using Tensorflow's tf.load_op_library(). The denoising-variational network code also had to be updated to Tensorflow 2. There is probably a better way to do it through Tensorflow Addons which I may look into later.

Answer 12 · 2021-01-27T12:26:21.000Z

Initially, the variational network was written in pure C++/CUDA as we were not able to fit the memory consuming task of MRI reconstruction onto the GPU when we started with this work a couple of years ago. However, releasing, building, and maintaining this code would have been even more challenging. At the time of publishing the TF version, adding extensions to TF was, however, not straightforward either, and adding custom CMake operators did not work smoothly at that time.

The cumbersome build of tensorflow-icg is now omitted and a new interface optox is provided. The fftshift/ifftshift functions are available in tensorflow (from 1.15) and are not needed anymore. The repository mri-variationalnetwork is updated.