tensorflow/models

Adding a new op when using tensorflow in windows

AnmolChachra opened this issue ยท 26 comments

Following the (https://www.tensorflow.org/extend/adding_an_op) step by step.
The first two steps , i.e Registering the new_op and Initializing the kernels for the op are same for all operating systems.
Build the op library (https://www.tensorflow.org/extend/adding_an_op#build_the_op_library) is OS specific.
The code for Linux and mac users is given.
How to do it in windows?

Thanks for the clear question @AnmolChachra !

I bet @mrry has an answer for this.

mrry commented

The current Windows release does not support tf.load_op_library() or separate compilation of a custom op as a DLL. Currently, the only option for adding an op to TensorFlow for Windows is to compile it into a source build of the entire library. The easiest way to do that is to add your op registration to one of the .cc files in tensorflow/core/ops/, add your kernel implementation to tensorflow/core/kernels/ and follow the instructions for the CMake build.

(We're currently investigating a way to provide tf.load_op_library() support on Windows, and will update the adding an op instructions when this is possible.)

/cc @guschmue

Thanks for clearing it out. @mrry
Tensorflow has grown beautifully with time.
I am sure that this issue will also be resolved soon.

I pushed a change for tf.load_library() on windows to here:
https://github.com/guschmue/tensorflow/tree/win-loadlibrary

I think I'll get the contrib/rnn working tomorrow before sending a pull request.
I have a sample CMakeLists.txt for to build user_op (cpu only) for windows outside of the tensorflow build here:
https://gist.github.com/guschmue/4b4178addcc96a4ecc7f03659b59bcd9

contrib/rnn with tf.load_library() on windows is working now (cpu). Waiting for a gpu build before sending PR.

Hi @mrry, I'm running tensorflow-gpu (1.0.1) on Windows 10. How would you suggest I create and load word2vec_ops.so? My options appear limited - would really appreciate help...

mrry commented

@robosoup We don't currently have any docs for this, but if you're feeling adventurous you could try following @guschmue's CMake recipe for building a user ops library, which we currently use for the fused RNN ops only:

https://github.com/tensorflow/tensorflow/blob/557b7b85f6d1ac6879f23b9b57828ce7ccfd57dd/tensorflow/contrib/cmake/tf_python.cmake#L771

(In the long term, we hope to switch this process to use a set of Bazel rules that match the Linux and Mac OS builds, but this is still TBD.)

I have being using the load_library() and have some workable process ... can put up a little howto shortly.

Thank you @mrry and @guschmue, I'll give this a try!

My favorite way is here:
https://gist.github.com/guschmue/2908e4411edc2faef6ddfe87a2ce4a1d
This becomes part of the normal cmake process so one does not need to worry about compiler flags, includes ...
Drawback is that you need to build the source tree once to generate all header files ... eventually we can add the includes to the python wheel.

I was wondering if there are any updates on this?
Has tf.load_op_library() support been added yet, or any eta for the same?

Thanks a lot for the wonderful library. I absolutely love it!!

mrry commented

@zeeshanzia84 Thanks to some heroic efforts by @guschmue, it's now possible to use tf.load_op_library() on Windows. There isn't a lot of documentation for it right now, but the AddUserOps CMake function will help, and you can see an example of its use here:

https://github.com/tensorflow/tensorflow/blob/e4296aefff97e6edd3d7cee9a09b9dd77da4c034/tensorflow/contrib/cmake/tf_python.cmake#L792

Thank you @guschmue for your great work, I wonder if this way of walking around works for gpus?

Yes, gpu does work with the dll, for example the gru uses a gpu kernel that loads as dll:
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/cmake/tf_python.cmake#L943

Does someone have an example that shows how to compile external C/C++ code into a dll that can be loaded by tensorflow as an external op (without building entire tensorflow source) on Windows? The word2vec code that builds word2vec_ops.so/dll would be a great example..

Automatically closing due to lack of recent activity. Please update the issue when new information becomes available, and we will reopen the issue. Thanks!

Hello, I'm trying to compile a custom op and I'm struggling to get everything working on Windows.
I've read the official guide and followed both this issue and the issue #19122 that was just mentioned here, with no luck so far.

You can see all my attempts here on StackOverflow.

My current situations is that I've installed both CMake and VS2015 and I'm using the test .cc, .cu and CMakeLists.txt files linked in the SO question, I can get CMake to run with this:

CMake CMakeLists.txt -DCMAKE_GENERATOR_PLATFORM=x64 -G "Visual Studio 14 2015"

to generate the VS project files, and then I manage to compile something with:

MSBuild /p:Configuration=Release add_one.vcxproj

But it fails due to the missing nsync_cv.h header file.

Any ideas on how to proceed? I feel like I'm not missing much to get this to work, at least I hope so.
I know the build process on Windows is not officially supported (at least not entirely), but any help or suggestion would be much appreciated
I realize now those headers were probably missing because I was not compiling the op from a locally compiled TensorFlow build (even though the guide didn't say that was strictly necessary, but I guess it is on Windows at least). See the following message, there I'm following the advices here and building TF from sources (or at least trying to).

Decided to go another way and followed the advice posted here, to build TensorFlow from sources.
I've also updated the question on SO with the info, but anyways, here's what I did:

  • I have this .cc file and this GPU kernel .cu.cc file to build.
  • Installed VS2015, the C++ compiler and the Windows SDK, installed Anaconda 3 version 4.1.1 as suggested, with Python 3.5 and the other required toolkits. Small note: I'm using CUDA 9.0. because the 8.0 version just failed with an error every time I tried to install it. I've also tried installing CUDA 8.0, but then the first CMake script would just fail saying it found version 8.0 but needed version 9.0 (even if I specified to use version 8.0 in the build script).
  • Followed the guide to setup the environment, added the tf_user_op.cmake file with the paths to my .cc and .cu.cc files to compile, then built TensorFlow using:
cmake -A x64 -DCMAKE_BUILD_TYPE=Release -DPYTHON_EXECUTABLE="C:\Users\Sergio\Anaconda3\python.exe" -DPYTHON_LIBRARIES="C:\Users\Sergio\Anaconda3\libs\python35.lib" -DSWIG_EXECUTABLE="C:\Users\Sergio\Downloads\swigwin-3.0.12\swig.exe" -Dtensorflow_ENABLE_GPU=ON -DCUDNN_HOME="C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0" -G "Visual Studio 14 2015"
  • Went ahead to build the custom op using MSBuild /p:Configuration=Release ackermann_ops.vcxproj, but it failed with this error:

add_one.obj : error LNK2019: external reference to symbol "void __cdecl AddOneKernelLauncher(float const *,int,int,
int,int,float *)" (?AddOneKernelLauncher@@YAXPEBMHHHHPEAM@Z) not resolved in function "public: virtual void __cdecl A
ddOneOpGPU::Compute(class tensorflow::OpKernelContext *)" (?Compute@AddOneOpGPU@@UEAAXPEAVOpKernelContext@tensorflow@@@
Z) [C:\Users\Sergio\Documents\GitHub\tensorflow\tensorflow\contrib\cmake\user_ops_gen_python.vcxproj]

@guschmue Do you have any suggestions here? I might have configured something wrong in the .cmake file, or did something wrong elsewhere, but I tried to follow your guide step by step and I'm not sure about what else to do at this point. Thanks!

Hello, any ideas on how to tackle this?
I've been forced to switch to Ubuntu for now to keep working on my project (no issues compiling the custom op there, following the official guide), but I'd like to be able to eventually release it on Windows too.

Is there any way to achieve this? I saw that you've been doing some work to use Bazel to build TF on Windows in the last few months, is there maybe a way to use that to build the .dll for the custom op on Windows too?

@mrry I'd love to help if needed, I just have no clue on what I could do here. I'd be happy to eventually test build configurations to give feedbacks though, if that might help.
Thanks again!

add_one.obj : error LNK2019: external reference to symbol "void __cdecl AddOneKernelLauncher(float const *,int,int,
int,int,float *)" (?AddOneKernelLauncher@@YAXPEBMHHHHPEAM@Z) not resolved in function "public: virtual void __cdecl A
ddOneOpGPU::Compute(class tensorflow::OpKernelContext *)" (?Compute@AddOneOpGPU@@UEAAXPEAVOpKernelContext@tensorflow@@@
Z)

This basically states that you have not referenced it to a library, just add the following lines as shown by @guschmue

set (pywrap_tensorflow_lib "${tensorflow_from_site_package}/python/pywrap_tensorflow_internal.lib")

add_library(ackermann_op SHARED "ackermann_op.cc")
target_link_libraries(ackermann_op ${pywrap_tensorflow_lib})

Hi @karShetty - thank you for your answer.
The issue though is that CMake is no longer supported now (and in fact apparently it's very difficult to even get TF to compile on Windows with it with the latest versions), and as I've mentioned in #22322 I'm actually looking for the ability to compile custom GPU ops on Windows using the same Bazel pipeline as on Mac/Linux.

I mean, ideally this should be added to the official docs and integrated into TF not as some weird thing that you'd have to look over on GitHub to figure out how to do, but as a properly supported and document process.

I saw they've actually added a page dedicated to building TF on Windows with Bazel with TF 11, so hopefully we're getting there, or at least moving in that direction ๐Ÿ˜„

Ah, well I thought you wanted a separate compilation of a custom op as a dll by calling tf.load_op_library()

@karShetty That's indeed exactly what I want (and what I've been doing on Linux for the last 6 months).
I'd just like to have the same feature on Windows as well, with a well documented process to follow using the official Bazel build script though, and not using some convoluted, undocumented, deprecated, borderline impossible to set up and run successfully CMake script that some random (although, very kind indeed) user on GitHub came up with months ago.

As I said on the other issue, I'd just like to see feature parity on Windows here, nothing more and nothing less than what has already been available for years on Mac/Linux.

EDIT: this doesn't seem to be clear though, now that I re-read your message. I want to be able to build a custom GPU op from binary, without having to build the whole TF library from source. That's the key difference here with what you're suggesting.

Well the suggested method works fine even with tf r1.12, I use Cmake just to build the dll file.

This is how it would typically look like

find_package(CUDA REQUIRED)
include_directories(${Tensorflow_INCLUDE_DIRS})  # Here Tensorflow_INCLUDE_DIRS referes to your prebuilt TF package at 'Lib\site-packages\tensorflow'
....

include_directories(${Tensorflow_INCLUDE_DIRS})
set(CUDA_NVCC_FLAGS "-I ${Tensorflow_INCLUDE_DIRS} -D GOOGLE_CUDA=1 -x cu -Xcompiler")
set (pywrap_tensorflow_lib "${Tensorflow_INCLUDE_DIRS}/pywrap_tensorflow_internal.lib")

cuda_add_library(xxx SHARED xxx.cc xxx.cc.cu .....)
target_link_libraries(xxx ${CUDA_LIBRARIES} ${pywrap_tensorflow_lib})

This would generate my xxx.dll

Also in my header files I add #define NOMINMAX to prevent using min,max Macro.

And in python I just call xxx = tf.load_op_library('xxx.dll')

@karShetty Could you please show the full CMakeLists file.

Hi all,

At this time, do we have a custom GPU op guide for tensorflow > 2.0 on Windows platforms other than https://www.tensorflow.org/guide/create_op which works for Linux and Ubuntu? It seemed some pioneers tried the CMake method in this issue discussion, but failed eventually at some point. I have my custom op work for Linux but would love to see it work on Windows as well.

Any suggestion would be appreciated. Thanks.