tensorflow/decision-forests

gpu support for layer use

Opened this issue · 1 comments

Hi,
I trained a tfdf modell and use it as a layer in a deep architecture.
I trained the tfdf model on cpu, but when training the deep model I would like to be able to train on gpu (I have big data).
When training the deep model with the tfdf layer on cpu everything works fine, however when training on gpu I get the following error:

Failure to load the inference.so custom c++ tensorflow ops. This error is likely caused the version of TensorFlow and TensorFlow Decision Forests are not compatible. Full error:CHECK failed: GeneratedDatabase()->Add(encoded_file_descriptor, size): | [1,mpirank:0,algo-1]:[WARNING] Failure to load the inference.so custom c++ tensorflow ops. This error is likely caused the version of TensorFlow and TensorFlow Decision Forests are not compatible. Full error:CHECK failed: GeneratedDatabase()->Add(encoded_file_descriptor, size):

stack trace:

2024-02-05T17:52:00.841+02:00 | [1,mpirank:0,algo-1]: import tensorflow_decision_forests as tfdf
  | 2024-02-05T17:52:00.841+02:00 | [1,mpirank:0,algo-1]: File "/usr/local/lib/python3.9/site-packages/tensorflow_decision_forests/init.py", line 63, in
  | 2024-02-05T17:52:00.841+02:00 | [1,mpirank:0,algo-1]: from tensorflow_decision_forests import keras
  | 2024-02-05T17:52:00.841+02:00 | [1,mpirank:0,algo-1]: File "/usr/local/lib/python3.9/site-packages/tensorflow_decision_forests/keras/init.py", line 54, in
  | 2024-02-05T17:52:00.841+02:00 | [1,mpirank:0,algo-1]: from tensorflow_decision_forests.keras import core
  | 2024-02-05T17:52:00.841+02:00 | [1,mpirank:0,algo-1]: File "/usr/local/lib/python3.9/site-packages/tensorflow_decision_forests/keras/core.py", line 65, in
  | 2024-02-05T17:52:00.841+02:00 | [1,mpirank:0,algo-1]: from tensorflow_decision_forests.tensorflow.ops.inference import api as tf_op
  | 2024-02-05T17:52:00.841+02:00 | [1,mpirank:0,algo-1]: File "/usr/local/lib/python3.9/site-packages/tensorflow_decision_forests/tensorflow/ops/inference/api.py", line 180, in
  | 2024-02-05T17:52:00.841+02:00 | [1,mpirank:0,algo-1]: from tensorflow_decision_forests.tensorflow.ops.inference import op
  | 2024-02-05T17:52:00.841+02:00 | [1,mpirank:0,algo-1]: File "/usr/local/lib/python3.9/site-packages/tensorflow_decision_forests/tensorflow/ops/inference/op.py", line 15, in
  | 2024-02-05T17:52:00.841+02:00 | [1,mpirank:0,algo-1]: from tensorflow_decision_forests.tensorflow.ops.inference.op_dynamic import *
  | 2024-02-05T17:52:00.841+02:00 | [1,mpirank:0,algo-1]: File "/usr/local/lib/python3.9/site-packages/tensorflow_decision_forests/tensorflow/ops/inference/op_dynamic.py", line 24, in
  | 2024-02-05T17:52:00.841+02:00 | [1,mpirank:0,algo-1]: raise e
  | 2024-02-05T17:52:00.841+02:00 | [1,mpirank:0,algo-1]: File "/usr/local/lib/python3.9/site-packages/tensorflow_decision_forests/tensorflow/ops/inference/op_dynamic.py", line 21, in
  | 2024-02-05T17:52:00.841+02:00 | [1,mpirank:0,algo-1]: ops = tf.load_op_library(resource_loader.get_path_to_datafile("inference.so"))
  | 2024-02-05T17:52:00.841+02:00 | [1,mpirank:0,algo-1]: File "/usr/local/lib/python3.9/site-packages/tensorflow/python/framework/load_library.py", line 54, in load_op_library
  | 2024-02-05T17:52:00.842+02:00 | [1,mpirank:0,algo-1]: lib_handle = py_tf.TF_LoadLibrary(library_filename)
  | 2024-02-05T17:52:00.842+02:00 | [1,mpirank:0,algo-1]:RuntimeError: [1,mpirank:0,algo-1]:CHECK failed: GeneratedDatabase()->Add(encoded_file_descriptor, size):

I know that you currently don't support training tfdf on gpu, but I wonder if gpu support for prediction style calls will be supported.

Would appreciate your help, I think that this type of capability would be very usefull,
Regards

rstz commented

Hi,

thank you for reporting. We'll have a look at this in more detail soon (though probably not this week). I wasn't able to immediately reproduce it based on https://www.tensorflow.org/decision_forests/tutorials/model_composition_colab - when running in colab with GPU support, it worked fine and executed on GPU when possible. Can you maybe provide a small repro?