tensorflowjs woes with saved keras file

Question

tensorflowjs woes with saved keras file

Closed this issue 3 months ago · 4 comments

background
I arrived at this library due to a failed upgrade / side grade from pytorch to tensorflow (missing ops - onnx / too slow - broken + hwhc headaces)
i need the model in tensorflowjs to do some advanced webpgu stuff (like this https://github.com/shiguredo/media-processors) that doesn't seem possible with onnx. with the twitter link above - does this mean keras 3 - we can port pytorch nn.module over to keras - and then potentially get to tensorflow js (bypassing nobuco)?

i maybe overlooking something - but
if i use this library to take a pytorch2 file do the model conversion to keras - (already did this) - it spits out a keras file -
if i save to h5 or keras does it make a difference?
I attempt to load it - to save to tensorflowjs

is there a supported version of tensorflowjs (take keras - spit out bin file) to go to tensorflowjs?

tensorflowjs_converter --input_format=keras implicit_motion_alignment.keras test 2024-10-18 15:46:56.398170: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variableTF_ENABLE_ONEDNN_OPTS=0`.
2024-10-18 15:46:56.406687: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-10-18 15:46:56.416802: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-10-18 15:46:56.419778: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-10-18 15:46:56.426812: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-10-18 15:46:57.037827: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Traceback (most recent call last):
File "/home/oem/miniconda3/envs/comfyui/bin/tensorflowjs_converter", line 8, in
sys.exit(pip_main())
^^^^^^^^^^
File "/home/oem/miniconda3/envs/comfyui/lib/python3.11/site-packages/tensorflowjs/converters/converter.py", line 959, in pip_main
main([' '.join(sys.argv[1:])])
File "/home/oem/miniconda3/envs/comfyui/lib/python3.11/site-packages/tensorflowjs/converters/converter.py", line 963, in main
convert(argv[0].split(' '))
File "/home/oem/miniconda3/envs/comfyui/lib/python3.11/site-packages/tensorflowjs/converters/converter.py", line 949, in convert
_dispatch_converter(input_format, output_format, args, quantization_dtype_map,
File "/home/oem/miniconda3/envs/comfyui/lib/python3.11/site-packages/tensorflowjs/converters/converter.py", line 619, in _dispatch_converter
dispatch_keras_h5_to_tfjs_layers_model_conversion(
File "/home/oem/miniconda3/envs/comfyui/lib/python3.11/site-packages/tensorflowjs/converters/converter.py", line 84, in dispatch_keras_h5_to_tfjs_layers_model_conversion
h5_file = h5py.File(h5_path, 'r')
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/oem/miniconda3/envs/comfyui/lib/python3.11/site-packages/h5py/_hl/files.py", line 561, in init
fid = make_fid(name, mode, userblock_size, fapl, fcpl, swmr=swmr)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/oem/miniconda3/envs/comfyui/lib/python3.11/site-packages/h5py/_hl/files.py", line 235, in make_fid
fid = h5f.open(name, flags, fapl=fapl)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5f.pyx", line 102, in h5py.h5f.open
OSError: Unable to synchronously open file (file signature not found)

`

it seems like this library can only work with tensorflow 1....?
because

tensorflowjs 4.5 - depends on latest tensorflow 2 - does it mean keras 3 format ? is that the cause of signature error? or its just a malformed keras saved file? but the prediction works -
The conflict is caused by:
tensorflow 2.12.0 depends on protobuf!=4.21.0, !=4.21.1, !=4.21.2, !=4.21.3, !=4.21.4, !=4.21.5, <5.0.0dev and >=3.20.3
tensorflowjs 3.21.0 depends on protobuf<3.20 and >=3.9.2

can anyone confirm if they can get a saved keras file ported to tensorflowjs
tensorflowjs_converter --input_format=keras YOURFILEHERE.keras test

Answer 1 · 2024-10-18T06:01:01.000Z

UPDATE - just found this
maybe the Keras file format cause my problems.
should we support the full keras save with embedded config…

Answer 2 · 2024-10-18T07:09:38.000Z

UPDATE in my code instead of using h5 or keras - just use tf format.

# Save the Keras model

    keras_model.save("implicit_motion_alignment", save_format="tf")

that gets me this

and then

tensorflowjs_converter --input_format=tf_saved_model \
                       --output_format=tfjs_graph_model \
                       implicit_motion_alignment test

i get to this

N.B. - it did require me to downgrade one of the libraries

pip install tensorflow==2.15.0 pip install tensorflow-decision-forests==1.8.0

awesome thanks for this library

UPDATE

this actually spits out a graph model -- but I can't load in browser for inference.

Answer 3 · 2024-10-21T15:09:57.000Z

Hi!

this actually spits out a graph model -- but I can't load in browser for inference.

Why won't it load, exactly? What error does it spit out?

TFJS converter is notoriously buggy. I've had quite a bit of (dis)pleasure working with it. Here's the script that works best for me:

os.system(f'''
tensorflowjs_converter \
--input_format=tf_saved_model \
--output_format=tfjs_graph_model \
--signature_name=serving_default \
--saved_model_tags=serve \
--quantize_float16="*" \
{os.path.join(output_dir, checkpoint_name)} \
{os.path.join(output_dir, checkpoint_name) + '.js'}
''')

One particularly annoying problem is if your model has multiple inputs/outputs, the converted can randomly shuffle them around. My way of putting them back in place is to manually edit model.json, which is a part of the converted model.

Answer 4 · 2024-10-22T03:36:56.000Z

Thanks @AlexanderLutsenko - i stand corrected - I had given up on this - but since you pointed it out -
the key was adjusting how I load the graph model.

  model = await tf.loadLayersModel(modelUrl); //output_format=tfjs_layers_model - DON'T USE according to SO
     model = await tf.loadGraphModel(modelPath); output_format=tfjs_graph_model

this actually got me further - and though it's not exactly working how i need it - it gives me a path forward.

it would helpful to guide people here to avoid using layers.
https://stackoverflow.com/questions/55829043/difference-between-tfjs-layers-model-and-tfjs-graph-model

I have a private nextjs project that I may opensource

side note - from your experience - have you been happy with tensorflowjs? what has been your use case?
i'm kinda on some skinny branch where onnx didn't deliver - and I'm sort of deflated / disillusioned about whole thing - it's been like 3 weeks of pushing but yet to run any inference in browser....
I'm attempting to build a neural video codec based off a Microsoft paper.
https://github.com/johndpope/imf

there are these video / tensorflowjs projects - https://github.com/shiguredo/media-processors

this guy @DrSleep shared a post which led me to tensorflowjs -
https://drsleep.github.io/technical/Tutorial-PyTorch-to-TensorFlow-JS/

did you have bad experience onnx?
I was going through another codebase - https://github.com/iperov/DeepFaceLive/tree/master and they are using onnx - not in browser though....

is nobuco predominanly to get the code working in the browser? do you see ai-edge superceding this library?
https://github.com/google-ai-edge/ai-edge-torch

UPDATE
actually completely working - thanks again.