AlexanderLutsenko/nobuco

tensorflowjs woes with saved keras file

Closed this issue · 4 comments

background
I arrived at this library due to a failed upgrade / side grade from pytorch to tensorflow (missing ops - onnx / too slow - broken + hwhc headaces)
i need the model in tensorflowjs to do some advanced webpgu stuff (like this https://github.com/shiguredo/media-processors) that doesn't seem possible with onnx. with the twitter link above - does this mean keras 3 - we can port pytorch nn.module over to keras - and then potentially get to tensorflow js (bypassing nobuco)?

i maybe overlooking something - but
if i use this library to take a pytorch2 file do the model conversion to keras - (already did this) - it spits out a keras file -
if i save to h5 or keras does it make a difference?
I attempt to load it - to save to tensorflowjs

is there a supported version of tensorflowjs (take keras - spit out bin file) to go to tensorflowjs?

tensorflowjs_converter --input_format=keras implicit_motion_alignment.keras test 2024-10-18 15:46:56.398170: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variableTF_ENABLE_ONEDNN_OPTS=0`.
2024-10-18 15:46:56.406687: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-10-18 15:46:56.416802: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-10-18 15:46:56.419778: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-10-18 15:46:56.426812: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-10-18 15:46:57.037827: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Traceback (most recent call last):
File "/home/oem/miniconda3/envs/comfyui/bin/tensorflowjs_converter", line 8, in
sys.exit(pip_main())
^^^^^^^^^^
File "/home/oem/miniconda3/envs/comfyui/lib/python3.11/site-packages/tensorflowjs/converters/converter.py", line 959, in pip_main
main([' '.join(sys.argv[1:])])
File "/home/oem/miniconda3/envs/comfyui/lib/python3.11/site-packages/tensorflowjs/converters/converter.py", line 963, in main
convert(argv[0].split(' '))
File "/home/oem/miniconda3/envs/comfyui/lib/python3.11/site-packages/tensorflowjs/converters/converter.py", line 949, in convert
_dispatch_converter(input_format, output_format, args, quantization_dtype_map,
File "/home/oem/miniconda3/envs/comfyui/lib/python3.11/site-packages/tensorflowjs/converters/converter.py", line 619, in _dispatch_converter
dispatch_keras_h5_to_tfjs_layers_model_conversion(
File "/home/oem/miniconda3/envs/comfyui/lib/python3.11/site-packages/tensorflowjs/converters/converter.py", line 84, in dispatch_keras_h5_to_tfjs_layers_model_conversion
h5_file = h5py.File(h5_path, 'r')
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/oem/miniconda3/envs/comfyui/lib/python3.11/site-packages/h5py/_hl/files.py", line 561, in init
fid = make_fid(name, mode, userblock_size, fapl, fcpl, swmr=swmr)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/oem/miniconda3/envs/comfyui/lib/python3.11/site-packages/h5py/_hl/files.py", line 235, in make_fid
fid = h5f.open(name, flags, fapl=fapl)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5f.pyx", line 102, in h5py.h5f.open
OSError: Unable to synchronously open file (file signature not found)

`

it seems like this library can only work with tensorflow 1....?
because

tensorflowjs 4.5 - depends on latest tensorflow 2 - does it mean keras 3 format ? is that the cause of signature error? or its just a malformed keras saved file? but the prediction works -
The conflict is caused by:
tensorflow 2.12.0 depends on protobuf!=4.21.0, !=4.21.1, !=4.21.2, !=4.21.3, !=4.21.4, !=4.21.5, <5.0.0dev and >=3.20.3
tensorflowjs 3.21.0 depends on protobuf<3.20 and >=3.9.2

can anyone confirm if they can get a saved keras file ported to tensorflowjs
tensorflowjs_converter --input_format=keras YOURFILEHERE.keras test

IMG_3663

UPDATE - just found this
maybe the Keras file format cause my problems.
should we support the full keras save with embedded config…

UPDATE in my code instead of using h5 or keras - just use tf format.

# Save the Keras model
    keras_model.save("implicit_motion_alignment", save_format="tf")

that gets me this
Screenshot from 2024-10-18 18-06-27

and then

tensorflowjs_converter --input_format=tf_saved_model \
                       --output_format=tfjs_graph_model \
                       implicit_motion_alignment test

i get to this

Screenshot from 2024-10-18 18-06-15

N.B. - it did require me to downgrade one of the libraries

pip install tensorflow==2.15.0 pip install tensorflow-decision-forests==1.8.0

awesome thanks for this library

UPDATE

  • this actually spits out a graph model -- but I can't load in browser for inference.

Hi!

this actually spits out a graph model -- but I can't load in browser for inference.

Why won't it load, exactly? What error does it spit out?

TFJS converter is notoriously buggy. I've had quite a bit of (dis)pleasure working with it. Here's the script that works best for me:

os.system(f'''
tensorflowjs_converter \
--input_format=tf_saved_model \
--output_format=tfjs_graph_model \
--signature_name=serving_default \
--saved_model_tags=serve \
--quantize_float16="*" \
{os.path.join(output_dir, checkpoint_name)} \
{os.path.join(output_dir, checkpoint_name) + '.js'}
''')

One particularly annoying problem is if your model has multiple inputs/outputs, the converted can randomly shuffle them around. My way of putting them back in place is to manually edit model.json, which is a part of the converted model.

Thanks @AlexanderLutsenko - i stand corrected - I had given up on this - but since you pointed it out -
the key was adjusting how I load the graph model.

  model = await tf.loadLayersModel(modelUrl); //output_format=tfjs_layers_model - DON'T USE according to SO
     model = await tf.loadGraphModel(modelPath); output_format=tfjs_graph_model

this actually got me further - and though it's not exactly working how i need it - it gives me a path forward.

it would helpful to guide people here to avoid using layers.
https://stackoverflow.com/questions/55829043/difference-between-tfjs-layers-model-and-tfjs-graph-model

I have a private nextjs project that I may opensource

side note - from your experience - have you been happy with tensorflowjs? what has been your use case?
i'm kinda on some skinny branch where onnx didn't deliver - and I'm sort of deflated / disillusioned about whole thing - it's been like 3 weeks of pushing but yet to run any inference in browser....
I'm attempting to build a neural video codec based off a Microsoft paper.
https://github.com/johndpope/imf

there are these video / tensorflowjs projects - https://github.com/shiguredo/media-processors

this guy @DrSleep shared a post which led me to tensorflowjs -
https://drsleep.github.io/technical/Tutorial-PyTorch-to-TensorFlow-JS/

did you have bad experience onnx?
I was going through another codebase - https://github.com/iperov/DeepFaceLive/tree/master and they are using onnx - not in browser though....

is nobuco predominanly to get the code working in the browser? do you see ai-edge superceding this library?
https://github.com/google-ai-edge/ai-edge-torch

UPDATE
actually completely working - thanks again.
Screenshot from 2024-10-22 16-40-45