NVlabs/FasterViT

Issue Converting FasterViT to CoreML

rotem154154 opened this issue · 8 comments

First off, I want to thank you for the incredible work on FasterViT. It's evident that it's a state-of-the-art model and its performance has been truly outstanding.

I'm writing to bring up an issue I encountered when attempting to convert FasterViT to CoreML. Given the model's promise for speed and power, it was surprising to find that there's no straightforward way to convert it to CoreML for deployment on Apple devices.

i got RuntimeError: PyTorch convert function for op 'bernoulli_' not implemented.

Is there a known workaround to successfully convert FasterViT to CoreML?
Are there any plans to provide support for CoreML conversion in the near future?
Having this capability would significantly broaden the potential deployment platforms for the model, especially in mobile and edge environments.

Thank you for your time and looking forward to your feedback.

Best regards,
Rotem

Hi @rotem154154 thanks for your note. Firstly, I would like to mention that you need to use a newer version of CoreML for this to work properly.

Specifically, in our case, we used the CoreML which was shipped with X-code14 (requires macos 13.0 or newer). Simply by just double clicking on the model, a variety of different options become available including a performance tab to measure latency, etc.

We did not account any issues for model conversion with the specified CoreML release.

Edit: For model generation, please see my latest comment with code snippet.

I hope this helps !

Hi @ahatamiz

Thank you for your quick response. I appreciate the clarification.

To ensure I'm using the appropriate tools, I've verified that I'm using the latest version of coremltools, specifically coremltools-6.3.0.

Would you be able to provide a more detailed step-by-step guide or perhaps a small code snippet on how you managed to convert the FasterViT model to CoreML? This would be incredibly helpful for my use case and for others who might be facing similar challenges.

Thank you once again for your support and the fantastic work on FasterViT.

Best regards,
Rotem

Hi @rotem154154 , we used this version of X-code.

Edit: For model generation, please see my latest comment with code snippet.

Hi @ahatamiz

Thank you for pointing out the use of Xcode 14 for the CoreML model generation. I'm a bit puzzled, as my understanding has been that conversion from PyTorch models to CoreML primarily happens through coremltools.

Here's how I've been attempting the conversion:

import torch
from fastervit import create_model
import coremltools as ct

m = create_model('faster_vit_0_224', pretrained=False)
dummy_input = torch.zeros(1, 3, 224, 224)
traced_model = torch.jit.trace(m, dummy_input)

model = ct.convert(
traced_model,
inputs=[ct.TensorType(shape=dummy_input.shape)]
)

If there's a direct way to generate the CoreML model with Xcode 14 (without relying on coremltools), I'd appreciate a bit more insight into that process. As you mentioned, my understanding is that Xcode primarily aids in running and benchmarking the already converted CoreML model, not in the direct conversion from a PyTorch model.

Any clarification or guidance on this would be greatly appreciated.

Best regards,
Rotem

Hi @rotem154154

Sorry for the confusion. Please see my latest comment with code snippet.

Hi @ahatamiz

Thank you for clarifying. It's intriguing to learn about this direct conversion feature in Xcode 14.

However, I'm encountering a challenge with this approach. The pre-trained models I have are in .pth.tar format, and when I attempt to open them directly with Xcode 14 on macOS 13, I don't see the options you've mentioned. It appears there's something I might be missing.

Additionally, I've been trying to find online documentation or any references about this new Xcode 14 feature of direct CoreML model generation, but my searches haven't yielded relevant results. If you have any official documentation, tutorials, or other resources about this process, it would be immensely helpful. This will ensure I'm following the correct steps and might clarify where I'm going astray.

Once again, thank you for your patience and assistance.

Hi @rotem154154 appologies for the confusion. One needs to still use the coreml for the conversion.

I wrote the following script to facilitate the process. Please use coremltools==5.2.0 :

import torch
import coremltools
from fastervit import create_model

model = create_model('faster_vit_0_224').eval()
input_size = 224
bs_size = 1
file_name = 'faster_vit_0_224.mlmodel'
img = torch.randn((bs_size, 3, input_size, input_size), dtype=torch.float32)
model_jit_trace = torch.jit.trace(model, img)
model = coremltools.convert(model_jit_trace, inputs=[coremltools.ImageType(shape=img.shape)])
model.save(file_name)

The benchmarking can be directly done on Xcode14.

I hope this helps !

I've reviewed the script you provided, and I realized the missing piece was the .eval() method on the model. It appears that was the root of the issue I was facing.

Thank you so much for taking the time to help and for providing the comprehensive solution. Your guidance has been invaluable.

I'll go ahead and close this issue now. Thanks again!

Best regards,
Rotem