Any plans to add moondream and build a demo? Xenova/moondream2

Question

Any plans to add moondream and build a demo? Xenova/moondream2

Closed this issue 23 days ago · 2 comments

BChip commented a month ago

Model description

I found https://huggingface.co/Xenova/moondream2 has been created.

Is there plans to add moondream2 in v3 and has anyone started a demo yet?

Prerequisites

The model is supported in Transformers (i.e., listed here)
The model can be exported to ONNX with Optimum (i.e., listed here)

Additional information

from transformers import AutoModelForCausalLM, AutoTokenizer
from PIL import Image

model_id = "vikhyatk/moondream2"
revision = "2024-04-02"
model = AutoModelForCausalLM.from_pretrained(
    model_id, trust_remote_code=True, revision=revision
)
tokenizer = AutoTokenizer.from_pretrained(model_id, revision=revision)

image = Image.open('<IMAGE_PATH>')
enc_image = model.encode_image(image)
print(model.answer_question(enc_image, "Describe this image.", tokenizer))

Your contribution

Please let me know if you need any help on this. I am looking forward to having a tiny VLM available in transformers.js! :hug

Answer 1 · 2024-05-08T14:20:31.000Z

Hi there 👋 Indeed, this is on our list :) The main issue is that the WebGPU version is still pretty slow, but now that we have Phi-3 running w/ WebGPU (demo), you should be seeing a Moondream demo soon. 🤞

Answer 2 · 2024-05-17T13:37:14.000Z

It's out! https://huggingface.co/spaces/Xenova/experimental-moondream-webgpu

moondream-webgpu-2.mp4

See the model card for usage instructions.