xenova/transformers.js

Any plans to add moondream and build a demo? Xenova/moondream2

Closed this issue · 2 comments

BChip commented

Model description

I found https://huggingface.co/Xenova/moondream2 has been created.

Is there plans to add moondream2 in v3 and has anyone started a demo yet?

Prerequisites

  • The model is supported in Transformers (i.e., listed here)
  • The model can be exported to ONNX with Optimum (i.e., listed here)

Additional information

from transformers import AutoModelForCausalLM, AutoTokenizer
from PIL import Image

model_id = "vikhyatk/moondream2"
revision = "2024-04-02"
model = AutoModelForCausalLM.from_pretrained(
    model_id, trust_remote_code=True, revision=revision
)
tokenizer = AutoTokenizer.from_pretrained(model_id, revision=revision)

image = Image.open('<IMAGE_PATH>')
enc_image = model.encode_image(image)
print(model.answer_question(enc_image, "Describe this image.", tokenizer))

Your contribution

Please let me know if you need any help on this. I am looking forward to having a tiny VLM available in transformers.js! :hug

Hi there 👋 Indeed, this is on our list :) The main issue is that the WebGPU version is still pretty slow, but now that we have Phi-3 running w/ WebGPU (demo), you should be seeing a Moondream demo soon. 🤞

It's out! https://huggingface.co/spaces/Xenova/experimental-moondream-webgpu

moondream-webgpu-2.mp4

See the model card for usage instructions.