/infer_grounding_dino

Primary LanguagePythonApache License 2.0Apache-2.0

Algorithm icon

infer_grounding_dino


Stars Website GitHub
Discord community

The Algorithm proposes a zero-shot object grounding model that can localize objects in an image with a natural language query.

Grounding Dino dog detection

🚀 Use with Ikomia API

1. Install Ikomia API

We strongly recommend using a virtual environment. If you're not sure where to start, we offer a tutorial here.

pip install ikomia

2. Create your workflow

from ikomia.dataprocess.workflow import Workflow
from ikomia.utils.displayIO import display


# Init your workflow
wf = Workflow()    

# Add the Grounding DINO Object Detector
dino = wf.add_task(name="infer_grounding_dino", auto_connect=True)

# Run on your image  
# wf.run_on(path="path/to/your/image.png")
wf.run_on(url="https://raw.githubusercontent.com/Ikomia-dev/notebooks/main/examples/img/img_dog.png")

# Inspect your results
display(dino.get_image_with_graphics())

☀️ Use with Ikomia Studio

Ikomia Studio offers a friendly UI with the same features as the API.

  • If you haven't started using Ikomia Studio yet, download and install it from this page.

  • For additional guidance on getting started with Ikomia Studio, check out this blog post.

📝 Set algorithm parameters

  • model_name (str) - default 'Swin-T': The GroundingDINO algorithm has two different checkpoint models: ‘Swin-B’ and ‘Swin-T’, with respectively, 172M and 341M of parameters.
  • prompt (str) - default 'car . person . dog .': Text prompt for the model
  • conf_thres (float) - default '0.35': Box threshold for the prediction‍
  • conf_thres_text (float) - default '0.25': Text threshold for the prediction
  • cuda (bool): If True, CUDA-based inference (GPU). If False, run on CPU

Parameters should be in strings format when added to the dictionary.

from ikomia.dataprocess.workflow import Workflow
from ikomia.utils.displayIO import display

# Init your workflow
wf = Workflow()    

# Add the Grounding DINO Object Detector
dino = wf.add_task(name="infer_grounding_dino", auto_connect=True)

dino.set_parameters({
    "model_name": "Swin-B",
    "prompt": "laptops . smartphone . headphone .",
    "conf_thres": "0.35",
    "conf_thres_text": "0.25"
})

# Run on your image  
# wf.run_on(path="path/to/your/image.png")
wf.run_on(url="https://raw.githubusercontent.com/Ikomia-dev/notebooks/main/examples/img/img_work.jpg")

# Inspect your results
display(dino.get_image_with_graphics())

🔍 Explore algorithm outputs

Every algorithm produces specific outputs, yet they can be explored them the same way using the Ikomia API. For a more in-depth understanding of managing algorithm outputs, please refer to the documentation.

import ikomia
from ikomia.dataprocess.workflow import Workflow

# Init your workflow
wf = Workflow()

# Add algorithm
algo = wf.add_task(name="infer_grounding_dino", auto_connect=True)

# Run on your image  
wf.run_on(url="https://raw.githubusercontent.com/Ikomia-dev/notebooks/main/examples/img/img_dog.png")

# Iterate over outputs
for output in algo.get_outputs():
    # Print information
    print(output)
    # Export it to JSON
    output.to_json()

⏩ Advanced usage

Check out the Grounding Dino blog post for more information on this algorithm.