The Algorithm proposes a zero-shot object grounding model that can localize objects in an image with a natural language query.
We strongly recommend using a virtual environment. If you're not sure where to start, we offer a tutorial here.
pip install ikomia
from ikomia.dataprocess.workflow import Workflow
from ikomia.utils.displayIO import display
# Init your workflow
wf = Workflow()
# Add the Grounding DINO Object Detector
dino = wf.add_task(name="infer_grounding_dino", auto_connect=True)
# Run on your image
# wf.run_on(path="path/to/your/image.png")
wf.run_on(url="https://raw.githubusercontent.com/Ikomia-dev/notebooks/main/examples/img/img_dog.png")
# Inspect your results
display(dino.get_image_with_graphics())
Ikomia Studio offers a friendly UI with the same features as the API.
-
If you haven't started using Ikomia Studio yet, download and install it from this page.
-
For additional guidance on getting started with Ikomia Studio, check out this blog post.
- model_name (str) - default 'Swin-T': The GroundingDINO algorithm has two different checkpoint models: ‘Swin-B’ and ‘Swin-T’, with respectively, 172M and 341M of parameters.
- prompt (str) - default 'car . person . dog .': Text prompt for the model
- conf_thres (float) - default '0.35': Box threshold for the prediction
- conf_thres_text (float) - default '0.25': Text threshold for the prediction
- cuda (bool): If True, CUDA-based inference (GPU). If False, run on CPU
Parameters should be in strings format when added to the dictionary.
from ikomia.dataprocess.workflow import Workflow
from ikomia.utils.displayIO import display
# Init your workflow
wf = Workflow()
# Add the Grounding DINO Object Detector
dino = wf.add_task(name="infer_grounding_dino", auto_connect=True)
dino.set_parameters({
"model_name": "Swin-B",
"prompt": "laptops . smartphone . headphone .",
"conf_thres": "0.35",
"conf_thres_text": "0.25"
})
# Run on your image
# wf.run_on(path="path/to/your/image.png")
wf.run_on(url="https://raw.githubusercontent.com/Ikomia-dev/notebooks/main/examples/img/img_work.jpg")
# Inspect your results
display(dino.get_image_with_graphics())
Every algorithm produces specific outputs, yet they can be explored them the same way using the Ikomia API. For a more in-depth understanding of managing algorithm outputs, please refer to the documentation.
import ikomia
from ikomia.dataprocess.workflow import Workflow
# Init your workflow
wf = Workflow()
# Add algorithm
algo = wf.add_task(name="infer_grounding_dino", auto_connect=True)
# Run on your image
wf.run_on(url="https://raw.githubusercontent.com/Ikomia-dev/notebooks/main/examples/img/img_dog.png")
# Iterate over outputs
for output in algo.get_outputs():
# Print information
print(output)
# Export it to JSON
output.to_json()
Check out the Grounding Dino blog post for more information on this algorithm.