Skip to content

Add GroundingDINO ONNX Runtime Python example#560

Open
siromermer wants to merge 1 commit into
microsoft:mainfrom
siromermer:add-grounding-dino-python-ort-example
Open

Add GroundingDINO ONNX Runtime Python example#560
siromermer wants to merge 1 commit into
microsoft:mainfrom
siromermer:add-grounding-dino-python-ort-example

Conversation

@siromermer
Copy link
Copy Markdown

@siromermer siromermer commented May 24, 2026

Summary

Adds a Python GroundingDINO zero-shot object detection example using ONNX Runtime.

The example is self-contained under python/models/grounding_dino and uses:

  • ONNX Runtime for ONNX model execution.
  • Hugging Face Transformers for preprocessing and post-processing.
  • onnx-community/grounding-dino-tiny-ONNX, onnx/model_quantized.onnx as the default model.

The example does not commit model files, images, or generated artifacts. ONNX Runtime package selection is documented explicitly so users can choose either onnxruntime or onnxruntime-gpu.

Files

  • python/models/grounding_dino/infer_grounding_dino_onnxruntime.py
  • python/models/grounding_dino/README.md
  • python/models/grounding_dino/requirements.txt
  • python/models/grounding_dino/.gitignore
  • python/README.md

Verification

Tested on CPU with:

python python/models/grounding_dino/infer_grounding_dino_onnxruntime.py \
  --provider CPUExecutionProvider \
  --image http://images.cocodataset.org/val2017/000000039769.jpg \
  --text "a cat. a remote control." \
  --output python/models/grounding_dino/output.jpg

Observed ONNX Runtime inputs:

pixel_values: np.ndarray[float32], Shape: (1, 3, 800, 800)
input_ids: np.ndarray[int64], Shape: (1, 9)
token_type_ids: np.ndarray[int64], Shape: (1, 9)
attention_mask: np.ndarray[int64], Shape: (1, 9)
pixel_mask: np.ndarray[int64], Shape: (1, 800, 800)

Observed ONNX Runtime outputs:

logits: np.ndarray[float32], Shape: (1, 900, 256)
pred_boxes: np.ndarray[float32], Shape: (1, 900, 4)

Observed detections:

Detection 1: Label: a cat, Score: 0.467, Box: [342.32, 25.67, 635.34, 375.47]
Detection 2: Label: a cat, Score: 0.392, Box: [12.62, 56.25, 317.0, 477.64]

Checks run:

python -m ruff check python/models/grounding_dino/infer_grounding_dino_onnxruntime.py
python -m ruff format --check python/models/grounding_dino/infer_grounding_dino_onnxruntime.py
python -m compileall python/models/grounding_dino/infer_grounding_dino_onnxruntime.py

Notes

This example intentionally avoids an Ultralytics dependency and does not include checked-in model artifacts.

@siromermer
Copy link
Copy Markdown
Author

@microsoft-github-policy-service agree

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant