[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
This is the third party implementation of the paper Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection.
[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.
awesome grounding: A curated list of research papers in visual grounding
Video Grounding and Captioning
[ACL 2024] GroundingGPT: Language-Enhanced Multi-modal Grounding Model
Grounding Image Matching in 3D with MASt3R
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
Train an RL agent to execute natural language instructions in a 3D Environment (PyTorch)
A Fast and Accurate One-Stage Approach to Visual Grounding, ICCV 2019 (Oral)
Grounding DINO with Segment Anything & Stable Diffusion colab
A curated list of “Temporally Language Grounding” and related area
[EMNLP 2022] Unifying and multi-tasking structured knowledge grounding with language models
Paper collection on building and evaluating language model agents via executable language grounding
Auto Segmentation label generation with SAM (Segment Anything) + Grounding DINO
AAAI2020-The official implementation of "Learning Cross-modal Context Graph for Visual Grounding"
[TPAMI 2024 & CVPR 2023] PyTorch code for DGM4: Detecting and Grounding Multi-Modal Media Manipulation and beyond
Graph grounding for graph coloring algorithms such as Welsh Powell and Evolution algorithms like Harmony Search and Genetic
#计算机科学#Examples and tutorials on using SOTA computer vision models and techniques. Learn everything from old-school ResNet, through YOLO and object-detection transformers like DETR, to the latest models like...