🪩 Create Disco Diffusion artworks in one line
#计算机科学#Represent, send, store and search multimodal data
翻译 - 非结构化数据的数据结构
#自然语言处理#A collection of research on knowledge graphs
#Awesome#A curated list of different papers and datasets in various areas of audio-visual processing
#计算机科学#PyTorch source code for "Stacked Cross Attention for Image-Text Matching" (ECCV 2018)
#自然语言处理#Analyze the unstructured data with Towhee, such as reverse image search, reverse video search, audio classification, question and answer systems, molecular search, etc.
The official implementation of Achieving Cross Modal Generalization with Multimodal Unified Representation (NeurIPS '23)
[CVPR 2023] Referring Image Matting
Remote Sensing Sar-Optical Land-use Classfication Pytorch Pytorch高分辨率遥感语义分割/地物分割/地物分类
#向量搜索引擎#[NAACL 2022]Mobile Text-to-Image search powered by multimodal semantic representation models(e.g., OpenAI's CLIP)
Weakly Supervised 3D Object Detection from Point Clouds (VS3D), ACM MM 2020
#自然语言处理#BioT5 (EMNLP 2023) and BioT5+ (ACL 2024 Findings)
DistillBEV: Boosting Multi-Camera 3D Object Detection with Cross-Modal Knowledge Distillation (ICCV 2023)
Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning, CVPR 2022
#计算机科学#Unofficial Implementation of Google Deepmind's paper `Objects that Sound`
Code for journal paper "Learning Dual Semantic Relations with Graph Attention for Image-Text Matching", TCSVT, 2020.
This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have been cited and discussed in the survey just accepted https://dl....
[CVPR2024] The code of "UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory"
Unleash the Potential of Image Branch for Cross-modal 3D Object Detection [NeurIPS2023]
Official PyTorch implementation of our CVPR 2022 paper: Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual Context for Image Captioning