#计算机科学#[ICLR 2024] Official implementation of " 🦙 Time-LLM: Time Series Forecasting by Reprogramming Large Language Models"
#计算机科学#Official implementation of "CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding" (CVPR, 2022)
【CVPR'2023 Highlight & TPAMI】Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
【AAAI'2023 & IJCV】Transferring Vision-Language Models for Visual Recognition: A Classifier Perspective
【CVPR'2023】Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models
#计算机科学#CVPR 2022: Cross-Modal Perceptionist: Can Face Geometry be Gleaned from Voices?
#计算机科学#[CVPR 2023 Highlight 💡] Hidden Gems: 4D Radar Scene Flow Learning Using Cross-Modal Supervision
[ICLR 2023] Autoencoders as Cross-Modal Teachers: Can Pretrained 2D Image Transformers Help 3D Representation Learning?
#自然语言处理#Code, dataset and models for our CVPR 2022 publication "Text2Pos"
[AAAI 2024] DGL: Dynamic Global-Local Prompt Tuning for Text-Video Retrieval.
In this work, we implement different cross-modal learning schemes such as Siamese Network, Correlational Network and Deep Cross-Modal Projection Learning model and study their performance. We also pro...
[IJBHI 2023] This is the official implementation of CAMANet: Class Activation Map Guided Attention Network for Radiology Report Generation accepted to IEEE Journal of Biomedical and Health Informatic...
Original PyTorch implementation of the code for the paper "Straight to the Point: Fast-forwarding Videos via Reinforcement Learning Using Textual Data" at the IEEE/CVF Conference on Computer Vision an...
Code for Limbacher, T., Özdenizci, O., & Legenstein, R. (2022). Memory-enriched computation and learning in spiking neural networks through Hebbian plasticity. arXiv preprint arXiv:2205.11276.
#自然语言处理#This project creates the T4SA 2.0 dataset, i.e. a big set of data to train visual models for Sentiment Analysis in the Twitter domain using a cross-modal student-teacher approach.
An intentionally simple Image to Food cross-modal search. Created by Prithiviraj Damodaran.
This is a cross-modal benchmark for industrial anomaly detection.
This is the code for our ICCV'19 paper on cross-modal learning and retrieval.
We design a cross-modal GAN which learns image-to-image modality transformation across cross-domain. This network is able to synthesize Infrared images from VISIBLE images for VEDAI dataset