Multi-Modal learning toolkit based on PaddlePaddle and PyTorch, supporting multiple applications such as multi-modal classification, cross-modal retrieval and image caption.
The source code of AMFMN and the dataset RSITMD
#计算机科学#[IJCAI2022] Unsupervised Voice-Face Representation Learning by Cross-Modal Prototype Contrast
secure and verifiable cross-modal retrieval