#计算机科学#Jina 是一个基于深度学习的搜索框架,支持各种类型如图片,视频,长文本,PDF等。
A Survey on multimodal learning research.
Multimodal-GPT
#大语言模型#AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.
Multimodal Unsupervised Image-to-Image Translation
✨✨Latest Advances on Multimodal Large Language Models
A framework to enable multimodal models to operate a computer.
Reading list for research topics in multimodal machine learning
This repository contains various models targetting multimodal representation learning, multimodal fusion for downstream tasks such as multimodal sentiment analysis.
Toward Multimodal Image-to-Image Translation
#大语言模型#mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
Reasoning in Large Language Models: Papers and Resources, including Chain-of-Thought and OpenAI o1 🍓
A Collaborative Correlation-Matching Network for Multimodality Remote Sensing Image Classification
This repository is build in association with our position paper on "Multimodality for NLP-Centered Applications: Resources, Advances and Frontiers". As a part of this release we share the information...