[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
#大语言模型#Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model
MultiMAE: Multi-modal Multi-task Masked Autoencoders, ECCV 2022
A Modular and Multi-Modal Mapping Framework
翻译 - 一个开放的视觉惯性映射框架。
An open source multi-modal trip planner
Multi-Modal learning toolkit based on PaddlePaddle and PyTorch, supporting multiple applications such as multi-modal classification, cross-modal retrieval and image caption.
M2DGR: a Multi-modal and Multi-scenario Dataset for Ground Robots(RA-L2021 & ICRA2022)
The TypeScript library for building multi-modal AI applications.
Multi-modal Image Registration And Connectivity anaLysis
#安卓#Mobile-Agent: The Powerful Mobile Device Operation Assistant Family
FarmVibes.AI: Multi-Modal GeoSpatial ML Models for Agriculture and Sustainability
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
FaceBagNet - Patch-based Methods for Multi-modal Face Anti-spoofing (FAS)
RELLIS-3D: A Multi-modal Dataset for Off-Road Robotics
Adaptive Context-Aware Multi-Modal Network for Depth Completion
High quality resources & applications for LLMs, multi-modal models and VectorDBs
A curated list of Multi-Modal Reinforcement Learning resources (continually updated)
AMC: Attention guided Multi-modal Correlation Learning for Image Search
MMGCN: Multi-modal Graph Convolution Network forPersonalized Recommendation of Micro-video