#计算机科学# LAVIS - A One-stop Library for Language-Vision Intelligence
#面试# A one stop repository for generative AI research updates, interview resources, notebooks and much more!
The implementation of "Prismer: A Vision-Language Model with Multi-Task Experts".
X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsense r...
翻译 - X-modaler 是用于跨模态分析的多功能高性能代码库。
A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities
#计算机科学# [ICML2024 (Oral)] Official PyTorch implementation of DoRA: Weight-Decomposed Low-Rank Adaptation
PyTorch code for CVPR 2019 paper: The Regretful Agent: Heuristic-Aided Navigation through Progress Estimation
翻译 - CVPR 2019论文的PyTorch代码:遗憾的代理:通过进度估计进行启发式导航