#计算机科学#LAVIS - A One-stop Library for Language-Vision Intelligence
Recent Advances in Vision and Language PreTrained Models (VL-PTMs)
#计算机科学#Collection of AWESOME vision-language models for vision tasks
DeepSeek-VL: Towards Real-World Vision-Language Understanding
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
Multi Task Vision and Language
#Awesome#A curated list of awesome vision and language resources (still under construction... stay tuned!)
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Bridging Vision and Language Model
Mixture-of-Experts for Large Vision-Language Models
A curated list of prompt-based paper in computer vision and vision-language learning.
Long Context Transfer from Language to Vision
deep learning, image retrieval, vision and language
This repository contains the official implementation of "FastVLM: Efficient Vision Encoding for Vision Language Models" - CVPR 2025
[ECCV 2024] Official code implementation of Vary: Scaling Up the Vision Vocabulary of Large Vision Language Models.
Prompt Learning for Vision-Language Models (IJCV'22, CVPR'22)
Code for ALBEF: a new vision-language pre-training method