#Awesome#A curated list of awesome vision and language resources (still under construction... stay tuned!)
Multi Task Vision and Language
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Code for the ICML 2021 (long talk) paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision"
Recent Advances in Vision and Language Pre-training (VLP)
Pretrain Vision and Large Language Models in Python, Published by Packt
deep learning, image retrieval, vision and language
Strong and Open Vision Language Assistant for Mobile Devices
Bridging Vision and Language Model
#计算机科学#Collection of AWESOME vision-language models for vision tasks
A curated list of prompt-based paper in computer vision and vision-language learning.
A curated list for vision-and-language navigation. ACL 2022 paper "Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future Directions"
Recent Advances in Vision and Language PreTrained Models (VL-PTMs)
Ideas and thoughts about the fascinating Vision-and-Language Navigation
Vision-Language Pre-training for Image Captioning and Question Answering
A curated list of foundation models for vision and language tasks
#计算机科学#LAVIS - A One-stop Library for Language-Vision Intelligence
DeepSeek-VL: Towards Real-World Vision-Language Understanding