Recent Advances in Vision and Language PreTrained Models (VL-PTMs)
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
A curated list of vision-and-language pre-training (VLP). :-)
The Paper List of Large Multi-Modality Model, Parameter-Efficient Finetuning, Vision-Language Pretraining, Conventional Image-Text Matching for Preliminary Insight.
Codes and Models for VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset
DeepSeek-VL: Towards Real-World Vision-Language Understanding
Awesome Vision-Language Pretraining Papers
Vision-Language Pretraining & Efficient Transformer Papers.
Evaluating Vision & Language Pretraining Models with Objects, Attributes and Relations. [EMNLP 2022]
XLNet: Generalized Autoregressive Pretraining for Language Understanding
翻译 - XLNet:用于语言理解的广义自回归预训练
PyTorch original implementation of Cross-lingual Language Model Pretraining.
翻译 - PyTorch最初执行跨语言模型预训练。
#自然语言处理#A large-scale 7B pretraining language model developed by BaiChuan-Inc.
Collection of AWESOME vision-language models for vision tasks
[CVPR 2022] Official code for "RegionCLIP: Region-based Language-Image Pretraining"
【ICLR 2024🔥】 Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
Code accompanying the paper Pretraining Language Models with Human Preferences
Multi Task Vision and Language
Bridging Vision and Language Model
#计算机科学#LAVIS - A One-stop Library for Language-Vision Intelligence
Pipeline for pulling and processing online language model pretraining data from the web