#计算机科学#LAVIS - A One-stop Library for Language-Vision Intelligence
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
DeepSeek-VL: Towards Real-World Vision-Language Understanding