A curated list of foundation models for vision and language tasks
#大语言模型#🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).
#Awesome#A curated list of Awesome Personalized Large Multimodal Models resources
Multimodal Bi-Transformers (MMBT) in Biomedical Text/Image Classification
#计算机科学#Phi-3-Vision model test - running locally
#计算机科学#Leverage VideoLLaMA 3's capabilities using LitServe.