[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
#计算机科学#Port of MiniGPT4 in C++ (4bit, 5bit, 6bit, 8bit, 16bit CPU inference with GGML)
[NAACL 2024] MMC: Advancing Multimodal Chart Understanding with LLM Instruction Tuning
中文医学多模态大模型 Large Chinese Language-and-Vision Assistant for BioMedicine
#大语言模型#Medical Report Generation And VQA (Adapting XrayGPT to Any Modality)
Streamline the creation of supervised datasets to facilitate data augmentation for deep learning architectures focused on image captioning. The core framework leverages MiniGPT-4, complemented by the ...