[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
#计算机科学#Port of MiniGPT4 in C++ (4bit, 5bit, 6bit, 8bit, 16bit CPU inference with GGML)
[NAACL 2024] MMC: Advancing Multimodal Chart Understanding with LLM Instruction Tuning
中文医学多模态大模型 Large Chinese Language-and-Vision Assistant for BioMedicine
Streamline the creation of supervised datasets to facilitate data augmentation for deep learning architectures focused on image captioning. The core framework leverages MiniGPT-4, complemented by the ...
#大语言模型#Medical Report Generation And VQA (Adapting XrayGPT to Any Modality)