”llava“ 的搜索结果

#大语言模型#LLaVA是一个具有 GPT-4V 级别功能的大语言和视觉模型助手

Python20.47 k

4 个月前

large-vision-language-model llama3-llava chatgpt instruction-tuning llava llama-3-llava multi-modal gpt-4 foundation-models multimodal

LLaVA-pp

@mbzuai-oryx

#大语言模型#🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)

conversation llama3 llava llm lmms

Python813

5 个月前

Video-LLaVA

@PKU-YuanGroup

【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

instruction-tuning large-vision-language-model multi-modal

Python3.03 k

2 个月前

LLaVA-NeXT

@LLaVA-VL

Python2.98 k

1 个月前

LLaVA-Med

Microsoft@microsoft

Large Language-and-Vision Assistant for Biomedicine, built towards multimodal GPT-4 level capabilities.

Python1.59 k

4 个月前

MoE-LLaVA

@PKU-YuanGroup

Mixture-of-Experts for Large Vision-Language Models

large-vision-language-model mixture-of-experts moe multi-modal

Python2 k

6 个月前

LLaVA-Interactive-Demo

@LLaVA-VL

LLaVA-Interactive-Demo

Python352

4 个月前

LLaVA-CoT

@PKU-YuanGroup

LLaVA-CoT, a visual language model capable of spontaneous, systematic reasoning

Python1.45 k

2 天前

llavavision

@lxe

A simple "Be My Eyes" web app with a llama.cpp/llava backend

JavaScript486

1 年前

llava-phi

@zhuyiche

Python368

7 个月前

llava-cpp-server

@trzy

LLaVA server (llama.cpp).

C++153

1 年前

LLaVA-RLHF

@llava-rlhf

Aligning LMMs with Factually Augmented RLHF

Python324

1 年前

LLaVA-Plus-Codebase

@LLaVA-VL

LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills

agent large-language-models large-multimodal-models multimodal-large-language-models tool-use

Python708

10 个月前

maestro

@roboflow

streamline the fine-tuning process for multimodal models: PaliGemma, Florence-2, and Qwen2-VL

lmm multimodality segment-anything instance-segmentation object-detection

Python1.4 k

4 天前

LLaVA-colab

@camenduru

Jupyter Notebook163

1 年前

Chinese-LLaVA

@LinkSoul-AI

支持中英文双语视觉-文本对话的开源可商用多模态模型。

Python352

1 年前

uform

@unum-cloud

#向量搜索引擎#Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️

huggingface-transformers language-vision multimodal PyTorch semantic-search

Python1.05 k

2 个月前

MPP-LLaVA

@Coobiw

Personal Project: MPP-Qwen14B & MPP-Qwen-Next(Multimodal Pipeline Parallel based on Qwen-LM). Support [video/image/multi-image] {sft/conversations}. Don't let the poverty limit your imagination! Train...

Jupyter Notebook381

2 个月前

HallusionBench

@tianyi-lab

[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models

Python207

8 个月前

Multi-Modality-Arena

@OpenGVLab

Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, BLIP...

Python468

7 个月前

编程语音