集合主题趋势排行榜

llava

ollama / ollama

#大语言模型#本地化搭建和运行 Llama2 和其他大模型

llama 大语言模型 llama2 Go ollama mistral gemma llama3 llava phi4 deepseek gemma3 qwen gemma3n gpt-oss

Go 154.01 k

2 天前

haotian-liu / LLaVA

#大语言模型#LLaVA是一个具有 GPT-4V 级别功能的大语言和视觉模型助手

gpt-4 聊天机器人 ChatGPT llama multimodal llava foundation-models instruction-tuning multi-modality visual-language-learning llama-2 llama2 vision-language-model

Python 23.72 k

1 年前

sgl-project / sglang

#大语言模型#SGLang is a fast serving framework for large language models and vision language models.

CUDA inference llama llava 大语言模型 llm-serving moe PyTorch transformer vlm llama3 deepseek deepseek-v3 deepseek-r1 qwen3 blackwell openai kimi gpt-oss deepseek-v3-2

Python 18.79 k

3 小时前

Fanghua-Yu / SUPIR

#计算机科学#SUPIR aims at developing Practical Algorithms for Photo-Realistic Image Restoration In the Wild. Our new online demo is also released at suppixel.ai.

深度学习 diffusion-models llava sdxl stable-diffusion super-resolution restoration PyTorch pytorch-lightning

Python 5.29 k

5 个月前

yuanzhoulvpi2017 / zero_nlp

#自然语言处理#中文nlp解决方案(大模型、数据、模型、训练、推理)

bert 自然语言处理 transformers gpt2 chatglm-6b clip gpt PyTorch text-generation huggingface-transformers llama2 llama llava

Jupyter Notebook 3.66 k

2 个月前

SciSharp / LLamaSharp

#大语言模型#A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.

聊天机器人 gpt llama llamacpp 大语言模型 semantic-kernel llava multi-modal llama2 llama3 llama-cpp

C# 3.38 k

10 天前

open-compass / VLMEvalKit

#大语言模型#Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks

gpt-4v large-language-models llava multi-modal openai vqa 大语言模型 openai-api qwen gpt 机器视觉 PyTorch gpt4 ChatGPT clip vit evaluation claude gemini

Python 3.18 k

17 小时前

om-ai-lab / OmAgent

#大语言模型#Build multimodal language agents for fast prototype and production

large-language-models multimodal-agent vision-and-language agent workflow 聊天机器人 gpt4 大语言模型 multimodal rag vlm gpt gradio llama llava openai Python gemini

Python 2.56 k

7 个月前

chenking2020 / FindTheChatGPTer

#大语言模型#ChatGPT爆火，开启了通往AGI的关键一步，本项目旨在汇总那些ChatGPT的开源平替们，包括文本大模型、多模态大模型等，为大家提供一些便利

chatglm llama belle vicuna ChatGPT alpaca guanaco lora llava minigpt4 autogpt agi ceval baichuan llama2

2.03 k

2 年前

Blaizzy / mlx-vlm

#大语言模型#MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.

llava 大语言模型 MLX vision-transformer apple-silicon idefics local-ai paligemma vision-framework vision-language-model florence2 molmo pixtral

Python 1.69 k

7 小时前

mbzuai-oryx / Video-ChatGPT

[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for ...

聊天机器人 clip gpt-4 llama llava vicuna vision-language vision-language-pretraining

Python 1.44 k

2 个月前

unum-cloud / UForm

#向量搜索引擎#Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️

huggingface-transformers language-vision multimodal PyTorch semantic-search transformer cross-attention vector-search bert 神经网络 pretrained-models multi-lingual clip openai contrastive-learning representation-learning clustering image-search llava

Python 1.18 k

1 个月前

jhc13 / taggui

Tag manager and captioner for image datasets

image-captioning pyside6 stable-diffusion llava cogvlm florence-2

Python 1.15 k

2 天前

gokayfem / awesome-vlm-architectures

#Awesome#Famous Vision Language Models and Their Architectures

clip llava vlm multimodal blip cogvlm internlm kosmos vision-language-model Awesome Lists

Markdown 1.04 k

8 个月前

TinyLLaVA / TinyLLaVA_Factory

#自然语言处理#A Framework of Small-scale Large Multimodal Models

large-multimodal-models llama llava 自然语言处理 transformers vision-language

Python 906

6 个月前

NVlabs / Eagle

#大语言模型#Eagle: Frontier Vision-Language Models with Data-Centric Strategies

Demo gpt4 huggingface llama llama3 llava lmm mllm 大语言模型 large-language-models

Python 879

2 个月前

mbzuai-oryx / LLaVA-pp

#大语言模型#🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)

conversation llama3 llava 大语言模型 lmms phi3 vision-language llama-3-llava llama-3-vision llama3-llava phi-3-vision phi3-vision

Python 842

2 个月前

PsyChip / machina

OpenCV+YOLO+LLAVA powered video surveillance system

camera llava ollama-api OpenCV Python rtsp yolo

Python 777

20 天前

PaddlePaddle / PaddleMIX

Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high per...

aigc stable-diffusion clip image-to-text text-to-image controlnet multimodal text-to-video dit llava sora qwen2-vl minicpm-v

Python 701

1 个月前

SkalskiP / awesome-foundation-and-multimodal-models

#自然语言处理#👁️ + 💬 + 🎧 = 🤖 Curated list of top foundation and multimodal models! [Paper + Code + Examples + Tutorials]

blip clip foundational-models grounding-dino llava multimodal segment-anything 机器视觉自然语言处理 open-vocabulary-detection open-vocabulary-segmentation image-captioning

Python 636

2 年前

Website
Wikipedia