#大语言模型#Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
#自然语言处理#Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
a state-of-the-art-level open visual language model | 多模态预训练模型
The simplest, fastest repository for training/finetuning small-sized VLMs.
A flexible and efficient codebase for training visually-conditioned language models (VLMs)
#大语言模型#Solve Visual Understanding with Reinforced VLMs
#计算机科学#Collection of AWESOME vision-language models for vision tasks
Inference Microsoft Florence2 VLM
[CoRL 2024] VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding
Witness the aha moment of VLM with less than $3.
LLM/VLM gaming agents and model evaluation through games.
#前端开发#Flame is an open-source multimodal AI system designed to translate UI design mockups into high-quality React code. It leverages vision-language modeling, automated data synthesis, and structured train...
X-VLM: Multi-Grained Vision Language Pre-Training (ICML 2022)
#大语言模型#MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.
#大语言模型#🚀 「大模型」1小时从0训练26M参数的视觉多模态VLM!🌏 Train a 26M-parameter VLM from scratch in just 1 hours!
Easily fine-tune, evaluate and deploy gpt-oss, Qwen3, DeepSeek-R1, or any open source LLM / VLM!
[CVPR 2025] FLAIR: VLM with Fine-grained Language-informed Image Representations
Visualizing the attention of vision-language models
A curated list of awesome LLM/VLM/VLA for Autonomous Driving(LLM4AD) resources (continually updated)
VLM-RL: A Unified Vision Language Models and Reinforcement Learning Framework for Safe Autonomous Driving
#大语言模型#A powerful toolkit for compressing large models including LLM, VLM, and video generation models.
#大语言模型#[NAACL2025] LiteWebAgent: The Open-Source Suite for VLM-Based Web-Agent Applications
The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.