#大语言模型#SGLang is a fast serving framework for large language models and vision language models.
A GUI Agent application based on UI-TARS(Vision-Language Model) that allows you to control your computer using natural language.
#计算机科学#This repository offers a comprehensive collection of tutorials on state-of-the-art computer vision models and techniques. Explore everything from foundational architectures like ResNet to cutting-edge...
#大语言模型#Effortless data labeling with AI support from Segment Anything and other awesome models.
#大语言模型#Solve Visual Understanding with Reinforced VLMs
#大语言模型#Nexa SDK is a comprehensive toolkit for supporting GGML and ONNX models. It supports text generation, image generation, vision-language models (VLM), Audio Language Model, auto-speech-recognition (ASR...
#大语言模型#StarVector is a foundation model for SVG generation that transforms vectorization into a code generation task. Using a vision-language modeling architecture, StarVector processes both visual and textu...
#大语言模型#The official repo of MiniMax-Text-01 and MiniMax-VL-01, large-language-model & vision-language-model based on Linear Attention
#大语言模型#Build multimodal language agents for fast prototype and production
#大语言模型#An AI-powered file management tool that ensures privacy by organizing local texts, images. Using Llama3.2 3B and Llava v1.6 models with the Nexa SDK, it intuitively scans, restructures, and organizes ...
#大语言模型#The Cradle framework is a first attempt at General Computer Control (GCC). Cradle supports agents to ace any computer task by enabling strong reasoning abilities, self-improvment, and skill curation, ...
#自然语言处理#[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
LLM Agent Framework in ComfyUI includes MCP sever, Omost,GPT-sovits, ChatTTS,GOT-OCR2.0, and FLUX prompt nodes,access to Feishu,discord,and adapts to all llms with similar openai / aisuite interfaces,...
#自然语言处理#A reading list for large models safety, security, and privacy (including Awesome LLM Security, Safety, etc.).
#大语言模型#A family of lightweight multimodal models.
Aircraft design optimization made fast through computational graph transformations (e.g., automatic differentiation). Composable analysis tools for aerodynamics, propulsion, structures, trajectory des...
#大语言模型#A streamlined and customizable framework for efficient large model evaluation and performance benchmarking
#Awesome#Famous Vision Language Models and Their Architectures