#大语言模型#SGLang is a fast serving framework for large language models and vision language models.
#计算机科学#This repository offers a comprehensive collection of tutorials on state-of-the-art computer vision models and techniques. Explore everything from foundational architectures like ResNet to cutting-edge...
#大语言模型#Effortless data labeling with AI support from Segment Anything and other awesome models.
#大语言模型#Nexa SDK is a comprehensive toolkit for supporting GGML and ONNX models. It supports text generation, image generation, vision-language models (VLM), Audio Language Model, auto-speech-recognition (ASR...
#大语言模型#Solve Visual Understanding with Reinforced VLMs
A GUI Agent application based on UI-TARS(Vision-Lanuage Model) that allows you to control your computer using natural language.
#大语言模型#Build multimodal language agents for fast prototype and production
#大语言模型#An AI-powered file management tool that ensures privacy by organizing local texts, images. Using Llama3.2 3B and Llava v1.6 models with the Nexa SDK, it intuitively scans, restructures, and organizes ...
#大语言模型#The Cradle framework is a first attempt at General Computer Control (GCC). Cradle supports agents to ace any computer task by enabling strong reasoning abilities, self-improvment, and skill curation, ...
#自然语言处理#[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
LLM Agent Framework in ComfyUI includes MCP sever, Omost,GPT-sovits, ChatTTS,GOT-OCR2.0, and FLUX prompt nodes,access to Feishu,discord,and adapts to all llms with similar openai / aisuite interfaces,...
#数据仓库#🚀🚀🚀 A collection of some awesome public YOLO object detection series projects and the related object detection datasets.
#自然语言处理#A reading list for large models safety, security, and privacy (including Awesome LLM Security, Safety, etc.).
#大语言模型#A family of lightweight multimodal models.
Aircraft design optimization made fast through computational graph transformations (e.g., automatic differentiation). Composable analysis tools for aerodynamics, propulsion, structures, trajectory des...
#Awesome#Famous Vision Language Models and Their Architectures
#大语言模型#A curated list of 3D Vision papers relating to Robotics domain in the era of large models i.e. LLMs/VLMs, inspired by awesome-computer-vision, including papers, codes, and related websites
#数据仓库#🚀🚀🚀A collection of some wesome public projects about Large Language Model(LLM), Vision Language Model(VLM), Vision Language Action(VLA), AI Generated Content(AIGC), the related Datasets and Applica...
[CVPR 2024 🔥] GeoChat, the first grounded Large Vision Language Model for Remote Sensing