A GUI Agent application based on UI-TARS(Vision-Language Model) that allows you to control your computer using natural language.
Let AI be your browser operator.
The most reliable AI agent framework that supports MCP.
Open-Source Chrome extension for AI-powered web automation. Run multi-agent workflows using your own LLM API key. Alternative to OpenAI Operator.
Create and run high-performance macOS and Linux VMs on Apple Silicon, with built-in support for AI agents.
Agent S: an open agentic framework that uses computers like a human
Ui.Vision Open-Source RPA Software with Computer Vision, OCR, Anthropic Computer Use/LLM. Selenium IDE import/export.
Open Source Generative Process Automation (i.e. Generative RPA). AI-First Process Automation with Large ([Language (LLMs) / Action (LAMs) / Multimodal (LMMs)] / Visual Language (VLMs)) Models
[CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.
A curated list of resources about AI agents for Computer Use, including research papers, projects, frameworks, and tools.
#大语言模型#AI computer use powered by open source LLMs and E2B Desktop Sandbox
A fork of Anthropic Computer Use that you can run on Mac computers to give Claude and other AI models autonomous access to your computer.
Windows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of multi-modal AI agents.
Desktop app powered by Claude’s computer use capability to control your computer
A framework to enable autonomous android and computer use using any LLM (local or remote)
A curated list of resources about AI agents for Computer Use, including research papers, projects, frameworks, and tools.
This is the repo for the paper "OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use".
A framework to enable autonomous android and computer use using any LLM (local or remote)
This is the repo for the paper "OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use".