# UI-TARS-desktop **Repository Path**: devai/UI-TARS-desktop ## Basic Information - **Project Name**: UI-TARS-desktop - **Description**: 本地+远程 GUI Agent 一键起飞,零配置秒用。 - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: main - **Homepage**: https://agent-tars.com/ - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-08-16 - **Last Updated**: 2025-08-16 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README Agent TARS Banner
## Introduction English | [简体中文](./README.zh-CN.md) [![](https://trendshift.io/api/badge/repositories/13584)](https://trendshift.io/repositories/13584) TARS\* is a Multimodal AI Agent stack, currently shipping two projects: [Agent TARS](#agent-tars) and [UI-TARS-desktop](#ui-tars-desktop):
Agent TARS UI-TARS-desktop
Agent TARS is a general multimodal AI Agent stack, it brings the power of GUI Agent and Vision into your terminal, computer, browser and product.

It primarily ships with a CLI and Web UI for usage. It aims to provide a workflow that is closer to human-like task completion through cutting-edge multimodal LLMs and seamless integration with various real-world MCP tools.
UI-TARS Desktop is a desktop application that provides a native GUI Agent based on the UI-TARS model.

It primarily ships a local and remote computer as well as browser operators.
## Table of Contents - [News](#news) - [Agent TARS](#agent-tars) - [Showcase](#showcase) - [Core Features](#core-features) - [Quick Start](#quick-start) - [Documentation](#documentation) - [UI-TARS Desktop](#ui-tars-desktop) - [Showcase](#showcase-1) - [Features](#features) - [Quick Start](#quick-start-1) - [Contributing](#contributing) - [License](#license) - [Citation](#citation) ## News - **\[2025-06-25\]** We released a Agent TARS Beta and Agent TARS CLI - [Introducing Agent TARS Beta](https://agent-tars.com/blog/2025-06-25-introducing-agent-tars-beta.html), a multimodal AI agent that aims to explore a work form that is closer to human-like task completion through rich multimodal capabilities (such as GUI Agent, Vision) and seamless integration with various real-world tools. - **\[2025-06-12\]** - 🎁 We are thrilled to announce the release of UI-TARS Desktop v0.2.0! This update introduces two powerful new features: **Remote Computer Operator** and **Remote Browser Operator**—both completely free. No configuration required: simply click to remotely control any computer or browser, and experience a new level of convenience and intelligence. - **\[2025-04-17\]** - 🎉 We're thrilled to announce the release of new UI-TARS Desktop application v0.1.0, featuring a redesigned Agent UI. The application enhances the computer using experience, introduces new browser operation features, and supports [the advanced UI-TARS-1.5 model](https://seed-tars.com/1.5) for improved performance and precise control. - **\[2025-02-20\]** - 📦 Introduced [UI TARS SDK](./docs/sdk.md), is a powerful cross-platform toolkit for building GUI automation agents. - **\[2025-01-23\]** - 🚀 We updated the **[Cloud Deployment](./docs/deployment.md#cloud-deployment)** section in the 中文版: [GUI模型部署教程](https://bytedance.sg.larkoffice.com/docx/TCcudYwyIox5vyxiSDLlgIsTgWf#U94rdCxzBoJMLex38NPlHL21gNb) with new information related to the ModelScope platform. You can now use the ModelScope platform for deployment.
## Agent TARS

npm version downloads node version Discord Community Official Twitter 飞书交流群 Ask DeepWiki

Agent TARS is a general multimodal AI Agent stack, it brings the power of GUI Agent and Vision into your terminal, computer, browser and product.

It primarily ships with a CLI and Web UI for usage. It aims to provide a workflow that is closer to human-like task completion through cutting-edge multimodal LLMs and seamless integration with various real-world MCP tools. ### Showcase ``` Please help me book the earliest flight from San Jose to New York on September 1st and the last return flight on September 6th on Priceline ``` https://github.com/user-attachments/assets/772b0eef-aef7-4ab9-8cb0-9611820539d8
Booking Hotel Generate Chart with extra MCP Servers
Instruction: I am in Los Angeles from September 1st to September 6th, with a budget of $5,000. Please help me book a Ritz-Carlton hotel closest to the airport on booking.com and compile a transportation guide for me Instruction: Draw me a chart of Hangzhou's weather for one month
For more use cases, please check out [#842](https://github.com/bytedance/UI-TARS-desktop/issues/842). ### Core Features - 🖱️ **One-Click Out-of-the-box CLI** - Supports both **headful** [Web UI](https://agent-tars.com/guide/basic/web-ui.html) and **headless** [server](https://agent-tars.com/guide/advanced/server.html)) [execution](https://agent-tars.com/guide/basic/cli.html). - 🌐 **Hybrid Browser Agent** - Control browsers using [GUI Agent](https://agent-tars.com/guide/basic/browser.html#visual-grounding), [DOM](https://agent-tars.com/guide/basic/browser.html#dom), or a hybrid strategy. - 🔄 **Event Stream** - Protocol-driven Event Stream drives [Context Engineering](https://agent-tars.com/beta#context-engineering) and [Agent UI](https://agent-tars.com/blog/2025-06-25-introducing-agent-tars-beta.html#easy-to-build-applications). - 🧰 **MCP Integration** - The kernel is built on MCP and also supports mounting [MCP Servers](https://agent-tars.com/guide/basic/mcp.html) to connect to real-world tools. ### Quick Start Agent TARS CLI ```bash # Luanch with `npx`. npx @agent-tars/cli@latest # Install globally, required Node.js >= 22 npm install @agent-tars/cli@latest -g # Run with your preferred model provider agent-tars --provider volcengine --model doubao-1-5-thinking-vision-pro-250428 --apiKey your-api-key agent-tars --provider anthropic --model claude-3-7-sonnet-latest --apiKey your-api-key ``` Visit the comprehensive [Quick Start](https://agent-tars.com/guide/get-started/quick-start.html) guide for detailed setup instructions. ### Documentation > 🌟 **Explore Agent TARS Universe** 🌟
Category Resource Link Description
🏠 Central Hub Website Your gateway to Agent TARS ecosystem
📚 Quick Start Quick Start Zero to hero in 5 minutes
🚀 What's New Blog Discover cutting-edge features & vision
🛠️ Developer Zone Docs Master every command & features
🎯 Showcase Examples View use cases built by the official and community
🔧 Reference API Complete technical reference



## UI-TARS Desktop

UI-TARS

UI-TARS Desktop is a native GUI agent driven by [UI-TARS](https://github.com/bytedance/UI-TARS) and Seed-1.5-VL/1.6 series models, available on your local computer and remote VM sandbox on cloud.

   📑 Paper    | 🤗 Hugging Face Models   |   🫨 Discord   |   🤖 ModelScope  
🖥️ Desktop Application    |    👓 Midscene (use in browser)   

### Showcase | Instruction | Local Operator | Remote Operator | | :----------------------------------------------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------: | | Please help me open the autosave feature of VS Code and delay AutoSave operations for 500 milliseconds in the VS Code setting. |