内容摘录
<!-- DOCTOC SKIP -->
<h1 align="center">
<a href="https://www.skyvern.com">
<picture>
<source media="(prefers-color-scheme: dark)" srcset="fern/images/skyvern_logo.png"/>
<img height="120" src="fern/images/skyvern_logo_blackbg.png"/>
</picture>
</a>
<br />
</h1>
<p align="center">
🐉 Automate Browser-based workflows using LLMs and Computer Vision 🐉
</p>
<p align="center">
<a href="https://www.skyvern.com/"><img src="https://img.shields.io/badge/Website-blue?logo=googlechrome&logoColor=black"/></a>
<a href="https://www.skyvern.com/docs/"><img src="https://img.shields.io/badge/Docs-yellow?logo=gitbook&logoColor=black"/></a>
<a href="https://discord.gg/fG2XXEuQX3"><img src="https://img.shields.io/discord/1212486326352617534?logo=discord&label=discord"/></a>
<!-- <a href="https://pepy.tech/project/skyvern" target="_blank"><img src="https://static.pepy.tech/badge/skyvern" alt="Total Downloads"/></a> -->
<a href="https://github.com/skyvern-ai/skyvern"><img src="https://img.shields.io/github/stars/skyvern-ai/skyvern" /></a>
<a href="https://github.com/Skyvern-AI/skyvern/blob/main/LICENSE"><img src="https://img.shields.io/github/license/skyvern-ai/skyvern"/></a>
<a href="https://twitter.com/skyvernai"><img src="https://img.shields.io/twitter/follow/skyvernai?style=social"/></a>
<a href="https://www.linkedin.com/company/95726232"><img src="https://img.shields.io/badge/Follow%20 on%20LinkedIn-8A2BE2?logo=linkedin"/></a>
</p>
Skyvern automates browser-based workflows using LLMs and computer vision. It provides a Playwright-compatible SDK that adds AI functionality on top of playwright, as well as a no-code workflow builder to help both technical and non-technical users automate manual workflows on any website, replacing brittle or unreliable automation solutions.
<p align="center">
<img src="fern/images/geico_shu_recording_cropped.gif"/>
</p>
Traditional approaches to browser automations required writing custom scripts for websites, often relying on DOM parsing and XPath-based interactions which would break whenever the website layouts changed.
Instead of only relying on code-defined XPath interactions, Skyvern relies on Vision LLMs to learn and interact with the websites.
How it works
Skyvern was inspired by the Task-Driven autonomous agent design popularized by BabyAGI and AutoGPT -- with one major bonus: we give Skyvern the ability to interact with websites using browser automation libraries like Playwright.
Skyvern uses a swarm of agents to comprehend a website, and plan and execute its actions:
<picture>
<source media="(prefers-color-scheme: dark)" srcset="fern/images/skyvern_2_0_system_diagram.png" />
<img src="fern/images/skyvern_2_0_system_diagram.png" />
</picture>
This approach has a few advantages:
Skyvern can operate on websites it's never seen before, as it's able to map visual elements to actions necessary to complete a workflow, without any customized code
Skyvern is resistant to website layout changes, as there are no pre-determined XPaths or other selectors our system is looking for while trying to navigate
Skyvern is able to take a single workflow and apply it to a large number of websites, as it's able to reason through the interactions necessary to complete the workflow
A detailed technical report can be found here.
Demo
<!-- Redo demo -->
https://github.com/user-attachments/assets/5cab4668-e8e2-4982-8551-aab05ff73a7f
Quickstart
Skyvern Cloud
Skyvern Cloud is a managed cloud version of Skyvern that allows you to run Skyvern without worrying about the infrastructure. It allows you to run multiple Skyvern instances in parallel and comes bundled with anti-bot detection mechanisms, proxy network, and CAPTCHA solvers.
If you'd like to try it out, navigate to app.skyvern.com and create an account.
Run Locally (UI + Server)
Choose your preferred setup method:
Option A: pip install (Recommended)
Dependencies needed:
Python 3.11.x, works with 3.12, not ready yet for 3.13
NodeJS & NPM
Additionally, for Windows:
Rust
VS Code with C++ dev tools and Windows SDK
Install Skyvern
Run Skyvern
If you already have a database you want to use, pass a custom connection string to skip the
local Docker PostgreSQL setup:
Option B: Docker Compose
Install Docker Desktop
Clone the repository:
Run quickstart with Docker Compose:
When prompted, choose "Docker Compose" for the full containerized setup.
Navigate to http://localhost:8080
SDK
**Skyvern is a Playwright extension that adds AI-powered browser automation.** It gives you the full power of Playwright with additional AI capabilities—use natural language prompts to interact with elements, extract data, and automate complex multi-step workflows.
**Installation:**
Python: pip install skyvern then run skyvern quickstart for local setup
TypeScript: npm install @skyvern/client
AI-Powered Page Commands
Skyvern adds four core AI commands directly on the page object:
| Command | Description |
|---------|-------------|
| page.act(prompt) | Perform actions using natural language (e.g., "Click the login button") |
| page.extract(prompt, schema) | Extract structured data from the page with optional JSON schema |
| page.validate(prompt) | Validate page state, returns bool (e.g., "Check if user is logged in") |
| page.prompt(prompt, schema) | Send arbitrary prompts to the LLM with optional response schema |
Additionally, page.agent provides higher-level workflow commands:
| Command | Description |
|---------|-------------|
| page.agent.run_task(prompt) | Execute complex multi-step tasks |
| page.agent.login(credential_type, credential_id) | Authenticate with stored credentials (Skyvern, Bitwarden, 1Password) |
| page.agent.download_files(prompt) | Navigate and download files |
| page.agent.run_workflow(workflow_id) | Execute pre-built workflows |
AI-Augmented Playwright Actions
All standard Playwright actions support an optional prompt parameter for AI-powered element location:
| Action | Playwright | AI-Augmented |
|--------…