Wyse Browser is a powerful, multi-process runtime engine designed for executing automated flows within a browser environment. It provides a robust platform for creating, managing, and executing complex automation workflows through a comprehensive REST API.
- Powerful & Scalable Automation Core ✨: Built on NestJS and Playwright, Wyse Browser provides a reliable and efficient multi-process runtime engine. It orchestrates multiple sandboxed Chrome instances, enabling robust and scalable browser automation.
- AI-Driven Workflow Orchestration 🧠: Designed to integrate seamlessly with LLMs and AI Agents, facilitating the creation, management, and execution of sophisticated, AI-driven automation workflows.
- Modular & Extensible Worklets 🧩: Leverage Worklets as autonomous, reusable, and highly composable code blocks for specific tasks, allowing for flexible and extensible automation solutions.
- Comprehensive REST API Control 🔗: Offers a full-featured REST API for programmatic control over every aspect of the browser environment, including sessions, pages, flows, and individual browser actions.
- Parallel & Isolated Session Execution ⚡: Manages multiple independent browser sessions in parallel, each running in a sandboxed Chrome instance with isolated contexts (cookies, local storage), ensuring tasks run without interference.
- Rich & Granular Action Space 🤖: Provides a wide array of built-in, low-level browser actions—from navigation and clicking to executing custom JavaScript—offering precise control over browser interactions.
- Robust Security & Data Privacy 🔒: Prioritizes user safety with explicit consent mechanisms for data access, strong data privacy measures, and secure handling of worklets which involve arbitrary code execution.
The Wyse Browser protocol is built for distributed systems, enabling each engine to manage multiple workflow and worklet instances efficiently.
graph TD
subgraph Intelligence Layer
M[LLM]
A[AI Agents]
end
M <--> A
APIS -- Exposes --> APIList[/api/session/<br>/api/flow/<br>/api/browser/]
subgraph Wyse Browser
B(RunTime)
C{Workflow}
AS[Browser Actions]
APIS[API Service]
end
subgraph "(Playwright)"
BIs[Chrome Instances]
end
A <--> B
B <--> APIS
B --> C
B --> AS
B -- Manages --> BIs
AL[visit, history, search, refresh_page<br/>click, click_full, double_click, text<br/>scroll_up, scroll_down, scroll_element_up, scroll_element_down<br/>scroll_to, wait, key_press, hover<br/>evaluate, init_js, content, create_tab<br/>switch_tab, close_tab, tabs_info, set_content<br/>select_option, drag, screenshot]
AS --> AL
AL --> W2[Website2]
C --> D[Worklet1]
C --> E[Worklet2]
C --> F[Worklet3]
C --> G[Worklet4]
subgraph External Resource
D
E
F
G
end
D --> I[Filesystem]
E --> J[Datasource1]
F --> K[Website1]
G --> L[External API]
style M fill:#D9FFD9,stroke:#66CC66,stroke-width:2px,rx:5px,ry:5px;
style A fill:#D9FFD9,stroke:#66CC66,stroke-width:2px,rx:5px,ry:5px;
style B fill:#E0F2FF,stroke:#3399FF,stroke-width:2px;
style C fill:#E0F2FF,stroke:#3399FF,stroke-width:1px,rx:5px,ry:5px;
style AS fill:#E0F2FF,stroke:#3399FF,stroke-width:1px,rx:5px,ry:5px;
style APIS fill:#E0F2FF,stroke:#3399FF,stroke-width:2px;
style APIList fill:#FFF,stroke:#333,stroke-width:1px,rx:5px,ry:5px;
style D fill:#FFD9D9,stroke:#FF6666,stroke-width:1px,rx:5px,ry:5px;
style E fill:#FFD9D9,stroke:#FF6666,stroke-width:1px,rx:5px,ry:5px;
style F fill:#E6E6FA,stroke:#9370DB,stroke-width:1px,rx:5px,ry:5px;
style G fill:#E6E6FA,stroke:#9370DB,stroke-width:1px,rx:5px,ry:5px;
style I fill:#FFD9D9,stroke:#FF6666,stroke-width:1px,rx:5px,ry:5px;
style J fill:#FFD9D9,stroke:#FF6666,stroke-width:1px,rx:5px,ry:5px;
style K fill:#E6E6FA,stroke:#9370DB,stroke-width:1px,rx:5px,ry:5px;
style L fill:#E6E6FA,stroke:#9370DB,stroke-width:1px,rx:5px,ry:5px;
style W2 fill:#E6E6FA,stroke:#9370DB,stroke-width:1px,rx:5px,ry:5px;
style AL fill:#FFF,stroke:#333,stroke-width:1px,rx:5px,ry:5px;
style BIs fill:#D5F5E3,stroke:#2ECC71,stroke-width:1px,rx:5px,ry:5px;
- Session 🌐: A dedicated, isolated browser environment (a sandboxed Chrome instance) that provides a consistent context for executing workflows and browser actions. Each session manages its own cookies, local storage, and pages (tabs), ensuring that automated tasks run without interference from other operations.
- Browser Actions 🤖: The fundamental building blocks for automation within a session. These are low-level, atomic operations that can be executed on a browser page, such as
visita URL,clickan element,typetext, ortake a screenshot. These actions are exposed through a comprehensive API, allowing for granular control over browser interactions. - Workflow 🚀: Defines a precise sequence of worklets executed in a specific order. Workflows are designed and created by AI agents to automate complex multi-step tasks within the browser. Each workflow maintains isolated data connections and state, ensuring independent and reliable execution.
- Worklet 🧩: A reusable, autonomous, and highly composable code block dedicated to performing a specific task. Worklets act as the modular units of automation, encapsulating logic for interactions with external resources or complex browser operations. They can be implemented in various languages and function as local processes or remote services, allowing for flexible and extensible automation.
- Node.js (v20.x or later)
- pnpm
-
Clone the repository:
git clone https://github.com/wyse-work/wyse-browser.git cd wyse-browser -
Navigate to the browser engine directory and install dependencies:
cd browser pnpm install -
Build all worklets:
./build_worklets.sh
-
Run the API development server:
pnpm run start:dev
The API server will be running at
http://127.0.0.1:13100.
Here's a quick example of how to use curl to create a session, navigate to a page, and take a screenshot.
-
Create a new session:
SESSION_ID=$(curl -s -X POST http://127.0.0.1:13100/api/session/create \ -H "Content-Type: application/json" \ -d '{}' | grep -o '"session_id":"[^"]*' | cut -d'"' -f4) echo "Session created with ID: $SESSION_ID"
-
Perform a "visit" action:
curl -X POST http://127.0.0.1:13100/api/browser/action \ -H "Content-Type: application/json" \ -d '{ "session_id": "'"$SESSION_ID"'", "action_name": "visit", "data": { "url": "https://www.google.com" } }'
-
Take a screenshot:
curl -X GET http://127.0.0.1:13100/api/session/$SESSION_ID/screenshot
The Wyse Browser exposes a rich set of API endpoints for programmatic control over browser automation tasks.
http://127.0.0.1:13100
| Method | Endpoint | Description | Parameters |
|---|---|---|---|
GET |
/api/health |
Checks if the API server is running. | None |
| Method | Endpoint | Description | Parameters |
|---|---|---|---|
GET |
/api/metadata/flow/:name |
Retrieves the manifest for a specific flow. | Path: name (string, required) |
GET |
/api/metadata/worklet/:name |
Retrieves the manifest for a specific worklet. | Path: name (string, required) |
GET |
/api/metadata/list/:type |
Lists all available metadata for a given type (flow or worklet). |
Path: type (string, required) - flow or worklet |
POST |
/api/metadata/save |
Saves or updates a flow manifest. | Body: UpdateMetadataDto- metadata_type (string, required)- name (string, required)- data (object, required) |
| Method | Endpoint | Description | Parameters |
|---|---|---|---|
POST |
/api/session/create |
Creates a new browser session. | Body: CreateSessionDto- session_context (object, optional)- session_id (string, optional) |
POST |
/api/session/:sessionId/add_init_script |
Adds an initialization script to the session. | Path: sessionId (string, required)Body: AddInitScriptDto- script (string, required) |
GET |
/api/session/:sessionId |
Retrieves details for a specific session. | Path: sessionId (string, required) |
GET |
/api/session/:sessionId/context |
Gets the context (cookies, local storage) of a session. | Path: sessionId (string, required) |
GET |
/api/session/:sessionId/release |
Closes and cleans up a session. | Path: sessionId (string, required) |
GET |
/api/sessions/list |
Lists all active sessions. | None |
GET |
/api/session/:sessionId/screenshot |
Takes a screenshot of the current page in a session. | Path: sessionId (string, required) |
| Method | Endpoint | Description | Parameters |
|---|---|---|---|
POST |
/api/browser/action |
Executes a single browser action (e.g., click, text). |
Body: BrowserActionDto- session_id (string, required)- page_id (number, optional, default: 0)- action_name (string, required)- data (object, required) |
POST |
/api/browser/batch_actions |
Executes a batch of browser actions sequentially. | Body: BatchActionsDto- session_id (string, required)- page_id (number, optional, default: 0)- actions (array, required):- action_name (string, required)- data (object, required) |
| Method | Endpoint | Description | Parameters |
|---|---|---|---|
POST |
/api/session/:sessionId/page/create |
Creates a new page (tab) in a session. | Path: sessionId (string, required) |
GET |
/api/session/:sessionId/page/:pageId/switch |
Switches the active page in a session. | Path: - sessionId (string, required)- pageId (number, required) |
GET |
/api/session/:sessionId/page/:pageId/release |
Closes a specific page in a session. | Path: - sessionId (string, required)- pageId (number, required) |
| Method | Endpoint | Description | Parameters |
|---|---|---|---|
POST |
/api/flow/create |
Creates a new flow instance from a predefined manifest. | Body: CreateFlowDto- flow_name (string, required)- session_id (string, optional)- is_save_video (boolean, optional)- extension_names (string[], optional) |
POST |
/api/flow/deploy |
Deploys a new flow using an inline JSON definition. | Body: DeployFlowDto- flow (object, required)- session_id (string, optional)- is_save_video (boolean, optional)- extension_names (string[], optional) |
POST |
/api/flow/fire |
Executes an action within a running flow instance. | Body: FireFlowDto- flow_instance_id (string, required)- action_name (string, optional, default: action_flow_start)- data (object, required) |
GET |
/api/flow/list |
Lists all active flow instances. | None |
| Method | Endpoint | Description | Parameters |
|---|---|---|---|
POST |
/api/sessions/:sessionId/files |
Uploads one or more files to a session, checking against session storage limits. | Path: sessionId (string, required)Body: filePath (string, optional) - target path for the file, defaults to original name. |
GET |
/api/sessions/:sessionId/files |
Lists all files stored within a specific session. | Path: sessionId (string, required) |
GET |
/api/sessions/:sessionId/files/* |
Downloads a specific file from a session. | Path: - sessionId (string, required)- filePath (string, required) |
HEAD |
/api/sessions/:sessionId/files/* |
Retrieves metadata (headers) for a specific file in a session without downloading the content. | Path: - sessionId (string, required)- filePath (string, required) |
GET |
/api/sessions/:sessionId/files.zip |
Downloads all files from a session as a ZIP archive. | Path: sessionId (string, required) |
DELETE |
/api/sessions/:sessionId/files/* |
Deletes a specific file from a session. | Path: - sessionId (string, required)- filePath (string, required) |
DELETE |
/api/sessions/:sessionId/files |
Deletes all files associated with a specific session. | Path: sessionId (string, required) |
The BrowserAction module provides a comprehensive set of low-level actions that can be executed on a page within a session. These actions are the fundamental building blocks for creating complex automation flows.
| Action | Description | Parameters |
|---|---|---|
url |
Gets the URL of the current page. | None |
visit |
Navigates the page to a specified URL. | url: The URL to visit. |
history |
Navigates forward or backward in the browser history. | num: A positive number to go forward, a negative number to go back. |
search |
Performs a Google search. | search_key: The text to search for. |
refreshpage |
Reloads the current page. | None |
click |
Clicks an element or a point on the page. | element_id or (x, y coordinates). |
clickfull |
A more comprehensive click action. | element_id or (x, y coordinates). Optional: hold (seconds), button ("left", "right", "middle"). |
doubleclick |
Double-clicks an element or a point on the page. | element_id or (x, y coordinates). |
text |
Enters text into an element or at the current cursor position. | text: The text to type. Optional: element_id, press_enter (boolean), delete_existing_text (boolean), or (x, y coordinates). |
scrollup |
Scrolls the page up. | None |
scrolldown |
Scrolls the page down. | None |
scrollelementup |
Scrolls an element's container up. | element_id, page_number: Number of pages to scroll. |
scrollelementdown |
Scrolls an element's container down. | element_id, page_number: Number of pages to scroll. |
scrollto |
Scrolls to make an element visible. | element_id: The ID of the element to scroll to. |
wait |
Pauses execution for a specified duration. | time: The number of seconds to wait. |
keypress |
Simulates key presses. | keys: A string or array of strings of keys to press (e.g., 'Enter', 'Control+A'). |
hover |
Hovers over an element or a point on the page. | element_id or (x, y coordinates). |
evaluate |
Executes a JavaScript snippet in the page context. | script: The JavaScript code to execute. |
initjs |
Injects initialization JavaScript into the page. | None |
waitforloadstate |
Waits for the page to reach a specific load state. | None |
content |
Gets the full HTML content of the page. | None |
createtab |
Creates a new browser tab. | Optional: url: The URL to open in the new tab. |
switchtab |
Switches to a different tab. | tab_index: The index of the tab to switch to. |
closetab |
Closes a browser tab. | tab_index: The index of the tab to close. |
tabsinfo |
Retrieves information about all open tabs. | None |
cleanupanimations |
Removes animations from the page to stabilize tests. | None |
previewaction |
Highlights an element to preview an action without executing it. | element_id: The ID of the element to preview. |
setcontent |
Sets the HTML content of the page. | content: The HTML content to set. |
ensurepageready |
Ensures the page is fully loaded and ready for interaction. | None |
selectoption |
Selects an option from a dropdown or custom select component. | element_id or (x, y coordinates). |
drag |
Performs a drag-and-drop operation. | drag_path: A JSON string or array of points {x, y} representing the drag path. |
screenshot |
Takes a screenshot of the current page. | None |
Security and user safety are paramount in Wyse Browser:
- User Consent and Control: Users must explicitly consent to and fully understand all data access and operations.
- Data Privacy: Applications must obtain explicit user consent before exposing any user data to external servers.
- Worklet Safety: Worklets involve arbitrary code execution and must be handled with extreme caution. Hosts must obtain explicit user consent before invoking any worklet.
Contributions are welcome! Please feel free to submit a pull request.
- Fork the repository.
- Create your feature branch (
git checkout -b feature/AmazingFeature). - Commit your changes (
git commit -m 'Add some AmazingFeature'). - Push to the branch (
git push origin feature/AmazingFeature). - Open a Pull Request.
This project is licensed under the MIT License. See the LICENSE file for details.