Skip to content

Commit e6d2a72

Browse files
Merge pull request #90 from askui/feat/web-support
feat: web automation support using playwright + install chat api with pip
2 parents 8f66e0f + f7070b6 commit e6d2a72

File tree

124 files changed

+1310
-15330
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

124 files changed

+1310
-15330
lines changed

.vscode/launch.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@
2828
"preLaunchTask": "Create .env.tmp file",
2929
"postDebugTask": "Delete .env.tmp file",
3030
"module": "uvicorn",
31-
"args": ["src.chat.api.app:app","--reload","--port","8000"],
31+
"args": ["src.askui.chat.api.app:app","--reload","--port","9261"],
3232
"envFile": "${workspaceFolder}/.env.tmp",
3333
"env": {
3434
"ASKUI_WORKSPACES__LOG__FORMAT": "logfmt",

README.md

Lines changed: 13 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -775,34 +775,39 @@ If you would like to disable the recording of usage data, set the `ASKUI__VA__TE
775775
### AskUI Chat
776776

777777
AskUI Chat is a web application that allows interacting with an AskUI Vision Agent similar how it can be
778-
done with `VisionAgent.act()` but in a more interactive manner that involves less code. Aside from
779-
telling the AskUI Vision Agent what to do, the user can also demonstrate what to do (currently, only
778+
done with `VisionAgent.act()` or `AndroidVisionAgent.act()` but in a more interactive manner that involves less code. Aside from
779+
telling the agent what to do, the user can also demonstrate what to do (currently, only
780780
clicking is supported).
781781

782782
**⚠️ Warning:** AskUI Chat is currently in an experimental stage and has several limitations (see below).
783783

784+
#### Architecture
785+
786+
This repository only includes the AskUI Chat API (`src/askui/chat`). The AskUI Chat UI can be accessed through the [AskUI Hub](https://hub.askui.com/) and connects to the local Chat API after it has been started.
787+
784788
#### Configuration
785789

786790
To use the chat, configure the following environment variables:
787791

788792
- `ASKUI_TOKEN`: AskUI Vision Agent behind chat uses currently the AskUI API
789793
- `ASKUI_WORKSPACE_ID`: AskUI Vision Agent behind chat uses currently the AskUI API
790-
- `ASKUI__CHAT_API__DATA_DIR` (optional, defaults to `$(pwd)/chat`): Currently, the AskUI chat stores its data in a directory locally. You can change the default directory by setting this environment variable.
794+
- `ASKUI__CHAT_API__DATA_DIR` (optional, defaults to `$(pwd)/chat`): Currently, the AskUI chat stores all data in a directory locally. You can change the default directory by setting this environment variable.
795+
- `ASKUI__CHAT_API__HOST` (optional, defaults to `127.0.0.1`): The host to bind the chat API to.
796+
- `ASKUI__CHAT_API__PORT` (optional, defaults to `9261`): The port to bind the chat API to.
797+
- `ASKUI__CHAT_API__LOG_LEVEL` (optional, defaults to `info`): The log level to use for the chat API.
791798

792799
#### Installation
793800

794801
```bash
795-
pdm install # is going to install the dependencies of the api
796-
pdm run chat:ui:install # is going to install the dependencies of the ui
802+
pip install askui[chat]
797803
```
798804

799805
You may need to give permissions on the fast run of the Chat UI to demonstrate actions (aka record clicks).
800806

801807
#### Usage
802808

803809
```bash
804-
pdm run chat:api # is going to start the api at port 8000
805-
pdm run chat:ui # is going to start the ui at port 3000
810+
python -m askui.chat
806811
```
807812

808813
You can use the chat to record a workflow and redo it later. For that, just tell the agent to redo all previous steps.
@@ -815,7 +820,7 @@ You can use the chat to record a workflow and redo it later. For that, just tell
815820
#### Limitations
816821

817822
- A lot of errors are not handled properly and we allow the user to do a lot of actions that can lead to errors instead of properly guiding the user.
818-
- The chat currently only allows rerunning actions through `VisionAgent.act()` which can be expensive, slow and is not necessary the most reliable way to do it.
823+
- The chat currently only allows rerunning actions through `VisionAgent.act()` (or `AndroidVisionAgent.act()` or `WebVisionAgent.act()`) which can be expensive, slow and is not necessary the most reliable way to do it.
819824
- A lot quirks in UI and API.
820825
- Currently, api and ui need to be run in dev mode.
821826
- When demonstrating actions, the corresponding screenshot may not reflect the correct state of the screen before the action. In this case, cancel demonstrating, delete messages and try again.
@@ -824,10 +829,3 @@ You can use the chat to record a workflow and redo it later. For that, just tell
824829
- The agent is going to fail if there are no messages in the conversation, there is no tool use result message following the tool use message somewhere in the conversation, a message is too long etc.
825830
Just adding or deleting the message in this case should fix the issue.
826831
- You should not switch the conversation while waiting for an agent's answers or demonstrating actions.
827-
828-
829-
830-
#### Architecture
831-
832-
- The chat api/backend is a [FastAPI](https://fastapi.tiangolo.com/) application that provides a REST API similar to [OpenAI's Assistants API](https://platform.openai.com/docs/assistants/overview).
833-
- The chat ui/frontend is a [Next.js](https://nextjs.org/) application that provides a web interface to the chat api.

0 commit comments

Comments
 (0)