Turns alerts into actionable diagnoses for on-call SRE and operations teams. Evidence first, human approved.
- Focused on incident diagnosis, not another generic ops dashboard
- Puts diagnosis, evidence, risk, and next action in one flow
- Keeps approval and execution behind a human gate
TARS is an AIOps MVP for on-call SRE and operations teams. It is built around a narrow job: when an alert arrives, help the responder understand what is happening and what to do next, without bouncing between metrics, logs, SSH, runbooks, and chat threads.
The product focus is not "full platform coverage." It is a tighter loop: faster diagnosis, clearer evidence, safer execution, and better auditability.
The win is not "AI wrote a long summary." The win is a responder opening one session and seeing:
- the current diagnosis
- the evidence behind it
- the risk level
- the next recommended action
That is the whole game.
The session detail page is being refocused around operator decision speed: current diagnosis, recommended next step, left-side timeline, and expandable evidence rows from the tool plan.
This image is the confirmed target information architecture reference for /sessions/:id. It is a design reference only, not a substitute for runtime validation in the shared environment.
flowchart TD
A[Alert / Telegram / Web request] --> B[Session intake]
B --> C[Evidence-first diagnosis]
C --> C1[Metrics]
C --> C2[Logs]
C --> C3[Traces / observability]
C --> C4[Release evidence]
C --> C5[SSH only when needed]
C1 --> D[Diagnosis, risk, next action]
C2 --> D
C3 --> D
C4 --> D
C5 --> D
D --> E[Approval-gated execution]
E --> F[Audit trail and knowledge capture]
- Go
1.25 - Node.js
20.19+or22.12+ - npm
- Ruby
- Docker, optional
- Clone the repository.
git clone https://github.com/evilgaoshu/TARS.git
cd TARS- Install frontend dependencies.
make web-install- Run the publishable baseline checks.
make secret-scan
make pre-check
make check-mvp-
Copy the
.exampleconfig files into ignored local files and fill in real values outside Git. -
If you want a local stack after the baseline checks, use the Docker Compose path described in docs/README-notes.md.
If you just cloned the repo, run make web-install before make check-mvp because check-mvp expects web/node_modules to exist.
TARS currently fits teams that:
- run an on-call SRE or operations rotation
- already have a real alert source such as
VMAlert - handle repeatable infrastructure incidents
- want AI-assisted diagnosis, but not unsupervised execution
Typical early scenarios include service unavailability, CPU or memory spikes, disk pressure, and unhealthy instances.
| Need | Where to start |
|---|---|
| Product and technical baseline | project/README.md |
| User, admin, deployment, troubleshooting guides | docs/guides/README.md |
| API, configuration, schema, compatibility | docs/reference/README.md |
| Operations, CI, rollout, deeper runbooks | docs/operations/README.md |
| Extra README notes and deeper entry links | docs/README-notes.md |
If the repository owner wants to update the GitHub About section manually, this wording matches the current project scope:
- About:
AI-assisted incident diagnosis and approval-gated execution for on-call SRE teams. Turns alerts into actionable diagnoses with evidence-first workflows and human approval. - Topics:
aiops,sre,incident-response,oncall,observability,devops,golang,react,postgresql,telegram
See CONTRIBUTING.md for local setup, development workflow, and contribution expectations.
No standalone LICENSE file is present in the repository today. Before public release or open source distribution, add the intended license file and update this section.
