Skip to content

oasis-main/ox-collab

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ox-collab

Collaborative data review for AI datasets — voting, threaded discussion, and PCA embedding maps over LLM-labeled examples and ML profiling reports.

ox-collab is the social and visualization layer in the oasis-data product line. It wraps two upstream tools — argilla (LLM dataset labeling) and ydata-profiling (tabular ML profiling) — with a unified UI that lets a research team upvote, discuss, and explore datasets together.

Status: WIP. Not yet production-ready. Local Docker stack works; LAN deployment is the next milestone.


What's in here

ox-collab/
├── api/         FastAPI backend — votes, comments, mentions, embeddings, auth toggle
├── frontend/    React + Mantine + Plotly UI — record cards, threads, 1D/2D/3D/4D viewer
├── docs/        Architecture, deployment, adapter notes
├── docker-compose.yaml       full local stack (postgres + api + frontend)
├── docker-compose.lan.yaml   overlay to expose on the LAN
└── .swarm/      cross-cutting project coordination (dot_swarm protocol)

Each subdir is independently developable (api/.swarm/ and frontend/.swarm/ track per-component work; the top-level .swarm/ tracks cross-cutting items like releases and integration with the forks).


The product line

ox-collab does not store dataset records itself — it pulls them from sibling repos in the oasis-data family:

Repo Role Upstream
oasis-main/ox-collab This repo. Collaboration UI + API. new
oasis-main/ox-llm-data-collab LLM dataset labeling argilla-io/argilla
oasis-main/ox-ml-data-collab Tabular ML profiling ydata-profiling family

The two fork repos each contain an oasis-extensions/ox_collab_adapter.py that pushes their records into ox-collab-api for collaboration. See docs/ADAPTERS.md for details.


Features

  • Records browser — filter by source (LLM data / ML profiling / all), score-sorted, paginated.
  • Voting — upvote / neutral / downvote. One vote per record per user; aggregate score on every card.
  • Threaded comments — markdown body, infinite reply depth, @mentions parsed server-side and stored as notifications.
  • Embedding map — PCA projection of every record into:
    • 1D strip plot (density)
    • 2D scatter (color by source)
    • 3D rotatable scatter
    • 4D = 3D + scrubbable time slider with play/pause, for watching the manifold evolve as records are added or relabeled.
  • Optional auth — host sets OX_AUTH_REQUIRED=true|false. False = anonymous-ok mode (X-Anon-Name header). True = JWT required for writes; reads stay open.
  • LAN-friendlydocker-compose.lan.yaml overlay binds to 0.0.0.0 and advertises via mDNS so colleagues on the same network reach http://<host>.local:8002 without configuration.

Quick start

git clone https://github.com/oasis-main/ox-collab.git
cd ox-collab

# Bring up the full stack (postgres + api + frontend)
docker compose --profile full up

# Open http://localhost:8002 — API at http://localhost:8001

To expose on the LAN:

docker compose -f docker-compose.yaml -f docker-compose.lan.yaml --profile full --profile lan up

For local dev without Docker, see api/README.md and frontend/README.md.


Architecture

┌─────────────────────────────────────────────────────────────────┐
│  React + Mantine + Plotly  (port 8002, served by nginx)         │
│  routes: /, /records/:id, /embedding                            │
└──────────────────────────────┬──────────────────────────────────┘
                               │  /api/* reverse-proxy
┌──────────────────────────────▼──────────────────────────────────┐
│  FastAPI  (port 8001)                                           │
│  /records  /votes  /comments  /embeddings  /auth                │
│    + auth-toggle middleware                                     │
│    + lazy sentence-transformers worker for embeddings           │
└──────────────────────────────┬──────────────────────────────────┘
                               │
                ┌──────────────┴──────────────┐
                ▼                             ▼
       Postgres 16 (records,        sentence-transformers
       votes, comments,             all-MiniLM-L6-v2
       embeddings, mentions)        (loaded on first /embeddings/refresh)

                ▲                             ▲
                │                             │
        ┌───────┴────────┐         ┌──────────┴────────┐
        │ ox-llm-data-   │         │ ox-ml-data-       │
        │ collab adapter │         │ collab adapter    │
        │ (argilla API)  │         │ (profiling JSON)  │
        └────────────────┘         └───────────────────┘

Full schema and design tradeoffs in docs/ARCHITECTURE.md.


Roadmap (cross-cutting; per-component queues live in each .swarm/)

  • v0 scaffold — api + frontend + Docker + adapters
  • OXC-001: end-to-end smoke test on a clean machine
  • OXC-002: deploy on LAN with mDNS, validate multi-user collaboration
  • OXC-003: oasis-auth SSO integration
  • OXC-004: WebSocket live updates (votes/comments)
  • OXC-005: ModalSheaf integration for cross-source consistency scoring
  • OXC-006: oasis-cloud production deployment manifests

See .swarm/queue.md for the full coordination queue.


License

ox-collab (this repo's original code) is licensed under Apache-2.0.

The sibling fork repos (ox-llm-data-collab, ox-ml-data-collab) inherit the upstream licenses (Apache-2.0 and MIT respectively); see each fork's LICENSE and NOTICE files for attribution.


Part of oasis-data — collaborative tooling for AI dataset development.

About

Monorepo for ox-collab — collaborative data review (votes, threaded comments, 1D/2D/3D/4D embedding viewer) layered on top of ox-llm-data-collab (argilla fork) and ox-ml-data-collab (profiling fork). FastAPI + React+Mantine+Plotly. Docker-deployed for LAN, scales to oasis-cloud.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors