Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
187 changes: 36 additions & 151 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,24 @@
# roteiro-agent

MCP (Model Context Protocol) server for Roteiro, a spatial data platform. Enables AI agents (Claude Desktop, VS Code, Cursor) to work with geospatial datasets, run geoprocessing operations, execute SQL, and more.
MCP server for Cairn's current public API.

## Installation
The agent now exposes a smaller, explicit tool surface built around the stable workflows:

```bash
go install github.com/i-norden/roteiro-agent@latest
```
- datasets and collection queries
- uploads and remote dataset intake
- celestial body metadata and recipe execution
- unified vector operations and async jobs
- ad hoc and saved pipelines
- SQL query control plane
- projects and workspace state
- published map management

Legacy tools for `/api/process`, raster processing, catalog browsing, STAC import, routing, geocoding, and the old `map_api` catch-all have been removed.

Or build from source:
## Install

```bash
git clone https://github.com/i-norden/roteiro-agent
cd roteiro-agent
go build -o roteiro-agent .
go install github.com/i-norden/roteiro-agent@latest
```

## Usage
Expand All @@ -22,149 +27,29 @@ go build -o roteiro-agent .
roteiro-agent --server-url http://localhost:8080 --api-key Roteiro_abc123 --project-id 42
```

The server communicates via JSON-RPC 2.0 over stdio (stdin/stdout), following the MCP specification.
The server speaks JSON-RPC 2.0 over stdio and follows the MCP protocol.

### Environment variables
## Environment Variables

| Variable | Flag | Description |
|----------|------|-------------|
| `ROTEIRO_SERVER_URL` | `--server-url` | Roteiro server base URL |
| `ROTEIRO_API_KEY` | `--api-key` | Roteiro API key |
| `ROTEIRO_SESSION_COOKIE` | `--session-cookie` | Session cookie (alternative to API key) |
| `ROTEIRO_PROJECT_ID` | `--project-id` | Optional default project scope sent as `X-Project-ID` |

When `--project-id` or `ROTEIRO_PROJECT_ID` is set, the agent scopes compatible requests to that project by default. Individual tool calls can also override the scope with a `project_id` argument.

## MCP Client Configuration

### Claude Desktop

Add to `~/Library/Application Support/Claude/claude_desktop_config.json`:

```json
{
"mcpServers": {
"roteiro": {
"command": "roteiro-agent",
"args": ["--server-url", "https://your-roteiro-instance.com", "--api-key", "roteiro_abc123"]
}
}
}
```

### VS Code (Copilot)

Add to `.vscode/mcp.json`:

```json
{
"servers": {
"roteiro": {
"command": "roteiro-agent",
"args": ["--server-url", "http://localhost:8080", "--api-key", "roteiro_abc123"]
}
}
}
```

### Claude Code

Add to `.mcp.json`:

```json
{
"mcpServers": {
"roteiro": {
"command": "roteiro-agent",
"args": ["--server-url", "http://localhost:8080", "--api-key", "roteiro_abc123"]
}
}
}
```

## Available Tools

| Tool | Description |
|------|-------------|
| `list_datasets` | List all registered datasets |
| `get_dataset_info` | Dataset schema, CRS, bounds, feature count |
| `get_dataset_schema` | Field names and types |
| `get_dataset_profile` | Statistical profile of a dataset |
| `query_features` | Query with bbox, bbox CRS, response CRS, CQL2 filter, limit, properties |
| `get_feature` | Single feature by ID |
| `upload_dataset` | Upload a spatial data file, optionally naming it and attaching it to a project |
| `run_process` | Single synchronous geoprocessing operation |
| `run_raster_process` | Generic synchronous raster processing via file-path inputs |
| `preflight_process` | Validate and normalize a processing request |
| `submit_process_job` | Submit an async processing job |
| `submit_process_batch` | Submit dependent async processing jobs |
| `list_process_jobs` | List async processing jobs |
| `get_process_job` | Inspect an async processing job |
| `cancel_process_job` | Cancel an async processing job |
| `rerun_process_job` | Re-submit an async processing job |
| `run_pipeline` | Multi-step geoprocessing pipeline |
| `convert_format` | Convert between formats (GeoJSON, Shapefile, etc.) |
| `diff_datasets` | Compare two dataset versions |
| `execute_sql` | Run PostGIS SQL query |
| `list_spatial_tables` | List spatial tables in the database |
| `get_duckdb_info` | DuckDB SQL engine status/capabilities |
| `list_duckdb_datasets` | Datasets available to DuckDB SQL |
| `geocode` | Address to coordinates |
| `reverse_geocode` | Coordinates to address |
| `compute_route` | Driving/walking route computation |
| `compute_isochrone` | Travel-time isochrone polygons |
| `compute_route_matrix` | Origin-destination time/distance matrix |
| `compute_service_area` | Distance-based service area polygons |
| `list_operations` | Available geoprocessing operations |
| `list_analysis_operations` | Available advanced analysis operations |
| `browse_catalog` | Browse the built-in data catalog |
| `browse_catalog_enhanced` | Browse enhanced catalog with filters |
| `get_catalog_entry` | Get enhanced catalog entry by ID |
| `list_catalog_categories` | List catalog categories |
| `list_catalog_tags` | List catalog tags |
| `import_from_catalog` | Import a dataset from the data catalog, optionally into a project |
| `browse_stac_catalog` | Browse a remote STAC catalog |
| `browse_stac_collections` | List collections in a remote STAC catalog |
| `browse_stac_items` | List items in a remote STAC collection |
| `import_stac_asset` | Import a STAC asset as a local dataset, optionally with namespace/catalog metadata and project attachment |
| `search_stac` | Search local STAC with bbox/datetime/CQL2 filters |
| `map_api` | Allowlisted map endpoint access (publish/unpublish/stats/embed config, raster metadata/JSON analysis/export ops including contour/viewshed/profile/KDE/slope/aspect, geodesic area/length, raster classification via k-means/ISODATA/max-likelihood/random-forest, OGC feature edit ops). Mutations require `confirm=true`. |

Use `list_operations` for the live vector-processing catalog. Raster operations do not currently have a live catalog endpoint, so `run_raster_process` documents the current backend families directly: terrain, hydrology, distance/cost, spectral/change, classification, and raster-vector conversion. For dataset-name-based raster JSON routes, `map_api` now also exposes contour, viewshed, profile, KDE, slope, and aspect. Geodesic area/length and raster classification (k-means, ISODATA, maximum-likelihood, random-forest) are also available via `map_api`.

## Example Workflows

**"Show me all parks larger than 10 acres near downtown"**
1. Agent calls `list_datasets` to find the parks dataset
2. Agent calls `query_features` with a CQL2 filter: `area_acres > 10` and bbox around downtown
3. Returns matching parks as GeoJSON features

**"Buffer all schools by 1km and find which residential zones intersect"**
1. Agent calls `run_pipeline` with two steps:
- Buffer "schools" by 1000m
- Spatial join the buffer result with "residential_zones"
2. Returns the intersection result

**"What's the average building height per neighborhood?"**
1. Agent calls `execute_sql` with PostGIS SQL:
```sql
SELECT n.name, AVG(b.height) as avg_height
FROM neighborhoods n
JOIN buildings b ON ST_Intersects(n.geom, b.geom)
GROUP BY n.name ORDER BY avg_height DESC
```

**"Import building footprints from a STAC catalog and calculate total area"**
1. Agent calls `browse_stac_collections` to discover available collections
2. Agent calls `browse_stac_items` to preview the buildings collection
3. Agent calls `import_stac_asset` to download and register the data
4. Agent calls `execute_sql` to calculate total building area with PostGIS

**"Find open data about transportation in our catalog"**
1. Agent calls `browse_catalog` with search="transportation"
2. Agent calls `import_from_catalog` to import the desired dataset
3. Agent calls `get_dataset_info` to inspect the imported data

## License

MIT
| `ROTEIRO_SERVER_URL` | `--server-url` | Cairn server base URL |
| `ROTEIRO_API_KEY` | `--api-key` | API key |
| `ROTEIRO_SESSION_COOKIE` | `--session-cookie` | Session cookie alternative |
| `ROTEIRO_PROJECT_ID` | `--project-id` | Default project scope sent as `X-Project-ID` |

## Tool Groups

- data: `list_datasets`, `get_dataset_info`, `query_features`, `get_feature`, `create_feature`, `update_feature`, `delete_feature`, `upload_dataset`, `import_source`
- celestial: `get_scene_manifest`, `list_bodies`, `get_body`, `get_body_recipes`, `execute_body_recipe`
- operations: `list_operations`, `preflight_operation`, `run_operation`, `submit_operation_job`, `submit_operation_batch`, `list_operation_jobs`, `get_operation_job`, `cancel_operation_job`, `rerun_operation_job`
- pipelines: `list_pipeline_operations`, `run_pipeline`, `list_pipelines`, `get_pipeline`, `create_pipeline`, `update_pipeline`, `delete_pipeline`, `duplicate_pipeline`, `execute_saved_pipeline`, `list_pipeline_runs`, `get_pipeline_run`
- SQL: `list_query_engines`, `get_query_engine_info`, `list_query_datasets`, `execute_sql`, `save_sql_result`
- projects: `list_projects`, `get_project`, `create_project`, `update_project`, `delete_project`, `get_project_workspace`, `set_project_workspace`
- publishing: `publish_map`, `list_published_maps`, `delete_published_map`, `get_published_map_stats`, `update_map_embed_config`

## Notes

- Most tools accept project scoping through the agent's global `--project-id` or a per-call `project_id` override.
- `upload_dataset` and `import_source` both support `body_id` so tenant-defined celestial bodies flow through the intake path.
- SQL tools operate against Cairn's engine-aware control plane, so `engine` is required where applicable.
139 changes: 33 additions & 106 deletions SKILL.md
Original file line number Diff line number Diff line change
@@ -1,123 +1,50 @@
# Roteiro Spatial Platform — Agent Guide
# Cairn Agent Guide

## What is Roteiro?
## Shape

Roteiro is a full-featured spatial data platform. It stores, processes, and serves geospatial datasets. Think of it as a self-hosted GIS server with a REST API.
`roteiro-agent` is a narrow MCP wrapper around Cairn's current stable workflows. Prefer the explicit tools over inventing raw REST calls.

## Authentication
## Core Workflows

All requests require either an API key (`X-API-Key` header) or a session cookie. The roteiro-agent MCP server handles this automatically — you just need to provide credentials when starting it.
### Data discovery and query

## Key Concepts
- Start with `list_datasets`.
- Use `get_dataset_info` before writing filters or SQL.
- Use `query_features` for bounded inspection and `get_feature` for a single record.

- **Dataset**: A named collection of spatial features (points, lines, polygons). Can be GeoJSON, Shapefile, GeoPackage, etc.
- **Collection**: OGC API term for a dataset. Used interchangeably.
- **Feature**: A single geographic entity with geometry and properties (attributes).
- **CQL2**: Common Query Language v2 — a standard for filtering features by attributes and spatial relationships.
- **Pipeline**: A chain of geoprocessing operations where each step's output feeds the next.
### Intake

## Working with Data
- Use `upload_dataset` for local files.
- Use `import_source` for remote URLs or catalog-backed sources.
- Set `body_id` when the dataset belongs to Earth, Moon, Mars, or a tenant-defined body.

### Discovering datasets
### Celestial bodies

Start with `list_datasets` to see what's available. Use `get_dataset_info` to drill into a specific dataset's schema, CRS, extent, and feature count. Use `get_dataset_schema` for just the field types, or `get_dataset_profile` for statistical summaries.
- Use `get_scene_manifest` to inspect the current body-aware scene configuration.
- Use `list_bodies`, `get_body`, and `get_body_recipes` to discover body metadata.
- Use `execute_body_recipe` to trigger a configured recipe source for a body.

### Querying features
### Operations and pipelines

Use `query_features` with:
- `bbox`: spatial bounding box filter (`west,south,east,north`)
- `bbox_crs`: optional CRS for the bbox coordinates
- `crs`: optional CRS for returned geometries
- `filter`: CQL2 expression (e.g. `population > 10000 AND status = 'active'`)
- `limit`: max features (default 10, use higher values carefully)
- `properties`: comma-separated list of properties to include (reduces response size)
- `sortby`: property to sort by (prefix with `-` for descending)
- Call `list_operations` first.
- Use `preflight_operation` before expensive operations.
- Use `run_operation` for synchronous work and the `*_operation_job` tools for async work.
- Use `run_pipeline` for ad hoc multi-step chains.
- Use saved pipeline tools only when the user is clearly asking about persisted workflows.

Most data-management tools also accept an optional `project_id` argument. Use it when the same user has access to multiple projects or when the agent is started without a global `--project-id`.
### SQL

### SQL queries
- Use `list_query_engines` and `get_query_engine_info` to discover available engines.
- Use `execute_sql` for analysis and `save_sql_result` when the result should become a dataset.
- Always specify `engine`.

Use `execute_sql` for complex spatial queries. Roteiro exposes PostGIS, so all spatial functions are available:
- `ST_Area`, `ST_Length`, `ST_Distance` — measurements
- `ST_Buffer`, `ST_Intersection`, `ST_Union` — geometry operations
- `ST_Intersects`, `ST_Contains`, `ST_Within` — spatial predicates
- `ST_Transform` — coordinate system transformation
### Projects and publishing

Queries must be SELECT-only (read-only).
- Use project tools for workspace state and basic lifecycle.
- Use publish tools for public map links and embed configuration.

## Geoprocessing Operations
## Guardrails

Use `preflight_process` to validate and normalize a request first. Use `run_process` for synchronous execution, `submit_process_job` or `submit_process_batch` for async execution, and `run_pipeline` for chains.

Always call `list_operations` first to fetch the live server operation catalog and parameter names. The server now returns rich metadata including category, UI availability, projected-CRS requirements, and typed parameter definitions.

Important parameter names for common ops:
- `geodesic_buffer` uses metric `distance` in meters
- `clip` uses `mask`
- `sjoin` uses `right` and `predicate`
- `reproject` uses `from_crs` and `to_crs`
- `dissolve` uses `group_by`

Async process jobs expose queue state, phase, progress, and failure metadata via the `/api/process/jobs*` endpoints.

Use `list_analysis_operations` for advanced analysis catalog endpoints under `/api/analysis/operations`.

The async process workflow is available through `submit_process_job`, `submit_process_batch`, `list_process_jobs`, `get_process_job`, `cancel_process_job`, and `rerun_process_job`.

For raster analysis, use `run_raster_process` with file paths when you need the generic `/api/raster/process` endpoint. Typical operation families include terrain, hydrology, distance/cost, spectral/change, classification, and raster-vector conversion.

For registered raster datasets and JSON-returning raster endpoints, use `map_api` with operations such as `get_raster_info`, `get_raster_stats`, `get_raster_histogram`, `get_raster_dimensions`, `get_raster_values`, `raster_zonal_stats`, `export_raster_band`, `raster_contour`, `raster_viewshed`, `raster_profile`, and `raster_kde`.

## Data Catalog & STAC

Roteiro includes a built-in data catalog and supports importing from remote STAC (SpatioTemporal Asset Catalog) servers.

### Built-in catalog

Use `browse_catalog` to discover datasets available for import. Filter by `search` (text) or `category`. Use `import_from_catalog` with a `catalog_id` to download and register a dataset. Pass `project_id` when the imported dataset should be attached to a specific workspace project.

### Remote STAC catalogs

For external data sources:
1. `browse_stac_catalog` — inspect a remote STAC catalog by URL
2. `browse_stac_collections` — list available collections
3. `browse_stac_items` — preview items with optional `bbox` and `datetime` filters
4. `import_stac_asset` — download an asset URL and register it as a local dataset; optionally include `namespace`, `collection`, `catalog_url`, and `project_id`

### Local STAC search

Use `search_stac` to search Roteiro's own STAC endpoint with spatial (`bbox`), temporal (`datetime`), collection, and CQL2 (`filter`) criteria.

## Tips for Effective Use

1. **Start with discovery**: Always `list_datasets` first to understand what's available.
2. **Use small limits**: Default to `limit=10` when exploring. Increase only when needed.
3. **Prefer SQL for analytics**: For aggregations, joins, and complex spatial queries, `execute_sql` is more efficient than fetching features and computing client-side.
4. **Chain operations with pipelines**: Instead of running operations one by one, use `run_pipeline` to chain them in a single request.
5. **Check schemas before querying**: Use `get_dataset_schema` to see available fields before writing CQL2 filters or SQL.

## Common Patterns

### Find features near a point
```sql
SELECT * FROM parks
WHERE ST_DWithin(geom, ST_SetSRID(ST_MakePoint(-73.97, 40.77), 4326), 0.01)
LIMIT 20
```

### Aggregate by region
```sql
SELECT r.name, COUNT(p.*) as count, SUM(ST_Area(p.geom::geography)) as total_area_m2
FROM regions r JOIN parcels p ON ST_Intersects(r.geom, p.geom)
GROUP BY r.name ORDER BY count DESC
```

### Buffer and intersect
```json
{
"steps": [
{"operation": "buffer", "input": "schools", "params": {"distance": 1000}},
{"operation": "intersect", "params": {"mask": "residential_zones"}}
]
}
```
- Keep feature and query requests bounded with `limit` unless the user explicitly needs a large result.
- Prefer the explicit MCP tools here over legacy routes like `/api/process`, `/api/query/sql`, `/api/catalog`, `/api/stac`, or the old `map_api` wrapper.
- Do not assume Earth-only data. Carry `body_id` or body slug context through the workflow when the task is celestial.
Loading
Loading