Add cecil-data-analysis-xql skill by jayendra13 · Pull Request #4 · cecilearth/examples

jayendra13 · 2026-05-11T14:02:03Z

Summary

Adds a new skill, skills/cecil-data-analysis-xql/, that answers earth-observation analysis questions against Cecil datasets using xarray-sql (xql) as the SQL layer.

The skill owns the full loop:

Pick the right dataset from the Cecil catalog (prefers existing subscriptions; gates new ones on explicit confirmation).
Load it via client.load_xarray(subscription_id).
Register the dataset and every variable's reference_table on an xarray_sql.XarrayContext.
Run the query and present SQL → result table → plain-English interpretation as one compact block.

Why xql: SQL beats pandas chains for joining categorical reference tables (no integer codes leaking into answers), windows like ROW_NUMBER OVER / LAG OVER make dominant-class and pixel-level change-detection queries one-liners, and DataFusion streams over the dask-backed xarray Dataset the SDK already returns.

What's in the skill

SKILL.md — instructions, golden-rule output format, worked example
references/ — Cecil dataset catalog pointers, condensed SDK reference, xql patterns + gotchas, and the canonical demo script
scripts/ — list_subscriptions.py, inspect_dataset.py, run_analysis.py (load → register → run SQL → save result.csv + result.md + result.png)

Cross-references

Per CONTRIBUTING:

Prerequisite: subscribe-and-load — sets up the subscription load_xarray expects.
Hand-written counterpart: land-cover-baseline-and-change — same outputs without a SQL layer.

Skill checklist

Directory name (cecil-data-analysis-xql) is kebab-case and matches the name: in frontmatter.
Frontmatter contains name, description, license: MIT.
Body covers prerequisites, steps, constraints, references.
Cross-references to sibling skills use relative paths.

Test plan

Smoke test: list_subscriptions.py — returns 2 subscriptions on Land Cover 9-Class, output matches the documented format.
Smoke test: inspect_dataset.py against the Land Cover 9-Class dataset — returns 9-class reference table cleanly.
End-to-end: dominant-land-cover-class worked example through run_analysis.py — runs the windowed SQL, joins the reference table, writes result.csv / result.md / result.png, prints the Markdown summary.

End-to-end output:

|   year | dominant_class   |      px |
|-------:|:-----------------|--------:|
|   2020 | Trees            | 9167795 |
|   2023 | Crops            | 8876719 |

The end-to-end test caught one missing transitive dependency (tabulate, needed by pandas.DataFrame.to_markdown()); fix included in the second commit on this branch.

A Claude skill for answering earth-observation analysis questions against Cecil datasets, with the SQL layer provided by xarray-sql (xql). Picks a dataset, loads it via the Cecil SDK, registers the dataset and every variable's reference_table on an XarrayContext, runs the query, and presents query + result table + interpretation as a single block. Contents: - SKILL.md: instructions, golden-rule output format, worked example - references/datasets.md: per-category selection guidance + gotchas (the live catalog comes from client.list_datasets()) - references/sdk.md: SDK gotchas that aren't in docstrings (the SDK has none — signatures come from inspect.signature) - references/xarray_sql.md: XarrayContext patterns, quoting rules, the cftime UDF caveat - scripts/list_subscriptions.py, inspect_dataset.py, run_analysis.py (load → register → run SQL → save result.csv + result.md + result.png)

Structural restructure of #4 (jayendra13/add-cecil-data-analysis-xql-skill). The original PR is a single skill containing 4 runnable Python scripts and 3 reference documents (~1,200 LOC). That shape is closer to a tutorial than a skill — skills are short, focused text loaded into an agent's context at inference time, not multi-file CLI projects. Splitting into two artifacts: - skills/cecil-data-analysis-xql/SKILL.md (~135 lines) Just what an agent needs in-context: the subscription gate (wallet safety), the golden-rule output format, the SQL idioms, the PascalCase quoting rule, vector-vs-raster routing, and failure modes. Links out to the tutorial for everything else. - tutorials/cecil-data-analysis-xql/ ├── README.md full Step 0–5 walkthrough + worked example ├── references/ datasets.md, sdk.md, xarray_sql.md (Jayendra's) └── scripts/ _env.py, list_subscriptions.py, inspect_dataset.py, run_analysis.py (Jayendra's, with three small fixes) Three review fixes applied to the script: 1. run_analysis.py: added --vector flag using load_dataframe + XarrayContext.from_pandas. Previously the runner unconditionally called load_xarray, which would fail on IBAT vector datasets the description says are in scope ("threatened species ranges intersect this AOI"). 2. run_analysis.py: replaced fig.autofmt_xdate() in the bar-chart branch with rotation of categorical xticks. autofmt_xdate is a date-axis helper applied to a non-date axis. 3. README.md: pinned xarray-sql to a sub-0.1 range (pre-1.0 package; the XarrayContext().from_dataset() API is the kind that drifts in 0.x). Also dropped the bash-specific `set -a && source .env && set +a` syntax from the prerequisites — fish/zsh-without-posix users would hit it. If this merges, #4 should close as superseded. Co-Authored-By: jayendra13 <651057+jayendra13@users.noreply.github.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Structural restructure of #4 (jayendra13/add-cecil-data-analysis-xql-skill). The original PR is a single skill containing 4 runnable Python scripts and 3 reference documents (~1,200 LOC). That shape is closer to a tutorial than a skill — skills are short, focused text loaded into an agent's context at inference time, not multi-file CLI projects. Splitting into two artifacts: - skills/cecil-data-analysis-xql/SKILL.md (~135 lines) Just what an agent needs in-context: the subscription gate (wallet safety), the golden-rule output format, the SQL idioms, the PascalCase quoting rule, vector-vs-raster routing, and failure modes. Links out to the tutorial for everything else. - tutorials/cecil-data-analysis-xql/ ├── README.md full Step 0–5 walkthrough + worked example ├── references/ datasets.md, sdk.md, xarray_sql.md (Jayendra's) └── scripts/ _env.py, list_subscriptions.py, inspect_dataset.py, run_analysis.py (Jayendra's, with three small fixes) Three review fixes applied to the script: 1. run_analysis.py: added --vector flag using load_dataframe + XarrayContext.from_pandas. Previously the runner unconditionally called load_xarray, which would fail on IBAT vector datasets the description says are in scope ("threatened species ranges intersect this AOI"). 2. run_analysis.py: replaced fig.autofmt_xdate() in the bar-chart branch with rotation of categorical xticks. autofmt_xdate is a date-axis helper applied to a non-date axis. 3. README.md: pinned xarray-sql to a sub-0.1 range (pre-1.0 package; the XarrayContext().from_dataset() API is the kind that drifts in 0.x). Also dropped the bash-specific `set -a && source .env && set +a` syntax from the prerequisites — fish/zsh-without-posix users would hit it. If this merges, #4 should close as superseded. Co-Authored-By: jayendra13 <jayendra0parmar@gmail.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

jayendra13 force-pushed the add-cecil-data-analysis-xql-skill branch from f4fcb2f to a232e5d Compare May 11, 2026 14:28

jayendra13 force-pushed the add-cecil-data-analysis-xql-skill branch from a232e5d to 4eb928b Compare May 11, 2026 14:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add cecil-data-analysis-xql skill#4

Add cecil-data-analysis-xql skill#4
jayendra13 wants to merge 1 commit into
cecilearth:mainfrom
jayendra13:add-cecil-data-analysis-xql-skill

jayendra13 commented May 11, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jayendra13 commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's in the skill

Cross-references

Skill checklist

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jayendra13 commented May 11, 2026 •

edited

Loading