feat(parliament): surface sansad-semantic-crawler v1.0.0 discourse layer by skishchampi · Pull Request #9 · CommonerLLP/theright2read

skishchampi · 2026-05-12T14:58:24Z

Summary

Reframes the data-page parliament section from what MPs ask to how the State responds when asked. Consumes the v1.0.0 analytical pipeline (extract-answers → analyse-discourse → analyse-ministry) and joins it into the public dataset.

Pipeline

Makefile — new targets: corpus-extract-answers, corpus-analyse-discourse, corpus-analyse-ministry, corpus-analyse, corpus-enrich. corpus-refresh now runs the full pipeline end-to-end.
scripts/build_parliament_libraries.py (new) — joins manifest.jsonl + analysis_discourse.jsonl + ministry_summary_qa.jsonl into the public JS export. Emits four new top-level keys: discourseSummary, ministryDiscourse, discourseExcerpts, rrrlfDeflections.

Surface

Headline stat: Of 105 classified responses to library questions, 64 (61%) were evasive.
"Library is a state subject" cascade: every FEDERAL_DEFLECTION response on rrrlf-tagged questions, sorted by date. The same five words appear in 1998 (HRD) and 2018 (Culture × 3) — across two decades and two political dispensations. Pairs with the existing RRRLF "two decades, a nose dive" section directly above it: Centre underfunded → Centre evaded when asked why.
Per-ministry evasion bars with classified-N denominators (HRD: 60% on N=15 of 83; Culture: 40% on N=15 of 67) — honest framing for the small classifier sample.
Taxonomy of evasion: single-column row list, one verbatim phrase per label as the hero (italic quote), one-line citation, collapsible full passage. Labels covered: REJECTED, SUBSTITUTED, FEDERAL_DEFLECTION, DEFLECTED, DATA_WITHHELD, STRUCTURAL_REFUSAL, CONSTITUTIONAL_DEFAULT, REPRESENTATIONAL_SILENCE.
Method note with corpus sizes + classifier name + repo link.
Old keyQuestions + topTags rendering demoted to a collapsible <details> block.

Followup tracked separately

Issue CommonerLLP/sansad-semantic-crawler#41 — the v1.0.0 discourse classifier is voice-blind. Adding voice/passive_ratio/agent_named lands as v1.1.0 in the upstream package; this consumer is structured to receive those fields when they arrive.

Test plan

CI passes (link checker + html validation)
After merge, load /data/#parliament:
- Headline stat renders with red 61% and the lede sentence
- RRRLF cascade shows the 1998 + three 2018 deflections, dates left-rail
- Per-ministry bars render with rate + classified/total N
- Each evasion row renders the italic pattern + cite line; "Read the passage" expands
- Method note lists corpus stats and links the upstream repo

Reframes the data-page parliament section from "what MPs ask" to "how the State responds when asked." Consumes the v1.0.0 analytical pipeline (extract-answers → analyse-discourse → analyse-ministry) and joins it into the public dataset. - Makefile: corpus-extract-answers, corpus-analyse-discourse, corpus-analyse-ministry, corpus-analyse, corpus-enrich. corpus-refresh now chains the full pipeline. - scripts/build_parliament_libraries.py: joins manifest + analysis_discourse + ministry_summary_qa into assets/parliament_libraries.js. Emits new top-level keys: discourseSummary, ministryDiscourse, discourseExcerpts, rrrlfDeflections. - data/index.html, assets/main.js, assets/styles.css: new section structure — headline evasion rate, "Library is a state subject" cascade (FEDERAL_DEFLECTION on RRRLF-tagged questions, sorted by date), per-ministry evasion bars with classified-N denominators, taxonomy of evasion (one verbatim phrase per label, with collapsible full passage), method note. Old keyQuestions / topTags grid demoted to a collapsible details element. - Cache-bust v=43 → v=44 across index.html, data/index.html, inequality/index.html.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4d462c0e7f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-12T15:03:19Z

+                    "qtype": m.get("qtype", ""),
+                    "qno": m.get("qno", ""),
+                    "date": m.get("date", ""),
+                    "title": escape(m.get("title", "")),


Stop pre-escaping titles in export builder

The builder escapes title before writing JSON, but the UI escapes the same field again in assets/main.js when rendering both the evasion rows and the RRRLF cascade. This double-escaping turns legitimate characters like & into visible entities (&) whenever a parliamentary title contains special characters, so end users see corrupted text instead of the original title.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-12T15:03:19Z

+
+# Join the upstream manifest export with the v1.0.0 analytical outputs
+# into a single enriched assets/parliament_libraries.js.
+corpus-enrich: corpus-export


Make corpus-enrich require analysis outputs

corpus-enrich only depends on corpus-export, but build_parliament_libraries.py reads analysis_discourse.jsonl and ministry_summary_qa.jsonl and silently treats missing files as empty arrays. As a result, running make corpus-enrich (or running corpus-refresh with parallel make) can produce a “successful” artifact with zeroed discourse metrics and missing sections instead of failing fast, which makes the published dataset silently incomplete.

Useful? React with 👍 / 👎.

skishchampi merged commit 5ba100c into main May 12, 2026
2 checks passed

skishchampi deleted the feat/parliament-classifier-v2 branch May 12, 2026 15:00

chatgpt-codex-connector Bot reviewed May 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(parliament): surface sansad-semantic-crawler v1.0.0 discourse layer#9

feat(parliament): surface sansad-semantic-crawler v1.0.0 discourse layer#9
skishchampi merged 1 commit into
mainfrom
feat/parliament-classifier-v2

skishchampi commented May 12, 2026

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 12, 2026

Uh oh!

chatgpt-codex-connector Bot May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

skishchampi commented May 12, 2026

Summary

Pipeline

Surface

Followup tracked separately

Test plan

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant