Reference implementation of proposed JSON-LD 1.2 extensions for AI/ML data exchange, security hardening, and validation.
Companion implementation for: "Extending JSON-LD for Modern AI: Addressing Security, Data Modeling, and Implementation Gaps" β FLAIRS-39 (2026)
jsonld-ex extends the existing JSON-LD ecosystem with backward-compatible extensions that address critical gaps in:
- AI/ML Data Modeling β
@confidence,@source,@vectorcontainer, provenance tracking, multimodal annotations, calibration & aggregation metadata - Confidence Algebra β Full Subjective Logic framework (JΓΈsang 2016): opinions, cumulative/averaging fusion, trust discount, deduction, conflict detection, Byzantine-resistant fusion, temporal decay
- Compliance Algebra β GDPR regulatory uncertainty modeling: jurisdictional meet, compliance propagation, consent assessment, temporal triggers, erasure scope
- Similarity Metrics β Extensible registry with 7 built-in + 10 example metrics, metric selection advisory system (compare, analyze, recommend, evaluate)
- Data Protection β GDPR/privacy compliance with W3C DPV v2.2 interop: consent lifecycle, data subject rights (Art. 15β20), personal data classification
- Security Hardening β
@integritycontext verification, context allowlists, resource limits - Validation β
@shapenative validation with nested shapes, conditional constraints (@if/@then/@else), severity levels, shape inheritance (@extends) - Inference β Confidence propagation through inference chains, multi-source combination (noisy-OR, DempsterβShafer)
- Graph Operations β Confidence-aware merging, semantic diff, conflict resolution
- Temporal Modeling β
@validFrom,@validUntil,@asOffor time-aware assertions - Dataset Metadata β ML dataset cards with Croissant interop (
to_croissant/from_croissant) - IoT Transport β CBOR-LD binary serialization, MQTT topic/QoS derivation, SSN/SOSA interop
- Context Versioning β Context diff, backward compatibility checking
- MCP Server β 53 tools exposing all library capabilities to LLM agents via the Model Context Protocol
jsonld-ex does not replace existing standards β it bridges them:
| Standard | Relationship |
|---|---|
| PROV-O | Bidirectional conversion via to_prov_o / from_prov_o (60β75% fewer triples) |
| SHACL | Bidirectional mapping via shape_to_shacl / shacl_to_shape |
| OWL | Bidirectional: shape_to_owl_restrictions / owl_to_shape |
| RDF-Star | Bidirectional: to_rdf_star_ntriples / from_rdf_star_ntriples, plus Turtle export |
| SSN/SOSA | Bidirectional IoT sensor metadata via to_ssn / from_ssn |
| Croissant | ML dataset metadata via to_croissant / from_croissant |
| DPV v2.2 | Data privacy vocabulary via to_dpv / from_dpv |
| CBOR-LD | Binary serialization with context compression |
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β MCP Server (53 tools, 5 resources, 4 prompts) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β jsonld-ex Extensions (v0.6.5) β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Confidence Algebra (Subjective Logic) + Compliance Algebra (GDPR) β β
β β Opinions, fusion, trust discount, deduction, Byzantine-resistant β β
β β Jurisdictional meet, consent, propagation, erasure, triggers β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β ββββββββββββββββββ ββββββββββββββββββ ββββββββββββββββββ ββββββββββββββββββ β
β β AI/ML β β Security β β Validation β β Inference β β
β β @confidence β β @integrity β β @shape β β propagation β β
β β @source β β allowlist β β @if/@then β β combination β β
β β @vector β β limits β β @extends β β conflict res. β β
β ββββββββββββββββββ ββββββββββββββββββ ββββββββββββββββββ ββββββββββββββββββ β
β β
β ββββββββββββββββββ ββββββββββββββββββ ββββββββββββββββββ ββββββββββββββββββ β
β β Data Protectionβ β Similarity β β Dataset / β β Context β β
β β GDPR, DPV β β 7 built-in β β Croissant β β versioning β β
β β consent, rightsβ β 10 examples β β interop β β diff, compat β β
β β erasure, audit β β advisory sys. β β β β β β
β ββββββββββββββββββ ββββββββββββββββββ ββββββββββββββββββ ββββββββββββββββββ β
β β
β ββββββββββββββββββ ββββββββββββββββββ ββββββββββββββββββ ββββββββββββββββββ β
β β Temporal β β Merge / Diff β β Interop β β IoT Transport β β
β β @validFrom β β graphs β β PROV-O, SHACL β β CBOR-LD, MQTT β β
β β @validUntil β β conflict β β OWL, RDF-Star β β SSN/SOSA β β
β β @asOf β β resolution β β SSN, Croissant β β topic, QoS β β
β ββββββββββββββββββ ββββββββββββββββββ ββββββββββββββββββ ββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β PyLD (Core JSON-LD 1.1 Processing) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β JSON-LD 1.1 Specification β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
# Core (all features except IoT transport)
pip install jsonld-ex
# With IoT transport (CBOR-LD + MQTT helpers)
pip install jsonld-ex[iot]from jsonld_ex import annotate, get_confidence
doc = {
"@context": "http://schema.org/",
"@type": "Person",
"name": annotate(
"John Smith",
confidence=0.95,
source="https://ml-model.example.org/ner-v2",
extracted_at="2026-01-15T10:30:00Z",
method="NER",
),
}
get_confidence(doc["name"]) # 0.95from jsonld_ex import propagate_confidence, combine_sources
# Source (0.9 conf) β Rule (0.8 conf) β Conclusion
result = propagate_confidence([0.9, 0.8], method="dampened")
result.score # 0.849 (less aggressive than naive 0.72)
# Two sources independently say the same thing
combined = combine_sources([0.8, 0.7], method="noisy_or")
combined.score # 0.94from jsonld_ex import merge_graphs
graph_a = {"@context": "http://schema.org/", "@graph": [
{"@id": "ex:alice", "@type": "Person",
"name": {"@value": "Alice", "@confidence": 0.8, "@source": "model-A"}}
]}
graph_b = {"@context": "http://schema.org/", "@graph": [
{"@id": "ex:alice", "@type": "Person",
"name": {"@value": "Alice", "@confidence": 0.7, "@source": "model-B"}}
]}
merged, report = merge_graphs([graph_a, graph_b])
# Agreement β confidence boosted via noisy-OR: 0.94
# report.properties_agreed == 1, report.properties_conflicted == 0from jsonld_ex import add_temporal, query_at_time
nodes = [
{"@id": "ex:alice", "jobTitle": add_temporal(
{"@value": "Engineer", "@confidence": 0.9},
valid_from="2020-01-01", valid_until="2023-12-31",
)},
{"@id": "ex:alice", "jobTitle": add_temporal(
{"@value": "Manager", "@confidence": 0.85},
valid_from="2024-01-01",
)},
]
query_at_time(nodes, "2022-06-15") # β Engineer
query_at_time(nodes, "2025-01-01") # β Managerfrom jsonld_ex import to_cbor, from_cbor, payload_stats
doc = {"@context": "http://schema.org/", "@type": "SensorReading",
"value": {"@value": 42.5, "@confidence": 0.9}}
stats = payload_stats(doc)
# stats.cbor_ratio β 0.65 (35% smaller than JSON)
# stats.gzip_cbor_ratio β 0.45 (55% smaller than JSON)
payload = to_cbor(doc) # bytes for wire transmission
restored = from_cbor(payload) # back to dictfrom jsonld_ex import to_prov_o, from_prov_o
doc = {
"@context": "http://schema.org/",
"@type": "Person",
"name": {"@value": "Alice", "@confidence": 0.95,
"@source": "https://model.example.org/v2",
"@method": "NER"},
}
prov_doc, report = to_prov_o(doc)
# Full PROV-O graph with Entity, Activity, Agent nodes
# report.compression_ratio shows jsonld-ex is 3-5x more compact
round_tripped = from_prov_o(prov_doc)
# Back to inline annotations β lossless round-trip| Module | Key Exports | Description |
|---|---|---|
ai_ml |
annotate, get_confidence, get_provenance, filter_by_confidence |
Core annotation with 23 provenance fields |
confidence_algebra |
Opinion, cumulative_fuse, averaging_fuse, trust_discount, deduce, robust_fuse |
Subjective Logic framework (JΓΈsang 2016) |
compliance_algebra |
ComplianceOpinion, jurisdictional_meet, compliance_propagation, consent_validity, erasure_scope_opinion |
GDPR regulatory uncertainty modeling |
similarity |
similarity, compare_metrics, analyze_vectors, recommend_metric, evaluate_metrics, MetricProperties |
7 built-in + extensible metrics, advisory system |
data_protection |
annotate_protection, create_consent_record, is_consent_active, filter_by_jurisdiction |
GDPR/privacy compliance metadata |
data_rights |
request_erasure, execute_erasure, export_portable, right_of_access_report |
Data subject rights (GDPR Art. 15β20) |
dpv_interop |
to_dpv, from_dpv, compare_with_dpv |
W3C Data Privacy Vocabulary v2.2 |
validation |
validate_node, validate_document |
@shape validation with @if/@then, @extends |
security |
compute_integrity, verify_integrity, is_context_allowed |
@integrity and allowlists |
owl_interop |
to_prov_o, from_prov_o, shape_to_shacl, shacl_to_shape, to_ssn, from_ssn |
Bidirectional: PROV-O, SHACL, OWL, RDF-Star, SSN/SOSA |
dataset |
create_dataset_metadata, to_croissant, from_croissant |
ML dataset cards, Croissant interop |
inference |
propagate_confidence, combine_sources, resolve_conflict |
Confidence propagation and combination |
confidence_bridge |
combine_opinions_from_scalars, propagate_opinions_from_scalars |
Scalar-to-opinion bridge |
confidence_decay |
decay_opinion, exponential_decay, linear_decay, step_decay |
Temporal decay of evidence |
merge |
merge_graphs, diff_graphs |
Graph merging and diff |
temporal |
add_temporal, query_at_time, temporal_diff |
Time-aware assertions |
vector |
validate_vector, cosine_similarity, vector_term_definition |
@vector container support |
batch |
annotate_batch, validate_batch, filter_by_confidence_batch |
Batch operations |
context |
context_diff, check_compatibility |
Context versioning and migration |
cbor_ld |
to_cbor, from_cbor, payload_stats |
Binary serialization (requires cbor2) |
mqtt |
to_mqtt_payload, from_mqtt_payload, derive_mqtt_topic, derive_mqtt_qos |
IoT transport (requires cbor2) |
mcp |
MCP server (53 tools, 5 resources, 4 prompts) | LLM agent integration (requires mcp) |
Detailed documentation, usage examples, and API reference for each language implementation:
| Package | Path | Status |
|---|---|---|
| Python | packages/python/README.md |
β Published on PyPI β 23 modules, 53 MCP tools, 2025+ tests |
| JavaScript/TypeScript | packages/js/README.md |
π§ Early development (v0.1.0) β 4 core modules (ai-ml, security, validation, vector) |
Formal specifications for each extension are in /spec:
- AI/ML Extensions β Confidence, provenance, vector embeddings
See DOCS_PLAN.md for the comprehensive documentation roadmap.
This is a research implementation accompanying an academic publication. Contributions welcome via issues and PRs.
MIT
@inproceedings{jsonld-ex-flairs-2026,
title={Extending JSON-LD for Modern AI: Addressing Security, Data Modeling, and Implementation Gaps},
author={Syed, Muntaser and Silaghi, Marius and Abujar, Sheikh and Alssadi, Rwaida},
booktitle={Proceedings of the 39th International FLAIRS Conference},
year={2026}
}A follow-up paper targeting NeurIPS 2026 Datasets & Benchmarks is in preparation, covering the formal confidence algebra, comprehensive benchmarks, and extended evaluation.