Skip to content

jemsbhai/jsonld-ex

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

178 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

jsonld-ex β€” JSON-LD 1.2 Extensions

Reference implementation of proposed JSON-LD 1.2 extensions for AI/ML data exchange, security hardening, and validation.

Companion implementation for: "Extending JSON-LD for Modern AI: Addressing Security, Data Modeling, and Implementation Gaps" β€” FLAIRS-39 (2026)

PyPI Tests License: MIT

Overview

jsonld-ex extends the existing JSON-LD ecosystem with backward-compatible extensions that address critical gaps in:

  1. AI/ML Data Modeling β€” @confidence, @source, @vector container, provenance tracking, multimodal annotations, calibration & aggregation metadata
  2. Confidence Algebra β€” Full Subjective Logic framework (JΓΈsang 2016): opinions, cumulative/averaging fusion, trust discount, deduction, conflict detection, Byzantine-resistant fusion, temporal decay
  3. Compliance Algebra β€” GDPR regulatory uncertainty modeling: jurisdictional meet, compliance propagation, consent assessment, temporal triggers, erasure scope
  4. Similarity Metrics β€” Extensible registry with 7 built-in + 10 example metrics, metric selection advisory system (compare, analyze, recommend, evaluate)
  5. Data Protection β€” GDPR/privacy compliance with W3C DPV v2.2 interop: consent lifecycle, data subject rights (Art. 15–20), personal data classification
  6. Security Hardening β€” @integrity context verification, context allowlists, resource limits
  7. Validation β€” @shape native validation with nested shapes, conditional constraints (@if/@then/@else), severity levels, shape inheritance (@extends)
  8. Inference β€” Confidence propagation through inference chains, multi-source combination (noisy-OR, Dempster–Shafer)
  9. Graph Operations β€” Confidence-aware merging, semantic diff, conflict resolution
  10. Temporal Modeling β€” @validFrom, @validUntil, @asOf for time-aware assertions
  11. Dataset Metadata β€” ML dataset cards with Croissant interop (to_croissant/from_croissant)
  12. IoT Transport β€” CBOR-LD binary serialization, MQTT topic/QoS derivation, SSN/SOSA interop
  13. Context Versioning β€” Context diff, backward compatibility checking
  14. MCP Server β€” 53 tools exposing all library capabilities to LLM agents via the Model Context Protocol

Ecosystem Interoperability

jsonld-ex does not replace existing standards β€” it bridges them:

Standard Relationship
PROV-O Bidirectional conversion via to_prov_o / from_prov_o (60–75% fewer triples)
SHACL Bidirectional mapping via shape_to_shacl / shacl_to_shape
OWL Bidirectional: shape_to_owl_restrictions / owl_to_shape
RDF-Star Bidirectional: to_rdf_star_ntriples / from_rdf_star_ntriples, plus Turtle export
SSN/SOSA Bidirectional IoT sensor metadata via to_ssn / from_ssn
Croissant ML dataset metadata via to_croissant / from_croissant
DPV v2.2 Data privacy vocabulary via to_dpv / from_dpv
CBOR-LD Binary serialization with context compression

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚            MCP Server (53 tools, 5 resources, 4 prompts)              β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                       jsonld-ex Extensions (v0.6.5)                    β”‚
β”‚                                                                       β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚  Confidence Algebra (Subjective Logic) + Compliance Algebra (GDPR) β”‚  β”‚
β”‚  β”‚  Opinions, fusion, trust discount, deduction, Byzantine-resistant  β”‚  β”‚
β”‚  β”‚  Jurisdictional meet, consent, propagation, erasure, triggers     β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚                                                                       β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚ AI/ML          β”‚ β”‚ Security       β”‚ β”‚ Validation     β”‚ β”‚ Inference      β”‚  β”‚
β”‚  β”‚ @confidence    β”‚ β”‚ @integrity     β”‚ β”‚ @shape         β”‚ β”‚ propagation    β”‚  β”‚
β”‚  β”‚ @source        β”‚ β”‚ allowlist      β”‚ β”‚ @if/@then      β”‚ β”‚ combination    β”‚  β”‚
β”‚  β”‚ @vector        β”‚ β”‚ limits         β”‚ β”‚ @extends       β”‚ β”‚ conflict res.  β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚                                                                       β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚ Data Protectionβ”‚ β”‚ Similarity     β”‚ β”‚ Dataset /      β”‚ β”‚ Context        β”‚  β”‚
β”‚  β”‚ GDPR, DPV      β”‚ β”‚ 7 built-in     β”‚ β”‚ Croissant      β”‚ β”‚ versioning     β”‚  β”‚
β”‚  β”‚ consent, rightsβ”‚ β”‚ 10 examples    β”‚ β”‚ interop        β”‚ β”‚ diff, compat   β”‚  β”‚
β”‚  β”‚ erasure, audit β”‚ β”‚ advisory sys.  β”‚ β”‚                β”‚ β”‚                β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚                                                                       β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚ Temporal       β”‚ β”‚ Merge / Diff  β”‚ β”‚ Interop        β”‚ β”‚ IoT Transport  β”‚  β”‚
β”‚  β”‚ @validFrom     β”‚ β”‚ graphs        β”‚ β”‚ PROV-O, SHACL  β”‚ β”‚ CBOR-LD, MQTT  β”‚  β”‚
β”‚  β”‚ @validUntil    β”‚ β”‚ conflict      β”‚ β”‚ OWL, RDF-Star  β”‚ β”‚ SSN/SOSA       β”‚  β”‚
β”‚  β”‚ @asOf          β”‚ β”‚ resolution    β”‚ β”‚ SSN, Croissant β”‚ β”‚ topic, QoS     β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                  PyLD (Core JSON-LD 1.1 Processing)                    β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                       JSON-LD 1.1 Specification                        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Quick Start

Installation

# Core (all features except IoT transport)
pip install jsonld-ex

# With IoT transport (CBOR-LD + MQTT helpers)
pip install jsonld-ex[iot]

Annotate Values with Confidence and Provenance

from jsonld_ex import annotate, get_confidence

doc = {
    "@context": "http://schema.org/",
    "@type": "Person",
    "name": annotate(
        "John Smith",
        confidence=0.95,
        source="https://ml-model.example.org/ner-v2",
        extracted_at="2026-01-15T10:30:00Z",
        method="NER",
    ),
}

get_confidence(doc["name"])  # 0.95

Propagate Confidence Through Inference Chains

from jsonld_ex import propagate_confidence, combine_sources

# Source (0.9 conf) β†’ Rule (0.8 conf) β†’ Conclusion
result = propagate_confidence([0.9, 0.8], method="dampened")
result.score  # 0.849 (less aggressive than naive 0.72)

# Two sources independently say the same thing
combined = combine_sources([0.8, 0.7], method="noisy_or")
combined.score  # 0.94

Merge Graphs from Multiple Sources

from jsonld_ex import merge_graphs

graph_a = {"@context": "http://schema.org/", "@graph": [
    {"@id": "ex:alice", "@type": "Person",
     "name": {"@value": "Alice", "@confidence": 0.8, "@source": "model-A"}}
]}
graph_b = {"@context": "http://schema.org/", "@graph": [
    {"@id": "ex:alice", "@type": "Person",
     "name": {"@value": "Alice", "@confidence": 0.7, "@source": "model-B"}}
]}

merged, report = merge_graphs([graph_a, graph_b])
# Agreement β†’ confidence boosted via noisy-OR: 0.94
# report.properties_agreed == 1, report.properties_conflicted == 0

Time-Aware Assertions

from jsonld_ex import add_temporal, query_at_time

nodes = [
    {"@id": "ex:alice", "jobTitle": add_temporal(
        {"@value": "Engineer", "@confidence": 0.9},
        valid_from="2020-01-01", valid_until="2023-12-31",
    )},
    {"@id": "ex:alice", "jobTitle": add_temporal(
        {"@value": "Manager", "@confidence": 0.85},
        valid_from="2024-01-01",
    )},
]

query_at_time(nodes, "2022-06-15")  # β†’ Engineer
query_at_time(nodes, "2025-01-01")  # β†’ Manager

CBOR-LD Payload Optimization

from jsonld_ex import to_cbor, from_cbor, payload_stats

doc = {"@context": "http://schema.org/", "@type": "SensorReading",
       "value": {"@value": 42.5, "@confidence": 0.9}}

stats = payload_stats(doc)
# stats.cbor_ratio β‰ˆ 0.65 (35% smaller than JSON)
# stats.gzip_cbor_ratio β‰ˆ 0.45 (55% smaller than JSON)

payload = to_cbor(doc)          # bytes for wire transmission
restored = from_cbor(payload)   # back to dict

Convert to/from PROV-O

from jsonld_ex import to_prov_o, from_prov_o

doc = {
    "@context": "http://schema.org/",
    "@type": "Person",
    "name": {"@value": "Alice", "@confidence": 0.95,
             "@source": "https://model.example.org/v2",
             "@method": "NER"},
}

prov_doc, report = to_prov_o(doc)
# Full PROV-O graph with Entity, Activity, Agent nodes
# report.compression_ratio shows jsonld-ex is 3-5x more compact

round_tripped = from_prov_o(prov_doc)
# Back to inline annotations β€” lossless round-trip

Module Reference

Module Key Exports Description
ai_ml annotate, get_confidence, get_provenance, filter_by_confidence Core annotation with 23 provenance fields
confidence_algebra Opinion, cumulative_fuse, averaging_fuse, trust_discount, deduce, robust_fuse Subjective Logic framework (JΓΈsang 2016)
compliance_algebra ComplianceOpinion, jurisdictional_meet, compliance_propagation, consent_validity, erasure_scope_opinion GDPR regulatory uncertainty modeling
similarity similarity, compare_metrics, analyze_vectors, recommend_metric, evaluate_metrics, MetricProperties 7 built-in + extensible metrics, advisory system
data_protection annotate_protection, create_consent_record, is_consent_active, filter_by_jurisdiction GDPR/privacy compliance metadata
data_rights request_erasure, execute_erasure, export_portable, right_of_access_report Data subject rights (GDPR Art. 15–20)
dpv_interop to_dpv, from_dpv, compare_with_dpv W3C Data Privacy Vocabulary v2.2
validation validate_node, validate_document @shape validation with @if/@then, @extends
security compute_integrity, verify_integrity, is_context_allowed @integrity and allowlists
owl_interop to_prov_o, from_prov_o, shape_to_shacl, shacl_to_shape, to_ssn, from_ssn Bidirectional: PROV-O, SHACL, OWL, RDF-Star, SSN/SOSA
dataset create_dataset_metadata, to_croissant, from_croissant ML dataset cards, Croissant interop
inference propagate_confidence, combine_sources, resolve_conflict Confidence propagation and combination
confidence_bridge combine_opinions_from_scalars, propagate_opinions_from_scalars Scalar-to-opinion bridge
confidence_decay decay_opinion, exponential_decay, linear_decay, step_decay Temporal decay of evidence
merge merge_graphs, diff_graphs Graph merging and diff
temporal add_temporal, query_at_time, temporal_diff Time-aware assertions
vector validate_vector, cosine_similarity, vector_term_definition @vector container support
batch annotate_batch, validate_batch, filter_by_confidence_batch Batch operations
context context_diff, check_compatibility Context versioning and migration
cbor_ld to_cbor, from_cbor, payload_stats Binary serialization (requires cbor2)
mqtt to_mqtt_payload, from_mqtt_payload, derive_mqtt_topic, derive_mqtt_qos IoT transport (requires cbor2)
mcp MCP server (53 tools, 5 resources, 4 prompts) LLM agent integration (requires mcp)

Packages

Detailed documentation, usage examples, and API reference for each language implementation:

Package Path Status
Python packages/python/README.md βœ… Published on PyPI β€” 23 modules, 53 MCP tools, 2025+ tests
JavaScript/TypeScript packages/js/README.md 🚧 Early development (v0.1.0) β€” 4 core modules (ai-ml, security, validation, vector)

Extension Specifications

Formal specifications for each extension are in /spec:

See DOCS_PLAN.md for the comprehensive documentation roadmap.

Contributing

This is a research implementation accompanying an academic publication. Contributions welcome via issues and PRs.

License

MIT

Citation

@inproceedings{jsonld-ex-flairs-2026,
  title={Extending JSON-LD for Modern AI: Addressing Security, Data Modeling, and Implementation Gaps},
  author={Syed, Muntaser and Silaghi, Marius and Abujar, Sheikh and Alssadi, Rwaida},
  booktitle={Proceedings of the 39th International FLAIRS Conference},
  year={2026}
}

A follow-up paper targeting NeurIPS 2026 Datasets & Benchmarks is in preparation, covering the formal confidence algebra, comprehensive benchmarks, and extended evaluation.

About

JSONLD Extensions for ML and SL

Resources

License

Stars

Watchers

Forks

Packages