Releases · discretewater/orga

ORGA v0.1.3 is the second public release of the project.

ORGA is a fast, explainable, non-LLM extraction engine for profiling institutional websites. It discovers key pages, extracts structured contact information, and assigns primary organization categories using deterministic rules, semantic heuristics, JSON-LD parsing, and lightweight Bayesian classification.

Highlights in this release:

The second public open-source release of ORGA
Dockerized FastAPI microservices for single-site extraction and batch jobs
Deterministic extraction pipeline for names, locations, phones, emails, and social links
Layered organization classification with rule-based scoring and Bayesian fallback
Stronger parser hardening and defensive handling for noisy real-world websites
Improved address/location deduplication and sanitization
Public-facing documentation, examples, and design notes
CI workflow, CHANGELOG, and CONTRIBUTING guide included
PyPI package published for the core library

What ORGA is good at:

Profiling institutional websites such as hospitals, universities, government agencies, nonprofits, and international organizations
Producing structured JSON output quickly and predictably
Serving as an explainable, low-cost baseline without LLM dependencies

Known boundaries:

Not designed for open-world semantic understanding
Not intended for deep PDF reading or nuanced corporate hierarchy interpretation
Address parsing from noisy footer text may still yield partially parsed raw strings in difficult cases

This release establishes the frozen baseline for the current deterministic architecture. Future work may explore lightweight supervised calibration and optional LLM-assisted features without compromising the core system’s speed, traceability, and deterministic behavior.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!

Releases: discretewater/orga

ORGA v0.1.3

Uh oh!