Skip to content

Add App 3 initial implementation: classifier, catalog, indexer and GUI#1

Open
ElPoot wants to merge 3 commits intomainfrom
codex/explain-repository-purpose-and-details
Open

Add App 3 initial implementation: classifier, catalog, indexer and GUI#1
ElPoot wants to merge 3 commits intomainfrom
codex/explain-repository-purpose-and-details

Conversation

@ElPoot
Copy link
Owner

@ElPoot ElPoot commented Feb 26, 2026

Motivation

  • Provide a first executable version of "App 3" that reuses parsing and settings logic from the legacy apps to accelerate migration.
  • Offer a basic workflow to index invoices, classify PDFs into account folders with SHA256-verified safe moves, and persist classifications.
  • Include a simple GUI to load a client by cédula, browse invoices and perform manual classification.

Description

  • Added a new app3/ package with a bootstrapper that inserts legacy app folders into sys.path so APP 1/APP 2 modules can be reused.
  • Implemented configuration helpers in app3/config.py and a FacturaRecord dataclass in app3/core/models.py.
  • Implemented CatalogManager with atomic JSON save/load and ClassificationDB backed by clasificacion.sqlite including an ON CONFLICT upsert and SHA256 hashing/validation in app3/core/classifier.py.
  • Added FacturaIndexer to build invoice records from XML and PDF folders, session resolution using existing profiles/settings, a Tkinter GUI (app3/gui/main_window.py) and a main entrypoint (app3/main.py); README updated with usage and status.

Testing

  • Added tests in tests/test_app3_core.py and tests/conftest.py exercising catalog atomic save/load and classify_record behavior.
  • Ran pytest covering tests/test_app3_core.py::test_catalog_atomic_save_load and tests/test_app3_core.py::test_classify_record_moves_pdf_and_registers, and both tests passed.

Codex Task

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a246c127d0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

clave = str(parsed.get("clave_numerica") or "").strip()
if len(clave) != 50:
continue
records[clave] = FacturaRecord(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve invoice record when same clave appears in another XML

This assignment overwrites any previously indexed record for the same clave, so when a client has both the invoice XML and a MensajeHacienda XML with the same key, whichever file is processed last wins. If MensajeHacienda is encountered after FacturaElectronica, the indexed row loses invoice fields like issuer, date, and totals, which leaves the UI with incomplete/incorrect invoice context for classification.

Useful? React with 👍 / 👎.

)
return None

dest_folder = client_folder / categoria / subcategoria / proveedor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Sanitize destination folder names before classifying PDFs

The destination path is built directly from user/catalog text (categoria, subcategoria, proveedor) without filesystem sanitization. In this flow proveedor is prefilled from XML issuer names, and names containing Windows-invalid characters (for example : or ?) will make mkdir/copy fail, while separators like / create unintended nested directories; both cases block or misroute real classifications.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant