Skip to content

fix(libreoffice): import existing office files#264

Merged
yuh-yang merged 3 commits intomainfrom
fix/libreoffice-existing-office-import
Apr 30, 2026
Merged

fix(libreoffice): import existing office files#264
yuh-yang merged 3 commits intomainfrom
fix/libreoffice-existing-office-import

Conversation

@yuh-yang
Copy link
Copy Markdown
Collaborator

Description

Fixes #

Type of Change

  • New Software CLI (in-repo) — adds a CLI harness inside this monorepo
  • New Software CLI (standalone repo) — registry-only PR pointing to an external repo
  • New Feature — adds new functionality to an existing harness or the plugin
  • Bug Fix — fixes incorrect behavior
  • Documentation — updates docs only
  • Other — please describe:

For New Software CLIs (in-repo)

  • <SOFTWARE>.md SOP document exists at <software>/agent-harness/<SOFTWARE>.md
  • Canonical SKILL.md exists at skills/cli-anything-<software>/SKILL.md
  • Packaged compatibility SKILL.md exists at cli_anything/<software>/skills/SKILL.md
  • Unit tests at cli_anything/<software>/tests/test_core.py are present and pass without backend
  • E2E tests at cli_anything/<software>/tests/test_full_e2e.py are present
  • README.md includes the new software (with link to harness directory)
  • registry.json includes an entry with source_url: null (see Contributing guide)
  • repl_skin.py in utils/ is an unmodified copy from the plugin

For New Software CLIs (standalone repo)

  • CLI is installable via pip install <package-name> or a pip install git+https://... URL
  • SKILL.md exists in the external repo
  • External repo has its own test suite
  • registry.json entry includes source_url pointing to the external repo
  • registry.json entry includes skill_md with full URL to the external SKILL.md
  • install_cmd in registry.json works (tested locally)

For Existing CLI Modifications

  • All unit tests pass: python3 -m pytest cli_anything/<software>/tests/test_core.py -v
  • All E2E tests pass: python3 -m pytest cli_anything/<software>/tests/test_full_e2e.py -v
  • No test regressions — no previously passing tests were removed or weakened
  • registry.json entry is updated if version, description, or requirements changed

General Checklist

  • Code follows existing patterns and conventions
  • --json flag is supported on any new commands
  • Commit messages follow the conventional format (feat:, fix:, docs:, test:)
  • I have tested my changes locally

Test Results

<paste test output here>

Copilot AI review requested due to automatic review settings April 29, 2026 12:04
@github-actions github-actions Bot added existing-cli-fix Fixes or improves an existing CLI harness cli-anything-skill Changes CLI-Anything plugin or skill files labels Apr 29, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 77c4ae6202

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread libreoffice/agent-harness/cli_anything/libreoffice/core/importer.py
Comment thread libreoffice/agent-harness/cli_anything/libreoffice/core/importer.py Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an import pipeline to the LibreOffice harness so existing ODF/Microsoft Office documents can be converted/parsed into the harness’s editable project JSON model, with corresponding CLI commands, docs, and tests.

Changes:

  • Introduces core/importer.py to import ODF directly and convert Office formats to ODF via LibreOffice headless before parsing into the project model.
  • Extends the CLI with document import, document import-formats, and enhances document open to import Office/ODF files (optionally saving to JSON).
  • Updates LibreOffice conversion runtime isolation (temp profile/XDG dirs) and adds unit/E2E coverage + documentation updates for import workflows.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
libreoffice/agent-harness/setup.py Updates package description to mention import support.
libreoffice/agent-harness/cli_anything/libreoffice/utils/lo_backend.py Runs LibreOffice conversion with isolated temp profile/runtime/config/cache env.
libreoffice/agent-harness/cli_anything/libreoffice/core/importer.py New importer: parse ODF content into project model; convert Office→ODF first.
libreoffice/agent-harness/cli_anything/libreoffice/libreoffice_cli.py Adds document import / import-formats; enhances document open to import.
libreoffice/agent-harness/cli_anything/libreoffice/core/document.py Improves invalid project file error to direct users to import.
libreoffice/agent-harness/cli_anything/libreoffice/tests/test_core.py Adds unit tests for import formats + ODF import + conversion routing.
libreoffice/agent-harness/cli_anything/libreoffice/tests/test_full_e2e.py Adds E2E coverage for importing existing Office docs and CLI open/import flow.
libreoffice/agent-harness/cli_anything/libreoffice/tests/TEST.md Updates test inventory and results summary to include import tests.
libreoffice/agent-harness/cli_anything/libreoffice/skills/SKILL.md Documents new import-related commands and recommended workflow.
libreoffice/agent-harness/cli_anything/libreoffice/README.md Documents prerequisites and examples for importing existing files.
libreoffice/agent-harness/LIBREOFFICE.md Updates architecture and adds import workflow example.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread libreoffice/agent-harness/cli_anything/libreoffice/utils/lo_backend.py Outdated
Comment thread libreoffice/agent-harness/cli_anything/libreoffice/core/importer.py
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ed894b917c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +123 to +125
content_xml = parsed.get("content_xml")
if not content_xml:
return project
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reject ODF packages missing content.xml

import_odf returns a freshly created empty project when content.xml is absent, so a structurally broken .odt/.ods/.odp can be reported as a successful import and then re-exported, silently discarding original document data. Since content.xml is the primary document payload, this should fail fast with a user-facing error instead of returning an empty document.

Useful? React with 👍 / 👎.

Comment on lines +110 to +113
try:
parsed = parse_odf(path)
except zipfile.BadZipFile as e:
raise ValueError(f"Invalid ODF file: {path}") from e
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Catch non-UTF8 XML decode errors in import pipeline

import_odf only wraps zipfile.BadZipFile, but parse_odf can also raise UnicodeDecodeError when XML entries are not UTF-8 decodable (e.g., malformed/corrupt archives). That exception is not handled by handle_error, so document import/open can terminate with a traceback instead of a clean CLI error, which is a user-visible robustness regression for file import.

Useful? React with 👍 / 👎.

@yuh-yang yuh-yang merged commit 0c92c54 into main Apr 30, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cli-anything-skill Changes CLI-Anything plugin or skill files existing-cli-fix Fixes or improves an existing CLI harness

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants