Skip to content

Add source provenance and publish-scope preflight checks #103

Description

@joshka

Parent: #100

Related Gitcrawl context: openclaw/gitcrawl#81

Observed Workflow

Discrawl supports both Discord bot API sync and local Discord Desktop cache import. That combination is useful, but the source mix can matter for metadata quality.

During initial use, a combined sync path refreshed messages from wiretap but left guild metadata insufficient for public/private publish classification. A later discrawl sync --source discord restored the role/visibility metadata needed for publish --public-only checks.

Current Workaround

The workaround was to remember that privacy-sensitive publish checks need bot-sourced guild metadata and to run a Discord-only repair sync before repeating the publish check.

That is easy to miss, especially when discrawl sync defaults to a combined source.

Request

Add source/provenance and publish-scope preflight diagnostics.

Example shape:

discrawl doctor --publish-scope --json
discrawl publish --public-only --check
discrawl status --sources --json

Useful Output

  • whether guild metadata came from bot sync, wiretap import, snapshot import, or an incomplete local cache record
  • whether role/private-channel metadata is present enough for public/private filtering
  • whether publish --public-only is likely to export zero rows because metadata is incomplete
  • suggested repair command, such as discrawl sync --source discord, when bot metadata is required
  • clear warning when only local desktop cache data is available

Why This Matters

The use case is not just "can I search locally?" It is local maintainer archive work with private/public boundaries. A user or Codex agent should be able to tell whether the archive has enough provenance and metadata to make a publish-scope decision.

Acceptance Criteria

  • doctor --json or a dedicated check reports metadata quality relevant to public/private filtering.
  • publish --public-only --check can validate scope without writing a snapshot.
  • The diagnostic recommends a concrete repair step when bot-sourced metadata is missing.
  • The output distinguishes missing metadata from an intentionally empty public export.

Prepared with Codex, confirmed as accurate by human.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Normal priority bug or improvement with limited blast radius.clawsweeper:needs-maintainer-reviewClawSweeper marked this issue as needing maintainer review before automation.clawsweeper:needs-product-decisionClawSweeper marked this issue as needing a product or behavior decision.clawsweeper:needs-security-reviewClawSweeper marked this issue as needing security-sensitive review.clawsweeper:no-new-fix-prClawSweeper does not recommend queueing a new automated fix PR for this issue.impact:securityThis issue is about security boundaries, credentials, authz, sandboxing, or sensitive data.issue-rating: 🌊 off-meta tidepoolIssue quality rating does not apply to this item.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions