Skip to content

fix(gitea): return repo settings as bytes so .pr_agent.toml loads#2435

Open
IsmaelMartinez wants to merge 1 commit into
The-PR-Agent:mainfrom
IsmaelMartinez:fix/gitea-get-file-content-bytes
Open

fix(gitea): return repo settings as bytes so .pr_agent.toml loads#2435
IsmaelMartinez wants to merge 1 commit into
The-PR-Agent:mainfrom
IsmaelMartinez:fix/gitea-get-file-content-bytes

Conversation

@IsmaelMartinez

@IsmaelMartinez IsmaelMartinez commented Jun 8, 2026

Copy link
Copy Markdown

PR Summary by Qodo

Fix Gitea repo settings to return bytes so .pr_agent.toml loads
🐞 Bug fix 🧪 Tests 🕐 10-20 Minutes

Grey Divider

Walkthroughs

User Description

What

GiteaProvider.get_repo_settings() returned a str, but utils.apply_repo_settings() writes the value with os.write(fd, repo_settings) and handle_configurations_errors() later calls .decode() on it — both require bytes. This is the contract the GitHub (decoded_content), GitLab (python-gitlab .decode()) and Bitbucket (response.text.encode('utf-8')) providers already follow.

As a result, loading a repo-level .pr_agent.toml on Gitea failed with a bytes-like object is required, not 'str', and the error handler then crashed in turn with 'str' object has no attribute 'decode', so repo-level configuration was effectively unusable.

Fix

Encode the content to bytes in get_repo_settings() before returning, mirroring the Bitbucket provider (which also starts from a str). RepoApi.get_file_content() is intentionally left returning str, since its other callers (diff and language analysis) legitimately consume text.

Tests

Added regression tests in tests/unittest/test_gitea_provider.py:

  • test_get_repo_settings_returns_bytes — asserts the return is bytes and round-trips through the exact os.write / .decode operations utils.py performs.
  • test_get_repo_settings_falsy_when_unset_or_missing — locks the falsy-on-missing behaviour shared with the other providers.

Fixes #2347.

AI Description
• Fix Gitea provider to return repo settings as bytes, matching other providers.
• Prevent .pr_agent.toml loading failures caused by os.write/.decode bytes contract.
• Add regression tests for bytes return and falsy behavior when settings are missing.
Diagram
graph TD
  A[GiteaProvider] --> B["get_repo_settings()"] --> C[RepoApi] --> D["get_file_content()"] --> E["encode('utf-8')"] --> F["utils.apply_repo_settings()"]
  subgraph Legend
    direction LR
    _mod([Module/Class]) ~~~ _fn[Function] ~~~ _ext[[External/Helper]]
  end
Loading
High-Level Assessment

The following are alternative approaches to this PR:

1. Make utils.apply_repo_settings accept str as well as bytes
  • ➕ Avoids re-encoding in providers that return decoded text
  • ➕ Centralizes the type-handling logic in one place
  • ➖ Expands the utils contract and increases branching/complexity there
  • ➖ Diverges from the established cross-provider expectation that repo settings are bytes
2. Return empty bytes (b'') consistently on error/missing settings
  • ➕ Keeps the get_repo_settings() return type consistent (always bytes or falsy bytes)
  • ➕ Avoids mixed-type falsy returns ('' vs b'') in callers
  • ➖ Slight behavior change if any caller distinguishes '' from b''
  • ➖ Not strictly required to fix the reported crash if callers only check truthiness

Recommendation: The chosen approach (encode to bytes in GiteaProvider.get_repo_settings) is the best fit because it restores the existing provider contract relied upon by utils.apply_repo_settings and aligns Gitea with GitHub/GitLab/Bitbucket behavior. Consider a follow-up to return b'' (instead of '') on failure/missing cases to fully match the new bytes return annotation and avoid mixed-type falsy values.

Grey Divider

File Changes

Bug fix (1)
gitea_provider.py Encode repo settings content to bytes in get_repo_settings() +6/-2

Encode repo settings content to bytes in get_repo_settings()

• Changes GiteaProvider.get_repo_settings() to return bytes by UTF-8 encoding the decoded file content. Adds inline documentation explaining the bytes contract expected by utils.apply_repo_settings() and why re-encoding is required for Gitea.

pr_agent/git_providers/gitea_provider.py


Tests (1)
test_gitea_provider.py Add regression tests for Gitea repo settings bytes contract +44/-0

Add regression tests for Gitea repo settings bytes contract

• Adds unit tests asserting get_repo_settings() returns bytes and round-trips through decode, matching the downstream os.write/.decode usage. Adds coverage for falsy returns when repo settings are unset or the file content is empty.

tests/unittest/test_gitea_provider.py


Grey Divider

Qodo Logo

@qodo-free-for-open-source-projects

qodo-free-for-open-source-projects Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Code Review by Qodo

🐞 Bugs (1) 📘 Rule violations (0) 📎 Requirement gaps (1)

Context used

Grey Divider


Action required

1. get_repo_settings() assumes str content 📎 Requirement gap ☼ Reliability ⭐ New
Description
get_repo_settings() unconditionally calls response.encode('utf-8'), which will raise
AttributeError if get_file_content() (or a mock/SDK change) returns bytes. This violates the
requirement to accept both str and bytes repo settings content for Gitea without type errors.
Code

pr_agent/git_providers/gitea_provider.py[R624-628]

+        # utils.apply_repo_settings() writes this via os.write() and later
+        # calls .decode() on it, so it must be bytes to match the GitHub/
+        # GitLab/Bitbucket contract. get_file_content() decodes the raw bytes
+        # to str, so re-encode here (see issue #2347).
+        return response.encode('utf-8')
Evidence
PR Compliance ID 6 requires the Gitea repo settings loading path to handle file contents returned as
either str or bytes. The added code at get_repo_settings() always encodes the response, which
only works for str and will fail for bytes, violating the rule’s acceptance criteria.

Gitea provider repo settings loading must accept both str and bytes file content
pr_agent/git_providers/gitea_provider.py[624-628]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`GiteaProvider.get_repo_settings()` must accept repository settings content returned as either `str` or `bytes`. The current implementation always does `response.encode('utf-8')`, which will crash if `response` is already `bytes`.

## Issue Context
Compliance requires Gitea repo settings loading to be robust to `str`/`bytes` variations to avoid `os.write()`/`.decode()` type errors.

## Fix Focus Areas
- pr_agent/git_providers/gitea_provider.py[624-628]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

2. Bytes annotation returns str 🐞 Bug ≡ Correctness
Description
GiteaProvider.get_repo_settings() is annotated as returning bytes but still returns "" (str) when
repo_settings is unset or the file content is empty/missing. This violates the bytes contract relied
on by utils.apply_repo_settings()/handle_configurations_errors and can reintroduce type errors for
any caller that treats the return value as bytes based on the annotation/contract.
Code

pr_agent/git_providers/gitea_provider.py[R608-611]

+    def get_repo_settings(self) -> bytes:
       """Get repository settings"""
       if not self.repo_settings:
           self.logger.error("Repository settings not found")
Evidence
The Gitea provider method is annotated to return bytes but returns an empty string in two branches;
meanwhile the settings application and error-reporting paths operate on the assumption that repo
settings are bytes when present/used.

pr_agent/git_providers/gitea_provider.py[608-623]
pr_agent/git_providers/utils.py[274-345]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`GiteaProvider.get_repo_settings()` now declares `-> bytes`, but it returns `""` (a `str`) in the early-return branches (unset settings path / missing file). This breaks the function's declared contract and the repo-settings contract expected by `utils.apply_repo_settings()` / `handle_configurations_errors()`.
## Issue Context
- `apply_repo_settings()` writes repo settings via `os.write(fd, repo_settings)` (bytes-only) and `handle_configurations_errors()` does `err['settings'].decode()` (bytes-only).
- The PR correctly encodes the success path, but the error/missing paths still return `str`.
## Fix Focus Areas
- pr_agent/git_providers/gitea_provider.py[608-623]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Previous review results

Review updated until commit b4b61be

Results up to commit a22c23f


🐞 Bugs (1) 📘 Rule violations (0) 📎 Requirement gaps (0)

Context used

Remediation recommended
1. Bytes annotation returns str 🐞 Bug ≡ Correctness
Description
GiteaProvider.get_repo_settings() is annotated as returning bytes but still returns "" (str) when
repo_settings is unset or the file content is empty/missing. This violates the bytes contract relied
on by utils.apply_repo_settings()/handle_configurations_errors and can reintroduce type errors for
any caller that treats the return value as bytes based on the annotation/contract.
Code

pr_agent/git_providers/gitea_provider.py[R608-611]

+    def get_repo_settings(self) -> bytes:
        """Get repository settings"""
        if not self.repo_settings:
            self.logger.error("Repository settings not found")
Evidence
The Gitea provider method is annotated to return bytes but returns an empty string in two branches;
meanwhile the settings application and error-reporting paths operate on the assumption that repo
settings are bytes when present/used.

pr_agent/git_providers/gitea_provider.py[608-623]
pr_agent/git_providers/utils.py[274-345]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`GiteaProvider.get_repo_settings()` now declares `-> bytes`, but it returns `""` (a `str`) in the early-return branches (unset settings path / missing file). This breaks the function's declared contract and the repo-settings contract expected by `utils.apply_repo_settings()` / `handle_configurations_errors()`.

## Issue Context
- `apply_repo_settings()` writes repo settings via `os.write(fd, repo_settings)` (bytes-only) and `handle_configurations_errors()` does `err['settings'].decode()` (bytes-only).
- The PR correctly encodes the success path, but the error/missing paths still return `str`.

## Fix Focus Areas
- pr_agent/git_providers/gitea_provider.py[608-623]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Qodo Logo

@github-actions github-actions Bot added the bug label Jun 8, 2026
GiteaProvider.get_repo_settings() returned a str, but
utils.apply_repo_settings() writes the value with os.write() and later
calls .decode() on it, both of which require bytes — the contract the
GitHub, GitLab and Bitbucket providers already follow. On Gitea this
raised "a bytes-like object is required, not 'str'" and broke
repo-level .pr_agent.toml loading. Encode the content before returning,
mirroring the Bitbucket provider, and add regression tests for the
bytes contract and the empty cases. See The-PR-Agent#2347.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@IsmaelMartinez IsmaelMartinez force-pushed the fix/gitea-get-file-content-bytes branch from a22c23f to b4b61be Compare June 9, 2026 05:46
@qodo-free-for-open-source-projects

qodo-free-for-open-source-projects Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Code review by qodo was updated up to the latest commit b4b61be

@IsmaelMartinez

Copy link
Copy Markdown
Author

Good catch — fixed. The two early-return branches now return b"" so every code path honours the -> bytes annotation and the bytes contract that utils.apply_repo_settings / handle_configurations_errors rely on, not just the success path. I also tightened the regression test to assert the unset/missing cases return b"". Pushed in b4b61be.

Comment on lines +624 to +628
# utils.apply_repo_settings() writes this via os.write() and later
# calls .decode() on it, so it must be bytes to match the GitHub/
# GitLab/Bitbucket contract. get_file_content() decodes the raw bytes
# to str, so re-encode here (see issue #2347).
return response.encode('utf-8')

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. get_repo_settings() assumes str content 📎 Requirement gap ☼ Reliability

get_repo_settings() unconditionally calls response.encode('utf-8'), which will raise
AttributeError if get_file_content() (or a mock/SDK change) returns bytes. This violates the
requirement to accept both str and bytes repo settings content for Gitea without type errors.
Agent Prompt
## Issue description
`GiteaProvider.get_repo_settings()` must accept repository settings content returned as either `str` or `bytes`. The current implementation always does `response.encode('utf-8')`, which will crash if `response` is already `bytes`.

## Issue Context
Compliance requires Gitea repo settings loading to be robust to `str`/`bytes` variations to avoid `os.write()`/`.decode()` type errors.

## Fix Focus Areas
- pr_agent/git_providers/gitea_provider.py[624-628]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

@IsmaelMartinez

Copy link
Copy Markdown
Author

Two findings on the latest commit (b4b61be):

1. get_repo_settings() "assumes str" / encode could fail on bytesRepoApi.get_file_content() always returns str: every path is raw_data.decode('utf-8') or "", so response is never bytes and .encode('utf-8') can't raise. Issue #2347 asks for get_repo_settings() to return bytes to match the GitHub/GitLab/Bitbucket contract, not for the loader to accept both str and bytes. I'd rather not add a branch for an input type the function can't produce; if get_file_content() ever returns bytes, that's where the change would belong.

2. "Bytes annotation returns str" — resolved in b4b61be: both early-return branches now return b"" (the finding links to the earlier a22c23f6). All three paths return bytes, and the regression test asserts the unset/missing cases return b"".

@IsmaelMartinez

Copy link
Copy Markdown
Author

I've opened a few small contributions recently, including this fix and a regression test on PR #2258. I'd genuinely appreciate it if the maintainers could let me know whether these fit the way the team likes to work, and I'm very happy to adjust them or change my approach if you'd prefer something different.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix: Gitea provider returns str instead of bytes, causing repo settings loading failure

1 participant