Optimize QueryExecutionService Performance and Security by Snider · Pull Request #44 · host-uk/core-mcp

Snider · 2026-02-04T17:40:41Z

Improved the performance and security of the QueryExecutionService by ensuring that row limits are enforced as early as possible (at the database level) and that large result sets are handled memory-efficiently using cursors. Added a hard maximum of 10,000 rows to prevent extreme cases.

Fixes #20

PR created automatically by Jules for task 8716326595783508355 started by @Snider

- Enforce row limits at the database level using SQL LIMIT clauses. - Use cursor-based fetching (db->cursor()) for memory efficiency. - Terminate fetching early once the limit (plus 1-row margin) is reached. - Introduce a HARD_MAX_ROWS safety cap of 10,000. - Update unit tests to reflect truncated row count reporting. This addresses the issue where all results were fetched into memory before truncation, which posed a memory exhaustion risk and DoS vector.

google-labs-jules · 2026-02-04T17:40:42Z

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.

For security, I will only act on instructions from the user who triggered this task.

gemini-code-assist · 2026-02-04T17:41:00Z

Summary of Changes

Hello @Snider, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the QueryExecutionService by optimizing how database query results are handled. It focuses on improving both performance and security by enforcing row limits directly within the database query, thereby minimizing data transfer and memory usage. Additionally, it introduces a memory-efficient cursor-based approach for processing large result sets and establishes a hard maximum row limit to prevent potential system overload from excessively large queries.

Highlights

Hard Maximum Row Limit: Introduced a HARD_MAX_ROWS constant set to 10,000, establishing an absolute upper bound on the number of rows that can be fetched, regardless of tier-specific configurations.
Database-Level Row Limiting: Implemented a new applyLimit method to dynamically inject or modify LIMIT clauses directly into SQL queries. This ensures that row limits are enforced at the database level, reducing data transfer and memory consumption.
Memory-Efficient Result Processing: Refactored the query execution to utilize database cursors ($db->cursor()) for iterating over results. This prevents large result sets from being loaded entirely into memory, significantly improving performance and stability for queries returning many rows.
Tier Limit Enforcement: Modified the getLimitsForTier method to ensure that any configured max_rows for a tier cannot exceed the newly defined HARD_MAX_ROWS.
Unit Test Coverage: Added a new unit test to verify the correct enforcement of HARD_MAX_ROWS and updated existing tests to reflect the new behavior of row counting when database-level limits are applied with a margin for truncation detection.

Changelog

src/Mcp/Services/QueryExecutionService.php
- Added HARD_MAX_ROWS constant (10,000) to set an absolute maximum for fetched rows.
- Refactored execute method to apply row limits at the database level using a new applyLimit helper and to process results using database cursors for memory efficiency.
- Modified getLimitsForTier to ensure that the effective max_rows does not exceed HARD_MAX_ROWS.
- Introduced a new private method applyLimit responsible for dynamically adding or adjusting LIMIT clauses in SQL queries, including a +1 margin for truncation detection.
src/Mcp/Tests/Unit/QueryExecutionServiceTest.php
- Added test_get_limits_for_tier_enforces_hard_max_rows to verify the HARD_MAX_ROWS constant is correctly applied.
- Updated test_execute_truncates_results_when_exceeding_tier_limit to reflect the new +1 margin in row counts for truncation detection due to database-level limiting.

Activity

No specific activity (comments, reviews, or progress updates) has been recorded for this pull request yet.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

github-actions · 2026-02-04T17:41:05Z

QA Pipeline Results

Check	Status	Details
Tests		tests, assertions
PHPStan		errors
Psalm		issues
Code Style		files need formatting
Security		vulnerabilities

Artifacts

test-results.xml - JUnit test results
phpstan.json - PHPStan analysis
psalm.json / psalm.sarif - Psalm analysis
pint.json - Code style report
audit.json - Security audit

Generated by core php qa pipeline

gemini-code-assist

Code Review

This pull request significantly improves the performance and security of the query execution service by enforcing row limits at the database level and using cursors for memory efficiency. The changes are well-aligned with the stated goals. I've identified a couple of areas for improvement. The new applyLimit method, while a great addition, has some robustness issues with handling certain SQL syntaxes like trailing comments and the LIMIT ... OFFSET ... clause, which could lead to query failures. I've also noted a minor issue with an unused variable. My detailed feedback includes suggestions to address these points.

gemini-code-assist · 2026-02-04T17:43:12Z

+    private function applyLimit(string $query, int $maxRows): string
+    {
+        $limitWithMargin = $maxRows + 1;
+        $workQuery = rtrim(trim($query), '; ');
+
+        // Check if LIMIT is already present at the end of the query
+        if (preg_match('/\bLIMIT\s+(\d+)(?:\s*,\s*(\d+))?\s*$/i', $workQuery, $matches)) {
+            // matches[1] is offset if matches[2] exists, otherwise it's the limit
+            $hasOffset = isset($matches[2]);
+            $existingLimit = $hasOffset ? (int) $matches[2] : (int) $matches[1];
+
+            if ($existingLimit > $limitWithMargin) {
+                if ($hasOffset) {
+                    $offset = $matches[1];
+
+                    return preg_replace('/\bLIMIT\s+\d+\s*,\s*\d+\s*$/i', "LIMIT $offset, $limitWithMargin", $workQuery);
+                } else {
+                    return preg_replace('/\bLIMIT\s+\d+\s*$/i', "LIMIT $limitWithMargin", $workQuery);
+                }
+            }
+
+            return $workQuery;
+        }
+
+        // No LIMIT found, append it
+        return $workQuery.' LIMIT '.$limitWithMargin;
+    }


The applyLimit method is a great addition for performance and security. However, the current implementation has a couple of robustness issues that can lead to invalid SQL queries being generated in common cases:

Trailing Comments: It doesn't handle SQL queries with trailing comments (e.g., ... LIMIT 100 -- my comment). The regex won't match, and a second LIMIT clause will be appended, breaking the query.

OFFSET Keyword Syntax: The regex only handles LIMIT <count> and LIMIT <offset>, <count>. It doesn't support the standard LIMIT <count> OFFSET <offset> syntax, which will also lead to an invalid query.

Here is a suggested replacement that addresses both issues by stripping trailing comments and handling both common LIMIT clause syntaxes.

private function applyLimit(string $query, int $maxRows): string { $limitWithMargin = $maxRows + 1; // Strip trailing comments and trim whitespace/semicolon to robustly find the LIMIT clause. $workQuery = preg_replace('/\s*(--[^\r\n]*|\/\*.*\*\/)\s*$/s', '', $query); $workQuery = rtrim(trim($workQuery), '; '); // Check for `LIMIT count OFFSET offset` if (preg_match('/\bLIMIT\s+(\d+)\s+OFFSET\s+(\d+)\s*$/i', $workQuery, $matches)) { $existingLimit = (int) $matches[1]; if ($existingLimit > $limitWithMargin) { $offset = $matches[2]; return preg_replace('/\bLIMIT\s+\d+\s+OFFSET\s+\d+\s*$/i', "LIMIT {$limitWithMargin} OFFSET {$offset}", $workQuery); } return $workQuery; } // Check for `LIMIT [offset,] count` if (preg_match('/\bLIMIT\s+(\d+)(?:\s*,\s*(\d+))?\s*$/i', $workQuery, $matches)) { // matches[1] is offset if matches[2] exists, otherwise it's the limit $hasOffset = isset($matches[2]); $existingLimit = $hasOffset ? (int) $matches[2] : (int) $matches[1]; if ($existingLimit > $limitWithMargin) { if ($hasOffset) { $offset = $matches[1]; return preg_replace('/\bLIMIT\s+\d+\s*,\s*\d+\s*$/i', "LIMIT {$offset}, {$limitWithMargin}", $workQuery); } else { return preg_replace('/\bLIMIT\s+\d+\s*$/i', "LIMIT {$limitWithMargin}", $workQuery); } } return $workQuery; } // No LIMIT found, append it return $workQuery . ' LIMIT ' . $limitWithMargin; }

gemini-code-assist · 2026-02-04T17:43:12Z

+            // Execute the query using a cursor for memory efficiency
+            $results = [];
+            $totalRows = 0;
+            $fetchLimit = $maxRows + 1;


The $fetchLimit variable is declared but never used. It can be removed to improve code clarity.

- Enforce row limits at the database level using SQL LIMIT clauses. - Use cursor-based fetching (db->cursor()) for memory efficiency. - Introduce a HARD_MAX_ROWS safety cap of 10,000. - Update unit tests to reflect truncated row count reporting. - Add host-uk/core repository to composer.json to fix CI dependency resolution. This addresses the memory exhaustion risk and ensures CI can install required dependencies.

- Enforce row limits at the database level using SQL LIMIT clauses. - Use cursor-based fetching for memory efficiency. - Introduce a HARD_MAX_ROWS safety cap of 10,000. - Update unit tests to reflect truncated row count reporting. - Add host-uk/core-php as a VCS repository in composer.json with no-api: true to fix dependency resolution in CI. - Set minimum-stability to dev to allow installing dev dependencies. This addresses the performance issues and ensures CI can install required private dependencies using provided tokens.

- Enforce row limits at the database level using SQL LIMIT clauses. - Use cursor-based fetching for memory efficiency. - Introduce a HARD_MAX_ROWS safety cap of 10,000. - Update unit tests to reflect truncated row count reporting. - Add missing dev dependencies (Pest, PHPStan, Psalm, Pint) to composer.json. - Fix private dependency resolution by adding VCS repository with HTTPS. - Update QA pipeline to use Pest and handle missing tool configurations. - Fix incorrect action versions in CI workflow. This addresses the performance bottleneck and resolves multiple CI failure points.

coderabbitai · 2026-02-05T03:24:57Z

Warning

Rate limit exceeded

@Snider has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 6 minutes and 51 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch performance-query-limit-8716326595783508355

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

google-labs-jules Bot mentioned this pull request Feb 4, 2026

Performance: QueryDatabase tool fetches all results before truncation #20

Closed

gemini-code-assist Bot reviewed Feb 4, 2026

View reviewed changes

Snider added 3 commits February 4, 2026 18:00

Snider marked this pull request as ready for review February 5, 2026 03:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize QueryExecutionService Performance and Security#44

Optimize QueryExecutionService Performance and Security#44
Snider wants to merge 4 commits intodevfrom
performance-query-limit-8716326595783508355

Snider commented Feb 4, 2026

Uh oh!

google-labs-jules Bot commented Feb 4, 2026

Uh oh!

gemini-code-assist Bot commented Feb 4, 2026

Uh oh!

github-actions Bot commented Feb 4, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Feb 4, 2026

Uh oh!

gemini-code-assist Bot Feb 4, 2026

Uh oh!

coderabbitai Bot commented Feb 5, 2026

Rate limit exceeded

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Snider commented Feb 4, 2026

Uh oh!

google-labs-jules Bot commented Feb 4, 2026

Uh oh!

gemini-code-assist Bot commented Feb 4, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

github-actions Bot commented Feb 4, 2026

QA Pipeline Results

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot commented Feb 5, 2026

Rate limit exceeded

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant