Skip to content

feat: Add wp search-replace file subcommand for SQL file processing #235

Open
AlextheYounga wants to merge 2 commits intowp-cli:mainfrom
AlextheYounga:feat/alexy/file-search-replace
Open

feat: Add wp search-replace file subcommand for SQL file processing #235
AlextheYounga wants to merge 2 commits intowp-cli:mainfrom
AlextheYounga:feat/alexy/file-search-replace

Conversation

@AlextheYounga
Copy link
Copy Markdown

This is an attempt to introduce the go-search-replace logic as requested in #137, by incorporating the php-search-replace php-ported logic.

I quickly realized the approaches here are fundamentally different, as the go-search-replace logic is designed for raw SQL while wp cli search-replace is designed to work with typed PHP values and that it would be futile to try and combine the two approaches.

But I think this would still be an extremely useful addition to the search-replace library. Although this library has been incredibly handy as is, I also can't tell you how many times I needed this exact feature and had to download Go to solve this problem. I hope you will agree; I feel like it should live here.

This change Introduces a new file subcommand that performs search/replace directly on SQL dump files (or streams) using a WP-CLI-compliant port of Automattic's go-search-replace algorithm. The original wp cli search-replace logic is untouched.

🫡

Usage

wp search-replace file [<input.sql> [<output.sql>]]
wp search-replace file --old= --new= input.sql output.sql
wp search-replace file old new --in-place dump.sql
cat dump.sql | wp search-replace file old new - -

Supported flags:
--old, --new Alternative to positional arguments (for strings
starting with '--')
--in-place Edit the input file in place
--dry-run Preview changes without writing output
--verbose Show per-line processing information

Testing

Tests from php-search-replace were added to this repository, and those tests were ported from the original go-search-replace repository.

Refs

This change migrates tests from the php-search-replace repository
for testing various file edge-cases. Using actual sql files proved
a necessary step that prevents running into oddities with how php
handles strings. There may be a better way around that problem, but
I never found it, and it seems a better test to use actual sql files
since that is the exact behavior of the command.
Introduces a new `file` subcommand that performs search/replace directly
on SQL dump files (or streams) using a WP-CLI-compliant port of
Automattic's go-search-replace algorithm.
This complements the existing database-centric `search-replace` command
by providing a text-level engine that correctly handles serialized PHP
strings and updates their length markers — including when the search
string appears as an array key.

Usage:
  wp search-replace file <old> <new> [<input.sql> [<output.sql>]]
  wp search-replace file --old=<old> --new=<new> input.sql output.sql
  wp search-replace file old new --in-place dump.sql
  cat dump.sql | wp search-replace file old new - -

Supported flags:
  --old, --new    Alternative to positional arguments (for strings
                  starting with '--')
  --in-place      Edit the input file in place
  --dry-run       Preview changes without writing output
  --verbose       Show per-line processing information

The implementation follows existing project conventions:
- `FileSearchReplacer` and `Serialized_Replace_Result` live under the
  `WP_CLI` namespace with proper PHPCS exclusions in phpcs.xml.dist
- All error handling uses exceptions (CLI layer converts to WP_CLI::error)
- Full `composer test` passes (production code is zero-warning)
- 19 unit tests pass, including large fixture parity tests against
  the original go-search-replace binary

Refs:
  https://github.com/AlextheYounga/php-search-replace
  https://github.com/Automattic/go-search-replace
@AlextheYounga AlextheYounga requested a review from a team as a code owner May 10, 2026 01:21
@github-actions
Copy link
Copy Markdown
Contributor

Hello! 👋

Thanks for opening this pull request! Please check out our contributing guidelines. We appreciate you taking the initiative to contribute to this project.

Contributing isn't limited to just code. We encourage you to contribute in the way that best fits your abilities, by writing tutorials, giving a demo at your local meetup, helping other users with their support questions, or revising our documentation.

Here are some useful Composer commands to get you started:

  • composer install: Install dependencies.
  • composer test: Run the full test suite.
  • composer phpcs: Check for code style violations.
  • composer phpcbf: Automatically fix code style violations.
  • composer phpunit: Run unit tests.
  • composer behat: Run behavior-driven tests.

To run a single Behat test, you can use the following command:

# Run all tests in a single file
composer behat features/some-feature.feature

# Run only a specific scenario (where 123 is the line number of the "Scenario:" title)
composer behat features/some-feature.feature:123

You can find a list of all available Behat steps in our handbook.

@github-actions github-actions Bot added command:search-replace Related to 'search-replace' command scope:distribution Related to distribution scope:testing Related to testing labels May 10, 2026
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new "wp search-replace file" command that performs search and replace operations directly on SQL files, including correct handling of serialized PHP strings by porting the go-search-replace algorithm. Critical issues were identified regarding potential data loss when using the --in-place flag due to file truncation during simultaneous read/write operations. Additionally, a typo in the NUL character mapping was found, along with missing SQL escape sequences and opportunities to improve performance and reduce code duplication by refactoring the processing loops and normalization logic.

*/
private function do_replace( \WP_CLI\FileSearchReplacer $replacer, string $input_file, string $output_file, array $replacements, bool $verbose ): void {
$input_handle = $this->open_input( $input_file );
$output_handle = $this->open_output( $output_file );
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Opening the same file for writing with 'wb' while it is still being read will truncate the file to zero bytes immediately. This will result in total data loss when the --in-place flag is used. To fix this, you should write the output to a temporary file and then rename it to the original filename after the processing loop is finished.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Good bot.

throw new \RuntimeException( sprintf( 'Unable to open "%s" for reading.', $input_path ) );
}

$output = @fopen( $output_path, 'wb' );
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This method also suffers from a truncation bug if $input_path and $output_path are the same. Opening the output file with 'wb' will wipe the input file before it can be read. A temporary file should be used for in-place operations.

't' => "\t",
'b' => "\x08",
'f' => "\f",
'0' => '0',
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The mapping for the escaped NUL character is incorrect. In SQL dumps, \0 represents the ASCII NUL byte (0x00), not the character '0'. This typo will cause incorrect length calculations for serialized strings containing NUL bytes.

			'0'  => "\0",

return '';
}

$normalized = $this->normalize_replacements( $replacements );
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Calling normalize_replacements for every line processed is inefficient for large SQL files. Since the replacements are constant throughout the command execution, they should be normalized once and passed to process_line.

Comment on lines +139 to +178
private function do_dry_run( \WP_CLI\FileSearchReplacer $replacer, string $input_file, array $replacements, bool $verbose ): void {
$input_handle = $this->open_input( $input_file );

$total_lines = 0;
$changed_lines = 0;
$total_replacements = 0;

while ( true ) {
$line = fgets( $input_handle );
if ( false === $line ) {
break;
}
++$total_lines;
$processed = $replacer->process_line( $line, $replacements );

if ( $processed !== $line ) {
++$changed_lines;
// Count how many times old appears in the original line
$old = $replacements[0]['from'];
$total_replacements += substr_count( $line, $old );
}

if ( $verbose ) {
\WP_CLI::line( sprintf( 'Line %d: %s', $total_lines, $processed !== $line ? 'changed' : 'unchanged' ) );
}
}

if ( '-' !== $input_file ) {
fclose( $input_handle );
}

\WP_CLI::success(
sprintf(
'Dry run complete. %d lines processed, %d lines would change, %d total replacements.',
$total_lines,
$changed_lines,
$total_replacements
)
);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The do_dry_run and do_replace methods contain almost identical logic for iterating over the file and processing lines. Additionally, this logic is duplicated from FileSearchReplacer::replace_in_file(). Consider refactoring to share the core processing loop.

Comment on lines +288 to +307
$map = array(
'\\' => '\\',
"'" => "'",
'"' => '"',
'n' => "\n",
'r' => "\r",
't' => "\t",
'b' => "\x08",
'f' => "\f",
'0' => '0',
);

$second = isset( $pair[1] ) ? $pair[1] : '';

if ( '' !== $second && isset( $map[ $second ] ) ) {
return $map[ $second ];
}

return $pair;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The unescaping logic is missing some SQL escape sequences like \Z (ASCII 26), which are supported by the original go-search-replace tool. This could lead to incorrect length markers if these characters appear in serialized data.

@AlextheYounga
Copy link
Copy Markdown
Author

I've never used Behat, so please forgive me in advance. I'll look into these failing tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

command:search-replace Related to 'search-replace' command scope:distribution Related to distribution scope:testing Related to testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant