Skip to content

UPSTREAM PR #27013: Add php_64_bit_only codegen option to the PHP generator#153

Open
loci-dev wants to merge 1 commit into
mainfrom
loci/pr-27013-php-64-bit-only
Open

UPSTREAM PR #27013: Add php_64_bit_only codegen option to the PHP generator#153
loci-dev wants to merge 1 commit into
mainfrom
loci/pr-27013-php-64-bit-only

Conversation

@loci-dev
Copy link
Copy Markdown

Note

Source pull request: protocolbuffers/protobuf#27013

Generated PHP code currently type-hints 64-bit integer fields as int|string to cover 32-bit PHP builds, where values that overflow a native int are returned as strings. On 64-bit PHP — the mainstream deployment — the runtime always returns native int, so the union is dead weight that noisies up static analysis, IDE autocompletion, and docs for everyone who has committed to 64-bit.

This change adds an opt-in CLI option, analogous to aggregate_metadata:

protoc --php_out=php_64_bit_only=true:. foo.proto

When the flag is on, INT64/UINT64/SINT64/FIXED64/SFIXED64 fields emit int instead of int|string across setter signatures, PHPDoc @param / @return / @type, and RepeatedField<...> generics (which naturally collapse from RepeatedField<int>|RepeatedField<string> to RepeatedField<int>). Wrapper types (Int64Value, UInt64Value) honor the flag via Options threaded into the wrapper doc-comment helpers. Default behavior (flag absent or =false) is unchanged.

Scope:

  • Codegen only. Runtime libraries (GPBUtil.php, convert.c) already return native int on 64-bit PHP regardless of this flag.
  • No change to checked-in generated WKT .php files; they remain on default behavior unless regenerated with the flag.
  • No new .proto file-level option.
  • Descriptor-mode generation (is_descriptor=true) is intentionally unchanged.

Unit tests cover flag on/off/absent/invalid paths, scalar and repeated 64-bit fields, and verify the int|string union is fully absent when the flag is enabled.

@loci-review
Copy link
Copy Markdown

loci-review Bot commented Apr 21, 2026

Overview

Analysis of 10,156 functions in build.protoc-stable shows 41 modified functions (0.4%) with negligible overall impact: power consumption decreased 0.037% (-219.75 nJ). Changes implement a php_64_bit_only codegen option for PHP generator, affecting only compile-time code generation—zero impact on runtime message parsing or serialization.

Function counts: 41 modified, 0 new, 0 removed, 10,115 unchanged

Function Analysis

Most Significant Changes:

  • _upb_DefPool_LoadDefInitEx: Response time -99.5% (-564.5 μs), throughput -10.9% (-33.9 ns). Caching optimization bypasses expensive definition building when proto files are already loaded, achieving 215x speedup for protoc initialization.

  • Formatter::operator(): Response time +99.9% (+89.4 ms), throughput +3.1% (+7.8 ns). Measures error path with logging infrastructure from new ABSL_LOG(FATAL) handler; normal code generation unaffected.

  • PhpDocSetterTypeName: Response time -2.4% (-153.7 ns), throughput -32.5% (-145.3 ns). Implements php_64_bit_only option with optimized switch table and eliminated function overload.

  • Printer::Print: Response time -0.06% (-50.5 μs), throughput -52.3% (-221 ns). Compiler optimization replaced inline hash operations with abstracted calls, reducing CFG complexity 28%.

  • std::vector::_M_realloc_insert (two variants): Response time -46.5% and -7.5% from exception handling consolidation; compiler optimization without source changes.

Other analyzed functions show sub-microsecond changes from security hardening (stack canary validation) or compiler optimizations.

Flame Graph Comparison

Function: Formatter::operator() — illustrates the 99.9% response time increase from new error handling path

Base version:

Flame Graph: build.protoc-stable::ZNK6google8protobuf8compiler3cpp9FormatterclIJSt17basic_string_viewIcSt11char_traitsIcEEEEEvPKcDpRKT

Target version:

Flame Graph: build.protoc-stable::ZNK6google8protobuf8compiler3cpp9FormatterclIJSt17basic_string_viewIcSt11char_traitsIcEEEEEvPKcDpRKT

Target introduces zero-argument template instantiation dominated by logging infrastructure (SendToLog 3.85ms, Flush 3.92ms, LogMessage 1.03ms), measuring error/diagnostic path rather than normal code generation.

💬 Questions? Tag @loci-dev

Generated PHP code currently type-hints 64-bit integer fields as
`int|string` to cover 32-bit PHP builds, where values that overflow a
native `int` are returned as strings. On 64-bit PHP — the mainstream
deployment — the runtime always returns native `int`, so the union is
dead weight that noisies up static analysis, IDE autocompletion, and
docs for everyone who has committed to 64-bit.

This change adds an opt-in CLI option, analogous to `aggregate_metadata`:

    protoc --php_out=assume_64_bit_php:. foo.proto

When the flag is present, INT64/UINT64/SINT64/FIXED64/SFIXED64 fields
emit `int` instead of `int|string` across setter signatures, PHPDoc
`@param` / `@return` / `@type`, and `RepeatedField<...>` generics
(which naturally collapse from `RepeatedField<int>|RepeatedField<string>`
to `RepeatedField<int>`). Wrapper types (`Int64Value`, `UInt64Value`)
honor the flag via `Options` threaded into the wrapper doc-comment
helpers. Default behavior (flag absent) is unchanged.

Scope:

- Codegen only. Runtime libraries (GPBUtil.php, convert.c) already
  return native `int` on 64-bit PHP regardless of this flag.
- No change to checked-in generated WKT .php files; they remain on
  default behavior unless regenerated with the flag.
- No new .proto file-level option.
- Descriptor-mode generation (`is_descriptor=true`) is intentionally
  unchanged.

Unit tests cover the flag-on and flag-absent paths, scalar and repeated
64-bit fields, and verify the `int|string` union is fully absent when
the flag is enabled.
@loci-dev loci-dev force-pushed the loci/pr-27013-php-64-bit-only branch from fd419f5 to d64617d Compare April 22, 2026 18:49
@loci-review
Copy link
Copy Markdown

loci-review Bot commented Apr 22, 2026

Overview

Analysis of 10,162 functions across build.protoc-stable shows 11 modified functions (0.11%) with minor performance improvements. Power consumption decreased by 0.044% (589,046 nJ → 588,788 nJ). Changes introduce assume_64_bit_php option to the PHP code generator, enabling stricter type hints for 64-bit environments.

Function Analysis

PhpDocSetterTypeName (most impacted):

  • Response time: 6,428 ns → 6,277 ns (-2.35%, -151 ns)
  • Throughput time: 447 ns → 302 ns (-32.59%, -146 ns)
  • Source changes: Added assume_64_bit_php flag for conditional type hint generation ("int" vs "int|string"), removed convenience overload requiring explicit Options passing

GenerateAll (entry point):

  • Response time: 31,939 ms → 31,793 ms (-0.46%, -146 ms)
  • Throughput time: 7,489 ns → 8,095 ns (+8.08%, +605 ns)
  • Source changes: Added parsing for new option parameter; throughput regression is expected overhead from option parsing, negligible compared to 31.8-second total execution

GenerateFieldAccessor:

  • Response time: 7,813 ms → 7,776 ms (-0.47%, -37 ms)
  • Throughput time: 5,340 ns → 5,161 ns (-3.35%, -179 ns)
  • Source changes: Modified type-name functions to conditionally emit type hints based on flag, removed convenience overloads

ConstantNamePrefix and GeneratedMetadataFileName: Minor improvements (-0.56% to -2.89% throughput) from compiler optimizations with no source code changes.

All modified functions are in the PHP code generator (compile-time), not runtime hot paths. Improvements affect protoc build times, not generated code performance. Changes demonstrate good engineering: adding functionality with minimal overhead while achieving performance gains through API cleanup.

💬 Questions? Tag @loci-dev

@loci-dev loci-dev force-pushed the main branch 15 times, most recently from ec8c960 to e3c8630 Compare April 27, 2026 07:22
@loci-dev loci-dev force-pushed the main branch 6 times, most recently from f292971 to 1fdfb93 Compare April 29, 2026 07:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants