Skip to content

Optimize SRS serialization: parallel uncompressed format, file caching, buffered I/O#325

Merged
niconiconi merged 2 commits into
mainfrom
hc/fast_srs_setup
Mar 12, 2026
Merged

Optimize SRS serialization: parallel uncompressed format, file caching, buffered I/O#325
niconiconi merged 2 commits into
mainfrom
hc/fast_srs_setup

Conversation

@hczphn

@hczphn hczphn commented Feb 22, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Parallel uncompressed SRS serialization (structs_kzg.rs): Replace derived ExpSerde on CoefFormUniKZGSRS with custom implementation using uncompressed G1 point format + rayon parallel conversion. Deserialization uses from_uncompressed_unchecked to skip expensive square root computation. SRS loading goes from minutes to seconds.
  • SRS file caching (poly_commit/src/utils.rs): Add expander_pcs_init_with_srs_path() — generates SRS once and saves to disk; subsequent runs load from file. expander_pcs_init_testing_only now defaults to /tmp/kzg_srs_{n}.bin cache.
  • 64MB buffered I/O (definition.rs): Wrap SRS file reads/writes in BufReader/BufWriter with 64MB capacity to reduce syscall overhead.
  • Vec pre-allocation (serdes.rs): Use Vec::with_capacity(len) instead of Vec::default() in generic Vec deserialization to avoid repeated reallocation.

🤖 Generated with Claude Code

…g, buffered I/O

- Replace derived ExpSerde on CoefFormUniKZGSRS with custom implementation that
  uses uncompressed G1 point format and parallel conversion via rayon, avoiding
  expensive square root computation during deserialization
- Add expander_pcs_init_with_srs_path() for SRS file caching: generate once,
  load from disk on subsequent runs (seconds vs minutes)
- Use 64MB BufReader/BufWriter for SRS file I/O to reduce syscall overhead
- Pre-allocate Vec with_capacity in generic Vec deserialization to avoid realloc

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@gemini-code-assist

Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @hczphn, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces significant performance enhancements for handling Structured Reference Strings (SRS) within the KZG polynomial commitment scheme. It achieves this by optimizing serialization and deserialization processes through parallel uncompressed point conversion, implementing a robust file caching mechanism for SRS, and improving I/O efficiency with buffered operations. Additionally, a general optimization for Vec deserialization was introduced to reduce memory reallocations. These changes collectively aim to reduce SRS loading times from minutes to seconds, improving overall system responsiveness and developer experience.

Highlights

  • Parallel Uncompressed SRS Serialization: Replaced the derived ExpSerde for CoefFormUniKZGSRS with a custom implementation that leverages uncompressed G1 points and rayon for parallel conversion, significantly reducing SRS loading times by avoiding expensive square root computations during deserialization.
  • SRS File Caching: Introduced a mechanism to cache generated Structured Reference Strings (SRS) to disk. This allows subsequent runs to load the SRS from a file instead of regenerating it, with expander_pcs_init_testing_only now defaulting to using a temporary cache file.
  • Buffered I/O for SRS: Implemented 64MB BufReader and BufWriter for SRS file operations to minimize syscall overhead and enhance overall I/O performance during serialization and deserialization.
  • Vec Pre-allocation: Optimized generic Vec deserialization by utilizing Vec::with_capacity to pre-allocate memory based on the deserialized length, thereby preventing repeated reallocations and improving performance.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • gkr_engine/src/poly_commit/definition.rs
    • Imported BufReader and BufWriter from std::io.
    • Wrapped file open and create operations with BufReader::with_capacity and BufWriter::with_capacity respectively, using a 64MB buffer size for SRS serialization and deserialization.
  • poly_commit/src/kzg/uni_kzg/structs_kzg.rs
    • Added imports for PrimeCurveAffine, UncompressedEncoding, rayon::prelude::*, and std::io::{Read, Write}.
    • Removed the ExpSerde derive macro from CoefFormUniKZGSRS.
    • Implemented a custom ExpSerde trait for CoefFormUniKZGSRS to handle serialization and deserialization.
    • Serialization now converts G1 points to an uncompressed format in parallel using rayon and writes them to a buffer, then writes the buffer to the writer.
    • Deserialization reads uncompressed G1 points into a buffer, then parses them in parallel using from_uncompressed_unchecked to avoid expensive square root computations.
  • poly_commit/src/utils.rs
    • Added a doc comment for expander_pcs_init_testing_only.
    • Modified expander_pcs_init_testing_only to call a new function expander_pcs_init_with_srs_path with a default temporary SRS cache path (/tmp/kzg_srs_{n}.bin).
    • Introduced a new public function expander_pcs_init_with_srs_path which takes an Option<&str> for the SRS path, allowing for explicit caching or no caching.
    • Updated the call to gen_or_load_srs_for_testing to pass the srs_path argument.
  • serdes/src/serdes.rs
    • Modified the deserialize_from implementation for Vec<V> to initialize the vector with Vec::with_capacity(len) instead of Self::default() (which is Vec::new()), based on the deserialized length.
Activity
  • No human activity (comments, reviews, etc.) has been recorded for this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces several significant performance optimizations for SRS serialization and loading. The use of parallel processing with rayon for uncompressed point format conversion, buffered I/O, and pre-allocation for vectors are all excellent improvements. The addition of SRS file caching is also a great feature for testing workflows. My review focuses on improving code maintainability, portability, and error handling. I've suggested replacing magic numbers with constants, using a platform-agnostic way to get temporary directories, improving error propagation, and cleaning up some redundant trait bounds.

Comment on lines +106 to +116
let powers_of_tau: Vec<E::G1Affine> = buffer
.par_chunks(point_size)
.map(|chunk| {
let mut uncompressed =
<E::G1Affine as UncompressedEncoding>::Uncompressed::default();
uncompressed.as_mut().copy_from_slice(chunk);
E::G1Affine::from_uncompressed_unchecked(&uncompressed)
.into_option()
.expect("Invalid G1 point in SRS file")
})
.collect();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using expect() here will cause a panic on invalid data. Since this function returns a SerdeResult, it's better to propagate the error instead of panicking. You can achieve this by using ok_or and collecting the results into a SerdeResult<Vec<...>>.

        let powers_of_tau = buffer
            .par_chunks(point_size)
            .map(|chunk| {
                let mut uncompressed =
                    <E::G1Affine as UncompressedEncoding>::Uncompressed::default();
                uncompressed.as_mut().copy_from_slice(chunk);
                E::G1Affine::from_uncompressed_unchecked(&uncompressed)
                    .into_option()
                    .ok_or(serdes::SerdeError::DeserializeError)
            })
            .collect::<SerdeResult<Vec<_>>>()?;

Ok(file) => {
// file exists; deserialize SRS from file
Self::SRS::deserialize_from(&mut file).unwrap_or_else(|_| {
let mut reader = BufReader::with_capacity(64 * 1024 * 1024, file);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The buffer capacity 64 * 1024 * 1024 is a magic number. It would be better to define it as a constant at the module level for clarity and maintainability, for example:

const SRS_IO_BUFFER_CAPACITY: usize = 64 * 1024 * 1024; // 64MB

This constant can then be used here and for the BufWriter on line 79.

let file =
std::fs::File::create(path).expect("Failed to create SRS file");
srs.serialize_into(&mut file)
let mut writer = BufWriter::with_capacity(64 * 1024 * 1024, file);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to the BufReader on line 69, this magic number 64 * 1024 * 1024 should be extracted into a shared constant to improve readability and maintainability.

Comment on lines +59 to +63
E::G1Affine: ExpSerde
+ CurveAffine<ScalarExt = E::Fr, CurveExt = E::G1>
+ UncompressedEncoding
+ Send
+ Sync,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The ExpSerde trait bound on E::G1Affine appears to be redundant. This custom implementation of ExpSerde for CoefFormUniKZGSRS serializes powers_of_tau using to_uncompressed and does not rely on an ExpSerde implementation for E::G1Affine. Removing this bound would make the trait constraints more precise.

    E::G1Affine: CurveAffine<ScalarExt = E::Fr, CurveExt = E::G1>
        + UncompressedEncoding
        + Send
        + Sync,

Comment thread poly_commit/src/utils.rs
Comment on lines +19 to +20
let srs_path = format!("/tmp/kzg_srs_{}.bin", n_input_vars);
expander_pcs_init_with_srs_path::<FieldConfig, PCS>(n_input_vars, mpi_config, Some(&srs_path))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Hardcoding the temporary path to /tmp is not portable and will fail on non-Unix-like systems such as Windows. It's better to use std::env::temp_dir() to get a platform-specific temporary directory.

Suggested change
let srs_path = format!("/tmp/kzg_srs_{}.bin", n_input_vars);
expander_pcs_init_with_srs_path::<FieldConfig, PCS>(n_input_vars, mpi_config, Some(&srs_path))
let srs_path = std::env::temp_dir().join(format!("kzg_srs_{}.bin", n_input_vars));
expander_pcs_init_with_srs_path::<FieldConfig, PCS>(n_input_vars, mpi_config, srs_path.to_str())

…buffer constant

- Replace expect() with ok_or + ? for proper error propagation when
  deserializing G1 points from SRS file
- Extract 64MB magic number into SRS_IO_BUFFER_CAPACITY constant

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@hczphn

hczphn commented Feb 22, 2026

Copy link
Copy Markdown
Contributor Author

@gemini review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request introduces significant performance optimizations for SRS serialization and loading by using an uncompressed format, parallel processing with Rayon, and buffered I/O. These changes effectively reduce SRS loading times from minutes to seconds. Key improvements include custom ExpSerde implementations and file caching. However, there are security concerns regarding the use of /tmp for caching and potential OOM vulnerabilities during deserialization. Additionally, the documentation for expander_pcs_init_testing_only is now misleading as it contradicts the new caching behavior.

use crate::{ExpErrors, ExpanderSingleVarChallenge, FieldEngine, MPIEngine, Transcript};

/// Buffer capacity for SRS file I/O (64 MB).
const SRS_IO_BUFFER_CAPACITY: usize = 64 * 1024 * 1024;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

A 64MB buffer capacity for BufReader and BufWriter is quite large and likely excessive. Since the SRS data is mostly read or written in large chunks (via read_exact or write_all), BufReader/BufWriter will often bypass the internal buffer for these large operations. A smaller, more standard buffer size (e.g., 64KB or 1MB) would be more memory-efficient without sacrificing performance.

Comment on lines +112 to +114
E::G1Affine::from_uncompressed_unchecked(&uncompressed)
.into_option()
.ok_or(serdes::SerdeError::DeserializeError)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using from_uncompressed_unchecked skips critical validation checks, such as ensuring the point lies on the curve and belongs to the correct subgroup. While this is a significant performance optimization for loading trusted SRS files, it introduces a security risk if the SRS file is tampered with or loaded from an untrusted source. Consider adding a comment warning about this or providing a way to perform validation if needed.

Comment thread poly_commit/src/utils.rs
};
use polynomials::{MultiLinearPoly, MultilinearExtension, MutableMultilinearExtension};

/// Initialize PCS for testing without SRS caching (always regenerates SRS)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The docstring states that this function initializes PCS 'without SRS caching', but the implementation was changed to use a hardcoded path in /tmp, which enables caching. This documentation should be updated to reflect the actual behavior.

Suggested change
/// Initialize PCS for testing without SRS caching (always regenerates SRS)
/// Initialize PCS for testing with SRS caching

Comment thread poly_commit/src/utils.rs
<PCS::SRS as StructuredReferenceString>::VKey,
PCS::ScratchPad,
) {
let srs_path = format!("/tmp/kzg_srs_{}.bin", n_input_vars);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using a hardcoded path in /tmp with a predictable filename is insecure on multi-user systems as it is vulnerable to symlink attacks. Additionally, it can lead to permission issues or collisions if multiple users run the same tests. It is recommended to use a more secure location or at least use std::env::temp_dir() to respect environment variables and improve cross-platform compatibility.

Comment thread serdes/src/serdes.rs
Comment on lines 71 to +72
let len = usize::deserialize_from(&mut reader)?;
let mut v = Vec::with_capacity(len);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Pre-allocating a Vec with with_capacity(len) where len is read directly from the input stream can lead to Out-Of-Memory (OOM) vulnerabilities if the input is malicious or corrupted (e.g., a very large len value). It is safer to bound the maximum allowed length or use a more conservative allocation strategy.

@niconiconi niconiconi merged commit aa4b214 into main Mar 12, 2026
0 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants