Propagate WAL write failures instead of silently discarding them#105
Merged
Propagate WAL write failures instead of silently discarding them#105
Conversation
insert_internal() used `let _ = self.wal.append(...)` which caused the key to be written to the memtable even when the WAL failed, making recovery after a crash impossible for those writes. Change insert_internal, insert_ts, insert, insert_ns_ts, and insert_ns to return std::io::Result<()> and return an Err immediately on WAL failure so the data is never inserted into the memtable. Update callers: - exec_insert, exec_update, exec_delete (query.rs): propagate as QueryError::Other so the client receives an explicit error. - register_table, save_schema (query.rs): log at ERROR level since these helpers currently have no return type. - lwt_commit (cluster.rs): log at ERROR level since it is fire-and-forget. https://claude.ai/code/session_011eyXTLxHcSEAeV6PRG3izJ
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6618d9b09f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
The previous fix logged the error but still cleared the slot, so a chosen LWT value was permanently dropped on this replica whenever the WAL append failed (e.g. I/O error in sync mode). The coordinator could still report success because it achieved quorum, leaving this replica silently diverged with no retry path. Fix by only removing the composite key from the Paxos slot map when the WAL write succeeds. On failure the slot is kept in memory so that the next prepare round from any coordinator can re-commit the chosen value through the normal Paxos flow. https://claude.ai/code/session_011eyXTLxHcSEAeV6PRG3izJ
Both helpers previously swallowed write errors: after the fail-fast insert_internal change, a WAL failure would leave _tables/_schemas unpopulated while CREATE TABLE still returned success, producing a false-positive DDL acknowledgement and broken metadata state. Change register_table and save_schema to return std::io::Result<()> and propagate WAL failures to their callers. The CreateTable branch in exec_statement now maps those errors to QueryError::Other so the client receives an explicit failure instead of a silent lie. https://claude.ai/code/session_011eyXTLxHcSEAeV6PRG3izJ
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
insert_internal() used
let _ = self.wal.append(...)which caused the key to be written to the memtable even when the WAL failed, making recovery after a crash impossible for those writes.Change insert_internal, insert_ts, insert, insert_ns_ts, and insert_ns to return std::io::Result<()> and return an Err immediately on WAL failure so the data is never inserted into the memtable.
Update callers:
https://claude.ai/code/session_011eyXTLxHcSEAeV6PRG3izJ