Skip to content

feat:Progressive compact#3229

Open
chejinge wants to merge 2 commits intoOpenAtomFoundation:unstablefrom
chejinge:unstable-compact
Open

feat:Progressive compact#3229
chejinge wants to merge 2 commits intoOpenAtomFoundation:unstablefrom
chejinge:unstable-compact

Conversation

@chejinge
Copy link
Collaborator

@chejinge chejinge commented Mar 12, 2026

Summary by CodeRabbit

  • New Features

    • Added automatic progressive compaction that runs on a regular cadence (≈60s) to optimize storage and reduce fragmentation.
  • Improvements

    • Improved server log management: preserves the most recent file per severity and more robustly prunes older logs with additional safeguards and clearer logging on deletions.

@github-actions github-actions bot added Invalid PR Title ✏️ Feature New feature or request labels Mar 12, 2026
@coderabbitai
Copy link

coderabbitai bot commented Mar 12, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 30c00113-f278-46b0-8459-e0a8bbdaebad

📥 Commits

Reviewing files that changed from the base of the PR and between b11fea6 and 786b98a.

📒 Files selected for processing (1)
  • src/pika_server.cc
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/pika_server.cc

📝 Walkthrough

Walkthrough

Adds a periodic progressive compaction routine to PikaServer via a new AutoProgressiveCompact() method (tracked by last_progressive_compact_time_), invoked from DoTimingTask on a 60s cadence; also adjusts server log purging to preserve the most recent log per severity and delete older files beyond retention.

Changes

Cohort / File(s) Summary
Public API
include/pika_server.h
Declared void AutoProgressiveCompact() and added private struct timeval last_progressive_compact_time_ member.
Implementation: compaction & log purging
src/pika_server.cc
Implemented PikaServer::AutoProgressiveCompact(), invoked from DoTimingTask on ~60s cadence; added timestamp tracking and updated log purge logic to keep the newest file per severity and delete older files past retention.

Sequence Diagram

sequenceDiagram
    participant DoTimingTask as DoTimingTask
    participant AutoCompact as AutoProgressiveCompact
    participant DBManager as DB Manager
    participant DB as Database
    participant Storage as Storage Engine

    DoTimingTask->>AutoCompact: invoke (every ~60s)
    activate AutoCompact
    AutoCompact->>DBManager: acquire read lock on all DBs
    activate DBManager
    DBManager-->>AutoCompact: lock acquired
    deactivate DBManager

    loop per database
        AutoCompact->>DB: acquire shared DB lock
        activate DB
        DB-->>AutoCompact: lock acquired
        deactivate DB
        AutoCompact->>Storage: LongestNotCompactionSstCompact(kAll)
        activate Storage
        Storage-->>AutoCompact: compaction result
        deactivate Storage
    end

    AutoCompact->>AutoCompact: update last_progressive_compact_time_
    AutoCompact-->>DoTimingTask: return
    deactivate AutoCompact
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • dingxiaoshuai123
  • wangshao1

Poem

🐰 In sixty beats my paws will tap,
A gentle hum, a tidy snap,
Old files cleared, small stones compact,
I hop, I sort, I keep intact.
Hooray — the server stays on track! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat:Progressive compact' directly corresponds to the main changes: a new progressive compaction mechanism including AutoProgressiveCompact() method and timing infrastructure.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
src/pika_server.cc (3)

1150-1152: Consider extracting the 60-second interval to a named constant or config parameter.

The hardcoded 60 is a magic number. For maintainability and potential future configurability, consider using a named constant:

+// At file/namespace scope
+constexpr int kProgressiveCompactIntervalSeconds = 60;
+
 void PikaServer::AutoProgressiveCompact() {
   struct timeval now;
   gettimeofday(&now, nullptr);
   
   // Execute progressive compact every 60 seconds
   if (last_progressive_compact_time_.tv_sec == 0 || 
-      now.tv_sec - last_progressive_compact_time_.tv_sec >= 60) {
+      now.tv_sec - last_progressive_compact_time_.tv_sec >= kProgressiveCompactIntervalSeconds) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/pika_server.cc` around lines 1150 - 1152, Replace the magic number 60
with a named constant or configurable parameter to improve readability and
maintainability: define a constant (e.g. kProgressiveCompactIntervalSec) or read
from configuration, then use that constant in the condition that checks
last_progressive_compact_time_.tv_sec and now.tv_sec (the if block using
last_progressive_compact_time_.tv_sec and now.tv_sec) and anywhere else the
interval is referenced so the interval is centralized and easy to change.

1157-1168: Use RAII guard for DBLockShared()/DBUnlockShared() to ensure exception safety.

The manual lock/unlock pattern is exception-unsafe. If any code between DBLockShared() and DBUnlockShared() throws (e.g., LOG could theoretically throw on allocation failure), the lock won't be released. Consider using std::shared_lock for RAII semantics.

♻️ Proposed refactor using RAII
     std::shared_lock db_rwl(dbs_rw_);
     for (const auto& db_item : dbs_) {
-      db_item.second->DBLockShared();
-      auto storage = db_item.second->storage();
-      if (storage) {
-        Status s = storage->LongestNotCompactionSstCompact(storage::DataType::kAll);
-        if (!s.ok()) {
-          LOG(WARNING) << "Progressive compact for DB: " << db_item.first 
-                      << " failed: " << s.ToString();
-        } else {
-          LOG(INFO) << "Progressive compact for DB: " << db_item.first << " completed";
+      {
+        std::shared_lock db_lock(db_item.second->GetDBLock());
+        auto storage = db_item.second->storage();
+        if (storage) {
+          Status s = storage->LongestNotCompactionSstCompact(storage::DataType::kAll);
+          if (!s.ok()) {
+            LOG(WARNING) << "Progressive compact for DB: " << db_item.first 
+                        << " failed: " << s.ToString();
+          } else {
+            LOG(INFO) << "Progressive compact for DB: " << db_item.first << " completed";
+          }
         }
       }
-      db_item.second->DBUnlockShared();
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/pika_server.cc` around lines 1157 - 1168, Replace the manual
DBLockShared()/DBUnlockShared() pair with an RAII guard to ensure the lock is
released on scope exit: create or use a RAII helper (e.g., a DBSharedLockGuard
whose constructor calls db->DBLockShared() and whose destructor calls
db->DBUnlockShared()), or use std::shared_lock if you can access the underlying
shared_mutex; then wrap the code that calls db_item.second->storage() and
storage->LongestNotCompactionSstCompact(...) inside that guard's scope (replace
the explicit DBLockShared/DBUnlockShared around db_item with the guard).
Reference symbols: DBLockShared, DBUnlockShared, db_item, storage(),
LongestNotCompactionSstCompact, storage::DataType::kAll.

1164-1166: Consider reducing logging verbosity for successful progressive compaction.

Logging at INFO level every 60 seconds for each DB will generate significant log volume. Consider:

  • Using VLOG or LOG_EVERY_N for routine success messages
  • Only logging when actual compaction work was performed (vs. when task was queued)
  • Moving success logging to the background task completion handler
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/pika_server.cc` around lines 1164 - 1166, The INFO log "Progressive
compact for DB: " << db_item.first << " completed" is too verbose; change it to
use a lower verbosity macro (e.g., VLOG or LOG_EVERY_N) and only emit when
actual compaction work ran (check the compaction result/flag rather than queued
state), and move the success log into the background task completion handler
that finalizes progressive compaction (the code paths around db_item.first and
the progressive compact completion callback/worker should be updated
accordingly).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/pika_server.cc`:
- Line 1350: There is a formatting issue: add a space after the `if` keyword in
the conditional calling pstd::DeleteFile so it reads `if
(!pstd::DeleteFile(log_file)) {`; update the occurrence around the use of
pstd::DeleteFile and log_file to follow this spacing convention (and optionally
scan nearby conditionals for the same pattern) to keep code style consistent.

---

Nitpick comments:
In `@src/pika_server.cc`:
- Around line 1150-1152: Replace the magic number 60 with a named constant or
configurable parameter to improve readability and maintainability: define a
constant (e.g. kProgressiveCompactIntervalSec) or read from configuration, then
use that constant in the condition that checks
last_progressive_compact_time_.tv_sec and now.tv_sec (the if block using
last_progressive_compact_time_.tv_sec and now.tv_sec) and anywhere else the
interval is referenced so the interval is centralized and easy to change.
- Around line 1157-1168: Replace the manual DBLockShared()/DBUnlockShared() pair
with an RAII guard to ensure the lock is released on scope exit: create or use a
RAII helper (e.g., a DBSharedLockGuard whose constructor calls
db->DBLockShared() and whose destructor calls db->DBUnlockShared()), or use
std::shared_lock if you can access the underlying shared_mutex; then wrap the
code that calls db_item.second->storage() and
storage->LongestNotCompactionSstCompact(...) inside that guard's scope (replace
the explicit DBLockShared/DBUnlockShared around db_item with the guard).
Reference symbols: DBLockShared, DBUnlockShared, db_item, storage(),
LongestNotCompactionSstCompact, storage::DataType::kAll.
- Around line 1164-1166: The INFO log "Progressive compact for DB: " <<
db_item.first << " completed" is too verbose; change it to use a lower verbosity
macro (e.g., VLOG or LOG_EVERY_N) and only emit when actual compaction work ran
(check the compaction result/flag rather than queued state), and move the
success log into the background task completion handler that finalizes
progressive compaction (the code paths around db_item.first and the progressive
compact completion callback/worker should be updated accordingly).

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a98ff8cf-2005-4315-a8cd-2ee54b0a0eb1

📥 Commits

Reviewing files that changed from the base of the PR and between bc72424 and b11fea6.

📒 Files selected for processing (2)
  • include/pika_server.h
  • src/pika_server.cc

if (interval_days > retention_time) {
std::string log_file = log_path + "/" + file;
LOG(INFO) << "Deleting out of date log file: " << log_file;
if(!pstd::DeleteFile(log_file)) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Minor formatting issue: missing space after if.

-        if(!pstd::DeleteFile(log_file)) {
+        if (!pstd::DeleteFile(log_file)) {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if(!pstd::DeleteFile(log_file)) {
if (!pstd::DeleteFile(log_file)) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/pika_server.cc` at line 1350, There is a formatting issue: add a space
after the `if` keyword in the conditional calling pstd::DeleteFile so it reads
`if (!pstd::DeleteFile(log_file)) {`; update the occurrence around the use of
pstd::DeleteFile and log_file to follow this spacing convention (and optionally
scan nearby conditionals for the same pattern) to keep code style consistent.

@chejinge chejinge changed the title feat:渐进式compact feat:Progressive compact Mar 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

✏️ Feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants