Skip to content

feat: add RedisBackedAC with preset-optimized engine and benchmarks#124

Merged
skyoo2003 merged 8 commits into
masterfrom
feature/redis-backed-preset
Apr 19, 2026
Merged

feat: add RedisBackedAC with preset-optimized engine and benchmarks#124
skyoo2003 merged 8 commits into
masterfrom
feature/redis-backed-preset

Conversation

@skyoo2003
Copy link
Copy Markdown
Owner

Summary

  • Add 4 in-memory Aho-Corasick engine implementations (Flat, Map, BandedDFA, DAT) with architecture presets (Speed, Balanced, MemoryEfficient, Ultimate)
  • Add Redis-backed AC engine (RedisBackedAC) that uses Redis as source of truth and a local preset-optimized automaton for fast reads
  • Add comprehensive tests (15 tests with race detector) and a full benchmark suite covering unit, E2E, scale, comparison, unicode, and memory profiling categories

Test plan

  • Run go test ./pkg/acor/... -race -count=1 and verify all tests pass
  • Run go test ./pkg/acor/... -bench=. -benchmem and verify benchmarks produce results
  • Verify Redis-backed tests pass with a running Redis instance

Implement Redis-backed Aho-Corasick engine (RedisBackedAC) with selectable
architecture presets (Speed, Balanced, MemoryEfficient, Ultimate). Redis
serves as source of truth; a local preset-optimized automaton is cached for
fast reads. Writes use V2 Lua scripts with optimistic locking; reads hit
the local engine with no Redis I/O on the hot path.

Also add in-memory engine implementations for all 4 presets, comprehensive
tests (15 tests with race detector), and a full benchmark suite covering
unit, E2E, scale, comparison, unicode, and memory profiling categories.
@sourcery-ai
Copy link
Copy Markdown

sourcery-ai Bot commented Apr 19, 2026

Reviewer's Guide

Introduces a new pluggable Aho-Corasick engine architecture with in-memory and Redis-backed preset-optimized engines, wires them into the existing AhoCorasick.Create API via new InMemory and Preset options, and adds tests/benchmarks plus documentation for the new modes and presets.

File-Level Changes

Change Details Files
Extend AhoCorasick core types and creation flow to support new backend modes (original Redis, in-memory, preset-Redis) and presets while keeping legacy behavior intact.
  • Add InMemory and Preset fields to AhoCorasickArgs with validation helpers for Redis vs in-memory configuration.
  • Introduce backendMode enum and mode field on AhoCorasick to distinguish original, in-memory, and preset-Redis backends.
  • Refactor Create into three code paths: createInMemory, createPresetRedis, and createOriginal, selecting based on InMemory/Preset and wiring mode, ops, and closeFn appropriately.
  • Extract logger construction into newLogger for reuse across creation paths.
  • Change Close to delegate to closeFn and adjust error semantics based on backendMode, and gate Debug to only operate in original Redis modes.
  • Extend AhoCorasickInfo with Preset, MemoryBytes, and TrieDepth fields for engines that expose these stats.
pkg/acor/acor.go
pkg/acor/modes.go
docs/content/reference/api.md
docs/content/reference/_index.md
Implement a family of in-memory Aho-Corasick engines with architecture presets (Flat/Speed, Map/MemoryEfficient, BandedDFA/Balanced, DAT+SIMD/Ultimate) and integrate them via an operations adapter.
  • Define Preset enum, its string representation, and presetConfigs plus newMatchEngine factory to choose concrete engine implementations by preset.
  • Add matchEngine interface and shared InMemoryInfo struct for engine statistics.
  • Implement speedEngine using a flat-array full DFA with compact alphabet mapping for maximum throughput.
  • Implement memEfficientEngine using a sparse map-based trie plus Bloom filter pre-filtering to minimize memory usage.
  • Implement doubleArrayTrie and bandedDFA-based balancedEngine for DAT+Banded DFA architecture, and ultimateEngine combining DAT+Banded DFA with SIMD-style prefiltering via simdScanner.
  • Add bloomFilter and simdScanner helpers to provide probabilistic and bitmap-based pre-filters.
  • Provide inMemoryOps adapter that wraps a matchEngine to conform to the existing operations interface, including erroring Suggest/SuggestIndex and exposing engine metrics through AhoCorasickInfo.
pkg/acor/engine.go
pkg/acor/engine_flat.go
pkg/acor/engine_map.go
pkg/acor/engine_bloom.go
pkg/acor/engine_dat.go
pkg/acor/engine_banded_dfa.go
pkg/acor/engine_deferred.go
pkg/acor/engine_simd.go
pkg/acor/engine_presets.go
pkg/acor/in_memory_ops.go
Add RedisBackedAC preset-optimized backend that keeps Redis as source of truth while serving reads from a local engine, with optimistic-locking writes and Pub/Sub-based invalidation.
  • Introduce redisBackedAC type encapsulating local matchEngine, Redis KVStorage, versioning, self-invalidation tracking, and Pub/Sub listener lifecycle.
  • Implement newRedisBacked constructor that initializes V2 trie state in Redis if necessary, loads keywords into a local engine, and starts a Redis Pub/Sub listener for invalidation messages.
  • Add redis_backed_ops layer with Add/Remove/Find/FindIndex/Flush/Info implemented using V2 Lua scripts plus local engine rebuilds and versioning, with retry on ErrConcurrencyConflict.
  • Implement presetRedisOps adapter that exposes redisBackedAC via the existing operations interface and maps Info into AhoCorasickInfo with preset-aware fields.
  • Reuse existing V2 add/remove Lua scripts by introducing redisBackedV2 adapter to bridge between new code and script APIs, and use singleflight to coalesce concurrent reloads during invalidation.
  • Ensure degraded-mode behavior where reads can still use the last-good automaton if Redis goes down, and cross-instance invalidation via invalidateChannelPrefix with self-skip TTL cleanup.
pkg/acor/redis_backed.go
pkg/acor/redis_backed_ops.go
pkg/acor/preset_redis_ops.go
Extend error handling and behavior for modes and unsupported APIs (Suggest/SuggestIndex) in new backends.
  • Add ErrInMemoryWithRedisConfig, ErrPresetRequiresRedis, ErrPresetRequiresV2, ErrPresetWithCache, and ErrSuggestRequiresRedis with clear messages for invalid configuration and unsupported operations.
  • Wire these errors into Create precondition checks and inMemoryOps/presetRedisOps Suggest/SuggestIndex implementations.
  • Adjust topology tests to assert new Close semantics when storage is nil but modeOriginal is set.
pkg/acor/errors.go
pkg/acor/topology_test.go
Add comprehensive unit tests and benchmarks for in-memory and Redis-backed preset engines, including concurrency, unicode, and scale coverage.
  • Add engine_test.go covering all presets in in-memory mode: Add/Remove/Find/FindIndex, Flush, Info, case sensitivity, empty/edge cases, unicode handling, concurrency, API stability across presets, and Suggest/SuggestIndex error behavior.
  • Add redis_backed_test.go covering preset-Redis mode: creation, Add/Remove/Find/FindIndex behaviors, Flush, Info (including Preset), concurrent usage, invalid config combinations, cross-instance invalidation, degraded mode, and Suggest/SuggestIndex errors.
  • Add benchmark_redis_backed_test.go with benchmark suites for in-memory engines and Redis-backed preset engines across presets and keyword scales, including memory-allocation profiling via BenchmarkMemoryUsage.
  • Update go.mod/go.sum to include golang.org/x/sync for singleflight use and update linter configuration to relax mnd/errcheck/gosec in engine and *_test paths.
pkg/acor/engine_test.go
pkg/acor/redis_backed_test.go
pkg/acor/benchmark_redis_backed_test.go
go.mod
go.sum
.golangci.yaml
Update documentation and README to describe in-memory and preset-optimized Redis engines, presets, and extended API/Info fields.
  • Extend API reference docs to include InMemory and Preset in AhoCorasickArgs, expanded AhoCorasickInfo struct fields, and example usage for both in-memory and preset-Redis modes.
  • Add new guides for in-memory-engine and redis-backed-engine explaining when to use each, architecture diagrams, presets table, and API snippets.
  • Update top-level docs index and guides index to link the new guides and mention unified Create API with InMemory/Preset options.
  • Enhance README with quick-start examples for in-memory and preset-Redis engines plus an architecture presets comparison table, and clarify that local caching pertains to the original mode only.
docs/content/reference/api.md
docs/content/_index.md
docs/content/reference/_index.md
docs/content/guides/in-memory-engine.md
docs/content/guides/redis-backed-engine.md
docs/content/guides/_index.md
docs/content/getting-started/quick-start.md
README.md

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@github-actions github-actions Bot added enhancement This issue is about improvement of project. dependencies Pull requests that update a dependency file go Pull requests that update Go code breaking_changes labels Apr 19, 2026
@skyoo2003 skyoo2003 self-assigned this Apr 19, 2026
Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 3 issues, and left some high level feedback:

  • In RedisBackedAC.ensureValid you call reloadFromRedisLocked, which performs HGetAll while holding ac.mu—this can unnecessarily block all readers during slow Redis operations; consider restructuring to fetch the snapshot outside the mutex and only hold the lock while swapping in the new engine state.
  • There is quite a bit of overlap between reloadFromRedis, reloadFromRedisLocked, and readTrieSnapshot in how trie state is loaded and interpreted; consolidating these into a single snapshot-loading helper would reduce duplication and the risk of subtle divergence between code paths.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `RedisBackedAC.ensureValid` you call `reloadFromRedisLocked`, which performs `HGetAll` while holding `ac.mu`—this can unnecessarily block all readers during slow Redis operations; consider restructuring to fetch the snapshot outside the mutex and only hold the lock while swapping in the new engine state.
- There is quite a bit of overlap between `reloadFromRedis`, `reloadFromRedisLocked`, and `readTrieSnapshot` in how trie state is loaded and interpreted; consolidating these into a single snapshot-loading helper would reduce duplication and the risk of subtle divergence between code paths.

## Individual Comments

### Comment 1
<location path="pkg/acor/engine_flat.go" line_range="209-218" />
<code_context>
+	return matched
+}
+
+func (e *speedEngine) info() *InMemoryInfo {
+	if e.dfa == nil {
+		return &InMemoryInfo{Preset: e.preset}
+	}
+	var mem int64
+	for _, row := range e.dfa {
+		mem += int64(len(row)) * 8
+	}
+	for _, outs := range e.outputMap {
+		mem += int64(16 + len(outs)*16)
+	}
+	mem += int64(len(e.alphabet)) * 16
+	mem += int64(len(e.alphaMap)) * 24
+
+	return &InMemoryInfo{
+		Keywords:    e.countKeywords(),
+		Nodes:       e.numStates,
</code_context>
<issue_to_address>
**suggestion:** InMemoryInfo from speedEngine does not populate TrieDepth, unlike other engines

info() leaves TrieDepth at its zero value here, while the other engines populate it, which makes preset comparisons less consistent. Since you already know node depths when building the flat trie, consider tracking the maximum depth there and setting TrieDepth in InMemoryInfo for this engine as well.
</issue_to_address>

### Comment 2
<location path="pkg/acor/engine_test.go" line_range="105-106" />
<code_context>
+					for _, kw := range tc.keywords {
+						ac.Add(kw)
+					}
+					matches := ac.Find(tc.text)
+					if !equalUnordered(matches, tc.expected) {
+						t.Errorf("expected %v, got %v", tc.expected, matches)
+					}
</code_context>
<issue_to_address>
**suggestion (testing):** Tests use order-insensitive comparison while the API comment says results are returned in order of occurrence

`TestInMemoryFind` uses `equalUnordered`, but `InMemoryAC.Find` promises results "in order of occurrence". Order-insensitive checks could mask regressions if the implementation stops preserving order. Please either assert the specific expected order (especially for overlapping/repeated matches) or update the API comment if ordering is not guaranteed.

Suggested implementation:

```golang
import (
	"reflect"

```

```golang
					matches := ac.Find(tc.text)
					if !reflect.DeepEqual(matches, tc.expected) {
						t.Errorf("expected matches in order %v, got %v", tc.expected, matches)
					}

```
</issue_to_address>

### Comment 3
<location path="pkg/acor/redis_backed_test.go" line_range="426-421" />
<code_context>
+	}
+}
+
+func TestRedisBackedDegradedMode(t *testing.T) {
+	mr := miniredis.RunT(t)
+	ac, err := NewRedisBacked(context.Background(), &RedisBackedArgs{
+		AhoCorasickArgs: AhoCorasickArgs{Addr: mr.Addr(), Name: "test-degraded"},
+		Preset:          PresetBalanced,
+	})
+	if err != nil {
+		t.Fatalf("NewRedisBacked: %v", err)
+	}
+	defer func() { _ = ac.Close() }()
+
+	ctx := context.Background()
+	ac.Add(ctx, "hello") //nolint:errcheck,gosec
+
+	mr.Close()
+
+	matched, err := ac.Find(ctx, "hello world")
+	if err != nil {
+		t.Fatalf("Find degraded: %v", err)
+	}
+	if len(matched) != 1 || matched[0] != testKeywordHello {
+		t.Errorf("Find degraded = %v, want [hello]", matched)
+	}
</code_context>
<issue_to_address>
**suggestion (testing):** Add a test for Find/FindIndex behavior when ensureValid fails due to context cancellation

There’s coverage for degraded mode when Redis goes away, but not for `Find`/`FindIndex` when `ensureValid` fails (e.g. due to context cancellation during reload). Since `ensureValid` is on the hot path, it’d be good to add a focused test that:

- Constructs via `newTestRedisBacked`.
- Marks the instance stale with `ac.markStale()`.
- Calls `Find`/`FindIndex` with a cancelled `context.Context`.
- Asserts the error is `context.Canceled` (or at least non-nil) and that there’s no panic.

This would directly exercise the error and cancellation path in `ensureValid` without relying on Redis failures.

Suggested implementation:

```golang
func TestEnsureValidContextCanceled(t *testing.T) {
	ac, mr := newTestRedisBacked(t, "test-ensure-valid-context-canceled")
	defer func() { _ = ac.Close() }()
	defer mr.Close()

	// Mark the instance stale so that the next Find/FindIndex calls go through ensureValid.
	ac.markStale()

	// Use a context that is already canceled to force ensureValid to fail fast.
	ctx, cancel := context.WithCancel(context.Background())
	cancel()

	_, err := ac.Find(ctx, "hello world")
	if err == nil {
		t.Fatalf("Find with canceled context: got nil error, want non-nil")
	}
	if !errors.Is(err, context.Canceled) {
		t.Fatalf("Find with canceled context: got %v, want %v", err, context.Canceled)
	}

	_, err = ac.FindIndex(ctx, "hello world")
	if err == nil {
		t.Fatalf("FindIndex with canceled context: got nil error, want non-nil")
	}
	if !errors.Is(err, context.Canceled) {
		t.Fatalf("FindIndex with canceled context: got %v, want %v", err, context.Canceled)
	}
}

func TestRedisBackedNew(t *testing.T) {
	mr := miniredis.RunT(t)
	ac, err := NewRedisBacked(context.Background(), &RedisBackedArgs{
		AhoCorasickArgs: AhoCorasickArgs{
			Addr: mr.Addr(),
			Name: "test-new",

```

1. Ensure the test file imports the `errors` package:
   - In the import block at the top of `pkg/acor/redis_backed_test.go`, add `errors` if it is not already present:
   `import ("errors" ... )`.
2. Confirm that `newTestRedisBacked` returns `(ac *RedisBacked, mr *miniredis.Miniredis)` or analogous types and that the package name of the test matches where `markStale` is defined so the test can access this unexported method.
3. If `FindIndex` has a different signature (e.g., additional parameters or a different return type), adjust the call accordingly while still using the canceled `ctx` and asserting the returned error as shown.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread pkg/acor/engine_flat.go
Comment on lines +209 to +218
func (e *speedEngine) info() *InMemoryInfo {
if e.dfa == nil {
return &InMemoryInfo{Preset: e.preset}
}
var mem int64
for _, row := range e.dfa {
mem += int64(len(row)) * 8
}
for _, outs := range e.outputMap {
mem += int64(16 + len(outs)*16)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: InMemoryInfo from speedEngine does not populate TrieDepth, unlike other engines

info() leaves TrieDepth at its zero value here, while the other engines populate it, which makes preset comparisons less consistent. Since you already know node depths when building the flat trie, consider tracking the maximum depth there and setting TrieDepth in InMemoryInfo for this engine as well.

Comment thread pkg/acor/engine_test.go Outdated
Comment on lines +105 to +106
matches := ac.Find(tc.text)
if !equalUnordered(matches, tc.expected) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Tests use order-insensitive comparison while the API comment says results are returned in order of occurrence

TestInMemoryFind uses equalUnordered, but InMemoryAC.Find promises results "in order of occurrence". Order-insensitive checks could mask regressions if the implementation stops preserving order. Please either assert the specific expected order (especially for overlapping/repeated matches) or update the API comment if ordering is not guaranteed.

Suggested implementation:

import (
	"reflect"
					matches := ac.Find(tc.text)
					if !reflect.DeepEqual(matches, tc.expected) {
						t.Errorf("expected matches in order %v, got %v", tc.expected, matches)
					}

if err != nil {
t.Fatalf("ac2.Find: %v", err)
}
if len(matched) != 1 || matched[0] != testKeywordHello {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Add a test for Find/FindIndex behavior when ensureValid fails due to context cancellation

There’s coverage for degraded mode when Redis goes away, but not for Find/FindIndex when ensureValid fails (e.g. due to context cancellation during reload). Since ensureValid is on the hot path, it’d be good to add a focused test that:

  • Constructs via newTestRedisBacked.
  • Marks the instance stale with ac.markStale().
  • Calls Find/FindIndex with a cancelled context.Context.
  • Asserts the error is context.Canceled (or at least non-nil) and that there’s no panic.

This would directly exercise the error and cancellation path in ensureValid without relying on Redis failures.

Suggested implementation:

func TestEnsureValidContextCanceled(t *testing.T) {
	ac, mr := newTestRedisBacked(t, "test-ensure-valid-context-canceled")
	defer func() { _ = ac.Close() }()
	defer mr.Close()

	// Mark the instance stale so that the next Find/FindIndex calls go through ensureValid.
	ac.markStale()

	// Use a context that is already canceled to force ensureValid to fail fast.
	ctx, cancel := context.WithCancel(context.Background())
	cancel()

	_, err := ac.Find(ctx, "hello world")
	if err == nil {
		t.Fatalf("Find with canceled context: got nil error, want non-nil")
	}
	if !errors.Is(err, context.Canceled) {
		t.Fatalf("Find with canceled context: got %v, want %v", err, context.Canceled)
	}

	_, err = ac.FindIndex(ctx, "hello world")
	if err == nil {
		t.Fatalf("FindIndex with canceled context: got nil error, want non-nil")
	}
	if !errors.Is(err, context.Canceled) {
		t.Fatalf("FindIndex with canceled context: got %v, want %v", err, context.Canceled)
	}
}

func TestRedisBackedNew(t *testing.T) {
	mr := miniredis.RunT(t)
	ac, err := NewRedisBacked(context.Background(), &RedisBackedArgs{
		AhoCorasickArgs: AhoCorasickArgs{
			Addr: mr.Addr(),
			Name: "test-new",
  1. Ensure the test file imports the errors package:
    • In the import block at the top of pkg/acor/redis_backed_test.go, add errors if it is not already present:
      import ("errors" ... ).
  2. Confirm that newTestRedisBacked returns (ac *RedisBacked, mr *miniredis.Miniredis) or analogous types and that the package name of the test matches where markStale is defined so the test can access this unexported method.
  3. If FindIndex has a different signature (e.g., additional parameters or a different return type), adjust the call accordingly while still using the canceled ctx and asserting the returned error as shown.

Extend AhoCorasickArgs with InMemory and Preset fields so all modes
go through the same Create() constructor and AhoCorasick type.
Remove NewInMemory, NewRedisBacked, InMemoryAC, RedisBackedAC to
preserve backward compatibility of the existing public API.

- Add PresetNone (zero value) to avoid implicit preset activation
- Add inMemoryOps and presetRedisOps adapters for operations interface
- Add mode-based branching in Create() and closeFn pattern in Close()
- Guard Debug() for in-memory and preset modes
- Exclude errcheck/gosec from test files in golangci-lint config
- Update all tests and benchmarks to use unified Create API
- Update documentation to reflect unified API
@skyoo2003
Copy link
Copy Markdown
Owner Author

@sourcery-ai review

Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 6 issues, and left some high level feedback:

  • In redisBackedAC.ensureValid/reloadFromRedisLocked, reload errors are swallowed and stale may remain true while callers see a nil error; consider at least logging or surfacing a wrapped error so operators can detect and debug persistent reload failures instead of silently staying on a stale snapshot.
  • There is a fair amount of duplicated logic between the original V2 path and the new Redis-backed preset path (e.g., trie snapshot read/unmarshal and version handling); consider factoring common helpers to reduce maintenance overhead and keep schema/V2 behavior changes in one place.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `redisBackedAC.ensureValid`/`reloadFromRedisLocked`, reload errors are swallowed and `stale` may remain true while callers see a nil error; consider at least logging or surfacing a wrapped error so operators can detect and debug persistent reload failures instead of silently staying on a stale snapshot.
- There is a fair amount of duplicated logic between the original V2 path and the new Redis-backed preset path (e.g., trie snapshot read/unmarshal and version handling); consider factoring common helpers to reduce maintenance overhead and keep schema/V2 behavior changes in one place.

## Individual Comments

### Comment 1
<location path="pkg/acor/acor.go" line_range="260-269" />
<code_context>
 	// complete even if the caller's context is already canceled.
 	RollbackTimeout time.Duration
+
+	// InMemory enables pure in-memory mode with no Redis dependency.
+	// When true, Addr, Addrs, MasterName, RingAddrs, Password, DB, SchemaVersion,
+	// and EnableCache must all be unset (zero values). A Preset may optionally be
+	// specified to select the engine architecture (defaults to PresetBalanced).
+	InMemory bool
+
+	// Preset selects the architecture for the local match engine.
+	// When set and InMemory is true, selects the in-memory engine architecture.
+	// When set and InMemory is false, uses Redis-backed engine with a local
+	// preset-optimized automaton for fast reads.
+	// When unset (zero), the original AhoCorasick engine is used.
+	// Preset mode forces V2 schema and is incompatible with EnableCache.
+	Preset Preset
 }

</code_context>
<issue_to_address>
**issue (bug_risk):** The documented invariants for InMemory (SchemaVersion, EnableCache) are not fully enforced in Create.

The InMemory docs require SchemaVersion and EnableCache to be unset, but the Create() path for args.InMemory only relies on hasAnyRedisConfig(), which doesn’t consider those fields. As a result, callers can set InMemory=true along with SchemaVersion/EnableCache and still get an in-memory instance, violating the documented contract. Please either extend hasAnyRedisConfig() to include SchemaVersion/EnableCache or add explicit checks in the InMemory branch so behavior matches the documented invariants.
</issue_to_address>

### Comment 2
<location path="pkg/acor/errors.go" line_range="30-32" />
<code_context>
+	// ErrInMemoryWithRedisConfig is returned when InMemory is true but Redis
+	// configuration fields are also set.
+	ErrInMemoryWithRedisConfig = errors.New("InMemory mode cannot be used with Redis configuration")
+	// ErrPresetRequiresRedis is returned when a Preset is specified without
+	// InMemory=true and without any Redis address.
+	ErrPresetRequiresRedis = errors.New("Preset requires either InMemory mode or a Redis address")
+	// ErrPresetRequiresV2 is returned when a Preset is set with SchemaVersion=1.
+	ErrPresetRequiresV2 = errors.New("Preset engine requires V2 schema")
</code_context>
<issue_to_address>
**issue:** ErrPresetRequiresRedis message is inconsistent with how Preset is actually validated.

Create() routes InMemory=true into the in-memory branch before the Preset branch, and the Preset branch requires hasAnyRedisConfig()==true with InMemory=false. So a Preset with InMemory=true never hits the Preset-Redis branch, and a Preset without Redis always yields ErrPresetRequiresRedis regardless of InMemory. Please either update the error text to reflect the actual constraint (e.g. "Preset requires a Redis configuration when InMemory is false") or adjust the branching logic if the intention is to support both modes.
</issue_to_address>

### Comment 3
<location path="pkg/acor/redis_backed_test.go" line_range="434-442" />
<code_context>
+	}
+}
+
+func TestPresetRedisSuggestError(t *testing.T) {
+	ac := newTestPresetRedis(t, PresetBalanced)
+	_, err := ac.Suggest("he")
</code_context>
<issue_to_address>
**suggestion (testing):** Also cover SuggestIndex error behavior in preset-Redis mode

Please add a similar test for `SuggestIndex` (e.g. `TestPresetRedisSuggestIndexError`) that asserts it also returns `ErrSuggestRequiresRedis` in preset-Redis mode, so both methods’ error contracts are covered and protected against regressions.

```suggestion
func TestPresetRedisSuggestError(t *testing.T) {
	ac := newTestPresetRedis(t, PresetBalanced)
	_, err := ac.Suggest("he")
	if err != ErrSuggestRequiresRedis {
		t.Errorf("expected ErrSuggestRequiresRedis, got %v", err)
	}
}

func TestPresetRedisSuggestIndexError(t *testing.T) {
	ac := newTestPresetRedis(t, PresetBalanced)
	_, err := ac.SuggestIndex("he")
	if err != ErrSuggestRequiresRedis {
		t.Errorf("expected ErrSuggestRequiresRedis, got %v", err)
	}
}

func TestPresetRedisV1Error(t *testing.T) {
```
</issue_to_address>

### Comment 4
<location path="pkg/acor/engine_test.go" line_range="32-33" />
<code_context>
+func TestInMemoryAdd(t *testing.T) {
+	for _, preset := range allPresets() {
+		t.Run(preset.String(), func(t *testing.T) {
+			ac := createTestInMemory(t, preset)
+			ac.Close()
+
+			added, err := ac.Add("hello")
</code_context>
<issue_to_address>
**suggestion (testing):** Avoid closing in-memory instances at the start of tests or explicitly assert post-close behavior

Several in-memory tests (e.g., `TestInMemoryAdd`, `TestInMemoryRemove`, `TestInMemoryFind`) call `createTestInMemory` and then immediately `ac.Close()` before using `ac`. This makes the tests depend on undocumented post-close behavior and could let them keep passing even if `Close` later starts returning errors or invalidating the instance. Consider either using `t.Cleanup(ac.Close)` so the instance stays open during the test, or adding explicit assertions that operations after `Close` are still valid and error-free.

Suggested implementation:

```golang
func TestInMemoryAdd(t *testing.T) {
	for _, preset := range allPresets() {
		t.Run(preset.String(), func(t *testing.T) {
			ac := createTestInMemory(t, preset)
			t.Cleanup(ac.Close)

			added, err := ac.Add("hello")

```

The same pattern should be applied to other in-memory tests that currently call `ac.Close()` immediately after `createTestInMemory` (e.g., `TestInMemoryRemove`, `TestInMemoryFind`, and any others with this pattern). For each such test:
1. Replace `ac.Close()` directly after initialization with `t.Cleanup(ac.Close)` so the instance remains valid during the test.
2. If you intentionally want to test post-close behavior, keep `ac.Close()` but then add explicit assertions about the expected behavior of subsequent operations after close (errors or success), rather than relying on implicit behavior.
</issue_to_address>

### Comment 5
<location path="pkg/acor/engine_test.go" line_range="336-345" />
<code_context>
+	}
+}
+
+func TestInMemorySuggestError(t *testing.T) {
+	ac := createTestInMemory(t, PresetBalanced)
+	ac.Close()
</code_context>
<issue_to_address>
**suggestion (testing):** Add a test for SuggestIndex error in in-memory mode

There’s no equivalent coverage for `SuggestIndex`. Please add a `TestInMemorySuggestIndexError` (mirroring the preset-Redis tests) that asserts `SuggestIndex` returns `ErrSuggestRequiresRedis` in in-memory mode, so both APIs are consistently guarded against this configuration and regressions are less likely.

```suggestion
func TestInMemorySuggestError(t *testing.T) {
	ac := createTestInMemory(t, PresetBalanced)
	ac.Close()
	_, err := ac.Suggest("he")
	if err != ErrSuggestRequiresRedis {
		t.Errorf("expected ErrSuggestRequiresRedis, got %v", err)
	}
}

func TestInMemorySuggestIndexError(t *testing.T) {
	ac := createTestInMemory(t, PresetBalanced)
	ac.Close()
	_, err := ac.SuggestIndex("he")
	if err != ErrSuggestRequiresRedis {
		t.Errorf("expected ErrSuggestRequiresRedis, got %v", err)
	}
}

func TestInMemoryWithRedisConfigError(t *testing.T) {
```
</issue_to_address>

### Comment 6
<location path="pkg/acor/redis_backed_test.go" line_range="442" />
<code_context>
+	}
+}
+
+func TestPresetRedisV1Error(t *testing.T) {
+	_, err := Create(&AhoCorasickArgs{
+		Addr:          "localhost:6379",
</code_context>
<issue_to_address>
**suggestion (testing):** Add a test for Preset with no Redis and InMemory=false (ErrPresetRequiresRedis)

One remaining constructor edge case is when `Preset` is set, `InMemory` is false, and no Redis address is provided. In that case `Create` should return `ErrPresetRequiresRedis`. Please add a test (e.g. `TestPresetRequiresRedisError`) that calls `Create(&AhoCorasickArgs{Name: "x", Preset: PresetBalanced})` with no `Addr` and asserts that `ErrPresetRequiresRedis` is returned.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread pkg/acor/acor.go Outdated
Comment thread pkg/acor/errors.go Outdated
Comment on lines +434 to +442
func TestPresetRedisSuggestError(t *testing.T) {
ac := newTestPresetRedis(t, PresetBalanced)
_, err := ac.Suggest("he")
if err != ErrSuggestRequiresRedis {
t.Errorf("expected ErrSuggestRequiresRedis, got %v", err)
}
}

func TestPresetRedisV1Error(t *testing.T) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Also cover SuggestIndex error behavior in preset-Redis mode

Please add a similar test for SuggestIndex (e.g. TestPresetRedisSuggestIndexError) that asserts it also returns ErrSuggestRequiresRedis in preset-Redis mode, so both methods’ error contracts are covered and protected against regressions.

Suggested change
func TestPresetRedisSuggestError(t *testing.T) {
ac := newTestPresetRedis(t, PresetBalanced)
_, err := ac.Suggest("he")
if err != ErrSuggestRequiresRedis {
t.Errorf("expected ErrSuggestRequiresRedis, got %v", err)
}
}
func TestPresetRedisV1Error(t *testing.T) {
func TestPresetRedisSuggestError(t *testing.T) {
ac := newTestPresetRedis(t, PresetBalanced)
_, err := ac.Suggest("he")
if err != ErrSuggestRequiresRedis {
t.Errorf("expected ErrSuggestRequiresRedis, got %v", err)
}
}
func TestPresetRedisSuggestIndexError(t *testing.T) {
ac := newTestPresetRedis(t, PresetBalanced)
_, err := ac.SuggestIndex("he")
if err != ErrSuggestRequiresRedis {
t.Errorf("expected ErrSuggestRequiresRedis, got %v", err)
}
}
func TestPresetRedisV1Error(t *testing.T) {

Comment thread pkg/acor/engine_test.go Outdated
Comment on lines +32 to +33
ac := createTestInMemory(t, preset)
ac.Close()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Avoid closing in-memory instances at the start of tests or explicitly assert post-close behavior

Several in-memory tests (e.g., TestInMemoryAdd, TestInMemoryRemove, TestInMemoryFind) call createTestInMemory and then immediately ac.Close() before using ac. This makes the tests depend on undocumented post-close behavior and could let them keep passing even if Close later starts returning errors or invalidating the instance. Consider either using t.Cleanup(ac.Close) so the instance stays open during the test, or adding explicit assertions that operations after Close are still valid and error-free.

Suggested implementation:

func TestInMemoryAdd(t *testing.T) {
	for _, preset := range allPresets() {
		t.Run(preset.String(), func(t *testing.T) {
			ac := createTestInMemory(t, preset)
			t.Cleanup(ac.Close)

			added, err := ac.Add("hello")

The same pattern should be applied to other in-memory tests that currently call ac.Close() immediately after createTestInMemory (e.g., TestInMemoryRemove, TestInMemoryFind, and any others with this pattern). For each such test:

  1. Replace ac.Close() directly after initialization with t.Cleanup(ac.Close) so the instance remains valid during the test.
  2. If you intentionally want to test post-close behavior, keep ac.Close() but then add explicit assertions about the expected behavior of subsequent operations after close (errors or success), rather than relying on implicit behavior.

Comment thread pkg/acor/engine_test.go Outdated
Comment on lines +336 to +345
func TestInMemorySuggestError(t *testing.T) {
ac := createTestInMemory(t, PresetBalanced)
ac.Close()
_, err := ac.Suggest("he")
if err != ErrSuggestRequiresRedis {
t.Errorf("expected ErrSuggestRequiresRedis, got %v", err)
}
}

func TestInMemoryWithRedisConfigError(t *testing.T) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Add a test for SuggestIndex error in in-memory mode

There’s no equivalent coverage for SuggestIndex. Please add a TestInMemorySuggestIndexError (mirroring the preset-Redis tests) that asserts SuggestIndex returns ErrSuggestRequiresRedis in in-memory mode, so both APIs are consistently guarded against this configuration and regressions are less likely.

Suggested change
func TestInMemorySuggestError(t *testing.T) {
ac := createTestInMemory(t, PresetBalanced)
ac.Close()
_, err := ac.Suggest("he")
if err != ErrSuggestRequiresRedis {
t.Errorf("expected ErrSuggestRequiresRedis, got %v", err)
}
}
func TestInMemoryWithRedisConfigError(t *testing.T) {
func TestInMemorySuggestError(t *testing.T) {
ac := createTestInMemory(t, PresetBalanced)
ac.Close()
_, err := ac.Suggest("he")
if err != ErrSuggestRequiresRedis {
t.Errorf("expected ErrSuggestRequiresRedis, got %v", err)
}
}
func TestInMemorySuggestIndexError(t *testing.T) {
ac := createTestInMemory(t, PresetBalanced)
ac.Close()
_, err := ac.SuggestIndex("he")
if err != ErrSuggestRequiresRedis {
t.Errorf("expected ErrSuggestRequiresRedis, got %v", err)
}
}
func TestInMemoryWithRedisConfigError(t *testing.T) {

}
}

func TestPresetRedisV1Error(t *testing.T) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Add a test for Preset with no Redis and InMemory=false (ErrPresetRequiresRedis)

One remaining constructor edge case is when Preset is set, InMemory is false, and no Redis address is provided. In that case Create should return ErrPresetRequiresRedis. Please add a test (e.g. TestPresetRequiresRedisError) that calls Create(&AhoCorasickArgs{Name: "x", Preset: PresetBalanced}) with no Addr and asserts that ErrPresetRequiresRedis is returned.

- Enforce InMemory invariants: reject SchemaVersion and EnableCache
  when InMemory=true (matches documented contract)
- Update ErrInMemoryWithRedisConfig message to include SchemaVersion
  and EnableCache
- Update ErrPresetRequiresRedis message to reflect actual constraint
- Surface reload errors in ensureValid instead of swallowing them
- Replace ac.Close() with t.Cleanup in in-memory tests so instances
  remain valid during test execution
- Add TestPresetRedisSuggestIndexError for SuggestIndex error coverage
- Add TestInMemorySuggestIndexError for SuggestIndex error coverage
- Add TestPresetRequiresRedisError for constructor edge case
- Expand TestInMemoryWithRedisConfigError with sub-tests for
  SchemaVersion and EnableCache
@skyoo2003
Copy link
Copy Markdown
Owner Author

@sourcery-ai review

Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 4 issues, and left some high level feedback:

  • In TestInMemoryCaseSensitive you call ac.Close() and then continue to use ac (calling Add and Find), which violates the usual expectation that a closed handle is no longer usable and could mask bugs in future Close semantics.
  • AhoCorasick.Debug becomes a no-op for the new in-memory and preset-Redis modes, which might surprise callers who relied on debug output; consider either documenting this explicitly or adding a basic debug implementation for the new backends.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `TestInMemoryCaseSensitive` you call `ac.Close()` and then continue to use `ac` (calling `Add` and `Find`), which violates the usual expectation that a closed handle is no longer usable and could mask bugs in future Close semantics.
- `AhoCorasick.Debug` becomes a no-op for the new in-memory and preset-Redis modes, which might surprise callers who relied on debug output; consider either documenting this explicitly or adding a basic debug implementation for the new backends.

## Individual Comments

### Comment 1
<location path="pkg/acor/redis_backed.go" line_range="143-152" />
<code_context>
+func (ac *redisBackedAC) reloadFromRedis(ctx context.Context) error {
</code_context>
<issue_to_address>
**suggestion:** Deduplicate the reload-from-Redis logic to avoid divergence between locked and unlocked paths.

`reloadFromRedis` and `reloadFromRedisLocked` are almost identical except for locking, which risks future changes (e.g., new trie state fields) being applied to only one path. Please extract the common `HGETALL` + unmarshal logic into a shared helper that returns the parsed data, and let each function focus only on locking and assigning the results. This centralizes correctness-sensitive logic and simplifies maintenance.

Suggested implementation:

```golang
func (ac *redisBackedAC) loadTrieDataFromRedis(ctx context.Context) (keywords []string, suffixes []string, version int64, err error) {
	trieData, err := ac.storage.HGetAll(ctx, trieKey(ac.name))
	if err != nil {
		return nil, nil, 0, fmt.Errorf("HGETALL %s: %w", trieKey(ac.name), err)
	}

	if data, ok := trieData["keywords"]; ok {
		if err := json.Unmarshal([]byte(data), &keywords); err != nil {
			return nil, nil, 0, fmt.Errorf("unmarshal keywords: %w", err)
		}
	}

	if data, ok := trieData["suffixes"]; ok {
		if err := json.Unmarshal([]byte(data), &suffixes); err != nil {
			return nil, nil, 0, fmt.Errorf("unmarshal suffixes: %w", err)
		}
	}

	if data, ok := trieData["version"]; ok {
		v, errParse := strconv.ParseInt(data, 10, 64)
		if errParse != nil {
			return nil, nil, 0, fmt.Errorf("parse version: %w", errParse)
		}
		version = v
	}

	return keywords, suffixes, version, nil
}

func (ac *redisBackedAC) reloadFromRedis(ctx context.Context) error {

```

```golang
func (ac *redisBackedAC) reloadFromRedis(ctx context.Context) error {
	keywords, suffixes, version, err := ac.loadTrieDataFromRedis(ctx)
	if err != nil {
		return err
	}

	// NOTE: the assignment to ac's trie state should remain here,
	// reusing the existing logic but now using keywords, suffixes, and version.

```

1. In the same file, update `reloadFromRedisLocked` to call `ac.loadTrieDataFromRedis(ctx)` and remove its duplicated `HGetAll`/`json.Unmarshal`/version-parsing code. It should:
   - Acquire the lock as it currently does.
   - Call `loadTrieDataFromRedis`.
   - On success, assign `keywords`, `suffixes`, `version` to the locked trie state.
2. Anywhere in `reloadFromRedis` and `reloadFromRedisLocked` that currently references `trieData["suffixes"]`, `trieData["version"]`, or does separate JSON unmarshalling/parsing should be replaced to use the `suffixes` and `version` returned from `loadTrieDataFromRedis`.
3. Ensure the file imports `strconv` if it is not already imported:
   - Add `strconv` to the existing import block: `import ( ... "strconv" ... )`.
4. If the trie state struct has additional fields that were previously deserialized in both functions (e.g., new fields added in future), move that deserialization into `loadTrieDataFromRedis` so both paths stay in sync.
</issue_to_address>

### Comment 2
<location path="pkg/acor/engine_test.go" line_range="199-207" />
<code_context>
+	}
+}
+
+func TestInMemoryCaseSensitive(t *testing.T) {
+	for _, preset := range allPresets() {
+		t.Run(preset.String(), func(t *testing.T) {
+			ac, _ := Create(&AhoCorasickArgs{
+				InMemory: true,
+				Name:     "test",
+				Preset:   preset,
+			})
+			ac.Close()
+			ac.Add("Hello")
+			if matches, _ := ac.Find("say HELLO world"); len(matches) == 0 {
</code_context>
<issue_to_address>
**issue (testing):** Test uses an AhoCorasick instance after it has been closed, which makes the assertions unreliable

In `TestInMemoryCaseSensitive`, `ac` is closed before `ac.Add("Hello")` and `ac.Find(...)`, so the test is exercising undefined behavior rather than case sensitivity. Please either delay `ac.Close()` until after the assertions, or create a new instance for the case-insensitive path and close it via `t.Cleanup` so the test reliably verifies case handling on a valid automaton.
</issue_to_address>

### Comment 3
<location path="pkg/acor/benchmark_redis_backed_test.go" line_range="48" />
<code_context>
+	}
+}
+
+func BenchmarkInMemoryAdd(b *testing.B) {
+	for _, preset := range allPresets() {
+		b.Run(preset.String(), func(b *testing.B) {
+			b.ResetTimer()
+			for i := 0; i < b.N; i++ {
</code_context>
<issue_to_address>
**issue (performance):** BenchmarkInMemoryAdd creates instances inside the benchmark loop and registers b.Cleanup each iteration, which distorts results

This pattern both inflates `b.Cleanup` registrations and measures construction/teardown rather than steady-state `Add` performance. Instead, initialize the `AhoCorasick` instance once per sub-benchmark before `b.ResetTimer`, register a single cleanup, and only loop over `ac.Add` inside the timed section, e.g.:

```go
for _, preset := range allPresets() {
    b.Run(preset.String(), func(b *testing.B) {
        ac := createTestInMemory(b, preset)
        b.Cleanup(func() { _ = ac.Close() })
        b.ResetTimer()
        for i := 0; i < b.N; i++ {
            for j := 0; j < 100; j++ {
                ac.Add(fmt.Sprintf("keyword%d", j))
            }
        }
    })
}
```
</issue_to_address>

### Comment 4
<location path="docs/content/guides/in-memory-engine.md" line_range="96-97" />
<code_context>
+ac.Remove("keyword")
+
+// Find
+matches := ac.Find("text")          // []string
+positions := ac.FindIndex("text")   // map[string][]int
+
+// Stats
</code_context>
<issue_to_address>
**issue:** Update `Find` and `FindIndex` examples to include the error return for consistency with the actual API.

Elsewhere in the docs (e.g., Quick Start), these functions are shown as returning `(result, error)` (e.g., `matches, _ := ac.Find(...)`), but here they appear to return a single value, implying a different signature. Please update these examples to include the error return (e.g., `matches, err := ac.Find("text")`, `positions, err := ac.FindIndex("text")`) to align with the real API and maintain consistency across the documentation.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread pkg/acor/redis_backed.go Outdated
Comment thread pkg/acor/engine_test.go Outdated
func BenchmarkEngineFind(b *testing.B) {
keywords := []string{"he", "she", "his", "hers", "hello", "world", "benchmark"}
for _, preset := range allPresets() {
b.Run(preset.String(), func(b *testing.B) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (performance): BenchmarkInMemoryAdd creates instances inside the benchmark loop and registers b.Cleanup each iteration, which distorts results

This pattern both inflates b.Cleanup registrations and measures construction/teardown rather than steady-state Add performance. Instead, initialize the AhoCorasick instance once per sub-benchmark before b.ResetTimer, register a single cleanup, and only loop over ac.Add inside the timed section, e.g.:

for _, preset := range allPresets() {
    b.Run(preset.String(), func(b *testing.B) {
        ac := createTestInMemory(b, preset)
        b.Cleanup(func() { _ = ac.Close() })
        b.ResetTimer()
        for i := 0; i < b.N; i++ {
            for j := 0; j < 100; j++ {
                ac.Add(fmt.Sprintf("keyword%d", j))
            }
        }
    })
}

Comment thread docs/content/guides/in-memory-engine.md Outdated
- Deduplicate reload-from-Redis logic: extract loadTrieDataFromRedis
  and applyReload helpers, remove reloadFromRedisLocked
- Fix TestInMemoryCaseSensitive to use t.Cleanup instead of Close
  before assertions, preventing use-after-close
- Fix BenchmarkInMemoryAdd to create instance once outside the
  benchmark loop instead of per-iteration
- Update in-memory engine docs to show error returns on Find/FindIndex
- Document that Debug() is a no-op for in-memory and preset-Redis modes
@skyoo2003
Copy link
Copy Markdown
Owner Author

@sourcery-ai review

Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 2 issues, and left some high level feedback:

  • The new in-memory and preset-Redis paths (createInMemory, createPresetRedis, redisBackedAC) hard-code context.Background() for long-lived operations, which ignores caller cancellation; consider threading the caller’s context (or a derived one) through so shutdown and reload behavior can respect upstream cancellation/timeouts.
  • Keyword normalization (trim + optional lowercasing) is duplicated across inMemoryOps and redisBackedAC methods; consider extracting a shared helper to reduce drift and keep behavior consistent across backends.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The new in-memory and preset-Redis paths (`createInMemory`, `createPresetRedis`, `redisBackedAC`) hard-code `context.Background()` for long-lived operations, which ignores caller cancellation; consider threading the caller’s context (or a derived one) through so shutdown and reload behavior can respect upstream cancellation/timeouts.
- Keyword normalization (trim + optional lowercasing) is duplicated across `inMemoryOps` and `redisBackedAC` methods; consider extracting a shared helper to reduce drift and keep behavior consistent across backends.

## Individual Comments

### Comment 1
<location path="pkg/acor/engine_flat.go" line_range="241-250" />
<code_context>
+	return len(seen)
+}
+
+func sortRunes(runes []rune) {
+	for i := 1; i < len(runes); i++ {
+		key := runes[i]
+		j := i - 1
+		for j >= 0 && runes[j] > key {
+			runes[j+1] = runes[j]
+			j--
+		}
+		runes[j+1] = key
+	}
+}
+
+func sortRunesFromPairs(pairs []struct {
+	ch    rune
+	child int
+}) {
+	for i := 1; i < len(pairs); i++ {
+		key := pairs[i]
+		j := i - 1
+		for j >= 0 && pairs[j].ch > key.ch {
+			pairs[j+1] = pairs[j]
+			j--
+		}
+		pairs[j+1] = key
+	}
+}
+
+func sortStrings(s []string) {
+	for i := 1; i < len(s); i++ {
+		key := s[i]
</code_context>
<issue_to_address>
**suggestion (performance):** Custom insertion-sort implementations can be slower and harder to maintain than using the standard library sort package.

sortRunes, sortRunesFromPairs, and sortStrings all manually implement insertion sort, which is O(n²) and can be slow for larger inputs. The standard library’s sort.Slice/sort.Strings use more efficient, well-optimized algorithms and are easier to read. Unless you have benchmarks showing these custom sorts are faster for your typical input sizes, consider replacing them with the standard sort package.

Suggested implementation:

```golang
func sortRunes(runes []rune) {
	sort.Slice(runes, func(i, j int) bool {
		return runes[i] < runes[j]
	})
}

```

```golang
func sortRunesFromPairs(pairs []struct {
	ch    rune
	child int
}) {
	sort.Slice(pairs, func(i, j int) bool {
		return pairs[i].ch < pairs[j].ch
	})
}

```

```golang
func sortStrings(s []string) {
	sort.Strings(s)
}

```

You’ll also need to ensure the file imports the `sort` package. If you have a standard Go import block like:

```go
import (
    "some/package"
)
```

add `"sort"` to it, for example:

```go
import (
    "sort"
    "some/package"
)
```

Adjust the exact location/order to match your existing import style.
</issue_to_address>

### Comment 2
<location path="pkg/acor/engine_test.go" line_range="199-202" />
<code_context>
+	}
+}
+
+func TestInMemoryCaseSensitive(t *testing.T) {
+	for _, preset := range allPresets() {
+		t.Run(preset.String(), func(t *testing.T) {
+			ac, _ := Create(&AhoCorasickArgs{
+				InMemory: true,
+				Name:     "test",
</code_context>
<issue_to_address>
**suggestion (testing):** Check and assert `Create` errors in `TestInMemoryCaseSensitive` instead of ignoring them

In `TestInMemoryCaseSensitive`, both `Create` calls discard the error (`ac, _ := ...`, `ac2, _ := ...`). If construction fails, the test may panic later with an unclear cause. Please capture and assert `err` is nil for both instances and fail the test immediately otherwise, consistent with the other tests.

Suggested implementation:

```golang
			ac, err := Create(&AhoCorasickArgs{
				InMemory: true,
				Name:     "test",
				Preset:   preset,
			})
			if err != nil {
				t.Fatalf("failed to create in-memory case-sensitive automaton: %v", err)
			}
			t.Cleanup(func() { _ = ac.Close() })

```

```golang
			ac2, err := Create(&AhoCorasickArgs{

```

Right after the full `AhoCorasickArgs` literal for the `ac2` creation (i.e., immediately after the closing `})` of `Create(&AhoCorasickArgs{...})`), add an error check analogous to the first one:

```go
			if err != nil {
				t.Fatalf("failed to create second in-memory case-sensitive automaton: %v", err)
			}
```

This ensures both `Create` calls in `TestInMemoryCaseSensitive` assert `err` is nil before proceeding.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread pkg/acor/engine_flat.go Outdated
Comment thread pkg/acor/engine_test.go Outdated
- Replace custom insertion sorts with sort.Slice/sort.Strings in
  engine_flat.go for better performance and readability
- Extract normalizeKeyword and normalizeText helpers in modes.go to
  deduplicate keyword normalization across inMemoryOps and
  redisBackedAC
- Document context.Background() usage in createInMemory and
  createPresetRedis (Create API does not accept caller context)
- Assert Create errors in TestInMemoryCaseSensitive instead of
  discarding them
@skyoo2003
Copy link
Copy Markdown
Owner Author

@sourcery-ai review

Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • The Redis-backed preset path (redisBackedAC) has separate snapshot loading (loadTrieDataFromRedis) and readTrieSnapshot/marshalTrieArgs logic, which now diverges from the existing V2 code; consider consolidating these into shared helpers so schema evolution only has one code path to maintain.
  • In Create, several different invalid configurations for in-memory mode currently map to ErrInMemoryWithRedisConfig (including SchemaVersion and EnableCache), which might be confusing; consider either splitting these into more precise errors or adjusting the error name/message to reflect all the conditions it covers.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The Redis-backed preset path (`redisBackedAC`) has separate snapshot loading (`loadTrieDataFromRedis`) and `readTrieSnapshot`/`marshalTrieArgs` logic, which now diverges from the existing V2 code; consider consolidating these into shared helpers so schema evolution only has one code path to maintain.
- In `Create`, several different invalid configurations for in-memory mode currently map to `ErrInMemoryWithRedisConfig` (including `SchemaVersion` and `EnableCache`), which might be confusing; consider either splitting these into more precise errors or adjusting the error name/message to reflect all the conditions it covers.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

- Split ErrInMemoryWithRedisConfig into three precise errors:
  ErrInMemoryWithRedisConfig, ErrInMemoryWithSchemaVersion,
  ErrInMemoryWithCache
- Consolidate trie snapshot loading: rename loadTrieDataFromRedis
  to loadTrieSnapshot returning trieSnapshot, and remove duplicate
  readTrieSnapshot from redis_backed_ops.go. Both redisBackedAC
  and v2Operations now deserialize from the same pattern.
Extract readTrieSnapshot(ctx, storage, name) as a package-level
function replacing duplicate implementations in redisBackedAC and
v2Operations. Removes ~30 lines of identical deserialization code.
@skyoo2003 skyoo2003 removed documentation enhancement This issue is about improvement of project. dependencies Pull requests that update a dependency file go Pull requests that update Go code breaking_changes labels Apr 19, 2026
Remove the pure in-memory mode (InMemory bool field) since preset
mode with Redis persistence supersedes it. Preset without Redis now
returns ErrPresetRequiresRedis. Also rename RedisBacked-prefixed
test/benchmark functions to Preset-prefixed for consistency.
@github-actions github-actions Bot added enhancement This issue is about improvement of project. dependencies Pull requests that update a dependency file go Pull requests that update Go code breaking_changes labels Apr 19, 2026
@skyoo2003 skyoo2003 merged commit eaf244e into master Apr 19, 2026
8 checks passed
@skyoo2003 skyoo2003 deleted the feature/redis-backed-preset branch April 19, 2026 22:13
@skyoo2003 skyoo2003 mentioned this pull request Apr 19, 2026
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file documentation enhancement This issue is about improvement of project. go Pull requests that update Go code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant