Skip to content

Audit/bug hunt#90

Merged
webcoderspeed merged 27 commits into
mainfrom
audit/bug-hunt
Jun 10, 2026
Merged

Audit/bug hunt#90
webcoderspeed merged 27 commits into
mainfrom
audit/bug-hunt

Conversation

@webcoderspeed

Copy link
Copy Markdown
Collaborator

Pull Request

📋 Description

What does this PR do?

Why is this change needed?

Related Issues

  • Fixes #
  • Related to #

🔄 Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • 📚 Documentation update
  • 🧹 Code cleanup/refactoring
  • ⚡ Performance improvement
  • 🧪 Test improvements
  • 🔧 Build/CI improvements

🧪 Testing

Test Coverage

  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have tested this change manually

Test Details

# Commands used for testing
npm test
npm run test:coverage
npm run lint

Manual Testing

📖 Documentation

  • Code comments updated
  • README.md updated
  • API documentation updated
  • Examples updated
  • CHANGELOG.md updated (for significant changes)

🔍 Code Quality

  • My code follows the project's style guidelines
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • My changes generate no new warnings or errors
  • I have run the linter and fixed any issues

📦 Dependencies

  • No new dependencies added
  • New dependencies are justified and documented
  • Dependencies are pinned to specific versions
  • Security implications have been considered

🚀 Performance

  • No performance impact
  • Performance impact is minimal and justified
  • Performance improvements included
  • Benchmarks provided (if applicable)

🔒 Security

  • No security implications
  • Security implications have been considered and addressed
  • No sensitive data exposed
  • Input validation added where necessary

📱 Compatibility

  • Node.js versions:
  • Operating Systems:
  • Breaking changes documented
  • Migration guide provided (if needed)

🎃 Hacktoberfest

  • This is a Hacktoberfest contribution
  • I have read and followed the contribution guidelines
  • This PR provides meaningful value (not spam)
  • I understand the project's goals and scope

📸 Screenshots/Examples

Before

After

Code Example

// Example of how to use the new feature
import { Logixia } from 'logixia';

const logger = new Logixia({
  // example configuration
});

// example usage

🔗 Additional Context

Deployment Notes

Rollback Plan

Future Considerations

✅ Reviewer Checklist

  • Code quality meets project standards
  • Tests are comprehensive and pass
  • Documentation is updated and clear
  • Breaking changes are properly documented
  • Security implications reviewed
  • Performance impact acceptable
  • Ready for merge

📝 Notes for Reviewers

🙏 Acknowledgments


Thank you for contributing to Logixia! 🚀

webcoderspeed and others added 24 commits June 10, 2026 21:25
Mirror the FileTransport overlapping-batch-flush fix for DatabaseTransport.
write() fires flush() un-awaited once the batch crosses batchSize and the
interval timer fires it too, so a synchronous burst (e.g. a NestJS bootstrap
replaying buffered logs) triggered many overlapping flushes. The old
isFlushing guard merely skipped the size-triggered flush, defeating
backpressure and letting the batch grow unbounded; and any snapshot-then-clear
race risked writing the same entries twice.

Changes:
- Serialize flushes through a single shared drain promise; detach the batch
  synchronously before awaiting so concurrent writes land in a fresh array and
  are written exactly once (no duplication).
- Harden close() to drain on shutdown: retry transient flush failures and
  re-buffer entries between attempts so buffered logs are not lost on deploy
  (the 'last N seconds of logs lost' problem). Warn if anything truly remains.
- Clear connectionPromise on failed connect so a transient startup error does
  not permanently wedge the transport.
- Coerce getLogCount() to a number (pg/mysql return COUNT as a string).

Adds regression tests: 500-entry un-awaited burst written exactly once,
concurrent flush() de-duplication, failed-flush re-buffering, and close()
draining all entries through a transient failure with zero loss.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
AnalyticsTransport is the shared base for Mixpanel, DataDog, Segment, and
Google Analytics, so its overlapping-batch-flush bug affected all four. Same
class of bug as the FileTransport/DatabaseTransport fixes: addToBatch() fires
flush() un-awaited on every Nth entry, and the old flush() snapshotted
[...this.batch] then cleared it only after the async send, so overlapping
flushes re-sent the same entries — turning N logs into N² delivered events.

Changes:
- Serialize flushes through a single shared drain promise; detach the batch
  synchronously before awaiting sendBatch() so concurrent writes are never
  sent twice.
- Harden close() to retry-drain on shutdown so a transient provider outage
  does not drop buffered logs (the 'last N seconds lost on deploy' problem).
- flush() no longer swallows the timer; batchTimer cleared up front.

Adds regression tests (600-entry burst delivered once, 4 concurrent flushes,
failed-send re-buffering, close() drain-through-failure) and a runnable
verification example (examples/verify-flush-no-duplication.ts) that prints
live PASS/FAIL for each guarantee.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
CloudWatch, GCP Cloud Logging, and Azure Monitor transports each buffered
entries on an interval but (a) flushed only ONE batchSize chunk per flush()
call, leaving a tail behind when the batch exceeded batchSize, and (b) had no
close() method at all — so TransportManager.close() skipped them on shutdown
and any buffered logs were silently lost on deploy (the exact 'last N seconds
of logs lost' problem the library promises to avoid).

These three use batch.splice(0, batchSize), which detaches synchronously, so
they were never vulnerable to the N² duplication bug — the issue here is loss,
not duplication.

Changes (all three transports):
- flush() now loops until the batch is empty, draining every chunk.
- Add close(): clears the interval timer and retry-drains the batch (bounded
  to 3 attempts) so a transient cloud outage does not drop buffered logs.

Adds regression tests (whole-batch drain + close-drain-through-failure for each
provider) and extends examples/verify-flush-no-duplication.ts with a live cloud
close() check (95 buffered logs all drained on shutdown).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…tdown

The per-transport averageWriteTime used (avg + sample) / 2, an exponential
decay that over-weighted the most recent write so the reported average never
matched the true mean. Replace with the standard incremental cumulative mean
avg += (sample - avg) / n.

Adds the first TransportManager test coverage: the corrected metric, the
shutdown guarantee that batched logs are flushed through manager.close()
(no loss on deploy), and documents the post-shutdown write early-return.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Pattern-based redaction only ran against the payload object, so a secret
interpolated into the message itself (e.g. logger.info(`Auth: Bearer ${token}`))
bypassed redaction entirely and leaked in plaintext — a real gap for a library
whose headline feature is redacting secrets.

Add applyRedactionToString() (patterns only — path rules don't apply to a bare
string) and run it over the message in the logger hot path, gated behind the
existing _hasRedact fast-path flag so the no-redact common case stays
allocation-free.

Adds unit tests for applyRedactionToString and an end-to-end logger test
asserting the raw token never appears in output. Extends the verification
example with a live message-redaction check.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…nals

flushOnExit() registers one handler for both SIGTERM and SIGINT, so two
signals (e.g. SIGINT after SIGTERM, or an impatient double Ctrl+C) ran the
async flush+exit handler twice concurrently. The two sequences raced: the
first process.exit(0) to fire could truncate the other's in-flight flush
mid-write — the exact log loss this util exists to prevent.

Add a shutdownInProgress re-entrancy guard so only the first signal runs the
flush; later signals are ignored while it is in progress. Reset the flag in
resetShutdownHandlers().

Adds regression tests: a second signal mid-shutdown does not re-flush loggers,
and two concurrent signals call process.exit only once.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Three lifecycle bugs in WorkerTransport:

1. No close() — the cleanup method was named shutdown(), but TransportManager
   calls transport.close(). So on graceful exit the worker was never told to
   flush+exit: its buffered logs were lost and the thread leaked. Rename to
   close() (shutdown() kept as a delegating alias) and forward any locally
   buffered pre-ready entries before sending the shutdown message.

2. flush() could hang forever — it resolved only on the worker's 'flushed'
   ack, so if the worker died/restarted mid-flush the promise never settled and
   blocked shutdown until the force-exit timer killed the process (losing
   everything). Time-box it.

3. Restart loop leaked / resurrected workers — the backoff setTimeout was never
   unref()'d (kept the process alive) and wasn't cancelled on close(), so a
   restart could spawn a fresh worker after shutdown. Track the timer, clear it
   in close(), unref() it, and add a  flag so a late exit event no
   longer triggers a restart.

Adds lifecycle tests with an injected fake worker (close posts shutdown +
resolves on exit, flush times out instead of hanging, buffered entries are
forwarded, shutdown() aliases close()).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The sampledTraces / droppedTraces sets are only cleared by resetStats(), which
only runs when a stats timer is active (an onStats callback was provided). With
traceConsistent sampling but no stats callback — a common config — they grew one
entry per unique traceId for the life of the process: an unbounded memory leak
in long-running services.

Cap the sets at maxTrackedTraces (100k) and clear on overflow via a
_rememberTrace helper. A re-seen trace after a clear just gets a fresh sampling
decision, which is acceptable for sampling consistency.

Adds tests asserting the set size never exceeds the cap under high trace
cardinality and that sampling still decides correctly after an overflow clear.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
createTraceMiddleware unconditionally called res.setHeader(), which throws when
the response object lacks it (Fastify replies use reply.header()) or when
headers are already sent or the stream is closed — crashing the whole request
for a middleware advertised for Express/NestJS (NestJS can run on Fastify).

Guard the header write: prefer setHeader(), fall back to header(), skip when
headersSent, and swallow any throw. The trace ID is still propagated via async
context regardless, so observability is unaffected.

Adds tests for Fastify-style reply.header(), a response with no header method,
headersSent=true, and a setHeader that throws.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…cle guard

Two correctness bugs in the defensive serialization helpers:

1. safeToString() returned JSON.stringify(value) directly, but JSON.stringify
   returns undefined for some inputs (e.g. an object whose toJSON() returns
   undefined). The whole point of safeToString is that safeReplace can call
   .replace() on the result without crashing — returning undefined defeated it
   and threw 'Cannot read properties of undefined'. Fall back to the
   constructor-name tag when stringify yields undefined.

2. serializeError's serializeValue created a fresh WeakSet for Errors nested in
   plain props, so the circular-reference guard didn't span the whole tree —
   cross-referencing errors relied only on the depth limit. Thread the shared
   seen set through serializeValue and guard arrays/objects too.

Adds a coerce.utils test suite (incl. the undefined-stringify and circular
cases) and error cross-reference / self-reference cycle tests.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The Kafka and WebSocket trace interceptors wrap next.handle() in a new
Observable and subscribe to it inside ctx.run(), but never returned a teardown.
When the outer subscriber unsubscribed (consumer shutdown, socket disconnect,
upstream takeUntil/timeout), the inner handler subscription kept running — a
subscription/memory leak that also let handler side effects continue after
cancellation.

Capture the inner subscription and return () => inner?.unsubscribe() from the
Observable so cancellation propagates.

Adds a Kafka interceptor test asserting the inner subscription is torn down
when the outer is unsubscribed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ssages

LogixiaLoggerService.formatMessage did JSON.stringify(message) for object
messages, which throws on circular structures (crashing the whole log call —
NestJS routinely logs request/response objects that can contain cycles) and
returns undefined for some inputs. Delegate to safeToString (already imported),
which is circular-safe and never returns a non-string.

Adds a regression test logging a circular object as the message.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… files

Two file-rotation bugs:

1. compressFile() was an empty stub — rotation with compress: true silently did
   nothing despite the documented 'gzip compression of rotated files'. Implement
   it with a streamed zlib gzip to <file>.gz (no full-file buffering), removing
   the original only after the compressed copy is written, and cleaning up a
   partial .gz on failure.

2. cleanupOldFiles() matched files with startsWith(base), so with filename
   app.log it also matched (and could delete) unrelated files like
   application.log. Tighten the match to this transport's own rotated pattern:
    (optionally .gz).

Adds rotation tests: the rotated file is gzip-compressed with no uncompressed
original left behind, and cleanup leaves a similarly-named unrelated file intact.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…hooks

The plugin lifecycle hooks promised that thrown errors are swallowed so a buggy
plugin can't crash logging, but only the ASYNC branch was guarded (.catch on the
returned promise). A synchronous throw propagated to the caller:

- onInit sync throw broke register()/use()
- onError sync throw turned a recoverable transport failure into a crash
  (onError fires precisely during failure handling)
- onShutdown sync throw rejected the Promise.all and broke graceful shutdown,
  risking log loss

Wrap each synchronous hook call in try/catch. runOnLog already did this; this
brings the other three to parity.

Updates the stale 'sync throws ARE propagated' onInit test to assert the new
(safe) behavior and adds sync-throw isolation tests for onError and onShutdown,
including that a later well-behaved hook still runs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…tters

Both formatters called raw JSON.stringify on the payload, which throws
'Converting circular structure to JSON' on any cyclic object — and a logging
library is routinely handed cyclic objects (Express req/res, DB connections,
Mongoose docs). The throw escaped format() and crashed the whole log/transport
path.

- JsonFormatter: stringify with a circular-safe replacer that emits '[Circular]'
  for repeated object references (fresh per call — it holds per-serialization
  state).
- TextFormatter: replace JSON.stringify(value) and the catch-fallback
  JSON.stringify(payload) (which threw AGAIN on the same cycle) with the
  circular-safe safeToString, keeping the CWE-117 control-char strip on object
  values too.

Adds a formatters test suite covering circular payloads for both, error
serialization, and the control-char strip.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
OTel bridge: getActiveOtelContext runs on every log call when the bridge is
enabled. A broken or version-mismatched @opentelemetry/api whose context/trace
methods throw would crash logging itself. Wrap the whole interaction in
try/catch so it degrades to no-injection like the rest of the (intentionally
silent) integration. Adds the first OTel tests (graceful degradation, never
throws, init/disable state).

Metrics: metric names (from user map keys) and label names were emitted
unsanitized. An invalid name like 'my-bad.metric' or label 'status code'
produces output Prometheus can't parse, which makes the scraper reject the
ENTIRE /metrics endpoint — not just that one metric. Add sanitizePromName
([a-zA-Z_][a-zA-Z0-9_]*, leading digit prefixed with _) and apply it to metric
and label names. Adds the first metrics tests (sanitization + counter/histogram/
gauge extraction + reset + pass-through).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…+close

The Express HTTP logger registered onFinish on BOTH res.once('finish') and
res.once('close'). A normal response fires both events, so the 'request
completed' entry (and the slow-request warning) could be logged twice — the
'close' guard only checked statusCode === 0, which doesn't hold for a normal
finish+close pair on every Node version / response shape.

Add a  flag so the completion is logged exactly once regardless of
which events fire, while 'close' still covers a client that disconnects before
'finish'.

Adds http-logger middleware tests: single completion on finish+close, request
start level, 5xx error level, header redaction, and skip predicate.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
calculateIndexSize() JSON.stringify'd EVERY entry in the index on each
getIndexStats() call. With the default maxIndexSize of 1,000,000 that means
serializing a million objects per call, blocking the event loop — pathological
if stats are polled for a dashboard.

Maintain the serialized size incrementally: trackInsert/trackDelete keep a
per-id byte estimate and a running total, so getIndexStats() is O(1). Clearing
and age/size-based eviction all update the total, keeping it consistent.

Adds the first BasicLogIndexer tests: size stays consistent across
add/remove/clear, field search (case-insensitive), and max-size eviction.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Two BasicSearchEngine bugs:

1. getSearchableText built its text with raw JSON.stringify(log.payload), which
   throws on a cyclic payload (Express req/res, DB handle). Since it runs for
   every candidate log, one such stored entry crashed the ENTIRE search. Use the
   circular-safe safeToString.

2. addLogs did this.logs.push(...logs) with no cap — the buffer grew without
   bound (memory leak in any long-running process feeding the engine), and
   spreading a very large array risks a RangeError. Append via a loop and cap at
   maxLogs (default 100k, configurable) with FIFO eviction of the oldest.

Adds the first BasicSearchEngine tests: text search, circular-payload search
without crashing, trace-id correlation ordering, and the bounded buffer.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
formatAsTable used (r[c] || '').toString(), so a real value of 0, false, or ''
was treated as empty and the cell rendered blank. In a log-analysis CLI that
silently hides numeric columns like count, statusCode, or duration when they
are zero. Use a null/undefined check via a cellText helper so falsy values
still render.

Adds CLI utils tests: falsy-value rendering, empty rows, missing fields, and
safeParseLogs JSON-line / fallback / blank-line handling.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
BrowserRemoteTransport.flush() spliced only one batchSize chunk, so a batch
larger than batchSize (e.g. after a failed POST re-queued entries) left a tail
unsent. And destroy() cleared the interval timer but never flushed, dropping
buffered logs on page unload / teardown — the browser equivalent of losing the
last N seconds of logs.

Loop flush() until the batch is empty (splice is synchronous so no duplication)
and flush() the remainder inside destroy().

Adds browser transport tests (whole-batch drain, destroy flush, failed-POST
re-buffer, non-http url rejection) with a mocked fetch.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Two bugs in the NestJS TraceMiddleware / functional traceMiddleware:

1. res.setHeader() was called unconditionally. NestJS can run on Fastify (reply
   uses .header()), and on an already-sent/closed response setHeader throws —
   crashing the request. Add writeTraceHeader() which prefers setHeader, falls
   back to header(), skips when headersSent, and swallows throws. Trace context
   propagation is unaffected.

2. ip resolution used . req.connection
   is deprecated and can be undefined, so when req.ip was empty the access threw
   'Cannot read properties of undefined (reading remoteAddress)'. Prefer
   req.socket and guard both with optional chaining.

Adds traceMiddleware tests for Express setHeader, Fastify header() fallback,
no-header-method response, and headersSent — which also cover the ip guard.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The LOGIXIA_SILENT_INTERNAL flag was captured into a module-level const at import
time, so setting it AFTER the module was first imported (the common case for a
test setup file) had no effect — the documented test-silencing was unreliable.
Read process.env per call via isSilent() instead.

Adds the first internal-log tests, including silencing applied after import and
resuming when the flag is cleared.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…d method

@logmethod unconditionally replaced the method with an async function and
awaited the original, so a SYNCHRONOUS method silently began returning a
Promise — breaking any caller that consumed its value directly (const x =
obj.syncMethod()). The decorator even documented 'works on sync methods' while
changing their contract.

Rewrite so the wrapper inspects the original return value: if it's a thenable,
await it (async path — exit/error logs reflect the resolved outcome and the
Promise type is preserved); otherwise treat it as synchronous, emit the exit log
fire-and-forget, and return the value as-is. Entry logging is now
fire-and-forget so a sync method is never forced to become async just to await
it. Shared emit/emitError helpers keep transport failures routed to stderr.

Adds the first @logmethod tests: sync returns a value (not a Promise), async
returns a resolving Promise, entry/exit logging, and sync-throw / async-reject
propagation with error logging.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jun 10, 2026

Copy link
Copy Markdown

size-limit report 📦

Path Size
ESM (gzipped) 10.22 KB (+8.36% 🔺)
CJS (gzipped) 10.24 KB (+6.82% 🔺)

webcoderspeed and others added 3 commits June 10, 2026 17:23
The earlier commit 0e8f915 recorded the message and intent but landed an EMPTY
diff — the staged changes were dropped (a pre-commit hook stripped them), so the
applyRedactionToString implementation, its wiring into the logger hot path, and
the tests lived only in the working tree. Tests still passed because jest runs
the working tree, masking the gap. This commit lands the actual code so the
redaction-in-message fix is in history, not just on disk.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@webcoderspeed webcoderspeed merged commit feee9e7 into main Jun 10, 2026
9 of 10 checks passed
@webcoderspeed webcoderspeed deleted the audit/bug-hunt branch June 10, 2026 17:33
@github-actions

Copy link
Copy Markdown

🎉 This issue has been resolved in version 1.10.3.
Install it: npm install logixia@1.10.3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant