Skip to content

fix(qwp): prevent JVM crash when closing a QWP sender#43

Merged
bluestreak01 merged 14 commits into
mainfrom
jh_segment_manager_segfault
Jun 17, 2026
Merged

fix(qwp): prevent JVM crash when closing a QWP sender#43
bluestreak01 merged 14 commits into
mainfrom
jh_segment_manager_segfault

Conversation

@jerrinot

@jerrinot jerrinot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Closing a QWP sender (on shutdown, reconnect, or sender churn) could
crash the entire JVM with a SIGSEGV when it raced the background segment
manager. Under load this showed up as rare, hard-to-reproduce process
deaths.

implementation details for reviewers
Two native-memory races are fixed:

  1. Watermark SIGSEGV. The worker services rings off a snapshot taken
    under lock, then writes the acked-FSN watermark outside the lock. If a
    sender unmapped that file in the same window, the worker wrote through a
    dangling address → SIGSEGV. Fix: the watermark write + totalBytes
    accounting now run under lock, gated on a lock-guarded
    RingEntry.registered flag that deregister() clears before close()
    unmaps.

  2. pathScratch use-after-free. close() uses a bounded join; a
    timed-out join could leave the worker alive while its scratch buffer was
    freed. Fix: only free worker-owned native state once the worker is
    observed dead, else retry on a later close().

Closing a QWP sender while its background segment manager was mid-tick
could crash the whole process. The manager's worker thread persists the
acknowledged-FSN watermark into a memory-mapped file on each tick; if a
sender closed and unmapped that file in the same instant, a stale worker
could write to the now-unmapped address and abort the JVM with a SIGSEGV.

The worker now re-checks, under the manager lock, whether the ring is
still registered before it touches the watermark or the byte accounting.
deregister() flips a lock-guarded `registered` flag, so once close()
returns the worker can no longer write through the unmapped watermark.
The watermark write and the totalBytes subtraction are both gated on the
flag; drainTrimmable() and the segment close/unlink stay unconditional,
so a stale snapshot still unlinks fully-acked segments as before. The
O(1) flag replaces the previous O(n) scan of the rings list.
@jerrinot jerrinot added the bug Something isn't working label Jun 9, 2026
@jerrinot jerrinot changed the title fix(qwp): prevent JVM crash when closing a QWP sender fix(qwp): prevent JVM crash when closing a QWP sender [DO NOT MERGE] Jun 9, 2026
jerrinot added 7 commits June 9, 2026 18:09
Keep the bounded close wait, but only free worker-owned native state after
the segment-manager worker is observed dead.

A timed-out or interrupted join can leave the worker alive inside a service
tick. In that state pathScratch may still be used for spare path creation or
native-path cleanup, so closing it immediately risks a native use-after-free.
Leave workerThread set and pathScratch allocated when the worker is still
alive, allowing a later close() to retry cleanup.
@jerrinot jerrinot changed the title fix(qwp): prevent JVM crash when closing a QWP sender [DO NOT MERGE] fix(qwp): prevent JVM crash when closing a QWP sender Jun 15, 2026
jerrinot and others added 6 commits June 15, 2026 16:56
The durable-ack tests assert on the in-memory engine.ackedFsn(), and the
recovery tests forge the .ack-watermark by hand, so nothing observed the
SegmentManager worker actually writing the watermark on its trim tick. A
regression that silently stopped that write (e.g. an inverted `registered`
gate) would pass the whole suite while reintroducing re-replay of
durable-acked frames on restart.

Add a positive twin of testRecoveryAdvancesAckedFsnPastWatermark: drive a
real, started manager to persist the watermark from real acks, block until
the worker has written it to disk, then assert a second session recovers
that manager-written value rather than the bare lowestBase - 1 seed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@mtopolnik

Copy link
Copy Markdown
Contributor

[PR Coverage check]

😍 pass : 42 / 43 (97.67%)

file detail

path covered line new line coverage
🔵 io/questdb/client/cutlass/qwp/client/sf/cursor/SegmentManager.java 38 39 97.44%
🔵 io/questdb/client/cutlass/qwp/client/sf/cursor/CursorSendEngine.java 4 4 100.00%

@bluestreak01 bluestreak01 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving after a full level-3 adversarial review.

This is a correct, well-tested fix for two real pre-existing native-memory use-after-free hazards, both reachable in the production owned-manager path when the bounded join(5s) times out under load:

  1. Watermark write-through-unmapped: the worker's watermark.write() now runs inside synchronized(lock) gated on the new RingEntry.registered flag, which deregister() flips false under the same lock before the engine unmaps the watermark. This correctly routes around AckWatermark's non-volatile 'closed' boolean (the original bug) by supplying a lock happens-before edge.
  2. pathScratch use-after-free: close() now early-returns on t.isAlive() without freeing worker-owned native scratch, trading a bounded ~256B leak for not freeing memory a live worker still writes.

Verified against source: drainTrimmable() and SegmentRing.close() are both synchronized(this) (clean ownership transfer, no double-free); needsHotSpare()/nextSeqHint() read only heap state (stale snapshots never touch freed native memory); register() reordered so nothing throwable runs after rings.add (no half-registered entry); totalBytes accounting cannot drift under any deregister/trim interleaving. All production callers (CursorSendEngine.close, QwpWebSocketSender, BackgroundDrainer, Sender, reconnect loop) walked and SAFE. Touched + surrounding test suite is green (26/26 plus the full sf.cursor.* package), and the crash-capable regressions confirm both fixed branches actually fire.

Non-blocking follow-ups (optional):

  • close() called from an already-interrupted thread always early-returns and leaks pathScratch regardless of worker/disk health (SegmentManager.java:160-176). Strictly safer than the prior crash; consider clearing interrupt status or a non-interruptible bounded wait so a clean stop is still attempted.
  • 'a later close() retries' overstates production reality (engine closes the owned manager exactly once); tighten the comment.
  • Shared-manager ctor catch never deregisters; safe only because register() can't throw after rings.add — a maintenance hazard worth the existing invariant comment.
  • Test gaps: no e2e production timed-out-join through engine close; ctor-catch reorder intent effectively untested; watermark test is crash-only signal; two ctor-failure tests lack @test(timeout).

@bluestreak01 bluestreak01 enabled auto-merge (squash) June 17, 2026 09:51
@bluestreak01 bluestreak01 merged commit 2f4d7c7 into main Jun 17, 2026
12 checks passed
@bluestreak01 bluestreak01 deleted the jh_segment_manager_segfault branch June 17, 2026 09:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants