Fix use-after-free when a terminated worker has children still starting#31951
Fix use-after-free when a terminated worker has children still starting#31951robobun wants to merge 8 commits into
Conversation
A worker's children hold a BackRef to its VirtualMachine and read it throughout start_vm() (transform options, env clone, standalone module graph). When the parent worker was terminated, its shutdown() freed that VM without waiting for children still starting up, a use-after-free caught by ASAN in VirtualMachine::init_worker. shutdown() now terminates its own children and waits for each to unlink from the live-workers registry before freeing the VM, the per-worker analogue of the main thread's terminate_all_and_wait() and of Node's Environment::stop_sub_worker_contexts(). Also fix two crashes on the same termination race: - flush_logs() panicked with "unhandled exception" when the log-to-JS conversion hit a pending TerminationException; skip dispatching the error event instead. - A TerminationException left pending after JSC clears the termination request at VM-entry-scope exit tripped ASSERT(vm.hasTerminationRequest()) in VMTraps::deferTerminationSlow when the death path re-entered JS; clear the stale exception at the re-entry points (flush_logs and the exit-handler step of shutdown).
|
Found 3 issues this PR may fix:
🤖 Generated with Claude Code |
|
Checked all three against this PR's builds before adding any Fixes lines. None of them verify, so I am not claiming them:
|
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughStops and waits for nested child workers during parent shutdown (step 3.5), refactors termination into a filtered terminate-and-wait helper, changes futex wake semantics to wake-all, clears stale JSC termination exceptions during shutdown/log flush, and adds two regression tests for termination lifetimes. ChangesNested worker termination and shutdown guarantees
Possibly related issues
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
The filtered wait derives its done-condition from the list sweep (matching) but passed a futex expected value loaded after the mutex was released. If the last matching child ran unlink() and mark_exited() in that gap, matching stayed stale at nonzero while OUTSTANDING already equaled the loaded value, so Futex::wait slept through the already-fired wake until the timeout. Loading OUTSTANDING before releasing the mutex guarantees every counted child's decrement is still ahead of the expected value, so the wait either observes the change or is woken.
terminate_children_and_wait means the main thread and any number of worker parents can wait on OUTSTANDING at the same time. A futex waiter queued with a stale expected value is only released by a wake or its timeout, so waking a single waiter can hand the event to a thread whose own condition is unmet while the one that could make progress stays asleep until its deadline (e.g. a three-level tree where the grandchild's exit wakes the grandparent instead of the middle worker). Wake all waiters in register() and mark_exited(); each re-sweeps under the mutex and re-checks its own condition, and the waiter count is bounded by the worker-tree depth.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/jsc/web_worker.rs (1)
419-436:⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy liftWait on a monotonic wake sequence, not
OUTSTANDINGitself.Between Line 420 and Line 436,
OUTSTANDINGcan gon → n-1 → nbefore this thread actually entersFutex::wait(): one worker exits, a mid-WebWorker__createworker registers, bothwake()calls fire, and the counter is back at the originaln. At that pointFutex::wait(&OUTSTANDING, n, ...)will sleep on a stale value, so the newly registered worker is never re-swept and never getsrequested_terminate = true. If the wait then times out, the caller can still tear down the parent/main VM while that worker is instart_vm().Possible direction
pub(super) static OUTSTANDING: AtomicU32 = AtomicU32::new(0); +pub(super) static WAKE_SEQ: AtomicU32 = AtomicU32::new(0); pub(super) fn register(worker: *mut WebWorker) { MUTEX.lock(); ... OUTSTANDING.fetch_add(1, Ordering::Release); - Futex::wake(&OUTSTANDING, u32::MAX); + WAKE_SEQ.fetch_add(1, Ordering::AcqRel); + Futex::wake(&WAKE_SEQ, u32::MAX); MUTEX.unlock(); } pub(super) fn mark_exited() { OUTSTANDING.fetch_sub(1, Ordering::Release); - Futex::wake(&OUTSTANDING, u32::MAX); + WAKE_SEQ.fetch_add(1, Ordering::AcqRel); + Futex::wake(&WAKE_SEQ, u32::MAX); } - let n = live_workers::OUTSTANDING.load(Ordering::Acquire); + let n = live_workers::OUTSTANDING.load(Ordering::Acquire); + let seq = live_workers::WAKE_SEQ.load(Ordering::Acquire); live_workers::MUTEX.unlock(); ... - let _ = Futex::wait(&live_workers::OUTSTANDING, n, Some(deadline_ns - elapsed)); + let _ = Futex::wait(&live_workers::WAKE_SEQ, seq, Some(deadline_ns - elapsed));As per coding guidelines, "Rust code: fix the whole bug class in the same PR."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/jsc/web_worker.rs` around lines 419 - 436, The wait is currently sleeping on the mutable counter live_workers::OUTSTANDING (via Futex::wait(&live_workers::OUTSTANDING, n, ...)), which can change n→n-1→n and cause a stale-wait race; instead introduce and wait on a monotonic sequence/version (e.g., live_workers::WAKE_SEQ or similar) that you increment whenever workers are added/removed or when requested_terminate is toggled, read that sequence under live_workers::MUTEX, compute the done condition the same way (using parent_filter or outstanding==0), then call Futex::wait(&live_workers::WAKE_SEQ, seq, Some(...)) so wakes are not lost (also update all corresponding wake() calls to futex::wake(&live_workers::WAKE_SEQ) when registering/unregistering workers or setting requested_terminate, and ensure WebWorker__create and any code that toggles requested_terminate increments the sequence while holding live_workers::MUTEX).Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@src/jsc/web_worker.rs`:
- Around line 419-436: The wait is currently sleeping on the mutable counter
live_workers::OUTSTANDING (via Futex::wait(&live_workers::OUTSTANDING, n, ...)),
which can change n→n-1→n and cause a stale-wait race; instead introduce and wait
on a monotonic sequence/version (e.g., live_workers::WAKE_SEQ or similar) that
you increment whenever workers are added/removed or when requested_terminate is
toggled, read that sequence under live_workers::MUTEX, compute the done
condition the same way (using parent_filter or outstanding==0), then call
Futex::wait(&live_workers::WAKE_SEQ, seq, Some(...)) so wakes are not lost (also
update all corresponding wake() calls to futex::wake(&live_workers::WAKE_SEQ)
when registering/unregistering workers or setting requested_terminate, and
ensure WebWorker__create and any code that toggles requested_terminate
increments the sequence while holding live_workers::MUTEX).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 7892402e-59aa-4913-a414-fb0fff0d8faf
📒 Files selected for processing (1)
src/jsc/web_worker.rs
OUTSTANDING as the futex word is ABA-prone: one worker exit plus one registration in the gap between a waiter's snapshot and its Futex::wait restores the exact expected value while both wakes fire with no waiter queued, so the wait sleeps through events it needed to re-sweep for (a newly registered worker then misses requested_terminate until the deadline forces a re-sweep). Add WAKE_SEQ, bumped on register() and mark_exited(), and wait on that; it only increments, so it cannot alias. The sequence is snapshotted under the registry mutex before the sweep and the OUTSTANDING load, so any event invalidating those inputs changes the futex word afterwards.
There was a problem hiding this comment.
Additional findings (outside current diff — PR may have been updated during review):
-
🟡
src/jsc/web_worker.rs:420-428— nit: this timeout log still saysterminateAllAndWaitand prints the globaln = OUTSTANDING, but the function is now also reached viaterminate_children_and_waitfromshutdown()step 3.5 — in that path the relevant figure ismatching(the global count always includes the caller itself plus unrelated workers). Consider branching onparent_filterto emit a distinct message withmatching, or at least genericizing the prefix.Extended reasoning...
What this is
This PR refactored
terminate_all_and_waitinto a sharedterminate_and_wait(parent_filter, timeout_ms)with a new filtered caller,terminate_children_and_wait, invoked from every exiting worker'sshutdown()step 3.5. The timeout-path debug log was not updated to reflect the generalization:if elapsed >= deadline_ns { log!("terminateAllAndWait: timed out with {} outstanding", n); return; }
When reached via the filtered path, this is misleading on two axes:
- Prefix names the wrong entry point — it says
terminateAllAndWaiteven when the caller isterminate_children_and_waitfrom a worker's shutdown. - Reports the wrong count —
nis the globalOUTSTANDINGcount, which in the filtered path always includes the caller worker itself (it unlinks after step 3.5 returns) plus any unrelated workers in the process. The figure the filtered wait actually conditions on ismatching.
Concrete example
Take Main → M → G, with M terminated and G stuck somewhere past the cooperative checkpoints for 10 s:
- M's step 3.5 sweep finds G:
matching = 1. n = OUTSTANDING.load()under the mutex →n = 2(M itself + G; M is still registered until after step 3.5).- 10 s elapses → log emits
terminateAllAndWait: timed out with 2 outstanding.
A developer with
BUN_DEBUG_Worker=1enabled who is chasing a per-worker child-wait stall sees a message that (a) names the main-thread global-exit sweep and (b) reports 2 outstanding when only 1 child of M is actually blocking the wait. If there were sibling worker trees in the process,nwould be larger still whilematchingstayed at 1.Why this isn't entirely academic
This is a
hiddenscoped debug log (define_scoped_log!(log, Worker, hidden)— only emits when the Worker debug scope is explicitly enabled via env var) on a 10 s timeout path, so there is zero functional impact and it is invisible to users. The counter-argument that this is below the nit threshold is reasonable.That said, this PR's own review history went through two rounds of debugging timeout stalls in exactly this code path (the lost-wakeup snapshot race fixed in b6b37a7, and the wake-one-vs-multiple-waiters issue fixed in 9e0980f — the latter explicitly cited a 10 s deadline stall on the debian-13 ASAN job). Anyone debugging the next stall here with the Worker scope enabled will land on this line, and "terminateAllAndWait timed out with N" pointing at the wrong sweep with an inflated count is a small but real speed bump. Since the PR is what generalized the function, updating the one diagnostic it contains to match seems worth a one-line follow-up.
Suggested fix
if elapsed >= deadline_ns { match parent_filter { Some(_) => log!("terminateChildrenAndWait: timed out with {} matching ({} outstanding)", matching, n), None => log!("terminateAllAndWait: timed out with {} outstanding", n), } return; }
(or just genericize the prefix to
terminate_and_waitand print both counts). - Prefix names the wrong entry point — it says
The timeout log always named terminateAllAndWait and printed the global OUTSTANDING count, which on the filtered path includes the calling worker itself plus unrelated workers; matching is the number the wait actually conditions on.
There was a problem hiding this comment.
Additional findings (outside current diff — PR may have been updated during review):
-
🟡
src/jsc/web_worker.rs:710-714— The parallel nested-worker comment increate()(line 631) still says "parent is itself a worker, not joined on exit" — the PR updated this rationale at the file header, theparentfield doc, andset_ref(), but missed this 4th site. With step 3.5 in place, worker parents do terminate-and-wait for children, so this wording now contradicts the rest of the file; it should get the same updateset_ref()did.Extended reasoning...
What the issue is
This PR updates the nested-worker keepalive rationale to reflect that worker parents now terminate-and-wait for their children via the new
shutdown()step 3.5. It updates that rationale at three sites:- the file header (lines 54-60): "Nested workers ARE stopped when their WORKER parent tears down:
shutdown()step 3.5 terminates and waits for them…" - the
parentfield doc (lines 88-94): "When the parent is itself a worker, itsshutdown()step 3.5 terminates us and waits for ourunlink()before freeing its VM…" - the
set_ref()comment (lines 710-714): changed from "worker parents aren't joined on exit" to "keeping the parent's loop alive avoids the child being terminated by the parent's natural exit (shutdown()step 3.5)"
But the parallel comment in
create()was missed:// src/jsc/web_worker.rs:628-632 // Keep the parent's event loop alive until the close task releases this. // If the user passed `{ ref: false }` we skip — they've opted out of the // worker keeping the process alive. Exception: a nested worker (parent is // itself a worker, not joined on exit) must hold the parent-loop keepalive // regardless, because the child holds a non-owning `BackRef` to the parent VM.
Why it's now wrong
The phrase "not joined on exit" was the old rationale: before this PR, a worker parent did not wait for its children, so the keepalive was the only thing standing between the child's
BackRefand a freed parent VM. After this PR, step 3.5 does wait for children before freeing the parent VM — so "not joined on exit" is no longer true, and the implied UAF-prevention rationale is no longer the operative one. The keepalive's purpose for nested workers is now to keep the parent's loop alive so the child isn't terminated by the parent's natural exit (exactly what the updatedset_ref()comment says).Step-by-step
- Pre-PR: file header documented "nested workers are not stopped when their WORKER parent's context tears down" → both
create()andset_ref()said "not joined on exit" → consistent. - PR adds step 3.5 (
terminate_children_and_wait) → worker parents now DO wait for children. - PR updates file header,
parentfield doc, andset_ref()to match → 3 of 4 sites consistent. create()at line 631 still says "not joined on exit" → contradicts the other 3 sites and the actual behavior.
Impact
Pure documentation inconsistency — no runtime effect. But these comments document a memory-safety invariant (the parent-VM
BackReflifetime), and a future reader who lands oncreate()first would form the wrong model of why the nested-worker keepalive exists and what guarantees it provides.Fix
Give
create()the same wordingset_ref()got, e.g.:// … Exception: a nested worker (parent is itself a worker) must hold the // parent-loop keepalive regardless: the child holds a non-owning `BackRef` // to the parent VM, and keeping the parent's loop alive avoids the child // being terminated by the parent's natural exit (`shutdown()` step 3.5).
- the file header (lines 54-60): "Nested workers ARE stopped when their WORKER parent tears down:
Each child is a full JSC VM; 12 children x 4 rounds per test overloaded saturated ASAN runners into the test timeout. 6 children x 3 rounds still crashed the unfixed build on every verification run (the race window spans the whole child VM startup), and the fixed suite drops to about 2.3s per test locally. Also update the one remaining create() comment that still described worker parents as not waiting for their children on exit.
The debian-13 x64-asan lane runs tests with detect_leaks=1 and BUN_DESTRUCT_VM_ON_EXIT=1 unless the file is listed in test/no-validate-leaksan.txt. Worker teardown intentionally leaks when a parent context is gone before the close task runs (the thread-held Worker ref and detached thread bookkeeping documented in src/jsc/web_worker.rs), and the new regression tests terminate workers while nested children are starting, which is exactly that window. Reproduced on a local release-asan build with the lane environment: a flaky 32-byte direct leak report in the spawned fixture's stderr failed the test about one run in five. worker.test.ts and worker_blob.test.ts are already listed for the same reason. Leak exclusion does not affect the tests' crash coverage; AddressSanitizer UAF detection stays on.
There was a problem hiding this comment.
All four of my earlier findings (snapshot race, wake-one vs wake-all, timeout log, stale create() comment) are addressed and the current WAKE_SEQ approach looks sound, but this reworks the cross-thread worker shutdown/futex protocol enough that a human should sign off.
Extended reasoning...
Overview
This PR fixes a use-after-free where a terminated worker frees its VirtualMachine while nested children are still inside start_vm() reading it via BackRef. The fix adds step 3.5 to WebWorker::shutdown(): terminate_children_and_wait(vm_ptr, 10_000) terminates direct children and futex-waits for each to unlink from the live-workers registry before the parent VM is freed. Supporting changes: terminate_all_and_wait is refactored into a generic terminate_and_wait(parent_filter, timeout); a new monotonic WAKE_SEQ futex word replaces waiting on OUTSTANDING (avoids ABA where exit+register restores the expected value); both register() and mark_exited() now wake-all since multiple threads can wait concurrently; flush_logs and shutdown() step 2 now clear stale TerminationExceptions and handle JsError::Terminated/Thrown instead of panic!. Two regression tests are added, and the test file is added to the LeakSanitizer exclusion list.
Security risks
None identified. This is internal lifetime/synchronization machinery for worker teardown; no auth, crypto, untrusted input parsing, or external surface.
Level of scrutiny
High. The change sits at the intersection of cross-thread raw-pointer lifetime management (extensive unsafe, BackRef validity invariants documented in load-bearing comments), a futex wait/wake protocol that this PR substantially redesigns (new WAKE_SEQ word, wake-one → wake-all, snapshot-under-mutex ordering), and the worker shutdown() sequence whose step ordering is a documented hard constraint. Earlier review rounds on this PR surfaced two real concurrency bugs (lost-wakeup race; wrong-waiter with wake-one) that were fixed, which underscores the subtlety. The 10s timeout fallback bounds any remaining protocol error to a stall rather than a UAF, but that's still user-visible.
Other factors
All four of my prior inline findings are resolved in the current diff. Since my last review the author landed 50fff74 (the WAKE_SEQ monotonic-sequence redesign — a stronger fix than the one I suggested) and c02cb19 (LeakSanitizer exclusion for the new test, with a documented rationale matching the file-header's intentional-leak note). The robobun CI comment for c99f30c shows the new test failing on x64-asan and an unrelated bunx.test.ts failure across all platforms; c02cb19 should address the former, but I haven't seen green CI for HEAD. Given the scope — new synchronization primitive, modified shutdown ordering, and a leaksan exclusion — this warrants human review rather than auto-approval.
There was a problem hiding this comment.
All four of my earlier findings have been addressed and I didn't spot anything new in the WAKE_SEQ rework, but this is cross-thread futex synchronization on the worker VM teardown path — worth a human sign-off.
Extended reasoning...
Overview
This PR fixes a use-after-free in nested Web Worker teardown by adding a terminate_children_and_wait step (3.5) to WebWorker::shutdown(), so a worker parent terminates and waits for its own children before freeing the VirtualMachine they back-reference. To support this it refactors terminate_all_and_wait into a shared terminate_and_wait(parent_filter, timeout) helper, replaces the futex word with a monotonic WAKE_SEQ counter (avoiding ABA on OUTSTANDING), switches register()/mark_exited() to wake-all, and hardens flush_logs() / the shutdown path against stale JSC TerminationExceptions. Two regression tests and a LeakSan exclusion are added; the VirtualMachine.rs change is doc-only.
Security risks
None in the conventional sense (no auth/crypto/input-handling). The change is memory-safety-relevant though: it edits the cross-thread lifetime contract that justifies every BackRef<VirtualMachine> deref in start_vm(), and the ordering of steps in shutdown() is load-bearing for soundness.
Level of scrutiny
High. This is core runtime concurrency code: an intrusive linked list walked under a process-global mutex, futex wait/wake protocol with a newly-introduced sequence word, and a multi-step teardown where each step is a barrier for the next. The PR already went through four review iterations here (lost-wakeup snapshot race, wake-one vs wake-all with multiple waiters, timeout-log accuracy, stale doc comment) — all fixed — which itself signals the subtlety. The final WAKE_SEQ design looks correct to me (snapshot taken under MUTEX before the sweep; every state change bumps it and wakes all), but a human reviewer familiar with this file's history should confirm the step-3.5 placement (after JSC teardown, before unlink()/dispatchExit) and the 10-second bounded-wait fallback are acceptable.
Other factors
- All four of my prior inline comments are resolved by commits on this branch, and the current bug-hunting pass found nothing new.
- Regression tests reproduce the original UAF and the
flush_logspanic on unfixed builds per the PR description. - The
no-validate-leaksan.txtaddition is justified (intentional leak documented in the file header) and matches existing exclusions forworker.test.ts/worker_blob.test.ts. - I'm not approving because the change is neither simple nor mechanical and sits squarely on a critical, hard-to-test code path.
|
CI status for reviewers: everything this PR touches is green. The worker test suites pass on all lanes, including debian-13 x64-asan after the LeakSanitizer exclusion (that lane's earlier failures were a flaky 32-byte leak report from the intentional worker-teardown leak, reproduced and verified locally on a release-asan build). The remaining red is unrelated: bunx.test.ts fails on every OS in every recent build of this branch, including builds whose diff was a text file or an empty commit, and the two single-platform failures in build 61283 (sql-mysql-bind-blob-borrow on alpine, install migration complex-workspace on ubuntu 25.04) are in suites this PR does not touch. |
What does this PR do?
Fixes a use-after-free found by Fuzzilli (fingerprint
Address:heap-use-after-free, flaky, on a Worker thread).Root cause: a worker's children hold a
BackRefto the parent'sVirtualMachineand read it throughoutstart_vm()(transform options clone, env map clone,standalone_module_graphinVirtualMachine::init_worker). When the parent is itself a worker and gets terminated, itsshutdown()freed that VM immediately. Children still starting up then read freed memory:This was the "Known gap" documented in the
web_worker.rsheader: the main thread waits for all workers at exit (terminate_all_and_wait), but a worker parent did not wait for its own children. The first regression test below crashes release builds too, so this is reachable in production, not only under ASAN.The fix: the fixing line is the
terminate_children_and_wait(vm_ptr, 10_000)call inshutdown()(step 3.5): before freeing its VM, an exiting worker terminates the workers whose parent VM is its own and futex-waits until each has unlinked from the live-workers registry, which is past every parent-VM read. It reuses the existingterminate_all_and_waitsweep, parameterized with a parent filter. This mirrors Node'sEnvironment::stop_sub_worker_contexts(). The placement after JSC teardown means no JS can run to register new children, so the sweep is complete.Two more crashes on the same termination race, surfaced by the same repro once the UAF was fixed:
flush_logs()hitpanic!("unhandled exception")when a child's entry-point resolution failed (revoked blob URL) while its termination was in flight: the pending TerminationException makes the log-to-JS conversion returnJsError::Terminated. Termination is a normal event, not an invariant violation, so skip dispatching the error event. A genuineJsError::Thrownfrom the conversion is reported throughreport_uncaught_exceptioninstead of crashing.handleTrapsre-arms from the trap bit). Re-entering JS in that state tripsASSERT(vm.hasTerminationRequest())inVMTraps::deferTerminationSlow(debug builds). The worker death path now clears the stale exception at its JS re-entry points, using the existingclear_termination_exception()(same pattern as the test runner and repl).How did you verify your code works?
Two regression tests in
test/js/web/workers/worker-terminate-lifetime.test.ts, each a spawned fixture where a worker creates children and is terminated while they start:flush_logspanic and the VMTraps assertion path; crashes 5/5 on the unfixed ASAN build.Also ran the existing worker suites (
worker.test.ts,worker_blob.test.ts,worker-terminate-lifetime.test.ts,worker_threads.test.ts,worker-async-dispose.test.ts,message-port-context-destroy-leak.test.ts) with the debug+ASAN build. The pre-existing failures inworker.test.ts("worker with event listeners doesn't close event loop", 1000ms budget) andworker_threads.test.ts("eval does not leak source code") reproduce identically on an unfixed build on this runner; they are unrelated timing flakes on slow debug machines.The file is also added to
test/no-validate-leaksan.txt: the debian-13 x64-asan lane runs tests withdetect_leaks=1, and worker teardown intentionally leaks when a parent context is gone before the close task runs (the thread-held Worker ref documented inweb_worker.rs), which these tests exercise by design. Reproduced on a local release-asan build (flaky 32-byte leak report in the fixture's stderr, about one run in five);worker.test.tsandworker_blob.test.tsare already excluded for the same reason. AddressSanitizer's use-after-free detection is unaffected by the exclusion.