[Wasm RyuJIT] Spill live ref/byref values to pinned stack slots at calls#129059
[Wasm RyuJIT] Spill live ref/byref values to pinned stack slots at calls#129059kg wants to merge 3 commits into
Conversation
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
There was a problem hiding this comment.
Pull request overview
This PR introduces a Wasm-specific mechanism intended to make ref/byref values GC-visible across call sites by injecting a new IR node (GT_WASM_SPILL_REF) and a new Wasm phase (WasmSpillRefs) that inserts these nodes and allocates pinned stack spill slots used during Wasm codegen.
Changes:
- Add
GT_WASM_SPILL_REFnode kind and operand iteration support. - Add
Compiler::WasmSpillRefsphase to insert spill nodes around calls and allocate spill locals. - Extend Wasm codegen/regalloc to track a spill index and (temporarily) force-enregister a scratch “splash zone” local.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| src/coreclr/jit/regallocwasm.cpp | Forces the “splash zone” local to be treated as a reg candidate. |
| src/coreclr/jit/gtlist.h | Adds new Wasm node WASM_SPILL_REF. |
| src/coreclr/jit/gentree.cpp | Updates operand-edge iterator to treat GT_WASM_SPILL_REF as unary. |
| src/coreclr/jit/fgwasm.cpp | Implements Compiler::WasmSpillRefs to insert spill nodes and allocate spill locals. |
| src/coreclr/jit/compphases.h | Adds PHASE_WASM_SPILL_REFS. |
| src/coreclr/jit/compmemkind.h | Adds WasmSpillRefs memory kind. |
| src/coreclr/jit/compiler.h | Adds m_wasmSpillSlots field and WasmSpillRefs declaration. |
| src/coreclr/jit/compiler.cpp | Wires WasmSpillRefs into the Wasm compilation pipeline. |
| src/coreclr/jit/codegenwasm.cpp | Emits Wasm for GT_WASM_SPILL_REF and resets spill index at calls. |
| src/coreclr/jit/codegenlinear.cpp | Resets spill index at block boundaries on Wasm. |
| src/coreclr/jit/codegen.h | Adds wasmSpillRefIndex state. |
| if (defs.size()) | ||
| { | ||
| JITDUMP("Spilling %d live ref(s) for call\n", defs.size()); | ||
| DISPNODE(tree); |
| JITDUMP("High water mark for refs was %d\n", highWaterMark); | ||
| if (highWaterMark == 0) | ||
| return PhaseStatus::MODIFIED_NOTHING; | ||
|
|
| if (tree->IsValue() && tree->TypeIs(TYP_REF, TYP_BYREF) && !tree->OperIs(GT_WASM_SPILL_REF)) | ||
| { | ||
| // TODO: Can we skip this for GT_LCL_VAR when it lives in memory? Or is it possible | ||
| // that the LCL_VAR has been modified since it was loaded onto the Wasm stack? | ||
| defs.push_back(tree); | ||
| } |
| varDsc->lvHasExplicitInit = true; | ||
| varDsc->lvImplicitlyReferenced = true; | ||
| // If we don't make this var tracked, regalloc will crash when allocating a register for it | ||
| varDsc->lvTracked = true; | ||
| m_wasmSpillSlots->at(0) = varNum; |
| // HACK: Ensure that we always enregister the splash zone, even if we are not enregistering other locals | ||
| if (m_compiler->m_wasmSpillSlots && m_compiler->m_wasmSpillSlots->size() && m_compiler->m_wasmSpillSlots->at(0) == lclNum) | ||
| { | ||
| varIsRegCandidate = true; | ||
| } |
AndyAyersMS
left a comment
There was a problem hiding this comment.
Generally looks good.
I'm curious how this intersects/overlaps with LSRA's spill temp mechanism. Not saying we should use that here, but it likely serves a similar purpose.
| return GenTree::VisitResult::Continue; | ||
| }); | ||
|
|
||
| if (tree->IsValue() && tree->TypeIs(TYP_REF, TYP_BYREF) && !tree->OperIs(GT_WASM_SPILL_REF)) |
There was a problem hiding this comment.
Check for unused values here?
| if (!op->TypeIs(TYP_REF, TYP_BYREF)) | ||
| return GenTree::VisitResult::Continue; | ||
|
|
||
| for (size_t i = defs.size(); i > 0; i--) |
There was a problem hiding this comment.
Probably worth commenting what this is doing (removing active defs once we find their use, and keeping the defs collection compact).
|
|
||
| GTNODE(WASM_JEXCEPT , GenTree ,0,0,GTK_LEAF|GTK_NOVALUE|DBK_NOTHIR) // Special jump for Wasm exception handling | ||
| GTNODE(WASM_THROW_REF , GenTree ,0,0,GTK_LEAF|GTK_NOVALUE|DBK_NOTHIR) // Wasm rethrow host exception (exception is an implicit operand) | ||
| GTNODE(WASM_SPILL_REF , GenTreeOp ,0,0,GTK_UNOP|DBK_NOTHIR) |
There was a problem hiding this comment.
Why a new node instead of just using normal stores?
There was a problem hiding this comment.
STORE_LCL_VAR doesn't return a value, does it? For this to work, the value has to 'flow through' the spill into the call consuming it
| varIsRegCandidate = false; | ||
| } | ||
|
|
||
| // HACK: Ensure that we always enregister the splash zone, even if we are not enregistering other locals |
There was a problem hiding this comment.
Why do these need to be enregistered?
There was a problem hiding this comment.
We would have to change the shape of the generated code to have the splash zone live in memory, and it would be much worse CQ
There was a problem hiding this comment.
Not sure I understand – if we were doing similar transformation in any other place in the JIT we would just create new locals and store to them. Like gtSplitTree or async do already. What makes that less desirable and why can't it be handled as well as this can automatically by the wasm backend?
There was a problem hiding this comment.
For context: https://gist.github.com/kg/cba44f4907f5320966058446fa01c25f
Essentially, the stack shape required to do a memory store on wasm is
[dst, value]
so if I have a list of call args like
[a, b, c, d]
I can't insert a node after b that does a memory store for a spill without first inserting a LCL_ADDR node before b.
If I have a local that's guaranteed to be enregistered, I can insert a tee.local opcode inline that will make a copy of b, and then I can do anything I want as long as I leave b on the stack afterwards (which is easy thanks to the local). By default in debug we don't enregister anything, so I need this hack to ensure I have the ability to tee.local.
Let me know if that doesn't make sense and I can explain in more detail.
| if (op == defs[i - 1]) | ||
| { | ||
| defs[i - 1] = defs[defs.size() - 1]; | ||
| defs.erase(defs.begin() + (defs.size() - 1), defs.end()); |
There was a problem hiding this comment.
This can use pop_back ( async.cpp version could too)
No description provided.