Skip to content

Fix quotations leaking empty-string match lowering into AST (#19873)#19923

Open
T-Gro wants to merge 12 commits into
mainfrom
feature/pattern-match-empty-string-quotations
Open

Fix quotations leaking empty-string match lowering into AST (#19873)#19923
T-Gro wants to merge 12 commits into
mainfrom
feature/pattern-match-empty-string-quotations

Conversation

@T-Gro

@T-Gro T-Gro commented Jun 9, 2026

Copy link
Copy Markdown
Member

Fixes #19873.

#19189 added an empty-string fast path inside BuildSwitch, replacing the equality test with if s <> null then s.Length = 0 else false. Because the rewrite happened during pattern-match compilation — which runs before the quotation translator — every quoted form involving match s with "" -> _ started leaking that lowering into its AST:

Lambda (_arg1,
   IfThenElse (
      IfThenElse (Call (None, op_Inequality, [_arg1, Value (<null>)]),
                  Call (None, op_Equality, [PropertyGet (Some _arg1, Length, []), Value 0]),
                  Value false),
      ...))

That broke quotation consumers (translators, target-language compilers) that assume the quoted tree mirrors the source — see the issue for the regression scenario.

The fix moves the rewrite out of pattern-match compilation and into OptimizeExprOp, which already skips Expr.Quote bodies. match s with "" -> _ now reaches the optimizer as a plain op_Equality call, so the quoted form is back to:

Lambda (x, IfThenElse (Call (None, op_Equality, [x, Value ""]), ...))

The optimized IL for compiled code is unchanged. As a side effect, if s = "" then _ now produces the same null-safe length-check IL as match s with "" -> _ — improvement, not regression. Two accepted IL trade-offs are pinned by baseline tests: --optimize- no longer collapses to the null+Length form ((=) is only inlined when LocalOptimizationsEnabled), and match s with null | "" -> _ emits one redundant brfalse.s the JIT folds.

T-Gro and others added 10 commits June 5, 2026 14:18
…clean

The empty-string lowering added in #19189 produced `if x <> null then x.Length = 0
else false` inside `BuildSwitch`, which leaked through pattern-match compilation
into captured quotations (#19873). Move the rewrite to `OptimizeExprOp`, which
already skips `Expr.Quote` bodies. `match s with "" -> _` keeps the same IL and
quotations now see `op_Equality(s, "")`. As a side effect, `if s = "" then _`
gets the same null-safe length-check IL — improvement, not regression.

Also adds a generic `verifyOutputAgainstBaseline` helper in `Compiler.fs` and a
`.bsl`-driven quotation rendering test suite under `Conformance/Expressions/
ExpressionQuotations/QuotationRendering/` to catch future leaks of pattern-match
lowerings into quotation ASTs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adversarial reviews flagged two IL-quality concerns. Verified both by
inspecting actual ildasm output, accepted both, and added IL tests
locking the current behavior in so future changes are conscious:

  1. --optimize- (debug) builds no longer get the null+Length fast path
     for `match s with "" -> _` because F#'s `(=)` is only inlined when
     LocalOptimizationsEnabled is true. The fallback `String.Equals(s,"")`
     call is still correct; JIT tiered compilation reaches the fast path.
  2. `match s with null | "" -> _` emits one redundant `brfalse` on the
     empty-string branch because the optimizer cannot see the enclosing
     null-filtered context that BuildSwitch's `isNullFiltered` flag tracked.
     JIT trivially eliminates the redundant branch.

Also tightens `IsILMethodRefSystemStringEquals` to require the call to
be static (rejects hypothetical instance/user-defined `System.String`
type with a 2-arg static `Equals`) and fixes a stale function-name
comment in PatternMatchCompilation.fs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Driver: extract `printerProgram` helper using a triple-quoted F# string
(no more `\"\"` escape hell); group tests into Regression / NoLeakReference
/ SideEffect / PreExistingError submodules.

Baselines: drop the `Match` suffix (folder is already `QuotationRendering`),
delete redundant `SparseIntMatch` (same code path as Consecutive), shrink
all multi-case tests to the minimum that still demonstrates the point:
- ConsecutiveInts: 1..10 → 1..3 (was 41 lines, now 7)
- Chars, NonEmptyString: 3 → 2 cases
- Int64, Float, Decimal: 3 → 1 case

Result: 9 baselines totalling 37 lines, each fits on screen.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The empty-string regression tests and surrounding trade-off comments grew
beyond what they need to convey. This trims prose to the causal mechanism
only, flattens four banner-only sub-modules into one, collapses the five
no-leak reference tests into a parametrised Theory, and corrects a stale
docstring claim that Int64/Float/Decimal each take a distinct BuildSwitch
arm (Int64 and Float share the mkILAsmCeq path, so Float.bsl was redundant
with Int64.bsl and is removed). Release note shrinks from a paragraph to
the one user-observable change.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…-architect review

Three rounds of adversarial multi-model (gpt-5.5, opus-4.7-xhigh, opus-4.8) architect
review converged on a cleaner design for the empty-string regression coverage:

- Quotation literals now live directly in test method bodies (`<@ … @>`) instead of
  inside a generated FSI script. They are desugared at test-project compile time by the
  proto fsc, which converts a runtime baseline diff into a compile-time gate: if #19873
  ever regresses, the test source itself fails to compile with FS0452.
- Removes the shared-FSI session, the no-op `RunTestCasesInSequence` cargo cult, the
  string-templated printer with its triple-quote escape guard, and the unused
  `verifyOutputAgainstBaseline` helper this PR briefly introduced.
- Baselines carry a `// <name>` provenance header so a `.bsl` opened in isolation
  identifies itself in PR diffs.
- Adds an orphan-guard `[<Fact>]` that fails when `*.bsl` on disk drifts from the
  test method set (skipped during `TEST_UPDATE_BSL=1` to avoid racing with writes).
- Adds a structural Expr walker as belt-and-suspenders alongside the baseline for the
  two known leaked-lowering shapes (`String.Length`, `String.Equals`).
- Adds a convergence assertion that the `match x with ""` and `if x = ""` quotations
  desugar to byte-identical Exprs, using shared module-level let bindings so the
  proof and the baselines cannot drift apart.
- Splits the FS0452 array-pattern diagnostic test into its own file since it uses a
  different harness (`FSharp |> typecheck |> shouldFail`, not a `.bsl` snapshot).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
`tname_String`, `tname_Bool`, `tname_Type` already exist in `src/Compiler/AbstractIL/il.fs`
as `[<Literal>]` constants alongside ~20 other type-name literals, but they were not
exposed via `il.fsi` so callers in `Optimize/Optimizer.fs` and `Symbols/Exprs.fs` had
been duplicating the magic strings (`"System.String"`, `"System.Boolean"`, `"System.Type"`).

Exposes the three names actually referenced outside `il.fs` via `il.fsi` and switches
the four affected `IL{TypeRef,MethodRef}.Name` comparisons to use them. No behavior
change. The fix applies to the empty-string-equals matcher this PR added plus the
pre-existing `String.Concat` / `String.GetHashCode` / `Type.GetTypeFromHandle` matchers.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
5-line walker docstring, 3-line section-divider blocks, 6-line bullet list above
the no-leak reference Facts, and a 12-line "Design notes" header were rationalising
self-evident code or restating test names. Trimmed to the essentials: top-of-file
purpose + bootstrap caveat, one-line note next to the shared `let` bindings, and
the one genuinely-non-obvious comment (Int64 takes the mkILAsmCeq arm).

Net effect: file shrinks from 156 to 100 lines, signal density up, tests unchanged.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
It tested filesystem bookkeeping (does *.bsl on disk equal a hand-maintained
expectedBaselines set), not the compiler. The drift it catches — a deleted
test leaving its .bsl behind — is already visible in `git diff` and adds zero
signal anyone would actually use to find a regression.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Three garbage hunters (production / tests / prose) found redundancies that
slipped through the earlier iterations:

Production:
- Merge two String.Equals(_, "")/("", _) optimizer arms into one or-pattern.
- Drop the redundant CallingConv.IsStatic guard in IsILMethodRefSystemStringEquals
  (ArgCount=2 + both args String + return Bool already pin the static overload).
- Inline single-use let bindings in MakeOptimizedStringEqualsEmptyCall.
- Shrink 4-line docstring to 2 lines; shrink match-arm comment to one line.
- Shrink il.fsi block comment and PatternMatchCompilation.fs trade-off comment.

Tests:
- Delete QuotationUnsupportedConstructsTests.fs (FS0452 array-pattern coverage
  already exists in Regressions/E_DecomposeArray01.fs — pure duplication).
- Delete IfEqualEmpty.bsl + its test + the convergence Fact + the shared `let`
  bindings + assertNoEmptyStringLowering walker; all duplicated EmptyString.bsl
  coverage either as identical AST or as belt-and-suspenders restating the
  baseline diff.
- Strip the `// <name>` provenance header from the printer and from all 7
  remaining baselines — the test method name already identifies each baseline.
- Trim trade-off comment tails in TypeTestsInPatternMatching.fs (drop the
  JIT-folds-the-duplicate / tiered-compilation trivia).

Release notes: drop the implementation-detail bullet about `if s = ""` getting
the same null-safe IL — implied by moving the rewrite to the optimizer.

Net: -115/+30 across 16 files. 14 tests still pass; build clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Extracts a `(|StringTy|_|)` partial active pattern over `ILType` and a small
`isILMethodRefOnSystemString` helper that bakes in `tname_String` as the
declaring type and takes the per-arg-shape check as an `[<InlineIfLambda>]`.

Each `IsILMethodRefSystem*` now reads as the literal shape it matches:
`[StringTy; StringTy]` for Equals(string, string); the three overload arities
of String.Concat as alternative list patterns; `[ILType.Array (shape, StringTy)]`
for Concat(string[]). No `ArgCount` book-keeping or `List.forall` walks left.

Behaviour unchanged: build clean, 14 quotation + IL tests still green.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

❗ Release notes required

You can open this PR in browser to add release notes: open in github.dev


✅ Found changes and release notes in following paths:

Warning

No PR link found in some release notes, please consider adding it.

Change path Release notes path Description
src/Compiler docs/release-notes/.FSharp.Compiler.Service/11.0.100.md No current pull request URL (#19923) found, please consider adding it

@github-actions github-actions Bot added the AI-Tooling-Check-Bypassed Tooling check: non-fork PR, not diff-analyzed label Jun 9, 2026
@T-Gro T-Gro requested a review from abonie June 10, 2026 13:25
let finalTest = if isNullFiltered then test else mkLazyAnd g m (mkNonNullTest g m vExpr) test
mkLetBind m bind finalTest
| DecisionTreeTest.Const (Const.String _ as c) ->
// Empty-string is rewritten to a null-safe length check by the Optimizer (#19873).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd remove this comment, seems out of place

@github-project-automation github-project-automation Bot moved this from New to In Progress in F# Compiler and Tooling Jun 10, 2026
…a FSI

- PatternMatchCompilation.fs: remove the misplaced `// Empty-string is rewritten…`
  comment on the (now-pristine) string arm (review feedback from @abonie).

- QuotationRenderingTests.fs: replace literal `<@…@>` quotations with source-string
  evaluation through a shared FSI session. The bootstrap fsc that builds the test
  project still has the pre-fix desugar and rejects literal `match s with ""`
  quotations with FS0452 — the literals only become valid AFTER this PR is in the
  bootstrap. FSI uses the just-built FCS (with the fix), so source-as-string
  evaluation is bootstrap-immune. The .bsl baselines are unchanged.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI-Tooling-Check-Bypassed Tooling check: non-fork PR, not diff-analyzed

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

Quotations matching "" produce incorrect representation

2 participants