Skip to content

test(conformance): arm the fixtures for the input-required scenario family#2319

Merged
felixweinberger merged 3 commits into
v2-2026-07-28from
fweinberger/mrtr-conformance-arming
Jun 18, 2026
Merged

test(conformance): arm the fixtures for the input-required scenario family#2319
felixweinberger merged 3 commits into
v2-2026-07-28from
fweinberger/mrtr-conformance-arming

Conversation

@felixweinberger

@felixweinberger felixweinberger commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

This arms the conformance fixtures for the multi round-trip request scenarios (SEP-2322,
protocol revision 2026-07-28) now that the SDK has the inputRequired() server API and the
client auto-fulfilment engine, and burns the corresponding entries out of the expected-failures
baselines.

Motivation and Context

The conformance suite's input-required-result-* server family and the SEP-2322 client
scenario were baselined as expected failures because the fixtures predate the multi round-trip
feature: the everything server did not register the diagnostic test_input_required_result_*
tools and the everything client never reached the auto-fulfilment driver. With the feature in
place, the fixtures can exercise it and the baselines can shrink.

How Has This Been Tested?

  • All five conformance legs run exactly as CI runs them, against the pinned
    @modelcontextprotocol/conformance release, and exit 0 with zero stale baseline entries:
    the 2025 server leg stays 42/0, the draft server leg goes 39→58 passed checks (14→3
    expected-failure entries), the forced-2026 server leg 63→82 (15→4), the client all leg
    324→329 (15→14), and the forced-2026 client leg 207→212 (25→24).
  • The intermediate state (armed fixtures, baselines untouched) makes each leg exit non-zero
    with exactly the newly passing scenarios reported as stale, which is the gating mechanism
    this change relies on.
  • The conformance package's lint/format checks pass.

Breaking Changes

None — test fixtures and expected-failure baselines only.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Checklist

  • I have read the MCP Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have added or updated documentation as needed

Additional context

  • The everything server registers the diagnostic tools write-once style with inputRequired()
    and completes from ctx.mcpReq.inputResponses on the retry. Tools that mint requestState
    integrity-protect it with a small fixture-local HMAC helper and verify it on every retry —
    the worked example of the consumer obligation the migration guide documents.
  • input-required-result-tampered-state stays in the baseline: the fixture detects tampering
    and rejects, but a rejection thrown from a tool callback surfaces as an isError tool result
    (the tools/call funnel wraps handler throws), not the JSON-RPC error the scenario requires.
    It unblocks when the SDK can reject invalid requestState at the protocol layer (the planned
    opt-in sealing helper) or lets this rejection escape as a protocol error.
  • The SEP-2322 client scenario's mock server answers neither server/discover nor
    initialize, so the everything client supplies the connect-time discover response through
    the transport's custom fetch and lets every other request hit the real mock; everything the
    scenario measures (auto-fulfilment through the registered elicitation handler, byte-exact
    requestState echo, fresh request ids, isolation of unrelated calls, no retry on complete
    results) is the SDK driver's behavior.
  • The previously WARNING-only missing-input-response and ignore-extra-params entries are
    removed too: the armed tool re-requests missing keys and ignores unrecognized ones, so those
    scenarios now pass cleanly.

(checklist note: "Documentation update" is the closest box for a tests/baselines-only change —
swap to "New feature" if the reviewer prefers classifying fixture arming that way.)

@felixweinberger felixweinberger requested a review from a team as a code owner June 18, 2026 12:50
@changeset-bot

changeset-bot Bot commented Jun 18, 2026

Copy link
Copy Markdown

⚠️ No Changeset found

Latest commit: 72cab85

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@pkg-pr-new

pkg-pr-new Bot commented Jun 18, 2026

Copy link
Copy Markdown

Open in StackBlitz

@modelcontextprotocol/client

npm i https://pkg.pr.new/modelcontextprotocol/typescript-sdk/@modelcontextprotocol/client@2319

@modelcontextprotocol/codemod

npm i https://pkg.pr.new/modelcontextprotocol/typescript-sdk/@modelcontextprotocol/codemod@2319

@modelcontextprotocol/server

npm i https://pkg.pr.new/modelcontextprotocol/typescript-sdk/@modelcontextprotocol/server@2319

@modelcontextprotocol/server-legacy

npm i https://pkg.pr.new/modelcontextprotocol/typescript-sdk/@modelcontextprotocol/server-legacy@2319

@modelcontextprotocol/express

npm i https://pkg.pr.new/modelcontextprotocol/typescript-sdk/@modelcontextprotocol/express@2319

@modelcontextprotocol/fastify

npm i https://pkg.pr.new/modelcontextprotocol/typescript-sdk/@modelcontextprotocol/fastify@2319

@modelcontextprotocol/hono

npm i https://pkg.pr.new/modelcontextprotocol/typescript-sdk/@modelcontextprotocol/hono@2319

@modelcontextprotocol/node

npm i https://pkg.pr.new/modelcontextprotocol/typescript-sdk/@modelcontextprotocol/node@2319

commit: 72cab85

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No bugs found, but this is a sizeable conformance-fixture change (new SEP-2322 tool/prompt fixtures, a custom-fetch discover shim in the client scenario, and baseline burn-down) stacked on the unmerged MRTR server-seam branch, so a human should confirm the fixture design and that the conformance legs stay green once the base lands.

Extended reasoning...

Overview

This PR is test-infrastructure only: it adds SEP-2322 (multi round-trip / input-required) diagnostic tools and a prompt to the conformance everything server, a new sep-2322-client-request-state scenario handler to the everything client, and removes the corresponding entries from both expected-failures baselines (keeping input-required-result-tampered-state with an updated rationale). No SDK source under src/ or published package code is touched.

Security risks

None to production code — the HMAC requestState minting/verification helpers and the local server/discover fetch shim live entirely in the conformance fixture. The HMAC helper is actually a worked example of the integrity-protection obligation the migration guide documents, and uses timingSafeEqual correctly with a length pre-check.

Level of scrutiny

Moderate. Although test-only, the change is large (~450 added lines), depends on the not-yet-merged MRTR server-seam branch for the inputRequired() / acceptedContent / ServerContext APIs it imports, and encodes design judgments a maintainer may want to weigh in on: which scenarios stay baselined (tampered-state because tool throws surface as isError results), the ctx as ServerContext cast in the no-args prompt registration, and especially the client scenario answering server/discover locally through a custom fetch because the conformance mock implements neither negotiation path — that shim is a reasonable workaround but is the kind of fixture trick a maintainer should bless. The baselines are self-policing (the runner fails on stale entries and unexpected failures), so CI on the merged stack will catch any mismatch between the registered fixture tool names and what the pinned conformance release expects.

Other factors

The bug-hunting pass found no issues, and the author reports all five conformance legs exiting 0 against the pinned conformance release with zero stale entries. The changeset bot warning is fine since this is a test package only. Given the size, the stacked base, and the draft-spec (SEP-2322) subject matter, this falls outside the simple/mechanical bar for shadow approval, but there is nothing blocking from the automated review side.

@felixweinberger felixweinberger force-pushed the fweinberger/mrtr-server-seam branch from cfa23e9 to 58297e2 Compare June 18, 2026 13:06
@felixweinberger felixweinberger force-pushed the fweinberger/mrtr-conformance-arming branch 2 times, most recently from e62b9cc to e240f12 Compare June 18, 2026 13:19
@felixweinberger felixweinberger force-pushed the fweinberger/mrtr-server-seam branch from 317ad59 to 893242c Compare June 18, 2026 13:25
@felixweinberger felixweinberger force-pushed the fweinberger/mrtr-conformance-arming branch from e240f12 to dccb610 Compare June 18, 2026 13:25
Comment thread test/conformance/expected-failures.yaml
@felixweinberger felixweinberger force-pushed the fweinberger/mrtr-conformance-arming branch from dccb610 to 6c96dd6 Compare June 18, 2026 13:49
Base automatically changed from fweinberger/mrtr-server-seam to v2-2026-07-28 June 18, 2026 14:02
…cenarios

Register the test_input_required_result_* diagnostic tools and the
input-required prompt in the everything server, written write-once style with
the inputRequired() builder and completing from ctx.mcpReq.inputResponses on
the retried request. Tools that mint requestState integrity-protect it with a
fixture-local HMAC helper and verify it on every retry, as the worked example
of the consumer obligation documented in the migration guide.

Add the SEP-2322 client scenario to the everything client: the SDK client
negotiates the 2026-07-28 era and its auto-fulfilment driver answers the
mock's embedded elicitation requests, echoes requestState byte-exact, and
retries on fresh request ids. The scenario's mock answers neither
server/discover nor initialize, so the fixture supplies the connect-time
discover response through the transport's custom fetch; everything the
scenario measures runs against the real mock.
… now pass

Remove the scenarios the armed fixtures make pass from both expected-failures
files: the input-required-result server family (except tampered-state) from
the draft and forced-2026 server legs, and sep-2322-client-request-state from
both client legs. The tampered-state entry stays with an updated reason: the
fixture detects tampering, but a rejection thrown from a tool callback
surfaces as an isError tool result rather than the JSON-RPC error the
scenario requires.
…red-state now passes

Moves the fixture-local HMAC verification of requestState into the new
ServerOptions.requestState.verify hook so a tampered retry answers the
wire-level -32602 the scenario requires (instead of an isError tool
result from the tools/call funnel). Removes the
input-required-result-tampered-state entry from both expected-failures
baselines.
@felixweinberger felixweinberger force-pushed the fweinberger/mrtr-conformance-arming branch from 6c96dd6 to 72cab85 Compare June 18, 2026 14:04
@felixweinberger felixweinberger merged commit f7c29e8 into v2-2026-07-28 Jun 18, 2026
15 checks passed
@felixweinberger felixweinberger deleted the fweinberger/mrtr-conformance-arming branch June 18, 2026 14:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant