From cee2b1569d584b7a3c653ee0933829080fd95a35 Mon Sep 17 00:00:00 2001 From: Branimir Rakic Date: Wed, 27 May 2026 00:15:51 +0200 Subject: [PATCH 01/12] docs(rfc): add OT-RFC-40 multi-storage KC URI scheme MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Formalises the V9/V10 plural-storage URI pattern that already exists in production (V9 KnowledgeAssetsStorage uses uriBase did:dkg:v9 alongside V10 KnowledgeCollectionStorage at did:dkg) so future contract upgrades can land additively without invalidating any existing UAL or store.nq triple. Companion to GitHub issue #679 (chain-reset auto-wipe). Removes the need for the wipe over time on networks where the scheme is in force. Decisions recorded (per §11): - adopt the explicit-tag URI scheme as specified in §5.2 - sequence the 6 PRs as written Out of scope: content-addressed identity, CG storage versioning, wholesale removal of chainResetMarker. Each tracked as separate. Co-authored-by: Cursor --- docs/RFC40_MULTI_STORAGE_KC_URI_SCHEME.md | 383 ++++++++++++++++++++++ 1 file changed, 383 insertions(+) create mode 100644 docs/RFC40_MULTI_STORAGE_KC_URI_SCHEME.md diff --git a/docs/RFC40_MULTI_STORAGE_KC_URI_SCHEME.md b/docs/RFC40_MULTI_STORAGE_KC_URI_SCHEME.md new file mode 100644 index 000000000..e78aa1a02 --- /dev/null +++ b/docs/RFC40_MULTI_STORAGE_KC_URI_SCHEME.md @@ -0,0 +1,383 @@ +# OT-RFC-40 — Multi-storage knowledge-collection URI scheme + +**Status:** Draft +**Companion:** [GitHub issue #679 — chain-reset auto-wipe](https://github.com/OriginTrail/dkg/issues/679) +**Related:** [`docs/TESTNET_RESET.md`](./TESTNET_RESET.md), [`packages/cli/src/daemon/chain-reset-wipe.ts`](../packages/cli/src/daemon/chain-reset-wipe.ts) + +## 1. Summary + +The chain-reset auto-wipe in `chain-reset-wipe.ts` exists because the wire identity of a Knowledge Collection's data on disk is, today, implicitly bound to a single deployed `KnowledgeCollectionStorage` contract. When that storage contract is re-deployed (testnet reset, major mainnet upgrade), the IDs embedded in store.nq triples lose their on-chain referent and the daemon's only safe move is to destroy them. + +This RFC formalises a URI scheme in which a KC's wire identity is bound to *which storage instance minted it*, not just to "the chain". Old data published against a previous storage instance continues to be valid and queryable indefinitely; new storage versions live alongside old ones; the auto-wipe becomes unnecessary on any network where this scheme is in force. + +**Crucially, this is not a new pattern**. It is a formalisation of the V9 / V10 coexistence pattern that already runs in production on Base Sepolia testnet today. The work below is closing the gaps that prevent the pattern from being repeatable for V11, V12, and arbitrary future upgrades. + +## 2. Motivation + +Three failure modes drive this: + +1. **Testnet dev rugpull (#679):** Every release-candidate redeploys storage contracts and bumps `chainResetMarker`. The next daemon boot deletes `store.nq`. Developers who switch between branches in the same `DKG_HOME` lose their local-only context graphs silently. Reproducible twice in five hours during a normal RC cadence. +2. **Mainnet upgrade rugpull risk:** The same hook is wired into mainnet's boot path. A future maintainer who bumps `chainResetMarker` in `network/mainnet-*.json` — by accident or by design — would destroy every operator's published data on next restart. There is no off-switch. +3. **Multi-storage coexistence is a real product requirement.** V9 `KnowledgeAssetsStorage` already coexists with V10 `KnowledgeCollectionStorage` on the testnet (see §3.1). Future versions will need to coexist too, and the existing convention is undocumented and partially-implemented. + +## 3. Current state + +### 3.1 Plural KC storage is already deployed + +`packages/evm-module/deployments/base_sepolia_v10_contracts.json`: + +```93:96:packages/evm-module/deployments/base_sepolia_v10_contracts.json + "KnowledgeCollectionStorage": { + "evmAddress": "0x4fCA405d46ADeDD7050420C1937842D2a36a04D8", + "version": "1.0.0", + "gitBranch": "main", +``` + +```309:312:packages/evm-module/deployments/base_sepolia_v10_contracts.json + "KnowledgeAssetsStorage": { + "evmAddress": "0x45E0e14c695681c8c93d6A489a314ea1EC28ba59", + "version": "2.0.0", + "gitBranch": "validate/v10-e2e-devnet", +``` + +Both are registered in the same Hub. Both are queried by agents via `hub.getAssetStorageAddress()`. Existence proof. + +### 3.2 Each storage has its own URI prefix + +`parameters.json` carries a `uriBase` per storage instance: + +```51:93:packages/evm-module/deployments/parameters.json + "KnowledgeCollectionStorage": { + "knowledgeCollectionSize": "1000000", + "uriBase": "did:dkg" + }, + ... + "KnowledgeAssetsStorage": { + "uriBase": "did:dkg:v9" + }, +``` + +This is the architectural primitive we're formalising. V9 KAs are minted under the `did:dkg:v9` URI prefix; V10 KCs under `did:dkg`. The prefix is baked into the storage contract at construction time and queryable on-chain via the standard ERC-1155 `uri(uint256)` view: + +```80:82:packages/evm-module/contracts/tokens/ERC1155Delta.sol + function uri(uint256) public view virtual override returns (string memory) { + return _uri; + } +``` + +### 3.3 The chain side is already multi-storage aware + +`RandomSamplingLib.Challenge` carries the storage address inline: + +```6:14:packages/evm-module/contracts/libraries/RandomSamplingLib.sol + struct Challenge { + uint256 knowledgeCollectionId; + uint256 chunkId; // TODO:Smaller data structure + address knowledgeCollectionStorageContract; + uint256 epoch; + uint256 activeProofPeriodStartBlock; + uint256 proofingPeriodDurationInBlocks; + bool solved; + } +``` + +Random sampling was designed for plural KC storage from day one. The challenge tells the prover *which* storage holds the targeted KC; the prover doesn't have to guess. + +### 3.4 Where the current model leaks + +Despite the primitives existing, three sites short-circuit on "the singular KC storage": + +- **Logic-side resolution** (`KnowledgeAssetsV10.sol:325`, `RandomSampling.sol:123`): every consumer calls `hub.getAssetStorageAddress("KnowledgeCollectionStorage")` — a single name. There's no convention for V2, V3, etc. — adding a second instance under the same name is a name-collision overwrite, not an additional registration. +- **UAL construction** (`packages/publisher/src/dkg-publisher.ts:2324`, `:2625`, `:2701`, `:2756`, `:2788`, `:2802`): the UAL is hard-coded to `did:dkg:${chainId}/${publisherAddress}/${startKAId}`. The storage's actual `uriBase` is never consulted; the agent assumes V10's `did:dkg` prefix always. V9 has its own KAS path that doesn't go through this. +- **UAL parser** (`packages/publisher/src/publish-handler.ts:581-610`): splits on `/`, assumes `segments[0] = chainId`, `segments[1] = publisherAddress`, `segments[2] = startKAId`. Three-segment-only. Anything richer fails silently. + +These three are the formalisation gap. Close them, and the storage-redeploy class of problem disappears. + +## 4. Investigation: answering the three unverified questions + +I committed in the previous round to verifying three points before finalising. Findings: + +### 4.1 Random-sampling WAL shape + +`packages/random-sampling/src/wal.ts`: + +```44:64:packages/random-sampling/src/wal.ts +export interface ProverWalEntry { + ts: string; + epoch: string; + periodStartBlock: string; + identityId: string; + status: ProverPeriodStatus; + kcId?: string; + cgId?: string; + chunkId?: string; + txHash?: string; + error?: { code: string; message: string }; +} +``` + +WAL identity is `(epoch, periodStartBlock, identityId)`. `kcId` and `cgId` are informational only — they're "set from `challenge` onwards", never used in `periodKeyEquals` (`wal.ts:77-83`). Crash recovery uses the period key, not the KC id. + +**Impact on this RFC:** zero. WAL is forward-compatible with any kcId representation that's serialisable as a decimal string. Even if we never touched the WAL, adopting the upper-bits scheme (§5.4) would just produce larger decimals in the JSONL — string-typed, no schema change. + +Should add `knowledgeCollectionStorageContract: string` to `ProverWalEntry` for forensics regardless (the chain already passes it; we just discard it). Half a day's work; not blocking. + +### 4.2 Publish-journal idempotency keys + +`packages/publisher/src/publish-journal.ts`: + +```5:19:packages/publisher/src/publish-journal.ts +export interface JournalEntry { + ual: string; + contextGraphId: string; + expectedPublisherAddress: string; + expectedMerkleRoot: string; + expectedStartKAId: string; + expectedEndKAId: string; + expectedChainId: string; + rootEntities?: string[]; + createdAt: number; +} +``` + +Journal is keyed by **UAL + contextGraphId + merkleRoot**, no separate kcId field. Once the UAL becomes storage-discriminating, the journal naturally becomes storage-discriminating too. No journal-level changes needed. + +### 4.3 External indexers / SubGraph manifests / ETL + +No SubGraph (The Graph protocol) manifests in this repo. The file at `packages/cli/src/indexer.ts` is the *code-graph* indexer (DKG ontology, dev-time), not a chain-data indexer. No external pipeline assumes small/dense kcId ranges. + +**Open coordination cost:** any third-party tooling that consumes `KnowledgeCollectionCreated` events would need to route on `log.address` (the emitting storage) rather than the event payload alone. This is the standard ethers/viem pattern; we shouldn't need explicit coordination, but the migration RFC's "external impact" section should call it out. + +## 5. Design + +### 5.1 Core invariant + +**A KC's UAL fully identifies which storage instance minted it, on which chain, by which publisher, with which KA range.** No agent ever needs to guess. + +### 5.2 URI shape + +Two equivalent forms are valid; the parser MUST handle both: + +``` +Legacy / V10 default (3-segment after did:dkg:): + did:dkg:{chainId}/{publisherAddress}/{startKAId} + +Storage-tagged (4-segment after did:dkg:): + did:dkg:{chainId}/{storageTag}/{publisherAddress}/{startKAId} +``` + +The `storageTag` is the suffix of the storage's on-chain `uri(0)` return value past the common `did:dkg` prefix. Examples: + +| Storage `uriBase` | `storageTag` | UAL example | +|---|---|---| +| `did:dkg` (V10 default) | _(empty — legacy 3-segment form)_ | `did:dkg:base:84532/0xA1.../12345` | +| `did:dkg:v9` | `v9` | `did:dkg:v9/base:84532/0xA1.../12345` | +| `did:dkg:v11` (hypothetical) | `v11` | `did:dkg:v11/base:84532/0xA1.../12345` | +| `did:dkg:base-staking-v2` (hypothetical) | `base-staking-v2` | `did:dkg:base-staking-v2/base:84532/0xA1.../12345` | + +This is the format V9 already uses for `did:dkg:v9` (see deployment script `packages/evm-module/deploy/archive/040_deploy_knowledge_assets_storage.ts`). We're standardising it. + +Rules: +1. The `storageTag` MUST be `[a-z0-9-]+` (matches existing V9 `v9` and all sensible future names; precludes spaces, slashes, `:` collisions with the chainId). +2. The default `did:dkg` storage (V10's current) MUST continue to produce 3-segment UALs forever, regardless of how many other storage instances exist. This is what makes the scheme backwards-compatible with every UAL ever minted. +3. A storage instance whose `uriBase` is exactly `did:dkg` is the **default storage** for that Hub. There MUST be exactly one. Other storage instances MUST have a `uriBase` of the form `did:dkg:`. + +### 5.3 Resolution path + +Given a UAL, the agent determines which storage to query as follows: + +``` +function resolveStorageForUal(ual: string): Address { + const segments = ual.slice("did:dkg:".length).split('/'); + if (segments.length === 3) { + // Default storage — the one whose uri(0) === "did:dkg" + return defaultKCStorage(); + } + if (segments.length === 4) { + const storageTag = segments[0]; + return findKCStorageByTag(storageTag); + } + throw new Error(`Malformed UAL: ${ual}`); +} +``` + +`findKCStorageByTag` consults a cache populated from Hub: + +``` +function buildKCStorageRegistry(): Map { + const all = hub.getAllAssetStorages(); + const registry = new Map(); + for (const { name, addr } of all) { + if (!name.startsWith("KnowledgeCollectionStorage") + && !name.startsWith("KnowledgeAssetsStorage")) continue; + const uriBase = await KnowledgeCollectionStorage(addr).uri(0); + const tag = uriBase === "did:dkg" + ? "" // default storage + : uriBase.slice("did:dkg:".length); + registry.set(tag, addr); + } + return registry; +} +``` + +The registry is built once at agent startup, refreshed on `NewAssetStorage` / `AssetStorageChanged` Hub events. Cheap. + +### 5.4 What we explicitly rejected: upper-bits encoding + +An alternative was to embed the storage version index in the upper bits of `kcId` / `startKAId`, keeping the UAL shape flat. Considered and rejected for the following reasons: + +- **URIs would be visually horrifying.** A V2-minted UAL would look like `did:dkg:base:84532/0xA1.../452312848583266388373324160190187140051835877600158453279131187530910662657`. Humans never read these, but anything that ever has to be eyeballed in a log, an error message, an indexer dashboard, or a debugger becomes incomprehensible. +- **The explicit-tag form already exists in production.** V9 ships under `did:dkg:v9/...` — the very pattern this RFC formalises. Adopting the bit-tag scheme instead would mean V9 is *not* an instance of the canonical scheme, which is wrong. +- **The chain-side `RandomSamplingLib.Challenge.knowledgeCollectionStorageContract` field is an `address`, not an upper-bits-encoded ID.** The chain itself has already chosen the explicit-address shape; matching it off-chain is consistent. +- **Storage layout in `kcToContextGraph`** (`ContextGraphStorage.sol:89`) — `mapping(uint256 kcId => uint256 cgId)` — actively reverts on collision (`KCAlreadyRegistered`). Either route (bit-tags or paired CG storage) handles this, but the explicit-tag route also implies "pair each new KC storage with its own CG storage", which is structurally cleaner. + +The upper-bits scheme remains a valid fallback if some future version needs a hot-path identifier in a smaller numeric form, but it's not what we're standardising. + +## 6. Backwards compatibility + +This is the part most worth stress-testing. Claims, with evidence: + +| Concern | Behaviour after RFC adoption | Evidence | +|---|---|---| +| Every UAL ever minted (V10) | Continues to resolve. Default-storage 3-segment form is forever valid. | §5.2 rule 2 | +| Every store.nq triple keyed by V10 UAL | Continues to be valid. Triples are unchanged. | Nothing rewrites store.nq | +| Every triple `?ual dkg:batchId ""^^xsd:integer` | Continues to resolve. The kcId literal is per-CG-metaGraph and remains storage-internal. | `kc-extractor.ts:184-186` lookup unchanged | +| Existing V9 KAs under `did:dkg:v9` | Continues to be readable. V9 storage stays deployed; resolver knows the `v9` tag. | §3.1 — already deployed today | +| Random sampling WAL on disk | Continues to be readable. New entries CAN carry storage address but don't have to. | §4.1 | +| Publish journal on disk | Continues to be readable. UALs gain optional tag segment. | §4.2 | +| Third-party indexers consuming chain events | No code change required. `log.address` already disambiguates emitting storage. | §4.3 | +| `verifyUALConsistency` / `publisherAddressFromUal` parsers | Need teaching the 4-segment form. ~30 lines of code. | `publish-handler.ts:581`, `dkg-publisher.ts:120` | +| `dkg-publisher.ts` UAL minting (5 sites) | Need centralising and teaching to consult `storage.uri(0)` per mint. | `dkg-publisher.ts:2324`, `:2625`, `:2701`, `:2756`, `:2788`, `:2802` | +| `chainResetMarker` auto-wipe hook | **Becomes redundant** on any network where this RFC is in force. Bump the marker, register new storage with a new `uriBase`, old data continues resolving to old storage forever. | The whole RFC | + +The only thing that CHANGES is what agents do at mint time and at parse time. Nothing on disk is invalidated by adopting this. Nothing on chain is invalidated. Existing testnet KCs published against `did:dkg` keep working without anyone doing anything. + +## 7. Implementation plan + +Six PRs, ordered, each landable independently. + +### PR 1 — Centralise UAL construction (refactor, ~1 day) + +`packages/core/src/constants.ts` already centralises CG URIs (lines 158-217). Extend with a single `kcUal()` helper: + +```typescript +export function kcUal( + chainId: string, + publisherAddress: string, + startKAId: bigint | string, + storageTag?: string, +): string { + const tag = storageTag && storageTag.length > 0 ? `${storageTag}/` : ''; + return `did:dkg:${tag}${chainId}/${publisherAddress}/${startKAId}`; +} +``` + +Replace every `did:dkg:${this.chain.chainId}/...` template literal in `dkg-publisher.ts` with a call to this. Today's call sites all pass `storageTag = undefined` and produce identical output. Zero behavioural change; just collects the format in one place for later PRs to evolve. + +**Test:** existing `dkg-publisher.test.ts` should pass unchanged. Add one `core/test/constants.test.ts` block exercising the helper with and without a tag. + +### PR 2 — Teach the UAL parser the 4-segment form (~1 day) + +Add `parseUal()` and `publisherAddressFromUal()` (the latter already exists in `dkg-publisher.ts:120` — move and generalise) to `packages/core/src/constants.ts`: + +```typescript +export interface ParsedUal { + chainId: string; + storageTag: string; // '' for default storage + publisherAddress: string; + startKAId: bigint; +} + +export function parseUal(ual: string): ParsedUal | null { ... } +``` + +3-segment input → `storageTag = ''`. 4-segment input → tag in position 0. Anything else → `null`. + +Update `verifyUALConsistency` (`publish-handler.ts:581`) to use the parser instead of raw `split('/')`. Same range-check semantics, just via the typed structure. + +**Test:** `core/test/constants.test.ts` exercises every legal form + every malformed form. `publisher/test/publish-handler.test.ts` should pass unchanged because validation behaviour for 3-segment UALs is identical. + +### PR 3 — Build the KC-storage registry (~2 days) + +Add `packages/chain/src/kc-storage-registry.ts`: + +```typescript +export class KCStorageRegistry { + private byTag = new Map(); + private byAddress = new Map(); + + async refresh(hub: Hub): Promise { ... } + getByTag(tag: string): Address | undefined { ... } + getDefaultAddress(): Address { ... } + tagFor(address: Address): string | undefined { ... } +} +``` + +Plumb it through `evm-adapter.ts` so it refreshes on agent startup + on `NewAssetStorage`/`AssetStorageChanged` events. + +This registry is only *consumed* in later PRs. Adding it now lets later PRs land independently. + +**Test:** unit tests against a mock Hub with two registered storages with different `uriBase`s. + +### PR 4 — Use the registry at mint time (~2 days) + +When the publisher mints a KC, it calls (via `evm-adapter`) some specific `KnowledgeCollectionStorage` contract. Currently the agent always uses the one named `"KnowledgeCollectionStorage"` in Hub. Change `dkg-publisher.ts` to: + +1. Resolve the storage instance it's minting into (still defaults to the named-`KnowledgeCollectionStorage` instance — this PR doesn't introduce a V2 storage, it just makes the choice explicit). +2. Read `storage.uri(0)` once at startup, cache the tag. +3. Pass the tag to `kcUal()` when constructing the UAL. + +For the default storage (`uriBase === "did:dkg"`), the tag is empty → 3-segment UAL → bit-for-bit identical to today. + +**Test:** existing publish E2E tests unchanged. Add a unit test using a mock storage whose `uri(0)` returns `did:dkg:v2` to confirm the 4-segment form is produced. + +### PR 5 — Use the registry at resolution time (~2 days) + +Two consumers need teaching: + +1. **`kc-extractor.extractV10KCFromStore` / random-sampling prover**: when handed a `kcId` from the chain's challenge, also accept `Challenge.knowledgeCollectionStorageContract`. Route the meta-graph lookup to use the storage's tag. Today the prover never had to disambiguate; from this PR forward, it does. +2. **The `verifyPublisherOwnsRange` chain-adapter method** (`evm-adapter.ts:1287-1302`): teach it which KC storage to consult, derived from the UAL. Currently it always queries `this.contracts.knowledgeAssetsStorage` — the V9 path. Should become "look up the storage by UAL tag, then query that one". + +**Test:** add a random-sampling e2e variant that targets a V9-tagged KC alongside a V10-default KC and confirms both produce valid proofs. + +### PR 6 — Document the convention (~half day) + +Add `docs/STORAGE_VERSION_TAGS.md` (operator-facing) summarising the URI format, the registry, and how to retire a tag (you don't — tags are forever). Cross-link from `TESTNET_RESET.md` and `chain-reset-wipe.ts`'s header comment. + +### Total + +~1.5–2 weeks calendar time at one engineer's typical pace, sequenced into 6 reviewable PRs. None of them require a new contract deployment, a chain migration, a data migration, or a forced upgrade window. + +## 8. Open questions + +These are the decisions the RFC does not yet make. Each is small but explicit-consent-worthy. + +1. **What's the right name for the V11 KC storage when it eventually deploys?** Naming convention: `KnowledgeCollectionStorageV11`? `KnowledgeCollectionStorage` + `uriBase: "did:dkg:v11"`? The Hub registry name and the URI tag are not the same thing today; we should pick a convention. Suggest: storage name follows the SemVer/version suffix (`KnowledgeCollectionStorageV11`); URI tag is shorter (`v11`); they're related by `KnowledgeCollectionStorage` → `""` (default), `KnowledgeCollectionStorageV{N}` → `v{N}` for N ≥ 11. +2. **Should `did:dkg` always remain the default storage tag?** Yes, IMO — it preserves every existing UAL forever. But this means we can never re-deploy "the default" storage; we'd always have to deploy a new tag and migrate logic over to read both. That's the *point*, but it's worth stating explicitly so a future maintainer doesn't try to "promote" V11 to be the new default. +3. **`KnowledgeAssetsV10`'s `getAssetStorageAddress("KnowledgeCollectionStorage")` call (`KnowledgeAssetsV10.sol:325`).** When V11 storage exists, V10's logic facade still grabs only the V10 storage. That's fine if we also deploy a V11 logic facade. But it implies "every storage version deploy comes with a paired logic facade deploy". Worth codifying as a rule. +4. **What about CG storage?** `ContextGraphStorage` is also a singleton today (`getAssetStorageAddress("ContextGraphStorage")` — see `KnowledgeAssetsV10.sol:347`, `ContextGraphs.sol:65`). The same versioning question applies. This RFC focuses on KC; CG should follow the same scheme in a follow-up RFC. CG URI construction already lives in `core/src/constants.ts`, so the centralisation work is half-done for that side. +5. **The Horizon-1 changes from issue #679 — non-destructive wipe + dev opt-out + mainnet refusal — are still recommended.** This RFC removes the *need* for the wipe over time, but doesn't help operators who are already running pre-RFC daemons on networks where the marker has bumped. The two efforts are complementary. + +## 9. Test plan + +Per PR, plus three integration sweeps: + +1. **Old-data-still-valid sweep** (post-PR-5): take a `store.nq` from a pre-RFC daemon, boot a post-RFC daemon against it, confirm every UAL in it still resolves and every triple still passes verification. Run on both a fresh checkout and a `git revert`-style downgrade roundtrip. +2. **V9 + V10 coexistence sweep** (post-PR-5): the testnet already has both. Add a CI scenario that publishes one KC against V10 and one "KA" against V9 (via the archived path) and proves both resolve cleanly from a single agent. Pins the existing production behaviour as the reference for any future V11. +3. **Hypothetical V11 sweep** (post-PR-5): deploy a `KnowledgeCollectionStorageV11` to a local devnet with `uriBase: "did:dkg:v11"`, publish a KC against it, query it via the prover. End-to-end demonstration that the scheme accommodates a brand-new storage version without a single line of agent code change beyond what this RFC ships. + +## 10. Out of scope + +- **Content-addressed identity** (the "merkle root IS the ID" model from my earlier note). That's a separate, larger RFC that would also be additive on top of this one — the chain-id-based UAL becomes one of multiple canonical identifiers, with merkle-root-keyed URIs as a parallel form. Not blocking this RFC; tracked separately. +- **Wholesale removal of `chainResetMarker`** from networks. The marker is still useful on networks that DO want auto-wipe (early-stage testnets where operators have explicitly opted into "fresh state on every redeploy"). The marker should be opt-in per network rather than opt-out. +- **CG storage versioning.** Same story applies, but the design has its own moving parts (CG ownership ERC-721, participant agents, name registry). Worth a sibling RFC after this one is settled. +- **Cross-chain UALs.** The chainId segment already supports CAIP-2 form (`base:84532`). Multi-chain coordination is orthogonal. + +## 11. Decision required + +Two yes/no calls from the reviewer: + +1. **Adopt the explicit-tag URI scheme as specified in §5.2, with the V9/V10 pattern as its precedent?** Yes/no. +2. **Sequence the 6 PRs as written, or pull a subset forward / push a subset back?** If pulled forward: PR-1 + PR-2 (the format + parser) are the only ones that must precede any future V2 KC storage deployment. PR-3 through PR-5 can be deferred until V2 is concretely planned. + +If both answers are yes, this RFC moves to Accepted and the work is small + bounded enough to fit in a normal sprint. From 56e8433acdbeaf60347df2ff48a4eba02a5415a5 Mon Sep 17 00:00:00 2001 From: Branimir Rakic Date: Wed, 27 May 2026 00:26:24 +0200 Subject: [PATCH 02/12] refactor(core,publisher): centralise KC UAL construction via kcUal() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit OT-RFC-40 PR-1. Adds a single `kcUal()` helper in `@origintrail-official/dkg-core` that builds the V10 KC UAL string, and replaces every `did:dkg:${...}/...` template literal in `dkg-publisher.ts` with a call to the helper. The helper accepts an optional `storageTag` argument so the same one function produces both URI shapes the protocol uses today: did:dkg:{chainId}/{publisherAddress}/{localId} (tag absent) did:dkg:{tag}/{chainId}/{publisherAddress}/{localId} (tag present) The 4-segment shape is the one V9 KAS already mints today via its `uriBase: "did:dkg:v9"` constructor parameter — see `packages/evm-module/deployments/parameters.json`. The 3-segment shape is the V10 default. PR-1 changes neither output: every call site explicitly omits the tag, so the produced UAL bytes are identical to those produced by the previous template literals (verified by the publisher integration test logs: `UAL=did:dkg:mock:31337/0xd785C4cB05949169aE0792c79B6B130b1a615E07/101`). Subsequent PRs in the OT-RFC-40 series adopt the tagged form for contract upgrades that mint into a fresh `KnowledgeCollectionStorage` instance (PR-3..5), and teach the parser to accept both shapes (PR-2). This PR is pure refactor: no observable behaviour change. Two pre-existing quirks are preserved deliberately and called out in code comments rather than fixed here: 1. The publish path passes `startKAId` in the third URI segment; the update path passes `kcId`. These are two different on-chain counters. The parser at `publish-handler.ts:589-606` expects `startKAId`. Tracked separately; out of scope for centralisation. 2. The publish-confirmation site at line ~2462 reads `onChainResult.startKAId` which is typed `bigint | undefined` (undefined only on the update result type, never on publish). The local-only-update site at line ~2747 reads `publisherAddress` typed `string | undefined` (defined at line ~2670 by `localTentativePublisherAddress()` whenever `localOnlyUpdate` is true). Both sites wrap with `String(...)` to preserve the template-literal coercion exactly under the typed-but-impossible `undefined` case. Cleaner refactors are deferred so this PR stays purely behaviour-preserving. Tests: 7 new pinning tests in `packages/core/test/constants.test.ts` covering both URI shapes, string `localId` (tentative path), case preservation, and the empty-tag → 3-segment-form contract. Existing 1046 publisher tests + 40 core tests all pass. Refs: docs/RFC40_MULTI_STORAGE_KC_URI_SCHEME.md §7.1 Co-authored-by: Cursor --- packages/core/src/constants.ts | 39 ++++++++++++++ packages/core/test/constants.test.ts | 68 +++++++++++++++++++++++++ packages/publisher/src/dkg-publisher.ts | 22 +++++--- 3 files changed, 121 insertions(+), 8 deletions(-) diff --git a/packages/core/src/constants.ts b/packages/core/src/constants.ts index ea4adcd1d..55d5f4490 100644 --- a/packages/core/src/constants.ts +++ b/packages/core/src/constants.ts @@ -153,6 +153,45 @@ export function networkPeersTopic(): string { return 'dkg/network/peers'; } +// ── V10 Knowledge Collection UALs ────────────────────────────────────── + +/** + * Build a Knowledge Collection UAL. + * + * Two equivalent shapes are valid (both produced by this helper depending + * on whether `storageTag` is supplied): + * + * - Default-storage / 3-segment form (legacy V10): + * `did:dkg:{chainId}/{publisherAddress}/{localId}` + * + * - Storage-tagged / 4-segment form (V9 KAS today, V11+ in future): + * `did:dkg:{chainId}/{storageTag}/{publisherAddress}/{localId}` + * + * The default-storage form is preserved bit-for-bit forever so every UAL + * ever minted under V10 keeps resolving without any rewrite. The tagged + * form is the one V9 already uses (`uriBase: "did:dkg:v9"`); see + * docs/RFC40_MULTI_STORAGE_KC_URI_SCHEME.md for the full scheme. + * + * `localId` accepts both `bigint` (the typical chain-issued counter) and + * `string` for two existing reasons: + * 1. Tentative UALs use a synthetic `t${publishOperationId}` string ID + * until the chain confirms (see dkg-publisher.ts publish path). + * 2. The publish and update paths inside dkg-publisher.ts pass two + * different ID kinds in this slot today — startKAId for publishes, + * kcId for updates. That divergence pre-dates this helper and is + * tracked as a separate cleanup; the helper preserves the existing + * behaviour at every call site. + */ +export function kcUal( + chainId: string, + publisherAddress: string, + localId: bigint | string, + storageTag?: string, +): string { + const tag = storageTag && storageTag.length > 0 ? `${storageTag}/` : ''; + return `did:dkg:${tag}${chainId}/${publisherAddress}/${localId}`; +} + // ── V10 Named Graph URIs ─────────────────────────────────────────────── export function contextGraphDataUri(contextGraphId: string, subGraphId?: string): string { diff --git a/packages/core/test/constants.test.ts b/packages/core/test/constants.test.ts index b73934e68..d154f7342 100644 --- a/packages/core/test/constants.test.ts +++ b/packages/core/test/constants.test.ts @@ -7,6 +7,7 @@ import { contextGraphSessionsTopic, contextGraphPublishTopic, contextGraphWorkspaceTopic, + kcUal, validateContextGraphId, validateSubGraphName, validateAssertionName, @@ -62,6 +63,73 @@ describe('context graph topic helpers (V10)', () => { }); }); +describe('kcUal (OT-RFC-40 §5.2)', () => { + // The default-storage 3-segment form MUST be preserved bit-for-bit + // forever — every UAL ever minted under V10 (which defaults to + // empty storage tag) keeps resolving without any rewrite. These + // tests pin that contract. + + it('produces the legacy 3-segment form when no storage tag is supplied', () => { + expect(kcUal('base:84532', '0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1', 12345n)).toBe( + 'did:dkg:base:84532/0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1/12345', + ); + }); + + it('produces the legacy 3-segment form when storage tag is the empty string', () => { + expect(kcUal('base:84532', '0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1', 12345n, '')).toBe( + 'did:dkg:base:84532/0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1/12345', + ); + }); + + it('produces the 4-segment tagged form when a non-empty storage tag is supplied', () => { + // V9 KAS today uses uriBase did:dkg:v9, so its UALs are 4-segment. + // This is the format precedent the RFC standardises for V11+. + expect(kcUal('base:84532', '0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1', 17n, 'v9')).toBe( + 'did:dkg:v9/base:84532/0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1/17', + ); + }); + + it('accepts a string localId (tentative-publish path uses t)', () => { + // Pre-confirmation, dkg-publisher.ts mints a synthetic ID of the + // form `t-`. The helper must not coerce or reject + // it — same shape goes into store.nq as the tentative subject + // prefix and gets rewritten to the chain-issued KAID once + // confirmation lands. + expect( + kcUal( + 'base:84532', + '0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1', + 'tabcd1234-5', + ), + ).toBe( + 'did:dkg:base:84532/0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1/tabcd1234-5', + ); + }); + + it('preserves EIP-55 checksum case in publisher addresses', () => { + // Same rationale as deriveCuratorDidFromCgId — comparison sites + // lowercase, but logs/errors stay legible if the original case is + // preserved on the wire. + expect( + kcUal( + 'base:84532', + '0xd46E77003d74df9aAdF011A5115A72405b084a88', + 1n, + ), + ).toBe('did:dkg:base:84532/0xd46E77003d74df9aAdF011A5115A72405b084a88/1'); + }); + + it('does not interpolate or validate any field — caller responsibility', () => { + // Mirror of contextGraphDataUri's "we don't sanitize slashes" + // contract: the helper is dumb concatenation. Validation lives at + // upload paths (validateContextGraphId etc.). This test pins that + // the helper does not silently mangle inputs that the caller + // passes in deliberately. + expect(kcUal('chainId', 'pub', 0n)).toBe('did:dkg:chainId/pub/0'); + expect(kcUal('chainId', 'pub', 0n, 'tag')).toBe('did:dkg:tag/chainId/pub/0'); + }); +}); + describe('createOperationContext', () => { it('generates a unique operationId', () => { const ctx = createOperationContext('publish'); diff --git a/packages/publisher/src/dkg-publisher.ts b/packages/publisher/src/dkg-publisher.ts index 31bbaaf26..135a26e15 100644 --- a/packages/publisher/src/dkg-publisher.ts +++ b/packages/publisher/src/dkg-publisher.ts @@ -2,7 +2,7 @@ import type { Quad, TripleStore } from '@origintrail-official/dkg-storage'; import type { ChainAdapter, OnChainPublishResult, AddBatchToContextGraphParams } from '@origintrail-official/dkg-chain'; import { enrichEvmError } from '@origintrail-official/dkg-chain'; import type { EventBus, OperationContext } from '@origintrail-official/dkg-core'; -import { DKGEvent, Logger, createOperationContext, sha256, encodeWorkspacePublishRequest, encodeEncryptedWorkspacePayload, encryptWorkspacePayload, contextGraphDataUri, contextGraphMetaUri, contextGraphAssertionUri, assertionLifecycleUri, contextGraphSubGraphUri, contextGraphSubGraphMetaUri, validateSubGraphName, isSafeIri, assertSafeIri, assertSafeRdfTerm, DKG_GOSSIP_MAX_MESSAGE_BYTES, type Ed25519Keypair, buildAuthorAttestationTypedData, AUTHOR_SCHEME_VERSION_V1, TrustLevel, TRUST_LEVEL_PREDICATE, assertNoUserAuthoredTrustLevelQuads, buildTrustLevelQuads, isTrustLevelQuad } from '@origintrail-official/dkg-core'; +import { DKGEvent, Logger, createOperationContext, sha256, encodeWorkspacePublishRequest, encodeEncryptedWorkspacePayload, encryptWorkspacePayload, contextGraphDataUri, contextGraphMetaUri, contextGraphAssertionUri, assertionLifecycleUri, contextGraphSubGraphUri, contextGraphSubGraphMetaUri, kcUal, validateSubGraphName, isSafeIri, assertSafeIri, assertSafeRdfTerm, DKG_GOSSIP_MAX_MESSAGE_BYTES, type Ed25519Keypair, buildAuthorAttestationTypedData, AUTHOR_SCHEME_VERSION_V1, TrustLevel, TRUST_LEVEL_PREDICATE, assertNoUserAuthoredTrustLevelQuads, buildTrustLevelQuads, isTrustLevelQuad } from '@origintrail-official/dkg-core'; import { GraphManager, PrivateContentStore } from '@origintrail-official/dkg-storage'; import type { Publisher, PublishOptions, PublishResult, KAManifestEntry, PhaseCallback, V10CoreNodeACK } from './publisher.js'; import { autoPartition } from './auto-partition.js'; @@ -2072,7 +2072,7 @@ export class DKGPublisher implements Publisher { // confirmed states for this publish so the `dkg:Publication` subject // emitted in metadata stays the same after on-chain confirmation. const publishOperationId = `${this.sessionId}-${tentativeSeq}`; - let ual = `did:dkg:${this.chain.chainId}/${publisherAddress}/t${publishOperationId}`; + let ual = kcUal(this.chain.chainId, publisherAddress, `t${publishOperationId}`); // Resolve the on-chain attribution target from the per-call override // (computed above) or fall back to the daemon's persistent identity. @@ -2459,7 +2459,10 @@ export class DKGPublisher implements Publisher { onChainResult.tokenAmount = tokenAmount; // V9 UAL: did:dkg:{chainId}/{publisherAddress}/{firstKAId} - ual = `did:dkg:${this.chain.chainId}/${onChainResult.publisherAddress}/${onChainResult.startKAId}`; + // String(...) preserves the historical template-literal behaviour + // around the typed-as-optional `startKAId` (in practice always + // defined post-publish; absent only on the update OnChainPublishResult). + ual = kcUal(this.chain.chainId, onChainResult.publisherAddress, String(onChainResult.startKAId)); for (const km of kaMetadata) { km.kcUal = ual; @@ -2739,9 +2742,12 @@ export class DKGPublisher implements Publisher { if (localOnlyUpdate) { this.log.warn(ctx, 'No chain configured — applying update locally and returning tentative result'); await storeUpdatedQuads(); + // String(...) preserves the historical template-literal behaviour + // around the typed-as-optional `publisherAddress` (the localOnlyUpdate + // branch at line ~2670 always populates it via localTentativePublisherAddress()). const result: PublishResult = { kcId, - ual: `did:dkg:${this.chain.chainId}/${publisherAddress}/${kcId}`, + ual: kcUal(this.chain.chainId, String(publisherAddress), kcId), merkleRoot: kcMerkleRoot, kaManifest: manifestEntries, status: 'tentative', @@ -2817,7 +2823,7 @@ export class DKGPublisher implements Publisher { if (!rejectedPublisherAddress) throw v10Err; earlyReturn = { kcId, - ual: `did:dkg:${this.chain.chainId}/${rejectedPublisherAddress}/${kcId}`, + ual: kcUal(this.chain.chainId, rejectedPublisherAddress, kcId), merkleRoot: kcMerkleRoot, kaManifest: manifestEntries, status: 'failed', @@ -2872,7 +2878,7 @@ export class DKGPublisher implements Publisher { onPhase?.('chain', 'end'); return { kcId, - ual: `did:dkg:${this.chain.chainId}/${failedPublisherAddress}/${kcId}`, + ual: kcUal(this.chain.chainId, failedPublisherAddress, kcId), merkleRoot: kcMerkleRoot, kaManifest: manifestEntries, status: 'failed', @@ -2904,7 +2910,7 @@ export class DKGPublisher implements Publisher { await storeUpdatedQuads(); const result: PublishResult = { kcId, - ual: `did:dkg:${this.chain.chainId}/${tentativePublisherAddress}/${kcId}`, + ual: kcUal(this.chain.chainId, tentativePublisherAddress, kcId), merkleRoot: kcMerkleRoot, kaManifest: manifestEntries, status: 'tentative', @@ -2918,7 +2924,7 @@ export class DKGPublisher implements Publisher { const result: PublishResult = { kcId, - ual: `did:dkg:${this.chain.chainId}/${effectivePublisherAddress}/${kcId}`, + ual: kcUal(this.chain.chainId, effectivePublisherAddress, kcId), merkleRoot: kcMerkleRoot, kaManifest: manifestEntries, status: 'confirmed', From 2c5df5d8c64cd2df34c36de21da766f10d0f4897 Mon Sep 17 00:00:00 2001 From: Branimir Rakic Date: Wed, 27 May 2026 00:54:08 +0200 Subject: [PATCH 03/12] feat(core,publisher): teach UAL parser the 4-segment storage-tagged form MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit OT-RFC-40 PR-2. Adds `parseUal()` and `publisherAddressFromUal()` to `@origintrail-official/dkg-core`, both accepting the legacy 3-segment default-storage form AND the 4-segment storage-tagged form V9 KAS already uses today via `uriBase: "did:dkg:v9"`. Surface in core: parseUal(ual): ParsedUal | null { chainId, storageTag, publisherAddress, startKAId: bigint | null } publisherAddressFromUal(ual): string | undefined DID_DKG_PREFIX, STORAGE_TAG_PATTERN Disambiguation choices, deliberate: 1. The publisher slot must match `^0x[0-9a-fA-F]{40}$`. Without that check, CG data URIs like `did:dkg:context-graph:agents/context/sub` would parse as 3-segment UALs (chainId=`context-graph:agents`, pub=`context`, localId=`sub`). The pattern keeps UAL parsing distinct from same-prefix CG URI parsing without bringing ethers into core. 2. `STORAGE_TAG_PATTERN = /^[a-z0-9-]+$/` (RFC §5.2 rule 1). Forbidding `:` prevents collision with the chainId segment (which uses CAIP-2 form `base:84532`). The tag whitelist is exported so deployment scripts that set a storage `uriBase` can validate against the same predicate the parser uses. 3. Non-numeric local-id slot returns `startKAId: null` rather than rejecting the whole UAL — the tentative-publish path uses `t` until the chain confirms (see PR-1's commit notes), and core has to round-trip those. 4. Publisher address case is preserved as-it-was-on-the-wire. EIP-55 normalisation lives in the publisher (which has ethers); core stays ethers-free. Consumers updated in the same PR: - `publish-handler.ts:verifyUALConsistency` — was raw `split('/')` + index-2 BigInt parse, only ever validated 3-segment UALs. Now uses `parseUal()`. Any store.nq subjects in the publish quads that match the 4-segment shape will now be range-checked too, closing a latent gap where V9-tagged subjects passed validation silently. - `dkg-publisher.ts:publisherAddressFromUal` — was a duplicated 3-segment-only parser that always read `segments[1]`, returning the storage tag (not the publisher) for V9-tagged UALs. Now delegates to `parseUal()` and only adds the chain-side EIP-55 normalisation via `coercePublisherAddress`. Pre-existing latent bug fixed as a free side effect. Tests: 14 new pinning tests in `core/test/constants.test.ts` covering both shapes, non-UAL CG URI rejection, malformed tags, empty segments, the tentative-publish form's null-startKAId, undef/null input, and case preservation. 1046 publisher tests + 54 core tests all pass. Refs: docs/RFC40_MULTI_STORAGE_KC_URI_SCHEME.md §7.2 Co-authored-by: Cursor --- packages/core/src/constants.ts | 122 +++++++++++++++++- packages/core/test/constants.test.ts | 147 ++++++++++++++++++++++ packages/publisher/src/dkg-publisher.ts | 19 ++- packages/publisher/src/publish-handler.ts | 37 +++--- 4 files changed, 302 insertions(+), 23 deletions(-) diff --git a/packages/core/src/constants.ts b/packages/core/src/constants.ts index 55d5f4490..af0ba6623 100644 --- a/packages/core/src/constants.ts +++ b/packages/core/src/constants.ts @@ -155,6 +155,14 @@ export function networkPeersTopic(): string { // ── V10 Knowledge Collection UALs ────────────────────────────────────── +/** + * Common URI prefix shared by every Knowledge Collection UAL produced + * by `kcUal()`. Exported so other parsers (e.g. `parseUal()` and the + * receiver-side range-consistency check in publish-handler) can match + * subjects without duplicating the literal across files. + */ +export const DID_DKG_PREFIX = 'did:dkg:'; + /** * Build a Knowledge Collection UAL. * @@ -189,7 +197,119 @@ export function kcUal( storageTag?: string, ): string { const tag = storageTag && storageTag.length > 0 ? `${storageTag}/` : ''; - return `did:dkg:${tag}${chainId}/${publisherAddress}/${localId}`; + return `${DID_DKG_PREFIX}${tag}${chainId}/${publisherAddress}/${localId}`; +} + +/** + * Storage-tag whitelist (RFC §5.2 rule 1). Lowercase ASCII letters, + * digits, and hyphens. Forbidding ":" prevents collision with the + * chainId segment which uses CAIP-2 form (e.g. `base:84532`). + * + * Exported so deployment scripts that set a storage `uriBase` can + * validate the tag with the exact same predicate the parser uses on + * the receive side. + */ +export const STORAGE_TAG_PATTERN = /^[a-z0-9-]+$/; + +/** + * Publisher-address shape used to disambiguate UALs from same-prefix + * URIs that share the 3-segment shape (notably CG data URIs of the + * form `did:dkg:context-graph://`). Every + * UAL minted by `kcUal()` puts an `ethers.getAddress`-checksummed + * EOA in slot 1 (3-segment) or slot 2 (4-segment); CG URIs put a + * sub-graph name there, which is never `0x`-prefixed hex. + */ +const PUBLISHER_ADDRESS_PATTERN = /^0x[0-9a-fA-F]{40}$/; + +export interface ParsedUal { + /** CAIP-2-style chain identifier, e.g. `base:84532`. */ + chainId: string; + /** + * Storage tag — empty string for the default V10 storage (3-segment + * UALs), otherwise a non-empty token like `v9` or `v11` matching + * `STORAGE_TAG_PATTERN`. Routes a UAL to the storage instance that + * minted it; see RFC §5.3. + */ + storageTag: string; + /** Publisher address as it appeared in the UAL (NOT EIP-55 normalised). */ + publisherAddress: string; + /** + * Third / fourth slot of the UAL parsed as a bigint — the chain-issued + * local ID (startKAId for publish UALs, kcId for the update branch's + * UAL). Returned as `null` when the slot is non-numeric (e.g. a + * tentative `t` placeholder), so callers that + * specifically want a confirmed-on-chain ID can branch on `null`. + */ + startKAId: bigint | null; +} + +/** + * Parse a Knowledge Collection UAL produced by `kcUal()`. + * + * Accepts both shapes the protocol mints today (RFC §5.2): + * + * 3-segment / default-storage: did:dkg:{chainId}/{pub}/{localId} + * 4-segment / storage-tagged: did:dkg:{chainId}/{tag}/{pub}/{localId} + * + * Returns `null` for any input that doesn't start with `did:dkg:` or + * whose segment count is not exactly 3 or 4 — callers (e.g. the + * publish-handler range check) treat `null` as "not a UAL we own, + * skip" rather than "validation error". Malformed inputs that happen + * to match the prefix-and-shape but carry a non-numeric local-id slot + * are returned with `startKAId: null` rather than rejected, because + * the tentative-publish path uses synthetic string IDs of the form + * `t` until the chain confirms. + * + * Validation of the storage tag against `STORAGE_TAG_PATTERN` IS + * enforced — a malformed tag (e.g. embedded `/` or `:`) is treated as + * "not a valid UAL" and returns `null`. + */ +export function parseUal(ual: string | undefined | null): ParsedUal | null { + if (typeof ual !== 'string') return null; + if (!ual.startsWith(DID_DKG_PREFIX)) return null; + + const segments = ual.slice(DID_DKG_PREFIX.length).split('/'); + let chainId: string; + let storageTag: string; + let publisherAddress: string; + let localIdSegment: string; + + if (segments.length === 3) { + [chainId, publisherAddress, localIdSegment] = segments; + storageTag = ''; + } else if (segments.length === 4) { + [storageTag, chainId, publisherAddress, localIdSegment] = segments; + if (!STORAGE_TAG_PATTERN.test(storageTag)) return null; + } else { + return null; + } + + if (chainId.length === 0) return null; + if (localIdSegment.length === 0) return null; + if (!PUBLISHER_ADDRESS_PATTERN.test(publisherAddress)) return null; + + let startKAId: bigint | null; + try { + startKAId = BigInt(localIdSegment); + } catch { + startKAId = null; + } + + return { chainId, storageTag, publisherAddress, startKAId }; +} + +/** + * Extract the publisher-address segment from a UAL, without any + * EIP-55 / checksum normalisation (callers that need normalised form + * pipe through their own ethers `getAddress()`). + * + * Returns `undefined` for any input `parseUal()` rejects. This is the + * core-side replacement for the (now-removed) `publisherAddressFromUal` + * helper inside `dkg-publisher.ts`, which only handled 3-segment UALs. + */ +export function publisherAddressFromUal(ual: string | undefined | null): string | undefined { + const parsed = parseUal(ual); + return parsed === null ? undefined : parsed.publisherAddress; } // ── V10 Named Graph URIs ─────────────────────────────────────────────── diff --git a/packages/core/test/constants.test.ts b/packages/core/test/constants.test.ts index d154f7342..8ce7716a8 100644 --- a/packages/core/test/constants.test.ts +++ b/packages/core/test/constants.test.ts @@ -8,6 +8,8 @@ import { contextGraphPublishTopic, contextGraphWorkspaceTopic, kcUal, + parseUal, + publisherAddressFromUal, validateContextGraphId, validateSubGraphName, validateAssertionName, @@ -130,6 +132,151 @@ describe('kcUal (OT-RFC-40 §5.2)', () => { }); }); +describe('parseUal (OT-RFC-40 §5.2 — handles both 3- and 4-segment forms)', () => { + // The parser must accept exactly the two shapes `kcUal()` produces + // and reject everything else. These tests pin both directions. + + it('parses the legacy 3-segment / default-storage form', () => { + const parsed = parseUal( + 'did:dkg:base:84532/0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1/12345', + ); + expect(parsed).toEqual({ + chainId: 'base:84532', + storageTag: '', + publisherAddress: '0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1', + startKAId: 12345n, + }); + }); + + it('parses the 4-segment / storage-tagged form (V9 KAS today)', () => { + const parsed = parseUal( + 'did:dkg:v9/base:84532/0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1/17', + ); + expect(parsed).toEqual({ + chainId: 'base:84532', + storageTag: 'v9', + publisherAddress: '0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1', + startKAId: 17n, + }); + }); + + it('round-trips with kcUal for both forms', () => { + const cid = 'base:84532'; + const pub = '0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1'; + const id = 99n; + expect(parseUal(kcUal(cid, pub, id))).toEqual({ + chainId: cid, + storageTag: '', + publisherAddress: pub, + startKAId: id, + }); + expect(parseUal(kcUal(cid, pub, id, 'v9'))).toEqual({ + chainId: cid, + storageTag: 'v9', + publisherAddress: pub, + startKAId: id, + }); + }); + + it('returns null for inputs that do not start with did:dkg:', () => { + expect(parseUal('http://example.org/1')).toBeNull(); + expect(parseUal('urn:dkg:agent:foo')).toBeNull(); + expect(parseUal('')).toBeNull(); + }); + + it('returns null for input segment counts other than 3 or 4', () => { + // 1 or 2 segments — covers CG/data URIs that share the prefix. + expect(parseUal('did:dkg:context-graph:agents')).toBeNull(); + expect(parseUal('did:dkg:foo/bar')).toBeNull(); + // 5+ segments — extension paths under a UAL. + expect(parseUal('did:dkg:base:84532/0xPub/123/sub/extra')).toBeNull(); + }); + + it('returns null when any segment is empty', () => { + expect(parseUal('did:dkg://0xPub/123')).toBeNull(); + expect(parseUal('did:dkg:base:84532//123')).toBeNull(); + expect(parseUal('did:dkg:base:84532/0xPub/')).toBeNull(); + expect(parseUal('did:dkg:v9//base:84532/0xPub/123')).toBeNull(); + }); + + it('rejects malformed storage tags (uppercase, colon, special chars)', () => { + // STORAGE_TAG_PATTERN = /^[a-z0-9-]+$/ + expect(parseUal('did:dkg:V9/base:84532/0xPub/123')).toBeNull(); + expect(parseUal('did:dkg:v9.1/base:84532/0xPub/123')).toBeNull(); + expect(parseUal('did:dkg:tag with space/base:84532/0xPub/123')).toBeNull(); + // Even legitimate-looking V11+ candidates with colons are rejected + // because chainIds use ':' and parsing would be ambiguous. + expect(parseUal('did:dkg:base:special/base:84532/0xPub/123')).toBeNull(); + }); + + it('returns startKAId: null for non-numeric local-id slot (tentative publish form)', () => { + // dkg-publisher.ts's tentative path uses `t${publishOperationId}` + // — the parser must not reject these since they are valid in-flight + // UALs that get rewritten to a chain-issued KAID on confirmation. + const parsed = parseUal( + 'did:dkg:base:84532/0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1/tabcd-1', + ); + expect(parsed).toEqual({ + chainId: 'base:84532', + storageTag: '', + publisherAddress: '0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1', + startKAId: null, + }); + }); + + it('handles undefined / null without throwing', () => { + expect(parseUal(undefined)).toBeNull(); + expect(parseUal(null)).toBeNull(); + }); + + it('preserves publisher address case (no EIP-55 normalisation in core)', () => { + // Caller decides whether to checksum-normalise; core stays + // ethers-free and returns the segment as-it-was. + const parsed = parseUal( + 'did:dkg:base:84532/0xd46E77003d74df9aAdF011A5115A72405b084a88/1', + ); + expect(parsed?.publisherAddress).toBe('0xd46E77003d74df9aAdF011A5115A72405b084a88'); + }); + + it('accepts arbitrary non-UAL CG data URIs without confusion', () => { + // contextGraphDataUri('agents', 'sub') = "did:dkg:context-graph:agents/context/sub" + // → 2 segments after prefix → null. parseUal must not mistake CG + // URIs for UALs. + expect(parseUal('did:dkg:context-graph:agents/context/sub')).toBeNull(); + expect(parseUal('did:dkg:context-graph:agents')).toBeNull(); + }); +}); + +describe('publisherAddressFromUal', () => { + it('returns the publisher segment for a 3-segment UAL', () => { + expect( + publisherAddressFromUal( + 'did:dkg:base:84532/0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1/12345', + ), + ).toBe('0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1'); + }); + + it('returns the publisher segment for a 4-segment / V9-tagged UAL', () => { + // The duplicated helper inside dkg-publisher.ts only handled the + // 3-segment form (it took segments[1]). Moving + generalising it + // to core fixes a latent bug where V9 UALs would have returned + // the storage tag instead of the publisher address. + expect( + publisherAddressFromUal( + 'did:dkg:v9/base:84532/0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1/17', + ), + ).toBe('0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1'); + }); + + it('returns undefined for malformed input', () => { + expect(publisherAddressFromUal(undefined)).toBeUndefined(); + expect(publisherAddressFromUal(null)).toBeUndefined(); + expect(publisherAddressFromUal('')).toBeUndefined(); + expect(publisherAddressFromUal('not-a-ual')).toBeUndefined(); + expect(publisherAddressFromUal('did:dkg:context-graph:agents')).toBeUndefined(); + }); +}); + describe('createOperationContext', () => { it('generates a unique operationId', () => { const ctx = createOperationContext('publish'); diff --git a/packages/publisher/src/dkg-publisher.ts b/packages/publisher/src/dkg-publisher.ts index 135a26e15..6c1d3477b 100644 --- a/packages/publisher/src/dkg-publisher.ts +++ b/packages/publisher/src/dkg-publisher.ts @@ -2,7 +2,7 @@ import type { Quad, TripleStore } from '@origintrail-official/dkg-storage'; import type { ChainAdapter, OnChainPublishResult, AddBatchToContextGraphParams } from '@origintrail-official/dkg-chain'; import { enrichEvmError } from '@origintrail-official/dkg-chain'; import type { EventBus, OperationContext } from '@origintrail-official/dkg-core'; -import { DKGEvent, Logger, createOperationContext, sha256, encodeWorkspacePublishRequest, encodeEncryptedWorkspacePayload, encryptWorkspacePayload, contextGraphDataUri, contextGraphMetaUri, contextGraphAssertionUri, assertionLifecycleUri, contextGraphSubGraphUri, contextGraphSubGraphMetaUri, kcUal, validateSubGraphName, isSafeIri, assertSafeIri, assertSafeRdfTerm, DKG_GOSSIP_MAX_MESSAGE_BYTES, type Ed25519Keypair, buildAuthorAttestationTypedData, AUTHOR_SCHEME_VERSION_V1, TrustLevel, TRUST_LEVEL_PREDICATE, assertNoUserAuthoredTrustLevelQuads, buildTrustLevelQuads, isTrustLevelQuad } from '@origintrail-official/dkg-core'; +import { DKGEvent, Logger, createOperationContext, sha256, encodeWorkspacePublishRequest, encodeEncryptedWorkspacePayload, encryptWorkspacePayload, contextGraphDataUri, contextGraphMetaUri, contextGraphAssertionUri, assertionLifecycleUri, contextGraphSubGraphUri, contextGraphSubGraphMetaUri, kcUal, parseUal, validateSubGraphName, isSafeIri, assertSafeIri, assertSafeRdfTerm, DKG_GOSSIP_MAX_MESSAGE_BYTES, type Ed25519Keypair, buildAuthorAttestationTypedData, AUTHOR_SCHEME_VERSION_V1, TrustLevel, TRUST_LEVEL_PREDICATE, assertNoUserAuthoredTrustLevelQuads, buildTrustLevelQuads, isTrustLevelQuad } from '@origintrail-official/dkg-core'; import { GraphManager, PrivateContentStore } from '@origintrail-official/dkg-storage'; import type { Publisher, PublishOptions, PublishResult, KAManifestEntry, PhaseCallback, V10CoreNodeACK } from './publisher.js'; import { autoPartition } from './auto-partition.js'; @@ -120,11 +120,20 @@ function coercePublisherAddress(value: unknown): string | undefined { return normalized === ethers.ZeroAddress ? undefined : normalized; } +/** + * Local helper combining core's UAL parser with the chain-side + * EIP-55 / zero-address normalisation. Core's `parseUal()` is + * ethers-free and returns the publisher segment as it appeared in + * the UAL; `coercePublisherAddress()` checksums and rejects the zero + * address. Result: a normalised EOA, or `undefined` for malformed / + * non-UAL input. OT-RFC-40 PR-2: now correctly handles both 3- and + * 4-segment UAL shapes (the previous implementation always read + * slot 1, returning the storage tag for V9-tagged UALs). + */ function publisherAddressFromUal(ual: string | undefined): string | undefined { - const prefix = 'did:dkg:'; - if (!ual?.startsWith(prefix)) return undefined; - const segments = ual.slice(prefix.length).split('/'); - return coercePublisherAddress(segments[1]); + const parsed = parseUal(ual); + if (parsed === null) return undefined; + return coercePublisherAddress(parsed.publisherAddress); } function formatBytesAsKb(bytes: number): string { diff --git a/packages/publisher/src/publish-handler.ts b/packages/publisher/src/publish-handler.ts index 2038d87e0..b107809ee 100644 --- a/packages/publisher/src/publish-handler.ts +++ b/packages/publisher/src/publish-handler.ts @@ -9,6 +9,7 @@ import { createOperationContext, assertSafeIri, assertNoUserAuthoredTrustLevelQuads, + parseUal, type PublishRequestMsg, } from '@origintrail-official/dkg-core'; import type { ChainAdapter } from '@origintrail-official/dkg-chain'; @@ -571,12 +572,17 @@ export class PublishHandler { // ── Helpers ── -const DID_DKG_PREFIX = 'did:dkg:'; - /** * Verify that any triple subjects using DKG UALs reference KA IDs * within the publisher's claimed range. - * UAL format: did:dkg:{chainId}/{publisherAddress}/{localKAId}[/...] + * + * Accepts both the legacy 3-segment / default-storage form + * (`did:dkg:{chainId}/{publisherAddress}/{localKAId}`) and the + * 4-segment / storage-tagged form + * (`did:dkg:{tag}/{chainId}/{publisherAddress}/{localKAId}`) — see + * OT-RFC-40. Subjects whose shape isn't a UAL (e.g. CG data URIs that + * share the `did:dkg:` prefix) are skipped silently rather than + * flagged, matching the pre-RFC-40 behaviour. */ function verifyUALConsistency( quads: Quad[], @@ -586,21 +592,18 @@ function verifyUALConsistency( const errors: string[] = []; for (const q of quads) { - if (!q.subject.startsWith(DID_DKG_PREFIX)) continue; - - const segments = q.subject.slice(DID_DKG_PREFIX.length).split('/'); - if (segments.length < 3) continue; - - let localKAId: bigint; - try { - localKAId = BigInt(segments[2]); - } catch { - continue; - } - - if (localKAId < startKAId || localKAId > endKAId) { + const parsed = parseUal(q.subject); + if (parsed === null) continue; + // Tentative-publish placeholders (`t`) carry a + // non-numeric local-id slot that surfaces as `startKAId: null`. + // They never reach the receive-side range check (the receiver only + // sees confirmed UALs), but a defensive skip keeps the contract + // crisp: this function flags out-of-range numeric IDs only. + if (parsed.startKAId === null) continue; + + if (parsed.startKAId < startKAId || parsed.startKAId > endKAId) { errors.push( - `UAL consistency: subject "${q.subject}" references KA ID ${localKAId} ` + + `UAL consistency: subject "${q.subject}" references KA ID ${parsed.startKAId} ` + `outside claimed range ${startKAId}..${endKAId}`, ); } From 5240c7fa767f2d0f71d8ccfc4575eb9ce08aab43 Mon Sep 17 00:00:00 2001 From: Branimir Rakic Date: Wed, 27 May 2026 00:58:50 +0200 Subject: [PATCH 04/12] feat(chain): add KCStorageRegistry for plural KC-storage discovery MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit OT-RFC-40 PR-3. Introduces a stand-alone, ethers-free registry that discovers every KC-class storage on a Hub and indexes them by their on-chain `uriBase` storage tag. PR-3 ships only the data structure and unit tests; PR-4 wires it into mint sites and PR-5 wires it into resolution sites. The class accepts injected readers for `Hub.getAllAssetStorages()` and storage `uri(0)` (mirrors `ERC1155Delta.uri(uint256)` in the V10 contracts), so it is exercisable end-to-end without a real chain or the full evm-adapter. The EVM adapter wiring will provide ethers-backed implementations of those two readers — kept out of PR-3 to keep the diff small and the registry independently testable. Filtering rule (matches RFC §5.3): treats only Hub entries whose name starts with `KnowledgeCollectionStorage` or `KnowledgeAssetsStorage` as KC-class. V9 KAS is included because in the RFC-40 ontology it IS a KC-class storage — its UALs share the `did:dkg:` scheme and are routed by the same tag mechanism. ContextGraphStorage is deliberately excluded; CG versioning is the subject of a sibling RFC. Tag derivation pinned in `deriveStorageTag`: uriBase === "did:dkg" → tag = "" (default storage) uriBase === "did:dkg:v9" → tag = "v9" uriBase === "did:dkg:foo-bar" → tag = "foo-bar" anything else → null (logged + skipped) The character set for non-empty tags is the same `STORAGE_TAG_PATTERN` that `parseUal` enforces in PR-2 (`/^[a-z0-9-]+$/`), so a storage that ships a `uriBase` the registry rejects is also one whose UALs the parser would reject — symmetry by construction. Failure-mode contract (tests pin all of these): - Hub RPC call throws → log warn, leave cache empty, do not throw. - One storage's uri(0) throws → log warn, skip that storage, continue. - One storage's uriBase is malformed → log warn, skip, continue. - Two storages claim the same tag → log warn, last-write-wins (matches Hub's AssetStorageChanged overwrite semantics). - No default storage detected → log warn (every legacy 3-segment UAL would otherwise silently fail to resolve via this registry). - refresh() called twice → second call replaces the cache atomically. Tests: 17 cases in `kc-storage-registry.test.ts`, including the production V9-KAS + V10-KCS pairing on Base Sepolia, a hypothetical V11 deploy, and every failure path above. Refs: docs/RFC40_MULTI_STORAGE_KC_URI_SCHEME.md §7.3 Co-authored-by: Cursor --- packages/chain/src/index.ts | 9 + packages/chain/src/kc-storage-registry.ts | 300 ++++++++++++++++++ .../chain/test/kc-storage-registry.test.ts | 259 +++++++++++++++ 3 files changed, 568 insertions(+) create mode 100644 packages/chain/src/kc-storage-registry.ts create mode 100644 packages/chain/test/kc-storage-registry.test.ts diff --git a/packages/chain/src/index.ts b/packages/chain/src/index.ts index 07e2d565c..9afb93eb1 100644 --- a/packages/chain/src/index.ts +++ b/packages/chain/src/index.ts @@ -6,4 +6,13 @@ export { HubResolutionCache, type HubResolutionCacheOptions, } from './hub-resolution-cache.js'; +export { + KCStorageRegistry, + deriveStorageTag, + type KCStorageEntry, + type KCStorageHubReader, + type KCStorageUriReader, + type KCStorageRegistryLogger, + type KCStorageRegistryOptions, +} from './kc-storage-registry.js'; export { PcaUnavailableError, isPcaUnavailableError } from './pca-errors.js'; diff --git a/packages/chain/src/kc-storage-registry.ts b/packages/chain/src/kc-storage-registry.ts new file mode 100644 index 000000000..e2a1ef98e --- /dev/null +++ b/packages/chain/src/kc-storage-registry.ts @@ -0,0 +1,300 @@ +/** + * Registry of all Knowledge Collection storage instances on a Hub, + * indexed by their on-chain `uriBase` storage tag. + * + * Background — OT-RFC-40 §5.3: + * + * The DKG ontology has always permitted plural KC storage instances on + * one Hub: V9 `KnowledgeAssetsStorage` (`uriBase: "did:dkg:v9"`) ships + * alongside V10 `KnowledgeCollectionStorage` (`uriBase: "did:dkg"`) + * on Base Sepolia today, both registered via `Hub.assetStorageSet`. + * Until OT-RFC-40, this primitive was used but not surfaced — every + * agent path resolved a single hardcoded name (`"KnowledgeCollectionStorage"`), + * so a future V11 deploy under a fresh name would have been invisible + * even though the chain itself supports it. + * + * This registry surfaces the primitive: given a Hub it discovers every + * KC-class storage, reads each one's `uri(0)` (the ERC-1155 metadata + * URI base baked into the storage at construction time), parses the + * suffix into a UAL storage tag, and caches both directions so the + * publisher can pick the right instance when minting and the resolver + * can pick the right instance when handed an opaque UAL. + * + * Tag derivation (RFC §5.2 rule 2 + 3): + * + * uriBase === "did:dkg" → tag = "" (the default storage) + * uriBase === "did:dkg:v9" → tag = "v9" + * uriBase === "did:dkg:v11" → tag = "v11" + * uriBase === "did:dkg:foo-bar" → tag = "foo-bar" + * + * Anything that doesn't start with `did:dkg` or that yields a tag + * which doesn't match `STORAGE_TAG_PATTERN` is logged as a warning + * and skipped. Worst case is a misconfigured storage instance is + * invisible to the agent until its deployment is fixed; we never + * crash the daemon over a registry parse failure. + * + * The registry is intentionally a plain class with injected readers + * rather than an evm-adapter-coupled component so it is unit-testable + * without spinning up an actual chain. PR-3 lands the data structure; + * PR-4 wires it into `dkg-publisher.ts` mint sites; PR-5 wires it into + * the resolution sites. + */ + +import { DID_DKG_PREFIX, STORAGE_TAG_PATTERN } from '@origintrail-official/dkg-core'; + +/** + * Hub registry name prefixes that this registry treats as KC-class + * storage. V9 KAS is included because it IS a KC-class storage in the + * V10+ ontology — its UALs share the `did:dkg:` scheme and are routed + * by the same tag mechanism. + * + * Other asset storages (e.g. `ContextGraphStorage`) are deliberately + * excluded: CG-storage versioning is the subject of a sibling RFC. + */ +const KC_STORAGE_NAME_PREFIXES: readonly string[] = [ + 'KnowledgeCollectionStorage', + 'KnowledgeAssetsStorage', +]; + +export interface KCStorageEntry { + /** + * Hub registry name as returned by `Hub.getAllAssetStorages()`, + * e.g. `"KnowledgeCollectionStorage"` or `"KnowledgeAssetsStorage"`. + * Distinct from `tag`: the same Hub name can in principle outlive + * its tag (e.g. on a chain reset) but the tag is what UALs carry. + */ + hubName: string; + /** Storage contract address. */ + address: string; + /** + * Full `uri(0)` return value, e.g. `"did:dkg"` or `"did:dkg:v9"`. + * Useful for round-tripping in logs / diagnostics; the RFC-40 + * routing key is `tag`, not `uriBase`. + */ + uriBase: string; + /** + * Storage tag — empty string for the default storage (RFC §5.2 + * rule 3: there MUST be exactly one default), otherwise a value + * matching `STORAGE_TAG_PATTERN`. + */ + tag: string; +} + +/** + * Minimal Hub surface this registry needs. Mirrors + * `Hub.getAllAssetStorages()` (declared in `Hub.sol:109-111`). Kept + * as an interface rather than depending on an `ethers.Contract` + * directly so the registry can be exercised by unit tests against an + * in-memory fake. + */ +export interface KCStorageHubReader { + getAllAssetStorages(): Promise>; +} + +/** + * Reads `uri(0)` (the ERC-1155 metadata URI) from a single KC storage + * contract. Mirrors `ERC1155Delta.uri(uint256)` (declared in + * `tokens/ERC1155Delta.sol:80-82` — the implementation returns the + * `_uri` set in the `KnowledgeCollectionStorage` constructor; the + * `tokenId` argument is ignored). Kept narrow and injectable so the + * registry doesn't drag the full storage ABI into its surface. + */ +export interface KCStorageUriReader { + /** Returns the storage's full `uriBase`, e.g. `"did:dkg"` or `"did:dkg:v9"`. */ + readUriBase(storageAddress: string): Promise; +} + +export interface KCStorageRegistryLogger { + warn(message: string): void; +} + +export interface KCStorageRegistryOptions { + /** Optional logger; defaults to a no-op so unit tests don't need to stub it. */ + log?: KCStorageRegistryLogger; +} + +/** + * Caches the (tag, address, hubName, uriBase) tuple for every + * KC-class storage on a single Hub. + * + * Lifecycle: + * - Construct with the two readers + options. + * - Call `refresh()` once at adapter init. + * - Call `refresh()` again on `Hub.NewAssetStorage` / + * `Hub.AssetStorageChanged` events (PR-3 ships the data + * structure; the evm-adapter wiring pulls in the listener + * plumbing in a future PR). + * + * Lookups are O(1) and side-effect-free. `refresh()` is the only + * mutating operation; it is safe to call concurrently — concurrent + * refreshes converge on the same final state, the worst case is + * extra work. + */ +export class KCStorageRegistry { + private byTag = new Map(); + private byAddress = new Map(); + private defaultEntry: KCStorageEntry | undefined; + + constructor( + private readonly hubReader: KCStorageHubReader, + private readonly uriReader: KCStorageUriReader, + private readonly options: KCStorageRegistryOptions = {}, + ) {} + + private warn(message: string): void { + this.options.log?.warn(`[KCStorageRegistry] ${message}`); + } + + /** + * Re-read every KC-class storage from the Hub and rebuild the cache. + * Replaces the previous cache atomically so concurrent lookups + * never see a partially-rebuilt state. + */ + async refresh(): Promise { + let storages: Array<{ name: string; addr: string }>; + try { + storages = await this.hubReader.getAllAssetStorages(); + } catch (err) { + this.warn(`Hub.getAllAssetStorages() failed: ${err instanceof Error ? err.message : String(err)}`); + return; + } + + const nextByTag = new Map(); + const nextByAddress = new Map(); + let nextDefault: KCStorageEntry | undefined; + + for (const { name, addr } of storages) { + if (!KC_STORAGE_NAME_PREFIXES.some(prefix => name.startsWith(prefix))) continue; + + let uriBase: string; + try { + uriBase = await this.uriReader.readUriBase(addr); + } catch (err) { + this.warn( + `${name} at ${addr}: uri(0) read failed (${err instanceof Error ? err.message : String(err)}); skipping`, + ); + continue; + } + + const tag = deriveStorageTag(uriBase); + if (tag === null) { + this.warn(`${name} at ${addr}: malformed uriBase "${uriBase}" (expected "did:dkg" or "did:dkg:"); skipping`); + continue; + } + + const entry: KCStorageEntry = { hubName: name, address: addr, uriBase, tag }; + + const existingByTag = nextByTag.get(tag); + if (existingByTag) { + // Two storages claiming the same tag is a deployment-time + // misconfiguration (RFC §5.2 rule 3 implies tags are unique). + // We tolerate this rather than throw so the daemon doesn't + // crash, but we surface it loudly. The newer registration wins + // because that matches the Hub's update semantics + // (`AssetStorageChanged` overwrites). + this.warn( + `tag "${tag || '(default)'}" claimed by both ${existingByTag.hubName} (${existingByTag.address}) and ${name} (${addr}); using the latter`, + ); + } + + nextByTag.set(tag, entry); + nextByAddress.set(addr, entry); + if (tag === '') nextDefault = entry; + } + + if (!nextDefault) { + this.warn( + 'no default KC storage (uriBase === "did:dkg") detected on this Hub; UALs in the legacy 3-segment form will not resolve via this registry', + ); + } + + this.byTag = nextByTag; + this.byAddress = nextByAddress; + this.defaultEntry = nextDefault; + } + + /** + * Look up a storage by its UAL tag. Empty string returns the + * default storage (the one whose `uriBase` is exactly `did:dkg`). + */ + getByTag(tag: string): KCStorageEntry | undefined { + if (tag === '') return this.defaultEntry; + return this.byTag.get(tag); + } + + /** + * Look up a storage by its on-chain address. Returns `undefined` + * for any address not seen by the most recent `refresh()`. + * + * Address lookup is intentionally case-sensitive: callers should + * either pass the address in the same case the Hub returned (which + * the EVM adapter already does) or normalise via ethers.getAddress + * upstream. Avoiding internal normalisation keeps this class + * ethers-free and unit-testable. + */ + getByAddress(address: string): KCStorageEntry | undefined { + return this.byAddress.get(address); + } + + /** + * Convenience accessor for the default storage's entry. Returns + * `undefined` if the most recent `refresh()` did not detect any + * storage with `uriBase === "did:dkg"`. + */ + getDefault(): KCStorageEntry | undefined { + return this.defaultEntry; + } + + /** + * The default storage's address. Useful for sites that want to + * preserve "always mint into the default" behaviour while still + * routing the UAL tag correctly. + */ + getDefaultAddress(): string | undefined { + return this.defaultEntry?.address; + } + + /** + * Tag for a known address; mirrors `getByAddress` but returns just + * the tag for sites (e.g. mint UAL construction) that only need + * the routing key. + */ + tagFor(address: string): string | undefined { + return this.byAddress.get(address)?.tag; + } + + /** Snapshot of every entry, for diagnostic/CLI use. */ + getAll(): KCStorageEntry[] { + return Array.from(this.byAddress.values()); + } +} + +/** + * Parse a storage's `uriBase` into the RFC-40 storage tag. Returns + * `null` if the input is malformed (doesn't start with `did:dkg`, or + * yields a tag that doesn't match `STORAGE_TAG_PATTERN`). + * + * Exported for unit tests and for deployment-time validation; in + * normal operation the registry calls this internally. + */ +export function deriveStorageTag(uriBase: string): string | null { + // Default storage: "did:dkg" exactly. No trailing colon, no tag. + if (uriBase === 'did:dkg') return ''; + + // Tagged storage: must start with "did:dkg:". + if (!uriBase.startsWith(DID_DKG_PREFIX)) return null; + const tag = uriBase.slice(DID_DKG_PREFIX.length); + + // Empty tag after the prefix is malformed (would mean uriBase ends + // with a bare colon). + if (tag.length === 0) return null; + + // Tag character set is intentionally narrow — see `parseUal`'s + // disambiguation rules. A storage whose `uriBase` is rejected here + // is undeployable under the RFC-40 scheme; the deployment script is + // expected to validate against `STORAGE_TAG_PATTERN` before + // submitting the construction tx. + if (!STORAGE_TAG_PATTERN.test(tag)) return null; + + return tag; +} diff --git a/packages/chain/test/kc-storage-registry.test.ts b/packages/chain/test/kc-storage-registry.test.ts new file mode 100644 index 000000000..e7cb9661d --- /dev/null +++ b/packages/chain/test/kc-storage-registry.test.ts @@ -0,0 +1,259 @@ +/** + * Unit tests for KCStorageRegistry (OT-RFC-40 PR-3). + * + * The class itself is ethers-free, so these tests use plain in-memory + * fakes for the `KCStorageHubReader` + `KCStorageUriReader` injection + * points rather than spinning up a real Hub via the EVM adapter. + */ +import { describe, it, expect } from 'vitest'; +import { + KCStorageRegistry, + deriveStorageTag, + type KCStorageEntry, + type KCStorageHubReader, + type KCStorageUriReader, +} from '../src/kc-storage-registry.js'; + +interface FakeStorage { + hubName: string; + address: string; + uriBase: string | (() => Promise); +} + +function buildRegistry(storages: FakeStorage[], opts: { warns?: string[] } = {}) { + const hubReader: KCStorageHubReader = { + async getAllAssetStorages() { + return storages.map(s => ({ name: s.hubName, addr: s.address })); + }, + }; + const uriReader: KCStorageUriReader = { + async readUriBase(addr: string) { + const match = storages.find(s => s.address === addr); + if (!match) throw new Error(`unknown storage address ${addr}`); + if (typeof match.uriBase === 'function') return match.uriBase(); + return match.uriBase; + }, + }; + const warns = opts.warns ?? []; + const log = { warn: (m: string) => warns.push(m) }; + const registry = new KCStorageRegistry(hubReader, uriReader, { log }); + return { registry, warns }; +} + +describe('deriveStorageTag', () => { + // The tag mapping is the precise contract for what a deployment + // script may use for a storage's `uriBase`. These tests pin the + // exact accept/reject set RFC §5.2 specifies. + + it('maps did:dkg → empty (default storage marker)', () => { + expect(deriveStorageTag('did:dkg')).toBe(''); + }); + + it('maps did:dkg:v9 → v9 (the V9 KAS pattern in production today)', () => { + expect(deriveStorageTag('did:dkg:v9')).toBe('v9'); + }); + + it('maps did:dkg:foo-bar → foo-bar (hyphenated tags allowed)', () => { + expect(deriveStorageTag('did:dkg:foo-bar')).toBe('foo-bar'); + }); + + it('rejects anything that does not start with did:dkg', () => { + expect(deriveStorageTag('http://example.org')).toBeNull(); + expect(deriveStorageTag('did:web:example.org')).toBeNull(); + expect(deriveStorageTag('')).toBeNull(); + }); + + it('rejects bare "did:dkg:" (empty tag after the prefix)', () => { + expect(deriveStorageTag('did:dkg:')).toBeNull(); + }); + + it('rejects tags with characters outside [a-z0-9-]', () => { + expect(deriveStorageTag('did:dkg:V9')).toBeNull(); + expect(deriveStorageTag('did:dkg:v9.1')).toBeNull(); + expect(deriveStorageTag('did:dkg:tag with space')).toBeNull(); + expect(deriveStorageTag('did:dkg:tag/with/slash')).toBeNull(); + // Colons are forbidden because they collide with the chainId + // segment in UAL parsing — see parseUal's disambiguation rules. + expect(deriveStorageTag('did:dkg:base:84532')).toBeNull(); + }); +}); + +describe('KCStorageRegistry.refresh', () => { + const V10_KCS = { + hubName: 'KnowledgeCollectionStorage', + address: '0x4fCA405d46ADeDD7050420C1937842D2a36a04D8', + uriBase: 'did:dkg', + }; + const V9_KAS = { + hubName: 'KnowledgeAssetsStorage', + address: '0x45E0e14c695681c8c93d6A489a314ea1EC28ba59', + uriBase: 'did:dkg:v9', + }; + + it('discovers two registered storages (V9 KAS + V10 KCS, the production case)', async () => { + // Mirror of `packages/evm-module/deployments/base_sepolia_v10_contracts.json`. + const { registry } = buildRegistry([V10_KCS, V9_KAS]); + await registry.refresh(); + + expect(registry.getDefaultAddress()).toBe(V10_KCS.address); + expect(registry.getByTag('')!.address).toBe(V10_KCS.address); + expect(registry.getByTag('v9')!.address).toBe(V9_KAS.address); + expect(registry.tagFor(V10_KCS.address)).toBe(''); + expect(registry.tagFor(V9_KAS.address)).toBe('v9'); + }); + + it('preserves uriBase + hubName on each entry for diagnostics', async () => { + const { registry } = buildRegistry([V10_KCS, V9_KAS]); + await registry.refresh(); + const v9 = registry.getByTag('v9') as KCStorageEntry; + expect(v9).toEqual({ + hubName: 'KnowledgeAssetsStorage', + address: V9_KAS.address, + uriBase: 'did:dkg:v9', + tag: 'v9', + }); + }); + + it('ignores Hub entries that are not KC-class storages', async () => { + // Hub.getAllAssetStorages() returns every asset storage, not just + // KC ones. The registry must skip ContextGraphStorage etc. + const cgs = { + hubName: 'ContextGraphStorage', + address: '0x1111111111111111111111111111111111111111', + uriBase: 'did:dkg', + }; + const { registry } = buildRegistry([V10_KCS, cgs, V9_KAS]); + await registry.refresh(); + expect(registry.getAll().map(e => e.hubName).sort()).toEqual([ + 'KnowledgeAssetsStorage', + 'KnowledgeCollectionStorage', + ]); + }); + + it('skips storages with malformed uriBase and warns once', async () => { + const malformed = { + hubName: 'KnowledgeCollectionStorageFuture', + address: '0x9999999999999999999999999999999999999999', + uriBase: 'http://nope.example.org', + }; + const { registry, warns } = buildRegistry([V10_KCS, malformed]); + await registry.refresh(); + + expect(registry.getByAddress(malformed.address)).toBeUndefined(); + expect(registry.getDefaultAddress()).toBe(V10_KCS.address); + expect(warns).toHaveLength(1); + expect(warns[0]).toContain('malformed uriBase'); + expect(warns[0]).toContain('http://nope.example.org'); + }); + + it('skips storages whose uri(0) read throws and continues with the rest', async () => { + const broken = { + hubName: 'KnowledgeCollectionStorageV99', + address: '0x8888888888888888888888888888888888888888', + uriBase: async () => { throw new Error('rpc unavailable'); }, + }; + const { registry, warns } = buildRegistry([V10_KCS, broken, V9_KAS]); + await registry.refresh(); + + expect(registry.getByAddress(broken.address)).toBeUndefined(); + expect(registry.getDefaultAddress()).toBe(V10_KCS.address); + expect(registry.getByTag('v9')?.address).toBe(V9_KAS.address); + expect(warns.some(w => w.includes('uri(0) read failed'))).toBe(true); + }); + + it('warns when no default storage is detected (every UAL would fail to resolve)', async () => { + const { registry, warns } = buildRegistry([V9_KAS]); + await registry.refresh(); + + expect(registry.getDefaultAddress()).toBeUndefined(); + expect(warns.some(w => w.includes('no default KC storage'))).toBe(true); + }); + + it('handles two storages claiming the same tag by warning + using the latter', async () => { + // This is a deployment misconfiguration; the registry must not crash. + // Hub semantics: AssetStorageChanged overwrites, so "newer wins" is + // the correct convergence rule. + const dup1 = { hubName: 'KnowledgeCollectionStorage', address: '0xAAAa', uriBase: 'did:dkg' }; + const dup2 = { hubName: 'KnowledgeCollectionStorageV2', address: '0xBBBb', uriBase: 'did:dkg' }; + const { registry, warns } = buildRegistry([dup1, dup2]); + await registry.refresh(); + + expect(registry.getDefaultAddress()).toBe(dup2.address); + expect(warns.some(w => w.includes('claimed by both'))).toBe(true); + }); + + it('replaces the previous cache atomically on subsequent refresh calls', async () => { + // Simulate a Hub event triggering re-discovery. + const v11 = { + hubName: 'KnowledgeCollectionStorageV11', + address: '0xC1C1C1C1C1C1C1C1C1C1C1C1C1C1C1C1C1C1C1C1', + uriBase: 'did:dkg:v11', + }; + let storages = [V10_KCS, V9_KAS]; + const hubReader: KCStorageHubReader = { + async getAllAssetStorages() { + return storages.map(s => ({ name: s.hubName, addr: s.address })); + }, + }; + const uriReader: KCStorageUriReader = { + async readUriBase(addr: string) { + const match = storages.find(s => s.address === addr); + if (!match) throw new Error(`unknown ${addr}`); + return typeof match.uriBase === 'string' ? match.uriBase : ''; + }, + }; + const registry = new KCStorageRegistry(hubReader, uriReader); + + await registry.refresh(); + expect(registry.getByTag('v11')).toBeUndefined(); + + // V11 deploys; Hub fires NewAssetStorage; registry refreshes. + storages = [V10_KCS, V9_KAS, v11]; + await registry.refresh(); + expect(registry.getByTag('v11')?.address).toBe(v11.address); + }); + + it('does not throw if Hub.getAllAssetStorages() throws (RPC outage)', async () => { + const hubReader: KCStorageHubReader = { + async getAllAssetStorages() { + throw new Error('rpc timeout'); + }, + }; + const uriReader: KCStorageUriReader = { + async readUriBase() { throw new Error('unreachable'); }, + }; + const warns: string[] = []; + const registry = new KCStorageRegistry(hubReader, uriReader, { + log: { warn: m => warns.push(m) }, + }); + + await registry.refresh(); + expect(warns.some(w => w.includes('getAllAssetStorages() failed'))).toBe(true); + expect(registry.getDefaultAddress()).toBeUndefined(); + }); + + it('returns a stable getAll() snapshot regardless of insertion order', async () => { + const { registry } = buildRegistry([V9_KAS, V10_KCS]); + await registry.refresh(); + const all = registry.getAll(); + expect(all.map(e => e.tag).sort()).toEqual(['', 'v9']); + }); +}); + +describe('KCStorageRegistry — lookup correctness on an empty registry', () => { + // A registry that never had refresh() called should answer + // every lookup with undefined. Important for sites that ask + // optimistically and only fall back to a slower path on miss. + it('returns undefined for every lookup before refresh()', () => { + const hubReader: KCStorageHubReader = { async getAllAssetStorages() { return []; } }; + const uriReader: KCStorageUriReader = { async readUriBase() { return ''; } }; + const registry = new KCStorageRegistry(hubReader, uriReader); + expect(registry.getByTag('')).toBeUndefined(); + expect(registry.getByTag('v9')).toBeUndefined(); + expect(registry.getByAddress('0xanything')).toBeUndefined(); + expect(registry.getDefault()).toBeUndefined(); + expect(registry.getDefaultAddress()).toBeUndefined(); + expect(registry.tagFor('0xanything')).toBeUndefined(); + expect(registry.getAll()).toEqual([]); + }); +}); From 53850de34ea72b8002815566c49ffbd9e44a8910 Mon Sep 17 00:00:00 2001 From: Branimir Rakic Date: Wed, 27 May 2026 01:06:18 +0200 Subject: [PATCH 05/12] feat(chain,publisher): use storage tag at mint time (RFC-40 PR-4) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit OT-RFC-40 PR-4. The chain adapter now exposes the storage tag of the KC storage instance it is bound to, and `dkg-publisher.ts` passes that tag through to `kcUal()` at every mint site. Surface change on `ChainAdapter`: /** OT-RFC-40 §5.2 storage tag this adapter mints into. */ mintingStorageTag?: string; EVM adapter populates the field during `init()` by reading `KnowledgeCollectionStorage.uri(0)` (the ERC-1155 metadata URI baked into the storage at construction time, as declared in `tokens/ERC1155Delta.sol:80-82`) and parsing the suffix via `deriveStorageTag()` from PR-3. Falls back to "" if the read fails or the uriBase is malformed — preserves the legacy 3-segment UAL output under any failure mode. Mock adapter exposes `setMintingStorageTag(tag)` for tests; defaults to "". This is what makes the new behaviour testable end-to-end without spinning up a real chain. Publisher integration: `dkg-publisher.ts` reads `this.chain.mintingStorageTag` (defaulting to "" if absent — preserves old NoChainAdapter / pre-RFC adapters) and passes it as the 4th argument to `kcUal()` at every one of the 7 UAL-construction call sites (the publish path, the local-only update path, the V10-rejected path, the failed-tx path, the no- publisher confirmed path, the tentative-result-without-effective- publisher path, and the success path). Effect: - Default storage (uriBase "did:dkg") → tag "" → 3-segment UAL. Bit-for-bit identical to PR-1's output. Verified by the existing 1046 publisher tests, all of which still pass against the new helper signature. - Tagged storage (uriBase "did:dkg:v9", "did:dkg:v11", etc) → tag "v9" / "v11" / etc → 4-segment UAL. New behaviour. Tests: `storage-tagged-ual.test.ts` exercises both shapes via MockChainAdapter — three cases pinning the tag round-trip through the full publish flow (default → 3-seg, "v2" → 4-seg with tag at slot 1, "v11" → recoverable via parseUal). The mock log output in test stdout shows the resolved UAL form, so the on-chain confirmed log line itself is now a documentation artifact: UAL=did:dkg:mock:31337/0xbcE.../101 (default) UAL=did:dkg:v2/mock:31337/0x6D2.../101 (tagged) UAL=did:dkg:v11/mock:31337/0x603.../101 (tagged) Out of scope (deferred to PR-5): - Resolution-side multi-storage in `kc-extractor` / `RandomSamplingProver` / `verifyPublisherOwnsRange`. PR-4 only changes which UAL form is produced; PR-5 teaches consumers to route on the tag. - Wiring the registry into the EVM adapter for plural-storage enumeration. The adapter currently reads only its OWN minting storage; the cross-storage registry is added in PR-5. Refs: docs/RFC40_MULTI_STORAGE_KC_URI_SCHEME.md §7.4 Co-authored-by: Cursor --- packages/chain/src/chain-adapter.ts | 14 ++ packages/chain/src/evm-adapter.ts | 39 +++++ packages/chain/src/mock-adapter.ts | 13 ++ packages/publisher/src/dkg-publisher.ts | 14 +- .../publisher/test/storage-tagged-ual.test.ts | 138 ++++++++++++++++++ 5 files changed, 211 insertions(+), 7 deletions(-) create mode 100644 packages/publisher/test/storage-tagged-ual.test.ts diff --git a/packages/chain/src/chain-adapter.ts b/packages/chain/src/chain-adapter.ts index 2a95013f4..64ffed46b 100644 --- a/packages/chain/src/chain-adapter.ts +++ b/packages/chain/src/chain-adapter.ts @@ -559,6 +559,20 @@ export interface OperationalWalletRegistrationResult { export interface ChainAdapter { chainType: 'evm' | 'solana'; chainId: string; + /** + * OT-RFC-40 §5.2 storage tag of the KC storage instance this adapter + * mints into. Empty string ("") means the default storage and produces + * the legacy 3-segment UAL form `did:dkg:{chainId}/{publisher}/{id}`. + * Tagged storages (e.g. V9 KAS at `did:dkg:v9`) produce the 4-segment + * form `did:dkg:{tag}/{chainId}/{publisher}/{id}` so the resolver can + * route to the correct storage instance. + * + * Optional on the adapter surface so adapters that pre-date the RFC + * default to "" (preserving the legacy form bit-for-bit). EVM adapters + * populate it during `init()` by reading the storage's `uri(0)` view; + * mock adapters expose a setter for tests. + */ + mintingStorageTag?: string; /** * Stable identifier for the SPECIFIC deployment this adapter is * bound to (not just the chain). `chainId` alone is too coarse — diff --git a/packages/chain/src/evm-adapter.ts b/packages/chain/src/evm-adapter.ts index 8ea9ed954..459fa18ca 100644 --- a/packages/chain/src/evm-adapter.ts +++ b/packages/chain/src/evm-adapter.ts @@ -40,6 +40,7 @@ import { ChallengeNoLongerActiveError, } from './chain-adapter.js'; import { HubResolutionCache } from './hub-resolution-cache.js'; +import { deriveStorageTag } from './kc-storage-registry.js'; import { PcaUnavailableError } from './pca-errors.js'; import { buildAuthorAttestationTypedData, @@ -298,6 +299,17 @@ export class EVMChainAdapter implements ChainAdapter { } readonly chainType = 'evm' as const; readonly chainId: string; + /** + * OT-RFC-40 §5.2 storage tag for the KC storage this adapter mints + * into. Populated during `init()` from `KnowledgeCollectionStorage.uri(0)` + * and frozen for the lifetime of the adapter — RC11 deployments + * rebuild the adapter on `chainResetMarker` change anyway, and + * mid-lifetime storage swaps under the same Hub name are not a + * supported operation. Defaults to "" so the legacy 3-segment UAL + * form is produced when the storage's `uriBase` is the canonical + * `did:dkg`. See `kcUal()` in `@origintrail-official/dkg-core`. + */ + mintingStorageTag = ''; private readonly provider: JsonRpcProvider; private readonly filterErrorSilencer: FilterErrorSilencer; @@ -789,6 +801,33 @@ export class EVMChainAdapter implements ChainAdapter { } this.contracts.knowledgeCollectionStorage = await this.resolveAssetStorage('KnowledgeCollectionStorage'); + // OT-RFC-40 §5.2 — bind this adapter to the storage tag of the + // `KnowledgeCollectionStorage` instance it is minting into. The + // tag is the suffix of the storage's `uri(0)` past the `did:dkg` + // prefix; empty for the canonical V10 default storage. Failure to + // read or parse the uriBase falls back to "" (the legacy 3-segment + // UAL form), which is the most-conservative behaviour: it preserves + // every bit of UAL output produced before this RFC. + try { + const uriBase: string = await this.contracts.knowledgeCollectionStorage.uri(0); + const tag = deriveStorageTag(uriBase); + if (tag === null) { + console.warn( + `[EVMChainAdapter] KC storage at ${await this.contracts.knowledgeCollectionStorage.getAddress()} ` + + `returned malformed uriBase "${uriBase}"; falling back to default storage tag (3-segment UALs)`, + ); + this.mintingStorageTag = ''; + } else { + this.mintingStorageTag = tag; + } + } catch (err) { + console.warn( + `[EVMChainAdapter] failed to read uri(0) from KC storage: ${err instanceof Error ? err.message : String(err)}; ` + + `falling back to default storage tag (3-segment UALs)`, + ); + this.mintingStorageTag = ''; + } + // V9 contracts (KnowledgeAssets + KnowledgeAssetsStorage) are archived // (PRD §4.1, deploy scripts 040+041 moved under deploy/archive). Keep // the try/catch so adapters bound to legacy deploys still resolve them. diff --git a/packages/chain/src/mock-adapter.ts b/packages/chain/src/mock-adapter.ts index 1105fc3ad..bdda6a5d5 100644 --- a/packages/chain/src/mock-adapter.ts +++ b/packages/chain/src/mock-adapter.ts @@ -48,6 +48,19 @@ export class MockChainAdapter implements ChainAdapter { readonly chainType = 'evm' as const; readonly chainId: string; readonly signerAddress: string; + /** + * OT-RFC-40 §5.2 storage tag for the KC storage this mock adapter + * pretends to mint into. Defaults to "" (the legacy 3-segment UAL + * form, same shape every existing mock-backed test asserts on). + * Tests that exercise the multi-storage path call + * `setMintingStorageTag('v9')` etc. to flip into the 4-segment form. + */ + mintingStorageTag = ''; + + /** Test helper: override the storage tag this adapter advertises. */ + setMintingStorageTag(tag: string): void { + this.mintingStorageTag = tag; + } /** See `ChainAdapter.deploymentId`. Single in-memory deployment per process — chainId is enough. */ get deploymentId(): string { diff --git a/packages/publisher/src/dkg-publisher.ts b/packages/publisher/src/dkg-publisher.ts index 6c1d3477b..af1bbd2bc 100644 --- a/packages/publisher/src/dkg-publisher.ts +++ b/packages/publisher/src/dkg-publisher.ts @@ -2081,7 +2081,7 @@ export class DKGPublisher implements Publisher { // confirmed states for this publish so the `dkg:Publication` subject // emitted in metadata stays the same after on-chain confirmation. const publishOperationId = `${this.sessionId}-${tentativeSeq}`; - let ual = kcUal(this.chain.chainId, publisherAddress, `t${publishOperationId}`); + let ual = kcUal(this.chain.chainId, publisherAddress, `t${publishOperationId}`, this.chain.mintingStorageTag); // Resolve the on-chain attribution target from the per-call override // (computed above) or fall back to the daemon's persistent identity. @@ -2471,7 +2471,7 @@ export class DKGPublisher implements Publisher { // String(...) preserves the historical template-literal behaviour // around the typed-as-optional `startKAId` (in practice always // defined post-publish; absent only on the update OnChainPublishResult). - ual = kcUal(this.chain.chainId, onChainResult.publisherAddress, String(onChainResult.startKAId)); + ual = kcUal(this.chain.chainId, onChainResult.publisherAddress, String(onChainResult.startKAId), this.chain.mintingStorageTag); for (const km of kaMetadata) { km.kcUal = ual; @@ -2756,7 +2756,7 @@ export class DKGPublisher implements Publisher { // branch at line ~2670 always populates it via localTentativePublisherAddress()). const result: PublishResult = { kcId, - ual: kcUal(this.chain.chainId, String(publisherAddress), kcId), + ual: kcUal(this.chain.chainId, String(publisherAddress), kcId, this.chain.mintingStorageTag), merkleRoot: kcMerkleRoot, kaManifest: manifestEntries, status: 'tentative', @@ -2832,7 +2832,7 @@ export class DKGPublisher implements Publisher { if (!rejectedPublisherAddress) throw v10Err; earlyReturn = { kcId, - ual: kcUal(this.chain.chainId, rejectedPublisherAddress, kcId), + ual: kcUal(this.chain.chainId, rejectedPublisherAddress, kcId, this.chain.mintingStorageTag), merkleRoot: kcMerkleRoot, kaManifest: manifestEntries, status: 'failed', @@ -2887,7 +2887,7 @@ export class DKGPublisher implements Publisher { onPhase?.('chain', 'end'); return { kcId, - ual: kcUal(this.chain.chainId, failedPublisherAddress, kcId), + ual: kcUal(this.chain.chainId, failedPublisherAddress, kcId, this.chain.mintingStorageTag), merkleRoot: kcMerkleRoot, kaManifest: manifestEntries, status: 'failed', @@ -2919,7 +2919,7 @@ export class DKGPublisher implements Publisher { await storeUpdatedQuads(); const result: PublishResult = { kcId, - ual: kcUal(this.chain.chainId, tentativePublisherAddress, kcId), + ual: kcUal(this.chain.chainId, tentativePublisherAddress, kcId, this.chain.mintingStorageTag), merkleRoot: kcMerkleRoot, kaManifest: manifestEntries, status: 'tentative', @@ -2933,7 +2933,7 @@ export class DKGPublisher implements Publisher { const result: PublishResult = { kcId, - ual: kcUal(this.chain.chainId, effectivePublisherAddress, kcId), + ual: kcUal(this.chain.chainId, effectivePublisherAddress, kcId, this.chain.mintingStorageTag), merkleRoot: kcMerkleRoot, kaManifest: manifestEntries, status: 'confirmed', diff --git a/packages/publisher/test/storage-tagged-ual.test.ts b/packages/publisher/test/storage-tagged-ual.test.ts new file mode 100644 index 000000000..b6b1e0a15 --- /dev/null +++ b/packages/publisher/test/storage-tagged-ual.test.ts @@ -0,0 +1,138 @@ +/** + * OT-RFC-40 PR-4 — verify that `dkg-publisher.ts` honours the chain + * adapter's `mintingStorageTag` field when constructing UALs. + * + * Behaviour pinned: + * 1. Default-storage adapters (`mintingStorageTag === ""`) produce + * the legacy 3-segment UAL form bit-for-bit. This is the test + * that all 1046 pre-RFC publisher tests implicitly assert; it's + * repeated here as an explicit, documented contract. + * 2. Storage-tagged adapters (`mintingStorageTag === "v2"`) produce + * the 4-segment UAL form. This is the new behaviour PR-4 enables. + * + * The shape of the UAL is what routes the resolver in PR-5; everything + * downstream of the publish path (publish-handler, store.nq subjects, + * ChainEventPoller's UAL parsing) trusts the form produced by these + * exact lines in `dkg-publisher.ts`. Pinning both shapes here ensures + * future refactors (e.g. content-addressed identity in the followup + * RFC) cannot silently regress either. + */ +import { describe, expect, it } from 'vitest'; +import { ethers } from 'ethers'; +import { MockChainAdapter } from '@origintrail-official/dkg-chain'; +import { + AUTHOR_SCHEME_VERSION_V1, + TypedEventBus, + buildAuthorAttestationTypedData, + generateEd25519Keypair, + parseUal, +} from '@origintrail-official/dkg-core'; +import { OxigraphStore, type Quad } from '@origintrail-official/dkg-storage'; +import { DKGPublisher, canonicalPublishPayload } from '../src/index.js'; +import { mockChainStubACKProvider } from './_helpers/acks.js'; + +function quad(subject: string, predicate = 'http://schema.org/name', object = '"Root"'): Quad { + return { subject, predicate, object, graph: '' }; +} + +async function buildSeal( + chain: MockChainAdapter, + contextGraphId: bigint, + quads: Quad[], + author: ethers.Wallet, +) { + const canonical = canonicalPublishPayload(quads, []); + const typed = buildAuthorAttestationTypedData({ + chainId: await chain.getEvmChainId(), + kav10Address: await chain.getKnowledgeAssetsV10Address(), + contextGraphId, + merkleRoot: canonical.kcMerkleRoot, + authorAddress: author.address, + schemeVersion: AUTHOR_SCHEME_VERSION_V1, + }); + const sig = ethers.Signature.from(await author.signTypedData( + typed.domain, + typed.types, + typed.message, + )); + return { + expectedMerkleRoot: canonical.kcMerkleRoot, + authorAddress: author.address, + signature: { + r: ethers.getBytes(sig.r), + vs: ethers.getBytes(sig.yParityAndS), + }, + schemeVersion: AUTHOR_SCHEME_VERSION_V1, + }; +} + +async function publishOnce(opts: { storageTag: string }): Promise { + const wallet = ethers.Wallet.createRandom(); + const chain = new MockChainAdapter('mock:31337', wallet.address); + chain.setMintingStorageTag(opts.storageTag); + chain.seedIdentity(wallet.address, 1n); + const created = await chain.createOnChainContextGraph({ + accessPolicy: 0, + publishPolicy: 1, + metadataBatchId: 0n, + }); + const contextGraphId = created.contextGraphId; + + const keypair = await generateEd25519Keypair(); + const store = new OxigraphStore(); + const publisher = new DKGPublisher({ + store, + chain, + eventBus: new TypedEventBus(), + keypair, + publisherPrivateKey: wallet.privateKey, + publisherNodeIdentityId: 1n, + }); + + const quads = [quad('urn:test:rfc40-pr4-root')]; + const result = await publisher.publish({ + contextGraphId: String(contextGraphId), + quads, + precomputedAttestation: await buildSeal(chain, contextGraphId, quads, wallet), + v10ACKProvider: mockChainStubACKProvider({ identityId: 1n }), + }); + expect(result.status).toBe('confirmed'); + return result.ual; +} + +describe('OT-RFC-40 PR-4 — minting UAL respects ChainAdapter.mintingStorageTag', () => { + it('default-storage adapter produces a 3-segment UAL bit-for-bit equivalent to the pre-RFC output', async () => { + const ual = await publishOnce({ storageTag: '' }); + // mock:31337// shape — segment count check against + // the parser is the most stable form-assertion we can make here + // since the publisher address and KA id are runtime-generated. + expect(ual.startsWith('did:dkg:mock:31337/0x')).toBe(true); + const parsed = parseUal(ual); + expect(parsed).not.toBeNull(); + expect(parsed!.storageTag).toBe(''); + expect(parsed!.chainId).toBe('mock:31337'); + // 3-segment form: nothing between the prefix and the chainId. + const afterPrefix = ual.slice('did:dkg:'.length); + expect(afterPrefix.split('/')).toHaveLength(3); + }); + + it('tagged-storage adapter produces a 4-segment UAL with the tag in slot 1', async () => { + const ual = await publishOnce({ storageTag: 'v2' }); + expect(ual.startsWith('did:dkg:v2/mock:31337/0x')).toBe(true); + const parsed = parseUal(ual); + expect(parsed).not.toBeNull(); + expect(parsed!.storageTag).toBe('v2'); + expect(parsed!.chainId).toBe('mock:31337'); + const afterPrefix = ual.slice('did:dkg:'.length); + expect(afterPrefix.split('/')).toHaveLength(4); + }); + + it('tag survives the round-trip and is recoverable via parseUal', async () => { + // Belt-and-suspenders: for a hypothetical future V11 storage with + // tag "v11", make sure the resolver-side parser sees exactly the + // tag the publisher minted under. This is the contract that PR-5's + // resolution path relies on. + const ual = await publishOnce({ storageTag: 'v11' }); + expect(parseUal(ual)?.storageTag).toBe('v11'); + }); +}); From da33d5f04064181465a9381f0b20768db676c63e Mon Sep 17 00:00:00 2001 From: Branimir Rakic Date: Wed, 27 May 2026 01:32:45 +0200 Subject: [PATCH 06/12] feat(chain,publisher,random-sampling): tag-aware resolution paths (RFC-40 PR-5) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit PR-5 of OT-RFC-40 closes the read-side gap: with PR-4 mints now stamping each KC UAL with the storage-tag of the KnowledgeCollectionStorage that owns it, every consumer that previously assumed a single global KC storage instance has to consult that tag to land on the right contract. Three call sites moved off the single-storage assumption: 1. `ChainAdapter.verifyPublisherOwnsRange` now accepts an optional `storageTag`. The EVM adapter routes: - `'v9'` → legacy V9 KAS pre-reservation API (unchanged behaviour; this is what the adapter did before PR-5). - `''` → V10 default storage. V10 does not pre-reserve ranges, so the method returns `true` and defers to the ACK-signature gate downstream; otherwise every V10 publish on a Hub without a V9 KAS would have been silently rejected here. - unknown tag → conservative `false` (registry hasn't seen it). The publish-handler now derives the tag from `parseUal(request.ual)` and passes it through. A new lifecycle test pins the V10 default path; the existing "publisher does not own UAL range" test now uses a `did:dkg:v9/...` UAL to keep exercising the V9 KAS path. 2. `EVMChainAdapter` exposes a `KCStorageRegistry` (`kcStorageRegistry`) built during `init()` and refreshed on `Hub.NewAssetStorage` / `Hub.AssetStorageChanged` events, so any tag-aware consumer can resolve storage address ↔ tag without re-deriving it from `uri(0)`. 3. `extractV10KCFromStore` (random-sampling) takes an optional `expectedStorageTag`. The SPARQL no longer caps at `LIMIT 1`; we parse every candidate UAL with `parseUal` and keep only those whose storage-tag matches the expectation, fixing the prover's prove-the-wrong-KC failure mode when two storages happen to mint the same `kcId`. The prover wires the tag from the on-chain `Challenge.knowledgeCollectionStorageContract` via `kcStorageRegistry.tagFor(...)`. Tests: - chain: `verifyPublisherOwnsRange` paths for v9, default, and unknown tags (`evm-adapter.test.ts`); mock-adapter parity ignores the new private `buildKCStorageRegistry` EVM helper. - publisher: lifecycle test split into v9-tag and default-tag cases. - random-sampling: `kc-extractor-multi-storage.test.ts` covers same-`kcId` collisions across V10 default, `v9`, omitted-tag (legacy callers), and unknown-tag inputs. Compatible with PR-3 (registry) and PR-4 (mint-time tagging); leaves PR-6 (operator docs) as the only remaining piece of the RFC. Co-authored-by: Cursor --- packages/chain/src/chain-adapter.ts | 38 +++- packages/chain/src/evm-adapter.ts | 149 +++++++++++++- packages/chain/src/mock-adapter.ts | 6 + packages/chain/test/evm-adapter.test.ts | 33 ++- .../chain/test/mock-adapter-parity.test.ts | 6 + packages/publisher/src/publish-handler.ts | 13 ++ .../publisher/test/publish-lifecycle.test.ts | 51 ++++- packages/random-sampling/src/index.ts | 1 + packages/random-sampling/src/kc-extractor.ts | 56 +++++- packages/random-sampling/src/prover.ts | 15 +- .../test/kc-extractor-multi-storage.test.ts | 188 ++++++++++++++++++ 11 files changed, 539 insertions(+), 17 deletions(-) create mode 100644 packages/random-sampling/test/kc-extractor-multi-storage.test.ts diff --git a/packages/chain/src/chain-adapter.ts b/packages/chain/src/chain-adapter.ts index 64ffed46b..f02241a78 100644 --- a/packages/chain/src/chain-adapter.ts +++ b/packages/chain/src/chain-adapter.ts @@ -1,4 +1,5 @@ import type { ethers } from 'ethers'; +import type { KCStorageRegistry } from './kc-storage-registry.js'; export interface IdentityProof { publicKey: Uint8Array; @@ -573,6 +574,23 @@ export interface ChainAdapter { * mock adapters expose a setter for tests. */ mintingStorageTag?: string; + + /** + * OT-RFC-40 §5.3 — registry of every KC-class storage on the Hub + * this adapter is bound to, keyed by storage tag and address. + * + * Resolution-time consumers (random-sampling prover, async-lift + * verifier, replication ack verifier) use this to map a UAL or a + * `Challenge.knowledgeCollectionStorageContract` address to the + * storage instance that minted the data, instead of assuming a + * single named-`KnowledgeCollectionStorage` exists. + * + * Optional on the adapter surface: adapters that pre-date the RFC + * (or only support a single storage instance) leave it `undefined`, + * in which case callers fall back to "treat everything as the + * default storage" — bit-for-bit pre-RFC behaviour. + */ + kcStorageRegistry?: KCStorageRegistry; /** * Stable identifier for the SPECIFIC deployment this adapter is * bound to (not just the chain). `chainId` alone is too coarse — @@ -622,8 +640,26 @@ export interface ChainAdapter { /** * Verify that a publisher address owns the UAL range [startKAId, endKAId] on-chain. * Used by receiving nodes to reject PublishRequests with spoofed publisher/range. + * + * OT-RFC-40 §7.5: optional 4th `storageTag` argument routes the + * range query to the correct KC storage instance. Empty string / + * undefined means the V10 default storage (which does not pre-reserve + * publisher ranges; auth happens at the publish ACK layer, so this + * call returns `true` to defer to that). `"v9"` routes to V9 KAS, + * which does carry per-publisher reserved ranges. Unknown tags + * return `false` conservatively. + * + * Adapters that pre-date the RFC ignore the parameter and preserve + * their existing behaviour; mock adapters keep their test-fixture + * range bookkeeping. The publish-handler derives the tag from the + * incoming request's UAL via `parseUal(request.ual).storageTag`. */ - verifyPublisherOwnsRange?(publisherAddress: string, startKAId: bigint, endKAId: bigint): Promise; + verifyPublisherOwnsRange?( + publisherAddress: string, + startKAId: bigint, + endKAId: bigint, + storageTag?: string, + ): Promise; // Block height (used by ChainEventPoller to seed the scan cursor) getBlockNumber?(): Promise; diff --git a/packages/chain/src/evm-adapter.ts b/packages/chain/src/evm-adapter.ts index 459fa18ca..de3287890 100644 --- a/packages/chain/src/evm-adapter.ts +++ b/packages/chain/src/evm-adapter.ts @@ -40,7 +40,12 @@ import { ChallengeNoLongerActiveError, } from './chain-adapter.js'; import { HubResolutionCache } from './hub-resolution-cache.js'; -import { deriveStorageTag } from './kc-storage-registry.js'; +import { + deriveStorageTag, + KCStorageRegistry, + type KCStorageHubReader, + type KCStorageUriReader, +} from './kc-storage-registry.js'; import { PcaUnavailableError } from './pca-errors.js'; import { buildAuthorAttestationTypedData, @@ -311,6 +316,15 @@ export class EVMChainAdapter implements ChainAdapter { */ mintingStorageTag = ''; + /** + * OT-RFC-40 §5.3 — registry of every KC-class storage on this + * adapter's Hub, keyed by storage tag and address. Populated lazily + * in `init()`; refreshed on `Hub.NewAssetStorage` / + * `Hub.AssetStorageChanged` so a V11 storage that gets registered + * after daemon boot becomes resolvable without a restart. + */ + kcStorageRegistry?: KCStorageRegistry; + private readonly provider: JsonRpcProvider; private readonly filterErrorSilencer: FilterErrorSilencer; /** Primary signer — used for identity/profile/staking operations. */ @@ -828,6 +842,23 @@ export class EVMChainAdapter implements ChainAdapter { this.mintingStorageTag = ''; } + // OT-RFC-40 §5.3 — build the cross-storage registry so consumers + // (random-sampling prover, async-lift verifier, replication ack + // verifier) can map a UAL or `Challenge.knowledgeCollectionStorageContract` + // address back to the storage that minted the data. Failure here is + // non-fatal; the registry just stays empty and resolution-time + // callers fall back to "treat as default storage" which is + // bit-for-bit pre-RFC behaviour. + try { + this.kcStorageRegistry = await this.buildKCStorageRegistry(); + } catch (err) { + console.warn( + `[EVMChainAdapter] failed to build KC storage registry: ${err instanceof Error ? err.message : String(err)}; ` + + `multi-storage resolution paths will fall back to default-storage behaviour`, + ); + this.kcStorageRegistry = undefined; + } + // V9 contracts (KnowledgeAssets + KnowledgeAssetsStorage) are archived // (PRD §4.1, deploy scripts 040+041 moved under deploy/archive). Keep // the try/catch so adapters bound to legacy deploys still resolve them. @@ -1432,17 +1463,59 @@ export class EVMChainAdapter implements ChainAdapter { publisherAddress: string, startKAId: bigint, endKAId: bigint, + storageTag?: string, ): Promise { await this.init(); - if (!this.contracts.knowledgeAssetsStorage) return false; - const storage = this.contracts.knowledgeAssetsStorage; - const count = await storage.getPublisherRangesCount(publisherAddress); - for (let i = 0; i < Number(count); i++) { - const [startId, endId] = await storage.getPublisherRange(publisherAddress, i); - if (startId <= startKAId && endId >= endKAId) return true; + // OT-RFC-40 §7.5: route the range query to the storage instance + // that minted the UAL. The publish-handler derives `storageTag` + // from `parseUal(request.ual).storageTag`; it is empty for V10 + // default-storage UALs (3-segment form) and "v9" for V9 KAS UALs. + const tag = storageTag ?? ''; + + if (tag === 'v9') { + // V9 KAS is the only currently-deployed storage that pre-reserves + // per-publisher ID ranges. The query API + // (`getPublisherRangesCount`/`getPublisherRange`) is V9-specific. + if (!this.contracts.knowledgeAssetsStorage) return false; + const storage = this.contracts.knowledgeAssetsStorage; + const count = await storage.getPublisherRangesCount(publisherAddress); + for (let i = 0; i < Number(count); i++) { + const [startId, endId] = await storage.getPublisherRange(publisherAddress, i); + if (startId <= startKAId && endId >= endKAId) return true; + } + return false; + } + + if (tag === '') { + // V10 default storage. V10 does NOT pre-reserve ranges — KCs + // are minted directly with their token-bound publisher recorded + // on-chain at mint time. The publish-handler's existing publish + // ACK signature check (V10 ACKs collected from receiving nodes) + // is the authoritative ownership verification on this path; this + // method exists as a V9-era pre-flight, so for V10 we simply + // defer to that downstream check by returning `true`. NOT + // returning `true` here would silently reject every V10 publish + // on Hubs without a V9 KAS deployment — the bug RFC-40 PR-5 + // calls out by name. + return true; + } + + // Unknown tag: the registry has not seen this storage. Conservative + // failure: refuse to attest range ownership for an unknown storage + // instance. Operators see the rejection message and can either + // refresh the registry or investigate why a UAL was minted under + // a tag the receiver doesn't recognise. + if (!this.kcStorageRegistry?.getByTag(tag)) { + return false; } - return false; + + // Future tagged storages might re-introduce a V9-style range API. + // When that happens, this branch can switch on the storage's hubName + // to route to the appropriate query. Today, the conservative answer + // for any tag we DO recognise but don't have a range API for is + // identical to V10's: defer to the publish-handler's downstream auth. + return true; } // ===================================================================== @@ -2862,15 +2935,75 @@ export class EVMChainAdapter implements ChainAdapter { this.invalidateRandomSamplingPair(); } }; + // OT-RFC-40 §5.3 — refresh the KC storage registry on every + // `NewAssetStorage` / `AssetStorageChanged` event regardless of + // name (the registry's filter rule already restricts to KC-class + // names). Listening unconditionally avoids races where a future + // version naming convention is added but we forget to update this + // allowlist; the registry's filter is the single source of truth. + const onAssetStorageChange = (): void => { + if (this.kcStorageRegistry === undefined) return; + void this.kcStorageRegistry + .refresh() + .catch((err) => console.warn( + `[EVMChainAdapter] KC storage registry refresh failed after Hub event: ${err instanceof Error ? err.message : String(err)}`, + )); + }; try { await this.contracts.hub.on('ContractChanged', onChange); await this.contracts.hub.on('NewContract', onChange); + await this.contracts.hub.on('NewAssetStorage', onAssetStorageChange); + await this.contracts.hub.on('AssetStorageChanged', onAssetStorageChange); this.hubRotationListenerStarted = true; } catch { /* provider doesn't support filter subscriptions — TTL refresh is the fallback */ } } + /** + * OT-RFC-40 §5.3 — construct the cross-storage registry by giving + * it ethers-backed implementations of the two readers it needs. + * + * The hub reader mirrors `Hub.getAllAssetStorages()` exactly. The + * URI reader instantiates a minimal ABI fragment for ERC-1155's + * `uri(uint256)` view rather than dragging in the full + * `KnowledgeCollectionStorage` ABI per call — this keeps the read + * working uniformly across V10 KCS, V9 KAS, and any future V11+ + * storages without a per-version ABI plumbing detour. + */ + private async buildKCStorageRegistry(): Promise { + const hub = this.contracts.hub; + const hubReader: KCStorageHubReader = { + async getAllAssetStorages() { + const raw = await hub.getAllAssetStorages(); + return Array.isArray(raw) + ? raw.map((entry: { name: string; addr: string } | { name: string; addr: string }[]) => { + const tuple = entry as { name?: string; addr?: string } & Array; + return { + name: String(tuple.name ?? tuple[0]), + addr: String(tuple.addr ?? tuple[1]), + }; + }) + : []; + }, + }; + const uriAbi: ethers.InterfaceAbi = [ + 'function uri(uint256) view returns (string)', + ]; + const provider = this.provider; + const uriReader: KCStorageUriReader = { + async readUriBase(storageAddress: string) { + const c = new Contract(storageAddress, uriAbi, provider); + return c.uri(0); + }, + }; + const registry = new KCStorageRegistry(hubReader, uriReader, { + log: { warn: (m: string) => console.warn(m) }, + }); + await registry.refresh(); + return registry; + } + /** * Map a caught chain error onto a typed prover error when the revert * matches one of the documented retry-next-period / non-retryable diff --git a/packages/chain/src/mock-adapter.ts b/packages/chain/src/mock-adapter.ts index bdda6a5d5..5554f8ab6 100644 --- a/packages/chain/src/mock-adapter.ts +++ b/packages/chain/src/mock-adapter.ts @@ -263,7 +263,13 @@ export class MockChainAdapter implements ChainAdapter { publisherAddress: string, startKAId: bigint, endKAId: bigint, + _storageTag?: string, ): Promise { + // RFC-40 PR-5: the mock keeps its existing single-storage range + // bookkeeping; the `storageTag` parameter is accepted for ABI + // compatibility with the EVM adapter but the mock's range ledger + // is shared across all simulated storages. Tests that need + // multi-storage behaviour use the EVM-adapter-level tests. const ranges = this.reservedRangesByPublisher.get(publisherAddress); if (!ranges?.length) return false; for (const r of ranges) { diff --git a/packages/chain/test/evm-adapter.test.ts b/packages/chain/test/evm-adapter.test.ts index f72312e68..5b5d4f755 100644 --- a/packages/chain/test/evm-adapter.test.ts +++ b/packages/chain/test/evm-adapter.test.ts @@ -48,10 +48,39 @@ describe('EVMChainAdapter integration', () => { expect(bn).toBeGreaterThanOrEqual(0); }, 15_000); - it('verifyPublisherOwnsRange resolves KnowledgeAssetsStorage after init', async () => { + it('verifyPublisherOwnsRange("v9") resolves KnowledgeAssetsStorage after init', async () => { + // OT-RFC-40 §7.5: explicit `"v9"` tag routes to the V9 KAS + // publisher-range API. A freshly-deployed V9 KAS has no + // pre-reserved ranges for any address, so this returns false. const adapter = new EVMChainAdapter(makeAdapterConfig(ctx.rpcUrl, ctx.hubAddress, HARDHAT_KEYS.DEPLOYER)); const deployer = adapter.getSignerAddress(); - const owns = await adapter.verifyPublisherOwnsRange(deployer, 1n, 1n); + const owns = await adapter.verifyPublisherOwnsRange(deployer, 1n, 1n, 'v9'); + expect(owns).toBe(false); + }, 30_000); + + it('verifyPublisherOwnsRange (default tag) defers to V10 ACK auth → returns true', async () => { + // OT-RFC-40 §7.5: the V10 default storage does NOT pre-reserve + // publisher ranges; ownership is verified at the ACK-signature + // layer. The method returns true so V10 publishes on Hubs without + // a V9 KAS deployment aren't silently rejected — the bug PR-5 + // calls out by name. Pre-RFC, this returned false unconditionally + // when V9 KAS was empty. + const adapter = new EVMChainAdapter(makeAdapterConfig(ctx.rpcUrl, ctx.hubAddress, HARDHAT_KEYS.DEPLOYER)); + const deployer = adapter.getSignerAddress(); + const ownsDefault = await adapter.verifyPublisherOwnsRange(deployer, 1n, 1n); + expect(ownsDefault).toBe(true); + const ownsExplicitDefault = await adapter.verifyPublisherOwnsRange(deployer, 1n, 1n, ''); + expect(ownsExplicitDefault).toBe(true); + }, 30_000); + + it('verifyPublisherOwnsRange returns false for an unknown storage tag', async () => { + // Conservative failure mode for a UAL minted under a tag the + // receiver's registry doesn't recognise (e.g. a V11 storage that + // the daemon hasn't refreshed against yet). RFC §7.5 — operators + // see the rejection and can refresh the registry. + const adapter = new EVMChainAdapter(makeAdapterConfig(ctx.rpcUrl, ctx.hubAddress, HARDHAT_KEYS.DEPLOYER)); + const deployer = adapter.getSignerAddress(); + const owns = await adapter.verifyPublisherOwnsRange(deployer, 1n, 1n, 'nonexistent-tag'); expect(owns).toBe(false); }, 30_000); }); diff --git a/packages/chain/test/mock-adapter-parity.test.ts b/packages/chain/test/mock-adapter-parity.test.ts index 3b2899f3a..b485308b7 100644 --- a/packages/chain/test/mock-adapter-parity.test.ts +++ b/packages/chain/test/mock-adapter-parity.test.ts @@ -132,6 +132,12 @@ const MOCK_EXEMPT_FROM_EVM = new Set([ // round-trips to cache in the first place; a no-op shim would just // pad parity for no behavioural reason. 'invalidatePublishPreflightCache', + // OT-RFC-40 PR-5: TS-private helper that builds the KC storage + // registry from a live Hub. Mock adapter has no Hub-backed + // multi-storage discovery (its `mintingStorageTag` field is set + // directly via `setMintingStorageTag`), so there is no mock-side + // counterpart to mirror. + 'buildKCStorageRegistry', ]); const NO_CHAIN_EXEMPT_FROM_EVM = new Set([ diff --git a/packages/publisher/src/publish-handler.ts b/packages/publisher/src/publish-handler.ts index b107809ee..2b742c960 100644 --- a/packages/publisher/src/publish-handler.ts +++ b/packages/publisher/src/publish-handler.ts @@ -258,16 +258,29 @@ export class PublishHandler { // identity from IdentityStorage.getIdentityId(msg.sender). // ── On-chain range check: reject if publisher does not own startKAId..endKAId ── + // + // OT-RFC-40 §7.5: derive the storage tag from the request UAL so + // the adapter can route the range query to the correct storage + // instance (V9 KAS uses the legacy publisher-range API; V10 + // default storage skips range pre-reservation and defers to ACK + // signatures). `parseUal(request.ual).storageTag` is empty for + // 3-segment / default-storage UALs and equals the tag for + // 4-segment / tagged UALs; passing `undefined` (the case where + // request.ual is missing on a malformed or pre-RFC client) + // preserves the pre-RFC behaviour bit-for-bit. if ( startKAId > 0n && endKAId > 0n && request.publisherAddress && this.chainAdapter?.verifyPublisherOwnsRange ) { + const parsedRequestUal = request.ual ? parseUal(request.ual) : null; + const storageTag = parsedRequestUal?.storageTag; const owns = await this.chainAdapter.verifyPublisherOwnsRange( request.publisherAddress, startKAId, endKAId, + storageTag, ); if (!owns) { return this.rejectAck( diff --git a/packages/publisher/test/publish-lifecycle.test.ts b/packages/publisher/test/publish-lifecycle.test.ts index a158c5827..52260dfc0 100644 --- a/packages/publisher/test/publish-lifecycle.test.ts +++ b/packages/publisher/test/publish-lifecycle.test.ts @@ -473,7 +473,13 @@ describe('Tentative data and chain event confirmation', () => { expect(handler.hasPendingPublishes).toBe(false); }); - it('rejects PublishRequest when publisher does not own UAL range (on-chain check)', async () => { + it('rejects v9-tagged PublishRequest when publisher does not own UAL range (on-chain check)', async () => { + // OT-RFC-40 §7.5: V10 default-storage publishes (3-segment UALs) defer + // ownership verification to the ACK layer — `verifyPublisherOwnsRange` + // returns true for them. The legacy V9 KAS pre-reservation API is still + // exercised when the UAL carries a `v9` storage tag, so this test + // anchors the V9 path: a publisher who never reserved a range gets + // rejected up front. const store = new OxigraphStore(); const bus = new TypedEventBus(); const chainAdapter = createEVMAdapter(HARDHAT_KEYS.CORE_OP); @@ -485,7 +491,7 @@ describe('Tentative data and chain event confirmation', () => { `<${t.subject}> <${t.predicate}> ${t.object} .`, ).join('\n'); - const ual = `did:dkg:mock:31337/${publisherAddress}/1`; + const ual = `did:dkg:v9/mock:31337/${publisherAddress}/1`; const reqBytes = encodePublishRequest({ ual, nquads: new TextEncoder().encode(ntriples), @@ -507,6 +513,47 @@ describe('Tentative data and chain event confirmation', () => { expect(ack.rejectionReason).toContain('1..1'); }); + it('accepts default-tag PublishRequest without on-chain range check (V10 defers to ACK auth)', async () => { + // OT-RFC-40 §7.5: 3-segment / default-storage UALs are V10 publishes; + // the publisher never reserves a range and ownership is established by + // ACK signatures collected downstream. The handler should NOT reject + // such requests up front via `verifyPublisherOwnsRange`. (Other layers + // — Merkle, ACK collection, etc — still gate confirmation; this test + // only proves the range pre-flight no longer fires.) + const store = new OxigraphStore(); + const bus = new TypedEventBus(); + const chainAdapter = createEVMAdapter(HARDHAT_KEYS.CORE_OP); + const handler = new PublishHandler(store, bus, { chainAdapter }); + + const publisherAddress = TEST_WALLET.address; + const triples = [q('did:dkg:agent:QmV10NoRange', 'http://schema.org/name', '"V10Bot"')]; + const ntriples = triples.map(t => + `<${t.subject}> <${t.predicate}> ${t.object} .`, + ).join('\n'); + + const ual = `did:dkg:mock:31337/${publisherAddress}/1`; + const reqBytes = encodePublishRequest({ + ual, + nquads: new TextEncoder().encode(ntriples), + contextGraphId: CONTEXT_GRAPH, + kas: [{ tokenId: 1, rootEntity: 'did:dkg:agent:QmV10NoRange', privateMerkleRoot: new Uint8Array(0), privateTripleCount: 0 }], + publisherIdentity: new Uint8Array(32), + publisherAddress, + startKAId: 1, + endKAId: 1, + chainId: 'mock:31337', + publisherSignatureR: new Uint8Array(0), + publisherSignatureVs: new Uint8Array(0), + }); + + const ackData = await handler.handler(reqBytes, 'test-peer' as any); + const ack = decodePublishAck(ackData); + expect(ack.accepted).toBe(true); + if (ack.rejectionReason) { + expect(ack.rejectionReason).not.toContain('does not own'); + } + }); + it('rejects confirmation with mismatched publisher address', async () => { const store = new OxigraphStore(); const bus = new TypedEventBus(); diff --git a/packages/random-sampling/src/index.ts b/packages/random-sampling/src/index.ts index 5dfee55a8..345c0d621 100644 --- a/packages/random-sampling/src/index.ts +++ b/packages/random-sampling/src/index.ts @@ -28,6 +28,7 @@ export { KCDataMissingError, type KCTriple, type KCExtractionResult, + type ExtractV10KCOptions, } from './kc-extractor.js'; export { diff --git a/packages/random-sampling/src/kc-extractor.ts b/packages/random-sampling/src/kc-extractor.ts index c45472699..8eb8556ee 100644 --- a/packages/random-sampling/src/kc-extractor.ts +++ b/packages/random-sampling/src/kc-extractor.ts @@ -3,6 +3,7 @@ import { contextGraphDataUri, contextGraphMetaUri, hashTripleV10, + parseUal, TRUST_LEVEL_PREDICATE, LEGACY_TRUST_LEVEL_PREDICATE, } from '@origintrail-official/dkg-core'; @@ -156,10 +157,33 @@ export class KCDataMissingError extends Error { * or {@link KCDataMissingError} on the named failure modes — each is a * skip-this-period signal for the prover, not a retry. */ +export interface ExtractV10KCOptions { + /** + * OT-RFC-40 §7.5: when supplied, the extractor only matches UALs + * whose storage tag equals this value. Empty string ("") matches + * default-storage / 3-segment UALs; "v9" matches V9-tagged UALs; + * any other tag matches its respective tagged storage. + * + * In a CG that holds KCs from multiple storage instances (e.g. a + * V9 + V10 coexistence scenario), batchId values can collide + * across storages — both V9 KAS kcId=5 and V10 KCS kcId=5 are + * valid, but they are DIFFERENT KCs. Without this filter the + * meta-graph lookup would return whichever match Oxigraph happens + * to enumerate first, producing the wrong leaves at proof time. + * + * Omit (or leave undefined) to preserve pre-RFC behaviour: take + * whichever UAL matches the batchId first. This is correct for + * single-storage CGs (the overwhelming majority) and is what the + * prover does on adapters whose `kcStorageRegistry` is undefined. + */ + expectedStorageTag?: string; +} + export async function extractV10KCFromStore( store: TripleStore, cgId: bigint, kcId: bigint, + options: ExtractV10KCOptions = {}, ): Promise { const cgIdStr = cgId.toString(); // Map cgId (numeric) → local CG name via the ontology graph. The @@ -180,17 +204,42 @@ export async function extractV10KCFromStore( // 1. Resolve UAL via dkg:batchId. Use a typed integer literal to // avoid string-prefix collisions (kcId 1 vs 10) — same lookup // discipline as the publisher's resolveUalByBatchId (P-18 lesson). + // RFC-40 §7.5: pull every match (no LIMIT 1) so we can post-filter + // by storage tag when one was requested. In single-storage CGs + // the result set is always size 1 anyway. const ualResult = await store.query( `SELECT ?ual WHERE { GRAPH <${metaGraph}> { ?ual <${DKG}batchId> "${kcId}"^^<${XSD}integer> . } - } LIMIT 1`, + }`, ); if (ualResult.type !== 'bindings' || ualResult.bindings.length === 0) { throw new KCNotFoundError(cgId, kcId); } - const ual = stripQuotes(ualResult.bindings[0]['ual'] ?? ''); + + // RFC-40 §7.5: when an expected storage tag is provided, drop + // bindings whose UAL doesn't match. Bindings whose UAL fails to + // parse (CG data URIs accidentally tagged with a batchId, etc.) + // are also dropped — the batchId triple is publisher-authored, so + // a non-UAL binding indicates store corruption that the prover + // should not attempt to recover from silently. + const candidateUals = ualResult.bindings + .map((b) => stripQuotes(b['ual'] ?? '')) + .filter((u) => u.length > 0); + let ual: string | undefined; + if (options.expectedStorageTag !== undefined) { + for (const candidate of candidateUals) { + const parsed = parseUal(candidate); + if (parsed === null) continue; + if (parsed.storageTag === options.expectedStorageTag) { + ual = candidate; + break; + } + } + } else { + ual = candidateUals[0]; + } if (!ual) throw new KCNotFoundError(cgId, kcId); assertSafeIri(ual); @@ -377,7 +426,8 @@ export async function extractV10KCQuads( store: TripleStore, cgId: bigint, kcId: bigint, + options: ExtractV10KCOptions = {}, ): Promise { - const result = await extractV10KCFromStore(store, cgId, kcId); + const result = await extractV10KCFromStore(store, cgId, kcId, options); return result.triples.map((t) => ({ ...t, graph: result.dataGraph })); } diff --git a/packages/random-sampling/src/prover.ts b/packages/random-sampling/src/prover.ts index c79485fbf..708c6e206 100644 --- a/packages/random-sampling/src/prover.ts +++ b/packages/random-sampling/src/prover.ts @@ -333,9 +333,22 @@ export class RandomSamplingProver { const expectedRoot = await this.chain.getLatestMerkleRoot(kcId); const expectedLeafCount = await this.chain.getMerkleLeafCount(kcId); + // OT-RFC-40 §7.5: when the chain adapter exposes a KC storage + // registry, map `Challenge.knowledgeCollectionStorageContract` + // (an address) back to the storage's UAL tag. The extractor uses + // this to disambiguate batchId collisions in CGs that hold KCs + // from multiple storage versions (V9 + V10 today; arbitrary + // versions in the future). Adapters without a registry still + // produce a working prover by falling through to the legacy + // "match the first UAL" behaviour — bit-for-bit pre-RFC. + const expectedStorageTag = this.chain.kcStorageRegistry + ?.tagFor(challenge.knowledgeCollectionStorageContract); + let leaves: Uint8Array[]; try { - const extracted = await extractV10KCFromStore(this.store, cgId, kcId); + const extracted = await extractV10KCFromStore(this.store, cgId, kcId, { + expectedStorageTag, + }); leaves = extracted.leaves; } catch (err) { if (err instanceof KCNotFoundError || err instanceof KCDataMissingError) { diff --git a/packages/random-sampling/test/kc-extractor-multi-storage.test.ts b/packages/random-sampling/test/kc-extractor-multi-storage.test.ts new file mode 100644 index 000000000..96481fca8 --- /dev/null +++ b/packages/random-sampling/test/kc-extractor-multi-storage.test.ts @@ -0,0 +1,188 @@ +/** + * OT-RFC-40 PR-5 — kc-extractor multi-storage disambiguation. + * + * Pin the contract that consumers can disambiguate same-batchId KCs + * across different storage instances (V9 KAS + V10 KCS, today; any + * tagged storage in the future) by passing `expectedStorageTag`. + * + * Without this filter, a CG that holds KCs from multiple storage + * versions and an on-chain `Challenge` for kcId=5 would have the + * extractor pick whichever UAL the SPARQL engine enumerates first — + * potentially the wrong one — and the prover would compute leaves + * for the wrong KC, fail merkle-root verification, and miss the + * proof period. + * + * The default behaviour (no `expectedStorageTag`) is preserved + * bit-for-bit: take the first match. This is correct for the + * overwhelming majority of CGs which only hold KCs from one storage. + */ +import { describe, it, expect, beforeEach } from 'vitest'; +import { OxigraphStore, type Quad } from '@origintrail-official/dkg-storage'; +import { + contextGraphDataUri, + contextGraphMetaUri, + kcUal, +} from '@origintrail-official/dkg-core'; +import { + extractV10KCFromStore, + KCNotFoundError, +} from '../src/index.js'; + +const DKG = 'http://dkg.io/ontology/'; +const XSD = 'http://www.w3.org/2001/XMLSchema#'; +const RDF = 'http://www.w3.org/1999/02/22-rdf-syntax-ns#'; +const ONTOLOGY_GRAPH = 'did:dkg:context-graph:ontology'; +const CONTEXT_GRAPH_ON_CHAIN_ID = 'https://dkg.network/ontology#ContextGraphOnChainId'; + +const PUBLISHER = '0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1'; +const CHAIN_ID = 'mock:31337'; +// Sentinel sentinels: each storage's KC seeds a single-leaf KA whose +// rootEntity is unique to that storage so the produced leaves are +// different. The prover's leaf count is hash-based so a leak between +// storages would surface as a leaf-set mismatch. +const ROOT_V10 = 'urn:entity:rfc40-pr5-v10-root'; +const ROOT_V9 = 'urn:entity:rfc40-pr5-v9-root'; + +async function seedOntology(store: OxigraphStore, cgName: string, cgId: bigint): Promise { + await store.insert([ + { + subject: `did:dkg:context-graph:${cgName}`, + predicate: CONTEXT_GRAPH_ON_CHAIN_ID, + object: `"${cgId.toString()}"`, + graph: ONTOLOGY_GRAPH, + }, + ]); +} + +interface KCSeed { + ual: string; + kcId: bigint; + rootEntity: string; + publicTriple: { subject: string; predicate: string; object: string }; +} + +async function seedKC( + store: OxigraphStore, + cgName: string, + cgId: bigint, + seed: KCSeed, +): Promise { + const cgIdStr = cgId.toString(); + const metaGraph = contextGraphMetaUri(cgName, cgIdStr); + const dataGraph = contextGraphDataUri(cgName, cgIdStr); + + const kaUri = `${seed.ual}/1`; + const metaQuads: Quad[] = [ + { + subject: seed.ual, + predicate: `${RDF}type`, + object: `${DKG}KnowledgeCollection`, + graph: metaGraph, + }, + { + subject: seed.ual, + predicate: `${DKG}batchId`, + object: `"${seed.kcId}"^^<${XSD}integer>`, + graph: metaGraph, + }, + { subject: kaUri, predicate: `${RDF}type`, object: `${DKG}KnowledgeAsset`, graph: metaGraph }, + { subject: kaUri, predicate: `${DKG}partOf`, object: seed.ual, graph: metaGraph }, + { subject: kaUri, predicate: `${DKG}rootEntity`, object: seed.rootEntity, graph: metaGraph }, + ]; + await store.insert(metaQuads); + await store.insert([{ ...seed.publicTriple, graph: dataGraph }]); +} + +describe('extractV10KCFromStore — RFC-40 PR-5 storage-tag filter', () => { + let store: OxigraphStore; + const cgId = 7n; + const cgName = `cg-${cgId.toString()}`; + const kcId = 5n; + + beforeEach(async () => { + store = new OxigraphStore(); + await seedOntology(store, cgName, cgId); + + // Both storages have a KC at the same batchId — the exact + // collision the storage-tag filter exists to disambiguate. + await seedKC(store, cgName, cgId, { + ual: kcUal(CHAIN_ID, PUBLISHER, kcId), // 3-segment / V10 default + kcId, + rootEntity: ROOT_V10, + publicTriple: { subject: ROOT_V10, predicate: 'urn:p:label', object: '"v10"' }, + }); + await seedKC(store, cgName, cgId, { + ual: kcUal(CHAIN_ID, PUBLISHER, kcId, 'v9'), // 4-segment / V9-tagged + kcId, + rootEntity: ROOT_V9, + publicTriple: { subject: ROOT_V9, predicate: 'urn:p:label', object: '"v9"' }, + }); + }); + + it('with expectedStorageTag="" returns the V10 default-storage UAL', async () => { + const result = await extractV10KCFromStore(store, cgId, kcId, { + expectedStorageTag: '', + }); + expect(result.ual).toBe(kcUal(CHAIN_ID, PUBLISHER, kcId)); + expect(result.rootEntities).toEqual([ROOT_V10]); + expect(result.triples).toHaveLength(1); + expect(result.triples[0].subject).toBe(ROOT_V10); + }); + + it('with expectedStorageTag="v9" returns the V9-tagged UAL', async () => { + const result = await extractV10KCFromStore(store, cgId, kcId, { + expectedStorageTag: 'v9', + }); + expect(result.ual).toBe(kcUal(CHAIN_ID, PUBLISHER, kcId, 'v9')); + expect(result.rootEntities).toEqual([ROOT_V9]); + expect(result.triples).toHaveLength(1); + expect(result.triples[0].subject).toBe(ROOT_V9); + }); + + it('with no expectedStorageTag (legacy behaviour) returns whichever UAL the store enumerates first', async () => { + // Pre-RFC behaviour. The exact choice is arbitrary, but the + // operation MUST still succeed in single-storage CGs (every + // deployment today). We only assert that one of the two seeded + // UALs comes back; the test does not pin the order. + const result = await extractV10KCFromStore(store, cgId, kcId); + const expected = new Set([ + kcUal(CHAIN_ID, PUBLISHER, kcId), + kcUal(CHAIN_ID, PUBLISHER, kcId, 'v9'), + ]); + expect(expected.has(result.ual)).toBe(true); + }); + + it('with an expectedStorageTag matching no UAL throws KCNotFoundError', async () => { + // Conservative failure mode: if the prover's challenge points at + // a storage we don't have data for in this CG, treat it as "kc + // not synced" rather than fall through to an unrelated UAL. + await expect( + extractV10KCFromStore(store, cgId, kcId, { expectedStorageTag: 'v11' }), + ).rejects.toBeInstanceOf(KCNotFoundError); + }); +}); + +describe('extractV10KCFromStore — RFC-40 PR-5 single-storage CGs unchanged', () => { + // Sanity check: for the overwhelming majority of CGs (single + // storage), passing `expectedStorageTag` should be a no-op vs + // leaving it undefined. Establishes that the extractor's filter + // path doesn't introduce false negatives when there's no collision. + it('returns the only KC regardless of whether a matching tag is passed', async () => { + const store = new OxigraphStore(); + const cgId = 11n; + const cgName = `cg-${cgId.toString()}`; + const kcId = 99n; + await seedOntology(store, cgName, cgId); + await seedKC(store, cgName, cgId, { + ual: kcUal(CHAIN_ID, PUBLISHER, kcId), + kcId, + rootEntity: ROOT_V10, + publicTriple: { subject: ROOT_V10, predicate: 'urn:p:k', object: '"v"' }, + }); + + const noFilter = await extractV10KCFromStore(store, cgId, kcId); + const explicit = await extractV10KCFromStore(store, cgId, kcId, { expectedStorageTag: '' }); + expect(noFilter.ual).toBe(explicit.ual); + expect(noFilter.leaves).toEqual(explicit.leaves); + }); +}); From 177d25e72f16062bb5e2c91c73fd8d1a21869b1c Mon Sep 17 00:00:00 2001 From: Branimir Rakic Date: Wed, 27 May 2026 01:34:57 +0200 Subject: [PATCH 07/12] docs(rfc40): operator guide for multi-storage KC URI scheme (RFC-40 PR-6) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit PR-6 of OT-RFC-40 wraps up the work by giving operators (and future maintainers) a single page that answers the questions the scheme introduces: what a UAL tag is, how to read one, how to add a new storage version, when (and crucially when *not*) to bump `chainResetMarker` now that storage redeploys are non-destructive, and a small failure-mode triage table for the new code paths. `docs/STORAGE_VERSION_TAGS.md` is the new entry point; it cross-links to the implementation files in core/chain/random-sampling and back to the RFC for design rationale. `docs/TESTNET_RESET.md` gets a top-of-doc warning and a footer link so operators reaching for that runbook on a storage-only redeploy land on the right doc instead. `packages/cli/src/daemon/chain-reset-wipe.ts`'s header comment gets a new "Relationship to OT-RFC-40" section explaining that the auto-wipe hook is correct for non-storage chain-identity changes and incorrect for storage redeploys (which should add a tagged storage instead). The RFC itself is moved from Draft → Implemented and gains a link to the operator guide. Co-authored-by: Cursor --- docs/RFC40_MULTI_STORAGE_KC_URI_SCHEME.md | 4 +- docs/STORAGE_VERSION_TAGS.md | 176 ++++++++++++++++++++ docs/TESTNET_RESET.md | 18 ++ packages/cli/src/daemon/chain-reset-wipe.ts | 14 ++ 4 files changed, 210 insertions(+), 2 deletions(-) create mode 100644 docs/STORAGE_VERSION_TAGS.md diff --git a/docs/RFC40_MULTI_STORAGE_KC_URI_SCHEME.md b/docs/RFC40_MULTI_STORAGE_KC_URI_SCHEME.md index e78aa1a02..13375d3ba 100644 --- a/docs/RFC40_MULTI_STORAGE_KC_URI_SCHEME.md +++ b/docs/RFC40_MULTI_STORAGE_KC_URI_SCHEME.md @@ -1,8 +1,8 @@ # OT-RFC-40 — Multi-storage knowledge-collection URI scheme -**Status:** Draft +**Status:** Implemented (PRs 1-6 landed on `rfc40/multi-storage-kc-uri`) **Companion:** [GitHub issue #679 — chain-reset auto-wipe](https://github.com/OriginTrail/dkg/issues/679) -**Related:** [`docs/TESTNET_RESET.md`](./TESTNET_RESET.md), [`packages/cli/src/daemon/chain-reset-wipe.ts`](../packages/cli/src/daemon/chain-reset-wipe.ts) +**Related:** [`docs/STORAGE_VERSION_TAGS.md`](./STORAGE_VERSION_TAGS.md) (operator guide), [`docs/TESTNET_RESET.md`](./TESTNET_RESET.md), [`packages/cli/src/daemon/chain-reset-wipe.ts`](../packages/cli/src/daemon/chain-reset-wipe.ts) ## 1. Summary diff --git a/docs/STORAGE_VERSION_TAGS.md b/docs/STORAGE_VERSION_TAGS.md new file mode 100644 index 000000000..931ecef06 --- /dev/null +++ b/docs/STORAGE_VERSION_TAGS.md @@ -0,0 +1,176 @@ +# Storage Version Tags — operator guide + +**Status:** stable as of OT-RFC-40 (PR-1 → PR-5) +**Companion:** [`docs/RFC40_MULTI_STORAGE_KC_URI_SCHEME.md`](./RFC40_MULTI_STORAGE_KC_URI_SCHEME.md), [`docs/TESTNET_RESET.md`](./TESTNET_RESET.md) + +## TL;DR + +Every Knowledge Collection UAL embeds the **storage instance** that +minted it. That tag is what makes V9 and V10 KCs coexist on the same +Hub today, and it is what will make V11, V12, and any future storage +upgrade additive (no chain-state wipe, no data loss) instead of +destructive. + +You will see two valid UAL forms in the wild and in your own logs: + +``` +Default storage (legacy, V10 today): did:dkg:{chainId}/{publisher}/{startKAId} +Tagged storage (V9 today, V11+...): did:dkg:{tag}/{chainId}/{publisher}/{startKAId} +``` + +`{tag}` is `[a-z0-9-]+`. A KC's tag is **forever** — once a KC is minted +its UAL will never re-tag. + +## What the tag is, in one sentence + +A storage instance's `uri(0)` ERC-1155 view returns its `uriBase` +(`did:dkg`, `did:dkg:v9`, `did:dkg:v11`, …). The agent strips the +common `did:dkg:` prefix; whatever remains is the tag (empty for the +default storage). The tag is the storage's URI suffix, so you can read +it directly off-chain at any time. + +## Why this matters to operators + +Three things change on the operator-visible surface: + +1. **Logs now show 4-segment UALs for V9 KAs.** This is normal: + `did:dkg:v9/base:84532/0xA1.../12345`. Treat it identically to a + V10 UAL — the daemon already routes correctly. +2. **The `chain-reset-wipe` hook is no longer the right answer for + storage-only redeploys.** Bumping `network/.json#chainResetMarker` + still works, but on any network where a new KC storage version was + deployed alongside the old one, the right cutover is "register the + new storage with a new `uriBase`" — old data keeps resolving to old + storage, new data uses new storage, no wipe needed. Reach for the + marker only when actual chain entities (CG ids, identity ids) are + redeployed too. See "When to bump the marker" below. +3. **There is no per-operator action for the multi-storage scheme to + take effect.** PR-3's registry runs at agent boot; it discovers + every registered KC-class storage on the Hub automatically and + refreshes on `Hub.NewAssetStorage` / `AssetStorageChanged` events. + +## Reading a UAL + +``` +did:dkg:base:84532/0xA1.../12345 + └── chainId └── tokenId on the default storage's contract + +did:dkg:v9/base:84532/0xA1.../12345 + └── tag └── tokenId on the V9 KAS contract + └── chainId +``` + +Helper in code: `parseUal(ual)` from `@origintrail-official/dkg-core`. +It returns `{ chainId, storageTag, publisherAddress, startKAId }` or +`null` for malformed input. Both 3- and 4-segment forms parse without +ambiguity. + +## How a new storage tag goes live + +For maintainers / contracts deployers (operators do nothing). + +1. Deploy the new storage (e.g. `KnowledgeCollectionStorageV11`) with + a fresh `uriBase` of the form `did:dkg:` — e.g. `did:dkg:v11`. + The tag MUST match `[a-z0-9-]+` (no spaces, no slashes, no `:`). +2. Register it on Hub via `Hub.setAssetStorageContract(name, addr)` or + `Hub.setAndReinitializeContracts(...)`. The Hub emits + `NewAssetStorage` (or `AssetStorageChanged` if rotating an existing + slot). +3. Every running daemon refreshes its `KCStorageRegistry` on those + events. No restart needed — the next publish or random-sampling + challenge against the new storage just works. +4. Subsequent mints into the new storage produce + `did:dkg:/{chainId}/{publisher}/{startKAId}` UALs + automatically — the publisher reads `uri(0)` once at init and + stamps every UAL with that tag. + +The old storage stays registered; old data keeps resolving. There is +no migration step. + +## What never changes about a tag + +- **The default storage's tag is empty, forever.** A storage whose + `uriBase` is exactly `did:dkg` produces 3-segment UALs; we will not + promote a different storage to be "the new default" because that + would require rewriting every legacy UAL. Keep the V10 storage at + `did:dkg`, and add V11, V12, … as tagged peers. +- **Tags are not retired.** If you need to take a storage out of + rotation (e.g. for safety), keep it deployed and just stop minting + into it. Existing UALs against that tag must continue to resolve. +- **Two storages MUST NOT share a tag.** The registry indexes by tag; + collision means one of them is unreachable. The deploy pipeline + should reject duplicate `uriBase` values; if you ever observe a + collision in `dkg_query`, that's a contracts-side bug, not a daemon + bug. + +## How resolution works (for the curious) + +``` +Publish path (mint) Resolution path (read / verify) +────────────────── ────────────────────────────── +publisher.publish() kc-extractor.extractV10KC(...) + ↓ ↓ +ChainAdapter.mintingStorageTag parseUal(ual).storageTag + (read from KCS.uri(0) ↓ + once at init, cached) KCStorageRegistry.getByTag(tag) + ↓ ↓ +kcUal(chainId, pub, id, tag) storage contract address + ↓ ↓ +4-segment if tag != '' query that contract + 3-segment otherwise (publisher / merkle root / range) +``` + +Random sampling is the most subtle case: the chain-side +`Challenge.knowledgeCollectionStorageContract` is the address of the +storage that holds the challenged KC. The prover passes that address +through `KCStorageRegistry.tagFor(addr)` to get the storage tag, then +filters its meta-graph lookup by that tag — so a prover holding two +KCs with the same `kcId` (one V9, one V10) attests to the right one. + +## When to bump `chainResetMarker` after this RFC + +You're bumping the marker because **non-storage** chain entities have +been replaced and the daemon's per-node state (publish journal, +random-sampling WAL) references entities that no longer exist: + +- Hub redeploy (new chain identity altogether): **bump.** +- IdentityStorage / Profile redeploy: **bump.** +- Context graph storage redeploy *with new ownership*: **bump.** +- KC storage v→v+1 redeploy (new `uriBase`, both registered): **don't + bump.** Old data is still valid; the multi-storage scheme handles it. +- KC storage same-tag redeploy (a "fix-forward" that reuses + `uri(0)`): **bump if and only if existing data on the old contract + is being abandoned.** Otherwise prefer redeploying as a new tag. + +The auto-wipe hook itself is unchanged by this RFC — it remains the +right tool when chain identity actually changes. + +## Failure modes & how to debug them + +| Symptom | First check | +|---|---| +| Publish gets a "publisher does not own UAL range …" rejection on a default-storage UAL | The receiver is on a Hub without a V9 KAS deployment. The V10 default path defers to ACK auth and should NOT be hitting the V9 `getPublisherRange*` API. If it is, the receiver hasn't picked up RFC-40 PR-5; upgrade. | +| Random-sampling proof fails with "no KC found" but the KC is in the local store | The prover may be looking for the wrong storage's KC. Check that `KCStorageRegistry.tagFor(challenge.knowledgeCollectionStorageContract)` returns a non-undefined tag. If undefined, the challenged storage isn't in the registry — refresh by triggering a `Hub.NewAssetStorage` event or restarting the daemon. | +| `parseUal()` returns `null` for what looks like a valid UAL | The publisher address must match `^0x[0-9a-fA-F]{40}$` — the parser intentionally rejects "UALs" that are actually CG data URIs (`did:dkg:context-graph:...`) or other DID forms. If your UAL doesn't have a valid 40-hex-character publisher segment, it's not a KC UAL. | +| Daemon log shows a UAL with an unfamiliar tag | This is fine — tags are forever. Look it up: `dkg query 'PREFIX hub: <…> SELECT ?addr WHERE { ?addr code:uriBase "did:dkg:" }'` or read `Hub.getAssetStorageAddress(...)` directly. | + +## Cross-references + +- [`packages/core/src/constants.ts`](../packages/core/src/constants.ts): + `kcUal()`, `parseUal()`, `publisherAddressFromUal()`, + `STORAGE_TAG_PATTERN`. +- [`packages/chain/src/kc-storage-registry.ts`](../packages/chain/src/kc-storage-registry.ts): + `KCStorageRegistry`, `deriveStorageTag()`. +- [`packages/chain/src/evm-adapter.ts`](../packages/chain/src/evm-adapter.ts): + `mintingStorageTag` field, `kcStorageRegistry` field, Hub event + listeners that refresh the registry, tag-aware + `verifyPublisherOwnsRange()`. +- [`packages/random-sampling/src/kc-extractor.ts`](../packages/random-sampling/src/kc-extractor.ts): + `extractV10KCFromStore({ expectedStorageTag })`. +- [`packages/cli/src/daemon/chain-reset-wipe.ts`](../packages/cli/src/daemon/chain-reset-wipe.ts): + the auto-wipe hook (unchanged by this RFC; still the right tool for + non-storage chain-identity changes). +- [`docs/TESTNET_RESET.md`](./TESTNET_RESET.md): full reset runbook + (Phases A-D). +- [`docs/RFC40_MULTI_STORAGE_KC_URI_SCHEME.md`](./RFC40_MULTI_STORAGE_KC_URI_SCHEME.md): + the design rationale, alternatives considered, and PR sequencing. diff --git a/docs/TESTNET_RESET.md b/docs/TESTNET_RESET.md index 442be9a0b..e5f995224 100644 --- a/docs/TESTNET_RESET.md +++ b/docs/TESTNET_RESET.md @@ -6,6 +6,19 @@ contract layout shipped in PR #357. Covers the three roles involved actually has to do — most of the operator-facing pain is handled by the daemon's built-in auto-update + supervised-restart. +> **Storage-only redeploys do NOT need this runbook.** +> OT-RFC-40 (see [`docs/STORAGE_VERSION_TAGS.md`](./STORAGE_VERSION_TAGS.md) +> and [`docs/RFC40_MULTI_STORAGE_KC_URI_SCHEME.md`](./RFC40_MULTI_STORAGE_KC_URI_SCHEME.md)) +> formalises the URI scheme that lets V9, V10, V11, … KC storages +> coexist on the same Hub. Adding a new KC storage version is **not** +> a chain reset: register the new storage with a fresh `uriBase` (e.g. +> `did:dkg:v11`), keep the old one deployed, and existing data keeps +> resolving without wiping anyone's `store.nq`. Use this runbook only +> when actual chain entities (Hub, IdentityStorage, the V10 default +> KC storage being abandoned) are being replaced. See +> STORAGE_VERSION_TAGS.md "When to bump `chainResetMarker` after this +> RFC" for the full decision table. + The reset is the simplest cutover path because it lets us drop V8 `Staking` + `DelegatorsInfo` + the dual-store coupling completely instead of running a wholesale state migration. Tradeoff: any node-side @@ -295,3 +308,8 @@ staking + RS pipeline in under a minute. - `scripts/devnet-test-random-sampling.sh` — the smoke test invoked in Phase D (works against any RPC + auth token, not devnet-only). - `docs/RELEASE.md` — the npm + GitHub release process used in Phase A. +- `docs/STORAGE_VERSION_TAGS.md` — operator-facing summary of the + multi-storage URI scheme that makes storage-only redeploys non- + destructive (and therefore non-runbook events). +- `docs/RFC40_MULTI_STORAGE_KC_URI_SCHEME.md` — the design rationale + behind the URI scheme. diff --git a/packages/cli/src/daemon/chain-reset-wipe.ts b/packages/cli/src/daemon/chain-reset-wipe.ts index e613335b2..0a4ab41ee 100644 --- a/packages/cli/src/daemon/chain-reset-wipe.ts +++ b/packages/cli/src/daemon/chain-reset-wipe.ts @@ -50,6 +50,20 @@ * Per the runbook contract: keystore stays so the wallet identity is * constant across resets, and `ensureProfile` re-derives the on-chain * identityId on the new chain cleanly. + * + * Relationship to OT-RFC-40 (multi-storage KC URIs) + * -------------------------------------------------- + * RFC-40 (`docs/RFC40_MULTI_STORAGE_KC_URI_SCHEME.md`, + * `docs/STORAGE_VERSION_TAGS.md`) makes a *storage* redeploy non- + * destructive: when V11/V12/etc KC storages are added alongside V10's + * default storage, every existing UAL keeps resolving to its original + * storage and no on-disk wipe is necessary. **For storage-only + * redeploys, do not bump `chainResetMarker`** — register the new + * storage with a fresh `uriBase` (e.g. `did:dkg:v11`) on the Hub + * instead. The auto-wipe hook below remains the right tool only for + * non-storage chain-identity changes (Hub redeploy, IdentityStorage + * redeploy, full testnet reset). See STORAGE_VERSION_TAGS.md "When to + * bump `chainResetMarker` after this RFC" for the full decision table. */ import { existsSync, readFileSync, writeFileSync, readdirSync, rmSync } from 'node:fs'; import { join } from 'node:path'; From 74d418886f6e312eae000a382f208f2e07d52665 Mon Sep 17 00:00:00 2001 From: Branimir Rakic Date: Wed, 27 May 2026 02:20:27 +0200 Subject: [PATCH 08/12] fix(rfc-40): address Codex review on PR #718 (4 substantive findings) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Closes the four substantive bugs Codex Review flagged on PR #718: (Comment 1, 🔴, packages/publisher/src/publish-handler.ts:608 → fix in core/src/constants.ts) — `parseUal` was strict-rejecting any UAL with more than 4 segments after `did:dkg:`. The pre-RFC-40 `verifyUALConsistency` extracted `segments[2]` directly, which implicitly tolerated KA-suffixed subjects like `did:dkg:base:84532/0xPub/123/7` (denoting KA #7 inside KC #123). Switching to the strict parser made the range check silently SKIP those subjects, letting a PublishRequest hide out-of-range references by appending a path segment. Fix: relax `parseUal` to accept ≥3 segments and discard trailing per-KA suffix bytes; disambiguate the 3- and 4-segment forms by content (`STORAGE_TAG_PATTERN.test(segments[0]) && PUBLISHER_ADDRESS_PATTERN.test(segments[2])` → tagged form, otherwise default form). Tests in `core/test/constants.test.ts` updated and extended with KA-suffix coverage. (Comment 2, 🔴, packages/random-sampling/src/prover.ts:344) — when the chain adapter exposes a `kcStorageRegistry` but `tagFor(challenge.knowledgeCollectionStorageContract)` returns undefined (stale cache, race vs Hub event, partial init failure), the prover silently fell through to the legacy "first batchId match wins" path. In a mixed-storage CG that could prove against the WRONG KC instead of failing fast. Fix: try a single registry refresh (covers the legitimate stale-cache case), and if the storage stays unknown afterwards, fail-closed by writing a `kcs-registry-miss` WAL entry and returning `kc-not-synced`. Adapters without a registry at all keep the legacy unfiltered behaviour because they explicitly opted out of tag-aware proving by not exposing a registry. (Comment 3, 🔴, packages/chain/src/evm-adapter.ts:1518) — `verifyPublisherOwnsRange` returned `true` for ANY tag the registry recognised, hard-coding the assumption that every future tagged storage authenticates publishes the same way V10 default storage does (ACK signatures). If V11+ ships a different ownership API, receivers would silently accept spoofed ranges as soon as the new storage is registered. Fix: only the explicit V10 default (empty tag) and `'v9'` get yes/no answers; everything else fails closed. A new test `verifyPublisherOwnsRange fails closed for a registered-but-unrecognised tag` pins the contract by injecting a synthetic `'v11-future'` registry entry and asserting `false`. (Comment 4, 🟡, packages/chain/src/evm-adapter.ts:2945) — if the initial `buildKCStorageRegistry()` call in `init()` failed (e.g. transient RPC error during boot), `kcStorageRegistry` stayed `undefined` and the Hub event listener silently no-op'd forever, leaving tagged-storage resolution broken until process restart. Fix: the asset-storage event handler now uses the first event as a recovery trigger — if the registry is still undefined, build it from scratch; otherwise refresh. All four fixes are scoped strictly to the bugs Codex flagged. No public API changes; no behavioural changes for paths that already worked. The full test sweep is green: core: 953/953 (+2 KA-suffix tests) chain: 417/417 (+1 fail-closed registered-tag test) random-sampling: 54/54 (+2 multi-storage routing tests) publisher: 1050/1056 (6 pre-existing skips) Co-authored-by: Cursor --- packages/chain/src/evm-adapter.ts | 47 ++++--- packages/chain/test/evm-adapter.test.ts | 37 ++++++ packages/core/src/constants.ts | 56 ++++++-- packages/core/test/constants.test.ts | 44 ++++++- packages/random-sampling/src/prover.ts | 59 ++++++++- packages/random-sampling/test/prover.test.ts | 128 +++++++++++++++++++ 6 files changed, 332 insertions(+), 39 deletions(-) diff --git a/packages/chain/src/evm-adapter.ts b/packages/chain/src/evm-adapter.ts index de3287890..b6f814af5 100644 --- a/packages/chain/src/evm-adapter.ts +++ b/packages/chain/src/evm-adapter.ts @@ -1501,21 +1501,18 @@ export class EVMChainAdapter implements ChainAdapter { return true; } - // Unknown tag: the registry has not seen this storage. Conservative - // failure: refuse to attest range ownership for an unknown storage - // instance. Operators see the rejection message and can either - // refresh the registry or investigate why a UAL was minted under - // a tag the receiver doesn't recognise. - if (!this.kcStorageRegistry?.getByTag(tag)) { - return false; - } - - // Future tagged storages might re-introduce a V9-style range API. - // When that happens, this branch can switch on the storage's hubName - // to route to the appropriate query. Today, the conservative answer - // for any tag we DO recognise but don't have a range API for is - // identical to V10's: defer to the publish-handler's downstream auth. - return true; + // Codex review on PR #718 (Comment 3): every other tag — whether + // the registry recognises it or not — fails closed. The previous + // implementation returned `true` for any tag the registry knew + // about, which silently assumed every future storage authenticates + // publishes via ACK signatures. If V11+ ships a different + // ownership API, that fallback would let receivers accept spoofed + // ranges as soon as the new storage is registered. We refuse to + // attest range ownership without an explicit per-tag policy here: + // when V11+ lands, add an `if (tag === 'v11') { ... }` branch + // (or carry an `authMode` field on KCStorageEntry and switch on + // it) before this fallthrough. + return false; } // ===================================================================== @@ -2941,8 +2938,26 @@ export class EVMChainAdapter implements ChainAdapter { // names). Listening unconditionally avoids races where a future // version naming convention is added but we forget to update this // allowlist; the registry's filter is the single source of truth. + // + // Codex review on PR #718 (Comment 4): if the initial + // `buildKCStorageRegistry()` call in `init()` failed (e.g. + // transient RPC error during boot), `kcStorageRegistry` stays + // `undefined` and a plain refresh would silently no-op forever. + // Use the first asset-storage event as a recovery trigger: if + // there is no registry yet, build one from scratch; otherwise + // refresh the existing instance. Either path picks up the new + // storage immediately. const onAssetStorageChange = (): void => { - if (this.kcStorageRegistry === undefined) return; + if (this.kcStorageRegistry === undefined) { + void this.buildKCStorageRegistry() + .then((registry) => { + this.kcStorageRegistry = registry; + }) + .catch((err) => console.warn( + `[EVMChainAdapter] KC storage registry rebuild after Hub event failed: ${err instanceof Error ? err.message : String(err)}`, + )); + return; + } void this.kcStorageRegistry .refresh() .catch((err) => console.warn( diff --git a/packages/chain/test/evm-adapter.test.ts b/packages/chain/test/evm-adapter.test.ts index 5b5d4f755..0f3f805f7 100644 --- a/packages/chain/test/evm-adapter.test.ts +++ b/packages/chain/test/evm-adapter.test.ts @@ -1,5 +1,6 @@ import { describe, it, expect, beforeAll, afterAll } from 'vitest'; import { EVMChainAdapter } from '../src/evm-adapter.js'; +import { KCStorageRegistry } from '../src/kc-storage-registry.js'; import { spawnHardhatEnv, killHardhat, @@ -83,4 +84,40 @@ describe('EVMChainAdapter integration', () => { const owns = await adapter.verifyPublisherOwnsRange(deployer, 1n, 1n, 'nonexistent-tag'); expect(owns).toBe(false); }, 30_000); + + it('verifyPublisherOwnsRange fails closed for a registered-but-unrecognised tag (Codex review on PR #718, Comment 3)', async () => { + // Even when the registry KNOWS a tag (e.g. a hypothetical V11 KC + // storage was registered on the Hub and the registry refreshed), + // verifyPublisherOwnsRange must NOT silently attest range + // ownership unless this adapter has an explicit per-tag policy + // (today: empty tag → V10 ACK auth, "v9" → V9 KAS pre-reservation + // API). A blanket "registered → true" fallback would let a future + // V11 deploy whose ownership API differs from V10 silently accept + // spoofed ranges. Pin the fail-closed contract here. + const adapter = new EVMChainAdapter(makeAdapterConfig(ctx.rpcUrl, ctx.hubAddress, HARDHAT_KEYS.DEPLOYER)); + const deployer = adapter.getSignerAddress(); + // Force adapter init so the registry is populated from the live Hub. + await adapter.verifyPublisherOwnsRange(deployer, 1n, 1n); + // Swap the live registry for a fake that surfaces a hypothetical + // V11 storage. Replacing the whole registry (vs reaching into + // private fields) is the public-API path: ChainAdapter exposes + // `kcStorageRegistry` as a writeable field for exactly this reason + // — adapters can plug in custom registries (mocks, multi-Hub, + // etc) without subclassing. + const fakeAddress = '0x0000000000000000000000000000000000001111'; + adapter.kcStorageRegistry = new KCStorageRegistry( + { + getAllAssetStorages: async () => [ + { name: 'KnowledgeCollectionStorageV11', addr: fakeAddress }, + ], + }, + { + readUriBase: async () => 'did:dkg:v11-future', + }, + ); + await adapter.kcStorageRegistry.refresh(); + expect(adapter.kcStorageRegistry.getByTag('v11-future')?.address).toBe(fakeAddress); + const owns = await adapter.verifyPublisherOwnsRange(deployer, 1n, 1n, 'v11-future'); + expect(owns).toBe(false); + }, 30_000); }); diff --git a/packages/core/src/constants.ts b/packages/core/src/constants.ts index af0ba6623..c778c86b4 100644 --- a/packages/core/src/constants.ts +++ b/packages/core/src/constants.ts @@ -249,44 +249,76 @@ export interface ParsedUal { * Accepts both shapes the protocol mints today (RFC §5.2): * * 3-segment / default-storage: did:dkg:{chainId}/{pub}/{localId} - * 4-segment / storage-tagged: did:dkg:{chainId}/{tag}/{pub}/{localId} + * 4-segment / storage-tagged: did:dkg:{tag}/{chainId}/{pub}/{localId} + * + * Trailing path segments after the UAL prefix are tolerated and + * discarded — KC subjects in store.nq sometimes carry per-KA suffixes + * (e.g. `did:dkg:base:84532/0xPub/123/7` to denote KA #7 inside KC + * #123). The pre-RFC-40 `verifyUALConsistency` did this implicitly by + * indexing `segments[2]`; preserving the behaviour here means range + * checks keep firing on those suffixed subjects instead of silently + * skipping (Codex review on PR #718). + * + * Disambiguation between the 3- and 4-segment forms when more than 3 + * segments are present: + * + * - If `segments[0]` matches `STORAGE_TAG_PATTERN` AND `segments[2]` + * matches `PUBLISHER_ADDRESS_PATTERN` → tagged form (and + * `segments[3]` is the local id). + * - Else if `segments[1]` matches `PUBLISHER_ADDRESS_PATTERN` → + * default-storage form (`segments[0]` chainId, `segments[2]` + * local id, anything beyond that is the per-KA suffix). + * - Else → null (unrecognised shape). + * + * The `PUBLISHER_ADDRESS_PATTERN` check is what disambiguates real + * UALs from same-prefix CG data URIs (`did:dkg:context-graph:...`). * * Returns `null` for any input that doesn't start with `did:dkg:` or - * whose segment count is not exactly 3 or 4 — callers (e.g. the + * whose shape doesn't match either form above — callers (e.g. the * publish-handler range check) treat `null` as "not a UAL we own, * skip" rather than "validation error". Malformed inputs that happen * to match the prefix-and-shape but carry a non-numeric local-id slot * are returned with `startKAId: null` rather than rejected, because * the tentative-publish path uses synthetic string IDs of the form * `t` until the chain confirms. - * - * Validation of the storage tag against `STORAGE_TAG_PATTERN` IS - * enforced — a malformed tag (e.g. embedded `/` or `:`) is treated as - * "not a valid UAL" and returns `null`. */ export function parseUal(ual: string | undefined | null): ParsedUal | null { if (typeof ual !== 'string') return null; if (!ual.startsWith(DID_DKG_PREFIX)) return null; const segments = ual.slice(DID_DKG_PREFIX.length).split('/'); + if (segments.length < 3) return null; + let chainId: string; let storageTag: string; let publisherAddress: string; let localIdSegment: string; - if (segments.length === 3) { - [chainId, publisherAddress, localIdSegment] = segments; + // Disambiguation order: tagged form is checked first because its + // detector is strictly stronger (requires segments[0] to match + // STORAGE_TAG_PATTERN, which CAIP-2 chainIds like `base:84532` fail + // because they contain ':'). Default-storage detection is the + // fallback. + if ( + segments.length >= 4 && + STORAGE_TAG_PATTERN.test(segments[0]) && + PUBLISHER_ADDRESS_PATTERN.test(segments[2]) + ) { + storageTag = segments[0]; + chainId = segments[1]; + publisherAddress = segments[2]; + localIdSegment = segments[3]; + } else if (PUBLISHER_ADDRESS_PATTERN.test(segments[1])) { storageTag = ''; - } else if (segments.length === 4) { - [storageTag, chainId, publisherAddress, localIdSegment] = segments; - if (!STORAGE_TAG_PATTERN.test(storageTag)) return null; + chainId = segments[0]; + publisherAddress = segments[1]; + localIdSegment = segments[2]; } else { return null; } if (chainId.length === 0) return null; if (localIdSegment.length === 0) return null; - if (!PUBLISHER_ADDRESS_PATTERN.test(publisherAddress)) return null; let startKAId: bigint | null; try { diff --git a/packages/core/test/constants.test.ts b/packages/core/test/constants.test.ts index 8ce7716a8..f0cbf262c 100644 --- a/packages/core/test/constants.test.ts +++ b/packages/core/test/constants.test.ts @@ -184,19 +184,51 @@ describe('parseUal (OT-RFC-40 §5.2 — handles both 3- and 4-segment forms)', ( expect(parseUal('')).toBeNull(); }); - it('returns null for input segment counts other than 3 or 4', () => { + it('returns null when fewer than 3 segments follow did:dkg:', () => { // 1 or 2 segments — covers CG/data URIs that share the prefix. expect(parseUal('did:dkg:context-graph:agents')).toBeNull(); expect(parseUal('did:dkg:foo/bar')).toBeNull(); - // 5+ segments — extension paths under a UAL. - expect(parseUal('did:dkg:base:84532/0xPub/123/sub/extra')).toBeNull(); + }); + + it('tolerates trailing per-KA suffix segments (default-storage form)', () => { + // store.nq subjects sometimes carry a per-KA index after the kcId + // (e.g. `did:dkg:base:84532/0xPub/123/7` ≡ KA #7 inside KC #123). + // The pre-RFC-40 verifyUALConsistency indexed segments[2] directly, + // so it kept range-checking those subjects. parseUal must preserve + // that behaviour or it would silently skip the check (Codex review + // on PR #718). + const parsed = parseUal( + 'did:dkg:base:84532/0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1/123/7', + ); + expect(parsed).toEqual({ + chainId: 'base:84532', + storageTag: '', + publisherAddress: '0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1', + startKAId: 123n, + }); + }); + + it('tolerates trailing per-KA suffix segments (tagged form)', () => { + const parsed = parseUal( + 'did:dkg:v9/base:84532/0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1/17/3/extra', + ); + expect(parsed).toEqual({ + chainId: 'base:84532', + storageTag: 'v9', + publisherAddress: '0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1', + startKAId: 17n, + }); }); it('returns null when any segment is empty', () => { - expect(parseUal('did:dkg://0xPub/123')).toBeNull(); + expect(parseUal('did:dkg://0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1/123')).toBeNull(); expect(parseUal('did:dkg:base:84532//123')).toBeNull(); - expect(parseUal('did:dkg:base:84532/0xPub/')).toBeNull(); - expect(parseUal('did:dkg:v9//base:84532/0xPub/123')).toBeNull(); + expect(parseUal('did:dkg:base:84532/0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1/')).toBeNull(); + expect( + parseUal( + 'did:dkg:v9//base:84532/0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1/123', + ), + ).toBeNull(); }); it('rejects malformed storage tags (uppercase, colon, special chars)', () => { diff --git a/packages/random-sampling/src/prover.ts b/packages/random-sampling/src/prover.ts index 708c6e206..80534cbf4 100644 --- a/packages/random-sampling/src/prover.ts +++ b/packages/random-sampling/src/prover.ts @@ -338,11 +338,60 @@ export class RandomSamplingProver { // (an address) back to the storage's UAL tag. The extractor uses // this to disambiguate batchId collisions in CGs that hold KCs // from multiple storage versions (V9 + V10 today; arbitrary - // versions in the future). Adapters without a registry still - // produce a working prover by falling through to the legacy - // "match the first UAL" behaviour — bit-for-bit pre-RFC. - const expectedStorageTag = this.chain.kcStorageRegistry - ?.tagFor(challenge.knowledgeCollectionStorageContract); + // versions in the future). + // + // Codex review on PR #718 (Comment 2): if the registry exists but + // does not yet know the challenged storage address (stale cache, + // race between Hub.NewAssetStorage emission and our event listener, + // partial init failure), tagFor returns undefined. Falling through + // to the legacy "first UAL match wins" path can prove the WRONG KC + // in mixed-storage CGs. Try a single registry refresh — that + // covers the legitimate stale-cache case — and fail-closed + // (skip-period as a kc-not-synced miss) if the storage stays + // unknown afterwards. Adapters without a registry at all + // (`kcStorageRegistry` is `undefined`) still get the legacy + // unfiltered behaviour because they explicitly opted out of + // tag-aware proving by not exposing a registry. + let expectedStorageTag: string | undefined; + if (this.chain.kcStorageRegistry) { + expectedStorageTag = this.chain.kcStorageRegistry.tagFor( + challenge.knowledgeCollectionStorageContract, + ); + if (expectedStorageTag === undefined) { + try { + await this.chain.kcStorageRegistry.refresh(); + } catch (err) { + this.log.warn('rs.tick.kcs-registry-refresh-failed', { + err: err instanceof Error ? err.message : String(err), + }); + } + expectedStorageTag = this.chain.kcStorageRegistry.tagFor( + challenge.knowledgeCollectionStorageContract, + ); + } + if (expectedStorageTag === undefined) { + const storageAddress = challenge.knowledgeCollectionStorageContract; + this.log.warn('rs.tick.kcs-registry-miss', { + kcId: kcId.toString(), + cgId: cgId.toString(), + storageContract: storageAddress, + }); + await this.wal.append( + makeWalEntry(periodKey, 'failed', { + kcId: kcId.toString(), + cgId: cgId.toString(), + chunkId: chunkId.toString(), + error: { + code: 'kcs-registry-miss', + message: + `KC storage registry has no entry for challenge storage ${storageAddress}; ` + + `refusing to prove against the wrong KC`, + }, + }), + ); + return { kind: 'kc-not-synced', kcId, cgId }; + } + } let leaves: Uint8Array[]; try { diff --git a/packages/random-sampling/test/prover.test.ts b/packages/random-sampling/test/prover.test.ts index 53eac575b..251c3b00f 100644 --- a/packages/random-sampling/test/prover.test.ts +++ b/packages/random-sampling/test/prover.test.ts @@ -48,6 +48,12 @@ interface FakeChainState { * check inside the prover engages. Tests that omit this fall back * to "stale check always false" (the production-safe default). */ blockNumber?: number; + /** OT-RFC-40 §7.5: optional fake KC storage registry. Tests that + * exercise the multi-storage prover paths (Codex review on PR #718, + * Comment 2) seed this; tests that omit it leave the prover in + * "registry-less" mode where `expectedStorageTag` is undefined and + * the extractor falls back to its legacy unfiltered behaviour. */ + kcStorageRegistry?: ChainAdapter['kcStorageRegistry']; } function makeChain(state: FakeChainState): ChainAdapter { @@ -66,6 +72,9 @@ function makeChain(state: FakeChainState): ChainAdapter { if (state.blockNumber !== undefined) { partial.getBlockNumber = vi.fn(async () => state.blockNumber!); } + if (state.kcStorageRegistry !== undefined) { + partial.kcStorageRegistry = state.kcStorageRegistry; + } return partial as ChainAdapter; } @@ -706,6 +715,125 @@ describe('RandomSamplingProver — short-circuits', () => { }); }); +describe('RandomSamplingProver — multi-storage routing (OT-RFC-40, Codex review on PR #718)', () => { + // Codex Comment 2: when the chain adapter exposes a kcStorageRegistry + // but `tagFor(challenge.knowledgeCollectionStorageContract)` returns + // undefined (registry stale, init failure, race vs Hub event), the + // prover MUST NOT silently fall through to unfiltered extraction — + // a same-`kcId` collision across storages would prove the wrong KC. + // Pin the fail-closed contract: refresh the registry once, and if + // it still doesn't know the storage, skip the period as kc-not-synced + // with an explicit `kcs-registry-miss` WAL entry. + + it('skips period as kc-not-synced when registry exists but the challenge storage is unknown (refresh does not recover)', async () => { + const store = new OxigraphStore(); + const fixture: KCFixture = { + cgId: 11n, kcId: 7n, ual: 'did:dkg:hardhat:31337/0xpub/7', + rootEntities: ['urn:e:1'], + publicTriples: [{ subject: 'urn:e:1', predicate: 'urn:p:k', object: '"a"' }], + }; + const { root, leafCount } = await seedKC(store, fixture); + + const refresh = vi.fn(async () => { + // The "stale cache" recovery attempt fails to discover the + // challenged storage. The prover must escalate to fail-closed. + }); + const tagFor = vi.fn(() => undefined); + const fakeRegistry = { refresh, tagFor } as unknown as NonNullable; + + const challengeStorage = '0x000000000000000000000000000000000000beef'; + const submitProof = vi.fn(async () => ({ hash: '0xunused', blockNumber: 1, success: true })); + const chain = makeChain({ + status: { activeProofPeriodStartBlock: 1000n, isValid: true }, + challengeForNode: null, + createChallenge: async () => ({ + challenge: makeChallenge({ + knowledgeCollectionId: fixture.kcId, + chunkId: 0n, + knowledgeCollectionStorageContract: challengeStorage, + }), + contextGraphId: fixture.cgId, + hash: '0x', blockNumber: 1, success: true, + }), + expectedRoot: root, + expectedLeafCount: leafCount, + cgIdForKc: fixture.cgId, + submitProof: submitProof as never, + kcStorageRegistry: fakeRegistry, + }); + + const wal = new InMemoryProverWal(); + const prover = new RandomSamplingProver({ chain, store, identityId: IDENTITY_ID, wal }); + const outcome = await prover.tick(); + expect(outcome).toEqual({ kind: 'kc-not-synced', kcId: fixture.kcId, cgId: fixture.cgId }); + // Refresh was attempted exactly once before fail-closing. + expect(refresh).toHaveBeenCalledTimes(1); + // tagFor consulted twice: once before refresh, once after. + expect(tagFor).toHaveBeenCalledTimes(2); + expect(tagFor).toHaveBeenCalledWith(challengeStorage); + expect(submitProof).not.toHaveBeenCalled(); + const failed = (await wal.readAll()).find((e) => e.status === 'failed'); + expect(failed?.error?.code).toBe('kcs-registry-miss'); + await prover.close(); + }); + + it('uses the resolved tag when the registry recognises the challenge storage', async () => { + const store = new OxigraphStore(); + // The fixture's UAL must round-trip through parseUal so the + // storage-tag filter can match it — i.e. the publisher slot has + // to satisfy `^0x[0-9a-fA-F]{40}$`. The other prover tests in + // this file use the legacy `0xpub` placeholder because their + // chain has no kcStorageRegistry, so the filter is skipped + // entirely. Multi-storage tests can't take that shortcut. + const fixture: KCFixture = { + cgId: 11n, + kcId: 7n, + ual: 'did:dkg:hardhat:31337/0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1/7', + rootEntities: ['urn:e:1'], + publicTriples: [{ subject: 'urn:e:1', predicate: 'urn:p:k', object: '"a"' }], + }; + const { root, leafCount } = await seedKC(store, fixture); + + const refresh = vi.fn(async () => {}); + // The default-storage tag round-trip confirms the registry path + // is wired without changing observable behaviour vs the + // registry-less mode (legacy 3-segment UALs end up filtered by + // `expectedStorageTag === ''`, which matches the seed UAL). + const tagFor = vi.fn(() => ''); + const fakeRegistry = { refresh, tagFor } as unknown as NonNullable; + + const submitProof = vi.fn(async () => ({ hash: '0xresolved', blockNumber: 1, success: true })); + const chain = makeChain({ + status: { activeProofPeriodStartBlock: 1000n, isValid: true }, + challengeForNode: null, + createChallenge: async () => ({ + challenge: makeChallenge({ + knowledgeCollectionId: fixture.kcId, + chunkId: 0n, + knowledgeCollectionStorageContract: '0x000000000000000000000000000000000000face', + }), + contextGraphId: fixture.cgId, + hash: '0x', blockNumber: 1, success: true, + }), + expectedRoot: root, + expectedLeafCount: leafCount, + cgIdForKc: fixture.cgId, + submitProof: submitProof as never, + kcStorageRegistry: fakeRegistry, + }); + + const prover = new RandomSamplingProver({ chain, store, identityId: IDENTITY_ID }); + const outcome = await prover.tick(); + expect(outcome.kind).toBe('submitted'); + // Registry was hit once (no refresh needed because tagFor returned + // a defined value on the first call). + expect(tagFor).toHaveBeenCalledTimes(1); + expect(refresh).not.toHaveBeenCalled(); + expect(submitProof).toHaveBeenCalledTimes(1); + await prover.close(); + }); +}); + describe('RandomSamplingProver — concurrency', () => { it('single-flights: concurrent ticks resolve to the same outcome and run once', async () => { const store = new OxigraphStore(); From 6f274ffb860a23787e1ff9ca5032c9c511ed41a7 Mon Sep 17 00:00:00 2001 From: Branimir Rakic Date: Wed, 27 May 2026 02:42:33 +0200 Subject: [PATCH 09/12] fix(rfc-40): address Codex review on PR #718 round 2 (3 substantive findings) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Round 2 of Codex review on PR #718 caught three real issues introduced by the round-1 fixes. The shared root cause was that round 1 routed `verifyPublisherOwnsRange` by hardcoded tag string (`''` / `'v9'`), which solved the original "blanket-true" problem (round 1 Comment 3) but introduced two new problems: 1. **Round 2, Comment 2 (🔴, evm-adapter.ts:1515)** — the strict tag-string switch made every V11+ KC storage unpublishable, even when the registry already knew the storage. That contradicted RFC §1's "additive future versions, no agent change required" promise. 2. **Round 2, Comment 1 (🔴, prover.ts:398)** — filtering the local- store extractor by `expectedStorageTag` only closed the read-side gap. The prover's chain reads (`getKCContextGraphId`, `getLatestMerkleRoot`, `getMerkleLeafCount`) still hardcode the V10 default KCS, so a tagged-storage challenge would still verify local data against the wrong contract. 3. **Round 2, Comment 3 (🔴, publish-handler.ts:278)** — a missing or malformed `request.ual` made `parseUal()` return null, `storageTag` collapse to `undefined`, the adapter's default-tag path return `true`, and the request bypass the ownership preflight entirely. Pre-RFC, the V9 KAS check would have rejected. Fixes: (1) New `KCStorageEntry.authMode` field (`kas-pre-reserved-range` | `kcs-ack-based` | `unknown`), derived at refresh time from the storage's `hubName` via the new exported `deriveAuthMode`. The EVM adapter's `verifyPublisherOwnsRange` now switches on `entry.authMode` rather than tag string: - `kcs-ack-based` (V10 default + V11 KCS extensions) → defer to ACK gate, return `true`. This is the V11+ "just works" path. - `kas-pre-reserved-range` (V9 KAS family) → query the registry-discovered storage address (`new Contract(entry.address, loadAbi('KnowledgeAssetsStorage'), this.signer)`) for publisher ranges. Was previously bound to the single `this.contracts.knowledgeAssetsStorage` instance — a Hub with multiple KAS-class storages now routes correctly. - `unknown` (registered under a hub-name pattern outside the two known KC families) → fail-closed. The empty-tag path is special-cased above the registry lookup so adapters bound to Hubs with no KC storage registered still accept V10 publishes (registry empty != "fail closed for default"). (2) Prover now fail-closes when `expectedStorageTag !== ''` with an explicit `tagged-storage-rs-unsupported` WAL entry, until the chain reads grow tag/address parameters. Documented as an RFC-40 follow-up. This is correct in practice (V10 random sampling targets only V10 default KCS today) and stops silent misrouting when V11+ RS-participating storage lands. (3) Publish-handler now rejects a request whose UAL is missing or not parseable when `startKAId/endKAId/publisherAddress` are all set. The rejection reason explicitly mentions "parseable UAL" so operators can grep logs. Tests added: - chain/test/kc-storage-registry.test.ts: `deriveAuthMode` mapping, plus `KCStorageEntry.authMode` populated correctly for V10/V9/V11 fixtures. - chain/test/evm-adapter.test.ts: V11 KCS extension auto-routes to ACK auth (returns `true`), and a synthetic `unknown` authMode entry returns `false`. - random-sampling/test/prover.test.ts: tagged-storage challenge skips the period as `kc-not-synced` with `tagged-storage-rs-unsupported` WAL code. - publisher/test/publish-lifecycle.test.ts: malformed-UAL PublishRequest rejected with "parseable UAL" reason. Full sweep: core: 953/953 chain: 422/422 (+5) random-sampling: 55/55 (+1) publisher: 1051/1057 (+1, 6 pre-existing skips) Co-authored-by: Cursor --- packages/chain/src/evm-adapter.ts | 94 +++++++++++-------- packages/chain/src/index.ts | 2 + packages/chain/src/kc-storage-registry.ts | 53 ++++++++++- packages/chain/test/evm-adapter.test.ts | 71 +++++++++++--- .../chain/test/kc-storage-registry.test.ts | 51 ++++++++++ packages/publisher/src/publish-handler.ts | 23 +++-- .../publisher/test/publish-lifecycle.test.ts | 40 ++++++++ packages/random-sampling/src/prover.ts | 35 +++++++ packages/random-sampling/test/prover.test.ts | 57 ++++++++++- 9 files changed, 365 insertions(+), 61 deletions(-) diff --git a/packages/chain/src/evm-adapter.ts b/packages/chain/src/evm-adapter.ts index b6f814af5..99ce34db3 100644 --- a/packages/chain/src/evm-adapter.ts +++ b/packages/chain/src/evm-adapter.ts @@ -1467,52 +1467,66 @@ export class EVMChainAdapter implements ChainAdapter { ): Promise { await this.init(); - // OT-RFC-40 §7.5: route the range query to the storage instance - // that minted the UAL. The publish-handler derives `storageTag` - // from `parseUal(request.ual).storageTag`; it is empty for V10 - // default-storage UALs (3-segment form) and "v9" for V9 KAS UALs. + // OT-RFC-40 §7.5: route the range query by the storage's auth + // mode rather than by hardcoded tag string. The registry derives + // `authMode` from each storage's `hubName` (`KnowledgeAssetsStorage*` + // → `kas-pre-reserved-range`, `KnowledgeCollectionStorage*` → + // `kcs-ack-based`, anything else → `unknown`). This way a future + // V11+ KC storage that extends the V10 family inherits the V10 + // ACK-auth semantics for free — the RFC's "additive future + // versions, no agent change required" promise — while a brand-new + // auth contract under an unrecognised name fails closed. + // + // Codex review on PR #718 fix (Comments 2+3 of round 2): + // - hardcoding `tag === ''` would block V11+ KCS publishes. + // - hardcoding "any registered tag → true" would silently + // accept spoofed ranges on a future custom-auth storage. + // The registry's authMode strikes the right balance. const tag = storageTag ?? ''; - if (tag === 'v9') { - // V9 KAS is the only currently-deployed storage that pre-reserves - // per-publisher ID ranges. The query API - // (`getPublisherRangesCount`/`getPublisherRange`) is V9-specific. - if (!this.contracts.knowledgeAssetsStorage) return false; - const storage = this.contracts.knowledgeAssetsStorage; - const count = await storage.getPublisherRangesCount(publisherAddress); - for (let i = 0; i < Number(count); i++) { - const [startId, endId] = await storage.getPublisherRange(publisherAddress, i); - if (startId <= startKAId && endId >= endKAId) return true; - } - return false; - } - + // Path 1 — explicit V10 default. Special-cased so adapters bound + // to a Hub without any KC storage registered still accept publishes + // (the registry would be empty in that case; we never want a + // missing-registry condition to silently reject every publish). if (tag === '') { - // V10 default storage. V10 does NOT pre-reserve ranges — KCs - // are minted directly with their token-bound publisher recorded - // on-chain at mint time. The publish-handler's existing publish - // ACK signature check (V10 ACKs collected from receiving nodes) - // is the authoritative ownership verification on this path; this - // method exists as a V9-era pre-flight, so for V10 we simply - // defer to that downstream check by returning `true`. NOT - // returning `true` here would silently reject every V10 publish - // on Hubs without a V9 KAS deployment — the bug RFC-40 PR-5 - // calls out by name. + // V10 default storage does NOT pre-reserve ranges; defer to the + // ACK-signature gate downstream. Bit-for-bit pre-RFC behaviour + // for the only path that mattered before RFC-40. return true; } - // Codex review on PR #718 (Comment 3): every other tag — whether - // the registry recognises it or not — fails closed. The previous - // implementation returned `true` for any tag the registry knew - // about, which silently assumed every future storage authenticates - // publishes via ACK signatures. If V11+ ships a different - // ownership API, that fallback would let receivers accept spoofed - // ranges as soon as the new storage is registered. We refuse to - // attest range ownership without an explicit per-tag policy here: - // when V11+ lands, add an `if (tag === 'v11') { ... }` branch - // (or carry an `authMode` field on KCStorageEntry and switch on - // it) before this fallthrough. - return false; + // Path 2 — tagged storage. Look up the registry entry; if the + // registry isn't populated or doesn't know this tag, fail closed. + const entry = this.kcStorageRegistry?.getByTag(tag); + if (!entry) return false; + + switch (entry.authMode) { + case 'kas-pre-reserved-range': { + // V9 KAS publisher-range API. Bind to the registry-discovered + // address rather than `this.contracts.knowledgeAssetsStorage` + // so a Hub with multiple KAS-class storages routes correctly. + const storage = new Contract( + entry.address, + loadAbi('KnowledgeAssetsStorage'), + this.signer, + ); + const count = await storage.getPublisherRangesCount(publisherAddress); + for (let i = 0; i < Number(count); i++) { + const [startId, endId] = await storage.getPublisherRange(publisherAddress, i); + if (startId <= startKAId && endId >= endKAId) return true; + } + return false; + } + case 'kcs-ack-based': + // V10/V11+ KC storage family — defer to ACK-signature gate. + return true; + case 'unknown': + // Storage registered under a Hub name the registry doesn't + // recognise (none of the known KC-family prefixes). Refuse to + // attest range ownership without an explicit authMode policy + // (Codex Comment 3 of round 1). + return false; + } } // ===================================================================== diff --git a/packages/chain/src/index.ts b/packages/chain/src/index.ts index 9afb93eb1..9f77ba643 100644 --- a/packages/chain/src/index.ts +++ b/packages/chain/src/index.ts @@ -9,7 +9,9 @@ export { export { KCStorageRegistry, deriveStorageTag, + deriveAuthMode, type KCStorageEntry, + type KCStorageAuthMode, type KCStorageHubReader, type KCStorageUriReader, type KCStorageRegistryLogger, diff --git a/packages/chain/src/kc-storage-registry.ts b/packages/chain/src/kc-storage-registry.ts index e2a1ef98e..42c678916 100644 --- a/packages/chain/src/kc-storage-registry.ts +++ b/packages/chain/src/kc-storage-registry.ts @@ -56,6 +56,45 @@ const KC_STORAGE_NAME_PREFIXES: readonly string[] = [ 'KnowledgeAssetsStorage', ]; +/** + * Authentication contract a KC-class storage uses for publish range + * ownership. Threaded through `KCStorageEntry.authMode` so consumers + * (e.g. `verifyPublisherOwnsRange` in the EVM adapter) can route by + * the storage's auth semantics rather than by hard-coded tag string. + * + * - `kas-pre-reserved-range`: V9 `KnowledgeAssetsStorage`. Caller + * must look up the publisher's pre-allocated ID range via + * `getPublisherRangesCount` / `getPublisherRange`. + * - `kcs-ack-based`: V10 `KnowledgeCollectionStorage` (and any V11+ + * that extends the same family). Ownership is asserted by the + * ACK signatures that hosting nodes return on a PublishRequest; + * no on-chain pre-reservation exists. The range pre-flight should + * accept (i.e. defer to the ACK gate downstream). + * - `unknown`: storage was registered under a Hub name the registry + * doesn't have a policy for. Conservative consumers fail closed. + * + * Derived from `hubName` at refresh time. A future Hub name pattern + * for a storage with a brand-new auth contract would land here as + * `unknown` and require an explicit registry override to publish + * against — that's by design (Codex review on PR #718, Comment 3). + */ +export type KCStorageAuthMode = 'kas-pre-reserved-range' | 'kcs-ack-based' | 'unknown'; + +/** + * Map a Hub registry name to its auth mode. Exported so deployment + * scripts and tests can validate against the same predicate the + * registry uses internally. + */ +export function deriveAuthMode(hubName: string): KCStorageAuthMode { + // Order matters: KnowledgeCollectionStorage is checked first because + // some V9 names like KnowledgeCollectionStorageLite (hypothetical) + // would still belong to the V10 family. KnowledgeAssetsStorage is + // V9-only; any future hubName with that prefix inherits V9 semantics. + if (hubName.startsWith('KnowledgeCollectionStorage')) return 'kcs-ack-based'; + if (hubName.startsWith('KnowledgeAssetsStorage')) return 'kas-pre-reserved-range'; + return 'unknown'; +} + export interface KCStorageEntry { /** * Hub registry name as returned by `Hub.getAllAssetStorages()`, @@ -78,6 +117,12 @@ export interface KCStorageEntry { * matching `STORAGE_TAG_PATTERN`. */ tag: string; + /** + * Auth mode derived from `hubName`. Allows consumers to route by + * policy (range API vs ACK signatures) without re-deriving from the + * tag string each time. See `deriveAuthMode` for the mapping rule. + */ + authMode: KCStorageAuthMode; } /** @@ -182,7 +227,13 @@ export class KCStorageRegistry { continue; } - const entry: KCStorageEntry = { hubName: name, address: addr, uriBase, tag }; + const entry: KCStorageEntry = { + hubName: name, + address: addr, + uriBase, + tag, + authMode: deriveAuthMode(name), + }; const existingByTag = nextByTag.get(tag); if (existingByTag) { diff --git a/packages/chain/test/evm-adapter.test.ts b/packages/chain/test/evm-adapter.test.ts index 0f3f805f7..6d62b3a3e 100644 --- a/packages/chain/test/evm-adapter.test.ts +++ b/packages/chain/test/evm-adapter.test.ts @@ -85,15 +85,15 @@ describe('EVMChainAdapter integration', () => { expect(owns).toBe(false); }, 30_000); - it('verifyPublisherOwnsRange fails closed for a registered-but-unrecognised tag (Codex review on PR #718, Comment 3)', async () => { - // Even when the registry KNOWS a tag (e.g. a hypothetical V11 KC - // storage was registered on the Hub and the registry refreshed), - // verifyPublisherOwnsRange must NOT silently attest range - // ownership unless this adapter has an explicit per-tag policy - // (today: empty tag → V10 ACK auth, "v9" → V9 KAS pre-reservation - // API). A blanket "registered → true" fallback would let a future - // V11 deploy whose ownership API differs from V10 silently accept - // spoofed ranges. Pin the fail-closed contract here. + it('verifyPublisherOwnsRange auto-routes a V11 KCS extension to ACK auth (Codex review on PR #718, Comment 2 of round 2)', async () => { + // RFC §1 promises that future KC storage versions go live without + // operator action. The registry derives `authMode` from `hubName`, + // so a hypothetical `KnowledgeCollectionStorageV11` registered on + // the Hub correctly inherits V10's `kcs-ack-based` policy and + // returns `true` (defer to ACK gate downstream) — even though + // there is no `tag === 'v11-future'` branch hardcoded in the EVM + // adapter. This is the test that pins the additive-versions + // contract. const adapter = new EVMChainAdapter(makeAdapterConfig(ctx.rpcUrl, ctx.hubAddress, HARDHAT_KEYS.DEPLOYER)); const deployer = adapter.getSignerAddress(); // Force adapter init so the registry is populated from the live Hub. @@ -101,9 +101,7 @@ describe('EVMChainAdapter integration', () => { // Swap the live registry for a fake that surfaces a hypothetical // V11 storage. Replacing the whole registry (vs reaching into // private fields) is the public-API path: ChainAdapter exposes - // `kcStorageRegistry` as a writeable field for exactly this reason - // — adapters can plug in custom registries (mocks, multi-Hub, - // etc) without subclassing. + // `kcStorageRegistry` as a writeable field for exactly this reason. const fakeAddress = '0x0000000000000000000000000000000000001111'; adapter.kcStorageRegistry = new KCStorageRegistry( { @@ -116,8 +114,55 @@ describe('EVMChainAdapter integration', () => { }, ); await adapter.kcStorageRegistry.refresh(); - expect(adapter.kcStorageRegistry.getByTag('v11-future')?.address).toBe(fakeAddress); + const v11Entry = adapter.kcStorageRegistry.getByTag('v11-future'); + expect(v11Entry?.address).toBe(fakeAddress); + expect(v11Entry?.authMode).toBe('kcs-ack-based'); const owns = await adapter.verifyPublisherOwnsRange(deployer, 1n, 1n, 'v11-future'); + expect(owns).toBe(true); + }, 30_000); + + it('verifyPublisherOwnsRange fails closed for a registered tag whose authMode is unknown', async () => { + // The registry filters incoming Hub entries by KC-class name + // prefixes today, so an `unknown` authMode is unreachable through + // the normal refresh path. The fallback exists to defend the + // contract: if `KC_STORAGE_NAME_PREFIXES` ever broadens (or a + // custom registry seeds an unknown-named entry directly), the + // adapter must NOT silently attest range ownership. Construct + // such a registry directly and pin the fail-closed answer. + const adapter = new EVMChainAdapter(makeAdapterConfig(ctx.rpcUrl, ctx.hubAddress, HARDHAT_KEYS.DEPLOYER)); + const deployer = adapter.getSignerAddress(); + await adapter.verifyPublisherOwnsRange(deployer, 1n, 1n); + + // Bypass the registry's name-prefix filter by building one whose + // hubReader emits an entry with an unrecognised hubName. The + // registry's filter rejects it; we then mutate the entry into + // place via a second registry that recognises the name. To keep + // it simple, expose a custom registry-like object that returns + // an entry with `authMode: 'unknown'` directly. + adapter.kcStorageRegistry = { + // Cast to satisfy the field type without going through the real + // refresh machinery — this is the synthetic "registry seeded + // out-of-band" path the unknown arm exists for. + getByTag(tag: string) { + if (tag === 'weird') { + return { + hubName: 'SomethingElseEntirely', + address: '0x0000000000000000000000000000000000002222', + uriBase: 'did:dkg:weird', + tag: 'weird', + authMode: 'unknown' as const, + }; + } + return undefined; + }, + tagFor: () => undefined, + getByAddress: () => undefined, + getDefault: () => undefined, + getDefaultAddress: () => undefined, + getAll: () => [], + refresh: async () => {}, + } as unknown as NonNullable; + const owns = await adapter.verifyPublisherOwnsRange(deployer, 1n, 1n, 'weird'); expect(owns).toBe(false); }, 30_000); }); diff --git a/packages/chain/test/kc-storage-registry.test.ts b/packages/chain/test/kc-storage-registry.test.ts index e7cb9661d..e913be195 100644 --- a/packages/chain/test/kc-storage-registry.test.ts +++ b/packages/chain/test/kc-storage-registry.test.ts @@ -9,6 +9,7 @@ import { describe, it, expect } from 'vitest'; import { KCStorageRegistry, deriveStorageTag, + deriveAuthMode, type KCStorageEntry, type KCStorageHubReader, type KCStorageUriReader, @@ -111,9 +112,31 @@ describe('KCStorageRegistry.refresh', () => { address: V9_KAS.address, uriBase: 'did:dkg:v9', tag: 'v9', + authMode: 'kas-pre-reserved-range', }); }); + it('derives authMode from hubName so V11+ KCS extensions inherit V10 semantics', async () => { + // Codex review on PR #718 (Comment 2 of round 2): a fail-closed + // policy that hardcoded `tag === ''` would block every V11+ KC + // storage from publishing, contradicting RFC §1's "additive + // future versions" promise. The registry derives `authMode` from + // `hubName` so the EVM adapter can route by policy: V11 KCS + // (whose hubName starts with `KnowledgeCollectionStorage*`) + // inherits V10 ACK-based auth automatically. + const v11 = { + hubName: 'KnowledgeCollectionStorageV11', + address: '0xD11D11D11D11D11D11D11D11D11D11D11D11D11D', + uriBase: 'did:dkg:v11', + }; + const { registry } = buildRegistry([V10_KCS, V9_KAS, v11]); + await registry.refresh(); + + expect(registry.getByTag('')?.authMode).toBe('kcs-ack-based'); + expect(registry.getByTag('v9')?.authMode).toBe('kas-pre-reserved-range'); + expect(registry.getByTag('v11')?.authMode).toBe('kcs-ack-based'); + }); + it('ignores Hub entries that are not KC-class storages', async () => { // Hub.getAllAssetStorages() returns every asset storage, not just // KC ones. The registry must skip ContextGraphStorage etc. @@ -240,6 +263,34 @@ describe('KCStorageRegistry.refresh', () => { }); }); +describe('deriveAuthMode (OT-RFC-40, Codex review on PR #718 round 2)', () => { + // The mapping pins the registry's policy contract: any future Hub + // name pattern that doesn't fit the two known KC-class families + // returns `unknown`, forcing per-storage explicit policy in the + // consumer (rather than silently inheriting V10 ACK auth). + + it('routes V10 default + future V11/V12 to kcs-ack-based', () => { + expect(deriveAuthMode('KnowledgeCollectionStorage')).toBe('kcs-ack-based'); + expect(deriveAuthMode('KnowledgeCollectionStorageV2')).toBe('kcs-ack-based'); + expect(deriveAuthMode('KnowledgeCollectionStorageV11')).toBe('kcs-ack-based'); + expect(deriveAuthMode('KnowledgeCollectionStorageLite')).toBe('kcs-ack-based'); + }); + + it('routes V9 KAS family to kas-pre-reserved-range', () => { + expect(deriveAuthMode('KnowledgeAssetsStorage')).toBe('kas-pre-reserved-range'); + expect(deriveAuthMode('KnowledgeAssetsStorageV1')).toBe('kas-pre-reserved-range'); + }); + + it('returns unknown for hub names outside the two KC families', () => { + // The registry's filter rejects these before they reach an entry, + // but a future expansion of `KC_STORAGE_NAME_PREFIXES` could let + // them through, so the defensive `unknown` arm matters. + expect(deriveAuthMode('ContextGraphStorage')).toBe('unknown'); + expect(deriveAuthMode('AskStorage')).toBe('unknown'); + expect(deriveAuthMode('SomethingElseEntirely')).toBe('unknown'); + }); +}); + describe('KCStorageRegistry — lookup correctness on an empty registry', () => { // A registry that never had refresh() called should answer // every lookup with undefined. Important for sites that ask diff --git a/packages/publisher/src/publish-handler.ts b/packages/publisher/src/publish-handler.ts index 2b742c960..0643ab1df 100644 --- a/packages/publisher/src/publish-handler.ts +++ b/packages/publisher/src/publish-handler.ts @@ -265,22 +265,33 @@ export class PublishHandler { // default storage skips range pre-reservation and defers to ACK // signatures). `parseUal(request.ual).storageTag` is empty for // 3-segment / default-storage UALs and equals the tag for - // 4-segment / tagged UALs; passing `undefined` (the case where - // request.ual is missing on a malformed or pre-RFC client) - // preserves the pre-RFC behaviour bit-for-bit. + // 4-segment / tagged UALs. + // + // Codex review on PR #718 (Comment 3 of round 2): a missing or + // malformed `request.ual` previously made `parseUal()` return + // null, `storageTag` collapse to `undefined`, and the adapter's + // default-tag pass-through accept the request without any range + // check at all. That regressed the pre-RFC defence (V9 KAS would + // have rejected an unowned range). Reject the request explicitly + // when a range check is mandated but the UAL isn't usable. if ( startKAId > 0n && endKAId > 0n && request.publisherAddress && this.chainAdapter?.verifyPublisherOwnsRange ) { - const parsedRequestUal = request.ual ? parseUal(request.ual) : null; - const storageTag = parsedRequestUal?.storageTag; + const parsedRequestUal = parseUal(request.ual); + if (parsedRequestUal === null) { + return this.rejectAck( + `PublishRequest with startKAId/endKAId set requires a parseable UAL ` + + `(${request.ual ? `got "${request.ual}"` : 'request.ual was missing'})`, + ); + } const owns = await this.chainAdapter.verifyPublisherOwnsRange( request.publisherAddress, startKAId, endKAId, - storageTag, + parsedRequestUal.storageTag, ); if (!owns) { return this.rejectAck( diff --git a/packages/publisher/test/publish-lifecycle.test.ts b/packages/publisher/test/publish-lifecycle.test.ts index 52260dfc0..48a7d3c99 100644 --- a/packages/publisher/test/publish-lifecycle.test.ts +++ b/packages/publisher/test/publish-lifecycle.test.ts @@ -513,6 +513,46 @@ describe('Tentative data and chain event confirmation', () => { expect(ack.rejectionReason).toContain('1..1'); }); + it('rejects PublishRequest with malformed UAL when range check is mandated (Codex review on PR #718, Comment 3 of round 2)', async () => { + // A request that carries startKAId/endKAId/publisherAddress + // requires a parseable UAL — without it, the handler can't derive + // the storage tag to route the on-chain range query, and a silent + // pass-through would let malformed requests skip the ownership + // preflight. Pin the rejection so that regression is loud. + const store = new OxigraphStore(); + const bus = new TypedEventBus(); + const chainAdapter = createEVMAdapter(HARDHAT_KEYS.CORE_OP); + const handler = new PublishHandler(store, bus, { chainAdapter }); + + const publisherAddress = TEST_WALLET.address; + const triples = [q('did:dkg:agent:QmMalformed', 'http://schema.org/name', '"MalBot"')]; + const ntriples = triples.map(t => + `<${t.subject}> <${t.predicate}> ${t.object} .`, + ).join('\n'); + + // Malformed UAL: 5 segments after `did:dkg:` with bogus shape — + // parseUal cannot recover a 3- or 4-segment KC UAL prefix from this. + const ual = 'did:dkg:not-a-real-ual'; + const reqBytes = encodePublishRequest({ + ual, + nquads: new TextEncoder().encode(ntriples), + contextGraphId: CONTEXT_GRAPH, + kas: [{ tokenId: 1, rootEntity: 'did:dkg:agent:QmMalformed', privateMerkleRoot: new Uint8Array(0), privateTripleCount: 0 }], + publisherIdentity: new Uint8Array(32), + publisherAddress, + startKAId: 1, + endKAId: 1, + chainId: 'mock:31337', + publisherSignatureR: new Uint8Array(0), + publisherSignatureVs: new Uint8Array(0), + }); + + const ackData = await handler.handler(reqBytes, 'test-peer' as any); + const ack = decodePublishAck(ackData); + expect(ack.accepted).toBe(false); + expect(ack.rejectionReason).toContain('parseable UAL'); + }); + it('accepts default-tag PublishRequest without on-chain range check (V10 defers to ACK auth)', async () => { // OT-RFC-40 §7.5: 3-segment / default-storage UALs are V10 publishes; // the publisher never reserves a range and ownership is established by diff --git a/packages/random-sampling/src/prover.ts b/packages/random-sampling/src/prover.ts index 80534cbf4..b33b96269 100644 --- a/packages/random-sampling/src/prover.ts +++ b/packages/random-sampling/src/prover.ts @@ -391,6 +391,41 @@ export class RandomSamplingProver { ); return { kind: 'kc-not-synced', kcId, cgId }; } + + // Codex review on PR #718 (Comment 1 of round 2): the chain + // reads on this code path — `getKCContextGraphId`, + // `getLatestMerkleRoot`, `getMerkleLeafCount` — are bound to the + // V10 default KC storage in the EVM adapter. Filtering the + // local-store extractor by a non-empty `expectedStorageTag` + // closes the disambiguation gap on the read side, but the chain + // reads above are still default-storage-only, so a challenge + // from a tagged storage would be verified against the wrong + // contract's root/leafCount even when local data is correct. + // Fail-closed here with an explicit reason until the chain reads + // accept a storage tag/address argument; tagged-storage RS + // support is tracked as a follow-up to RFC-40. + if (expectedStorageTag !== '') { + this.log.warn('rs.tick.tagged-storage-rs-unsupported', { + kcId: kcId.toString(), + cgId: cgId.toString(), + storageContract: challenge.knowledgeCollectionStorageContract, + expectedStorageTag, + }); + await this.wal.append( + makeWalEntry(periodKey, 'failed', { + kcId: kcId.toString(), + cgId: cgId.toString(), + chunkId: chunkId.toString(), + error: { + code: 'tagged-storage-rs-unsupported', + message: + `Random-sampling chain reads are bound to the V10 default KC storage; ` + + `challenge storage tag "${expectedStorageTag}" is not yet routed end-to-end`, + }, + }), + ); + return { kind: 'kc-not-synced', kcId, cgId }; + } } let leaves: Uint8Array[]; diff --git a/packages/random-sampling/test/prover.test.ts b/packages/random-sampling/test/prover.test.ts index 251c3b00f..ae2f6cf42 100644 --- a/packages/random-sampling/test/prover.test.ts +++ b/packages/random-sampling/test/prover.test.ts @@ -777,7 +777,7 @@ describe('RandomSamplingProver — multi-storage routing (OT-RFC-40, Codex revie await prover.close(); }); - it('uses the resolved tag when the registry recognises the challenge storage', async () => { + it('proves successfully when the registry resolves the default-storage tag', async () => { const store = new OxigraphStore(); // The fixture's UAL must round-trip through parseUal so the // storage-tag filter can match it — i.e. the publisher slot has @@ -832,6 +832,61 @@ describe('RandomSamplingProver — multi-storage routing (OT-RFC-40, Codex revie expect(submitProof).toHaveBeenCalledTimes(1); await prover.close(); }); + + it('skips period when the registry resolves a non-default-storage tag (Codex review on PR #718, Comment 1 of round 2)', async () => { + // The chain reads `getKCContextGraphId`, `getLatestMerkleRoot`, + // `getMerkleLeafCount` are all bound to the V10 default KC + // storage in the EVM adapter today — they don't accept a storage + // tag/address argument. Filtering the local-store extractor by a + // non-empty tag would only close the disambiguation gap on the + // read side; the chain reads above would still query V10 KCS, + // verifying our local data against the WRONG contract's expected + // root and leaf count. Fail-closed with an explicit + // `tagged-storage-rs-unsupported` reason until those reads grow + // a tag/address parameter (tracked as RFC-40 follow-up). + const store = new OxigraphStore(); + const fixture: KCFixture = { + cgId: 11n, + kcId: 7n, + ual: 'did:dkg:hardhat:31337/0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1/7', + rootEntities: ['urn:e:1'], + publicTriples: [{ subject: 'urn:e:1', predicate: 'urn:p:k', object: '"a"' }], + }; + const { root, leafCount } = await seedKC(store, fixture); + + const refresh = vi.fn(async () => {}); + const tagFor = vi.fn(() => 'v9'); + const fakeRegistry = { refresh, tagFor } as unknown as NonNullable; + + const submitProof = vi.fn(async () => ({ hash: '0xunused', blockNumber: 1, success: true })); + const chain = makeChain({ + status: { activeProofPeriodStartBlock: 1000n, isValid: true }, + challengeForNode: null, + createChallenge: async () => ({ + challenge: makeChallenge({ + knowledgeCollectionId: fixture.kcId, + chunkId: 0n, + knowledgeCollectionStorageContract: '0x00000000000000000000000000000000000000a9', + }), + contextGraphId: fixture.cgId, + hash: '0x', blockNumber: 1, success: true, + }), + expectedRoot: root, + expectedLeafCount: leafCount, + cgIdForKc: fixture.cgId, + submitProof: submitProof as never, + kcStorageRegistry: fakeRegistry, + }); + + const wal = new InMemoryProverWal(); + const prover = new RandomSamplingProver({ chain, store, identityId: IDENTITY_ID, wal }); + const outcome = await prover.tick(); + expect(outcome).toEqual({ kind: 'kc-not-synced', kcId: fixture.kcId, cgId: fixture.cgId }); + expect(submitProof).not.toHaveBeenCalled(); + const failed = (await wal.readAll()).find((e) => e.status === 'failed'); + expect(failed?.error?.code).toBe('tagged-storage-rs-unsupported'); + await prover.close(); + }); }); describe('RandomSamplingProver — concurrency', () => { From 5ed33ca92aaf2c93afc65a38c272105326507718 Mon Sep 17 00:00:00 2001 From: Branimir Rakic Date: Wed, 27 May 2026 03:05:14 +0200 Subject: [PATCH 10/12] fix(rfc-40): address Codex review on PR #718 round 3 (2 substantive findings) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Round-3 fixes follow the same shape as rounds 1 and 2: every change is a fail-closed correction or a missing refresh path that Codex flagged on the PR. Both findings expose a stale-binding class of bug — the adapter or prover keeps a handle/tag captured at init even after the underlying state changes. 1) packages/chain/src/evm-adapter.ts (Codex round-3 comment 2): Hub `NewAssetStorage`/`AssetStorageChanged` listeners only refreshed `kcStorageRegistry`, leaving `mintingStorageTag` and the cached `this.contracts.knowledgeCollectionStorage` handle bound to the pre-rotation contract. After a runtime KC storage rotation the adapter would still mint UALs with the old tag and fan out reads through the stale contract handle until the daemon was restarted. Extract a private `refreshMintingKCStorage()` that re-resolves the `KnowledgeCollectionStorage` asset-storage from the Hub, reads `uri(0)`, recomputes the tag via `deriveStorageTag`, and updates the cached handle. Call it from `init()` (replacing the inline block) and from `onAssetStorageChange` after the registry refresh path. Both code paths share the same conservative fallback as before: any RPC/parse failure leaves `mintingStorageTag = ""` and keeps the previously-bound handle. Also exempts `refreshMintingKCStorage` from the mock-adapter parity set: it's a TS-private helper with no equivalent on the mock surface (mock has no Hub event subsystem; `setMintingStorageTag` already covers the tag-rotation knob tests need). 2) packages/random-sampling/src/prover.ts (Codex round-3 comment 1): the previous round-2 fix added a fail-closed guard for non-default storage tags but placed it AFTER `getKCContextGraphId`, `getLatestMerkleRoot`, and `getMerkleLeafCount`. Those three adapter reads are still hardcoded to the V10 default `KnowledgeCollectionStorage`, so a tagged challenge could already return the wrong CG/root/leaf count or throw before the guard could fire. Hoist the entire registry-resolution block (registry miss + tagged-storage-rs-unsupported) ABOVE all storage-bound chain reads. The guard now executes immediately after the challenge is resolved and before any storage-bound chain API call. Threaded `cgIdFromCreate` (the chain-side cgId returned by `createChallenge` when available, else `0n`) through the early fail-closed returns so the WAL still records best-effort cgId without forcing a default-storage read. Test changes: - packages/chain/test/evm-adapter.test.ts: pin `refreshMintingKCStorage` idempotency by reaching through TS-private internals; matches how the Hub event handler invokes it. Codex comment number documented inline. - packages/chain/test/mock-adapter-parity.test.ts: extend `MOCK_EXEMPT_FROM_EVM` with the new helper. All four packages green: chain (423), core (953), random-sampling (55, e2e excluded), publisher (1051). Co-authored-by: Cursor --- packages/chain/src/evm-adapter.ts | 128 +++++++++++------- packages/chain/test/evm-adapter.test.ts | 26 ++++ .../chain/test/mock-adapter-parity.test.ts | 7 + packages/random-sampling/src/prover.ts | 115 ++++++++-------- 4 files changed, 167 insertions(+), 109 deletions(-) diff --git a/packages/chain/src/evm-adapter.ts b/packages/chain/src/evm-adapter.ts index 99ce34db3..9aab4cf17 100644 --- a/packages/chain/src/evm-adapter.ts +++ b/packages/chain/src/evm-adapter.ts @@ -813,34 +813,7 @@ export class EVMChainAdapter implements ChainAdapter { } catch { // V8 KnowledgeCollection not deployed — legacy publish surface unavailable. } - this.contracts.knowledgeCollectionStorage = await this.resolveAssetStorage('KnowledgeCollectionStorage'); - - // OT-RFC-40 §5.2 — bind this adapter to the storage tag of the - // `KnowledgeCollectionStorage` instance it is minting into. The - // tag is the suffix of the storage's `uri(0)` past the `did:dkg` - // prefix; empty for the canonical V10 default storage. Failure to - // read or parse the uriBase falls back to "" (the legacy 3-segment - // UAL form), which is the most-conservative behaviour: it preserves - // every bit of UAL output produced before this RFC. - try { - const uriBase: string = await this.contracts.knowledgeCollectionStorage.uri(0); - const tag = deriveStorageTag(uriBase); - if (tag === null) { - console.warn( - `[EVMChainAdapter] KC storage at ${await this.contracts.knowledgeCollectionStorage.getAddress()} ` + - `returned malformed uriBase "${uriBase}"; falling back to default storage tag (3-segment UALs)`, - ); - this.mintingStorageTag = ''; - } else { - this.mintingStorageTag = tag; - } - } catch (err) { - console.warn( - `[EVMChainAdapter] failed to read uri(0) from KC storage: ${err instanceof Error ? err.message : String(err)}; ` + - `falling back to default storage tag (3-segment UALs)`, - ); - this.mintingStorageTag = ''; - } + await this.refreshMintingKCStorage(); // OT-RFC-40 §5.3 — build the cross-storage registry so consumers // (random-sampling prover, async-lift verifier, replication ack @@ -2946,21 +2919,29 @@ export class EVMChainAdapter implements ChainAdapter { this.invalidateRandomSamplingPair(); } }; - // OT-RFC-40 §5.3 — refresh the KC storage registry on every - // `NewAssetStorage` / `AssetStorageChanged` event regardless of - // name (the registry's filter rule already restricts to KC-class - // names). Listening unconditionally avoids races where a future - // version naming convention is added but we forget to update this - // allowlist; the registry's filter is the single source of truth. + // OT-RFC-40 §5.3 — refresh the KC storage registry AND the + // adapter's own `KnowledgeCollectionStorage` binding on every + // `NewAssetStorage` / `AssetStorageChanged` event. Listening + // unconditionally (rather than gating on a known name allowlist) + // avoids races where a future version naming convention is added + // but we forget to update the allowlist; the registry's filter is + // the single source of truth for what counts as a KC-class + // storage. + // + // Codex review on PR #718 (Comment 4 round 1): if the initial + // `buildKCStorageRegistry()` call in `init()` failed, the registry + // stays `undefined` and a plain refresh would silently no-op + // forever. Use the first asset-storage event as a recovery + // trigger: build from scratch if absent, otherwise refresh. // - // Codex review on PR #718 (Comment 4): if the initial - // `buildKCStorageRegistry()` call in `init()` failed (e.g. - // transient RPC error during boot), `kcStorageRegistry` stays - // `undefined` and a plain refresh would silently no-op forever. - // Use the first asset-storage event as a recovery trigger: if - // there is no registry yet, build one from scratch; otherwise - // refresh the existing instance. Either path picks up the new - // storage immediately. + // Codex review on PR #718 (Comment 2 round 3): the cached + // `this.contracts.knowledgeCollectionStorage` handle and the + // `mintingStorageTag` derived from its `uri(0)` are themselves + // bound at init and would otherwise survive a Hub-mediated + // KC-storage rotation. Re-resolve them here so a rotation + // immediately propagates to mint UAL construction and to the + // ad-hoc reads that still go through the `knowledgeCollectionStorage` + // contract handle. const onAssetStorageChange = (): void => { if (this.kcStorageRegistry === undefined) { void this.buildKCStorageRegistry() @@ -2970,13 +2951,16 @@ export class EVMChainAdapter implements ChainAdapter { .catch((err) => console.warn( `[EVMChainAdapter] KC storage registry rebuild after Hub event failed: ${err instanceof Error ? err.message : String(err)}`, )); - return; + } else { + void this.kcStorageRegistry + .refresh() + .catch((err) => console.warn( + `[EVMChainAdapter] KC storage registry refresh failed after Hub event: ${err instanceof Error ? err.message : String(err)}`, + )); } - void this.kcStorageRegistry - .refresh() - .catch((err) => console.warn( - `[EVMChainAdapter] KC storage registry refresh failed after Hub event: ${err instanceof Error ? err.message : String(err)}`, - )); + void this.refreshMintingKCStorage().catch((err) => console.warn( + `[EVMChainAdapter] minting KC storage refresh after Hub event failed: ${err instanceof Error ? err.message : String(err)}`, + )); }; try { await this.contracts.hub.on('ContractChanged', onChange); @@ -2989,6 +2973,54 @@ export class EVMChainAdapter implements ChainAdapter { } } + /** + * OT-RFC-40 §5.2 + Codex review on PR #718 (Comment 2 of round 3) — + * (re-)bind this adapter to the active `KnowledgeCollectionStorage` + * instance and recompute `mintingStorageTag` from its `uri(0)`. Run + * once at `init()` and again whenever Hub fires a + * `NewAssetStorage`/`AssetStorageChanged` event so a runtime KC + * storage rotation doesn't leave the adapter parsing/minting + * against the stale contract. + * + * The tag is the suffix of `uri(0)` past the `did:dkg` prefix; + * empty for the canonical V10 default storage. Failure to read or + * parse the uriBase falls back to "" (the legacy 3-segment UAL + * form), which is the most-conservative behaviour: it preserves + * every bit of UAL output produced before this RFC. + */ + private async refreshMintingKCStorage(): Promise { + let storage: Contract; + try { + storage = await this.resolveAssetStorage('KnowledgeCollectionStorage'); + } catch (err) { + console.warn( + `[EVMChainAdapter] failed to resolve KnowledgeCollectionStorage from Hub: ${err instanceof Error ? err.message : String(err)}; ` + + `keeping previously-bound storage handle (mintingStorageTag="${this.mintingStorageTag}")`, + ); + return; + } + this.contracts.knowledgeCollectionStorage = storage; + try { + const uriBase: string = await storage.uri(0); + const tag = deriveStorageTag(uriBase); + if (tag === null) { + console.warn( + `[EVMChainAdapter] KC storage at ${await storage.getAddress()} ` + + `returned malformed uriBase "${uriBase}"; falling back to default storage tag (3-segment UALs)`, + ); + this.mintingStorageTag = ''; + } else { + this.mintingStorageTag = tag; + } + } catch (err) { + console.warn( + `[EVMChainAdapter] failed to read uri(0) from KC storage: ${err instanceof Error ? err.message : String(err)}; ` + + `falling back to default storage tag (3-segment UALs)`, + ); + this.mintingStorageTag = ''; + } + } + /** * OT-RFC-40 §5.3 — construct the cross-storage registry by giving * it ethers-backed implementations of the two readers it needs. diff --git a/packages/chain/test/evm-adapter.test.ts b/packages/chain/test/evm-adapter.test.ts index 6d62b3a3e..7084203ae 100644 --- a/packages/chain/test/evm-adapter.test.ts +++ b/packages/chain/test/evm-adapter.test.ts @@ -121,6 +121,32 @@ describe('EVMChainAdapter integration', () => { expect(owns).toBe(true); }, 30_000); + it('refreshMintingKCStorage re-resolves the storage handle and tag (Codex review on PR #718, Comment 2 of round 3)', async () => { + // The Hub event listener calls `refreshMintingKCStorage` on every + // `NewAssetStorage`/`AssetStorageChanged` event so a runtime + // KC storage rotation propagates to both `mintingStorageTag` and + // the cached contract handle. We can't trigger a real rotation + // from this test (would require a contract redeploy), but we can + // pin that the helper itself works: invoking it after init keeps + // the adapter in a consistent state (idempotent re-resolve), and + // the resolved tag matches what the live Hub reports. + const adapter = new EVMChainAdapter(makeAdapterConfig(ctx.rpcUrl, ctx.hubAddress, HARDHAT_KEYS.DEPLOYER)); + // Force adapter init (which calls refreshMintingKCStorage once). + await adapter.verifyPublisherOwnsRange(adapter.getSignerAddress(), 1n, 1n); + const tagBefore = adapter.mintingStorageTag; + const internals = adapter as unknown as { + refreshMintingKCStorage(): Promise; + contracts: { knowledgeCollectionStorage?: { getAddress(): Promise } }; + }; + const addressBefore = await internals.contracts.knowledgeCollectionStorage!.getAddress(); + // Re-invoke the helper directly — this mirrors the Hub-event + // call path the listener takes. + await internals.refreshMintingKCStorage(); + expect(adapter.mintingStorageTag).toBe(tagBefore); + const addressAfter = await internals.contracts.knowledgeCollectionStorage!.getAddress(); + expect(addressAfter).toBe(addressBefore); + }, 30_000); + it('verifyPublisherOwnsRange fails closed for a registered tag whose authMode is unknown', async () => { // The registry filters incoming Hub entries by KC-class name // prefixes today, so an `unknown` authMode is unreachable through diff --git a/packages/chain/test/mock-adapter-parity.test.ts b/packages/chain/test/mock-adapter-parity.test.ts index b485308b7..ed4d8fd38 100644 --- a/packages/chain/test/mock-adapter-parity.test.ts +++ b/packages/chain/test/mock-adapter-parity.test.ts @@ -138,6 +138,13 @@ const MOCK_EXEMPT_FROM_EVM = new Set([ // directly via `setMintingStorageTag`), so there is no mock-side // counterpart to mirror. 'buildKCStorageRegistry', + // OT-RFC-40 round-3 review fix: TS-private helper that re-binds + // the adapter's `KnowledgeCollectionStorage` handle and recomputes + // `mintingStorageTag` after a Hub `AssetStorageChanged` event. + // Mock adapter has no Hub event surface and `setMintingStorageTag` + // already covers the equivalent affordance for tests, so no + // mock-side parity is needed. + 'refreshMintingKCStorage', ]); const NO_CHAIN_EXEMPT_FROM_EVM = new Set([ diff --git a/packages/random-sampling/src/prover.ts b/packages/random-sampling/src/prover.ts index b33b96269..a68046eaa 100644 --- a/packages/random-sampling/src/prover.ts +++ b/packages/random-sampling/src/prover.ts @@ -285,14 +285,14 @@ export class RandomSamplingProver { reason: 'unsolved-stale', }); } + let cgIdFromCreate: bigint | null = null; if (existingIsCurrent && !existing.solved && !unsolvedStale) { challenge = existing; - cgId = await this.chain.getKCContextGraphId(challenge.knowledgeCollectionId); } else { try { const created = await this.chain.createChallenge(); challenge = created.challenge; - cgId = created.contextGraphId; + cgIdFromCreate = created.contextGraphId; } catch (err) { if (err instanceof NoEligibleContextGraphError) { this.log.info('rs.tick.no-eligible-cg', {}); @@ -311,53 +311,29 @@ export class RandomSamplingProver { const kcId = challenge.knowledgeCollectionId; const chunkId = challenge.chunkId; - await this.wal.append( - makeWalEntry(periodKey, 'challenge', { - kcId: kcId.toString(), - cgId: cgId.toString(), - chunkId: chunkId.toString(), - }), - ); - - if (cgId === 0n) { - this.log.warn('rs.tick.cg-not-found', { kcId: kcId.toString() }); - await this.wal.append( - makeWalEntry(periodKey, 'failed', { - kcId: kcId.toString(), - error: { code: 'cg-not-found', message: 'getKCContextGraphId returned 0' }, - }), - ); - return { kind: 'cg-not-found', kcId }; - } - - const expectedRoot = await this.chain.getLatestMerkleRoot(kcId); - const expectedLeafCount = await this.chain.getMerkleLeafCount(kcId); - - // OT-RFC-40 §7.5: when the chain adapter exposes a KC storage - // registry, map `Challenge.knowledgeCollectionStorageContract` - // (an address) back to the storage's UAL tag. The extractor uses - // this to disambiguate batchId collisions in CGs that hold KCs - // from multiple storage versions (V9 + V10 today; arbitrary - // versions in the future). + // OT-RFC-40 §7.5 + Codex review on PR #718 (Comment 1 of round 3): + // the storage-tag resolution MUST happen before any storage-bound + // chain read. `getKCContextGraphId`, `getLatestMerkleRoot`, and + // `getMerkleLeafCount` in the EVM adapter all hardcode the V10 + // default `KnowledgeCollectionStorage`; calling them for a tagged + // challenge would either return the wrong CG/root/leafCount or + // throw, both of which would land before this fail-closed branch + // could fire. Resolve the tag now — and skip the period — if the + // adapter's registry says the challenge belongs to anything other + // than the V10 default. // - // Codex review on PR #718 (Comment 2): if the registry exists but - // does not yet know the challenged storage address (stale cache, - // race between Hub.NewAssetStorage emission and our event listener, - // partial init failure), tagFor returns undefined. Falling through - // to the legacy "first UAL match wins" path can prove the WRONG KC - // in mixed-storage CGs. Try a single registry refresh — that - // covers the legitimate stale-cache case — and fail-closed - // (skip-period as a kc-not-synced miss) if the storage stays - // unknown afterwards. Adapters without a registry at all - // (`kcStorageRegistry` is `undefined`) still get the legacy - // unfiltered behaviour because they explicitly opted out of - // tag-aware proving by not exposing a registry. + // Adapters without a registry at all (`kcStorageRegistry` is + // `undefined`) still get the legacy unfiltered behaviour because + // they explicitly opted out of tag-aware proving by not exposing + // a registry. let expectedStorageTag: string | undefined; if (this.chain.kcStorageRegistry) { expectedStorageTag = this.chain.kcStorageRegistry.tagFor( challenge.knowledgeCollectionStorageContract, ); if (expectedStorageTag === undefined) { + // Stale cache / Hub event race / partial init failure — try a + // single refresh before fail-closing. try { await this.chain.kcStorageRegistry.refresh(); } catch (err) { @@ -373,13 +349,11 @@ export class RandomSamplingProver { const storageAddress = challenge.knowledgeCollectionStorageContract; this.log.warn('rs.tick.kcs-registry-miss', { kcId: kcId.toString(), - cgId: cgId.toString(), storageContract: storageAddress, }); await this.wal.append( makeWalEntry(periodKey, 'failed', { kcId: kcId.toString(), - cgId: cgId.toString(), chunkId: chunkId.toString(), error: { code: 'kcs-registry-miss', @@ -389,32 +363,25 @@ export class RandomSamplingProver { }, }), ); - return { kind: 'kc-not-synced', kcId, cgId }; + // No reliable cgId yet (existing-challenge branch hasn't + // queried it because the read is storage-bound). Use the chain- + // side cgId from the createChallenge path when available, + // otherwise 0 to signal "unknown CG". + return { kind: 'kc-not-synced', kcId, cgId: cgIdFromCreate ?? 0n }; } - - // Codex review on PR #718 (Comment 1 of round 2): the chain - // reads on this code path — `getKCContextGraphId`, - // `getLatestMerkleRoot`, `getMerkleLeafCount` — are bound to the - // V10 default KC storage in the EVM adapter. Filtering the - // local-store extractor by a non-empty `expectedStorageTag` - // closes the disambiguation gap on the read side, but the chain - // reads above are still default-storage-only, so a challenge - // from a tagged storage would be verified against the wrong - // contract's root/leafCount even when local data is correct. - // Fail-closed here with an explicit reason until the chain reads - // accept a storage tag/address argument; tagged-storage RS - // support is tracked as a follow-up to RFC-40. if (expectedStorageTag !== '') { + // The chain reads below are bound to the V10 default KC + // storage. Tagged-storage RS routing requires those reads to + // grow a tag/address parameter — tracked as a follow-up to + // RFC-40. Until then, a non-empty tag is a hard fail-closed. this.log.warn('rs.tick.tagged-storage-rs-unsupported', { kcId: kcId.toString(), - cgId: cgId.toString(), storageContract: challenge.knowledgeCollectionStorageContract, expectedStorageTag, }); await this.wal.append( makeWalEntry(periodKey, 'failed', { kcId: kcId.toString(), - cgId: cgId.toString(), chunkId: chunkId.toString(), error: { code: 'tagged-storage-rs-unsupported', @@ -424,10 +391,36 @@ export class RandomSamplingProver { }, }), ); - return { kind: 'kc-not-synced', kcId, cgId }; + return { kind: 'kc-not-synced', kcId, cgId: cgIdFromCreate ?? 0n }; } } + // Storage tag is the default V10 (or registry isn't present). The + // existing chain reads are now safe to use. + cgId = cgIdFromCreate ?? (await this.chain.getKCContextGraphId(kcId)); + + await this.wal.append( + makeWalEntry(periodKey, 'challenge', { + kcId: kcId.toString(), + cgId: cgId.toString(), + chunkId: chunkId.toString(), + }), + ); + + if (cgId === 0n) { + this.log.warn('rs.tick.cg-not-found', { kcId: kcId.toString() }); + await this.wal.append( + makeWalEntry(periodKey, 'failed', { + kcId: kcId.toString(), + error: { code: 'cg-not-found', message: 'getKCContextGraphId returned 0' }, + }), + ); + return { kind: 'cg-not-found', kcId }; + } + + const expectedRoot = await this.chain.getLatestMerkleRoot(kcId); + const expectedLeafCount = await this.chain.getMerkleLeafCount(kcId); + let leaves: Uint8Array[]; try { const extracted = await extractV10KCFromStore(this.store, cgId, kcId, { From 638613b4c21691e1d743eb6ad480c2f88b9e7f39 Mon Sep 17 00:00:00 2001 From: Branimir Rakic Date: Wed, 27 May 2026 03:29:24 +0200 Subject: [PATCH 11/12] fix(rfc-40): address Codex review on PR #718 round 4 (4 substantive findings) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Round 4 closes the convergence gap that the previous rounds opened: the per-PR additive-versions promise needed real teeth in the prover and refresh paths, plus a stricter UAL parser to keep malformed tagged UALs out of the default-storage fast path. 1) packages/core/src/constants.ts (round-4 comment 2): parseUal() silently demoted malformed tagged UALs into default-storage form. `did:dkg:v9/0xPub/123/1` (missing chainId between tag and publisher) would parse as `{chainId: "v9", storageTag: ""}` because the tagged-form check failed (segments[2] not a publisher) and the default-form fallback didn't validate that segments[0] could be a chainId. Downstream this bypassed the V9 KAS range check in the publisher. Fix by inspecting segments[0] FIRST: if it matches STORAGE_TAG_PATTERN (which CAIP-2 chainIds with `:` never do), the input MUST be the 4-segment tagged form, and segments[2] MUST be a publisher — otherwise reject hard. Tests pin both new rejection branches and the old uppercase/colon/wrong-shape rejections still stand. 2) packages/chain/src/evm-adapter.ts (round-4 comment 3 + refreshMintingKCStorage atomic-commit): the previous round-3 refactor partially updated the adapter binding when uri(0) read or tag derivation failed — the new `KnowledgeCollectionStorage` handle was committed before the tag was known to be valid. A bad uriBase then downgraded `mintingStorageTag` to '' while leaving the handle pointed at a contract that mints under a different tag. That permanently mis-routes new UALs against the wrong on-chain contract. Fix: hold both `nextStorage` and `nextTag` as locals and commit them as a unit at the end. Any failure on the resolve / read / parse path returns early with both fields untouched, preserving the previous {handle, tag} pair (or the class defaults on first init, which subsequently fail at mint time rather than silently legacy-mint). 3) packages/chain/src/evm-adapter.ts (round-4 comment 4): hard-coding `resolveAssetStorage("KnowledgeCollectionStorage")` as the mint target ties the adapter to the legacy Hub registration NAME. RFC §5.2's source of truth for "the canonical default storage" is the storage's `uriBase === "did:dkg"`. A future deploy registering V11 under a different Hub name with `uriBase === "did:dkg"` is the new default by RFC, but the legacy resolve would still pick the V10 contract. Fix: prefer `kcStorageRegistry.getDefault()` when the registry has discovered a default; fall back to the legacy resolve only when the registry is undefined or empty. Reordered init() so `buildKCStorageRegistry()` runs BEFORE `refreshMintingKCStorage()`, and ordered the Hub-event handler to refresh the registry first THEN the mint helper (the previous fire-and-forget order raced). 4) packages/random-sampling/src/prover.ts + packages/chain/src/chain-adapter.ts + packages/chain/src/evm-adapter.ts + packages/chain/src/mock-adapter.ts (round-4 comment 1): blanket fail-closing every tagged challenge made V11+ KCs permanently unprovable, contradicting the RFC's additive-versions promise. The V10 family (`kcs-ack-based`) shares method shapes with V11+ extensions, so we can route the chain reads directly to the challenge's specific storage address. Extend `getLatestMerkleRoot` and `getMerkleLeafCount` with an optional `opts.storageContract` parameter (the address) — the EVM adapter constructs a one-shot Contract handle bound to that address with the V10 KCS ABI. The prover resolves the registry entry from `Challenge.knowledgeCollectionStorageContract`, switches on `entry.authMode`: * `kcs-ack-based`: route reads via opts.storageContract (V10 default + future extensions both flow through this path; the default-tag case keeps using `opts.storageContract === undefined` which the adapter resolves to its bound default — bit-for-bit legacy behaviour). * `kas-pre-reserved-range` (V9 KAS): fail-closed with new WAL code `kas-rs-unsupported`; full V9 RS support tracked as a follow-up RFC. * `unknown`: fail-closed with new WAL code `tagged-storage-unknown-auth`. Mock adapter accepts the opts parameter for type parity but ignores it — its in-memory collections map is single-storage. Test changes: - core/test/constants.test.ts: pin malformed-tagged-UAL rejection. - chain/test/evm-adapter.test.ts: pin registry-driven mint target preference. - chain/test/mock-adapter-parity.test.ts: exempt `kcStorageForRead` (TS-private one-shot Contract factory). - random-sampling/test/prover.test.ts: split the previous "skip on any non-default tag" test into three: V11+ kcs-ack-based routes successfully through opts.storageContract, V9 KAS gets `kas-rs-unsupported`, unknown authMode gets `tagged-storage-unknown-auth`. Default-tag path explicitly asserts `opts.storageContract === undefined` is forwarded. - chain.getByAddress replaces the previous tagFor-only spy because the prover now needs the full entry (address + authMode + tag). All four packages green: chain (424), core (954), random-sampling (56, e2e excluded), publisher (1051). Co-authored-by: Cursor --- packages/chain/src/chain-adapter.ts | 20 +- packages/chain/src/evm-adapter.ts | 200 ++++++++++++----- packages/chain/src/mock-adapter.ts | 9 +- packages/chain/test/evm-adapter.test.ts | 45 ++++ .../chain/test/mock-adapter-parity.test.ts | 8 + packages/core/src/constants.ts | 55 +++-- packages/core/test/constants.test.ts | 24 ++ packages/random-sampling/src/prover.ts | 139 +++++++----- packages/random-sampling/test/prover.test.ts | 212 +++++++++++++++--- 9 files changed, 550 insertions(+), 162 deletions(-) diff --git a/packages/chain/src/chain-adapter.ts b/packages/chain/src/chain-adapter.ts index f02241a78..ce39df0d7 100644 --- a/packages/chain/src/chain-adapter.ts +++ b/packages/chain/src/chain-adapter.ts @@ -1017,16 +1017,32 @@ export interface ChainAdapter { * `kcId` is unknown to the chain or the V10 storage contract is not * deployed on this Hub. Optional so non-V10 / no-chain adapters can * stub the prover surface. + * + * OT-RFC-40 §7.5 (Codex review on PR #718, round 4): the optional + * `opts.storageContract` argument routes the read to a specific KC + * storage address. Used by the random-sampling prover when a + * challenge resolves to a `kcs-ack-based` tagged storage (V10 + * default + future V11+ extensions) so the verification reads run + * against the contract that minted the data, not against the + * adapter's bound default. When omitted, the adapter MUST query the + * default storage as before. V9 KAS (`kas-pre-reserved-range`) is + * intentionally NOT supported via this routing — V9 RS is tracked + * as a follow-up and prover callers fail-closed before reaching + * here. */ - getLatestMerkleRoot?(kcId: bigint): Promise; + getLatestMerkleRoot?(kcId: bigint, opts?: { storageContract?: string }): Promise; /** * V10 flat-KC merkle leaf count (sorted + deduped) recorded on-chain * for `kcId`. Used by the prover to (a) validate the local extraction * matches the published shape before building a proof, and (b) sanity * check the on-chain `chunkId = leafIndex` falls within the tree. + * + * OT-RFC-40 §7.5 (Codex review on PR #718, round 4): see + * `getLatestMerkleRoot` for the `opts.storageContract` routing + * rationale. */ - getMerkleLeafCount?(kcId: bigint): Promise; + getMerkleLeafCount?(kcId: bigint, opts?: { storageContract?: string }): Promise; /** * Address that signed the latest merkle root for `kcId` (the EOA that diff --git a/packages/chain/src/evm-adapter.ts b/packages/chain/src/evm-adapter.ts index 9aab4cf17..3ff7a446b 100644 --- a/packages/chain/src/evm-adapter.ts +++ b/packages/chain/src/evm-adapter.ts @@ -813,8 +813,6 @@ export class EVMChainAdapter implements ChainAdapter { } catch { // V8 KnowledgeCollection not deployed — legacy publish surface unavailable. } - await this.refreshMintingKCStorage(); - // OT-RFC-40 §5.3 — build the cross-storage registry so consumers // (random-sampling prover, async-lift verifier, replication ack // verifier) can map a UAL or `Challenge.knowledgeCollectionStorageContract` @@ -822,6 +820,15 @@ export class EVMChainAdapter implements ChainAdapter { // non-fatal; the registry just stays empty and resolution-time // callers fall back to "treat as default storage" which is // bit-for-bit pre-RFC behaviour. + // + // Codex review on PR #718 round 4 (Comment 4): the registry build + // MUST run before `refreshMintingKCStorage` because the latter + // prefers `kcStorageRegistry.getDefault()` (the storage whose + // `uriBase === "did:dkg"` — the RFC's authoritative source of + // truth for "the canonical default KC storage") over the legacy + // `Hub.getAssetStorageAddress("KnowledgeCollectionStorage")` + // resolve, which only matches by Hub registration name and could + // miss a future deploy that renames the canonical storage. try { this.kcStorageRegistry = await this.buildKCStorageRegistry(); } catch (err) { @@ -832,6 +839,8 @@ export class EVMChainAdapter implements ChainAdapter { this.kcStorageRegistry = undefined; } + await this.refreshMintingKCStorage(); + // V9 contracts (KnowledgeAssets + KnowledgeAssetsStorage) are archived // (PRD §4.1, deploy scripts 040+041 moved under deploy/archive). Keep // the try/catch so adapters bound to legacy deploys still resolve them. @@ -2942,25 +2951,31 @@ export class EVMChainAdapter implements ChainAdapter { // immediately propagates to mint UAL construction and to the // ad-hoc reads that still go through the `knowledgeCollectionStorage` // contract handle. + // The registry refresh MUST complete before refreshMintingKCStorage + // runs, because the latter prefers `kcStorageRegistry.getDefault()` + // as its source of truth for the canonical default storage. + // Sequencing them here avoids a race where the mint helper picks up + // the pre-rotation registry entry and rebinds the cached handle to + // the old contract for a few milliseconds (Codex round 4). const onAssetStorageChange = (): void => { - if (this.kcStorageRegistry === undefined) { - void this.buildKCStorageRegistry() - .then((registry) => { - this.kcStorageRegistry = registry; - }) - .catch((err) => console.warn( - `[EVMChainAdapter] KC storage registry rebuild after Hub event failed: ${err instanceof Error ? err.message : String(err)}`, - )); - } else { - void this.kcStorageRegistry - .refresh() - .catch((err) => console.warn( - `[EVMChainAdapter] KC storage registry refresh failed after Hub event: ${err instanceof Error ? err.message : String(err)}`, - )); - } - void this.refreshMintingKCStorage().catch((err) => console.warn( - `[EVMChainAdapter] minting KC storage refresh after Hub event failed: ${err instanceof Error ? err.message : String(err)}`, - )); + const registryStep = this.kcStorageRegistry === undefined + ? this.buildKCStorageRegistry() + .then((registry) => { + this.kcStorageRegistry = registry; + }) + .catch((err) => console.warn( + `[EVMChainAdapter] KC storage registry rebuild after Hub event failed: ${err instanceof Error ? err.message : String(err)}`, + )) + : this.kcStorageRegistry + .refresh() + .catch((err) => console.warn( + `[EVMChainAdapter] KC storage registry refresh failed after Hub event: ${err instanceof Error ? err.message : String(err)}`, + )); + void registryStep.then(() => + this.refreshMintingKCStorage().catch((err) => console.warn( + `[EVMChainAdapter] minting KC storage refresh after Hub event failed: ${err instanceof Error ? err.message : String(err)}`, + )), + ); }; try { await this.contracts.hub.on('ContractChanged', onChange); @@ -2974,51 +2989,95 @@ export class EVMChainAdapter implements ChainAdapter { } /** - * OT-RFC-40 §5.2 + Codex review on PR #718 (Comment 2 of round 3) — - * (re-)bind this adapter to the active `KnowledgeCollectionStorage` - * instance and recompute `mintingStorageTag` from its `uri(0)`. Run - * once at `init()` and again whenever Hub fires a - * `NewAssetStorage`/`AssetStorageChanged` event so a runtime KC - * storage rotation doesn't leave the adapter parsing/minting - * against the stale contract. + * OT-RFC-40 §5.2 — (re-)bind this adapter to the active + * `KnowledgeCollectionStorage` instance and recompute + * `mintingStorageTag` from its `uri(0)`. Run once at `init()` + * (after the registry builds — see init ordering note) and again + * whenever Hub fires a `NewAssetStorage`/`AssetStorageChanged` + * event (Codex round 3, Comment 2), so a runtime KC storage + * rotation doesn't leave the adapter parsing/minting against the + * stale contract. + * + * Source of truth (Codex round 4, Comment 4): + * - PREFERRED: `kcStorageRegistry.getDefault()` — the storage + * whose `uriBase === "did:dkg"`. By RFC §5.2 there is exactly + * one default per Hub, and its `uriBase` is the protocol- + * authoritative way to identify the canonical KC storage. A + * future deploy that registers V11+ as the new default (under + * ANY Hub name) will be picked up correctly. + * - FALLBACK: `Hub.getAssetStorageAddress("KnowledgeCollectionStorage")` — + * legacy Hub-name resolve. Used only when the registry hasn't + * been built yet (init bootstrap before `buildKCStorageRegistry()` + * completes) or when the registry build itself failed/found no + * default storage. Preserves pre-RFC-40 behaviour for clean + * upgrades from a working V10 deploy. * - * The tag is the suffix of `uri(0)` past the `did:dkg` prefix; - * empty for the canonical V10 default storage. Failure to read or - * parse the uriBase falls back to "" (the legacy 3-segment UAL - * form), which is the most-conservative behaviour: it preserves - * every bit of UAL output produced before this RFC. + * Atomicity (Codex round 4, Comment 3): the cached + * `this.contracts.knowledgeCollectionStorage` handle and + * `mintingStorageTag` MUST update as a unit. Any failure during + * resolution preserves the previous pair rather than silently + * downgrading the tag to `''` while leaving the handle pointed at + * a different contract — that combination would stamp default- + * storage UALs against data minted on some other contract and + * permanently mis-route the collection on chain. On first init, + * "previous pair" is the class-default `(undefined, '')`, and the + * adapter is expected to fail at mint time rather than silently + * legacy-mint. */ private async refreshMintingKCStorage(): Promise { - let storage: Contract; - try { - storage = await this.resolveAssetStorage('KnowledgeCollectionStorage'); - } catch (err) { - console.warn( - `[EVMChainAdapter] failed to resolve KnowledgeCollectionStorage from Hub: ${err instanceof Error ? err.message : String(err)}; ` + - `keeping previously-bound storage handle (mintingStorageTag="${this.mintingStorageTag}")`, + let nextStorage: Contract; + let nextTag: string; + + const defaultEntry = this.kcStorageRegistry?.getDefault(); + if (defaultEntry) { + // Registry view: address is already validated via uri(0) read + // and the tag is already derived. Trust both and build a + // contract handle bound to this signer for write paths. + nextStorage = new Contract( + defaultEntry.address, + loadAbi('KnowledgeCollectionStorage'), + this.signer, ); - return; - } - this.contracts.knowledgeCollectionStorage = storage; - try { - const uriBase: string = await storage.uri(0); + nextTag = defaultEntry.tag; + } else { + // Legacy fallback: Hub-name resolve + uri(0) read + + // deriveStorageTag. Preserve previous {handle, tag} pair on + // ANY error here rather than silently downgrading. + let storage: Contract; + try { + storage = await this.resolveAssetStorage('KnowledgeCollectionStorage'); + } catch (err) { + console.warn( + `[EVMChainAdapter] failed to resolve KnowledgeCollectionStorage from Hub: ${err instanceof Error ? err.message : String(err)}; ` + + `keeping previously-bound {handle, tag} pair (mintingStorageTag="${this.mintingStorageTag}")`, + ); + return; + } + let uriBase: string; + try { + uriBase = await storage.uri(0); + } catch (err) { + console.warn( + `[EVMChainAdapter] failed to read uri(0) from KC storage at ${await storage.getAddress()}: ${err instanceof Error ? err.message : String(err)}; ` + + `keeping previously-bound {handle, tag} pair (mintingStorageTag="${this.mintingStorageTag}")`, + ); + return; + } const tag = deriveStorageTag(uriBase); if (tag === null) { console.warn( `[EVMChainAdapter] KC storage at ${await storage.getAddress()} ` + - `returned malformed uriBase "${uriBase}"; falling back to default storage tag (3-segment UALs)`, + `returned malformed uriBase "${uriBase}"; keeping previously-bound {handle, tag} pair (mintingStorageTag="${this.mintingStorageTag}")`, ); - this.mintingStorageTag = ''; - } else { - this.mintingStorageTag = tag; + return; } - } catch (err) { - console.warn( - `[EVMChainAdapter] failed to read uri(0) from KC storage: ${err instanceof Error ? err.message : String(err)}; ` + - `falling back to default storage tag (3-segment UALs)`, - ); - this.mintingStorageTag = ''; + nextStorage = storage; + nextTag = tag; } + + // Atomic commit: both fields update together or neither does. + this.contracts.knowledgeCollectionStorage = nextStorage; + this.mintingStorageTag = nextTag; } /** @@ -3344,20 +3403,45 @@ export class EVMChainAdapter implements ChainAdapter { return cgs; } - async getLatestMerkleRoot(kcId: bigint): Promise { + async getLatestMerkleRoot(kcId: bigint, opts?: { storageContract?: string }): Promise { await this.init(); - const kcs = this.requireKCStorage(); + const kcs = this.kcStorageForRead(opts?.storageContract); const rootHex: string = await kcs.getLatestMerkleRoot(kcId); return ethers.getBytes(rootHex); } - async getMerkleLeafCount(kcId: bigint): Promise { + async getMerkleLeafCount(kcId: bigint, opts?: { storageContract?: string }): Promise { await this.init(); - const kcs = this.requireKCStorage(); + const kcs = this.kcStorageForRead(opts?.storageContract); const count: bigint = BigInt(await kcs.getMerkleLeafCount(kcId)); return Number(count); } + /** + * OT-RFC-40 §7.5 (Codex review on PR #718, round 4) — return either + * the adapter's bound default `KnowledgeCollectionStorage` or a + * one-shot Contract handle bound to a caller-supplied storage + * address. Used by the prover to dispatch reads to the specific + * `kcs-ack-based` storage that minted a challenged KC, so a V11+ + * tagged storage gets its root/leafCount read against its own + * contract rather than the (possibly stale-default) bound handle. + * + * Caller is responsible for ensuring `storageContract` belongs to + * a `kcs-ack-based` family (V10 default + future extensions); the + * adapter does NOT re-validate the auth mode here, that gating + * lives in the prover and in `verifyPublisherOwnsRange`. + */ + private kcStorageForRead(storageContract: string | undefined): Contract { + if (storageContract === undefined) { + return this.requireKCStorage(); + } + return new Contract( + storageContract, + loadAbi('KnowledgeCollectionStorage'), + this.signer, + ); + } + async getLatestMerkleRootPublisher(kcId: bigint): Promise { await this.init(); const kcs = this.requireKCStorage(); diff --git a/packages/chain/src/mock-adapter.ts b/packages/chain/src/mock-adapter.ts index 5554f8ab6..ee4fe4e5a 100644 --- a/packages/chain/src/mock-adapter.ts +++ b/packages/chain/src/mock-adapter.ts @@ -1503,13 +1503,18 @@ export class MockChainAdapter implements ChainAdapter { // `createKnowledgeAssetsV10` and `__registerKC`. // ===================================================================== - async getLatestMerkleRoot(kcId: bigint): Promise { + async getLatestMerkleRoot(kcId: bigint, _opts?: { storageContract?: string }): Promise { + // OT-RFC-40 §7.5: mock adapter doesn't model multiple storage + // instances (in-memory `collections` map is single-storage). The + // optional `storageContract` arg is accepted for ChainAdapter API + // parity but ignored — tests that need multi-storage behaviour + // use the EVM adapter against a hardhat fork. const entry = this.collections.get(kcId); if (!entry) throw new Error(`Mock: unknown kcId ${kcId}`); return entry.merkleRoot; } - async getMerkleLeafCount(kcId: bigint): Promise { + async getMerkleLeafCount(kcId: bigint, _opts?: { storageContract?: string }): Promise { const entry = this.collections.get(kcId); if (!entry) throw new Error(`Mock: unknown kcId ${kcId}`); return entry.merkleLeafCount; diff --git a/packages/chain/test/evm-adapter.test.ts b/packages/chain/test/evm-adapter.test.ts index 7084203ae..306a07f1f 100644 --- a/packages/chain/test/evm-adapter.test.ts +++ b/packages/chain/test/evm-adapter.test.ts @@ -121,6 +121,51 @@ describe('EVMChainAdapter integration', () => { expect(owns).toBe(true); }, 30_000); + it('refreshMintingKCStorage prefers kcStorageRegistry.getDefault() as the mint target (Codex review on PR #718, Comment 4 of round 4)', async () => { + // The legacy `Hub.getAssetStorageAddress("KnowledgeCollectionStorage")` + // resolve only matches by exact Hub registration name. RFC §5.2's + // source of truth for "the canonical default storage" is the + // storage's `uriBase === "did:dkg"`, not its Hub name. A future + // V11 deploy registered under a different name (e.g. + // `KnowledgeCollectionStorageV11`) but with `uriBase === "did:dkg"` + // is the new default by RFC, even though the legacy Hub-name + // resolve would still return whatever is bound to the original + // `KnowledgeCollectionStorage` slot. Pin that + // `refreshMintingKCStorage` consults the registry first so the + // PR's promised additive cutover actually delivers. + const adapter = new EVMChainAdapter(makeAdapterConfig(ctx.rpcUrl, ctx.hubAddress, HARDHAT_KEYS.DEPLOYER)); + await adapter.verifyPublisherOwnsRange(adapter.getSignerAddress(), 1n, 1n); + const realLegacyAddr = await (adapter as unknown as { + contracts: { knowledgeCollectionStorage: { getAddress(): Promise } }; + }).contracts.knowledgeCollectionStorage.getAddress(); + // Simulate a future deploy: V11 sits at a different address but + // claims the canonical `did:dkg` uriBase (i.e. it IS the new + // default per RFC §5.2 rule 3). The registry surfaces it; the + // legacy slot still points at V10's address. + const v11Addr = '0x0000000000000000000000000000000000004444'; + adapter.kcStorageRegistry = new KCStorageRegistry( + { + getAllAssetStorages: async () => [ + { name: 'KnowledgeCollectionStorageV11', addr: v11Addr }, + ], + }, + { + readUriBase: async () => 'did:dkg', + }, + ); + await adapter.kcStorageRegistry.refresh(); + const internals = adapter as unknown as { + refreshMintingKCStorage(): Promise; + contracts: { knowledgeCollectionStorage: { getAddress(): Promise } }; + }; + await internals.refreshMintingKCStorage(); + expect(adapter.mintingStorageTag).toBe(''); + const addrAfter = await internals.contracts.knowledgeCollectionStorage.getAddress(); + // ethers normalises addresses to checksummed form; compare lowercase. + expect(addrAfter.toLowerCase()).toBe(v11Addr.toLowerCase()); + expect(addrAfter.toLowerCase()).not.toBe(realLegacyAddr.toLowerCase()); + }, 30_000); + it('refreshMintingKCStorage re-resolves the storage handle and tag (Codex review on PR #718, Comment 2 of round 3)', async () => { // The Hub event listener calls `refreshMintingKCStorage` on every // `NewAssetStorage`/`AssetStorageChanged` event so a runtime diff --git a/packages/chain/test/mock-adapter-parity.test.ts b/packages/chain/test/mock-adapter-parity.test.ts index ed4d8fd38..0bba83c4c 100644 --- a/packages/chain/test/mock-adapter-parity.test.ts +++ b/packages/chain/test/mock-adapter-parity.test.ts @@ -145,6 +145,14 @@ const MOCK_EXEMPT_FROM_EVM = new Set([ // already covers the equivalent affordance for tests, so no // mock-side parity is needed. 'refreshMintingKCStorage', + // OT-RFC-40 round-4 review fix: TS-private helper used by + // `getLatestMerkleRoot` / `getMerkleLeafCount` to instantiate a + // one-shot Contract bound to a caller-supplied storage address + // (kcs-ack-based future versions). Mock adapter doesn't model + // multiple storage instances at all — its in-memory `collections` + // map is single-storage, so the equivalent affordance is just + // ignoring the optional `opts.storageContract` argument. + 'kcStorageForRead', ]); const NO_CHAIN_EXEMPT_FROM_EVM = new Set([ diff --git a/packages/core/src/constants.ts b/packages/core/src/constants.ts index c778c86b4..dcba20f84 100644 --- a/packages/core/src/constants.ts +++ b/packages/core/src/constants.ts @@ -259,15 +259,26 @@ export interface ParsedUal { * checks keep firing on those suffixed subjects instead of silently * skipping (Codex review on PR #718). * - * Disambiguation between the 3- and 4-segment forms when more than 3 - * segments are present: + * Disambiguation when more than 3 segments are present (the trick is + * that real CAIP-2 chainIds always contain a `:` — e.g. `base:84532`, + * `eip155:1` — so they NEVER match `STORAGE_TAG_PATTERN`. Conversely, + * a real storage tag like `v9` or `v11` MUST match + * `STORAGE_TAG_PATTERN` and CANNOT contain `:`. That's why we can + * disambiguate without ambiguity by inspecting `segments[0]` first): * - * - If `segments[0]` matches `STORAGE_TAG_PATTERN` AND `segments[2]` - * matches `PUBLISHER_ADDRESS_PATTERN` → tagged form (and - * `segments[3]` is the local id). - * - Else if `segments[1]` matches `PUBLISHER_ADDRESS_PATTERN` → - * default-storage form (`segments[0]` chainId, `segments[2]` - * local id, anything beyond that is the per-KA suffix). + * - If `segments[0]` matches `STORAGE_TAG_PATTERN` (i.e. it looks + * like a tag, not a chainId), the input MUST be the 4-segment + * tagged form. We require `segments[2]` to match + * `PUBLISHER_ADDRESS_PATTERN`. If it doesn't, the input is a + * malformed tagged UAL and we return `null` rather than silently + * falling back to the default-storage interpretation, because + * misclassifying a malformed `did:dkg:v9/0xPub/123/1` as a + * default-storage UAL (chainId="v9") would skip the V9 range + * check in publish-handler (Codex review on PR #718, round 4). + * - Else if `segments[1]` matches `PUBLISHER_ADDRESS_PATTERN`, + * it's the default-storage form (`segments[0]` chainId, + * `segments[2]` local id, anything beyond that is the per-KA + * suffix and is discarded). * - Else → null (unrecognised shape). * * The `PUBLISHER_ADDRESS_PATTERN` check is what disambiguates real @@ -294,21 +305,29 @@ export function parseUal(ual: string | undefined | null): ParsedUal | null { let publisherAddress: string; let localIdSegment: string; - // Disambiguation order: tagged form is checked first because its - // detector is strictly stronger (requires segments[0] to match - // STORAGE_TAG_PATTERN, which CAIP-2 chainIds like `base:84532` fail - // because they contain ':'). Default-storage detection is the - // fallback. - if ( - segments.length >= 4 && - STORAGE_TAG_PATTERN.test(segments[0]) && - PUBLISHER_ADDRESS_PATTERN.test(segments[2]) - ) { + // Inspect segments[0] first: real CAIP-2 chainIds contain `:` and + // thus NEVER match STORAGE_TAG_PATTERN, while real storage tags + // ALWAYS match it. This makes the choice unambiguous and lets us + // reject malformed tagged UALs hard rather than silently demoting + // them to default-storage form (Codex review on PR #718, round 4). + if (STORAGE_TAG_PATTERN.test(segments[0])) { + // Tagged form. Require segments[2] to be a publisher address. + // If it isn't, the input is malformed (e.g. + // `did:dkg:v9/0xPub/123/1` is missing the chainId between the + // tag and publisher) — return null rather than fall back to the + // default-storage path, which would parse chainId="v9" and bypass + // the V9 range check downstream. + if (segments.length < 4 || !PUBLISHER_ADDRESS_PATTERN.test(segments[2])) { + return null; + } storageTag = segments[0]; chainId = segments[1]; publisherAddress = segments[2]; localIdSegment = segments[3]; } else if (PUBLISHER_ADDRESS_PATTERN.test(segments[1])) { + // Default-storage form. segments[0] is the chainId (must contain + // `:` per CAIP-2; that's why STORAGE_TAG_PATTERN.test rejected it + // above). storageTag = ''; chainId = segments[0]; publisherAddress = segments[1]; diff --git a/packages/core/test/constants.test.ts b/packages/core/test/constants.test.ts index f0cbf262c..1f38ccac1 100644 --- a/packages/core/test/constants.test.ts +++ b/packages/core/test/constants.test.ts @@ -241,6 +241,30 @@ describe('parseUal (OT-RFC-40 §5.2 — handles both 3- and 4-segment forms)', ( expect(parseUal('did:dkg:base:special/base:84532/0xPub/123')).toBeNull(); }); + it('rejects malformed tagged UALs without silently demoting to default form (Codex review on PR #718, round 4)', () => { + // `did:dkg:v9//` LOOKS like a tagged form because + // segments[0]="v9" matches STORAGE_TAG_PATTERN, but it's missing + // the chainId between the tag and the publisher. Pre-fix, this + // would silently fall back to the default-storage path and parse + // as `{chainId: "v9", storageTag: "", ...}`, which the publisher + // would then route through the V10 default range check instead + // of the V9 KAS pre-reserved-range check. Reject hard instead. + expect( + parseUal( + 'did:dkg:v9/0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1/123', + ), + ).toBeNull(); + expect( + parseUal( + 'did:dkg:v9/0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1/123/1', + ), + ).toBeNull(); + // Also rejects 4-segment shapes where segments[2] is plainly not + // a publisher address (random number, random string). + expect(parseUal('did:dkg:v11/base:84532/notapublisher/1')).toBeNull(); + expect(parseUal('did:dkg:v11/base:84532/12345/1')).toBeNull(); + }); + it('returns startKAId: null for non-numeric local-id slot (tentative publish form)', () => { // dkg-publisher.ts's tentative path uses `t${publishOperationId}` // — the parser must not reject these since they are valid in-flight diff --git a/packages/random-sampling/src/prover.ts b/packages/random-sampling/src/prover.ts index a68046eaa..0f6606a38 100644 --- a/packages/random-sampling/src/prover.ts +++ b/packages/random-sampling/src/prover.ts @@ -311,41 +311,42 @@ export class RandomSamplingProver { const kcId = challenge.knowledgeCollectionId; const chunkId = challenge.chunkId; - // OT-RFC-40 §7.5 + Codex review on PR #718 (Comment 1 of round 3): - // the storage-tag resolution MUST happen before any storage-bound - // chain read. `getKCContextGraphId`, `getLatestMerkleRoot`, and - // `getMerkleLeafCount` in the EVM adapter all hardcode the V10 - // default `KnowledgeCollectionStorage`; calling them for a tagged - // challenge would either return the wrong CG/root/leafCount or - // throw, both of which would land before this fail-closed branch - // could fire. Resolve the tag now — and skip the period — if the - // adapter's registry says the challenge belongs to anything other - // than the V10 default. + // OT-RFC-40 §7.5 + Codex review on PR #718: + // + // Round 3 Comment 1: storage-tag resolution MUST happen before + // any storage-bound chain read so a tagged challenge can't + // reach `getLatestMerkleRoot` / `getMerkleLeafCount` against + // the wrong contract. + // + // Round 4 Comment 1: tagged challenges MUST NOT be permanently + // unprovable. Future `kcs-ack-based` storages (V11+ extending + // the V10 family) share the V10 method shapes and ABI, so we + // route the chain reads to the challenge's specific storage + // address via `opts.storageContract`. Pre-reserved-range V9 + // KAS challenges remain fail-closed (its method shapes diverge + // from V10's; full V9 RS support is a follow-up RFC). // // Adapters without a registry at all (`kcStorageRegistry` is - // `undefined`) still get the legacy unfiltered behaviour because - // they explicitly opted out of tag-aware proving by not exposing - // a registry. + // `undefined`) get the legacy unfiltered behaviour because they + // explicitly opted out of tag-aware proving. let expectedStorageTag: string | undefined; + let storageContractForReads: string | undefined; if (this.chain.kcStorageRegistry) { - expectedStorageTag = this.chain.kcStorageRegistry.tagFor( - challenge.knowledgeCollectionStorageContract, - ); - if (expectedStorageTag === undefined) { + const registry = this.chain.kcStorageRegistry; + let entry = registry.getByAddress(challenge.knowledgeCollectionStorageContract); + if (entry === undefined) { // Stale cache / Hub event race / partial init failure — try a // single refresh before fail-closing. try { - await this.chain.kcStorageRegistry.refresh(); + await registry.refresh(); } catch (err) { this.log.warn('rs.tick.kcs-registry-refresh-failed', { err: err instanceof Error ? err.message : String(err), }); } - expectedStorageTag = this.chain.kcStorageRegistry.tagFor( - challenge.knowledgeCollectionStorageContract, - ); + entry = registry.getByAddress(challenge.knowledgeCollectionStorageContract); } - if (expectedStorageTag === undefined) { + if (entry === undefined) { const storageAddress = challenge.knowledgeCollectionStorageContract; this.log.warn('rs.tick.kcs-registry-miss', { kcId: kcId.toString(), @@ -363,40 +364,69 @@ export class RandomSamplingProver { }, }), ); - // No reliable cgId yet (existing-challenge branch hasn't - // queried it because the read is storage-bound). Use the chain- - // side cgId from the createChallenge path when available, - // otherwise 0 to signal "unknown CG". return { kind: 'kc-not-synced', kcId, cgId: cgIdFromCreate ?? 0n }; } - if (expectedStorageTag !== '') { - // The chain reads below are bound to the V10 default KC - // storage. Tagged-storage RS routing requires those reads to - // grow a tag/address parameter — tracked as a follow-up to - // RFC-40. Until then, a non-empty tag is a hard fail-closed. - this.log.warn('rs.tick.tagged-storage-rs-unsupported', { - kcId: kcId.toString(), - storageContract: challenge.knowledgeCollectionStorageContract, - expectedStorageTag, - }); - await this.wal.append( - makeWalEntry(periodKey, 'failed', { + expectedStorageTag = entry.tag; + // Route chain reads explicitly only when the storage isn't the + // adapter's bound default. The default-tag path keeps using the + // implicit (no-opts) read so existing tests + adapters that + // don't model the opts argument continue to work bit-for-bit. + if (entry.tag !== '') { + if (entry.authMode === 'kcs-ack-based') { + // V10 family: stable method shapes across versions, route by + // the challenge's concrete storage address. + storageContractForReads = entry.address; + } else if (entry.authMode === 'kas-pre-reserved-range') { + // V9 KAS: read-side method shapes (publishCount + per-batch + // root) differ from V10. RS support tracked as a follow-up. + this.log.warn('rs.tick.kas-rs-unsupported', { kcId: kcId.toString(), - chunkId: chunkId.toString(), - error: { - code: 'tagged-storage-rs-unsupported', - message: - `Random-sampling chain reads are bound to the V10 default KC storage; ` + - `challenge storage tag "${expectedStorageTag}" is not yet routed end-to-end`, - }, - }), - ); - return { kind: 'kc-not-synced', kcId, cgId: cgIdFromCreate ?? 0n }; + storageContract: challenge.knowledgeCollectionStorageContract, + expectedStorageTag, + }); + await this.wal.append( + makeWalEntry(periodKey, 'failed', { + kcId: kcId.toString(), + chunkId: chunkId.toString(), + error: { + code: 'kas-rs-unsupported', + message: + `V9 KnowledgeAssetsStorage random-sampling reads are not yet supported; ` + + `challenge storage tag "${expectedStorageTag}" deferred to a follow-up RFC`, + }, + }), + ); + return { kind: 'kc-not-synced', kcId, cgId: cgIdFromCreate ?? 0n }; + } else { + // unknown authMode: fail closed. + this.log.warn('rs.tick.tagged-storage-unknown-auth', { + kcId: kcId.toString(), + storageContract: challenge.knowledgeCollectionStorageContract, + expectedStorageTag, + hubName: entry.hubName, + }); + await this.wal.append( + makeWalEntry(periodKey, 'failed', { + kcId: kcId.toString(), + chunkId: chunkId.toString(), + error: { + code: 'tagged-storage-unknown-auth', + message: + `KC storage "${entry.hubName}" at ${entry.address} has an unknown auth ` + + `mode; refusing to prove without an explicit policy`, + }, + }), + ); + return { kind: 'kc-not-synced', kcId, cgId: cgIdFromCreate ?? 0n }; + } } } - // Storage tag is the default V10 (or registry isn't present). The - // existing chain reads are now safe to use. + // Storage tag is either the default V10 or a `kcs-ack-based` + // future extension (which shares the V10 read API). Either way the + // chain reads below are safe — the optional storageContract arg is + // undefined for the default path and the resolved entry address + // for the tagged path. cgId = cgIdFromCreate ?? (await this.chain.getKCContextGraphId(kcId)); await this.wal.append( @@ -404,6 +434,7 @@ export class RandomSamplingProver { kcId: kcId.toString(), cgId: cgId.toString(), chunkId: chunkId.toString(), + storageTag: expectedStorageTag ?? '', }), ); @@ -418,8 +449,12 @@ export class RandomSamplingProver { return { kind: 'cg-not-found', kcId }; } - const expectedRoot = await this.chain.getLatestMerkleRoot(kcId); - const expectedLeafCount = await this.chain.getMerkleLeafCount(kcId); + const expectedRoot = await this.chain.getLatestMerkleRoot(kcId, { + storageContract: storageContractForReads, + }); + const expectedLeafCount = await this.chain.getMerkleLeafCount(kcId, { + storageContract: storageContractForReads, + }); let leaves: Uint8Array[]; try { diff --git a/packages/random-sampling/test/prover.test.ts b/packages/random-sampling/test/prover.test.ts index ae2f6cf42..a5942da14 100644 --- a/packages/random-sampling/test/prover.test.ts +++ b/packages/random-sampling/test/prover.test.ts @@ -54,6 +54,14 @@ interface FakeChainState { * "registry-less" mode where `expectedStorageTag` is undefined and * the extractor falls back to its legacy unfiltered behaviour. */ kcStorageRegistry?: ChainAdapter['kcStorageRegistry']; + /** OT-RFC-40 §7.5 (Codex review on PR #718, round 4): optional + * override hooks for the chain reads that take the new optional + * `opts.storageContract` argument. Tests that need to assert the + * routing parameter is passed correctly inject custom spies; tests + * that don't care use the default `state.expectedRoot` / + * `state.expectedLeafCount` constant returns. */ + getLatestMerkleRoot?: ChainAdapter['getLatestMerkleRoot']; + getMerkleLeafCount?: ChainAdapter['getMerkleLeafCount']; } function makeChain(state: FakeChainState): ChainAdapter { @@ -64,8 +72,8 @@ function makeChain(state: FakeChainState): ChainAdapter { getActiveProofPeriodStatus: vi.fn(async () => state.status), getNodeChallenge: vi.fn(async () => state.challengeForNode), createChallenge: vi.fn(state.createChallenge), - getLatestMerkleRoot: vi.fn(async () => state.expectedRoot), - getMerkleLeafCount: vi.fn(async () => state.expectedLeafCount), + getLatestMerkleRoot: state.getLatestMerkleRoot ?? vi.fn(async () => state.expectedRoot), + getMerkleLeafCount: state.getMerkleLeafCount ?? vi.fn(async () => state.expectedLeafCount), getKCContextGraphId: vi.fn(async () => state.cgIdForKc), submitProof: vi.fn(state.submitProof), }; @@ -738,8 +746,8 @@ describe('RandomSamplingProver — multi-storage routing (OT-RFC-40, Codex revie // The "stale cache" recovery attempt fails to discover the // challenged storage. The prover must escalate to fail-closed. }); - const tagFor = vi.fn(() => undefined); - const fakeRegistry = { refresh, tagFor } as unknown as NonNullable; + const getByAddress = vi.fn(() => undefined); + const fakeRegistry = { refresh, getByAddress } as unknown as NonNullable; const challengeStorage = '0x000000000000000000000000000000000000beef'; const submitProof = vi.fn(async () => ({ hash: '0xunused', blockNumber: 1, success: true })); @@ -768,9 +776,9 @@ describe('RandomSamplingProver — multi-storage routing (OT-RFC-40, Codex revie expect(outcome).toEqual({ kind: 'kc-not-synced', kcId: fixture.kcId, cgId: fixture.cgId }); // Refresh was attempted exactly once before fail-closing. expect(refresh).toHaveBeenCalledTimes(1); - // tagFor consulted twice: once before refresh, once after. - expect(tagFor).toHaveBeenCalledTimes(2); - expect(tagFor).toHaveBeenCalledWith(challengeStorage); + // getByAddress consulted twice: once before refresh, once after. + expect(getByAddress).toHaveBeenCalledTimes(2); + expect(getByAddress).toHaveBeenCalledWith(challengeStorage); expect(submitProof).not.toHaveBeenCalled(); const failed = (await wal.readAll()).find((e) => e.status === 'failed'); expect(failed?.error?.code).toBe('kcs-registry-miss'); @@ -795,14 +803,26 @@ describe('RandomSamplingProver — multi-storage routing (OT-RFC-40, Codex revie const { root, leafCount } = await seedKC(store, fixture); const refresh = vi.fn(async () => {}); + const challengeStorage = '0x000000000000000000000000000000000000face'; // The default-storage tag round-trip confirms the registry path // is wired without changing observable behaviour vs the // registry-less mode (legacy 3-segment UALs end up filtered by - // `expectedStorageTag === ''`, which matches the seed UAL). - const tagFor = vi.fn(() => ''); - const fakeRegistry = { refresh, tagFor } as unknown as NonNullable; + // `expectedStorageTag === ''`, which matches the seed UAL). The + // registry returns a full entry now that the prover routes by + // authMode (Codex review on PR #718, round 4). + const defaultEntry = { + hubName: 'KnowledgeCollectionStorage', + address: challengeStorage, + uriBase: 'did:dkg', + tag: '', + authMode: 'kcs-ack-based' as const, + }; + const getByAddress = vi.fn(() => defaultEntry); + const fakeRegistry = { refresh, getByAddress } as unknown as NonNullable; const submitProof = vi.fn(async () => ({ hash: '0xresolved', blockNumber: 1, success: true })); + const getLatestMerkleRoot = vi.fn(async () => root); + const getMerkleLeafCount = vi.fn(async () => leafCount); const chain = makeChain({ status: { activeProofPeriodStartBlock: 1000n, isValid: true }, challengeForNode: null, @@ -810,7 +830,7 @@ describe('RandomSamplingProver — multi-storage routing (OT-RFC-40, Codex revie challenge: makeChallenge({ knowledgeCollectionId: fixture.kcId, chunkId: 0n, - knowledgeCollectionStorageContract: '0x000000000000000000000000000000000000face', + knowledgeCollectionStorageContract: challengeStorage, }), contextGraphId: fixture.cgId, hash: '0x', blockNumber: 1, success: true, @@ -820,43 +840,175 @@ describe('RandomSamplingProver — multi-storage routing (OT-RFC-40, Codex revie cgIdForKc: fixture.cgId, submitProof: submitProof as never, kcStorageRegistry: fakeRegistry, + getLatestMerkleRoot: getLatestMerkleRoot as never, + getMerkleLeafCount: getMerkleLeafCount as never, }); const prover = new RandomSamplingProver({ chain, store, identityId: IDENTITY_ID }); const outcome = await prover.tick(); expect(outcome.kind).toBe('submitted'); - // Registry was hit once (no refresh needed because tagFor returned - // a defined value on the first call). - expect(tagFor).toHaveBeenCalledTimes(1); + // Registry was hit once (no refresh needed because getByAddress + // returned a defined entry on the first call). + expect(getByAddress).toHaveBeenCalledTimes(1); expect(refresh).not.toHaveBeenCalled(); expect(submitProof).toHaveBeenCalledTimes(1); + // Default-tag path: chain reads called WITHOUT a storageContract + // routing arg (i.e. opts.storageContract === undefined), so the + // adapter uses its bound default. Round 4 rule: only tagged + // storages override the implicit default-read path. + expect(getLatestMerkleRoot).toHaveBeenCalledWith(fixture.kcId, { storageContract: undefined }); + expect(getMerkleLeafCount).toHaveBeenCalledWith(fixture.kcId, { storageContract: undefined }); await prover.close(); }); - it('skips period when the registry resolves a non-default-storage tag (Codex review on PR #718, Comment 1 of round 2)', async () => { - // The chain reads `getKCContextGraphId`, `getLatestMerkleRoot`, - // `getMerkleLeafCount` are all bound to the V10 default KC - // storage in the EVM adapter today — they don't accept a storage - // tag/address argument. Filtering the local-store extractor by a - // non-empty tag would only close the disambiguation gap on the - // read side; the chain reads above would still query V10 KCS, - // verifying our local data against the WRONG contract's expected - // root and leaf count. Fail-closed with an explicit - // `tagged-storage-rs-unsupported` reason until those reads grow - // a tag/address parameter (tracked as RFC-40 follow-up). + it('routes chain reads through opts.storageContract for kcs-ack-based tagged challenges (V11+) (Codex review on PR #718, Comment 1 of round 4)', async () => { + // The PR's promise: future KC storage versions go live without + // operator action. For random sampling specifically, that means a + // V11 (or any future `kcs-ack-based`) storage's KC must be + // provable end-to-end — the prover routes `getLatestMerkleRoot` + // and `getMerkleLeafCount` to the challenge's specific storage + // address via the new `opts.storageContract` parameter, instead + // of querying the adapter's bound default contract (which would + // return root/leafCount for a different KC, mismatching local + // data and slashing the node). const store = new OxigraphStore(); const fixture: KCFixture = { cgId: 11n, kcId: 7n, - ual: 'did:dkg:hardhat:31337/0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1/7', + ual: 'did:dkg:v11-future/hardhat:31337/0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1/7', rootEntities: ['urn:e:1'], publicTriples: [{ subject: 'urn:e:1', predicate: 'urn:p:k', object: '"a"' }], }; const { root, leafCount } = await seedKC(store, fixture); + const v11Address = '0x00000000000000000000000000000000000000b1'; const refresh = vi.fn(async () => {}); - const tagFor = vi.fn(() => 'v9'); - const fakeRegistry = { refresh, tagFor } as unknown as NonNullable; + const v11Entry = { + hubName: 'KnowledgeCollectionStorageV11', + address: v11Address, + uriBase: 'did:dkg:v11-future', + tag: 'v11-future', + authMode: 'kcs-ack-based' as const, + }; + const getByAddress = vi.fn(() => v11Entry); + const fakeRegistry = { refresh, getByAddress } as unknown as NonNullable; + + const submitProof = vi.fn(async () => ({ hash: '0xv11-proven', blockNumber: 1, success: true })); + const getLatestMerkleRoot = vi.fn(async (_kcId: bigint, _opts?: { storageContract?: string }) => root); + const getMerkleLeafCount = vi.fn(async (_kcId: bigint, _opts?: { storageContract?: string }) => leafCount); + const chain = makeChain({ + status: { activeProofPeriodStartBlock: 1000n, isValid: true }, + challengeForNode: null, + createChallenge: async () => ({ + challenge: makeChallenge({ + knowledgeCollectionId: fixture.kcId, + chunkId: 0n, + knowledgeCollectionStorageContract: v11Address, + }), + contextGraphId: fixture.cgId, + hash: '0x', blockNumber: 1, success: true, + }), + expectedRoot: root, + expectedLeafCount: leafCount, + cgIdForKc: fixture.cgId, + submitProof: submitProof as never, + kcStorageRegistry: fakeRegistry, + getLatestMerkleRoot: getLatestMerkleRoot as never, + getMerkleLeafCount: getMerkleLeafCount as never, + }); + + const prover = new RandomSamplingProver({ chain, store, identityId: IDENTITY_ID }); + const outcome = await prover.tick(); + expect(outcome.kind).toBe('submitted'); + expect(submitProof).toHaveBeenCalledTimes(1); + // Critical assertion: the routing arg was passed through to the + // chain reads as the V11 storage's address, not the bound default. + expect(getLatestMerkleRoot).toHaveBeenCalledWith(fixture.kcId, { storageContract: v11Address }); + expect(getMerkleLeafCount).toHaveBeenCalledWith(fixture.kcId, { storageContract: v11Address }); + await prover.close(); + }); + + it('skips period as kas-rs-unsupported when the challenge resolves to a V9 KAS (kas-pre-reserved-range) tagged storage', async () => { + // V9 KAS read APIs (publishCount + per-batch root) diverge from + // V10's, so we don't yet route the chain reads against KAS. + // Fail-closed with `kas-rs-unsupported` until a follow-up RFC + // adds explicit V9 RS support. + const store = new OxigraphStore(); + const fixture: KCFixture = { + cgId: 11n, + kcId: 7n, + ual: 'did:dkg:v9/hardhat:31337/0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1/7', + rootEntities: ['urn:e:1'], + publicTriples: [{ subject: 'urn:e:1', predicate: 'urn:p:k', object: '"a"' }], + }; + const { root, leafCount } = await seedKC(store, fixture); + + const refresh = vi.fn(async () => {}); + const v9Entry = { + hubName: 'KnowledgeAssetsStorage', + address: '0x00000000000000000000000000000000000000a9', + uriBase: 'did:dkg:v9', + tag: 'v9', + authMode: 'kas-pre-reserved-range' as const, + }; + const getByAddress = vi.fn(() => v9Entry); + const fakeRegistry = { refresh, getByAddress } as unknown as NonNullable; + + const submitProof = vi.fn(async () => ({ hash: '0xunused', blockNumber: 1, success: true })); + const chain = makeChain({ + status: { activeProofPeriodStartBlock: 1000n, isValid: true }, + challengeForNode: null, + createChallenge: async () => ({ + challenge: makeChallenge({ + knowledgeCollectionId: fixture.kcId, + chunkId: 0n, + knowledgeCollectionStorageContract: v9Entry.address, + }), + contextGraphId: fixture.cgId, + hash: '0x', blockNumber: 1, success: true, + }), + expectedRoot: root, + expectedLeafCount: leafCount, + cgIdForKc: fixture.cgId, + submitProof: submitProof as never, + kcStorageRegistry: fakeRegistry, + }); + + const wal = new InMemoryProverWal(); + const prover = new RandomSamplingProver({ chain, store, identityId: IDENTITY_ID, wal }); + const outcome = await prover.tick(); + expect(outcome).toEqual({ kind: 'kc-not-synced', kcId: fixture.kcId, cgId: fixture.cgId }); + expect(submitProof).not.toHaveBeenCalled(); + const failed = (await wal.readAll()).find((e) => e.status === 'failed'); + expect(failed?.error?.code).toBe('kas-rs-unsupported'); + await prover.close(); + }); + + it('skips period as tagged-storage-unknown-auth when authMode is unknown', async () => { + // Defence-in-depth: if a future Hub registers a KC-class storage + // under a name pattern the registry doesn't have a policy for, + // the prover MUST refuse to attest a proof rather than implicit- + // routing to the V10 reader. + const store = new OxigraphStore(); + const fixture: KCFixture = { + cgId: 11n, + kcId: 7n, + ual: 'did:dkg:experimental/hardhat:31337/0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1/7', + rootEntities: ['urn:e:1'], + publicTriples: [{ subject: 'urn:e:1', predicate: 'urn:p:k', object: '"a"' }], + }; + const { root, leafCount } = await seedKC(store, fixture); + + const refresh = vi.fn(async () => {}); + const unknownEntry = { + hubName: 'ExperimentalStorage', + address: '0x00000000000000000000000000000000000000ee', + uriBase: 'did:dkg:experimental', + tag: 'experimental', + authMode: 'unknown' as const, + }; + const getByAddress = vi.fn(() => unknownEntry); + const fakeRegistry = { refresh, getByAddress } as unknown as NonNullable; const submitProof = vi.fn(async () => ({ hash: '0xunused', blockNumber: 1, success: true })); const chain = makeChain({ @@ -866,7 +1018,7 @@ describe('RandomSamplingProver — multi-storage routing (OT-RFC-40, Codex revie challenge: makeChallenge({ knowledgeCollectionId: fixture.kcId, chunkId: 0n, - knowledgeCollectionStorageContract: '0x00000000000000000000000000000000000000a9', + knowledgeCollectionStorageContract: unknownEntry.address, }), contextGraphId: fixture.cgId, hash: '0x', blockNumber: 1, success: true, @@ -884,7 +1036,7 @@ describe('RandomSamplingProver — multi-storage routing (OT-RFC-40, Codex revie expect(outcome).toEqual({ kind: 'kc-not-synced', kcId: fixture.kcId, cgId: fixture.cgId }); expect(submitProof).not.toHaveBeenCalled(); const failed = (await wal.readAll()).find((e) => e.status === 'failed'); - expect(failed?.error?.code).toBe('tagged-storage-rs-unsupported'); + expect(failed?.error?.code).toBe('tagged-storage-unknown-auth'); await prover.close(); }); }); From 2773ae81941b35a14c245bb0c66df680390b8cc5 Mon Sep 17 00:00:00 2001 From: Branimir Rakic Date: Wed, 27 May 2026 03:42:07 +0200 Subject: [PATCH 12/12] fix(rfc-40): address Codex review on PR #718 round 5 (2 substantive findings) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Round 5 closes the regressions the round-4 fixes introduced. 1) packages/core/src/constants.ts (round-5 comment 1): the round-4 parseUal disambiguation rule "if segments[0] matches STORAGE_TAG_PATTERN, the input MUST be the tagged form" was too strict. It rejected 3-segment UALs with no-colon chainIds, breaking `did:dkg:none//` round-tripping for the no-chain `DKGPublisher` mode (`chain.chainId === 'none'` emits exactly that shape; tests in `publisher-no-random-wallet.test.ts` regex-match it). The round-4 round-trip rule actually only applies to 4+ segment inputs — a 3-segment UAL is unambiguously default-storage form regardless of what segments[0] looks like, because the tagged form REQUIRES 4 segments to fit (tag, chainId, publisher, localId). Restructure the disambiguation: * 3 segments → ALWAYS default form. Accepts no-colon chainIds like `none`, `1`, etc. * 4+ segments AND segments[0] matches STORAGE_TAG_PATTERN → MUST be tagged form. segments[2] not a publisher → reject (preserves round-4 fix). * 4+ segments AND segments[0] has `:` (real CAIP-2 chainId) → default form with per-KA suffix. Test changes split the previous "rejects 3- and 4-segment malformed tagged UALs" test into the still-valid 4+ segment rejection plus a new positive test pinning `did:dkg:none/...` round-trips for both confirmed and tentative (`t`) local ids. 2) packages/random-sampling/src/prover.ts (round-5 comment 2): the round-4 prover refactor added a `storageTag` diagnostics field to the `challenge` WAL entry, but `ProverWalEntry` in `random-sampling/src/wal.ts` doesn't define `storageTag`. The field was a TS excess-property write through `Omit`; vitest's esbuild transformer doesn't typecheck so runtime tests passed, but `tsc -p packages/random-sampling/tsconfig.json` rejects the build. Drop the field — the WAL `failed` entries already capture `storageTag` indirectly via the `kas-rs-unsupported` / `tagged-storage-unknown-auth` / `kcs-registry-miss` error codes, which is enough for diagnostics. Adding the field to `ProverWalEntry` was the alternative; left for a future schema bump if a use case emerges. All four packages green: chain (424), core (955), random-sampling (57, e2e excluded), publisher (1051). `tsc --noEmit` also passes on all four. Co-authored-by: Cursor --- packages/core/src/constants.ts | 79 ++++++++++++++------------ packages/core/test/constants.test.ts | 54 +++++++++++++----- packages/random-sampling/src/prover.ts | 1 - 3 files changed, 83 insertions(+), 51 deletions(-) diff --git a/packages/core/src/constants.ts b/packages/core/src/constants.ts index dcba20f84..13a5bb960 100644 --- a/packages/core/src/constants.ts +++ b/packages/core/src/constants.ts @@ -259,27 +259,31 @@ export interface ParsedUal { * checks keep firing on those suffixed subjects instead of silently * skipping (Codex review on PR #718). * - * Disambiguation when more than 3 segments are present (the trick is - * that real CAIP-2 chainIds always contain a `:` — e.g. `base:84532`, - * `eip155:1` — so they NEVER match `STORAGE_TAG_PATTERN`. Conversely, - * a real storage tag like `v9` or `v11` MUST match - * `STORAGE_TAG_PATTERN` and CANNOT contain `:`. That's why we can - * disambiguate without ambiguity by inspecting `segments[0]` first): + * Disambiguation: * - * - If `segments[0]` matches `STORAGE_TAG_PATTERN` (i.e. it looks - * like a tag, not a chainId), the input MUST be the 4-segment - * tagged form. We require `segments[2]` to match + * - 3 segments → ALWAYS default-storage form (`segments[0]` + * chainId, `segments[1]` publisher, `segments[2]` local id). The + * chainId can be any non-empty token, including no-colon + * sentinels like `none` (used by `DKGPublisher` in no-chain + * mode) or `1` (a hypothetical bare-numeric chainId). Codex + * review on PR #718, round 5 surfaced this — an over-strict + * rule that treated a tag-shaped `segments[0]` as proof-of- + * tagged-form broke `did:dkg:none//` round-tripping for + * locally published KCs. + * - 4+ segments AND `segments[0]` matches `STORAGE_TAG_PATTERN` + * (i.e. no colon — real CAIP-2 chainIds like `base:84532` + * ALWAYS contain `:` and so NEVER match the pattern) → MUST be + * the tagged form. We require `segments[2]` to match * `PUBLISHER_ADDRESS_PATTERN`. If it doesn't, the input is a * malformed tagged UAL and we return `null` rather than silently * falling back to the default-storage interpretation, because - * misclassifying a malformed `did:dkg:v9/0xPub/123/1` as a - * default-storage UAL (chainId="v9") would skip the V9 range - * check in publish-handler (Codex review on PR #718, round 4). - * - Else if `segments[1]` matches `PUBLISHER_ADDRESS_PATTERN`, - * it's the default-storage form (`segments[0]` chainId, - * `segments[2]` local id, anything beyond that is the per-KA - * suffix and is discarded). - * - Else → null (unrecognised shape). + * misclassifying `did:dkg:v9/0xPub/123/1` as a default-storage + * UAL (chainId="v9") would skip the V9 range check in publish- + * handler (Codex review on PR #718, round 4). + * - 4+ segments AND `segments[0]` does NOT match + * `STORAGE_TAG_PATTERN` (has `:`, e.g. `base:84532`) → default- + * storage form. Trailing segments after `segments[2]` are + * discarded (per-KA suffix tolerance — see PR #718 round 1). * * The `PUBLISHER_ADDRESS_PATTERN` check is what disambiguates real * UALs from same-prefix CG data URIs (`did:dkg:context-graph:...`). @@ -305,29 +309,34 @@ export function parseUal(ual: string | undefined | null): ParsedUal | null { let publisherAddress: string; let localIdSegment: string; - // Inspect segments[0] first: real CAIP-2 chainIds contain `:` and - // thus NEVER match STORAGE_TAG_PATTERN, while real storage tags - // ALWAYS match it. This makes the choice unambiguous and lets us - // reject malformed tagged UALs hard rather than silently demoting - // them to default-storage form (Codex review on PR #718, round 4). - if (STORAGE_TAG_PATTERN.test(segments[0])) { - // Tagged form. Require segments[2] to be a publisher address. - // If it isn't, the input is malformed (e.g. - // `did:dkg:v9/0xPub/123/1` is missing the chainId between the - // tag and publisher) — return null rather than fall back to the - // default-storage path, which would parse chainId="v9" and bypass - // the V9 range check downstream. - if (segments.length < 4 || !PUBLISHER_ADDRESS_PATTERN.test(segments[2])) { - return null; - } + // 3-segment form is unambiguously default-storage. This includes + // no-chain `did:dkg:none//` UALs from DKGPublisher's + // local-only path (Codex review on PR #718, round 5) and any + // future bare-numeric chainId like `did:dkg:1//`. The + // distinction between "v9 as chainId" and "v9 as storage tag" + // can only be made when 4+ segments are present. + if (segments.length === 3) { + if (!PUBLISHER_ADDRESS_PATTERN.test(segments[1])) return null; + storageTag = ''; + chainId = segments[0]; + publisherAddress = segments[1]; + localIdSegment = segments[2]; + } else if (STORAGE_TAG_PATTERN.test(segments[0])) { + // 4+ segments AND segments[0] looks tag-shaped (no colon — real + // CAIP-2 chainIds with `:` never match STORAGE_TAG_PATTERN). MUST + // be the tagged form. If segments[2] is not a publisher this is + // a malformed tagged UAL — reject hard rather than silently fall + // back to the default-storage interpretation, which would skip + // the V9 range check downstream (Codex review on PR #718, round + // 4 example: `did:dkg:v9/0xPub/123/1` is missing the chainId). + if (!PUBLISHER_ADDRESS_PATTERN.test(segments[2])) return null; storageTag = segments[0]; chainId = segments[1]; publisherAddress = segments[2]; localIdSegment = segments[3]; } else if (PUBLISHER_ADDRESS_PATTERN.test(segments[1])) { - // Default-storage form. segments[0] is the chainId (must contain - // `:` per CAIP-2; that's why STORAGE_TAG_PATTERN.test rejected it - // above). + // 4+ segments with a colon-containing chainId in segments[0] — + // default-storage form with a per-KA suffix beyond segments[2]. storageTag = ''; chainId = segments[0]; publisherAddress = segments[1]; diff --git a/packages/core/test/constants.test.ts b/packages/core/test/constants.test.ts index 1f38ccac1..6d84f980a 100644 --- a/packages/core/test/constants.test.ts +++ b/packages/core/test/constants.test.ts @@ -241,30 +241,54 @@ describe('parseUal (OT-RFC-40 §5.2 — handles both 3- and 4-segment forms)', ( expect(parseUal('did:dkg:base:special/base:84532/0xPub/123')).toBeNull(); }); - it('rejects malformed tagged UALs without silently demoting to default form (Codex review on PR #718, round 4)', () => { - // `did:dkg:v9//` LOOKS like a tagged form because - // segments[0]="v9" matches STORAGE_TAG_PATTERN, but it's missing - // the chainId between the tag and the publisher. Pre-fix, this - // would silently fall back to the default-storage path and parse - // as `{chainId: "v9", storageTag: "", ...}`, which the publisher - // would then route through the V10 default range check instead - // of the V9 KAS pre-reserved-range check. Reject hard instead. - expect( - parseUal( - 'did:dkg:v9/0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1/123', - ), - ).toBeNull(); + it('rejects malformed 4+-segment tagged UALs without silently demoting to default form (Codex review on PR #718, round 4)', () => { + // `did:dkg:v9/0xPub/123/1` is the canonical malformed tagged + // shape: segments[0]="v9" looks tag-shaped, but the chainId + // between tag and publisher is missing, so segments[2]="123" is + // not a publisher address. Pre-fix, this would silently fall + // back to the default-storage path with chainId="v9" and bypass + // the V9 KAS range check. Reject instead. expect( parseUal( 'did:dkg:v9/0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1/123/1', ), ).toBeNull(); - // Also rejects 4-segment shapes where segments[2] is plainly not - // a publisher address (random number, random string). + // Other malformed 4-segment shapes where segments[2] is plainly + // not a publisher address. expect(parseUal('did:dkg:v11/base:84532/notapublisher/1')).toBeNull(); expect(parseUal('did:dkg:v11/base:84532/12345/1')).toBeNull(); }); + it('preserves no-chain `did:dkg:none//` round-tripping (Codex review on PR #718, round 5)', () => { + // DKGPublisher emits `did:dkg:none//` in no-chain mode + // (`chain.chainId === 'none'`). Despite "none" matching + // STORAGE_TAG_PATTERN, a 3-segment UAL is unambiguously the + // default-storage form per RFC §5.2 — the tag-vs-chainId + // disambiguation only kicks in for 4+ segments. The previous + // round-4 fix was too aggressive and broke this round-trip. + const parsed = parseUal( + 'did:dkg:none/0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1/12345', + ); + expect(parsed).toEqual({ + chainId: 'none', + storageTag: '', + publisherAddress: '0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1', + startKAId: 12345n, + }); + // Tentative no-chain UALs (synthetic `t` local id) also + // round-trip — DKGPublisher's no-random-wallet test pins this + // exact shape via regex. + const tentative = parseUal( + 'did:dkg:none/0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1/tabcd-1', + ); + expect(tentative).toEqual({ + chainId: 'none', + storageTag: '', + publisherAddress: '0xA1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1A1', + startKAId: null, + }); + }); + it('returns startKAId: null for non-numeric local-id slot (tentative publish form)', () => { // dkg-publisher.ts's tentative path uses `t${publishOperationId}` // — the parser must not reject these since they are valid in-flight diff --git a/packages/random-sampling/src/prover.ts b/packages/random-sampling/src/prover.ts index 0f6606a38..b096078b0 100644 --- a/packages/random-sampling/src/prover.ts +++ b/packages/random-sampling/src/prover.ts @@ -434,7 +434,6 @@ export class RandomSamplingProver { kcId: kcId.toString(), cgId: cgId.toString(), chunkId: chunkId.toString(), - storageTag: expectedStorageTag ?? '', }), );