Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
383 changes: 383 additions & 0 deletions docs/RFC40_MULTI_STORAGE_KC_URI_SCHEME.md

Large diffs are not rendered by default.

176 changes: 176 additions & 0 deletions docs/STORAGE_VERSION_TAGS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,176 @@
# Storage Version Tags — operator guide

**Status:** stable as of OT-RFC-40 (PR-1 → PR-5)
**Companion:** [`docs/RFC40_MULTI_STORAGE_KC_URI_SCHEME.md`](./RFC40_MULTI_STORAGE_KC_URI_SCHEME.md), [`docs/TESTNET_RESET.md`](./TESTNET_RESET.md)

## TL;DR

Every Knowledge Collection UAL embeds the **storage instance** that
minted it. That tag is what makes V9 and V10 KCs coexist on the same
Hub today, and it is what will make V11, V12, and any future storage
upgrade additive (no chain-state wipe, no data loss) instead of
destructive.

You will see two valid UAL forms in the wild and in your own logs:

```
Default storage (legacy, V10 today): did:dkg:{chainId}/{publisher}/{startKAId}
Tagged storage (V9 today, V11+...): did:dkg:{tag}/{chainId}/{publisher}/{startKAId}
```

`{tag}` is `[a-z0-9-]+`. A KC's tag is **forever** — once a KC is minted
its UAL will never re-tag.

## What the tag is, in one sentence

A storage instance's `uri(0)` ERC-1155 view returns its `uriBase`
(`did:dkg`, `did:dkg:v9`, `did:dkg:v11`, …). The agent strips the
common `did:dkg:` prefix; whatever remains is the tag (empty for the
default storage). The tag is the storage's URI suffix, so you can read
it directly off-chain at any time.

## Why this matters to operators

Three things change on the operator-visible surface:

1. **Logs now show 4-segment UALs for V9 KAs.** This is normal:
`did:dkg:v9/base:84532/0xA1.../12345`. Treat it identically to a
V10 UAL — the daemon already routes correctly.
2. **The `chain-reset-wipe` hook is no longer the right answer for
storage-only redeploys.** Bumping `network/<env>.json#chainResetMarker`
still works, but on any network where a new KC storage version was
deployed alongside the old one, the right cutover is "register the
new storage with a new `uriBase`" — old data keeps resolving to old
storage, new data uses new storage, no wipe needed. Reach for the
marker only when actual chain entities (CG ids, identity ids) are
redeployed too. See "When to bump the marker" below.
3. **There is no per-operator action for the multi-storage scheme to
take effect.** PR-3's registry runs at agent boot; it discovers
every registered KC-class storage on the Hub automatically and
refreshes on `Hub.NewAssetStorage` / `AssetStorageChanged` events.

## Reading a UAL

```
did:dkg:base:84532/0xA1.../12345
└── chainId └── tokenId on the default storage's contract

did:dkg:v9/base:84532/0xA1.../12345
└── tag └── tokenId on the V9 KAS contract
└── chainId
```

Helper in code: `parseUal(ual)` from `@origintrail-official/dkg-core`.
It returns `{ chainId, storageTag, publisherAddress, startKAId }` or
`null` for malformed input. Both 3- and 4-segment forms parse without
ambiguity.

## How a new storage tag goes live

For maintainers / contracts deployers (operators do nothing).

1. Deploy the new storage (e.g. `KnowledgeCollectionStorageV11`) with
a fresh `uriBase` of the form `did:dkg:<tag>` — e.g. `did:dkg:v11`.
The tag MUST match `[a-z0-9-]+` (no spaces, no slashes, no `:`).
2. Register it on Hub via `Hub.setAssetStorageContract(name, addr)` or
`Hub.setAndReinitializeContracts(...)`. The Hub emits
`NewAssetStorage` (or `AssetStorageChanged` if rotating an existing
slot).
3. Every running daemon refreshes its `KCStorageRegistry` on those
events. No restart needed — the next publish or random-sampling
challenge against the new storage just works.
4. Subsequent mints into the new storage produce
`did:dkg:<tag>/{chainId}/{publisher}/{startKAId}` UALs
automatically — the publisher reads `uri(0)` once at init and
stamps every UAL with that tag.

The old storage stays registered; old data keeps resolving. There is
no migration step.

## What never changes about a tag

- **The default storage's tag is empty, forever.** A storage whose
`uriBase` is exactly `did:dkg` produces 3-segment UALs; we will not
promote a different storage to be "the new default" because that
would require rewriting every legacy UAL. Keep the V10 storage at
`did:dkg`, and add V11, V12, … as tagged peers.
- **Tags are not retired.** If you need to take a storage out of
rotation (e.g. for safety), keep it deployed and just stop minting
into it. Existing UALs against that tag must continue to resolve.
- **Two storages MUST NOT share a tag.** The registry indexes by tag;
collision means one of them is unreachable. The deploy pipeline
should reject duplicate `uriBase` values; if you ever observe a
collision in `dkg_query`, that's a contracts-side bug, not a daemon
bug.

## How resolution works (for the curious)

```
Publish path (mint) Resolution path (read / verify)
────────────────── ──────────────────────────────
publisher.publish() kc-extractor.extractV10KC(...)
↓ ↓
ChainAdapter.mintingStorageTag parseUal(ual).storageTag
(read from KCS.uri(0) ↓
once at init, cached) KCStorageRegistry.getByTag(tag)
↓ ↓
kcUal(chainId, pub, id, tag) storage contract address
↓ ↓
4-segment if tag != '' query that contract
3-segment otherwise (publisher / merkle root / range)
```

Random sampling is the most subtle case: the chain-side
`Challenge.knowledgeCollectionStorageContract` is the address of the
storage that holds the challenged KC. The prover passes that address
through `KCStorageRegistry.tagFor(addr)` to get the storage tag, then
filters its meta-graph lookup by that tag — so a prover holding two
KCs with the same `kcId` (one V9, one V10) attests to the right one.

## When to bump `chainResetMarker` after this RFC

You're bumping the marker because **non-storage** chain entities have
been replaced and the daemon's per-node state (publish journal,
random-sampling WAL) references entities that no longer exist:

- Hub redeploy (new chain identity altogether): **bump.**
- IdentityStorage / Profile redeploy: **bump.**
- Context graph storage redeploy *with new ownership*: **bump.**
- KC storage v→v+1 redeploy (new `uriBase`, both registered): **don't
bump.** Old data is still valid; the multi-storage scheme handles it.
- KC storage same-tag redeploy (a "fix-forward" that reuses
`uri(0)`): **bump if and only if existing data on the old contract
is being abandoned.** Otherwise prefer redeploying as a new tag.

The auto-wipe hook itself is unchanged by this RFC — it remains the
right tool when chain identity actually changes.

## Failure modes & how to debug them

| Symptom | First check |
|---|---|
| Publish gets a "publisher does not own UAL range …" rejection on a default-storage UAL | The receiver is on a Hub without a V9 KAS deployment. The V10 default path defers to ACK auth and should NOT be hitting the V9 `getPublisherRange*` API. If it is, the receiver hasn't picked up RFC-40 PR-5; upgrade. |
| Random-sampling proof fails with "no KC found" but the KC is in the local store | The prover may be looking for the wrong storage's KC. Check that `KCStorageRegistry.tagFor(challenge.knowledgeCollectionStorageContract)` returns a non-undefined tag. If undefined, the challenged storage isn't in the registry — refresh by triggering a `Hub.NewAssetStorage` event or restarting the daemon. |
| `parseUal()` returns `null` for what looks like a valid UAL | The publisher address must match `^0x[0-9a-fA-F]{40}$` — the parser intentionally rejects "UALs" that are actually CG data URIs (`did:dkg:context-graph:...`) or other DID forms. If your UAL doesn't have a valid 40-hex-character publisher segment, it's not a KC UAL. |
| Daemon log shows a UAL with an unfamiliar tag | This is fine — tags are forever. Look it up: `dkg query 'PREFIX hub: <…> SELECT ?addr WHERE { ?addr code:uriBase "did:dkg:<tag>" }'` or read `Hub.getAssetStorageAddress(...)` directly. |

## Cross-references

- [`packages/core/src/constants.ts`](../packages/core/src/constants.ts):
`kcUal()`, `parseUal()`, `publisherAddressFromUal()`,
`STORAGE_TAG_PATTERN`.
- [`packages/chain/src/kc-storage-registry.ts`](../packages/chain/src/kc-storage-registry.ts):
`KCStorageRegistry`, `deriveStorageTag()`.
- [`packages/chain/src/evm-adapter.ts`](../packages/chain/src/evm-adapter.ts):
`mintingStorageTag` field, `kcStorageRegistry` field, Hub event
listeners that refresh the registry, tag-aware
`verifyPublisherOwnsRange()`.
- [`packages/random-sampling/src/kc-extractor.ts`](../packages/random-sampling/src/kc-extractor.ts):
`extractV10KCFromStore({ expectedStorageTag })`.
- [`packages/cli/src/daemon/chain-reset-wipe.ts`](../packages/cli/src/daemon/chain-reset-wipe.ts):
the auto-wipe hook (unchanged by this RFC; still the right tool for
non-storage chain-identity changes).
- [`docs/TESTNET_RESET.md`](./TESTNET_RESET.md): full reset runbook
(Phases A-D).
- [`docs/RFC40_MULTI_STORAGE_KC_URI_SCHEME.md`](./RFC40_MULTI_STORAGE_KC_URI_SCHEME.md):
the design rationale, alternatives considered, and PR sequencing.
18 changes: 18 additions & 0 deletions docs/TESTNET_RESET.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,19 @@ contract layout shipped in PR #357. Covers the three roles involved
actually has to do — most of the operator-facing pain is handled by the
daemon's built-in auto-update + supervised-restart.

> **Storage-only redeploys do NOT need this runbook.**
> OT-RFC-40 (see [`docs/STORAGE_VERSION_TAGS.md`](./STORAGE_VERSION_TAGS.md)
> and [`docs/RFC40_MULTI_STORAGE_KC_URI_SCHEME.md`](./RFC40_MULTI_STORAGE_KC_URI_SCHEME.md))
> formalises the URI scheme that lets V9, V10, V11, … KC storages
> coexist on the same Hub. Adding a new KC storage version is **not**
> a chain reset: register the new storage with a fresh `uriBase` (e.g.
> `did:dkg:v11`), keep the old one deployed, and existing data keeps
> resolving without wiping anyone's `store.nq`. Use this runbook only
> when actual chain entities (Hub, IdentityStorage, the V10 default
> KC storage being abandoned) are being replaced. See
> STORAGE_VERSION_TAGS.md "When to bump `chainResetMarker` after this
> RFC" for the full decision table.

The reset is the simplest cutover path because it lets us drop V8
`Staking` + `DelegatorsInfo` + the dual-store coupling completely
instead of running a wholesale state migration. Tradeoff: any node-side
Expand Down Expand Up @@ -295,3 +308,8 @@ staking + RS pipeline in under a minute.
- `scripts/devnet-test-random-sampling.sh` — the smoke test invoked
in Phase D (works against any RPC + auth token, not devnet-only).
- `docs/RELEASE.md` — the npm + GitHub release process used in Phase A.
- `docs/STORAGE_VERSION_TAGS.md` — operator-facing summary of the
multi-storage URI scheme that makes storage-only redeploys non-
destructive (and therefore non-runbook events).
- `docs/RFC40_MULTI_STORAGE_KC_URI_SCHEME.md` — the design rationale
behind the URI scheme.
72 changes: 69 additions & 3 deletions packages/chain/src/chain-adapter.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import type { ethers } from 'ethers';
import type { KCStorageRegistry } from './kc-storage-registry.js';

export interface IdentityProof {
publicKey: Uint8Array;
Expand Down Expand Up @@ -559,6 +560,37 @@ export interface OperationalWalletRegistrationResult {
export interface ChainAdapter {
chainType: 'evm' | 'solana';
chainId: string;
/**
* OT-RFC-40 §5.2 storage tag of the KC storage instance this adapter
* mints into. Empty string ("") means the default storage and produces
* the legacy 3-segment UAL form `did:dkg:{chainId}/{publisher}/{id}`.
* Tagged storages (e.g. V9 KAS at `did:dkg:v9`) produce the 4-segment
* form `did:dkg:{tag}/{chainId}/{publisher}/{id}` so the resolver can
* route to the correct storage instance.
*
* Optional on the adapter surface so adapters that pre-date the RFC
* default to "" (preserving the legacy form bit-for-bit). EVM adapters
* populate it during `init()` by reading the storage's `uri(0)` view;
* mock adapters expose a setter for tests.
*/
mintingStorageTag?: string;

/**
* OT-RFC-40 §5.3 — registry of every KC-class storage on the Hub
* this adapter is bound to, keyed by storage tag and address.
*
* Resolution-time consumers (random-sampling prover, async-lift
* verifier, replication ack verifier) use this to map a UAL or a
* `Challenge.knowledgeCollectionStorageContract` address to the
* storage instance that minted the data, instead of assuming a
* single named-`KnowledgeCollectionStorage` exists.
*
* Optional on the adapter surface: adapters that pre-date the RFC
* (or only support a single storage instance) leave it `undefined`,
* in which case callers fall back to "treat everything as the
* default storage" — bit-for-bit pre-RFC behaviour.
*/
kcStorageRegistry?: KCStorageRegistry;
/**
* Stable identifier for the SPECIFIC deployment this adapter is
* bound to (not just the chain). `chainId` alone is too coarse —
Expand Down Expand Up @@ -608,8 +640,26 @@ export interface ChainAdapter {
/**
* Verify that a publisher address owns the UAL range [startKAId, endKAId] on-chain.
* Used by receiving nodes to reject PublishRequests with spoofed publisher/range.
*
* OT-RFC-40 §7.5: optional 4th `storageTag` argument routes the
* range query to the correct KC storage instance. Empty string /
* undefined means the V10 default storage (which does not pre-reserve
* publisher ranges; auth happens at the publish ACK layer, so this
* call returns `true` to defer to that). `"v9"` routes to V9 KAS,
* which does carry per-publisher reserved ranges. Unknown tags
* return `false` conservatively.
*
* Adapters that pre-date the RFC ignore the parameter and preserve
* their existing behaviour; mock adapters keep their test-fixture
* range bookkeeping. The publish-handler derives the tag from the
* incoming request's UAL via `parseUal(request.ual).storageTag`.
*/
verifyPublisherOwnsRange?(publisherAddress: string, startKAId: bigint, endKAId: bigint): Promise<boolean>;
verifyPublisherOwnsRange?(
publisherAddress: string,
startKAId: bigint,
endKAId: bigint,
storageTag?: string,
): Promise<boolean>;

// Block height (used by ChainEventPoller to seed the scan cursor)
getBlockNumber?(): Promise<number>;
Expand Down Expand Up @@ -967,16 +1017,32 @@ export interface ChainAdapter {
* `kcId` is unknown to the chain or the V10 storage contract is not
* deployed on this Hub. Optional so non-V10 / no-chain adapters can
* stub the prover surface.
*
* OT-RFC-40 §7.5 (Codex review on PR #718, round 4): the optional
* `opts.storageContract` argument routes the read to a specific KC
* storage address. Used by the random-sampling prover when a
* challenge resolves to a `kcs-ack-based` tagged storage (V10
* default + future V11+ extensions) so the verification reads run
* against the contract that minted the data, not against the
* adapter's bound default. When omitted, the adapter MUST query the
* default storage as before. V9 KAS (`kas-pre-reserved-range`) is
* intentionally NOT supported via this routing — V9 RS is tracked
* as a follow-up and prover callers fail-closed before reaching
* here.
*/
getLatestMerkleRoot?(kcId: bigint): Promise<Uint8Array>;
getLatestMerkleRoot?(kcId: bigint, opts?: { storageContract?: string }): Promise<Uint8Array>;

/**
* V10 flat-KC merkle leaf count (sorted + deduped) recorded on-chain
* for `kcId`. Used by the prover to (a) validate the local extraction
* matches the published shape before building a proof, and (b) sanity
* check the on-chain `chunkId = leafIndex` falls within the tree.
*
* OT-RFC-40 §7.5 (Codex review on PR #718, round 4): see
* `getLatestMerkleRoot` for the `opts.storageContract` routing
* rationale.
*/
getMerkleLeafCount?(kcId: bigint): Promise<number>;
getMerkleLeafCount?(kcId: bigint, opts?: { storageContract?: string }): Promise<number>;

/**
* Address that signed the latest merkle root for `kcId` (the EOA that
Expand Down
Loading
Loading