Skip to content

Benchmarking json schemas#1154

Open
JordanMaples wants to merge 5 commits into
mainfrom
jordanmaples/benchmark_schema
Open

Benchmarking json schemas#1154
JordanMaples wants to merge 5 commits into
mainfrom
jordanmaples/benchmark_schema

Conversation

@JordanMaples

Copy link
Copy Markdown
Contributor

To increase clarity into what the available options are for our json inputs, Mark and I discussed walking the ASTs of the json body using Schemar and rendering a breakdown of the possible options. This adds JSON Schema documentation for benchmark inputs using the --schema and --field options

Summary

Adds schemars-based JSON Schema generation and a custom tree-style terminal renderer so users can discover benchmark input fields without reading source code.

Usage

Full schema for an input type

cargo run --release -p diskann-benchmark -- inputs graph-index-build --schema

Drill into a specific field

cargo run --release -p diskann-benchmark -- inputs graph-index-build --field source.start_point_strategy

What's included

  • schemars 1.2 added to workspace; JsonSchema derived on all input types
  • Custom renderer (diskann-benchmark-runner/src/schema.rs) with: - Colored terminal output (bold field names, cyan types, yellow enum variants)
  • Multi-line description alignment
  • Handles internally/externally-tagged enums and $ref newtypes
  • MAX_DEPTH guard against recursive schemas
  • Manual JsonSchema impls for custom-serde types: - NonNegativeFinite (number with minimum)
  • StartPointStrategyRef (externally-tagged enum, with drift test)
  • QuantizationTypeSchema proxy (keeps schemars out of diskann-disk)
  • JsonSchema bound added to Input::Raw trait
  • README documentation for the new --schema and --field options

Sample output

Full schema

cargo run -p diskann-benchmark --all-features -- inputs --schema graph-index-build-bftree-spherical-quantization

Schema for "graph-index-build-bftree-spherical-quantization":

├── build: object
    ├── alpha: number
    │
    ├── backedge_ratio: number
    │
    ├── data: string
    │   # A file that is used as an input to for a benchmark.
    │
    ├── data_type: one of ["float64", "float32", "float16", "uint8", "uint16", "uint32", "uint64", "int8", "int16", "int32", "int64", "bool"]
    │   # An enum representation for common DiskANN data types.
    │
    ├── distance: one of ["squared_l2", "inner_product", "cosine", "cosine_normalized"]
    │
    ├── insert_retry (optional): (any of)
        ├── num_insert_attempts: integer (≥1)
        │
        ├── retry_threshold: number
        │
        └── saturate_inserts: boolean
    │
    ├── l_build: integer (≥0)
    │
    ├── max_degree: integer (≥0)
    │
    ├── multi_insert (optional): (any of)
        ├── batch_parallelism: integer (≥1)
        │
        ├── batch_size: integer (≥1)
        │
        └── intra_batch_candidates: (one of)
            # A one-to-one correspondence with [`diskann::index::config::IntraBatchCandidates`].
            ├─ "none" — No intra-batch candidates will be considered.
            ├─ "max"
            └─ "all" — Consider all elements in the batch for intra-batch candidates.
    │
    ├── num_threads: integer (≥0)
    │
    ├── save_path: any (optional)
    │
    └── start_point_strategy: (one of)
        # Strategy for selecting graph start points.
        ├─ "medoid" — Use the medoid as the starting point.
        ├─ "first_vector" — Use the first vector in the dataset.
        ├─ "random_vectors" — Randomly select vector(s) with given norm.
        │  ├── norm: number
        │  ├── nsamples: integer (≥1)
        │  └── seed: integer
        ├─ "random_samples" — Sample data from the dataset.
        │  ├── nsamples: integer (≥1)
        │  └── seed: integer
        └─ "latin_hyper_cube" — Use Latin Hypercube sampling.
           ├── nsamples: integer (≥1)
           └── seed: integer
│
├── neighbor_store_config (optional): (any of)
    ├── cache_only: any (optional)
    │   # If true, only use the in-memory circular buffer (no disk pages).
    │
    ├── cb_copy_on_access_ratio: any (optional)
    │   # Ratio of buffer used before copy-on-access kicks in.
    │
    ├── cb_max_record_size: any (optional)
    │   # Maximum record size that can be stored in the circular buffer.
    │
    ├── cb_min_record_size: any (optional)
    │   # Minimum record size for the circular buffer.
    │
    ├── cb_size_byte: integer (≥0)
    │   # Size of the circular buffer (in-memory write cache) in bytes.
    │
    ├── leaf_page_size: integer (≥0)
    │   # Size of leaf pages in bytes.
    │
    ├── read_promotion_rate: any (optional)
    │   # Probability (0-100) of promoting a read record to the front of the buffer.
    │
    ├── read_record_cache: any (optional)
    │   # Whether to cache full pages on read.
    │
    └── scan_promotion_rate: any (optional)
        # Probability (0-100) of promoting a scanned record to the front of the buffer.
│
├── num_bits: integer (≥1)
│
├── pre_scale (optional): (any of)
    ├─ one of ["none", "reciprocal_mean_norm"]
    └─ "some"
│
├── quant_store_config (optional): (any of)
    ├── cache_only: any (optional)
    │   # If true, only use the in-memory circular buffer (no disk pages).
    │
    ├── cb_copy_on_access_ratio: any (optional)
    │   # Ratio of buffer used before copy-on-access kicks in.
    │
    ├── cb_max_record_size: any (optional)
    │   # Maximum record size that can be stored in the circular buffer.
    │
    ├── cb_min_record_size: any (optional)
    │   # Minimum record size for the circular buffer.
    │
    ├── cb_size_byte: integer (≥0)
    │   # Size of the circular buffer (in-memory write cache) in bytes.
    │
    ├── leaf_page_size: integer (≥0)
    │   # Size of leaf pages in bytes.
    │
    ├── read_promotion_rate: any (optional)
    │   # Probability (0-100) of promoting a read record to the front of the buffer.
    │
    ├── read_record_cache: any (optional)
    │   # Whether to cache full pages on read.
    │
    └── scan_promotion_rate: any (optional)
        # Probability (0-100) of promoting a scanned record to the front of the buffer.
│
├── search_phase: (one of)
    ├─ "topk"
    │  ├── groundtruth: string
    │  ├── num_threads: array of integer (≥1)
    │  ├── queries: string
    │  ├── reps: integer (≥1)
    │  └── runs: array of any
    ├─ "range"
    │  ├── groundtruth: string
    │  ├── num_threads: array of integer (≥1)
    │  ├── queries: string
    │  ├── reps: integer (≥1)
    │  └── runs: array of any
    ├─ "topk-beta-filter"
    │  ├── beta: number
    │  ├── data_labels: string
    │  ├── groundtruth: string
    │  ├── num_threads: array of integer (≥1)
    │  ├── queries: string
    │  ├── query_predicates: string
    │  ├── reps: integer (≥1)
    │  └── runs: array of any
    ├─ "topk-multihop-filter"
    │  ├── data_labels: string
    │  ├── groundtruth: string
    │  ├── num_threads: array of integer (≥1)
    │  ├── queries: string
    │  ├── query_predicates: string
    │  ├── reps: integer (≥1)
    │  └── runs: array of any
    └─ "topk-inline-filter"
       ├── adaptive_l: (union) (optional)
       ├── data_labels: string
       ├── groundtruth: string
       ├── num_threads: array of integer (≥1)
       ├── queries: string
       ├── query_predicates: string
       ├── reps: integer (≥1)
       └── runs: array of any
│
├── seed: integer (≥0)
│
├── transform_kind: (one of)
    ├─ one of ["null"]
    ├─ "padding_hadamard"
    ├─ "random_rotation"
    └─ "double_hadamard"
│
└── vector_store_config (optional): (any of)
    ├── cache_only: any (optional)
    │   # If true, only use the in-memory circular buffer (no disk pages).
    │
    ├── cb_copy_on_access_ratio: any (optional)
    │   # Ratio of buffer used before copy-on-access kicks in.
    │
    ├── cb_max_record_size: any (optional)
    │   # Maximum record size that can be stored in the circular buffer.
    │
    ├── cb_min_record_size: any (optional)
    │   # Minimum record size for the circular buffer.
    │
    ├── cb_size_byte: integer (≥0)
    │   # Size of the circular buffer (in-memory write cache) in bytes.
    │
    ├── leaf_page_size: integer (≥0)
    │   # Size of leaf pages in bytes.
    │
    ├── read_promotion_rate: any (optional)
    │   # Probability (0-100) of promoting a read record to the front of the buffer.
    │
    ├── read_record_cache: any (optional)
    │   # Whether to cache full pages on read.
    │
    └── scan_promotion_rate: any (optional)
        # Probability (0-100) of promoting a scanned record to the front of the buffer.

Example:

{
  "content": {
    "build": {
      "alpha": 1.2000000476837158,
      "backedge_ratio": 1.0,
      "data": "path/to/data",
      "data_type": "float32",
      "distance": "squared_l2",
      "insert_retry": null,
      "l_build": 50,
      "max_degree": 32,
      "multi_insert": {
        "batch_parallelism": 32,
        "batch_size": 128,
        "intra_batch_candidates": "none"
      },
      "num_threads": 1,
      "save_path": null,
      "start_point_strategy": "medoid"
    },
    "neighbor_store_config": null,
    "num_bits": 1,
    "pre_scale": null,
    "quant_store_config": null,
    "search_phase": {
      "groundtruth": "path/to/groundtruth",
      "num_threads": [
        1,
        2,
        4,
        8
      ],
      "queries": "path/to/queries",
      "reps": 5,
      "runs": [
        {
          "recall_k": 10,
          "search_l": [
            10,
            20,
            30,
            40
          ],
          "search_n": 10
        }
      ],
      "search-type": "topk"
    },
    "seed": 42,
    "transform_kind": "null",
    "vector_store_config": null
  },
  "type": "graph-index-build-bftree-spherical-quantization"
}

Single Field

cargo run -p diskann-benchmark --all-features -- inputs --schema graph-index-build-bftree-spherical-quantization --field build.start_point_strategy


Schema for "graph-index-build-bftree-spherical-quantization".build.start_point_strategy:

├─ "medoid" — Use the medoid as the starting point.
├─ "first_vector" — Use the first vector in the dataset.
├─ "random_vectors" — Randomly select vector(s) with given norm.
│  ├── norm: number
│  ├── nsamples: integer (≥1)
│  └── seed: integer
├─ "random_samples" — Sample data from the dataset.
│  ├── nsamples: integer (≥1)
│  └── seed: integer
└─ "latin_hyper_cube" — Use Latin Hypercube sampling.
   ├── nsamples: integer (≥1)
   └── seed: integer

Example:

"medoid"

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds JSON Schema generation (via schemars) and a tree-style terminal renderer so diskann-benchmark users can discover benchmark input fields/variants using --schema and --field, rather than reading source.

Changes:

  • Add schemars::JsonSchema coverage across benchmark input/tolerance DTOs and generate per-input JSON Schemas from Input::Raw.
  • Introduce diskann-benchmark-runner::schema to render schemas (and drill into sub-fields) as human-readable CLI documentation.
  • Add schema/serialization drift tests and wire new CLI flags + README documentation.

Reviewed changes

Copilot reviewed 26 out of 27 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
diskann-disk/src/build/configuration/quantization_types.rs Adds a test intended to guard drift across QuantizationType variants/serialization.
diskann-disk/Cargo.toml Adds serde_json as a dev-dependency for new tests.
diskann-benchmark/src/utils/mod.rs Derives JsonSchema for SimilarityMeasure used in inputs.
diskann-benchmark/src/inputs/multi_vector.rs Derives JsonSchema for multi-vector input types.
diskann-benchmark/src/inputs/graph_index.rs Derives JsonSchema broadly; adds manual JsonSchema for StartPointStrategyRef + drift test; annotates schema override for the remote-serde field.
diskann-benchmark/src/inputs/filters.rs Derives JsonSchema for filter-related inputs.
diskann-benchmark/src/inputs/exhaustive.rs Derives JsonSchema for exhaustive-benchmark inputs.
diskann-benchmark/src/inputs/disk.rs Adds schema proxy for QuantizationType (to avoid schemars dependency in diskann-disk) and derives JsonSchema for disk-index inputs.
diskann-benchmark/src/inputs/bftree.rs Derives JsonSchema for bf_tree inputs.
diskann-benchmark/src/backend/multi_vector/driver.rs Derives JsonSchema for multi-vector tolerance input.
diskann-benchmark/src/backend/disk_index/benchmarks.rs Derives JsonSchema for disk-index tolerance input.
diskann-benchmark/README.md Documents --schema and --field usage.
diskann-benchmark/Cargo.toml Adds schemars dependency for input schema generation.
diskann-benchmark-simd/src/lib.rs Derives JsonSchema for SIMD input/tolerance types.
diskann-benchmark-simd/Cargo.toml Adds schemars dependency.
diskann-benchmark-runner/src/utils/num.rs Implements JsonSchema for NonNegativeFinite.
diskann-benchmark-runner/src/utils/datatype.rs Derives JsonSchema for DataType.
diskann-benchmark-runner/src/test/typed.rs Updates test inputs/tolerances to derive JsonSchema.
diskann-benchmark-runner/src/test/dim.rs Updates test inputs/tolerances to derive JsonSchema.
diskann-benchmark-runner/src/schema.rs Adds schema renderer + path resolver + unit tests.
diskann-benchmark-runner/src/lib.rs Exposes the new schema module.
diskann-benchmark-runner/src/input.rs Requires Input::Raw: JsonSchema and adds Registered::schema() plumbing.
diskann-benchmark-runner/src/files.rs Derives JsonSchema for InputFile.
diskann-benchmark-runner/src/app.rs Adds --schema/--field CLI flags and wiring to render schema docs + example.
diskann-benchmark-runner/Cargo.toml Adds colored + schemars dependencies.
Cargo.toml Adds schemars to workspace dependencies.
Cargo.lock Locks new schemars/colored (and transitive) dependencies.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +289 to +292
/// Ensures the manual `JsonSchema` impl stays in sync with actual variants.
/// If a variant is added to `QuantizationType`, this match will fail to compile.
#[test]
fn schema_covers_all_quantization_variants() {
Comment thread diskann-benchmark/src/inputs/graph_index.rs Outdated
Comment thread diskann-benchmark/src/inputs/graph_index.rs Outdated
Comment thread diskann-benchmark/src/inputs/graph_index.rs Outdated
Comment thread diskann-benchmark-runner/src/schema.rs Outdated
}
s
}
Some("number") => "number".to_string(),
Comment thread diskann-benchmark-runner/src/input.rs Outdated
let generator =
schemars::generate::SchemaSettings::default().into_generator();
let schema = generator.into_root_schema_for::<T::Raw>();
serde_json::to_value(schema).unwrap_or_default()
@codecov-commenter

codecov-commenter commented Jun 12, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 65.10574% with 231 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.77%. Comparing base (b5ebac2) to head (14c1318).

Files with missing lines Patch % Lines
diskann-benchmark-runner/src/schema.rs 74.74% 124 Missing ⚠️
diskann-benchmark/src/inputs/graph_index.rs 41.66% 49 Missing ⚠️
diskann-benchmark-runner/src/app.rs 20.93% 34 Missing ⚠️
diskann-benchmark-runner/src/utils/num.rs 0.00% 9 Missing ⚠️
diskann-benchmark-runner/src/input.rs 0.00% 8 Missing ⚠️
diskann-benchmark/src/inputs/post_processor.rs 0.00% 6 Missing ⚠️
...disk/src/build/configuration/quantization_types.rs 95.23% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1154      +/-   ##
==========================================
- Coverage   90.99%   89.77%   -1.22%     
==========================================
  Files         489      490       +1     
  Lines       93130    93785     +655     
==========================================
- Hits        84746    84199     -547     
- Misses       8384     9586    +1202     
Flag Coverage Δ
miri 89.77% <65.10%> (-1.22%) ⬇️
unittests 89.43% <65.10%> (-1.53%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
diskann-benchmark-runner/src/files.rs 100.00% <ø> (ø)
diskann-benchmark-runner/src/test/dim.rs 89.79% <ø> (ø)
diskann-benchmark-runner/src/test/typed.rs 97.10% <ø> (ø)
diskann-benchmark-runner/src/utils/datatype.rs 100.00% <ø> (ø)
diskann-benchmark-simd/src/lib.rs 83.03% <ø> (ø)
diskann-benchmark/src/inputs/disk.rs 1.50% <ø> (ø)
diskann-benchmark/src/inputs/exhaustive.rs 26.83% <ø> (ø)
diskann-benchmark/src/inputs/filters.rs 67.74% <ø> (ø)
diskann-benchmark/src/inputs/multi_vector.rs 19.67% <ø> (ø)
diskann-benchmark/src/utils/mod.rs 83.33% <ø> (ø)
... and 7 more

... and 40 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@JordanMaples JordanMaples force-pushed the jordanmaples/benchmark_schema branch from 83b159e to f764227 Compare June 15, 2026 20:38
JordanMaples and others added 4 commits June 22, 2026 08:56
Adds schemars-based JSON Schema generation and a custom tree-style
terminal renderer for benchmark input types. Users can run
`inputs <name> --schema` to see field documentation with types,
optionality, enum variants, and descriptions — followed by the
example JSON.

Implementation:
- Add schemars 1.2 to workspace; derive JsonSchema on all input types
- Custom renderer in diskann-benchmark-runner/src/schema.rs with:
  - Colored output (field names bold, types cyan, variants yellow)
  - Multi-line description alignment
  - Handles internally/externally-tagged enums, newtypes with $ref
  - MAX_DEPTH guard against recursive schemas
- Manual JsonSchema impls for custom-serde types:
  - NonNegativeFinite (number with minimum)
  - StartPointStrategyRef (externally-tagged enum, with drift test)
  - QuantizationTypeSchema proxy (keeps schemars out of diskann-disk)
- JsonSchema bound added to Input::Raw trait

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@JordanMaples JordanMaples force-pushed the jordanmaples/benchmark_schema branch from f764227 to 14c1318 Compare June 22, 2026 16:01

@hildebrandmw hildebrandmw left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Jordan. I'm very supportive of a feature along these lines. Especially when enums are involved, obtaining the set of valid representations is currently an exercise in guess-and-check.

That said, there are several aspects of this particular approach that make me hesitant:

  1. Testing: diskann-benchmark-runner has a pretty extensive UX test suite for comparing app input and output. This makes it easier to write tests, observe the expected output, and monitor changes that get made and this PR does not use this methodology. As such, many paths in the schema renderer are uncovered by the checked-in tests.
  2. The example representation for enums seems sub-par with only a single variant displayed. I don't think this is fixable with schemars alone.
  3. The hand-written schemas (e.g. StartPointStrategyRef) worry from a maintenance stand-point. The tests don't actually protect against drift here.
  4. The renderer is fairly inscrutable, doesn't display nesting separators particularly well, and would at the very least benefit greatly from UX tests.

This is a good first step, but I would like to see several things ironed out first:

  1. Enable examples for all enum variants. This may require changes to the how Input works - that's fine.
  2. Address the potential for schema drift. There are some options here. When the type is a mirror (e.g. StartPointStrategyRef), derive JsonSchema on the mirror directly. In any case, please add tests that validate an input's example type matches its generated schema.
  3. Use the UX test framework.
  4. Please revisit the style of the renderer. For example, render_node has an unused _is_last argument.

- Clarify quantization schema test docs: the JsonSchema impl lives in the
  QuantizationTypeSchema proxy (diskann-benchmark), not in diskann-disk
- Add minimum: 0 to u64 seed fields in StartPointStrategyRef schema
- Render min/max constraints for number types in type_summary, matching integers
- Fail loudly if RootSchema serialization fails instead of defaulting silently

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants