Specimen complements the Plausible property-based testing library by automatically deriving generators, enumerators, and checkers for inductive relations.
Specimen's design is heavily inspired by Coq/Rocq's QuickChick library and the following papers:
- Testing Theorems, Fully Automatically (OOPSLA 2026)
- Computing Correctly with Inductive Relations (PLDI 2022)
- Generating Good Generators for Inductive Relations (POPL 2018)
Specimen is a testing and verification tool - it is designed to help find bugs during development, not to serve as a security guarantee or correctness proof for production or enterprise workloads. Intended use is development-time property-based testing, rapid prototyping of invariants, and pre-proof exploration of conjectures.
Like QuickChick, Specimen uses the following typeclasses:
Arbitrary: unconstrained random generators for inhabitants of algebraic data types. This is imported from PlausibleArbitrarySuchThat: constrained generators which only produce random values that satisfy a user-supplied inductive relationArbitraryFueled,ArbitrarySizedSuchThat: versions of the two typeclasses above where the generator's size parameter is made explicit (the former is imported from Plausible)Enum, EnumSuchThat, EnumSized, EnumSizedSuchThat: Like theirArbitrarycounterparts but for deterministic enumerators insteadDecOpt: Checkers (partial decision procedures that returnExcept GenError Bool) for inductive propositions
Specimen provides various top-level commands which automatically derive generators for Lean inductives (the file Specimen/README.md has more details):
1. Deriving unconstrained generators/enumerators
An unconstrained generator produces random inhabitants of an algebraic data type, while an unconstrained enumerator enumerates (deterministically) these inhabitants.
Users can write deriving Arbitrary and/or deriving Enum after an inductive type definition, e.g.
inductive Foo where
...
deriving Arbitrary, EnumAlternatively, users can also write deriving instance Arbitrary for T1, ..., Tn (or deriving instance Enum ...) as a top-level command to derive Arbitrary / Enum instances for types T1, ..., Tn simultaneously. This also works for mutually recursive types:
mutual
inductive MutEven where
| zero : MutEven
| succOdd : MutOdd → MutEven
inductive MutOdd where
| succEven : MutEven → MutOdd
end
deriving instance Enum for MutEven, MutOddTo sample from a derived unconstrained generator, users can simply call runArbitrary, specify the type
for the desired generated values and provide some Nat to act as the generator's size parameter (10 in the example below):
#eval runArbitrary (α := Tree) 10Similarly, to return the elements produced from a derived enumerator, users can call runEnum like so:
#eval runEnum (α := Tree) 102. derive_mutual — the recommended command for constrained derivation
derive_mutual is the primary command for deriving constrained generators, enumerators, and checkers. It supersedes the older derive_generator/derive_enumerator/derive_checker commands by providing:
- Automatic dependency discovery (derives instances for sub-relations)
- Multi-output generation (a single hypothesis step can produce multiple existential variables)
- True mutual recursion (multiple specs compiled into a shared
mutualblock) - Quality scoring and schedule search with branch-and-bound optimization
Syntax:
set_option specimen.autoDeriveDeps true
set_option specimen.multiOutput true
-- Derive a constrained generator (default sort is `generator`)
derive_mutual
(fun n => ∃ (t : BinaryTree), balancedTree n t)
-- Derive multiple specs at once (they can call each other)
derive_mutual
(fun G t => ∃ (e : term), typing G e t)
-- Derive with explicit sort keywords
derive_mutual
generator (fun lo hi => ∃ (t : BinaryTree), BST lo hi t),
checker (fun lo hi t => BST lo hi t)
-- Derive an enumerator
derive_mutual enumerator
(fun n => ∃ (t : BinaryTree), balancedTree n t)
-- Multi-output: generate all existentials at once
derive_mutual
(∃ (Γ : List type) (e : term) (τ : type), typing Γ e τ)Each entry can be prefixed with generator (default), enumerator, or checker. When specimen.autoDeriveDeps is true, Specimen automatically discovers and derives instances for sub-relations referenced in the constructors. When specimen.multiOutput is true, the scheduler can produce multiple existential outputs in a single hypothesis step.
To sample from a generator derived via derive_mutual:
#eval runSizedGen (ArbitrarySizedSuchThat.arbitrarySizedST (fun t => balanced 5 t)) 103. derive_generator / derive_enumerator — single-spec constrained derivation
These commands derive a constrained generator or enumerator for a single specification. They are still supported and useful for quick one-off derivations:
derive_generator (fun n => ∃ t, balanced n t)
derive_enumerator (fun n => ∃ t, balanced n t)In the command derive_generator (fun x1 ... xn => ∃ x, P x1 ... x ... xn):
Pmust be an inductively defined relationxis the value to be generated (bound by∃)x1 ... xnare input parameters (bound byfun)- Multiple existential outputs are supported:
derive_generator (fun n => ∃ a b, Split n a b)
To sample from the derived producer:
#eval runSizedGen (ArbitrarySizedSuchThat.arbitrarySizedST (fun t => balanced 5 t)) 10
#eval runSizedEnum (EnumSizedSuchThat.enumSizedST (fun t => balanced 3 t)) 34. derive_checker — partial decision procedures
A checker for an inductively-defined Prop is a Nat -> Except GenError Bool function, which
takes a Nat argument as fuel and returns an error if it can't decide whether the Prop holds (e.g. it runs out of fuel),
and otherwise returns ok true / ok false depending on whether the Prop holds.
derive_checker (fun n t => balanced n t)5. Options
| Option | Default | Description |
|---|---|---|
specimen.autoDeriveDeps |
false |
Automatically derive dependency instances for sub-relations in derive_mutual |
specimen.multiOutput |
false |
Allow multi-output production steps (multiple ∃ vars generated per hypothesis) |
specimen.scoreType |
"Scoring.DefaultScore" |
Scoring strategy for schedule quality evaluation (see below) |
specimen.fuel |
10000 |
Fuel (termination budget) for derived generators/enumerators/checkers |
specimen.richOutput |
true |
Emit rich HTML widget output in the Lean infoview |
specimen.textOutput |
0 |
Plain-text output verbosity (0=off, 1=summary, 2=problems, 3=full) |
specimen.searchLimit |
200000 |
Max hypothesis orderings to evaluate per constructor during schedule search |
Scoring strategies control how Specimen evaluates and selects among candidate schedules during derivation. The specimen.scoreType option selects the active strategy:
| Strategy | Option value | Description |
|---|---|---|
| Default | "Scoring.DefaultScore" |
Sum of (checks, length, unconstrained) — the original heuristic. Minimizes total checking work. |
| Worst-leaf | "Scoring.WorstLeafScore" |
Takes the max (not sum) across coverage-trie leaves — penalizes worst-case input paths. |
| Density | "Scoring.DensityScore" |
Categorical density classification (Total/Partial/Backtracking/Checking) from Section 4 of Testing Theorems, Fully Automatically. Prefers schedules that avoid backtracking. |
For example, to use the density scoring strategy from the Testing Theorems paper:
set_option specimen.scoreType "Scoring.DensityScore"
derive_mutual
(fun lo hi => ∃ (t : BinaryTree), BST lo hi t)See ScheduleQualityRegressionTest.lean for a comparison of all three strategies on the same relation.
Building & compiling:
- To compile, run
lake buildfrom the top-level repository. - To run snapshot tests, run
lake test. - To run linter checks, run
lake lint.- This invokes the linter provided via the Batteries library.
Typeclass definitions:
ArbitrarySizedSuchThat.lean: TheArbitrarySuchThat&ArbitrarySizedSuchThattypeclasses for constrained generators, adapted from QuickChickDecOpt.lean: TheDecOpttypeclass for partially decidable propositions, adapted from QuickChickEnumerators.lean: TheEnum, EnumSized, EnumSuchThat, EnumSizedSuchThattypeclasses for constrained & unconstrained enumeration
Combinators for generators & enumerators:
GeneratorCombinators.lean: Extra combinators for Plausible generators (e.g. analogs of thesizedandfrequencycombinators from Haskell QuickCheck)EnumeratorCombinators.lean: Combinators over enumerators
Algorithm for deriving constrained producers & checkers (adapted from the QuickChick papers):
UnificationMonad.lean: The unification monad described in Generating Good GeneratorsDeriveConstrainedProducer.lean: Algorithm for deriving constrained generators, including thederive_mutualcommand for multi-spec mutual derivationMExp.lean: An intermediate representation for monadic expressions (MExp), used when compiling schedules to Lean codeMakeConstrainedProducerInstance.lean: Auxiliary functions for creating instances of typeclasses for constrained producers (ArbitrarySuchThat,EnumSuchThat)DeriveChecker.lean: Deriver for automatically deriving checkers (instances of theDecOpttypeclass)Schedules.lean: Type definitions for generator schedulesDeriveSchedules.lean: Algorithm for deriving generator schedulesSearchTree.lean: Dependency-aware hypothesis ordering via lazy search tree with branch-and-bound pruning
Schedule scoring & quality analysis:
Score.lean: Type-erased score values used by the modular scoring frameworkScoring.lean: Modular scoring framework with pluggable strategies (DefaultScore, WorstLeafScore, DensityScore) for evaluating schedule qualityPatternCoverage.lean: Pattern coverage trie that partitions the input space of an inductive relation, identifies weak spots, and annotates leaves with constructor coverage
Derivers for unconstrained producers:
DeriveArbitrary.lean: Deriver for unconstrained generators (instances of theArbitrary/ArbitrarySizedtypeclasses), including support for mutually recursive and parameterized typesDeriveEnum.lean: Deriver for unconstrained enumerators (instances of theEnum/EnumSizedtypeclasses), including nested and mutually recursive types
Miscellany:
TSyntaxCombinators.lean: Combinators overTSyntaxfor creating monadicdo-blocks & other Lean expressions via metaprogrammingLazyList.lean: Implementation of lazy lists (used for enumerators)LazyRoseTree.lean: Lazy rose tree data structureIdents.lean: Utilities for dealing with identifiers / producing fresh namesUtils.lean: Other miscellaneous utilsDebug.lean: Debug tracing and option flags for Specimen
Overview of test corpus:
- The
SpecimenTestsubdirectory contains snapshot tests (aka expect tests) for the derivation commands. - Run
lake testto check that the derived generators inSpecimenTesttypecheck, and that the code for the derived generators match the expected output. - Key test directories:
DeriveArbitrarySuchThat/— constrained generators (BST, balanced tree, STLC, regex, permutations, multi-output, mutual recursion)DeriveEnumSuchThat/— constrained enumeratorsDeriveDecOpt/— checkersDeriveArbitrary/— unconstrained generators (parameterized types, mutually recursive types, structures)DeriveEnum/— unconstrained enumerators (nested recursion, mutual recursion)CedarExample/— real-world application: well-typed Cedar policy expression generatorsArithCompiler/— end-to-end example: compiler correctness testing
See CONTRIBUTING for more information.
This project is licensed under the Apache-2.0 License.