Skip to content

kimmingul/samplesize-copilot

Repository files navigation

samplesize-copilot — Sample-size and power calculations for clinical and applied research

CI License: Apache 2.0 Python Methods Tests

A Python package + Claude Code plugin implementing 234 sample-size and power-calculation methods validated against worked examples from established statistical references.

Status

v0.1 — 234 methods implemented and validated, 819 worked-example fixture tests passing. Doctor passes 9/9 integrity checks across registry, callables, plugin manifest, and reporting templates. Roadmap in docs/ROADMAP.md; live coverage matrix in docs/METHOD_COVERAGE.md.

Layout

samplesize-copilot/
├── samplesize/             # Python package — pure-Python calculators
│   ├── core/               # distributions, effect sizes, adjustments
│   ├── tests/              # per-method calculator modules
│   ├── reporting/          # plots, tables, protocol text, audit, R/SAS export
│   │   └── templates/      # i18n templates (protocol.en.yaml, protocol.ko.yaml, ...)
│   ├── registry/           # methods.json — categorical metadata only
│   ├── cli.py              # `python -m samplesize ...`
│   └── doctor.py           # `samplesize doctor` integrity checks
├── plugin/                 # Claude Code plugin
│   ├── .claude-plugin/plugin.json
│   ├── skills/             # design / calculate / report / validate
│   ├── commands/           # /ss-design, /ss-calc, /ss-power, /ss-curve, /ss-report
│   └── agents/             # methodologist, calculator, validator
├── reference/              # Local-only knowledge base (gitignored, user-supplied)
│   └── ...                 # Validation reference material — not bundled in repo
├── tests/                  # pytest suites
│   ├── validation/         # worked-example regression tests
│   └── unit/               # registry / doctor / signature parity
└── docs/                   # ARCHITECTURE, ROADMAP, METHOD_COVERAGE, COOKBOOK, TROUBLESHOOTING

Installation

pip install -e ".[dev]"

Quick start

samplesize list                                          # available methods
samplesize show two_sample_t_equal_var                   # full metadata + kwargs
samplesize calc two_sample_t_equal_var \
  --json-args '{"mean1":10,"mean2":0,"sd":20,"alpha":0.05,"power":0.80,"sides":2}'
# → n1=64, n2=64, achieved_power=0.8015; audit JSON saved

# follow-ups on the audit just printed
AUDIT=$(ls -t .samplesize/audit/*.json | head -1)
samplesize report "$AUDIT" --kind power-curve --out curve.png
samplesize report "$AUDIT" --kind protocol --lang en
samplesize report "$AUDIT" --kind sensitivity --vary "sd=15,20,25,30"
samplesize report "$AUDIT" --kind r-code        # pwr::pwr.t.test(...) equivalent
samplesize report "$AUDIT" --kind sas-code      # PROC POWER equivalent

# sanity gate
samplesize doctor

More recipes. docs/COOKBOOK.md has 15 worked study scenarios (RCT, NI, equivalence, survival, Cox, McNemar, χ², ANOVA, correlation). Hit an error? docs/TROUBLESHOOTING.md.

Using inside Claude Code (plugin)

Two ways to make the slash commands and skills available:

Ephemeral — load for one session:

claude --plugin-dir /path/to/samplesize-copilot/plugin

Persistent — register the marketplace and install:

claude plugin marketplace add kimmingul/samplesize-copilot   # from GitHub
# …or from a local clone (repo root): claude plugin marketplace add /path/to/samplesize-copilot
claude plugin install samplesize-copilot@samplesize-copilot  # requires CC ≥ 2.2

Once loaded, these commands work inside Claude Code:

  • /samplesize-copilot:ss-design <study description> — pick the right test
  • /samplesize-copilot:ss-calc <method> ... — run a calculation
  • /samplesize-copilot:ss-power ... — solve for power at fixed N
  • /samplesize-copilot:ss-curve — emit a power-curve PNG for the latest result
  • /samplesize-copilot:ss-report — generate ICH E9 protocol / grant text
  • /samplesize-copilot:ss-validate <method?> — run worked-example validation tests

Coverage

234 methods across:

  • Means (one-sample, two-sample, paired, non-inferiority, equivalence, superiority-by-margin)
  • Proportions (one, two, McNemar, NI/equivalence variants)
  • Correlation (Pearson exact and Fisher-z)
  • ANOVA / GLM (one-way F, chi-square)
  • Survival (logrank Freedman, Cox regression Hsieh-Lavori)
  • Group-sequential (O'Brien-Fleming, Pocock alpha-spending)
  • Cluster-randomized (two means, two proportions, Donner-Klar)
  • Cross-over (2×2 design)
  • Phase II (Simon two-stage)
  • ROC / diagnostic
  • And more — see docs/METHOD_COVERAGE.md

Validation

819 fixture tests passing. Methods are validated against worked examples from established statistical software references. Reference content itself is user-supplied (see reference/ — not bundled in this repository).

Fixtures live under tests/validation/fixtures/<method_id>.yaml.

pytest tests/validation/

License

Apache License 2.0 — see LICENSE.

Acknowledgments

Method implementations draw on the primary statistical literature, including:

  • Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences (2nd ed.)
  • Donner, A. & Klar, N. (1996). Statistical considerations in the design and analysis of community intervention trials.
  • Hsieh, F. Y. & Lavori, P. W. (2000). Sample-size calculations for the Cox proportional hazards regression model with nonbinary covariates.
  • Schoenfeld, D. (1981). The asymptotic properties of nonparametric tests for comparing survival distributions.
  • Bonett, D. G. & Wright, T. A. (2000). Sample size requirements for estimating Pearson, Kendall and Spearman correlations.
  • Hanley, J. A. & McNeil, B. J. (1982). The meaning and use of the area under a receiver operating characteristic (ROC) curve.
  • Simon, R. (1989). Optimal two-stage designs for phase II clinical trials.
  • Wang, S. K. & Tsiatis, A. A. (1987). Approximately optimal one-parameter boundaries for group sequential trials.
  • Flack, V. F. et al. (1988). Sample size determinations for the two rater kappa statistic.

Releases

No releases published

Packages

 
 
 

Contributors

Languages