Add harm bound to AHR designs by LittleBeannie · Pull Request #640 · Merck/gsDesign2

LittleBeannie · 2026-06-10T19:44:58Z

To solve issue #618.

@jdblischak: In addition to "Efficacy" and "Futility" bound, I added "Harm" bound, which sequentially leads to some changes in gs_bound_summary(). The changes in gs_bound_summary() is suggested by GPT5.5. Could you please review if these AI-suggested changes looks good to you?

This reverts commit 9b59c45.

The helper-support-as_rtf.R is auto-sourced by testit, so the explicit source() call failed during R CMD check. Also update as_gt and as_rtf snapshots to reflect the new bound ordering. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Run roxygenize to sync Rd files with code (fixes WARNING) - Remove test-independent-as_rtf.R (the .md snapshot is sufficient) - Replace all() with bare logical vector in test assertions Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…alone Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

yihui · 2026-06-11T20:14:10Z

Note on snapshot test changes

The .md snapshot diffs look large but are caused by a single behavioral change: as_gt() now sorts bounds by factor level order (Efficacy, Futility, Harm) via arrange(x2, Analysis, Bound), whereas previously it used arrange(x2, Analysis) which preserved the summary() order (Futility before Efficacy, from desc(bound) alphabetical sorting).

This means every analysis section in both test-independent-as_gt.md and test-independent-as_rtf.md has its Efficacy and Futility rows swapped. The actual numeric values are unchanged.

The other visible change in as_gt.md is lrrrrr → crrrrr (first column alignment changed from left to center).

yihui · 2026-06-11T20:16:07Z

Correction on the alignment change: the lrrrrr → crrrrr (first column left → center) is not from this PR's code. It's from gt 1.3.0 changing its default LaTeX alignment for the first data column in row-grouped tables. The old snapshot was generated with an older gt version. Since CI also installs gt 1.3.0, the updated snapshot is correct.

LittleBeannie · 2026-06-16T13:23:05Z

Thank you @yihui for helping with the testit snapshot issues!

jdblischak · 2026-06-16T15:37:17Z

+  harm_bound <- x |> dplyr::filter(bound == "harm") |> dplyr::pull(z)
+  futility_bound <- x |> dplyr::filter(bound == "lower") |> dplyr::pull(z)
+  (harm_bound <= futility_bound)
+})


Please instruct the AI to write unit tests for every function that it edits

jdblischak · 2026-06-16T15:38:28Z


  # One-sided design should not include Futility column
  if (all(is.na(out[["Futility"]]))) out[["Futility"]] <- NULL
+  if (all(is.na(out[["Harm"]]))) out[["Harm"]] <- NULL


Changes look good when I tested locally, but this needs some unit tests for long-term maintenance. For example, confirm that a one-sided design does not return the column Harm

jdblischak · 2026-06-16T15:59:18Z

+  upper = gs_b, upar = -qnorm(0.025), test_upper = TRUE,
+  lower = gs_b, lpar = -1, test_lower = TRUE,
+  harm = gs_b, hpar = -2, test_harm = TRUE
+)


Does it make sense to compute and return a futility or harm bound when there is only a single analysis?

If I run this with test_harm = FALSE, it only returns an efficacy bound for the single analysis. But with test_harm = TRUE, it returns all 3 bounds (efficacy, futility, harm).

Could you please provide your code example?

I was using the example immediately above. Here is more minimal code to reproduce what I observe:

library("gsDesign2") # Returns upper bound gs_design_ahr(analysis_time = 40)$bound # Returns upper and lower bounds gs_design_ahr(analysis_time = 40, test_harm = TRUE)$bound # Returns upper, lower, and harm bounds gs_design_ahr(analysis_time = 40, hpar = -2, test_harm = TRUE)$bound

This matters because the bounds return by gs_design_ahr() are directly used and reported by gs_bound_summary(). Does it make sense to include columns for Futility and Harm for a trial with a single analysis time?

x <- gs_design_ahr(analysis_time = 40, hpar = -2, test_harm = TRUE) gs_bound_summary(x) ## Analysis Value Efficacy Futility Harm ## 1 Final Z 1.9600 1.9444 -2.0000 ## 2 N: 428 p (1-sided) 0.0250 0.0259 0.9772 ## 3 Events: 280 ~HR at bound 0.7911 0.7925 1.2702 ## 4 Month: 40 P(Cross) if HR=1 0.0250 0.9741 0.0228 ## 5 P(Cross) if AHR=0.68 0.9000 0.0973 0.0000

These are really good questions. I confirmed with Keaven:

When it is a fixed design, there is no harm bound.

When it is a group sequential design, the harm bound and lower bound are relatively independent. However, when the lower bound exists, the harm bound is always below the harm bound.

I will test the these 2 bullets in a coming commit!

The commit 9117f53 gives error messages when users test harm bound in a fixed design with one single analysis.

Besides, I added your 3 examples as developer tests in the commit 55bbe26.

yihui

I only looked at the test part as I don't have expertise on the design part.

jdblischak · 2026-06-19T16:14:46Z

+
+```{r}
+candidate_harm_bounds <- lapply(astar_candidates, function(astar_candidate) {
+  gs_harm <- gsDesign::gsSurv(


The pkgdown workflow failed with the error ! unused arguments (sfharm = gsDesign::sfHSD, sfharmparam = -2) because this new article requires the dev version of {gsDesign}

jdblischak · 2026-06-19T16:18:55Z

  }
+  if (n_analysis == 1 && test_harm) {
+    stop("gs_design_ahr() harm bound cannot be tested if there is only one analysis.")
+  }


What about gs_design_npe() and gs_power_npe()? They still return the harm bound for a fixed design when test_harm = TRUE.

library("gsDesign2") gs_design_npe() ## # A tibble: 1 × 10 ## analysis bound z probability probability0 theta info info0 info1 info_frac ## <dbl> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 1 upper 1.96 0.9 0.025 0.1 1051. 1051. 1051. 1 gs_design_npe(test_harm = TRUE) ## analysis bound z probability probability0 theta info_frac info info0 info1 ## 1 1 upper 1.959964 0.9 0.025 0.1 1 1050.742 1050.742 1050.742 ## 2 1 lower -Inf 0.0 0.000 0.1 1 1050.742 1050.742 1050.742 ## 3 1 harm -Inf 0.0 0.000 0.1 1 1050.742 1050.742 1050.742 gs_power_npe() ## analysis bound z probability theta theta1 info_frac info info0 info1 ## 1 1 upper 1.959964 0.03144531 0.1 0.1 1 1 1 1 ## 2 1 lower -Inf 0.00000000 0.1 0.1 1 1 1 1 > gs_power_npe(test_harm = TRUE) ## analysis bound z probability theta theta1 info_frac info info0 info1 ## 1 1 upper 1.959964 0.03144531 0.1 0.1 1 1 1 1 ## 2 1 lower -Inf 0.00000000 0.1 0.1 1 1 1 1 ## 3 1 harm -Inf 0.00000000 0.1 0.1 1 1 1 1

jdblischak · 2026-06-19T16:23:35Z

  (filter(x$bound, bound == "lower")$spending_time %==% expected)
 })
+
+assert("Harm bound is not provided when it is a fixed design", {


I'm confused. All of the calls below use the default of test_harm = FALSE. Thus the harm bound is not returned because of this argument, and has nothing to do with the fact that these are fixed designs.

jdblischak · 2026-06-19T16:24:10Z

+                      upper = gs_b, upar = qnorm(1 - 0.025), test_upper = TRUE,
+                      lower = gs_b, lpar = -Inf, test_lower = FALSE)
+
+  (all.equal(x1$bound[abs(x1$z) != Inf], "upper"))


all.equal() should be used for numeric comparisons. I think == should be sufficient for comparing strings. Alternatively you could use identical().

jdblischak · 2026-06-19T16:27:15Z

+  (all.equal(x1$bound[abs(x1$z) != Inf], "upper"))
+  (all.equal(x2$bound[abs(x2$z) != Inf], "upper"))
+  (all.equal(x3$bound$bound[abs(x3$bound$z) != Inf], "upper"))
+  (all.equal(x4$bound$bound[abs(x4$bound$z) != Inf], "upper"))


Using != Inf seems like cheating here. Don't you want to confirm that the bound table does not include a row for the harm bound at all?

Only x1 (produced by gs_power_npe()) returns a lower bound with a z of -Inf. Given that test_lower = FALSE, shouldn't the function gs_power_npe() instead be updated to not return this lower bound?

jdblischak · 2026-06-19T16:30:36Z

+  (all(x2$bound$z[x2$bound$bound == "lower"] - x2$bound$z[x2$bound$bound == "harm"] >= 0))
+})
+
+assert("Harm bound is always below the lower bound in a group sequential design", {


How is this test different than the one above? The comment used to describe them is identical.

jdblischak · 2026-06-19T16:31:29Z

+                      upper = gs_spending_bound, 
+                      upar = list(sf = gsDesign::sfLDOF, total_spend = 0.025), 
+                      test_upper = TRUE,
+                      test_lower = FALSE,


No lower bound is returned because test_lower = FALSE

jdblischak · 2026-06-19T16:31:41Z

+                      harm = gs_spending_bound,
+                      hpar = list(sf = gsDesign::sfHSD, total_spend = 0.1, param = -4),
+                      test_harm = TRUE)
+  x3 <- gs_design_ahr(analysis_time = c(12, 24, 36), info_frac = NULL,


This x3 object is never used

jdblischak · 2026-06-19T16:34:21Z

+                      test_harm = TRUE)
+
+  (all(x1$bound$z[x1$bound$bound == "lower"] - x1$bound$z[x1$bound$bound == "harm"] >= 0))
+  (all(x2$bound$z[x2$bound$bound == "lower"] - x2$bound$z[x2$bound$bound == "harm"] >= 0))


These tests are meaningless. Because test_lower = FALSE, there is no lower bound.

x1$bound ## analysis bound probability probability0 z ~hr at bound nominal p spending_time ## 1 1 upper 0.002264189 5.380432e-05 3.872763 0.4485851 5.380432e-05 0.3080415 ## 2 2 upper 0.550377366 9.209304e-03 2.357870 0.7299828 9.190059e-03 0.7407917 ## 3 3 upper 0.899999844 2.500000e-02 2.009598 0.7938368 2.223685e-02 1.0000000 x1$bound$z[x1$bound$bound == "lower"] ## numeric(0) x1$bound$z[x1$bound$bound == "harm"] ## numeric(0) x1$bound$z[x1$bound$bound == "lower"] - x1$bound$z[x1$bound$bound == "harm"] >= 0 ## logical(0) all(x1$bound$z[x1$bound$bound == "lower"] - x1$bound$z[x1$bound$bound == "harm"] >= 0) ## [1] TRUE

jdblischak · 2026-06-19T16:35:45Z

+    test_lower = c(TRUE, TRUE, FALSE),
+    harm = gs_spending_bound,
+    hpar = list(sf = gsDesign::sfHSD, total_spend = 0.1, param = -4, timing = NULL),
+    test_harm = c(TRUE, TRUE, TRUE)


Does it make sense for test_harm to be TRUE for the 3rd analysis but test_lower be FALSE?

LittleBeannie added 4 commits June 10, 2026 15:41

Add harm bound to gs_power_npe and gs_design_npe

489ed2a

Add harm bound to gs_xxx_ahr

5e415a4

Update summary functions when harm bound is added

568d492

add developer tests

399cd2a

LittleBeannie self-assigned this Jun 10, 2026

LittleBeannie linked an issue Jun 10, 2026 that may be closed by this pull request

Add harm bound #618

Open

LittleBeannie and others added 9 commits June 11, 2026 10:31

Add as_rtf test file and update snapshot

f2bcae3

Merge remote-tracking branch 'origin/main' into 618-add-harm-bound

f585c45

Rename test file to match new naming convention

9b59c45

Revert "Rename test file to match new naming convention"

b577e09

This reverts commit 9b59c45.

Rename testing file

6d1efe7

Clarify test skill: helpers are auto-sourced, .md snapshots are stand…

fa84635

…alone Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Fix parse error in gs_power_ahr examples (missing # before Example 5)

b561720

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

LittleBeannie requested review from jdblischak and keaven June 16, 2026 13:25

jdblischak reviewed Jun 16, 2026

View reviewed changes

LittleBeannie added 5 commits June 18, 2026 16:01

Add an error message when users add harm bound to a fixed design

9117f53

Restrict harm bound only shows when futility is tested

b0da6ff

Add more developer tests

55bbe26

Update the Rd files

c642a6b

Address CICD checking errors

31d59a0

LittleBeannie requested review from jdblischak and yihui June 18, 2026 21:08

yihui approved these changes Jun 18, 2026

View reviewed changes

Add harm bound Schoenfeld vignette

0a65c16

jdblischak reviewed Jun 19, 2026

View reviewed changes

Conversation

LittleBeannie commented Jun 10, 2026 • edited by jdblischak Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yihui commented Jun 11, 2026

Note on snapshot test changes

Uh oh!

yihui commented Jun 11, 2026

Uh oh!

LittleBeannie commented Jun 16, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yihui left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

LittleBeannie commented Jun 10, 2026 •

edited by jdblischak

Loading