Comparison of Lakatos & Lan (1992), nQuery, EAST and R packages

Authors
Affiliation

Arocena Thomas

Oncostat

Dan Chaltiel

Anne Lourdessamy

Published

March 13, 2026

This analysis started from a simple question:

nQuery cites Lakatos and Lan (1992) as a reference, but which method from that article does nQuery actually implement?

Because Lakatos and Lan (1992) is often cited in a generic way, it’s easy to forget that the paper actually compared three distinct methods for calculating sample sizes under proportional hazards:

  1. Freedman – a simple exponential approximation that estimates the required number of events for a log-rank test under proportional hazards.
  2. RGS (Rubin–Gail–Santner) – an improved approximation that refines the variance calculation of the log-rank test and better accounts for the timing of events.
  3. Lakatos – a more general piecewise-exponential approach that models accrual, follow-up, and hazard rates over time, providing more realistic sample size calculations.

To understand which method is implemented by each contemporary tool, we reproduced the set of scenarios from the 1992 article and compared the sample sizes obtained from:

against the benchmark values for Lakatos, Freedman, and RGS. We rely on rashnu::LakatosSampleSize as an R implementation of the original Lakatos method, allowing a direct comparison between our results and the classical Lakatos benchmarks.

Results

Here is the comparison of sample sizes calculated by the five software/packages (blue) versus those reported in the article (brown).

Lakatos & Lan (1992) comparison with East, nQuery and R packages
Survival at 10 years; Study duration = 10 years; α = 0.05; β = 0.1
Survival HR Accrual n_rpact n_rashnu n_gsdesign2 n_east n_nquery n_L n_F n_RGS
0.8 0.667 1 1588 1620 1610 1593 1640 1617 1628 1640
0.8 0.667 5 1979 2022 2007 1984 2046 2017 2024 2046
0.8 0.667 9 2671 2728 2710 2679 2764 2724 2709 2764
0.8 0.500 1 601 638 625 604 664 638 649 664
0.8 0.500 5 749 798 781 753 832 798 807 831
0.8 0.500 9 1011 1078 1055 1017 1124 1079 1081 1124
0.8 0.250 1 181 230 214 182 270 230 241 269
0.8 0.250 5 225 290 267 227 340 289 299 338
0.8 0.250 9 304 392 362 306 460 392 401 459
0.2 0.667 1 361 362 362 362 364 360 370 363
0.2 0.667 5 414 416 416 415 420 414 419 418
0.2 0.667 9 527 530 530 528 534 528 509 534
0.2 0.500 1 133 134 135 134 138 134 144 138
0.2 0.500 5 154 156 157 155 162 156 164 161
0.2 0.500 9 196 200 201 197 208 200 200 207
0.2 0.250 1 40 44 44 40 50 43 53 48
0.2 0.250 5 46 52 51 47 58 51 61 58
0.2 0.250 9 59 68 66 60 78 66 74 76
Blue: Computed sample sizes
Brown: Sample sizes from the 1992 article
Lakatos & Lan (1992) comparison with East, nQuery and R packages
Survival at 10 years; Study duration = 10 years; α = 0.05; β = 0.1
Survival HR Accrual n_rpact n_rashnu n_gsdesign2 n_east n_nquery n_L n_F n_RGS
0.8 0.667 1 1588 1620 1610 1593 1640 1617 1628 1640
0.8 0.667 5 1979 2022 2007 1984 2046 2017 2024 2046
0.8 0.667 9 2671 2728 2710 2679 2764 2724 2709 2764
0.8 0.500 1 601 638 625 604 664 638 649 664
0.8 0.500 5 749 798 781 753 832 798 807 831
0.8 0.500 9 1011 1078 1055 1017 1124 1079 1081 1124
0.8 0.250 1 181 230 214 182 270 230 241 269
0.8 0.250 5 225 290 267 227 340 289 299 338
0.8 0.250 9 304 392 362 306 460 392 401 459
0.2 0.667 1 361 362 362 362 364 360 370 363
0.2 0.667 5 414 416 416 415 420 414 419 418
0.2 0.667 9 527 530 530 528 534 528 509 534
0.2 0.500 1 133 134 135 134 138 134 144 138
0.2 0.500 5 154 156 157 155 162 156 164 161
0.2 0.500 9 196 200 201 197 208 200 200 207
0.2 0.250 1 40 44 44 40 50 43 53 48
0.2 0.250 5 46 52 51 47 58 51 61 58
0.2 0.250 9 59 68 66 60 78 66 74 76
Blue: Computed sample sizes
Brown: Sample sizes from the 1992 article
  • The modern software outputs (blue columns) mostly form a tight cluster of values. In many scenarios, these columns differ only by a handful of participants.

  • Within the brown block, RGS is usually the largest value, followed by Lakatos, then Freedman.

  • nQuery closely matches RGS, with values that line up almost identically across all scenarios.

  • Rashnu matches the Lakatos method, as expected, showing nearly the same sample sizes row by row.

  • rpact, gsDesign2, and East do not match any single historical method, but their outputs cluster tightly together, and they generally fall closer to the Lakatos values than to Freedman or RGS.

References

Lakatos, Edward, and K. K. Lan. 1992. “A Comparison of Sample Size Methods for the Logrank Statistic.” Statistics in Medicine 11 (2): 179–91. https://doi.org/10.1002/sim.4780110205.