BRAR Simulation Results
  • Patient Benefit
  • Parameter Estimation
  • Hypothesis Testing
  • Convergence
  • Condition-method-level data
  • Method-aggregated data
  • Reproducibility
  • Rate of successes
  • Allocations to treatment group 1
  • Sample size difference
  • Sample size imbalance
  • Extreme randomization probabilities
Mean rate of successes (i.e., the number of successes in a study divided by its sample size averaged over all simulations). The maximum MCSE is 0.047%.

Rate of allocations to treatment group 1 averaged across simulations. Treatment group 1 is the ‘best’ treatment group in all conditions apart from the conditions with \(\text{RD}_1 = \theta_1 - \theta_C = 0\), where the remaining treatments are the most benefitial. The maximum MCSE is 0.27%.

Mean difference in sample size between treatment group 1 and average sample size in remaining groups (control and other treatments). Error bars indicate 2.5% and 97.5% quantiles.

Proportion of simulations with more than 10% of the sample size randomized to other groups than treatment group 1. The maximum MCSE is 0.5%.

Mean proportion of randomization probabilities either less than 0.1 or greater than 0.9. The maximum MCSE is 0.33%.

  • Bias
  • Coverage
Empirical bias of the estimate of the risk difference \(\text{RD}_1\) between the first treatment group and the control group. The maximum MCSE is 0.0013.

Empirical coverage of the 95% Wald confidence interval for the risk difference \(\text{RD}_1\) between the first treatment group and the control group. The maximum MCSE is 0.38%.

  • Type I error rate
  • Power
Empirical type I error rate of the Wald test of \(\text{RD}_1 = 0\). Error bars denote MCSEs (Monte Carlo Standard Errors).

Empirical power of the Wald test of $ ext{RD}_1 = 0$. The maximum MCSE is 0.5%.

Mean convergence
Mean convergence rate within a study averaged across simulations.

Below information on the computational environment from the server where the simulation study was run.

  • utils::sessionInfo
  • sessioninfo::session_info
R version 4.5.0 (2025-04-11)
Platform: x86_64-pc-linux-gnu
Running under: Debian GNU/Linux 13 (trixie)

Matrix products: default
BLAS:   /usr/local/lib/R/lib/libRblas.so 
LAPACK: /usr/local/lib/R/lib/libRlapack.so;  LAPACK version 3.12.1

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=de_CH.UTF-8        LC_COLLATE=en_GB.UTF-8    
 [5] LC_MONETARY=de_CH.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=de_CH.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=de_CH.UTF-8 LC_IDENTIFICATION=C       

time zone: Europe/Zurich
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] brar_0.1         SimDesign_2.20.0

loaded via a namespace (and not attached):
 [1] vctrs_0.6.5         progressr_0.15.1    cli_3.6.3          
 [4] rlang_1.1.5         generics_0.1.3      glue_1.8.0         
 [7] future.apply_1.20.0 listenv_0.9.1       brio_1.1.5         
[10] tibble_3.2.1        lifecycle_1.0.4     beepr_2.0          
[13] compiler_4.5.0      dplyr_1.1.4         codetools_0.2-20   
[16] sessioninfo_1.2.3   testthat_3.2.1.1    pkgconfig_2.0.3    
[19] pbapply_1.7-2       future_1.67.0       R.oo_1.27.1        
[22] R.utils_2.13.0      digest_0.6.35       R6_2.5.1           
[25] tidyselect_1.2.1    pillar_1.10.1       parallelly_1.45.1  
[28] parallel_4.5.0      magrittr_2.0.3      R.methodsS3_1.8.2  
[31] tools_4.5.0         globals_0.18.0      audio_0.1-11       
─ Session info ───────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.5.0 (2025-04-11)
 os       Debian GNU/Linux 13 (trixie)
 system   x86_64, linux-gnu
 ui       X11
 language (EN)
 collate  en_GB.UTF-8
 ctype    en_GB.UTF-8
 tz       Europe/Zurich
 date     2025-08-24
 pandoc   NA
 quarto   NA

─ Packages ───────────────────────────────────────────────────────────────────
 package      * version date (UTC) lib source
 audio          0.1-11  2023-08-18 [1] CRAN (R 4.5.0)
 beepr          2.0     2024-07-06 [1] CRAN (R 4.5.0)
 brar         * 0.1     2025-08-22 [1] local (/home/spawel/simulation-brar/brar/package/out/brar_0.1.tar.gz)
 brio           1.1.5   2024-04-24 [1] CRAN (R 4.4.0)
 cli            3.6.3   2024-06-21 [1] CRAN (R 4.4.0)
 codetools      0.2-20  2024-03-31 [2] CRAN (R 4.5.0)
 digest         0.6.35  2024-03-11 [1] CRAN (R 4.3.3)
 dplyr          1.1.4   2023-11-17 [1] CRAN (R 4.3.3)
 future         1.67.0  2025-07-29 [1] CRAN (R 4.5.0)
 future.apply   1.20.0  2025-06-06 [1] CRAN (R 4.5.0)
 generics       0.1.3   2022-07-05 [1] CRAN (R 4.3.3)
 globals        0.18.0  2025-05-08 [1] CRAN (R 4.5.0)
 glue           1.8.0   2024-09-30 [1] CRAN (R 4.4.1)
 lifecycle      1.0.4   2023-11-07 [1] CRAN (R 4.3.3)
 listenv        0.9.1   2024-01-29 [1] CRAN (R 4.5.0)
 magrittr       2.0.3   2022-03-30 [1] CRAN (R 4.3.3)
 parallelly     1.45.1  2025-07-24 [1] CRAN (R 4.5.0)
 pbapply        1.7-2   2023-06-27 [1] CRAN (R 4.3.3)
 pillar         1.10.1  2025-01-07 [1] CRAN (R 4.4.1)
 pkgconfig      2.0.3   2019-09-22 [1] CRAN (R 4.3.3)
 progressr      0.15.1  2024-11-22 [1] CRAN (R 4.5.0)
 R.methodsS3    1.8.2   2022-06-13 [1] CRAN (R 4.5.0)
 R.oo           1.27.1  2025-05-02 [1] CRAN (R 4.5.0)
 R.utils        2.13.0  2025-02-24 [1] CRAN (R 4.5.0)
 R6             2.5.1   2021-08-19 [1] CRAN (R 4.3.3)
 rlang          1.1.5   2025-01-17 [1] CRAN (R 4.4.1)
 sessioninfo    1.2.3   2025-02-05 [1] CRAN (R 4.5.0)
 SimDesign    * 2.20.0  2025-07-16 [1] CRAN (R 4.5.0)
 testthat       3.2.1.1 2024-04-14 [1] CRAN (R 4.4.0)
 tibble         3.2.1   2023-03-20 [1] CRAN (R 4.3.3)
 tidyselect     1.2.1   2024-03-11 [1] CRAN (R 4.3.3)
 vctrs          0.6.5   2023-12-01 [1] CRAN (R 4.3.3)

 [1] /home/spawel/lib/R
 [2] /usr/local/lib/R/library

──────────────────────────────────────────────────────────────────────────────

This dashboard presents detailed results from the simulation study reported in Pawel and Held (2025, https://github.com/SamCH93/brar). Below is a brief description of the design and analysis of the study, following the ADEMP reporting structure (Morris et al., 2019, https://doi.org/10.1002/sim.8086).

Aims

To investigate the design characteristics of different BRAR (Bayesian Response-Adaptive Randomization) methods.

Data-generating mechanism

In each repetition, a data set with \(n\) binary outcomes is simulated. Adaptive randomization is performed: A patient \(i\) is randomly allocated to the control group or one of the \(K\) treatment groups based on randomization probabilities computed from the \(1, \dots, i -1\) preceding outcomes. Depending on the allocation, an outcome is either simulated from a Bernoulli distribution with probability \(\theta_C\) in the control group, \(\theta_1\) in the first treatment group, and \(\theta_2\) for the remaining treatment groups (only present if \(K > 1\)).

Parameters are chosen similar to a previous simulation studies (Robertson et al., 2023, https://doi.org/10.1214/22-STS865). We vary the sample size \(n \in \{200, 654\}\) to represent low and high powered studies, the number of treatment groups \(K \in \{1, 2, 3\}\), and probability in the first treatment group \(\theta_1 \in \{0.25, 0.35, 0.45\}\). The probability in the control group and the remaining groups is always fixed at \(\theta_C = 0.25\) and \(\theta_2 = \theta_3 = 0.3\), respectively. All these parameters are varied fully-factorially, leading to \(2 \times 3 \times 3 = 18\) parameter conditions.

Since treatment allocation determines from which true probability an outcome is simulated, data generation is directly influenced by the RAR methods described below. These come with additional parameters that are, however, considered as method tuning-/hyper-parameters rather than true underlying parameters.

Estimands and other targets

The primary targets of interest are the design characteristics of the compared BRAR methods in terms of patient benefit, parameter estimation, and hypothesis testing. For the latter two, the estimand of interest is the risk difference \(\text{RD}_1 = \theta_1 - \theta_C\) and the corresponding targeted null hypothesis is \(\text{RD}_1 = 0\).

Methods

We consider the BRAR methods described in Pawel and Held (2025, https://github.com/SamCH93/brar). These methods introduce a point hypothesis \(H_0\) postulating that all treatments are equally effective as the control. The prior probability of \(H_0\) is a tuning parameter and controls the variability of the randomization probabilities. Setting \(\Pr(H_0) = 1\) produces equal randomization, whereas \(\Pr(H_0) = 0\) produces Thompson sampling. We will consider values \(\Pr(H_0) \in \{0, 0.25, 0.5, 0.75, 1\}\), as well as the normal approximation and exact binomial version of BRAR. For approximate normal BRAR, a normal prior with mean 0, variance 1, and in case of \(K > 1\) a correlation of 0.5 are considered, whereas independent uniform priors are assigned for binomial BRAR. Log odds ratios along with their covariance are estimated with logistic regression and then used as inputs for the normal BRAR method, while the exact method used counts and sample sizes only. In case a method fails to converge, equal randomization is applied.

We also consider some modifications of these methods: In some conditions, a “burn-in” phase is carried out during which patients are always randomized with equal probability \(1/(K + 1)\) to each group. For Thompson sampling (\(\Pr(H_0) = 0\)), we also consider two more modifications: (i) conditions with “capping” of randomization probabilities to \([10\%, 90\%]\), that is, to set randomization probabilities to either 10% or 90% if they are outside this interval, (ii) conditions with power transformations of randomization probabilities, i.e., if \(\pi_k\) is the randomization probability of group \(k\), we take \(\pi_k^* = \pi_k^c / \sum_{j \in \{C,1,\dots,K\}} \pi_j^c\). After capping has been performed, randomization probabilities are re-normalized to sum to one (Wathen and Thall, 2017, https://doi.org/10.1177/1740774517692302). This re-normalization is only performed for randomization probabilities greater than 0.1, as these randomization probability would otherwise be reduced again to probabilities less than 10%. In case, a re-normalized probability becomes less than 10%, it is also capped at 10% and second re-normalization performed. For equal randomization (\(\Pr(H_0) = 1\)), no burn-in, capping, or power transformation conditions are simulated as these manipulations have no effect on equal randomization.

Performance measures

Different performance measures were used

Patient benefit
  • The mean rate of successes per study
  • The mean rate of allocations to treatment group 1 (the best group in all conditions where \(\theta_1 > \theta_C\))
  • The mean sample size difference between treatment group 1 and the average sample size in the remaining groups
  • The rate of simulations with 10% sample size imbalance in favor of inferior treatments, i.e., \(\hat{S}_{0.1} = \Pr(\frac{n - n_1}{K} - n_1 > 0.1n)\). For \(K= 1\), this reduces to the \(\hat{S}_{0.1}\) measure from Robertson et al. (2023, https://doi.org/10.1214/22-STS865), which in turn was inspired by the performance evaluation in the simulation study from Thall, Fox, and Wathen (2015, https://doi.org/10.1093/annonc/mdv238).
  • The rate of simulations with extreme randomization probabilities (less than 10%0.1 or greater than 90%).
Parameter estimation performance
  • Empirical bias of the estimate of the risk difference between the first treatment group and the control group
  • Empirical coverage of the 95% Wald confidence intervals of the risk difference between the first treatment group and the control group
Hypothesis testing performance
  • Empirical type I error rate and power related to the Wald test regarding the risk difference between the first treatment group and the control group.

Each condition was simulated 10’000 times. This ensures a MCSE (Monte Carlo Standard Error) for the Type I error rate and power of at most 0.5%. MCSEs were calculated using the formulae from Siepe et al. (2024, https://doi.org/10.1037/met0000695) and are provided for all performance measures in the corresponding Figures and Tables.

Computational aspects

The simulation study was run on a server running Debian GNU/Linux 13 (trixie) and R version 4.5.0 (2025-04-11). The SimDesign R package was used to organize and run the simulation study (https://CRAN.R-project.org/package=SimDesign). The brar package was used to perform BRAR. SessionInfo outputs with more computational information are available under the “Reproducibilty” tab. Code and data to reproduce this simulation study is available at https://github.com/SamCH93/brar. The brar package can also be installed from the repository and will be submitted to CRAN in the near future.

Citation

Cite this dashboard as

Pawel, S., Held, L. (2025). Results from Simulation Study on Point Null Bayesian Response-Adaptive Randomization Methods. https://samch93.github.io/brar/