--- title: "Package 'ROCModels'" date: "`r format(Sys.Date(), '%B %d, %Y')`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{'ROCModels'} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- **Type** Package **Title** ROCModels: ROC Models and AUC Estimation for different models **Version** 1.0.0 **Date** `r format(Sys.time(), "%Y-%m-%d")` **Encoding** UTF-8 **Depends** R (>= 2.14) **Imports** ggplot2, kedd, dplyr, survival, nleqslv, HDInterval, MASS, doParallel, foreach, pbivnorm, nor1mix, parallel **Description** The receiver operating characteristic (ROC) curve is one of the most widely used tools for evaluating diagnostic and prognostic biomarkers across diverse scientific fields, particularly in medicine. Despite its ubiquity, ROC estimation and testing methods differ substantially in their assumptions and resulting curve properties. This package provides a unified framework for constructing, visualizing, and comparing parametric, nonparametric, semiparametric, and Bayesian ROC curves. 'ROCModels' helps researchers identify and implement ROC inference methods most suitable for their data. **License** GPL **NeedsCompilation** yes **Author** Ruhul Ali Khan [cre, aut] (ORCID: ), Raja Nakka [aut], Musie Ghebremichael [aut] **Maintainer** Ruhul Ali Khan <>, <> **Repository** CRAN **Date/Publication** `r format(Sys.time(), "%Y-%m-%d %H:%M:%S UTC")` **Contents** - [ROCModels-package](#ROCModels-package) - [Data format](#Data-format) - [DMDmodified](#DMDmodified) - [AUC](#AUC) - [empirical](#empirical-empirical-roc) - [order](#order-order-restricted-roc) - [norm_silver](#norm_silver-norm_ucv-bi_silver-bi_ucv-kernel-densitybased-smooth-roc) - [norm_ucv](#norm_silver-norm_ucv-bi_silver-bi_ucv-kernel-densitybased-smooth-roc) - [bi_silver](#norm_silver-norm_ucv-bi_silver-bi_ucv-kernel-densitybased-smooth-roc) - [bi_ucv](#norm_silver-norm_ucv-bi_silver-bi_ucv-kernel-densitybased-smooth-roc) - [binormal](#binormal-binormal-roc-curve) - [biweibull](#biweibull-constant-shape-bi-weibull-roc-curve) - [bigamma](#bigamma-bi-gamma-roc-curve) - [lehmann](#lehmann) - [bayesbiweibull](#bayesbiweibull-bayesian-bi-weibull-roc-curve) - [dpm](#dpm-roc-curve) - [BB](#bb-bayesian-bootstrap-roc-curve)

```{r setup, include=FALSE} knitr::opts_chunk$set( echo = TRUE, message = FALSE, warning = FALSE, fig.width = 7, fig.height = 5 ) ``` # {#ROCModels-package .unnumbered} ------------------------------------------------------------------------ `ROCModels-package` ROCModels ------------------------------------------------------------------------ **Description** The receiver operating characteristic (ROC) curve is a fundamental tool for evaluating diagnostic and prognostic biomarkers, particularly in medical research. However, ROC estimation methods differ substantially in their underlying assumptions, statistical properties, and inferential objectives. The `ROCModels` package offers a unified framework for constructing, visualizing, and comparing ROC curves using a wide range of modeling approaches: **Nonparametric Methods** - **[Empirical](#empirical-empirical-roc)**\ - **[Order-restricted](#order-order-restricted-roc)**\ - **[Biweight kernel with Silverman's bandwidth](#norm_silver-norm_ucv-bi_silver-bi_ucv-kernel-densitybased-smooth-roc)**\ - **[Biweight kernel with unbiased cross-validation (UCV) bandwidth](#norm_silver-norm_ucv-bi_silver-bi_ucv-kernel-densitybased-smooth-roc)**\ - **[Gaussian kernel with Silverman's bandwidth](#norm_silver-norm_ucv-bi_silver-bi_ucv-kernel-densitybased-smooth-roc)**\ - **[Gaussian kernel with UCV bandwidth](#norm_silver-norm_ucv-bi_silver-bi_ucv-kernel-densitybased-smooth-roc)** **Parametric Methods** - **[Binormal](#binormal-binormal-roc-curve)**\ - **[Bi-Weibull](#biweibull-constant-shape-bi-weibull-roc-curve)**\ - **[Bi-Gamma](#bigamma-bi-gamma-roc-curve)** **Semiparametric Method** - **[Lehmann model](#lehmann)** **Bayesian Methods** - **Parametric Bayesian** ([Bayesian Bi-Weibull](#bayesbiweibull-bayesian-bi-weibull-roc-curve)) - **Semiparametric Bayesian** ([Dirichlet process mixture of normals](#dpm-roc-curve)) - **Nonparametric Bayesian** ([Bayesian Bootstrap ROC Curve](#bb-bayesian-bootstrap-roc-curve)) **Except for the empirical and order-restricted estimators, all other methods produce smooth ROC curves.** This package helps researchers identify and implement inference methods most appropriate for their data, promoting transparent, reproducible, and methodologically rigorous ROC analysis. Alonzo, T. A., and Pepe, M. S. (2002) , Andrews, D. F., and Herzberg, A. M. (1985) , Bamber, D. (1975) , Cox, D. R. (1972) , Cox, D. R. (1975) , DeLong, E. R., DeLong, D. M., and Clarke-Pearson, D. L. (1988) , Dorfman, D. D., and Alf, E. (1969) , Dorfman, D. D., Berbaum, K. S., and Metz, C. E. (1997) , Erkanli, A., Sung, L., and Stamey, J. D. (2006) , Faraggi, D., and Reiser, B. (2002) , Ghebremichael, M., and Habtemicael, S. (2018) , Ghebremichael, M., and Michael, H. (2024) , Ghebremichael, M., Michael, H., Tubbs, J., and Paintsil, E. (2019) , Gönen, M., and Heller, G. (2010) , Gopalakrishnan, V., Bose, E., Nair, U., Cheng, Y., and Ghebremichael, M. (2020) , Green, D. M., and Swets, J. A. (1966, ISBN:0471324205), Gu, J., and Ghosal, S. (2009) , Gu, Y., Ghosal, S., and Roy, A. (2008) , Guidoum, A. C. (2020) , , Guo, B. (2015) , Hanley, J. A., and McNeil, B. J. (1982) , Hsieh, F., and Turnbull, B. W. (1996) , Hussain, E. (2012) , Ishwaran, H., and James, L. F. (2002) , Jokiel-Rokita, A., and Topolnicki, R. (2020) , Krzanowski, W. J., and Hand, D. J. (2009) , Kundu, D., and Gupta, R. D. (2006) , Lloyd, C. J. (1998) , Lehmann, E. L. (1953) , Metz, C. E., Herman, B. A., and Shen, J. H. (1998) , Pepe, M. S. (2003) , Pundir, S., and Amala, R. (2014) , Silverman, B. W. (2018) , Yeo, I. K., and Johnson, R. A. (2000) , Zhou, X. H., McClish, D. K., and Obuchowski, N. A. (2009) , Zou, K. H., Hall, W. J., and Shapiro, D. E. (1997) . **Details** The core functionality of the `ROCModels` package centers around the `AUC()` function, which computes the area under the ROC curve (AUC), its confidence interval (CI), and generates the corresponding ROC curve. Users can choose from a wide variety of modeling approaches, as outlined in the description above. These include parametric, nonparametric, semiparametric, and Bayesian methods. Within each modeling framework, the package supports multiple options for constructing ROC curves and selecting appropriate confidence interval techniques. Subsequent sections of this documentation provide detailed mathematical formulations, implementation specifications, and code examples for each modeling approach and supported CI method. This flexibility allows researchers to tailor ROC estimation and inference to the specific characteristics of their data and scientific objectives, promoting transparent, reproducible, and methodologically sound analysis. **Authors** Ruhul Ali Khan, Raja Nakka, Musie Ghebremichael. Maintainer: Ruhul Ali Khan <>, <> **Abbreviations** The following abbreviations are employed extensively in this package: - **ROC**: receiver operating characteristic - **AUC**: area under the ROC curve - **TPR**: True Positive Rate or Sensitivity - **FPR**: False Positive Rate or 1–Specificity - **PDF**: Probability Density Function - **CDF**: Cumulative Distribution Function - **SE**: Standard Error - **CI**: Confidence Interval (frequentist) or Credible Interval (Bayesian) - **KDE**: Kernel Density Estimation - **HPD**: Highest Posterior Density - **BB**: Bayesian Bootstrap - **DPM**: Dirichlet Process Mixture - **IG**: Inverse-Gamma distribution - **PH**: Proportional Hazards model - **MCMC**: Markov Chain Monte Carlo - **MLE**: Maximum Likelihood Estimation - **MH**: Metropolis–Hastings - **Boot-p**: Bootstrap percentile **References** Alonzo, T. A., & Pepe, M. S. (2002). Distribution-free ROC analysis using binary regression techniques. *Biostatistics*, **3**(3), 421–432. [https://doi.org/10.1093/biostatistics/3.3.421](https://doi.org/10.1093/biostatistics/3.3.421) Andrews, D. F., & Herzberg, A. M. (1985). *Data: A Collection of Problems from Many Fields for the Student and Research Worker*. Springer-Verlag, Berlin. [https://doi.org/10.1007/978-1-4612-5098-2](https://doi.org/10.1007/978-1-4612-5098-2) Bamber, D. (1975). The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. *Journal of Mathematical Psychology*, **12**(4), 387–415. [https://doi.org/10.1016/0022-2496(75)90001-2](https://doi.org/10.1016/0022-2496(75)90001-2) Cox, D. R. (1972). Regression models and life-tables. *Journal of the Royal Statistical Society: Series B*, **34**, 187–220. [https://doi.org/10.1111/j.2517-6161.1972.tb00899.x](https://doi.org/10.1111/j.2517-6161.1972.tb00899.x) Cox, D. R. (1975). Partial likelihood. *Biometrika*, **62**, 269–276. [https://doi.org/10.1093/biomet/62.2.269](https://doi.org/10.1093/biomet/62.2.269) Daniel, F., Ooi, H., Calaway, R., Microsoft, & Weston, S. (2022). *doParallel: Foreach Parallel Adaptor for the 'parallel' Package*. R package version 1.0.17. [https://CRAN.R-project.org/package=doParallel](https://CRAN.R-project.org/package=doParallel) Daniel, F., Ooi, H., Calaway, R., Microsoft, & Weston, S. (2022). *foreach: Provides Foreach Looping Construct for R*. R package version 1.5.2. [https://CRAN.R-project.org/package=foreach](https://CRAN.R-project.org/package=foreach) DeLong, E. R., DeLong, D. M., & Clarke-Pearson, D. L. (1988). Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. *Biometrics*, **44**(3), 837–845. [https://doi.org/10.2307/2531595](https://doi.org/10.2307/2531595) Dorfman, D. D., & Alf, E. (1969). Maximum-likelihood estimation of parameters of signal-detection theory and determination of confidence intervals-rating method data. *Journal of Mathematical Psychology*, **6**, 487–496. [https://doi.org/10.1016/0022-2496(69)90019-4](https://doi.org/10.1016/0022-2496(69)90019-4) Dorfman, D. D., Berbaum, K. S., & Metz, C. E. (1997). Proper receiver operating characteristic analysis: The bigamma model. *Academic Radiology*, **4**, 138–149. [https://doi.org/10.1016/s1076-6332(97)80013-x](https://doi.org/10.1016/s1076-6332(97)80013-x) Erkanli, A., Sung, L., & Stamey, J. D. (2006). Bayesian semi-parametric ROC curve estimation. *Statistics in Medicine*, **25**, 3905–3928. [https://doi.org/10.1002/sim.2496](https://doi.org/10.1002/sim.2496) Faraggi, D., & Reiser, B. (2002). Estimation of the area under the ROC curve. *Statistics in Medicine*, **21**, 3093–3106. [https://doi.org/10.1002/sim.1228](https://doi.org/10.1002/sim.1228) Ghebremichael, M., & Habtemicael, S. (2018). Effect of tuberculosis on immune restoration among HIV-infected patients receiving antiretroviral therapy. *Journal of Applied Statistics*, **45**(13), 2357–2364. [https://doi.org/10.1080/02664763.2017.1420758](https://doi.org/10.1080/02664763.2017.1420758) Ghebremichael, M., & Michael, H. (2024). Comparison of the binormal and Lehmann receiver operating characteristic curves. *Communications in Statistics—Simulation and Computation*, **53**(2), 772–785. [https://doi.org/10.1080/03610918.2022.2032159](https://doi.org/10.1080/03610918.2022.2032159) Ghebremichael, M., Michael, H., Tubbs, J., & Paintsil, E. (2019). Comparing the diagnostic accuracy of CD4+ T-lymphocyte count and percent as surrogate markers of pediatric HIV disease. *Journal of Mathematics and Statistics*, **15**(1), 55–64. [https://doi.org/10.3844/jmssp.2019.55.64](https://doi.org/10.3844/jmssp.2019.55.64) Gönen, M., & Heller, G. (2010). Lehmann family of ROC curves. *Medical Decision Making*, **30**(4), 509–517. [https://doi.org/10.1177/0272989X09360067](https://doi.org/10.1177/0272989X09360067) Gopalakrishnan, V., Bose, E., Nair, U., Cheng, Y., & Ghebremichael, M. (2020). Pre-HAART CD4+ T-lymphocytes as biomarkers of post-HAART immune recovery in HIV-infected children with or without TB co-infection. *BMC Infectious Diseases*, **20**, 1–8. [https://doi.org/10.1186/s12879-020-05458-w](https://doi.org/10.1186/s12879-020-05458-w) Green, D. M., & Swets, J. A. (1966). *Signal Detection Theory and Psychophysics*, Vol. 1., ISBN:0471324205, Wiley, New York. [https://www.semanticscholar.org/paper/b11fa6f41f9bbc17bfe1b94e857ee76b6f0bd7f5](https://www.semanticscholar.org/paper/b11fa6f41f9bbc17bfe1b94e857ee76b6f0bd7f5) Gu, J., & Ghosal, S. (2009). Bayesian ROC curve estimation under binormality using a rank likelihood. *Journal of Statistical Planning and Inference*, **139**(6), 2076–2083. [https://doi.org/10.1016/j.jspi.2008.09.014](https://doi.org/10.1016/j.jspi.2008.09.014) Gu, Y., Ghosal, S., & Roy, A. (2008). Bayesian bootstrap for ROC curve estimation. *Bayesian Analysis*, **3**(3), 659–676. [https://doi.org/10.1002/sim.3366](https://doi.org/10.1002/sim.3366) Guidoum, A. C. (2020). *kedd: Kernel Estimator and Bandwidth Selection for Density and Its Derivatives*. R package. CRAN DOI: [10.32614/CRAN.package.kedd](https://CRAN.R-project.org/package=kedd) arXiv preprint: [https://doi.org/10.48550/arXiv.2012.06102](https://doi.org/10.48550/arXiv.2012.06102) Guo, B. (2015). *On the effect of improperness of binormal ROC curves for estimating full area under the curve*. PhD Thesis, University of Pittsburgh. [https://d-scholarship.pitt.edu/23590/1/Guo_Ben_thesis_12-2014.pdf](https://d-scholarship.pitt.edu/23590/1/Guo_Ben_thesis_12-2014.pdf) Hanley, J. A., & McNeil, B. J. (1982). The meaning and use of the area under a receiver operating characteristic (ROC) curve. *Radiology*, **143**(1), 29–36. [https://doi.org/10.1148/radiology.143.1.7063747](https://doi.org/10.1148/radiology.143.1.7063747) Hasselman, B. (2022). *nleqslv: Solve Systems of Nonlinear Equations*. R package version 3.3.5. [https://CRAN.R-project.org/package=nleqslv](https://CRAN.R-project.org/package=nleqslv) Hsieh, F., & Turnbull, B. W. (1996). Nonparametric and semiparametric estimation of the ROC curve. *Annals of Statistics*, **24**(1), 25–40. [https://doi.org/10.1214/aos/1033066197](https://doi.org/10.1214/aos/1033066197) Hussain, E. (2012). The Bi-Gamma ROC Curve in a Straightforward Manner. *Journal of Basic and Applied Sciences*, **8**(2). [https://doi.org/10.6000/1927-5129.2012.08.02.09](https://doi.org/10.6000/1927-5129.2012.08.02.09) Ishwaran, H., & James, L. F. (2002). Approximate Dirichlet process computing in finite normal mixtures. *Journal of Computational and Graphical Statistics*, **11**(3), 508–532. [https://doi.org/10.1198/106186002411](https://doi.org/10.1198/106186002411) Jokiel-Rokita, A., & Topolnicki, R. (2020). Estimation of the ROC curve from the Lehmann family. *Computational Statistics & Data Analysis*, **142**, 106820. [https://doi.org/10.1016/j.csda.2019.106820](https://doi.org/10.1016/j.csda.2019.106820) Kenkel. B., Genz, A. (2015). *pbivnorm: Vectorized Computation of the Bivariate Normal Probabilities*. R package version 0.6.0. [https://CRAN.R-project.org/package=pbivnorm](https://CRAN.R-project.org/package=pbivnorm) Krzanowski, W. J., & Hand, D. J. (2009). *ROC Curves for Continuous Data*. CRC Press. [https://doi.org/10.1201/9781439800225](https://doi.org/10.1201/9781439800225) Kundu, D., & Gupta, R. D. (2006). Estimation of $ P[Y < X] $ for Weibull distributions. *IEEE Transactions on Reliability*, **55**(2), 270–280. [https://doi.org/10.1109/TR.2006.874918](https://doi.org/10.1109/TR.2006.874918) Lloyd, C. J. (1998). Using smoothed receiver operating characteristic curves to summarize and compare diagnostic systems. *Journal of the American Statistical Association*, **93**(444), 1356–1364. [https://doi.org/10.1080/01621459.1998.10473797](https://doi.org/10.1080/01621459.1998.10473797) Lehmann, E. L. (1953). The power of rank tests. *Annals of Mathematical Statistics*, **24**, 23–43. [https://doi.org/10.1214/aoms/1177729080](https://doi.org/10.1214/aoms/1177729080) Maechler, M. (2024). *nor1mix: Normal Mixture Models with One Unknown Component*. R package version 1.2-3. [https://CRAN.R-project.org/package=nor1mix](https://CRAN.R-project.org/package=nor1mix) Metz, C. E., Herman, B. A., & Shen, J. H. (1998). Maximum likelihood estimation of receiver operating characteristic (ROC) curves from continuously-distributed data. *Statistics in Medicine*, **17**, 1033–1053. [https://doi.org/10.1002/(SICI)1097-0258(19980515)17:9%3C1033::AID-SIM784%3E3.0.CO;2-Z](https://doi.org/10.1002/(SICI)1097-0258(19980515)17:9%3C1033::AID-SIM784%3E3.0.CO;2-Z) Ngumbang, J., Meredith, M., & Kruschke, J. K. (2023). *HDInterval: Highest (Posterior) Density Intervals*. R package version 0.2.5. [https://CRAN.R-project.org/package=HDInterval](https://CRAN.R-project.org/package=HDInterval) Pepe, M. S. (2003). *The Statistical Evaluation of Medical Tests for Classification and Prediction*. Oxford University Press. [https://doi.org/10.1093/oso/9780198509844.001.0001](https://doi.org/10.1093/oso/9780198509844.001.0001) Pundir, S., & Amala, R. (2014). Evaluation of area under the constant shape bi-Weibull ROC curve. *Journal of Modern Applied Statistical Methods*, **13**(1), 20. [https://doi.org/10.22237/jmasm/1398917940](https://doi.org/10.22237/jmasm/1398917940) R Core Team (2023). *parallel: Support for Parallel Computation in R*. Part of R base distribution. [https://stat.ethz.ch/R-manual/R-devel/library/parallel/doc/parallel.pdf](https://stat.ethz.ch/R-manual/R-devel/library/parallel/doc/parallel.pdf) Silverman, B. W. (2018). *Density Estimation for Statistics and Data Analysis*. Chapman & Hall/CRC. [https://doi.org/10.1201/9781315140919](https://doi.org/10.1201/9781315140919) Therneau, T. M. (2023). *A Package for Survival Analysis in R*. R package version 3.5-7. [https://CRAN.R-project.org/package=survival](https://CRAN.R-project.org/package=survival) Wickham, H. (2016). *ggplot2: Elegant Graphics for Data Analysis*. Springer-Verlag, New York. ISBN 978-3-319-24277-4. [https://ggplot2.tidyverse.org](https://ggplot2.tidyverse.org) Wickham, H., François, R., Henry, L., Müller, K. & Vaughan, D. (2023). *dplyr: A Grammar of Data Manipulation*. R package version 1.1.3. [https://CRAN.R-project.org/package=dplyr](https://CRAN.R-project.org/package=dplyr) Yeo, I. K., & Johnson, R. A. (2000). A new family of power transformations to improve normality or symmetry. *Biometrika*, **87**(4), 954–959. [https://doi.org/10.1093/biomet/87.4.954](https://doi.org/10.1093/biomet/87.4.954) Zhou, X. H., McClish, D. K., & Obuchowski, N. A. (2009). *Statistical Methods in Diagnostic Medicine*. John Wiley & Sons. [https://doi.org/10.1002/9780470906514](https://doi.org/10.1002/9780470906514) Zou, K. H., Hall, W. J., & Shapiro, D. E. (1997). Smooth nonparametric receiver operating characteristic (ROC) curves for continuous data. *Statistics in Medicine*, **16**, 2143–2156. [https://doi.org/10.1002/(SICI)1097-0258(19971015)16:19%3C2143::AID-SIM655%3E3.0.CO;2-3](https://doi.org/10.1002/(SICI)1097-0258(19971015)16:19%3C2143::AID-SIM655%3E3.0.CO;2-3) **Installation** To install the `ROCModels` package, ensure your R session is connected to the internet. Then, run the following command in the R console: ```{r install, eval=FALSE} install.packages("ROCModels") ``` Then load the package in R: ```{r load-package, eval=FALSE} library(ROCModels) ``` From this point, the examples below assume that **ROCModels** is loaded. # {#Data-format .unnumbered} ------------------------------------------------------------------------ `Data Format` Preparing Your Dataset for Use with `ROCModels` ------------------------------------------------------------------------ Before using the package, it is essential to format your dataset according to the following guidelines. The main function, `AUC()`, requires a data frame named `data` that contains exactly **two columns**: - `biomarker`: Numeric values representing the diagnostic marker. - `status`: Disease status encoded as a character or factor with two levels: - `"0"` for **non-diseased** individuals (controls) - `"1"` for **diseased** individuals (cases) These column names and coding conventions must be followed precisely to ensure compatibility with the package’s functions. # {#DMDmodified .unnumbered} ------------------------------------------------------------------------ `DMDmodified` Default Dataset ------------------------------------------------------------------------ This package includes a built-in dataset to show immediate functionality. This dataset comprises 209 records from female individuals assessed for potential carrier status of Duchenne Muscular Dystrophy (DMD). Among them, 75 are identified as carriers and 134 as non-carriers. The dataset includes demographic information and biochemical measurements from four serum markers commonly used in clinical screening, which may show elevated levels in carriers despite the absence of symptoms. **Variables** - **OBS**: Observation index - **HospID**: Hospital identification code - **AGE**: Age in years - **M**: Month of examination - **Y**: Year of examination - **CK**: Creatine kinase (default biomarker) - **H**: Hemopexin - **PK**: Pyruvate kinase - **LD**: Lactate dehydrogenase - **Class**: Diagnostic label with two levels — “carrier” (positive) and “normal” (negative) For demonstration purposes, we have filtered the original dataset to focus specifically on the **CK** biomarker. In this modified version, **CK** is treated as biomarker, and the **Class** column serves as the status indicator with **levels `"0"` and `"1"`** (`"0"` denotes the **normal** (non‑diseased or controls) and `"1"` denotes the **carrier** cases. (diseased)). This curated dataset is included in the package under the name `DMDmodified` for illustration purpose.This dataset follows the required data format for the package. **Reference** * Andrews, D. F., & Herzberg, A. M. (1985). *Data: A Collection of Problems from Many Fields for the Student and Research Worker*. Springer-Verlag, Berlin. [https://doi.org/10.1007/978-1-4612-5098-2](https://doi.org/10.1007/978-1-4612-5098-2) # {#AUC .unnumbered} ------------------------------------------------------------------------ `AUC` Compute the Area Under the ROC Curve and Plot the ROC Curve ------------------------------------------------------------------------ **Details** The `AUC()` function is the central component of the `ROCModels` package. It calculates the area under the ROC curve (AUC), estimates its confidence interval (CI), and produces the corresponding ROC plot. **Usage** ```r AUC( data, method, ci = TRUE, ci_method = "delong", siglevel = 0.05, boot_iter = 1000, seed = 1691 ) ``` **Arguments** - `data` A data frame containing two columns: - `biomarker`: numeric values representing the diagnostic marker - `status`: character or factor with levels `"0"` (controls) and `"1"` (cases) - `method` A character string specifying the ROC/AUC modeling approach. Supported options include: - `"empirical"` – empirical ROC, - `"order"` – ROC curve under **stochastic order constraints**, - `"norm_silver"` – kernel ROC with **normal kernel** and **Silverman** bandwidth, - `"norm_ucv"` – kernel ROC with **normal kernel** and **UCV** bandwidth, - `"bi_silver"` – kernel ROC with **biweight kernel** and **Silverman** bandwidth, - `"bi_ucv"` – kernel ROC with **biweight kernel** and **UCV** bandwidth, - `"binormal"` – classical binormal ROC model, - `"biweibull"` – parametric **bi‑Weibull** ROC, - `"bigamma"` – parametric ROC assuming gamma distributions, - `"lehmann"` – ROC under the Lehmann alternative, - `"bayesbiweibull"` – **Bayesian bi‑Weibull** ROC (MCMC‑based), - `"BB"` – **Bayesian bootstrap** ROC, - `"dpm"` – **Dirichlet process mixture** based ROC. Method names are case-sensitive and must match exactly. Each method is described in detail in later sections. - `ci` Logical. If `TRUE` (default), the function computes confidence intervals for the AUC (and, in some models, credible intervals for Bayesian methods). - `ci_method` Specifies the type of interval estimation: - `"delong"` – DeLong’s variance-based normal approximation - `"bootstrap"` – nonparametric bootstrap interval - `"hm"` – Hanley–McNeil variance-based interval - `"mle"` – likelihood-based interval - `"all"` – computes all applicable interval types for the selected method Not all CI methods are compatible with every model. Each method has a default CI approach, and compatibility will be discussed in the corresponding documentation sections. - `siglevel` Significance level $\alpha$ for the confidence interval. The corresponding confidence level is $1 - \alpha$. For example, `siglevel = 0.05` yields a 95% interval. - `boot_iter` Number of bootstrap resamples used only when `ci_method = "bootstrap"` or when `"all"` is requested. Larger values give more stable intervals but increase computation time. This option is applicable only when the confidence interval is computed using the bootstrap method. - `seed` Random number for reproducibility **Value** The primary behavior of the `AUC()` function is to: 1. Display the **AUC estimate** 2. Print one or more **confidence intervals** 3. Return a **ggplot object** visualizing the ROC curve for the selected method The exact structure of the returned object may vary depending on the chosen model. For typical usage: - Use `AUC()$summary` to access the printed output - Use `AUC()$plot` to retrieve the ROC curve visualization **Examples** ```r # Import well formated dataset data(DMDmodified) # Calculate AUC summary and ROC plot auc <- AUC( data=DMDmodified, method = "empirical", ci = TRUE ) # Get the AUC summary message(paste(auc$summary)) # Get the ROC plot auc$plot ``` Next we describe, at a high level, the methods invoked by the `method` argument. # {#empirical-empirical-roc .unnumbered} ------------------------------------------------------------------------ `empirical` Empirical ROC ------------------------------------------------------------------------ To apply this method, set `method = "empirical"` in the `AUC()` function. The following options are available for `ci_method`: - `"delong"` – DeLong’s variance-based normal approximation - `"bootstrap"` – nonparametric bootstrap percentile method - `"hm"` – Hanley–McNeil variance-based interval - `"all"` – computes all applicable interval types for the selected method **Usage** ```r AUC( data = data, method = "empirical", ci = TRUE, ci_method = "delong", siglevel = 0.05, boot_iter = 1000 ) ``` **Description** The empirical ROC method is a fully nonparametric approach that makes **no assumptions about the underlying distribution** of the biomarker in either group. It is based on the Mann–Whitney U statistic, including adjustments for tied values, and provides a widely accepted estimate of both the ROC curve and the AUC. The empirical ROC curve is defined as: \[ ROC_{\text{emp}}(t) = 1 - G_n\left(F_m^{-1}(1 - t)\right), \quad \text{for } 0 < t < 1 \] where $F_m$ and $G_n$ are empirical estimator. The corresponding AUC estimator is: \[ \widehat{\mathrm{AUC}}_{\text{emp}} = \frac{1}{mn} \sum_{i=1}^m \sum_{j=1}^n \left[ I(X_i < Y_j) + \frac{1}{2} I(X_i = Y_j) \right] \] where $X_1, \dots, X_m$ are biomarker values from controls and $Y_1, \dots, Y_n$ are from cases. **This method produces a jagged, step-like ROC curve. For small datasets, the curve may appear more irregular and less stable.** **Example** ```r # Load the formatted dataset data(DMDmodified) # Compute AUC summary and ROC plot auc <- AUC( data = DMDmodified, method = "empirical", ci = TRUE, ci_method = "delong", siglevel = 0.05, boot_iter = 1000 ) # Display AUC summary message(paste(auc$summary)) # Display ROC plot auc$plot ``` **References** * Bamber, D. (1975). The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. *Journal of Mathematical Psychology*, **12**(4), 387–415. [https://doi.org/10.1016/0022-2496(75)90001-2](https://doi.org/10.1016/0022-2496(75)90001-2) * DeLong, E. R., DeLong, D. M., & Clarke-Pearson, D. L. (1988). Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. *Biometrics*, **44**(3), 837–845. [https://doi.org/10.2307/2531595](https://doi.org/10.2307/2531595) * Hanley, J. A., & McNeil, B. J. (1982). The meaning and use of the area under a receiver operating characteristic (ROC) curve. *Radiology*, **143**(1), 29–36. [https://doi.org/10.1148/radiology.143.1.7063747](https://doi.org/10.1148/radiology.143.1.7063747) * Hsieh, F., & Turnbull, B. W. (1996). Nonparametric and semiparametric estimation of the ROC curve. *Annals of Statistics*, **24**(1), 25–40. [https://doi.org/10.1214/aos/1033066197](https://doi.org/10.1214/aos/1033066197) * Pepe, M. S. (2003). *The Statistical Evaluation of Medical Tests for Classification and Prediction*. Oxford University Press. [https://doi.org/10.1093/oso/9780198509844.001.0001](https://doi.org/10.1093/oso/9780198509844.001.0001) # {#order-order-restricted-roc .unnumbered} ------------------------------------------------------------------------ # `order` Order-Restricted ROC ------------------------------------------------------------------------ To apply this method, set `method = "order"` in the `AUC()` function. The following options are available for `ci_method`: * `"bootstrap"` – nonparametric bootstrap percentile interval (recommended default) * `"delong"` – DeLong’s variance-based normal approximation (large samples) * `"hm"` – Hanley–McNeil variance-based interval (large samples) * `"all"` – computes all applicable interval types for the selected method **Usage** ```r AUC( data = data, method = "order", ci = TRUE, ci_method = "bootstrap", siglevel = 0.05, boot_iter = 1000 ) ``` **Description** For a useful binary classifier, the true positive rate (TPR) should be greater than or equal to the false positive rate (FPR) across all thresholds. Geometrically, this places the ROC curve always on or above the diagonal, with the AUC lying between 0.5 (random allocation) and 1 (perfect classification). In practice, however, the empirical distribution functions $F_m$ and $G_n$ obtained from finite samples may not respect this order due to sampling variability. To address this, order-restricted ROC methods enforce $\overline{G}_n(u) \ge \overline{F}_m(u)$ constraints that are biologically or theoretically reasonable, leading to smoother, more stable ROC curves and more accurate AUC estimates. Under the order restriction framework, Jokiel-Rokita and Topolnicki (2020) extended the methodology to ROC estimation. Let $F_m$ and $G_n$ be the empirical distribution functions of the controls and cases, respectively. Define the empirical distribution based on the combined samples \[ P_{mn}(t) = \frac{m}{m+n} F_m(t) + \frac{n}{m+n} G_n(t), \] and the order-restricted estimators \[ F_{mn}(t) = \max\{ F_m(t), P_{mn}(t) \}, \qquad G_{mn}(t) = \min\{ G_n(t), P_{mn}(t) \}. \] The order-restricted ROC curve is then defined by \[ ROC_{\text{or}}(t) = 1 - G_{mn}\left( F_{mn}^{-1}(1 - t) \right), \quad 0 < t < 1, \] where $F_{mn}^{-1}$ denotes inverse of $F_{mn}$. The area under the order-restricted ROC curve is defined as \[ \widehat{\mathrm{AUC}}_{\text{or}} = \int_0^1 {ROC}_{\text{or}}(t), dt, \] where ${ROC}_{\text{or}}(t)$ is the estimated order-restricted ROC curve. Under suitable regularity conditions, the asymptotic distributions of $\widehat{\mathrm{AUC}}_{\text{or}}$ and $\widehat{\mathrm{AUC}}_{\text{emp}}$ are equivalent. Consequently, for **large sample sizes**, variance approximations developed for the empirical AUC—such as those by Hanley & McNeil or DeLong—can also be used for the order-restricted AUC as a large-sample approximation. This is a nonparametric method produces a jagged, step-like ROC curve but little smoother than empirical ROC curve. This method is particularly useful when: * There is prior knowledge that the classifier should satisfy the usual stochastic dominance; * More stable AUC estimates are desired in small to moderate sample sizes. **Example** ```r # Load the formatted dataset data(DMDmodified) # Compute order-restricted AUC summary and ROC plot auc <- AUC( data = DMDmodified, method = "order", ci = TRUE, ci_method = "bootstrap", siglevel = 0.05, boot_iter = 1000 ) # Display AUC summary message(paste(auc$summary)) # Display order-restricted ROC plot auc$plot ``` **References** * Jokiel-Rokita, A., & Topolnicki, R. (2020). Estimation of the ROC curve from the Lehmann family. *Computational Statistics & Data Analysis*, **142**, 106820. [https://doi.org/10.1016/j.csda.2019.106820](https://doi.org/10.1016/j.csda.2019.106820) # {#norm_silver-norm_ucv-bi_silver-bi_ucv-kernel-densitybased-smooth-roc .unnumbered} ------------------------------------------------------------------------ # `norm_silver`, `norm_ucv`, `bi_silver`, `bi_ucv` Kernel Density–Based Smooth ROC ------------------------------------------------------------------------ To apply a kernel density–based ROC method, set `method` to one of the following options in the `AUC()` function: * `"norm_silver"` – Gaussian kernel with Silverman bandwidth * `"norm_ucv"` – Gaussian kernel with unbiased cross-validation (UCV) bandwidth * `"bi_silver"` – Biweight kernel with Silverman bandwidth * `"bi_ucv"` – Biweight kernel with UCV bandwidth Each combination defines both the kernel function $ K(\cdot) $ and the bandwidth selection rule used to smooth the ROC curve. The following options are available for `ci_method`: - `"delong"` – DeLong’s variance-based normal approximation - `"bootstrap"` – nonparametric bootstrap percentile method - `"all"` – computes all applicable interval types for the selected method, i.e., `"all"` **Usage** ```r AUC( data = data, method = "norm_silver", ci = TRUE, ci_method = "bootstrap", siglevel = 0.05, boot_iter = 1000 ) ``` **Description** To obtain a smoother nonparametric and more interpretable estimate, **kernel density estimation (KDE)** can be used to estimate the underlying distribution functions of the marker in each group. The resulting ROC curve is continuous and differentiable, offering both interpretability and visual smoothness. **Kernel Density Estimation** Let ( X_1, \ldots, X_m ) denote marker values from controls and ( Y_1, \ldots, Y_n ) from cases. The kernel density estimators are given by \[ \hat{f}(x) = \frac{1}{m h_m} \sum_{i=1}^m K\left(\frac{x - X_i}{h_m}\right), \quad \hat{g}(x) = \frac{1}{n h_n} \sum_{i=1}^n K\left(\frac{x - Y_i}{h_n}\right), \] where $ h_m $ and $ h_n $ are the bandwidths controlling smoothness, and $K(\cdot)$ is a kernel function that integrates to one. The corresponding cumulative distribution estimators are \[ \hat{F}(t) = \int_{-\infty}^{t} \hat{f}(x), dx, \qquad \hat{G}(t) = \int_{-\infty}^{t} \hat{g}(x), dx. \] Then the **kernel-smoothed ROC curve** is defined as \[ \label{kde_roc} \widehat{ROC}_{kde}(t) = 1 - \hat{G}\left(\hat{F}^{-1}(1 - t)\right), \quad 0 < t < 1, \] where $ \hat{F}^{-1}(t) = \inf{x : \hat{F}(x) \ge t} $. **Bandwidth and Kernel Selection** Choosing an appropriate bandwidth and kernel is crucial for balancing bias and variance: * **Small bandwidths** yield curves that closely follow empirical data—high variance but low bias. * **Large bandwidths** produce overly smooth curves—low variance but potentially biased. Two bandwidth selection methods are available: * **Silverman’s rule** of thumb, providing a quick, general-purpose bandwidth. * **Unbiased cross-validation (UCV)**, which minimizes the integrated squared error of the density estimate. Available kernel functions include: * **Gaussian kernel:** $ K(t) = (2\pi)^{-1/2} e^{-t^2/2}, , t \in (-\infty, \infty) $ * **Biweight kernel:** $ K(t) = \tfrac{15}{16}(1 - t^2)^2, , t \in [-1, 1] $ There is no universally optimal kernel, but the Gaussian and biweight kernels are widely used and perform robustly across diverse data conditions. **AUC Estimation** The area under the kernel-smoothed ROC curve is given by \[ \label{auc_kde} \widehat{\mathrm{AUC}}_{kde} = \int_0^1 \widehat{ROC}_{kde}(t), dt, \] which is evaluated numerically using **trapezoidal rule**. The variance of $ \widehat{\mathrm{AUC}}_{kde} $ can be estimated using **bootstrap resampling**, which accounts for uncertainty in both the kernel estimation and the sampling process. Zou et al. (1997), stated that the smoothing introduced by KDE has negligible effect on the first-order variance of $ \widehat{\mathrm{AUC}}_{kde} $. To ensure confidence intervals remain within the $(0,1)$ range, a log-transformation is recommended: \[ -\log(1 - \widehat{\mathrm{AUC}}_{kde}), \] constructing intervals on this transformed scale and then back-transforming for interpretation. As a nonparametric method, **kernel-smoothed ROC curve** provides several advantages: * Reduces sampling irregularities inherent in the empirical ROC. * Maintains a nonparametric framework—no specific distributional assumptions. * Produces a smoother curve. While flexible, kernel-based ROC methods also have several limitations: 1. **Boundary bias:** KDE performs poorly near FPR values close to 0 or 1. 2. **Bandwidth sensitivity:** Requires careful tuning of $h_m$ and $h_n$. 3. **Computational cost:** Bootstrapping smooth ROC curves can be intensive for large datasets. **Example** ```r # Load formatted dataset data(DMDmodified) # Compute smooth ROC using Gaussian kernel and Silverman bandwidth auc <- AUC( data = DMDmodified, method = "norm_silver", ci = TRUE, ci_method = "bootstrap", siglevel = 0.05, boot_iter = 1000 ) # Display AUC summary message(paste(auc$summary)) # Display smooth ROC plot auc$plot ``` **References** * Zou, K. H., Hall, W. J & Shapiro, D. E. (1997). Smooth nonparametric receiver operating characteristic (ROC) curves for continuous data. *Statistics in Medicine*, **16**, 2143–2156. [https://doi.org/10.1002/(sici)1097-0258(19971015)16:19%3C2143::aid-sim655%3E3.0.co;2-3](https://doi.org/10.1002/(sici)1097-0258(19971015)16:19%3C2143::aid-sim655%3E3.0.co;2-3) * Lloyd, C. J. (1998). Using smoothed receiver operating characteristic curves to summarize and compare diagnostic systems. *Journal of the American Statistical Association*, **93**(444), 1356–1364. [https://doi.org/10.1080/01621459.1998.10473797](https://doi.org/10.1080/01621459.1998.10473797) * Silverman, B. W. (2018). *Density Estimation for Statistics and Data Analysis*. Chapman & Hall/CRC. [https://doi.org/10.1201/9781315140919](https://doi.org/10.1201/9781315140919) # {#binormal-binormal-roc-curve .unnumbered} ------------------------------------------------------------------------ # `binormal` Binormal ROC Curve ------------------------------------------------------------------------ To apply this method, set `method = "binormal"` in the `AUC()` function. The following options are available for `ci_method`: * `"mle"` – likelihood-based interval * `"bootstrap"` – parametric bootstrap percentile interval * `"all"` – computes both likelihood-based and bootstrap intervals **Usage** ```r AUC( data = data, method = "binormal", ci = TRUE, ci_method = "bootstrap", siglevel = 0.05, boot_iter = 1000 ) ``` **Description** The **bi-normal ROC model** is one of the most widely used **parametric approaches** to ROC analysis. It assumes that biomarker values for both the non-diseased $ F $ and diseased $ G $ populations follow normal distributions, but with potentially **different means and variances**. The model assumes: \[ F(x) = \Phi\left( \frac{x - \mu_0}{\sigma_0} \right), \quad G(y) = \Phi\left( \frac{y - \mu_1}{\sigma_1} \right), \] where $ \Phi(\cdot) $ is the standard normal CDF, and $ \mu_0, \sigma_0^2, \mu_1, \sigma_1^2 $ are the means and variances for the two groups. Defining \[ a = \frac{\mu_1 - \mu_0}{\sigma_1}, \qquad b = \frac{\sigma_0}{\sigma_1}, \] the **bi-normal ROC curve** can be expressed as \[ ROC_{\text{Bin}}(t) = \Phi\left( a + b,\Phi^{-1}(t) \right), \] where $ \Phi^{-1}(\cdot) $ is the quantile function of the standard normal distribution. The corresponding AUC is given by \[ AUC_{\text{Bin}} = \Phi\left( \frac{a}{\sqrt{1 + b^2}} \right). \] The parameters $ \mu_0, \sigma_0, \mu_1, \sigma_1 $ are estimated using **Maximum likelihood estimation (MLE)**, assuming normality. MLE provides asymptotically efficient estimates and allows for likelihood-based confidence intervals on the AUC. Because the AUC has a closed-form expression, confidence intervals can be obtained using either: * **Analytical variance formulas** (`ci_method = "mle"`) derived from the delta method (via MLE covariance estimates), or * **Bootstrap resampling** (`ci_method = "bootstrap"`) for more robust inference under small samples or mild deviations from normality. **Example** ```r # Load formatted dataset data(DMDmodified) # Compute bi-normal AUC summary and ROC plot auc <- AUC( data = DMDmodified, method = "binormal", ci = TRUE, ci_method = "mle", siglevel = 0.05, boot_iter = 1000 ) # Display AUC summary message(paste(auc$summary)) # Display bi-normal ROC plot auc$plot ``` **References** * Dorfman, D. D., & Alf, E. (1969). Maximum-likelihood estimation of parameters of signal-detection theory and determination of confidence intervals—rating method data. *Journal of Mathematical Psychology*, **6**, 487–496. [https://doi.org/10.1016/0022-2496(69)90019-4](https://doi.org/10.1016/0022-2496(69)90019-4) * Hsieh, F., & Turnbull, B. W. (1996). Nonparametric and semiparametric estimation of the ROC curve. *Annals of Statistics*, **24**(1), 25–40. [https://doi.org/10.1214/aos/1033066197](https://doi.org/10.1214/aos/1033066197) * Metz, C. E., Herman, B. A., & Shen, J. H. (1998). Maximum likelihood estimation of receiver operating characteristic (ROC) curves from continuously-distributed data. *Statistics in Medicine*, **17**, 1033–1053. [https://doi.org/10.1002/(sici)1097-0258(19980515)17:9%3C1033::aid-sim784%3E3.0.co;2-z](https://doi.org/10.1002/(sici)1097-0258(19980515)17:9%3C1033::aid-sim784%3E3.0.co;2-z) * Faraggi, D., & Reiser, B. (2002). Estimation of the area under the ROC curve. *Statistics in Medicine*, **21**, 3093–3106. [https://doi.org/10.1002/sim.1228](https://doi.org/10.1002/sim.1228) * Yeo, I. K., & Johnson, R. A. (2000). A new family of power transformations to improve normality or symmetry. *Biometrika*, **87**(4), 954–959. [https://doi.org/10.1093/biomet/87.4.954](https://doi.org/10.1093/biomet/87.4.954) # {#biweibull-constant-shape-bi-weibull-roc-curve .unnumbered} ------------------------------------------------------------------------ # `biweibull` Constant-shape bi-Weibull ROC Curve ------------------------------------------------------------------------ To apply this method, set `method = "biweibull"` in the `AUC()` function. The following options are available for `ci_method`: * `"mle"` – likelihood-based interval * `"bootstrap"` – parametric bootstrap percentile interval * `"all"` – computes both likelihood-based and bootstrap intervals **Usage** ```r AUC( data = data, method = "biweibull", ci = TRUE, ci_method = "mle", siglevel = 0.05, boot_iter = 1000 ) ``` **Description** The **constant-shape bi-Weibull model** is a flexible **parametric model** for ROC analysis. Let $ X $ and $ Y $ denote biomarker values for non-diseased and diseased subjects, respectively. Assume that both follow Weibull distributions with a **common shape parameter** $ \alpha $, but possibly different **scale parameters** $ \theta_0 $ and $ \theta_1 $. Then, the **bi-Weibull ROC curve**: \[ ROC_{\text{Biw}}(t) = t^{\frac{\theta_0}{\theta_1}}, \quad t \in (0,1). \] The corresponding AUC has a simple **closed-form expression** which is given by \[ AUC_{\text{Biw}} = \frac{\theta_1}{\theta_0 + \theta_1}. \] The model parameters $ \alpha, \theta_0, \theta_1 $ are typically estimated via **maximum likelihood estimation (MLE)**, which provides consistent and efficient estimators under the Weibull assumption. Then, confidence intervals are calculated using * **Analytical variance formulas** (`ci_method = "mle"`) via asymptotic normality of MLEs, or * **Bootstrap resampling** (`ci_method = "bootstrap"`) for more robust inference under small samples. The **bi-Weibull ROC curve** assumes biomarker values for the non-diseased and diseased populations each follow Weibull distributions. Owing to its adaptable shape parameter, the Weibull family can approximate several common distributions—including the **exponential**, **Rayleigh**, and even **log-normal-like** forms—making it particularly effective for modeling **skewed or heavy-tailed biomedical data**, where symmetric distributions such as the normal may perform poorly. Under the Weibull distributional assumption, this parametric model also outperforms for small sample sizes. **Example** ```r # Load formatted dataset data(DMDmodified) # Compute bi-Weibull AUC summary and ROC plot auc <- AUC( data = DMDmodified, method = "biweibull", ci = TRUE, ci_method = "mle", siglevel = 0.05, boot_iter = 1000 ) # Display AUC summary message(paste(auc$summary)) # Display bi-Weibull ROC plot auc$plot ``` **References** * Pundir, S., & Amala, R. (2014). Evaluation of area under the constant shape bi-Weibull ROC curve. *Journal of Modern Applied Statistical Methods*, **13**(1), 20. [https://doi.org/10.22237/jmasm/1398917940](https://doi.org/10.22237/jmasm/1398917940) * Kundu, D., & Gupta, R. D. (2006). Estimation of $ P[Y < X] $ for Weibull distributions. *IEEE Transactions on Reliability*, **55**(2), 270–280. [https://doi.org/10.1109/TR.2006.874918](https://doi.org/10.1109/TR.2006.874918) * Khan, R. A., & Ghebremichael, M. (2025). Comparing estimation methods for the area under the bi‐Weibull ROC curve. *Pharmaceutical Statistics*, **24**(5), e70038. [https://doi.org/10.1002/pst.70038](https://doi.org/10.1002/pst.70038) # {#bigamma-bi-gamma-roc-curve .unnumbered} ------------------------------------------------------------------------ # `bigamma` Bi-Gamma ROC Curve ------------------------------------------------------------------------ To apply this method, set `method = "bigamma"` in the `AUC()` function. The option `ci_method = "bootstrap"` refers to the computation of the parametric bootstrap percentile interval, which is the only available option for Bayesian Bootstrap inference. **Usage** ```r AUC( data = data, method = "bigamma", ci = TRUE, ci_method = "bootstrap", siglevel = 0.05, boot_iter = 1000 ) ``` **Description** In the **bi-Gamma ROC model**, both populations are assumed to follow independent Gamma distributions but with potentially different **shape** and **scale** parameters, allowing for flexible modeling of skewness and dispersion. Let * $ X \sim \mathrm{Gamma}(k_1, \theta_1) $: biomarker values in the **non-diseased** group, and * $ Y \sim \mathrm{Gamma}(k_2, \theta_2) $: biomarker values in the **diseased** group, where $ k_1, k_2 $ are shape parameters and $ \theta_1, \theta_2 $ are scale parameters. The probability density functions (PDFs) are: \[ f(x; k_1, \theta_1) = \frac{1}{\Gamma(k_1)\theta_1^{k_1}}x^{k_1-1} e^{-x / \theta_1}, \quad x > 0, \] \[ g(y; k_2, \theta_2) = \frac{1}{\Gamma(k_2)\theta_2^{k_2}}y^{k_2-1} e^{-y / \theta_2}, \quad y > 0. \] The **bi-Gamma ROC curve** is given by \[ ROC_{\text{Big}}(t) = 1 - \frac{\gamma\left(k_2, \frac{k_1}{\theta_2}\gamma^{-1}(k_1, 1 - t)\right)}{\Gamma(k_2)}, \] where $\gamma^{-1}(a, \cdot)$ is the inverse lower incomplete Gamma function. The area under the bi-Gamma ROC curve $(AUC_{\text{Big}})$ can be expressed as \[ \label{AUC_gam_F} AUC_{\text{Big}} = F_F\left(\frac{k_2 \theta_2}{k_1 \theta_1}; , 2k_1, 2k_2\right), \] where $ F_F(\cdot; 2k_1, 2k_2) $ denotes the CDF of an **F-distributed** random variable with $2k_1$ and $2k_2$ degrees of freedom. The model parameters $ (k_1, \theta_1, k_2, \theta_2) $ are estimated by **maximum likelihood estimation (MLE)** based on independent samples from the non-diseased and diseased groups. The **parametric percentile bootstrap** (`ci_method = "bootstrap"`) is recommended for constructing percentile-based confidence intervals, especially when sample sizes are small-moderate or data deviate from ideal Gamma assumptions. The **bi-Gamma ROC model** is a **parametric ROC framework** and is suitable for data that are **positively skewed** or have **heavy right tails**, characteristics commonly observed in biomedical and reliability studies. **Example** ```r # Load formatted dataset data(DMDmodified) # Compute bi-Gamma AUC summary and ROC plot auc <- AUC( data = DMDmodified, method = "gamma", ci = TRUE, ci_method = "bootstrap", siglevel = 0.05, boot_iter = 1000 ) # Display AUC summary message(paste(auc$summary)) # Display bi-Gamma ROC plot auc$plot ``` **References** * Dorfman, D. D., Berbaum, K. S., & Metz, C. E. (1997). Proper receiver operating characteristic analysis: The bigamma model. *Academic Radiology*, **4**, 138–149. [https://doi.org/10.1016/s1076-6332(97)80013-x](https://doi.org/10.1016/s1076-6332(97)80013-x) * Hussain, E. (2012). The Bi-Gamma ROC Curve in a Straightforward Manner. *Journal of Basic and Applied Sciences*, **8**(2). [http://dx.doi.org/10.6000/1927-5129.2012.08.02.09](http://dx.doi.org/10.6000/1927-5129.2012.08.02.09) * Guo, B. (2015). *On the effect of improperness of binormal ROC curves for estimating full area under the curve*. PhD Thesis, University of Pittsburgh. [http://d-scholarship.pitt.edu/id/eprint/23590](http://d-scholarship.pitt.edu/id/eprint/23590) # {#lehmann .unnumbered} ------------------------------------------------------------------------ # `lehmann` Semiparametric ROC Curve under the Lehmann Model ------------------------------------------------------------------------ The option `ci_method = "ple"`: refers to the computation of the confidence interval based on partial likelihood-based method (via proportional hazards model), which is the only available option for Lehmann model. **Usage** ```r AUC( data = data, method = "lehmann", ci = TRUE, ci_method = "ple", siglevel = 0.05, boot_iter = 1000 ) ``` **Description** The **Lehmann model** provides a semiparametric framework for ROC curve estimation that assumes a simple power relationship between the survivor functions of the diseased and non-diseased populations: \[ \overline{G}(t) = [\overline{F}(t)]^{\delta}, \qquad 0 < \delta \le 1, \] where $\overline{F}(t)$ and $\overline{G}(t)$ are the survivor functions of the biomarker for the non-diseased and diseased groups, respectively, and $\delta$ is a single **diagnostic accuracy parameter**. Smaller values of $\delta$ correspond to stronger discriminatory ability of the biomarker. Under this assumption, the ROC curve and its corresponding area have simple analytical forms: \[ ROC_{\text{le}}(t) = t^{\delta}, \qquad t \in [0, 1], \] and \[ AUC_{\text{le}} = \int_0^1 t^{\delta} dt = \frac{1}{1 + \delta}. \] This produces a **smooth, monotonic ROC curve** that is both interpretable and computationally efficient. The single parameter $\delta$ controls the shape of the ROC and directly determines the AUC. The Lehmann assumption is equivalent to the **proportional hazards (PH)** formulation in survival analysis, where the ratio of hazard functions for the diseased and non-diseased groups is constant: \[ \frac{h_Y(t)}{h_X(t)} = e^{\beta}. \] Here, the Lehmann parameter and PH coefficient are linked by $\delta = e^{\beta}$. Thus, estimation of $\hat{\beta}$ proceeds by fitting a Cox proportional hazards model using `survival` package and consequently, \[ \hat{\delta} = e^{\hat{\beta}}, \quad \widehat{AUC}_{\text{le}} = \frac{1}{1 + \hat{\delta}}. \] The confidence interval for $AUC_{\text{le}}$ can then be derived using the delta method, based on the estimated variance of $\hat{\beta}$. This method naturally accommodates covariates through the Cox proportional hazards framework, allowing for adjusted ROC analysis that accounts for additional variables. The parameter $\delta = e^{\beta}$ offers direct clinical interpretability as a hazard ratio, making the results meaningful in applied biomedical contexts. It is computationally simple, relying on standard Cox regression routines without requiring complex optimization procedures or Bayesian sampling. The approach is also robust, maintaining statistical efficiency without imposing distributional assumptions on the biomarker data. The **Lehmann ROC model** bridges parametric and nonparametric approaches by imposing a simple, interpretable relationship between sensitivity and specificity while leaving the biomarker distributions unspecified. This balance of **robustness, flexibility, and efficiency** makes it particularly suitable for **heterogeneous biomedical datasets**, especially where biomarkers are influenced by covariates or measured repeatedly over time. **Example** ```r # Load formatted dataset data(DMDmodified) # Compute semiparametric ROC under the Lehmann assumption auc <- AUC( data = DMDmodified, method = "lehmann", ci = TRUE, ci_method = "ple", siglevel = 0.05, boot_iter = 1000 ) # Display AUC summary message(paste(auc$summary)) # Display Lehmann ROC plot auc$plot ``` **References** * Lehmann, E. L. (1953). The power of rank tests. *Annals of Mathematical Statistics*, **24**, 23–43. [https://doi.org/10.1214/aoms/1177729080](https://doi.org/10.1214/aoms/1177729080) * Cox, D. R. (1972). Regression models and life-tables. *Journal of the Royal Statistical Society: Series B*, **34**, 187–220. [https://www.jstor.org/stable/2985181](https://www.jstor.org/stable/2985181) * Cox, D. R. (1975). Partial likelihood. *Biometrika*, **62**, 269–276. [https://doi.org/10.1093/biomet/62.2.269](https://doi.org/10.1093/biomet/62.2.269) * Gönen, M., & Heller, G. (2010). Lehmann family of ROC curves. *Medical Decision Making*, **30**(4), 509–517. [https://doi.org/10.1177/0272989X09360067](https://doi.org/10.1177/0272989X09360067) * Ghebremichael, M., & Habtemicael, S. (2018). Effect of tuberculosis on immune restoration among HIV-infected patients receiving antiretroviral therapy. *Journal of Applied Statistics*, **45**(13), 2357–2364. [https://doi.org/10.1080/02664763.2017.1420758](https://doi.org/10.1080/02664763.2017.1420758) * Ghebremichael, M., & Michael, H. (2024). Comparison of the binormal and Lehmann receiver operating characteristic curves. *Communications in Statistics—Simulation and Computation*, **53**(2), 772–785. [https://doi.org/10.1080/03610918.2022.2032159](https://doi.org/10.1080/03610918.2022.2032159) * Ghebremichael, M., et al. (2019). Comparing the diagnostic accuracy of CD4+ T-lymphocyte count and percent as surrogate markers of pediatric HIV disease. *Journal of Mathematics and Statistics*, **15**(1), 55–64. [https://doi.org/10.3844/jmssp.2019.55.64](https://doi.org/10.3844/jmssp.2019.55.64) * Jokiel-Rokita, A., & Topolnicki, R. (2020). Estimation of the ROC curve from the Lehmann family. *Computational Statistics & Data Analysis*, **142**, 106820. [https://doi.org/10.1016/j.csda.2019.106820](https://doi.org/10.1016/j.csda.2019.106820) # {#bayesbiweibull-bayesian-bi-weibull-roc-curve .unnumbered} ------------------------------------------------------------------------ # `bayesbiweibull` Bayesian Bi-Weibull ROC Curve ------------------------------------------------------------------------ To apply this method, set `method = "bayesbiweibull"` in the `AUC()` function. The option `ci_method = "mcmc"` refers to the computation of the Bayesian Bootstrap credible interval, which is the only available option for Bayesian Bootstrap inference. The `boot_iter` option is inactive for this method, as the number of MCMC iterations is fixed at 11,000 (comprising 1000 burn-in and 10000 retained samples). **Usage** ```r AUC( data = data, method = "bayesbiweibull", ci = TRUE, ci_method = "mcmc", siglevel = 0.05, boot_iter = 1000 ) ``` **Description** The **Bayesian Bi-Weibull ROC curve** is a **parametric Bayesian extension** of the constant-shape Bi-Weibull model. In the Bayesian paradigm, the unknown model parameters are treated as **random variables** with prior distributions that reflect prior knowledge or beliefs about their possible values. These priors are updated with the observed data through **Bayes’ theorem**, yielding **posterior distributions** for both the ROC curve and its area under the curve (AUC). Posterior summaries (such as the posterior mean or credible intervals) serve as Bayesian estimates of ROC quantities. As described in the frequentist Bi-Weibull section, we assume the biomarker values for the **non-diseased** and **diseased** populations follow Weibull distributions with a shared **shape parameter** $\alpha$, but distinct **scale parameters** $\theta_0$ and $\theta_1$. The parameters $\theta_0$, $\theta_1$, and $\alpha$ are treated as random variables with the following prior distributions: \[ \theta_j \sim \mathrm{IG}(a_j, b_j), \quad j = 0, 1, \] \[ \alpha \sim \mathrm{Gamma}(k, \beta), \] where $a_j, b_j, k, \beta > 0$. Here, $\mathrm{IG}(a, b)$ denotes the **inverse-gamma** distribution with density \[ \pi_{1j}(\theta_j) = \frac{b_j^{a_j}}{\Gamma(a_j)} , \theta_j^{-(a_j + 1)} e^{-b_j / \theta_j}, \] and all priors are assumed **independent**. Given data $x_1, \dots, x_m$ (controls) and $y_1, \dots, y_n$ (cases), the likelihood function is: \[ L(\alpha, \theta_0, \theta_1 \mid \text{data}) \propto \alpha^{m+n} , \theta_0^{-m} , \theta_1^{-n} \left( \prod_{i=1}^{m} x_i^{\alpha - 1} \right) \left( \prod_{j=1}^{n} y_j^{\alpha - 1} \right) \exp\left(-\frac{\sum_{i=1}^{m} x_i^{\alpha}}{\theta_0}\right) \exp\left(-\frac{\sum_{j=1}^{n} y_j^{\alpha}}{\theta_1}\right). \] The **posterior distribution** is proportional to the product of the likelihood and priors: \[ p(\alpha, \theta_0, \theta_1 \mid \text{data}) \propto L(\alpha, \theta_0, \theta_1 \mid \text{data}) , \pi_{10}(\theta_0)\pi_{11}(\theta_1)\pi(\alpha). \] Because the posterior distribution cannot be evaluated analytically, parameter estimation is performed using **Markov Chain Monte Carlo (MCMC)**-typically through **Gibbs sampling** with a **Metropolis–Hastings** step for $\alpha$. At each iteration (s), new samples $\theta_0^{(s)}$, $\theta_1^{(s)}$, and $\alpha^{(s)}$ are drawn from their respective conditional posterior distributions. From these samples, the AUC for iteration (s) is computed as: \[ AUC^{(s)} = \frac{\theta_1^{(s)}}{\theta_0^{(s)} + \theta_1^{(s)}}. \] The **posterior mean AUC** and its **95% highest posterior density (HPD)** credible interval are then estimated as: \[ \widehat{AUC}_{\text{Biw}}^{\text{Bayes}} = \frac{1}{S} \sum_{s=1}^{S} AUC^{(s)}, \quad CI_{95\%} = \mathrm{HPD}_{0.95}\{AUC^{(1)}, \dots, AUC^{(S)}\}. \] For this implementation, **11,000 MCMC iterations** were performed, discarding the first **1,000 iterations as burn-in** and retaining the remaining **10,000 samples** for posterior inference. The AU and its 95% HPD credible intervals were computed using **non-informative priors**, with $a_1 = a_2 = b_1 = b_2 = 0$ and $\alpha \sim \text{Gamma}(0.1, 1)$. Note that these priors for $\theta_0$, $\theta_1$, and $\alpha$ are **non-proper**, meaning they do not integrate to one but still yield proper posteriors when combined with the likelihood. The **Bayesian Bi-Weibull ROC approach** offers a flexible and robust framework for ROC analysis by combining prior knowledge with observed data. It generates full posterior distributions for the ROC curve and AUC through MCMC simulation, providing direct quantification of uncertainty without relying on asymptotic approximations. Averaging over posterior draws yields smooth and stable ROC estimates, even in small samples. The 95% highest posterior density (HPD) credible intervals, computed using the **`HDInterval`** package in R. **Example** ```r # Load formatted dataset data(DMDmodified) # Bayesian estimation of the Bi-Weibull AUC and ROC auc <- AUC( data = DMDmodified, method = "bayesbiweibull", ci = TRUE, ci_method = "mcmc", siglevel = 0.05, boot_iter = 1000 ) # Display Bayesian AUC summary message(paste(auc$summary)) # Display posterior ROC plot auc$plot ``` **References** * Kundu, D., & Gupta, R. D. (2006). Estimation of $ P[Y < X] $ for Weibull distributions. *IEEE Transactions on Reliability*, **55**(2), 270–280. [https://doi.org/10.1109/TR.2006.874918](https://doi.org/10.1109/TR.2006.874918) * Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., & Rubin, D. B. (2020). *Bayesian Data Analysis* (4th ed.). CRC Press. [https://doi.org/10.1201/9780429258480](https://doi.org/10.1201/9780429258480) * Meredith, M., & Kruschke, J. K. (2022). *HDInterval: Highest (Posterior) Density Intervals*. R package version 0.2.4. Available at: [https://CRAN.R-project.org/package=HDInterval](https://CRAN.R-project.org/package=HDInterval) # {#dpm-roc-curve .unnumbered} ------------------------------------------------------------------------ # `dpm` Bayesian Semiparametric ROC (Dirichlet Process Mixture of Normals) ------------------------------------------------------------------------ To apply this method, set `method = "dpm"` in the `AUC()` function. The option `ci_method = "dpm"` refers to the computation of the Bayesian Bootstrap credible interval, which is the only available option for Bayesian Bootstrap inference. The `boot_iter=` option is inactive for this method, as the number of MCMC iterations is fixed at 500 (comprising 100 burn-in and 400 retained samples). **Usage** ```r AUC( data = data, method = "dpm", ci = TRUE, ci_method = "dpm", siglevel = 0.05, boot_iter = 1000 ) ``` **Description** The **Bayesian semiparametric ROC** approach combines the interpretability of parametric models with the flexibility of nonparametric inference by assigning **infinite-dimensional priors**, such as **Dirichlet processes (DPs)**, to the biomarker distributions. This allows the model to capture complex data features — including **multimodality**, **skewness**, and **heterogeneity** — that cannot be adequately represented by simple single-component parametric models. A key implementation of this framework is the **Dirichlet Process Mixture (DPM)** of normal distributions proposed by Erkanli et al. (2006). In this model, the biomarker distributions for the non-diseased and diseased populations are represented as mixtures of normal components, with random mixing distributions drawn from independent DPs. Let $ {X}_{i=1}^m \sim F $ and $ {Y}_{i=1}^n \sim G $ denote biomarker measurements from the non-diseased and diseased groups, respectively. Then, \[ F(x) = \sum_{l=1}^{L} p_l \Phi(x \mid \mu_l, \sigma_l^2), \qquad G(y) = \sum_{l=1}^{L'} p_l' \Phi(y \mid \mu_l', \sigma_l'^2), \] where $\Phi(\cdot \mid \mu, \sigma^2)$ is the normal cumulative distribution function, and $L, L'$ are truncation levels that approximate the infinite Dirichlet process mixture. The mixture weights ${p_l}$ follow a **stick-breaking process**: \[ p_l = \begin{cases} R_1, & l = 1, \\ R_l \prod_{r=1}^{l-1}(1 - R_r), & l = 2, \dots, L - 1, \\ \prod_{r=1}^{L-1}(1 - R_r), & l = L, \end{cases} \] with a similar construction for ${p_l'}$ in the diseased group. The priors are specified as follows: \[ R_r \sim \mathrm{Beta}(1, \alpha), \quad \alpha \sim \mathrm{Gamma}(a, b), \] \[ \mu_l \sim N(m_0, S_0), \quad \sigma_l^{-2} \sim \mathrm{Gamma}(c, d), \] where $\alpha$ controls the number of mixture components and the model complexity. This finite truncation approach, following Ishwaran and James (2002), yields a **computationally tractable** approximation to the Dirichlet process. Given $F$ and $G$, the ROC curve is defined as \[ ROC_{\text{DPM}}(p) = 1 - G(F^{-1}(1 - p)), \qquad 0 < p < 1, \] and the corresponding AUC is \[ AUC_{\text{DPM}} = \int_0^1 [1 - G(F^{-1}(1 - p))] , dp. \] Both quantities are evaluated numerically at each iteration of the MCMC algorithm using the current mixture parameter draws. Posterior inference for the ROC curve and AUC is obtained using **Gibbs sampling** with **Metropolis–Hastings** updates when needed. At each iteration, the algorithm updates mixture component parameters and stick-breaking weights for both groups, computes the corresponding ROC curve and AUC, and stores these posterior samples. The posterior mean ROC curve and 95% credible intervals are then obtained by averaging across MCMC iterations. In this implementation, the semiparametric Bayesian estimator $\widehat{\text{AUC}}_{\text{DPM}}$ is computed using posterior samples from the Dirichlet process mixture model. Weakly informative priors are used to ensure flexibility and stability: $\alpha \sim \text{Gamma}(1, 1)$, $\mu_l \sim \mathcal{N}(0, 100)$, and $\tau_l = \sigma_l^{-2} \sim \text{Gamma}(0.1, 0.1)$, providing vague yet regularized estimates. Stick-breaking variables follow $R_r \sim \text{Beta}(1, \alpha)$, and both groups are modeled identically. The truncation level is fixed at $L = L' = 10$, which balances computational efficiency with representational power. Posterior inference was based on **500 MCMC iterations** (100 burn-in and 400 retained samples), yielding stable AUC estimates across replications. Although $\widehat{\text{AUC}}_{\text{DPM}}$ often produced narrower credible intervals than competing methods, it was somewhat sensitive to prior choices in small-sample scenarios. Despite a higher computational cost, this approach remains highly flexible and robust—particularly valuable when the biomarker distributions are **skewed**, **heavy-tailed**, or **multimodal**. **Example** ```r # Load formatted dataset data(DMDmodified) # Bayesian semiparametric ROC using Dirichlet process mixture of normals auc <- AUC( data = DMDmodified, method = "dpm", ci = TRUE, ci_method = "dpm", siglevel = 0.05, boot_iter = 1000 ) # Display AUC summary (posterior mean and credible interval) message(paste(auc$summary)) # Display DPM-based ROC plot (posterior mean ROC with bands) auc$plot ``` **References** * Erkanli, A., Sung, L., & Stamey, J. D. (2006). Bayesian semi-parametric ROC curve estimation. *Statistics in Medicine*, **25**, 3905–3928. [https://doi.org/10.1002/sim.2496](https://doi.org/10.1002/sim.2496) * Ishwaran, H., & James, L. F. (2002). Approximate Dirichlet process computing in finite normal mixtures. *Journal of Computational and Graphical Statistics*, **11**(3), 508–532. [https://doi.org/10.1198/106186002411](https://doi.org/10.1198/106186002411) # {#bb-bayesian-bootstrap-roc-curve .unnumbered} ------------------------------------------------------------------------ # `BB` Bayesian Bootstrap ROC Curve ------------------------------------------------------------------------ To apply this method, set `method = "BB"` in the `AUC()` function. The option `ci_method = "bb"` refers to the computation of the Bayesian Bootstrap credible interval, which is the only available option for Bayesian Bootstrap inference. `boot_iter=` refers to the number of bootstrap replications. **Usage** ```r AUC( data = data, method = "BB", ci = TRUE, ci_method = "bb", siglevel = 0.05, boot_iter = 1000 ) ``` **Description** The **Bayesian Bootstrap (BB)**, introduced by Rubin (1981), provides a **fully nonparametric Bayesian** method for estimating smooth ROC curves and AUC values. Unlike classical bootstrapping, which resamples data points, BB assigns **random Dirichlet weights** to observed data, generating a posterior distribution over ROC curves that reflects uncertainty without relying on large-sample approximations or bandwidth selection. In empirical ROC estimation, each observation contributes equally (weights of $1/m$ for controls and $1/n$ for cases). BB replaces these fixed weights with random draws from a **Dirichlet(1, ..., 1)** distribution. Averaging across replicates yields a smooth posterior mean ROC curve, and variation among replicates quantifies uncertainty in AUC. Let $ X = (X_1, \dots, X_m) $ be controls and $ Y = (Y_1, \dots, Y_n) $ be cases. For each bootstrap replicate $ b = 1, \dots, B $: 1. Draw $ (p_1, \dots, p_m) \sim \text{Dirichlet}(1, \dots, 1) $, or equivalently $ p_i = w_i / \sum_j w_j $ with $ w_i \sim \text{Exponential}(1) $. Define weighted empirical CDF: \[ F^{(b)}(u) = \sum_{i=1}^{m} p_i \mathbf{1}(X_i \le u) \] Compute placement values: \[ U_j^{(b)} = 1 - F^{(b)}(Y_j), \quad j = 1, \dots, n \] 2. Draw $ (q_1, \dots, q_n) \sim \text{Dirichlet}(1, \dots, 1) $ Construct ROC curve: \[ ROC_{m,n}^{(b)}(t) = \sum_{j=1}^{n} q_j \mathbf{1}(U_j^{(b)} \le t) \] Estimate AUC numerically: \[ AUC^{(b)} = \int_0^1 ROC_{m,n}^{(b)}(t) \, dt \] 3. Combine the results from all $B$ replicates to produce the posterior mean estimates: \[ \widehat{\text{ROC}}_{\text{BB}}(t) = \frac{1}{B} \sum_{b=1}^B \text{ROC}_{m,n}^{(b)}(t), \quad \widehat{\text{AUC}}_{\text{BB}} = \frac{1}{B} \sum_{b=1}^B \text{AUC}^{(b)}. \] Posterior variance: \[ \text{Var}(\widehat{AUC}_{\text{BB}}) = \frac{1}{B - 1} \sum_{b=1}^{B} \left(AUC^{(b)} - \widehat{AUC}_{\text{BB}}\right)^2 \] A $100(1-\alpha)\%$ credible interval for the AUC is obtained by taking the $\alpha/2$ and $1-\alpha/2$ quantiles of the empirical distribution $\{\text{AUC}^{(1)}, \dots, \text{AUC}^{(B)}\}$. The Bayesian Bootstrap generates smooth ROC curves by averaging over random weighted distributions, avoiding kernel smoothing or parametric assumptions. It is especially useful for small or irregular samples, offering robust, data-driven inference with direct posterior uncertainty quantification. The method is also computationally efficient, relying on simple resampling rather than full MCMC. **Example** ```r # Load formatted dataset data(DMDmodified) # Bayesian Bootstrap ROC and AUC estimation auc <- AUC( data = DMDmodified, method = "BB", ci = TRUE, ci_method = "bb", siglevel = 0.05, boot_iter = 1000 ) # Display posterior AUC summary and credible interval message(paste(auc$summary)) # Display smooth Bayesian Bootstrap ROC plot auc$plot ``` **References** * Rubin, D. B. (1981). The Bayesian bootstrap. *Annals of Statistics*, **9**, 130–134. [https://doi.org/10.1214/aos/1176345338](https://doi.org/10.1214/aos/1176345338) * Gu, Y., Ghosal, S., & Roy, A. (2008). Bayesian bootstrap for ROC curve estimation. *Bayesian Analysis*, **3**(3), 659–676. [https://doi.org/10.1002/sim.3366](https://doi.org/10.1002/sim.3366)