Note to self (10/29/19) !!!DPC!!!

The FOM and DeLong method implementations need checking with a toy dataset.

Introduction

  • For an ROI dataset StSignificanceTesting() automatically defaults to method = "OR", covEstMethod = "DeLong" and FOM = "ROI".

  • The covariance estimation method is based on the original DeLong method (DeLong, DeLong, and Clarke-Pearson 1988), which is valid only for the trapezoidal AUC, i.e. ROC data, as extended by (Obuchowski 1997) to ROI data, see formula below.

  • The essential differences from conventional ROC analyses are in the definition of the ROI figure of merit, see below, and the procedure developed by (Obuchowski 1997) for estimating the covariance matrix. Once the covariances are known, method = "OR" can be applied to perform significance testing, as described in (Obuchowski and Rockette 1995) and (Chakraborty 2017, Chapter 10).

The ROI figure of merit

Let \({{X}_{kr}}\) denote the rating for the rthlesion-containing ROI in the kth case and let \(n_{k}^{L}\) be the total number of lesion-containing ROIs in the kth case. Similarly, let \({{Y}_{kr}}\) denote the rating for the rthlesion-free ROI in the kth case and \(n_{k}^{N}\) denote the total number of lesion-free ROIs in the kth case. Let \({{N}_{L}}\) denote the total number of lesion-containing ROIs in the image set and \({{N}_{N}}\) denote the total number of lesion-free ROIs. These are given by \({{N}_{L}}=\sum\nolimits_{k}{n_{k}^{L}}\) and \({{N}_{N}}=\sum\nolimits_{k}{n_{k}^{N}}\). The ROI figure of merit \(\theta\) is defined by: \[\theta =\frac{1}{{{N}_{L}}{{N}_{N}}}\sum\nolimits_{k}{\sum\nolimits_{{{k}'}}{\sum\limits_{r=1}^{n_{k}^{L}}{\sum\limits_{{r}'=1}^{n_{k'}^{N}}{\psi ({{X}_{kr}},{{Y}_{{k}'{r}'}})}}}}\]

The kernel function \(\Psi(X,Y)\) is defined by:

\[\psi (X,Y)=\left\{ \begin{align} & 1.0\ \ \ \text{if}\ Y<X \\ & 0.5\ \ \ \text{if}\ Y=X \\ & 0.0\ \ \ \text{if}\ Y>X \\ \end{align} \right.\]

The ROIs are effectively regarded as mini-cases and one calculates the FOM as the Wilcoxon statistic considering the mini-cases as actual cases. The correlations between the ratings of ROIs on the same case are accounted for in the analysis.

Calculation of the ROI figure of merit.

UtilFigureOfMerit(datasetROI, FOM = "ROI")
#>           rdr1      rdr2      rdr3      rdr4      rdr5
#> trt1 0.9057239 0.8842834 0.8579279 0.9350207 0.8352103
#> trt2 0.9297186 0.9546035 0.8937652 0.9531716 0.8770076
fom <- UtilFigureOfMerit(datasetROI, FOM = "ROI")
  • If the correct FOM is not supplied, it defaults to FOM = ROI.
  • This is a 2-treatment 5-reader dataset.
  • For treatment 1, reader 1 the figure of merit is 0.9057239.
  • For treatment 2, reader 5 the figure of merit is 0.8770076.
  • Etc.

Significance testing

When dataset$dataType == "ROI" the FOM defaults to “ROI” (meaning the above formula) and the covariance estimation method defaults to covEstMethod = "DeLong".

ret <- StSignificanceTesting(datasetROI, FOM = "Wilcoxon")
#> ROI dataset: forcing method = `ORH`, covEstMethod = `DeLong` and FOM = `ROI`.
str(ret)
#> List of 5
#>  $ FOMs :List of 3
#>   ..$ foms        :'data.frame': 2 obs. of  5 variables:
#>   .. ..$ rdr1: num [1:2] 0.906 0.93
#>   .. ..$ rdr2: num [1:2] 0.884 0.955
#>   .. ..$ rdr3: num [1:2] 0.858 0.894
#>   .. ..$ rdr4: num [1:2] 0.935 0.953
#>   .. ..$ rdr5: num [1:2] 0.835 0.877
#>   ..$ trtMeans    :'data.frame': 2 obs. of  1 variable:
#>   .. ..$ Estimate: num [1:2] 0.884 0.922
#>   ..$ trtMeanDiffs:'data.frame': 1 obs. of  1 variable:
#>   .. ..$ Estimate: num -0.038
#>  $ ANOVA:List of 4
#>   ..$ TRanova      :'data.frame':    3 obs. of  3 variables:
#>   .. ..$ SS: num [1:3] 0.003614 0.010223 0.000827
#>   .. ..$ DF: num [1:3] 1 4 4
#>   .. ..$ MS: num [1:3] 0.003614 0.002556 0.000207
#>   ..$ VarCom       :'data.frame':    6 obs. of  2 variables:
#>   .. ..$ Estimates: num [1:6] 0.001082 0.000153 0.000247 0.000187 0.000154 ...
#>   .. ..$ Rhos     : num [1:6] NA NA 0.74 0.561 0.463 ...
#>   ..$ IndividualTrt:'data.frame':    2 obs. of  4 variables:
#>   .. ..$ DF         : num [1:2] 4 4
#>   .. ..$ msREachTrt : num [1:2] 0.00153 0.00123
#>   .. ..$ varEachTrt : num [1:2] 0.000412 0.000255
#>   .. ..$ cov2EachTrt: num [1:2] 0.00023 0.000144
#>   ..$ IndividualRdr:'data.frame':    5 obs. of  4 variables:
#>   .. ..$ DF         : num [1:5] 1 1 1 1 1
#>   .. ..$ msTEachRdr : num [1:5] 0.000288 0.002472 0.000642 0.000165 0.000874
#>   .. ..$ varEachRdr : num [1:5] 0.000269 0.000227 0.000481 0.000168 0.000522
#>   .. ..$ cov1EachRdr: num [1:5] 0.000216 0.000122 0.000345 0.000125 0.000424
#>  $ RRRC :List of 3
#>   ..$ FTests         :'data.frame':  2 obs. of  4 variables:
#>   .. ..$ DF   : num [1:2] 1 12.8
#>   .. ..$ MS   : num [1:2] 0.00361 0.00037
#>   .. ..$ FStat: num [1:2] 9.76 NA
#>   .. ..$ p    : num [1:2] 0.00817 NA
#>   ..$ ciDiffTrt      :'data.frame':  1 obs. of  7 variables:
#>   .. ..$ Estimate: num -0.038
#>   .. ..$ StdErr  : num 0.0122
#>   .. ..$ DF      : num 12.8
#>   .. ..$ t       : num -3.12
#>   .. ..$ PrGTt   : num 0.00817
#>   .. ..$ CILower : num -0.0643
#>   .. ..$ CIUpper : num -0.0117
#>   ..$ ciAvgRdrEachTrt:'data.frame':  2 obs. of  6 variables:
#>   .. ..$ Estimate: num [1:2] 0.884 0.922
#>   .. ..$ StdErr  : num [1:2] 0.0232 0.0197
#>   .. ..$ DF      : num [1:2] 12.2 10.1
#>   .. ..$ CILower : num [1:2] 0.833 0.878
#>   .. ..$ CIUpper : num [1:2] 0.934 0.966
#>   .. ..$ Cov2    : num [1:2] 0.00023 0.000144
#>  $ FRRC :List of 5
#>   ..$ FTests              :'data.frame': 2 obs. of  4 variables:
#>   .. ..$ MS   : num [1:2] 0.003614 0.000218
#>   .. ..$ Chisq: num [1:2] 16.6 NA
#>   .. ..$ DF   : num [1:2] 1 NA
#>   .. ..$ p    : num [1:2] 4.58e-05 NA
#>   ..$ ciDiffTrt           :'data.frame': 1 obs. of  6 variables:
#>   .. ..$ Estimate: num -0.038
#>   .. ..$ StdErr  : num 0.00933
#>   .. ..$ z       : num -4.08
#>   .. ..$ PrGTz   : num 4.58e-05
#>   .. ..$ CILower : num -0.0563
#>   .. ..$ CIUpper : num -0.0197
#>   ..$ ciAvgRdrEachTrt     :'data.frame': 2 obs. of  5 variables:
#>   .. ..$ Estimate: num [1:2] 0.884 0.922
#>   .. ..$ StdErr  : num [1:2] 0.0163 0.0129
#>   .. ..$ DF      : num [1:2] 89 89
#>   .. ..$ CILower : num [1:2] 0.852 0.896
#>   .. ..$ CIUpper : num [1:2] 0.916 0.947
#>   ..$ ciDiffTrtEachRdr    :'data.frame': 5 obs. of  6 variables:
#>   .. ..$ Estimate: num [1:5] -0.024 -0.0703 -0.0358 -0.0182 -0.0418
#>   .. ..$ StdErr  : num [1:5] 0.01025 0.01448 0.01648 0.00928 0.01398
#>   .. ..$ z       : num [1:5] -2.34 -4.86 -2.17 -1.96 -2.99
#>   .. ..$ PrGTz   : num [1:5] 1.93e-02 1.20e-06 2.97e-02 5.05e-02 2.79e-03
#>   .. ..$ CILower : num [1:5] -0.0441 -0.0987 -0.0681 -0.0363 -0.0692
#>   .. ..$ CIUpper : num [1:5] -3.90e-03 -4.19e-02 -3.53e-03 3.88e-05 -1.44e-02
#>   ..$ IndividualRdrVarCov1:'data.frame': 5 obs. of  2 variables:
#>   .. ..$ varEachRdr : num [1:5] 0.000269 0.000227 0.000481 0.000168 0.000522
#>   .. ..$ cov1EachRdr: num [1:5] 0.000216 0.000122 0.000345 0.000125 0.000424
#>  $ RRFC :List of 3
#>   ..$ FTests         :'data.frame':  2 obs. of  4 variables:
#>   .. ..$ DF: num [1:2] 1 4
#>   .. ..$ MS: num [1:2] 0.003614 0.000207
#>   .. ..$ F : num [1:2] 17.5 NA
#>   .. ..$ p : num [1:2] 0.0139 NA
#>   ..$ ciDiffTrt      :'data.frame':  1 obs. of  7 variables:
#>   .. ..$ Estimate: num -0.038
#>   .. ..$ StdErr  : num 0.00909
#>   .. ..$ DF      : num 4
#>   .. ..$ t       : num -4.18
#>   .. ..$ PrGTt   : num 0.0139
#>   .. ..$ CILower : num -0.0633
#>   .. ..$ CIUpper : num -0.0128
#>   ..$ ciAvgRdrEachTrt:'data.frame':  2 obs. of  5 variables:
#>   .. ..$ Estimate: num [1:2] 0.884 0.922
#>   .. ..$ StdErr  : num [1:2] 0.0175 0.0157
#>   .. ..$ DF      : num [1:2] 4 4
#>   .. ..$ CILower : num [1:2] 0.835 0.878
#>   .. ..$ CIUpper : num [1:2] 0.932 0.965
  • While ret is a list with many (22) members, their meanings should be clear from the notation. As an example:

  • The variance components are given by:

ret$varComp
#> NULL

RRRC analysis

ret$FTestStatsRRRC$fRRRC
#> NULL
ret$FTestStatsRRRC$ndfRRRC
#> NULL
ret$FTestStatsRRRC$ddfRRRC
#> NULL
ret$FTestStatsRRRC$pRRRC
#> NULL
  • The F-statistic is , with ndf = 1 and ddf = , which yields a p-value of .

  • The confidence interval for the reader averaged difference between the two treatments is given by:

ret$ciDiffTrtRRRC
#> NULL
  • The FOM difference (treatment 1 minus 2) is , which is significant, p-value = , F-statistic = , ddf = . The confidence interval is (, ).

FRRC analysis

ret$FTestStatsFRRC$fFRRC
#> NULL
ret$FTestStatsFRRC$ndfFRRC
#> NULL
ret$FTestStatsFRRC$ddfFRRC
#> NULL
ret$FTestStatsFRRC$pFRRC
#> NULL
  • The F-statistic is , with ndf = 1 and ddf = Inf, which yields a p-value of .

  • The confidence interval for the reader averaged difference between the two treatments is given by:

ret$ciDiffTrtFRRC
#> NULL

RRFC analysis

ret$FTestStatsRRFC$fRRFC
#> NULL
ret$FTestStatsRRFC$ndfRRFC
#> NULL
ret$FTestStatsRRFC$ddfRRFC
#> NULL
ret$FTestStatsRRFC$pRRFC
#> NULL
  • The F-statistic is , with ndf = 1 and ddf = , which yields a p-value of .

  • The confidence interval for the reader averaged difference between the two treatments is given by:

ret$ciDiffTrtRRFC
#> NULL

Summary

TBA

References

Chakraborty, Dev P. 2017. Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples. Book. Boca Raton, FL: CRC Press.

DeLong, E. R., D. M. DeLong, and D. L. Clarke-Pearson. 1988. “Comparing the Areas Under Two or More Correlated Receiver Operating Characteristic Curves: A Nonparametric Approach.” Journal Article. Biometrics 44: 837–45.

Obuchowski, Nancy A. 1997. “Nonparametric Analysis of Clustered Roc Curve Data.” Journal Article. Biometrics 53: 567–78.

Obuchowski, N. A., and H. E. Rockette. 1995. “Hypothesis Testing of the Diagnostic Accuracy for Multiple Diagnostic Tests: An Anova Approach with Dependent Observations.” Journal Article. Communications in Statistics: Simulation and Computation 24: 285–308.