Comparing standalone CAD vs. at least two radiologists interpreting the same cases; standalone CAD means that all the designer-level mark-rating pairs generated by the CAD algorithm are available to the analyst, not just the one or two marks per case displayed to the radiologist (the latter are marks whose ratings exceed a pre-selected threshold). At the very minimum, location-level information, such as in the LROC paradigm, should be used. Ideally, the FROC paradigm should be used. A severe statistical power penalty is paid if one uses the ROC paradigm. See Standalone CAD vs Radiologists chapter, available via download link at site https://github.com/dpc10ster/RJafrocBook/blob/gh-pages/RJafrocBook.pdf

StCadVsRad(
  dataset,
  FOM,
  FPFValue = 0.2,
  method = "1T-RRRC",
  alpha = 0.05,
  plots = FALSE
)

Arguments

dataset

The dataset to be analyzed; must be single-modality at least three readers, where the first reader is CAD.

FOM

The desired FOM; for ROC data it must be "Wilcoxon", for FROC data it can be any valid FOM, e.g., "HrAuc", "wAFROC", etc; for LROC data it must be "Wilcoxon", or "PCL" or "ALROC".

FPFValue

Only needed for LROC data and FOM = "PCL" or "ALROC"; where to evaluate a partial curve based figure of merit. The default is 0.2.

method

The desired analysis: "1T-RRFC","1T-RRRC" (the default) or "2T-RRRC", see manuscript for details.

alpha

Significance level of the test, defaults to 0.05.

plots

Flag, default is FALSE, i.e., a plot is not displayed. If TRUE, it displays the appropriate operating characteristic for all readers and CAD.

Value

If method = "1T-RRRC" the return value is a list with the following elements:

fomCAD

The observed FOM for CAD.

fomRAD

The observed FOM array for the readers.

avgRadFom

The average FOM of the readers.

avgDiffFom

The mean of the difference FOM, RAD - CAD.

ciAvgDiffFom

The 95-percent CI of the average difference, RAD - CAD.

varR

The variance of the radiologists.

varError

The variance of the error term in the single-modality multiple-reader OR model.

cov2

The covariance of the error term.

tstat

The observed value of the t-statistic; it's square is equivalent to an F-statistic.

df

The degrees of freedom of the t-statistic.

pval

The p-value for rejecting the NH.

Plots

If argument plots = TRUE, a ggplot object containing empirical operating characteristics corresponding to specified FOM. For example, if FOM = "Wilcoxon" an ROC plot object is produced where reader 1 is CAD. If an LROC FOM is selected, an LROC plot is displayed.

If method = "2T-RRRC" the return value is a list with the following elements:

fomCAD

The observed FOM for CAD.

fomRAD

The observed FOM array for the readers.

avgRadFom

The average FOM of the readers.

avgDiffFom

The mean of the difference FOM, RAD - CAD.

ciDiffFom

A data frame containing the statistics associated with the average difference, RAD - CAD.

ciAvgRdrEachTrt

A data frame containing the statistics associated with the average FOM in each "modality".

varR

The variance of the pure reader term in the OR model.

varTR

The variance of the modality-reader term error term in the OR model.

cov1

The covariance1 of the error term - same reader, different treatments.

cov2

The covariance2 of the error term - different readers, same modality.

cov3

The covariance3 of the error term - different readers, different treatments.

varError

The variance of the pure error term in the OR model.

FStat

The observed value of the F-statistic.

ndf

The numerator degrees of freedom of the F-statistic.

df

The denominator degrees of freedom of the F-statistic.

pval

The p-value for rejecting the NH.

Plots

see above.

Details

  • PCL is the probability of a correct localization.

  • The LROC is the plot of PCL (ordinate) vs. FPF.

  • For LROC data, FOM = "PCL" means the interpolated PCL value at the specified FPFValue.

  • For FOM = "ALROC" the trapezoidal area under the LROC from FPF = 0 to FPF = FPFValue is used.

  • If method = "1T-RRRC" the first reader is assumed to be CAD.

  • If method = "2T-RRRC" the first modality is assumed to be CAD.

  • The NH is that the FOM of CAD equals the average of the readers.

  • The method = "1T-RRRC" analysis uses an adaptation of the single-modality multiple-reader Obuchowski Rockette (OR) model described in a paper by Hillis (2007), section 5.3. It is characterized by 3 parameters VarR, Var and Cov2, where the latter two are estimated using the jackknife.

  • For method = "2T-RRRC" the analysis replicates the CAD data as many times as necessary so as to form one "modality" of an MRMC pairing, the other "modality" being the radiologists. Then standard ORH analysis is applied. The method is described in Kooi et al. It gives exactly the same final results (F-statistic, ddf and p-value) as "1T-RRRC" but the intermediate quantities are meaningless.

References

Hillis SL (2007) A comparison of denominator degrees of freedom methods for multiple observer ROC studies, Statistics in Medicine. 26:596-619.

Chakraborty DP (2017) Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples, CRC Press, Boca Raton, FL. https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840

Hupse R, Samulski M, Lobbes M, et al (2013) Standalone computer-aided detection compared to radiologists performance for the detection of mammographic masses, Eur Radiol. 23(1):93-100.

Kooi T, Gubern-Merida A, et al. (2016) A comparison between a deep convolutional neural network and radiologists for classifying regions of interest in mammography. Paper presented at: International Workshop on Digital Mammography, Malmo, Sweden.

Examples

ret1M <- StCadVsRad (dataset09, 
FOM = "Wilcoxon", method = "1T-RRRC")

StCadVsRad(datasetCadLroc, 
FOM = "Wilcoxon", method = "1T-RRFC")
#> $fomCAD
#> [1] 0.8169271
#> 
#> $fomRAD
#> [1] 0.8415625 0.8411979 0.8997396 0.8381250 0.8563542 0.8786979 0.8583854
#> [8] 0.7970312 0.8268750
#> 
#> $avgRadFom
#> [1] 0.8486632
#> 
#> $CIAvgRadFom
#> [1] 0.8258894 0.8714370
#> 
#> $avgDiffFom
#> [1] 0.03173611
#> 
#> $CIAvgDiffFom
#> [1] 0.008962347 0.054509875
#> 
#> $varR
#> [1] 0.0008777927
#> 
#> $Tstat
#> [1] 3.213505
#> 
#> $df
#> [1] 8
#> 
#> $pval
#> [1] 0.01235909
#> 

retLroc1M <- StCadVsRad (datasetCadLroc, 
FOM = "PCL", method = "1T-RRRC", FPFValue = 0.05)

## test with fewer readers
dataset09a <- DfExtractDataset(dataset09, rdrs = seq(1:7))
ret1M7 <- StCadVsRad (dataset09a, 
FOM = "Wilcoxon", method = "1T-RRRC")

datasetCadLroc7 <- DfExtractDataset(datasetCadLroc, rdrs = seq(1:7))
ret1MLroc7 <- StCadVsRad (datasetCadLroc7, 
FOM = "PCL", method = "1T-RRRC", FPFValue = 0.05)

# \donttest{
## takes longer than 5 sec on OSX
## retLroc2M <- StCadVsRad (datasetCadLroc, 
## FOM = "PCL", method = "2T-RRRC", FPFValue = 0.05)

## ret2MLroc7 <- StCadVsRad (datasetCadLroc7, 
## FOM = "PCL", method = "2T-RRRC", FPFValue = 0.05)
# }