Bins continuous (i.e. floating point) or quasi-continuous (e.g. integers 0-100) ratings in a dataset and returns the corresponding binned dataset in which the ratings are integers 1, 2,...., with higher values representing greater confidence in presence of disease

DfBinDataset(dataset, desiredNumBins = 7, opChType)

Arguments

dataset

The dataset to be binned, with structure as in RJafroc-package.

desiredNumBins

The desired number of bins. The default is 7.

opChType

The operating characteristic relevant to the binning operation: "ROC", "FROC", "AFROC", or "wAFROC".

Value

The binned dataset

Details

For small datasets the number of bins may be smaller than desiredNumBins. The algorithm needs to know the type of operating characteristic relevant to the binning operation. For ROC the bins are FP and TP counts, for FROC the bins are NL and LL counts, for AFROC the bins are FP and LL counts, and for wAFROC the bins are FP and wLL counts. Binning is generally employed prior to fitting a statistical model, e.g., maximum likelihood, to the data. This version chooses ctffs so as to maximize empirical AUC (this yields a unique choice of ctffs which gives the reader the maximum deserved credit).

References

Miller GA (1956) The Magical Number Seven, Plus or Minus Two: Some limits on our capacity for processing information, The Psychological Review 63, 81-97

Chakraborty DP (2017) Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples, CRC Press, Boca Raton, FL. https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840

Examples

# \donttest{
binned <- DfBinDataset(dataset02, desiredNumBins = 3, opChType = "ROC")
binned <- DfBinDataset(dataset05, desiredNumBins = 4, opChType = "ROC")
binned <- DfBinDataset(dataset05, desiredNumBins = 4, opChType = "AFROC")
binned <- DfBinDataset(dataset05, desiredNumBins = 4, opChType = "wAFROC")
binned <- DfBinDataset(dataset05, opChType = "wAFROC", desiredNumBins = 1)
binned <- DfBinDataset(dataset05, opChType = "wAFROC", desiredNumBins = 2)
binned <- DfBinDataset(dataset05, opChType = "wAFROC", desiredNumBins = 3)
## etc.
# }
 
# \donttest{
## takes longer than 5 sec on OSX
dataset <- SimulateRocDataset(I = 2, J = 5, K1 = 50, K2 = 70, a = 1, b = 0.5, seed = 123)
datasetB <- DfBinDataset(dataset, desiredNumBins = 7, opChType = "ROC")
fomOrg <- as.matrix(UtilFigureOfMerit(dataset, FOM = "Wilcoxon"))
##print(fomOrg)
fomBinned <- as.matrix(UtilFigureOfMerit(datasetB, FOM = "Wilcoxon"))
##print(fomBinned)
##cat("mean, sd = ", mean(fomOrg), sd(fomOrg), "\n")
##cat("mean, sd = ", mean(fomBinned), sd(fomBinned), "\n")
# }