DfBinDataset.Rd
Bins continuous (i.e. floating point) or quasi-continuous (e.g. integers 0-100) ratings in a dataset and returns the corresponding binned dataset in which the ratings are integers 1, 2,...., with higher values representing greater confidence in presence of disease
DfBinDataset(dataset, desiredNumBins = 7, opChType)
The dataset to be binned, with structure as in RJafroc-package
.
The desired number of bins. The default is 7.
The operating characteristic relevant to the binning operation:
"ROC"
, "FROC"
, "AFROC"
, or "wAFROC"
.
The binned dataset
For small datasets the number of bins may be smaller than desiredNumBins
.
The algorithm needs to know the type of operating characteristic
relevant to the binning operation. For ROC the bins are FP and TP counts, for
FROC the bins are NL and LL counts, for AFROC the bins are FP and LL counts,
and for wAFROC the bins are FP and wLL counts. Binning is generally
employed prior to fitting a statistical model, e.g., maximum likelihood, to the data.
This version chooses ctffs so as to maximize empirical AUC (this yields a
unique choice of ctffs which gives the reader the maximum deserved credit).
Miller GA (1956) The Magical Number Seven, Plus or Minus Two: Some limits on our capacity for processing information, The Psychological Review 63, 81-97
Chakraborty DP (2017) Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples, CRC Press, Boca Raton, FL. https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840
# \donttest{
binned <- DfBinDataset(dataset02, desiredNumBins = 3, opChType = "ROC")
binned <- DfBinDataset(dataset05, desiredNumBins = 4, opChType = "ROC")
binned <- DfBinDataset(dataset05, desiredNumBins = 4, opChType = "AFROC")
binned <- DfBinDataset(dataset05, desiredNumBins = 4, opChType = "wAFROC")
binned <- DfBinDataset(dataset05, opChType = "wAFROC", desiredNumBins = 1)
binned <- DfBinDataset(dataset05, opChType = "wAFROC", desiredNumBins = 2)
binned <- DfBinDataset(dataset05, opChType = "wAFROC", desiredNumBins = 3)
## etc.
# }
# \donttest{
## takes longer than 5 sec on OSX
dataset <- SimulateRocDataset(I = 2, J = 5, K1 = 50, K2 = 70, a = 1, b = 0.5, seed = 123)
datasetB <- DfBinDataset(dataset, desiredNumBins = 7, opChType = "ROC")
fomOrg <- as.matrix(UtilFigureOfMerit(dataset, FOM = "Wilcoxon"))
##print(fomOrg)
fomBinned <- as.matrix(UtilFigureOfMerit(datasetB, FOM = "Wilcoxon"))
##print(fomBinned)
##cat("mean, sd = ", mean(fomOrg), sd(fomOrg), "\n")
##cat("mean, sd = ", mean(fomBinned), sd(fomBinned), "\n")
# }