Ch00Vig4DataFormatFrocSp.Rmd
Truth
worksheetThe Truth
worksheet contains 6 columns: CaseID
, LesionID
, Weight
, ReaderID
, ModalityID
and Paradigm
.
LesionID
corresponds to a lesion).CaseID
: unique integers, one per case, representing the cases in the dataset.LesionID
: integers 0, 1, 2, etc., with each 0 representing a non-diseased case, 1 representing the first lesion on a diseased case, 2 representing the second lesion on a diseased case, if present, and so on.ReaderID
value 0
are labeled 1
, 2
, 3
, while the diseased cases interpreted by this reader are labeled 70
, 71
, 72
, 73
and 74
, with LesionID
values ranging from 1 to 3.ReaderID
value 1
, interprets three non-diseased cases labeled 4
, 5
and 6
, each with LesionID
value 0
, and five diseased cases labeled 80
, 81
, 82
, 83
and 84
, with LesionID
values ranging from 1 to 3.ReaderID
value 2
, interprets three non-diseased cases labeled 7
, 8
and 9
, each with LesionID
value 0
and five diseased cases labeled 90
, 91
, 92
, 93
and 94
, with LesionID
values ranging from 1 to 3.Weight
: floating point value adding upto unity for diseased cases as required for FROC data.ModalityID
: a comma-separated listing of modalities, each represented by a unique integer. In the example shown below each cell has the value 0, 1
. Each cell has to be text formatted.
Paradigm
: In the example shown below, the contents are FROC
and split-plot
.The example shown above corresponds to Excel file inst/extdata/toyFiles/FROC/frocSp.xlsx
in the project directory.
frocSp <- system.file("extdata", "toyFiles/FROC/frocSp.xlsx", package = "RJafroc", mustWork = TRUE) x <- DfReadDataFile(frocSp, newExcelFileFormat = TRUE) str(x) #> List of 3 #> $ ratings :List of 3 #> ..$ NL : num [1:2, 1:3, 1:24, 1:3] 1.02 2.89 -Inf -Inf -Inf ... #> ..$ LL : num [1:2, 1:3, 1:15, 1:3] 5.28 5.2 -Inf -Inf -Inf ... #> ..$ LL_IL: logi NA #> $ lesions :List of 3 #> ..$ perCase: int [1:15] 2 1 3 2 1 2 1 3 2 1 ... #> ..$ IDs : num [1:15, 1:3] 1 1 1 1 1 1 1 1 1 1 ... #> ..$ weights: num [1:15, 1:3] 0.3 1 0.333 0.1 1 ... #> $ descriptions:List of 7 #> ..$ fileName : logi NA #> ..$ type : chr "FROC" #> ..$ name : logi NA #> ..$ truthTableStr: num [1:2, 1:3, 1:24, 1:4] 1 1 NA NA NA NA 1 1 NA NA ... #> ..$ design : chr "SPLIT-PLOT" #> ..$ modalityID : Named chr [1:2] "0" "1" #> .. ..- attr(*, "names")= chr [1:2] "0" "1" #> ..$ readerID : Named chr [1:3] "0" "1" "2" #> .. ..- attr(*, "names")= chr [1:3] "0" "1" "2"
newExcelFileFormat
must be set to TRUE
for split plot data.x
is a list
variable with 3 members.dataType
member is and the design
member is .LesionID
column of the Truth
worksheet) and 9 non-diseased cases (the number of 0’s in the LesionID
column).x$lesionVector
member is a vector with 15 ones representing the 15 diseased cases in the dataset.x$lesionID
member is a 15 x 3 array labeling the lesions in the dataset.x$lesionWeight
member is a 15 x 3 array.x$lesionVector #> NULL x$lesionID #> NULL x$lesionWeight #> NULL
x$truthTableStr
member is a 2 x 3 x 24 x 4
array, i.e., I x J x K x (maximum number of lesions per case plus 1). The plus 1
is needed to accommodate normal cases with lesionID
= 0.1
, meaning the corresponding interpretation exists, or NA
, meaning the corresponding interpretation does not exist.x$truthTableStr[1,1,1,1]
is . This means that an interpretation exists for the first treatment (modalityID
= 0), first reader (readerID
= 0) and first (normal) case caseID
= 1 and lesionID
= 0. This example corresponds to row 2 in the TRUTH
worksheet.x$truthTableStr[1,1,4,1]
is , which means an interpretation does not exist for the first treatment, first reader and fourth (normal) case.x$truthTableStr[1,2,4,1]
is , which means an interpretation does exist for the first treatment, second reader and fourth (normal) case. This example corresponds to row 5 in the TRUTH
worksheet.x$truthTableStr[1,1,10,3]
is , which means an interpretation does exist for the first treatment, first reader, tenth (abnormal) case and lesionID
= 2. This example corresponds to row 12 in the TRUTH
worksheet.x$truthTableStr
summarizes the structure of the data in the TRUTH
worksheet.These are found in the FP
or NL
worksheet, see below.
ReaderID
: the reader labels: these must be from 0
, 1
or 2
, as declared in the Truth
worksheet.ModalityID
: the modality labels: 0
or 1
, as declared in the Truth
worksheet.CaseID
: the labels of non-diseased cases. Each CaseID
, ModalityID
, ReaderID
combination must be consistent with that declared in the Truth
worsheet.FP_Rating
: the floating point ratings of non-diseased cases. Each row of this worksheet yields a rating corresponding to the values of ReaderID
, ModalityID
and CaseID
for that row. Each CaseID
, ModalityID
, ReaderID
combination must be consistent with that declared in the Truth
worsheet.x$NL[,1,1:9,1] #> NULL x$NL[,2,1:9,1] #> NULL x$NL[,3,1:9,1] #> NULL
CaseID
s 1,3,3
(indexed 1, 2, 3 and appearing in the first three columns) interpreted by the first reader (ReaderID
0
).CaseID
s 4,5,6
(indexed 4, 5, 6and appearing in the next three columns) interpreted by the second reader (ReaderID
1
).CaseID
s 7,8,9
(indexed 7, 8, 9and appearing in the final three columns) interpreted by the third reader (ReaderID
2
).x$NL[,,16:30,1]
, which are there for compatibility with FROC data, are all filled with -Inf
.These are found in the TP
or LL
worksheet, see below.
ReaderID
: the reader labels: these must be from 0
, 1
or 2
, as declared in the Truth
worksheet.ModalityID
: the modality labels: 0
or 1
, as declared in the Truth
worksheet.CaseID
: the labels of diseased cases. Each CaseID
, ModalityID
, ReaderID
combination must be consistent with that declared in the Truth
worsheet.TP_Rating
: the floating point ratings of diseased cases. Each row of this worksheet yields a rating corresponding to the values of ReaderID
, ModalityID
and CaseID
for that row. Each CaseID
, ModalityID
, ReaderID
combination must be consistent with that declared in the Truth
worsheet.x$LL[,1,1:15,1] #> NULL x$LL[,2,1:15,1] #> NULL x$LL[,3,1:15,1] #> NULL
CaseID
s 70,71,72,73,74
(indexed 1, 2, 3, 4, 5 and appearing in the first five columns) interpreted by the first reader (ReaderID
0
).CaseID
s 80,81,82,83,84
(indexed 6, 7, 8, 9, 10 and appearing in the next five columns) interpreted by the second reader (ReaderID
1
).CaseID
s 90,91,92,93,94
(indexed 11, 12, 13, 14, 15 and appearing in the final five columns) interpreted by the third reader (ReaderID
2
).