Ch00Vig3DataFormatRocSp.Rmd
Truth
worksheetThe Truth
worksheet contains 6 columns: CaseID
, LesionID
, Weight
, ReaderID
, ModalityID
and Paradigm
.
CaseID
: unique integers, one per case, representing the cases in the dataset.LesionID
: integers 0, representing non-diseased cases and 1 representing the diseased cases.ReaderID
value 1
are labeled 6
, 7
, 8
, 9
and 10
, each with LesionID
value 0
, while the diseased cases interpreted by this reader are labeled 16
, 17
, 18
, 19
and 20
, each with LesionID
value 1
. Note that the ReaderID
for the above cases has the single value 1
, unlike the crossed design where all readers interpret all cases.ReaderID
value 4
, interprets five non-diseased cases labeled 21
, 22
, 23
, 24
and 25
, each with LesionID
value 0
, and five diseased cases labeled 36
, 37
, 38
, 39
and 40
, each with LesionID
value 1
.5
, interprets five non-diseased cases labeled 46
, 47
, 48
, 49
and 50
, each with LesionID
value 0
and five diseased cases labeled 51
, 52
, 53
, 54
and 55
, each with LesionID
value 1
.Weight
: floating point value 0 - this is not used for ROC data.ModalityID
: a comma-separated listing of modalities, each represented by a unique integer. In the example shown below each cell has the value 1, 2
. Each cell has to be text formatted.
Paradigm
: In the example shown below, the contents are ROC
and split-plot
.The example shown above corresponds to Excel file inst/extdata/toyFiles/ROC/rocSp.xlsx
in the project directory.
rocSp <- system.file("extdata", "toyFiles/ROC/rocSp.xlsx", package = "RJafroc", mustWork = TRUE) x <- DfReadDataFile(rocSp, newExcelFileFormat = TRUE) str(x) #> List of 3 #> $ ratings :List of 3 #> ..$ NL : num [1:2, 1:3, 1:30, 1] 1 1 -Inf -Inf -Inf ... #> ..$ LL : num [1:2, 1:3, 1:15, 1] 5 2.3 -Inf -Inf -Inf ... #> ..$ LL_IL: logi NA #> $ lesions :List of 3 #> ..$ perCase: int [1:15] 1 1 1 1 1 1 1 1 1 1 ... #> ..$ IDs : num [1:15, 1] 1 1 1 1 1 1 1 1 1 1 ... #> ..$ weights: num [1:15, 1] 1 1 1 1 1 1 1 1 1 1 ... #> $ descriptions:List of 7 #> ..$ fileName : logi NA #> ..$ type : chr "ROC" #> ..$ name : logi NA #> ..$ truthTableStr: num [1:2, 1:3, 1:30, 1:2] 1 1 NA NA NA NA 1 1 NA NA ... #> ..$ design : chr "SPLIT-PLOT" #> ..$ modalityID : Named chr [1:2] "1" "2" #> .. ..- attr(*, "names")= chr [1:2] "1" "2" #> ..$ readerID : Named chr [1:3] "1" "4" "5" #> .. ..- attr(*, "names")= chr [1:3] "1" "4" "5"
newExcelFileFormat
must be set to TRUE
for split plot data.x
is a list
variable with 3 members.LesionID
column of the Truth
worksheet) and 15 non-diseased cases (the number of 0’s in the LesionID
column).x$NL
, with dimension [2, 3, 30, 1], contains the ratings of normal cases. The extra values in the third dimension, filled with NAs
, are needed for compatibility with FROC datasets.x$LL
, with dimension [2, 3, 15, 1], contains the ratings of abnormal cases.x$lesionVector
member is a vector with 15 ones representing the 15 diseased cases in the dataset.x$lesionID
member is an array with 15 ones (this member is needed for compatibility with FROC datasets).x$lesionWeight
member is an array with 15 ones (this member is needed for compatibility with FROC datasets).dataType
member is which specifies the data collection method (“ROC”, “FROC”, “LROC” or “ROI”).x$modalityID
member is a vector with two elements "1"
and "2"
, naming the two modalities.x$readerID
member is a vector with three elements "1"
, "4"
and "5"
, naming the three modalities.x$design
member is ; specifies the dataset design, which can be either “CROSSED” or “SPLIT-PLOT”.x$normalCases
member lists the names of the normal cases, .x$abnormalCases
member lists the names of the abnormal cases, .x$truthTableStr
member quantifies the structure of the dataset, as explained next. It is used in the DfReadDataFile()
function to check for data entry errors.
truthTableStr
member2 x 3 x 30 x 2
array, i.e., I x J x K x (maximum number of lesions per case plus 1). The plus 1
is needed to accommodate normal cases with lesionID
= 0.1
, meaning the corresponding interpretation exists, or NA
, meaning the corresponding interpretation does not exist.x$truthTableStr[1,1,1,1]
is . This means that an interpretation exists for the first treatment (modalityID
= 1), first reader (readerID
= 1) and first (normal) case (caseID
= 6 and lesionID
= 0). This example corresponds to row 2 in the TRUTH
worksheet.x$truthTableStr[,1,1:15,1] #> NULL
NA
because normal cases correspond to lesionID = 1.x$truthTableStr[,1,1:15,2] #> NULL
x$truthTableStr[,2,1:15,1] #> NULL
x$truthTableStr[,3,1:15,1] #> NULL
x$truthTableStr[,1,16:30,2] #> NULL
NA
because abnormal cases correspond to lesionID
= 2.x$truthTableStr[,1,16:30,1] #> NULL
These are found in the FP
or NL
worksheet, see below.
ReaderID
: the reader labels: these must be from 1
, 4
or 5
, as declared in the Truth
worksheet.ModalityID
: the modality labels: 1
or 2
, as declared in the Truth
worksheet.CaseID
: the labels of non-diseased cases. Each CaseID
- ReaderID
combination must be consistent with that declared in the Truth
worsheet.NL_Rating
: the floating point ratings of non-diseased cases. Each row of this worksheet yields a rating corresponding to the values of ReaderID
, ModalityID
and CaseID
for that row.x$NL[,1,1:15,1] #> NULL x$NL[,2,1:15,1] #> NULL x$NL[,3,1:15,1] #> NULL
CaseID
s 6,7,8,9,10
(indexed 1, 2, 3, 4, 5 and appearing in the first five columns) interpreted by the first reader (ReaderID
1).CaseID
s 21,22,23,24,25
(indexed 6, 7, 8, 9, 10and appearing in the next five columns) interpreted by the second reader (ReaderID
4).CaseID
s 46,47,48,49,50
(indexed 11, 12, 13, 14, 15and appearing in the final five columns) interpreted by the third reader (ReaderID
5).x$NL[,,16:30,1]
, which are there for compatibility with FROC data, are all filled with -Inf
.These are found in the TP
or LL
worksheet, see below.
ReaderID
: the reader labels: these must be from 1
, 4
or 5
, as declared in the Truth
worksheet.ModalityID
: the modality labels: 1
or 2
, as declared in the Truth
worksheet.CaseID
: the labels of diseased cases. Each CaseID
- ReaderID
combination must be consistent with that declared in the Truth
worsheet.LL_Rating
: the floating point ratings of diseased cases. Each row of this worksheet yields a rating corresponding to the values of ReaderID
, ModalityID
and CaseID
for that row.x$LL[,1,1:15,1] #> NULL x$LL[,2,1:15,1] #> NULL x$LL[,3,1:15,1] #> NULL
CaseID
s 16,17,18,19,20
(indexed 1, 2, 3, 4, 5and appearing in the first five columns) interpreted by the first reader (ReaderID
1).CaseID
s 36,37,38,39,40
(indexed 6, 7, 8, 9, 10and appearing in the next five columns) interpreted by the second reader (ReaderID
4).CaseID
s 51,52,53,54,55
(indexed 11, 12, 13, 14, 15and appearing in the final five columns) interpreted by the third reader (ReaderID
5).x$NL
or x$LL
list members is the total number of modalities, 2 in the current example.x$NL
or x$LL
list members is the total number of readers, 3 in the current example.x$NL
is the total number of cases, 8 in the current example. The first three positions account for NL
marks on non-diseased cases and the remaining 5 positions account for NL
marks on diseased cases.x$LL
is the total number of diseased cases, 5 in the current example.x$NL
is determined by the case (diseased or non-diseased) with the most NL
marks, 2 in the current example.x$LL
is determined by the diseased case with the most lesions, 3 in the current example.