125 Predicting response to chemoradiotherapy in rectal cancer via visual morphologic assessment and staging on baseline MRI 7 Table 3 Diagnostic performance and effect of reader experience level Sensitivity Specificity PPV NPV Accuracy AUC 5-point confidence score Average (all readers) 49% 73% 66% 62% 61% 0.71 Expert readers Non-expert readers 43% 51% 83% 70% 72% 64% 62% 62% 63% 61% 0.75 0.69 Effect size (+ 95% CI) Level of significance (p) -0.08 (-0.32;0.16) p=0.49 0.13 (-0.06;0.31) p=0.17 0.08 (-0.01;0.15) p=0.03 0.00 (-0.08;0.08) p=0.99 0.03 (-0.02;0.07) p=0.27 N/A p=0.16 4-point risk score Average (all readers) 57% 71% 67% 65% 64% 0.74 Expert readers Non-expert readers 59% 57% 74% 70% 69% 66% 67% 64% 67% 64% 0.76 0.74 Effect size (+ 95% CI) Level of significance (p) 0.02 (-0.17;0.21) p=0.81 0.04 (-0.11;0.20) p=0.59 0.02 (-0.05;0.09) p=0.51 0.03 (-0.04;0.10) p=0.42 0.03 (-0.03;0.09) p=0.28 N/A p=0.38 Dichotomized (2-point) risk score Average (all readers) 59% 68% 64% 64% 64% 0.72 Expert readers Non-expert readers 60% 58% 71% 68% 67% 64% 66% 64% 66% 63% 0.74 0.71 Effect size (+ 95% CI) Level of significance (p) 0.01 (-0.13;0.16) p=0.86 0.04 (-0.08;0.15) p=0.51 0.03 (-0.01;0.08) p=0.15 0.02 (-0.04;0.07) p=0.49 0.02 (-0.02;0.07) p=0.24 N/A p=0.39 Note, results were calculated using a (near-)complete response (TRG1-2) as the positive outcome and incomplete response as the negative outcome. Expert readers (n=5) were MRI-experts with ≥10 years dedicated experience in rectal MRI; non-expert readers (n=17) were abdominal radiologists or general radiologists with a specific interest in abdominal imaging. Effect sizes for reader experience level including 95% confidence intervals and levels of significance were assessed using mixed model linear regression. Optimal cut-off values were derived from the results of the ROC-analysis: the 5-point confidence score by van Griethuysen was dichotomized between 4-5 (positive, indicative of (near-)CR) and 1-3 (negative, indicative of incomplete response); the 4-point risk score was dichotomized between 0-1 (positive, indicative of (near-)CR) and 2-4 (negative, indicative of incomplete response).
RkJQdWJsaXNoZXIy MjY0ODMw