91 Comparison of MRI response evaluation methods in rectal cancer: a multicentre and multireader validation study 6 Results Baseline characteristics Baseline patient and study reader data are provided in Table 2. Fifty-two patients (58%) were 52 male, and mean age was 65 ± 11 years. Twenty-seven patients (30%) were complete responders. The 22 study readers originated from fourteen different countries. Diagnostic performance and effects of reader experience and image quality Table 3 shows the average diagnostic performance for the four response methods to discern complete responders from patients with residual tumor, including sub-analyses comparing results for experts versus non-expert readers and for scans with optimal versus below-average image quality. The mrTRG showed the lowest specificity (64% vs. 79–82% for the other methods; p < 0.001) but the highest sensitivity (57% vs. 36–40%; p < 0.001). NPV was significantly higher (p = 0.04) and overall accuracy was significantly lower for mrTRG (p < 0.001) compared to the other methods. Overall accuracy ranged between 62 and 68%, with higher accuracy (70–74%) for the expert readers, except for the split scar sign where no significant differences were observed. The area under the ROC-curve (incl. 95% confidence interval) was 0.72 (0.60–0.83) for mrTRG, 0.69 (0.57–0.91) for modified mrTRG, 0.68 (0.55–0.81) for DWI patterns, and 0.74 (0.63–0.85) for the split scar; differences between the four techniques were not statistically significant (p = 0.17–0.94). Scans with below-average imaging quality had a negative impact on diagnostic performance. Detailed effect sizes and levels of significance are provided in Supplement 2. Selected imaging examples demonstrating the effects of reader experience and image quality are provided in Figs. 2 and 3.
RkJQdWJsaXNoZXIy MjY0ODMw