Thesis

94 Chapter 6 Reader characteristics N= % Country The Netherlands 4 18% United Kingdom 4 18% Italy 2 9% Switzerland 2 9% India 1 5% Israël 1 5% Denmark 1 5% Germany 1 5% Portugal 1 5% France 1 5% Canada 1 5% Brazil 1 5% Chile 1 5% Georgia 1 5% 1 based on histology after surgery in 21 patients and on a sustained clinical complete response during W&W with >2 years of clinical follow up in the remaining 6 patients Diagnostic performance and effects of reader experience and image quality Table 3 shows the average diagnostic performance for the four response methods to discern complete responders from patients with residual tumor, including sub-analyses comparing results for experts versus non-expert readers and for scans with optimal versus below-average image quality. The mrTRG showed the lowest specificity (64% vs. 79–82% for the other methods; p < 0.001) but the highest sensitivity (57% vs. 36–40%; p < 0.001). NPV was significantly higher (p = 0.04) and overall accuracy was significantly lower for mrTRG (p < 0.001) compared to the other methods. Overall accuracy ranged between 62 and 68%, with higher accuracy (70–74%) for the expert readers, except for the split scar sign where no significant differences were observed. The area under the ROC-curve (incl. 95% confidence interval) was 0.72 (0.60–0.83) for mrTRG, 0.69 (0.57–0.91) for modified mrTRG, 0.68 (0.55–0.81) for DWI patterns, and 0.74 (0.63–0.85) for the split scar; differences between the four techniques were not statistically significant (p = 0.17–0.94). Scans with below-average imaging quality had a negative impact on diagnostic performance. Detailed effect sizes and levels of significance are provided in Supplement 2. Selected imaging examples demonstrating the effects of reader experience and image quality are provided in Figs. 2 and 3. Table 2 Continued

RkJQdWJsaXNoZXIy MjY0ODMw