99 Comparison of MRI response evaluation methods in rectal cancer: a multicentre and multireader validation study 6 known limitation of DWI [3, 12]. When relying on DWI for clinical decision-making, steps should be taken to optimize DWI image quality, such as giving patients a preparatory micro-enema or adapting acquisition protocols to make the DWI sequence less susceptible to artefacts [17, 23,24,25]. Out of the four investigated methods, the mrTRG has been studied the most in previous literature. In a recent meta-analysis including six studies and a total of 916 patients, pooled sensitivity to diagnose a complete response using a mrTRG score of 1–2 was somewhat higher than in our current report (70% vs. 57%) [7]. Interestingly, sensitivity for mrTRG in our study was higher than for the other three methods (57% vs. 36–40%), suggesting a better performance for mrTRG in identifying complete responders with a lower risk of overcalling the presence of a residual tumor. The specificity of 62% for mrTRG in our study was comparable to that reported in the previous meta-analysis (64%) [7], but lower compared to the other three methods under evaluation (specificity 79–82%), indicating a higher risk of missing residual tumor. Notably, the mrTRG—despite being probably the most well-known method out of the four—was selected as the preferred response method by only 18% of our study readers. The fourth method under evaluation was the split scar sign, proposed by Santiago et al [14]. The split scar sign describes a particular morphologic appearance of the tumor bed (scar) after CRT which gives the rectal wall a characteristic layered appearance. In the original publication with two readers, a higher sensitivity of 52–64% was reported compared to the average sensitivity of only 36% for the 22 readers in our current study. The average specificity in our current study was 79%, versus 97% in the original publication. Overall accuracy for the split scar sign in our current study was similar to that of the other three methods. However, it was clearly the least preferred scoring method amongst the study readers. In up to 20% of cases, our readers experienced difficulties in assessing the split scar sign, and a positive split scar sign was recognized in only a very small minority of the cases. Several of our readers furthermore noted that the split scar sign was not applicable in cases with a complete response without any visible fibrosis. Santiago et al stated explicitly in their publication that high-resolution T2W imaging is required for the evaluation of the split scar sign. A substantial number of scans in our cohort were acquired with a slice thickness of > 3 mm and/or limited in-plane resolution. This suggests that out of the four response methods, the split scar sign may be the most influenced by T2W scan quality and therefore more challenging to reproduce in a heterogeneous clinical dataset with less optimized acquisition protocols. With respect to the interobserver agreement, results were comparable for the mrTRG, modified TRG, and DWI pattern approach, with median kappa’s ranging between 0.41 and 0.48 (with the highest scores for the DWI pattern score). Agreement for the split
RkJQdWJsaXNoZXIy MjY0ODMw