Rectal MRI staging: a web-based approach for teaching and research NAJIM EL KHABABI

Rectal MRI staging: a web-based approach for teaching and research Najim el Khababi

ISBN: 978-94-6473-350-1 Cover design: Vera den Das Design & lay-out: Douwe Oppewal Printed by: Ipskamp printing, Enschede ©2023 N. El Khababi All rights reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without prior written permission of the author

Rectal MRI staging: a web-based approach for teaching and research PROEFSCHRIFT Ter verkrijging van de graad van doctor aan de Universiteit Maastricht, op gezag van de Rector Magnificus, Prof. dr. Pamela Habibović volgens het besluit van het College van Decanen, in het openbaar te verdedigen op Maandag 15 april 2024 om 10.00 uur door Najim el Khababi

Promotores: Prof. dr. Regina G .H. Beets-Tan (Universiteit Maastricht / Antoni van Leeuwenhoek, Amsterdam) Prof. dr. Geerard L. Beets (Universiteit Maastricht / Antoni van Leeuwenhoek, Amsterdam) Co-promotor: Dr. Doenja M.J. Lambregts (Universiteit Maastricht / Antoni van Leeuwenhoek, Amsterdam) Beoordelingscommissie: Prof. dr. Laurents Stassen (Voorzitter) Prof. dr. Corrie Marijnen (Universiteit Leiden / Antoni van Leeuwenhoek, Amsterdam) Dr. Jarno Melenhorst Prof. dr. Harm Rutten Prof. dr. Vincent Vandecavaye (Universitair Ziekenhuis Leuven)

5 Contents Chapter 1 Introduction, aims and outline of the thesis 7 Chapter 2 El Khababi et al. “Pearls and pitfalls of structured staging and 15 reporting of rectal cancer on MRI: an international multireader study” Published in BJR 2023; 96(1150):20230091 Chapter 3 El Khababi et al. “Outcomes and potential impact of a virtual hands-on 35 training program on MRI staging confidence and performance in rectal cancer ” Published in Eur Radiol 2023 (ePub ahead of print) Chapter 4 Bogveradze, El Khababi et al. “Evolutions in rectal cancer MRI staging 53 and risk stratification in the Netherlands” Published in Abdom Radiol (NY) 2022; 47:38-47 Chapter 5 El Khababi et al. “Sense and nonsense of yT-staging on MRI after 69 chemoradiotherapy in rectal cancer.” Published in Colorectal Disease 2023; 25(9):1878-1887 Chapter 6 El Khababi et al. “Comparison of MRI response evaluation methods 85 in rectal cancer: a multicentre and multireader validation study Published in Eur Radiol 2023; 33(6):4367-4377 Chapter 7 El Khababi et al. “Predicting response to chemoradiotherapy in 115 rectal cancer via visual assessment on baseline staging MRI: a multicentre and multireader study Published in Abdom Radiol (NY) 2023; 48(10):3039-3049 Chapter 8 General Discussion 141 Appendices Summary 152 Samenvatting 154 Scientific impact 157 List of publications 160 Dankwoord 162 Curriculum Vitea 164


7 1 CHAPTER 1 Introduction, aims and outline of the thesis

8 Chapter 1 Introduction The treatment landscape in rectal cancer Colorectal cancer is the third most prevalent type of cancer in the world and second regarding mortality [1, 2]. Approximately one-third of all colorectal cancers concern rectal cancers. Historically, the management of rectal cancer primarily involved surgical intervention, typically using total mesorectal excision (TME). While surgery remains a fundamental component of rectal cancer treatment, the treatment landscape has evolved into a multidisciplinary approach that encompasses chemotherapy, radiation therapy, targeted therapies, and innovative surgical techniques. In recent years, there has been a specific paradigm shift in treatment towards “organ preservation”. The Brazilian group of Habr-Gama et al. was one of the first to show that approximately one fourth of locally advanced rectal cancers show a complete response after neoadjuvant chemoradiation (CRT), and that surgery could be avoided in these patients without compromising oncological outcomes [3]. Since the introduction of this ‘watchand-wait’ (W&W) approach, research groups from around the world have confirmed that, in carefully selected patients, W&W can be a safe alternative to surgical resection with an important positive impact on quality of life [4-6]. At the same time, studies are now focusing on maximizing neoadjuvant treatment effects to improve response rates and offer more patients the option of organ-preservation. For example, the RAPIDO trail compared standard of care chemoradiation to short course radiotherapy followed by chemotherapy and showed that the percentage of complete responders could be doubled after short course radiotherapy and chemotherapy [7]. Several studies have also shown promising results for ‘total neoadjuvant therapy’ (TNT) that incorporates chemotherapy with CRT. With TNT, studies have shown that a complete response may be reached in up to half of the patients [8]. Finally, studies such as the STAR-TREC and OPERA trial are now investigating the benefit of giving ‘neoadjuvant’ therapy to more early stage tumours that would otherwise be primarily operated, with the aim of achieving organ preservation [9, 10]. Role of imaging and the radiologist Magnetic Resonance Imaging (MRI) has developed into the main diagnostic tool for the primary local staging of rectal cancer and plays a pivotal role to stratify patients into different risk and treatment groups. Based on the MRI assessment of key risk factors such as the local tumour (T) stage, lymph node (N) stage and tumour extension into the anticipated surgical resection plane, patients are stratified as low, intermediate or high risk. This risk stratification determines whether they may undergo immediate surgery or first require neoadjuvant radiation and/or chemotherapy to downstage the tumour. The increased focus on organ-preservation has greatly increased the importance of the radiologist in monitoring patients undergoing neoadjuvant treatment. MRI nowadays

9 Introduction, aims and outline of the thesis 1 has an important role (next to endoscopy) to evaluate the local tumour response after neoadjuvant treatment. This ‘restaging’ helps determine whether patients still require routine surgical resection, or may be managed with more conservative treatment alternatives such as watch and wait (W&W) [11]. Considering that rectal cancer treatment is increasingly tailored to a patient’s individual needs, the radiologist now plays an important role as a sparring partner in the multidisciplinary team (MDT) to help determine these needs and push personalized medicine forwards. To support radiologists in fulfilling this role, various imaging guidelines, reporting guides and staging templates have been developed for rectal cancer [12-14]. In addition, new diagnostic grading systems have recently been published to assist radiologists in assessing the local tumour response after neoadjuvant therapy [15-18]. An important goal of these templates and grading systems is to enhance uniformity in radiological reporting and thereby promote consistent and evidence-based patient management. However, there are several pitfalls and challenges when it comes to the application and clinical implementation of such tools. First of all, several reports have shown that the experience level of the radiologist has a significant impact on the diagnostic performance and, consequently, on MDT decisions and treatment outcomes [19-22]. New diagnostic methods, such as response-grading systems have often only been tested by small groups of expert radiologists before being adopted into guidelines [15-18, 22]. Whether such tools are also reproducible and accurate in the hands of radiologists in everyday clinical practice is often poorly examined. An important reason for this is that it is challenging to set up studies to test and validate diagnostic methods on a large scale, i.e. with multiple radiologists and using data from different clinical centres. Furthermore, limited data is available on how well staging guides and templates, such as those published by the European Society of Gastrointestinal and Abdominal Radiology (ESGAR), are adopted in daily clinical routine, what their clinical impact is and whether there are any practical limitations that need to be addressed and investigated further. Required research infrastructure An important missing link to address the challenges described above is the availability of a user-friendly platform to enable large scale diagnostic validation studies where images from different clinical centres can easily be shared and evaluated by multiple readers. Though recent “image biobank” and “data repository” initiatives do offer platforms to share research data, these platforms are often primarily focused on the development of artificial intelligence (AI) tools [23, 24]. Digital Imaging and Communications in Medicine (DICOM)-viewers embedded within these platforms – if any – are typically very basic and mainly offer tools for AI annotation. Functionality dedicated to visual diagnostic image interpretation is limited and these platforms are therefore poorly equipped to support more clinically oriented diagnostic studies. Proper validation of visual diagnostic methods is critical, considering the direct impact on patient management outlined

10 Chapter 1 above. Availability of a platform and infrastructure to facilitate research in this direction would thus be an important step forward. Moreover, it would create new opportunities for training and teaching, thereby promoting effective clinical dissemination and implementation of study outcomes. Aims of this thesis The overall aim of this thesis was to develop a practical web-based tool and research infrastructure to investigate and validate available diagnostic methods and staging templates on a large scale within the clinical context of rectal cancer. The main study goals and questions addressed in this thesis – using this newly developed webplatform – are as follows: 1. To investigate the clinical applicability and main pitfalls of the ESGAR structured reporting template for rectal cancer (Chapter 2), and to explore to what extent structured reporting is already successfully embedded in daily radiological reporting practice in the Netherlands (Chapter 4) 2. To test the use of our webplatform as a tool for virtual training and teaching and its potential impact on rectal cancer staging performance (Chapter 3) 3. To validate the diagnostic performance and reproducibility of different visual diagnostic grading systems to evaluate and predict response to neoadjuvant chemoradiotherapy (Chapters 5,6,7) Outline of this thesis Chapter 2 investigates the reproducibility and diagnostic performance in staging using the ESGAR structured reporting template, and aims to establish its main limitations and areas for improvement. Chapter 3 describes the outcomes and explores the impact of a virtual hands-on training course on MRI staging confidence and performance. Chapter 4 evaluates how the MRI reporting of rectal cancer has evolved in the Netherlands following guidelines updates and assesses how well-structured reporting templates have been adopted into clinical routine. Chapter 5 evaluates the pearls and pitfalls of MRI for yT-staging after chemoradiotherapy.

11 Introduction, aims and outline of the thesis 1 Chapter 6 compares four different MRI grading systems for rectal tumour response evaluation after chemoradiotherapy in a multicentre and multireader validation study. Chapter 7 focuses on the comparison of different methods to predict response before the start of chemoradiotherapy via visual assessment on baseline staging MRI.

12 Chapter 1 References 1. Paty PB, Garcia-Aguilar J (2022) Colorectal cancer. J Surg Oncol 126:881–887. https://doi. org/10.1002/JSO.27079 2. Rawla P, Sunkara T, Barsouk A (2019) Epidemiology of colorectal cancer: incidence, mortality, survival, and risk factors. Gastroenterology Review/Przegląd Gastroenterologiczny 14:89–103. 3. Habr-Gama A, Perez RO, Nadalin W, et al (2004) Operative versus nonoperative treatment for stage 0 distal rectal cancer following chemoradiation therapy: long-term results. Ann Surg 240(4):711-718. 4. Appelt AL, Pløen J, Harling H, et al (2015) High-dose chemoradiotherapy and watchful waiting for distal rectal cancer: a prospective observational study. Lancet Oncol 16(8):919– 927. 42. 5. Araujo RO, Valadão M, Borges D, et al (2015) Nonoperative management of rectal cancer after chemoradiation opposed to resection after complete clinical re-sponse. A comparative study. Eur J Surg Oncol 41(11):1456–1463. 6. van der Valk MJM, Hilling DE, Bastiaannet E, et al; IWWD Consortium (2018) Long-term outcomes of clinical complete responders after neo-adjuvant treatment for rectal cancer in the International Watch & Wait Database (IWWD): an international multicentre registry study. Lancet 391(10139):2537–2545 7. Bahadoer RR, Dijkstra EA, van Etten B, et al (2021) Short-course radiotherapy followed by chemotherapy before total mesorectal excision (TME) versus preoperative chemoradiotherapy, TME, and optional adjuvant chemotherapy in locally advanced rectal cancer (RAPIDO): a randomised, open-label, phase 3 trial [published correction appears in Lancet Oncol. 2021 Feb;22(2):e42]. Lancet Oncol 22(1):29-42. 8. Garcia-Aguilar J, Patil S, Gollub MJ, et al (2022) Organ Preservation in Patients With Rectal Adenocarcinoma Treated With Total Neoadjuvant Therapy. J Clin Oncol.;40(23):2546-2556. doi:10.1200/JCO.22.00032 9. Bach SP; STAR-TREC Collaborative (2022) Can we Save the rectum by watchful waiting or TransAnal surgery following (chemo)Radiotherapy versus Total mesorectal excision for early REctal Cancer (STAR-TREC)? Protocol for the international, multicentre, rolling phase II/III partially randomized patient preference trial evaluating long-course concurrent chemoradiotherapy versus short-course radiotherapy organ preservation approaches. Colorectal Dis 24(5):639651. 10. Gerard JP, Barbet N, Schiappa R, et al (2023) Neoadjuvant chemoradiotherapy with radiation dose escalation with contact x-ray brachytherapy boost or external beam radiotherapy boost for organ preservation in early cT2-cT3 rectal adenocarcinoma (OPERA): a phase 3, randomised controlled trial. Lancet Gastroenterol Hepatol 8(4):356-367. 11. Haak HE, Maas M, Lahaye MJ, et al (2020) Selection of Patients for Organ Preservation After Chemoradiotherapy: MRI Identifies Poor Responders Who Can Go Straight to Surgery. Ann Surg Oncol 27:2732–2739. 12. Beets-Tan RGH, Lambregts DMJ, Maas M, et al (2018) Magnetic resonance imaging for clinical management of rectal cancer: Updated recommendations from the 2016 European Society of Gastrointestinal and Abdominal Radiology (ESGAR) consensus meeting. Eur Radiol 28:1465– 1475. 13. Kassam Z, Lang R, Arya S, et al (2022) Update to the structured MRI report for primary staging of rectal cancer Perspective from the SAR Disease Focused Panel on Rectal and Anal Cancer. 47:3364–3374. 14. Nougaret S, Rousset P, Gormly K, et al (2022) Structured and shared MRI staging lexicon and report of rectal cancer: A consensus proposal by the French Radiology Group (GRERCAR) and Surgical Group (GRECCAR) for rectal cancer. Diagn Interv Imaging 103:127–141. https://doi. org/10.1016/J.DIII.2021.08.003

13 Introduction, aims and outline of the thesis 1 15. Bhoday J, Smith F, Siddiqui MR, et al (2016) Magnetic Resonance Tumor Regression Grade and Residual Mucosal Abnormality as Predictors for Pathological Complete Response in Rectal Cancer Postneoadjuvant Chemoradiotherapy. Dis Colon Rectum 59:925–933. https://doi. org/10.1097/DCR.0000000000000667 16. Lee MA, Cho SH, Seo AN, et al (2017) Modified 3-point mri-based tumor regression grade incorporating DWI for locally advanced rectal cancer. American Journal of Roentgenology 209:1247–1255. 17. Lambregts DMJ, Pizzi AD, Lahaye MJ, et al (2018) A pattern-based approach combining tumor morphology on MRI with distinct signal patterns on diffusion-weighted imaging to assess response of rectal tumors after chemoradiotherapy. Dis Colon Rectum 61:328–337. https://doi. org/10.1097/DCR.0000000000000915 18. Santiago I, Barata M, Figueiredo N, et al (2020) The split scar sign as an indicator of sustained complete response after neoadjuvant therapy in rectal cancer. Eur Radiol 30:224–238. https:// 19. Bregendahl S, Bondeven P, Grønborg TK, et al (2022) Training of radiology specialists in local staging of primary rectal cancer on MRI: a prospective intervention study exploring the impact of various educational elements on the interpretive performance. BMJ Open Qual 11:. https:// 20. Sluckin TC, Hazen SMJA, Horsthuis K, et al (2022) Significant improvement after training in the assessment of lateral compartments and short-axis measurements of lateral lymph nodes in rectal cancer. Eur Radiol. 21. Wang S, Li XT, Zhang XY, et al (2019) MRI evaluation of extramural vascular invasion by inexperienced radiologists. Br J Radiol 92:. 22. Siddiqui MRS, Gormly KL, Bhoday J, et al (2016) Interobserver agreement of radiologists assessing the response of rectal cancers to preoperative chemoradiation using the MRI tumour regression grading (mrTRG). Clin Radiol 71:854–862. CRAD.2016.05.005 23. Marcus DS, Olsen TR, Ramaratnam M, Buckner RL (2007) The extensible neuroimaging archive toolkit: An informatics platform for managing, exploring, and sharing neuroimaging data. Neuroinformatics 5:11–33. 24. Ziegler E, Urban T, Brown D, et al (2020) Open Health Imaging Foundation Viewer: An Extensible Open-Source Framework for Building Web-Based Imaging Applications to Support Cancer Research. JCO Clin Cancer Inform 4:336–345

14 Chapter 1

15 2 CHAPTER 2 Pearls and pitfalls of structured staging and reporting of rectal cancer on MRI: an international multireader study. N. el Khababi, R.G.H. Beets-Tan, L. Curvo-Semedo, R. Tissier, J. Nederend, M.J. Lahaye, M. Maas, G.L. Beets, D.M.J. Lambregts, on behalf of the rectal MRI study group Other authors in rectal MRI study group: F.C.H. Bakers, P. Barros, F. Bauer, S.H. de Bie, S. Ballantyne, J. Brayner Dutra, N. Bogveradze, G.P.T. Bosma, A. Mirela Calin - Vainak, V.C. Cappendijk, F. Castagnoli, A. Chandramohan, S. Charalampos, A. Delli Pizzi, S. Evans, R.W.F. Geenen, J.J.M. van Griethuysen, J. Maclachlan, V. Mahajan, S. Malekzadeh, P.A. Neijenhuis, M. de Oliveira Taveira, G.M. Peterson, I. Pieters, R. Popita, N.W. Schurink, C. Sofia, S. Swerkersson, C.J. Veeken, R.F.A. Vliegen, A. Zeina Published in BJR 2023; 96(1150):20230091

16 Chapter 2 Abstract Objectives To investigate uniformity and pitfalls in structured radiological staging of rectal cancer. Methods Twenty-one radiologists (12 countries) staged 75 rectal cancers on MRI using a structured reporting template. Interobserver agreement (IOA) was calculated as the percentage agreement between readers (categorical variables) and Krippendorff’s alpha (continuous variables). Agreement with an expert consensus served as a surrogate standard of reference to estimate diagnostic accuracy. Polychoric correlation coefficients were used to assess correlations between diagnostic confidence and accuracy (=agreement with expert consensus). Results Uniformity to diagnose high-risk (≥cT3ab) versus low-risk (≤cT3cd) cT-stage, cN0 versus cN+, lateral nodes and tumour deposits, MRF and sphincter involvement, and solid versus mucinous tumours was high with IOA >80% in the majority of cases (and >80% agreement with expert consensus). Results for assessing extramural vascular invasion, cT-stage (cT1-2/cT3/cT4a/cT4b), cN-stage (cN0/N1/N2), relation to the peritoneal reflection, extent of sphincter involvement (internal/intersphincteric/external) and morphology (solid/annular/semi-annular) were considerably poorer. IOA was high (α=0.72-0.84) for tumour height/length and extramural invasion depth, but low for tumour-MRF distance and number of (suspicious) nodes (α=0.05-0.55). There was a significant positive correlation between diagnostic confidence and accuracy (=agreement with expert consensus) (p<0.001-p=0.003). Conclusions - Several staging items lacked sufficient reproducibility. - Results for cT- and N-staging improved when using a dichotomized stratification - There was a significant correlation between diagnostic confidence and accuracy (=agreement with expert consensus). Introduction MRI is the main diagnostic technique for local tumour staging in rectal cancer. Primary goals of MRI are to establish the presence of key risk factors associated with local recurrence to determine the need for neoadjuvant treatment, and to assess the local invasion of tumours into surrounding organs and structures to help guide the surgical

17 Pearls and pitfalls of structured staging and reporting of rectal cancer on MRI 2 strategy. To ensure that these key elements of staging are adequately represented in radiological reports, various organizations such as the European Society of Gastrointestinal and Abdominal Radiology (ESGAR), Society of Abdominal Radiology (SAR), Radiological Society of North America (RSNA), and different national radiological societies, have introduced standardized (structured) reporting templates that are typically largely based on the Tumour Node Metastasis (TNM) staging system proposed by the American Joint Committee on Cancer (AJCC) / Union for International Cancer Control (UICC).1–4 Recent studies have shown that radiologists have increasingly adopted these structured reporting templates for routine clinical reporting 3,5–7 , which has led to enhanced completeness of reporting and improved satisfaction levels of referring clinicians.5,7–10 Another goal of structured reporting is to increase the overall level of uniformity in radiological reporting. To what extent this is successfully accomplished, is not well documented. We know from previous literature that MRI has its limitations, for example when it comes to lymph node staging.11 In addition, there are several other pitfalls and controversies that may lead to inconsistencies in reporting despite the availability of standardized reporting templates.12,13 This multicentre study aims to investigate the level of uniformity in the radiological staging of rectal cancer using a structured reporting template, and establish its main limitations and areas for improvement on a large scale, by testing the reproducibility among a group of more than 20 radiologists from different nationalities and with different clinical expertise levels. Methods Study design This study concerns a retrospective multicentre diagnostic study approved by the local institutional review board of the principal investigating centre. Informed consent was waived. Patient and study reader accrual Twenty-one radiologists from 12 countries participated as study readers. Readers were accrued via an open call to the ESGAR membership (in particular members with a known interest in rectal cancer imaging). The study included n=75 patients who were treated for newly diagnosed rectal cancer in one out of 10 centres in the Netherlands (1 university hospital, 8 teaching hospitals and 1 comprehensive cancer centre). Patients were selected from an existing and previously published multicentre study database7, 14, 15 based on the following inclusion criteria; [1] biopsy-proven rectal carcinoma, and

18 Chapter 2 [2] availability of diagnostic quality primary staging MRI including at least T2-weighted sequences in three planes (sagittal, coronal, transversal). To ensure a clinically representative sample of cases, patients were selected semi-randomly so that data from all 10 study centres were represented in the study cohort with sufficient variety in terms of clinical tumour stage. MR imaging MRI exams were carried out following the local protocols of the participating study centres at the time of inclusion. From the full scan protocols, we selected 2D T2-weighted sequences in sagittal, oblique-axial (perpendicular to the tumour axis), and obliquecoronal (parallel to the tumour axis) planes as these sequences are the minimum requirement recommended for primary rectal cancer staging as outlined in recent guidelines.1 Slice thickness ranged between 3-5 mm and in-plane resolution ranged between 0.35x0.35 and 0.94x0.94 mm. Image evaluation Images were evaluated using a web-based platform (iScore) designed by one of the authors (NEK) that combines the Open Health Imaging Foundation (OHIF) DICOM viewing platform16 with customizable electronic case report forms (eCRFs). For the current study, the structured reporting template for primary rectal cancer staging published by ESGAR was converted into an eCRF, comprised of twenty staging items, including fourteen categorical staging variables and six continuous variables (listed in detail in Table 1). Readers were asked to complete the staging eCRF for each study case. For the variables cT-stage, cN-stage, MRF involvement, EMVI, and presence of sphincter invasion, a confidence level was included in the staging (see Table 1). Links to relevant background information, including TNM staging definitions and the ESGAR consensus guidelines on MRI for rectal cancer were provided in iScore.1,4,17 Readers were blinded to each other’s results and outcomes of surgery and histopathology. Statistical analysis and standard of reference Statistical analyses were performed using R statistics version 4.1.0 (2021) and IBM SPSS version 27 (2020). Group interobserver agreement (IOA) for the continuous variables was calculated using Krippendorff’s alpha. For the categorical variables the percentage agreement between study readers was calculated and grouped into items with suboptimal agreement (<60%), moderate agreement (60%-80%) and good agreement (>80%). Correlation with histopathology was only available for a minority of patients (and for only a few of the studied staging variables) since most patients underwent neoadjuvant treatment prior to surgery. As such, two rectal MRI experts (DL and LC; each with >10 years of dedicated experience in rectal MRI and with a background in rectal cancer research and teaching) staged all study cases to establish a surrogate standard of

19 Pearls and pitfalls of structured staging and reporting of rectal cancer on MRI 2 reference. The two experts staged the cases independently. In case of any disagreement, a third independent expert (ML, with similar >10 years of dedicated experience in rectal MRI) was consulted to reach final expert consensus. IOA and diagnostic accuracy (= agreement with the expert reference) were calculated for the 21 study readers. For staging items 3, 5, 8, 9, 15 (see Table 1) confidence scores were dichotomized to calculate diagnostic accuracy with the cut-off between equivocal and probably involved/positive. Polychoric correlation coefficients were calculated between the various confidence level scores and diagnostic accuracy. Pearson’s Chi-squared test was used to calculate p-values. The level of significance was set at p<0.05. Table 1 Staging variables derived from the ESGAR structured reporting template that were included in the electronic case report form in iScore Main staging variables used for risk stratification Variable Answer options Categorical 1 cT-stage - cT0, cT1-2, cT3, cT4a, cT4b - Level of confidence: very unconfident, unconfident, equivocal, confident, very confident 2* In case of cT3 tumour, cT3 sub-stage - cT3ab (≤5 mm extramural invasion depth), cT3cd (>5 mm extramural invasion depth) - Level of confidence: very unconfident, unconfident, equivocal, confident, very confident 3 Dichotomous cT-stage classification - Definitely low risk (cT1-2-3ab) - Probably low risk (cT1-2-3ab) - Equivocal (possibly low risk, possibly high risk) - Probably high risk (cT3cd-4ab) - Definitely high risk (cT3cd-4ab) 4 cN-stage - cN0 (no suspicious nodes), cN1 (1-3 suspicious nodes), cN2 (≥ 4 suspicious nodes) 5 Dichotomous cN-stage classification - Definitely cN0 - Probably cN0 - Equivocal (possibly cN0, possibly cN+) - Probably cN+ - Definitely cN+ 6 Lateral nodes - Yes, presence of ≥1 suspicious lateral lymph node(s) - No, absence of suspicious lateral lymph nodes 7 Tumour deposits - Yes, presence of ≥1 tumour deposit(s) - No, absence of tumour deposits 8 MRF involvement - Definitely MRF- (tumour-MRF distance > 1 mm) - Probably MRF- (tumour-MRF distance > 1 mm) - Equivocal (possibly MRF-, possibly MRF+) - Probably MRF+ (tumour-MRF distance ≤1 mm) - Definitely MRF+ (tumour-MRF distance ≤1 mm)

20 Chapter 2 Main staging variables used for risk stratification Variable Answer options 9 EMVI - Definitely EMVI- - Probably EMVI- - Equivocal (possibly EMVI -, possibly EMVI +) - Probably EMVI+ - Definitely EMVI+ Continuous 10 Total number of visible mesorectal nodes Number 11 Total number of suspicious mesorectal nodes Number 12 Tumour-MRF distance …. mm Other staging variables Variable Answer options Categorical 13 Morphology (shape) Annular, semi-annular, polypoid 14 Morphology (composition) Solid, mixed, mucinous 15 Sphincter invasion Definitely no sphincter invasion Probably no sphincter invasion Equivocal (possibly no sphincter invasion, possibly sphincter invasion) Probably sphincter invasion Definitely sphincter invasion 16# In case of sphincter involvement, level of involvement Internal sphincter only, including intersphincteric plane, including external sphincter Level of confidence: very unconfident, unconfident, equivocal, confident, very confident 17 Relation to peritoneal reflection Completely above, completely below, partially above/ below Continuous 18 Tumour length …. mm 19 Tumour height …. mm 20 Extramural invasion depth (in cT3-4 tumours) …. mm * T3 substage was not evaluated for all patient cases, but only when readers assigned a T3 stage # Level of sphincter involvement was not evaluated for all patient cases, but only when readers indicated a suspicion for sphincter involvement Results Patient and study reader characteristics Forty-six patients (61%) were male, median age was 64 (range 40-82) years. The study readers had a median of 11 years of experience as a radiologist after completion of residency training (range 2-28 years). Further patient and study reader characteristics are provided in Table 2. Table 1 Continued

21 Pearls and pitfalls of structured staging and reporting of rectal cancer on MRI 2 Table 2 Patient and study reader characteristics N= % Patients Total 75 100% Median age 64 years (range 40-82) Sex Male 46 61% Female 29 39% Clinical tumour stage and treatment Surgery only (early stage) 10 13% Short course radiotherapy (intermediate stage) 7 9% Long course (chemo)radiotherapy* (advanced stage) 58 77% Study readers Total 21 100% Years after completion of residency training <5 years 5 24% 5-10 years 6 29% >10 years 10 48% Workplace Comprehensive cancer center 8 35% University hospital 4 19% General hospital 7 33% other 2 10% Number of rectal cancer cases reported on a yearly basis < 50 1 5% 50 – 100 5 24% > 100 15 71% Country of origin United Kingdom 5 24% Brazil 2 10% India 2 10% Italy 2 10% Romania 2 10% Switzerland 2 10% The Netherlands 2 10% Chile 1 5% Georgia 1 5% Germany 1 5% Israël 1 5% Sweden 1 5% * Including 57 cases undergoing long course chemoradiation and 1 case undergoing 5x5 Gy radiotherapy with a prolonged waiting interval to surgery

22 Chapter 2 Interobserver agreement Table 3 shows the mean IOA for the categorical staging variables (as well as the diagnostic accuracy estimates based on the expert standard of reference). For most of the categorical staging items good (>80%) agreement was reached for the majority of patient cases. For three items – cTstage, cN-stage, and morphology/shape – IOA was <80% in the majority of cases with poorest results for cN-staging where good IOA was achieved in only 28% of the cases. For the six continuous variables, IOA (Krippendorff’s alpha) was α=0.72 for tumour length, α=0.84 for tumour height, α=0.33 for the tumour-MRF distance, α=0.73 for extramural invasion depth, α=0.55 for the total number of visible mesorectal nodes, and α=0.05 for the total number of suspicious mesorectal nodes. Good IOA was reached for the majority (71%) of cases to detect the presence of any high risk feature (i.e. ≥T3cd stage, N+, tumour deposits, or EMVI). Agreement with expert reference Good (>80%) agreement with the expert standard of reference (range 80-91%) was achieved for the dichotomized assessment of low versus high-risk cT-stage and cN0 vs N+ disease, for assessing the presence of lateral nodal metastases, tumour deposits, MRF involvement, and presence of sphincter invasion. Results for multi-categorical cT-staging and cN-staging were considerably lower (69% and 60% agreement with expert reference), as were results for EMVI (77%), tumour morphology (67-79%) and relation to the anterior peritoneal reflection (77%). In patients with suspected sphincter involvement, mean agreement with the expert reference to assess the level of involvement (internal sphincter, intersphincteric plane, external sphincter) was 51%. In patients staged as cT3, accuracy for cT3 subclassification into low risk (cT3ab) vs high risk (cT3cd) was 73%. Correlation with diagnostic confidence A significant positive correlation between diagnostic confidence and agreement with the expert reference was found for all staging items for which diagnostic confidence scores were available (see Table 1) with polychoric correlation coefficients ranging from 0.18-0.66 (p<0.001-p=0.003). Strongest effects (correlation coefficient >0.50) were found for presence of sphincter involvement, low versus high-risk cT-stage, cN0 vs. N+ stage , EMVI and MRF involvement. Confidence scores were lowest for assessing the level of sphincter involvement. Main problem areas Morphology (shape) As shown in Table 3, readers showed relatively poor IOA and agreement with an expert reference to differentiate between annular, semi-annular and polypoid tumours. Readers were mainly inconsistent in discerning semi-annular from polypoid tumours (accuracy 62-64%) (see Figure 1).

23 Pearls and pitfalls of structured staging and reporting of rectal cancer on MRI 2 Table 3 Mean diagnostic accuracy and interobserver agreement for the main categorical staging variables Staging variables* Interobserver agreement Agreement with expert reference Reference standard (i.e. results of expert consensus reading; total n=75) Suboptimal (<60%) Moderate (60-80%) Good (>80%) I – Main staging variables used for risk stratification cTstage (cT0/1/2/3/4a/4b) 16% 40% 44% 69% 13 cT1-2, 39 cT3, 10 cT4a, 13 cT4b dichotomized cT-stage (≤cT3ab vs. ≥cT3cd) 17% 23% 60% 81% 33 low risk, 42 high risk cN-stage (cN0/1/2) 40% 32% 28% 60% 21 cN0, 26 cN1, 28 cN2 dichotomized cN-stage (cN0/ cN+) 9% 33% 57% 80% 21 cN0, 54 cN+ lateral N+ nodes (yes/no) 5% 19% 76% 88% 16 Yes, 59 No tumor deposits (yes/no) 7% 19% 75% 86% 13 Yes, 62 No MRF involvement (yes/no) 13% 23% 64% 82% 28 Yes, 47 No EMVI (yes/no) 9% 36% 55% 77% 30 Yes, 45 No Ia – Assessing the presence of high risk disease - Presence of any (≥1) high risk feature (≥T3cd, N+, tumor deposits, or EMVI) 9% 20% 71% 84% 60 Yes, 15 No II – Other staging variables morphology – shape (annular/ semi-annular/polypoid) 29% 31% 40% 67% 29 annular, 36 semiannular, 10 polypoid morphology – composition (solid/mucinous/mixed) 8% 27% 65% 79% 58 solid, 5 mucinous, 12 mixed relation to peritoneal reflection (above/below/straddling) 12% 23% 65% 77% 10 above, 41 below, 24 straddling sphincter invasion (yes/no) 5% 16% 79% 91% 16 Yes, 59 No Note, diagnostic accuracies represent the averages for all cases and readers combined. * The categorical staging variables cT3 substage (cT3ab vs cT3cd) and level of sphincter involvement (internal, intersphincteric plane, external sphincter) are not included in this table as these were only available for varying subsets of patients depending on assigned cT-stage and presence of sphincter invasion.

24 Chapter 2 Figure 1 Example of two cases that were scored as semi-annular by the two expert readers. There was large variation among the study readers. Case A was scored by 29% as annular, by 48% as polypoid, and by 24% as semi-annular. Case B was scored by 10% as annular, by 43% as polypoid, and by 48% as semi-annular. cT-staging Agreement with the expert reference for cT-staging was 69%, which was considerably lower than for dichotomous evaluation of low risk (cT1-2-3ab) versus high risk (cT3cd-4) cT-stage (agreement 81%). IOA was also better for the dichotomous evaluation. Results were poorest for assessment of cT4a ( 54% agreement with expert reference) and cT4b tumours (59% agreement), with understaging rates of on average 30-34% (example shown in Figure 2).

25 Pearls and pitfalls of structured staging and reporting of rectal cancer on MRI 2 Figure 2 Example of a case that was staged as cT4a by expert consensus because of involvement of the peritoneum on the left anterior side, above the level of the anterior peritoneal reflection (arrowheads in A). There is simultaneous focal involvement of the MRF below the level of the anterior peritoneal reflection (arrow in B). The study readers unanimously scored this case as MRF+ (100%), but 81% of readers scored it as cT3 and only 19% as T4a. cN-staging Agreement with the expert reference for dichotomized assessment of cN0/N+ stage was 80%, which was considerably higher than for multicategorical cN-staging as cN0/1/2 (agreement 60%). The number of detected positive mesorectal lymph nodes varied widely, as reflected by the very low group agreement in defining the number of suspicious mesorectal nodes (α=0.05).

26 Chapter 2 Figure 3 Sagittal image (A) showing a stenosing tumour in the mid-rectum with corresponding crosssection (B) at the mid-tumour level. Adjacent to the tumour there is some desmoplastic stranding as well as several small vessels that radiate outward from the edge of the muscularis propria into the perirectal fat (arrows in B). Because no tumour signal extends into, interrupts or expands these vessels, this case was scored as EMVI-negative by expert consensus. There was however, considerable variation among the study readers; 57% considered this case as EMVI+, 33% as EMVI- and 10% assigned an equivocal score. EMVI IOA to assess EMVI was relatively low and >80% consensus between readers was only reached in 55% of the study cases, with a mean agreement with the expert reference of 77%. Agreement with the expert reference to diagnose EMVI+ tumours (69%) was poorer than for the diagnosis of EMVI- tumours (82%), see Figure 3.

27 Pearls and pitfalls of structured staging and reporting of rectal cancer on MRI 2 Level of sphincter involvement IOA to assess the presence of sphincter involvement (yes/ no) were high but results to assess the level of sphincter involvement were considerably poorer with an agreement of only 51% compared to the expert reference. This was probably (partly) related to suboptimal image angulation: when re-reviewing all the cases with suspected sphincter involvement, a dedicated coronal sequence angled parallel to the anal canal was only available in 38% of these cases (see Figure 4). Figure 4 Example of a case with suspect sphincter involvement. The coronal sequence is angled parallel to the distal rectum (rather than to the anal canal), making it more difficult to assess the level of sphincter involvement. The two experts assessed this case as suspicious for intersphincteric involvement, mainly based on the transverse sequence (arrow). Results of the study readers were highly variable; 19% considered it as no sphincter invasion, 19% as internal sphincter invasion only, 19% as extending into the intersphincteric space, and 43% as extending into the external sphincter.

28 Chapter 2 Tumour-MRF distance IOA (and agreement with the expert reference) to assess an involved MRF were high, but there was large variation in measuring the tumour-MRF distance with an IOA of only α=0.33 (two examples shown in Figure 5). Figure 5 Examples of two cases where there was substantial variation in tumour-MRF distance among readers. The upper row shows the sagittal (A) and transverse (B) images of a male patient with an upper rectal tumour, situated above the level of the anterior peritoneal reflection. Six out of 21 readers (29%) erroneously interpreted the anterior invasion of the peritoneum as MRF invasion and measured a tumourMRF distance of 0 mm, while other readers (and the expert reference) measured the distance from the tumour to dorsal MRF as > 10 mm. The bottom row (sagittal C, transverse D) shows a male patient with a mid-rectal tumour. Some readers interpreted the anterior rectal wall as involved, while other readers (and the expert reference) believed this tumour was confined to the left dorsolateral wall, resulting in a tumourMRF distance of >10 mm.

29 Pearls and pitfalls of structured staging and reporting of rectal cancer on MRI 2 Discussion In this study we evaluated uniformity in structured reporting of rectal cancer among an international group of 21 radiologists. When looking at the main staging variables used for risk- and treatment stratification, good results were achieved for dichotomized assessment of high versus low-risk cT-stage and cN0 vs N+ stage, lateral nodes and tumour deposits, and MRF involvement. These variables resulted in good (>80%) interobserver agreement in the majority of patient cases. Less favourable results were found for multicategorical cT- and cN-staging, and for assessing EMVI. We observed a clear significant positive correlation between diagnostic confidence and the agreement of study readers with an expert standard of reference (which was used as a surrogate standard to estimate diagnostic accuracy, considering the lack of pathologic correlation for most study cases and variables). This correlation is an interesting finding considering that template reports tend to force radiologists to assign a specific stage or category, even if they are unsure about their diagnosis, which is a potential drawback of structured reporting. Our results show that this uncertainty can lead to substantial variations in staging. Given the potential impact on treatment, it might be better to incorporate the level of confidence into structured reporting templates to better inform risk-based clinical decision making during multidisciplinary team discussions, at least for those items that are poorly reproducible and strongly influenced by variations in diagnostic confidence. Interestingly, the dichotomized classification of cT-stage into high risk (≥cT3cd) versus low risk (≤cT3ab) resulted in better IOA than the multicategorical cT-stage classification as defined the TNM staging manual.4 Some guidelines, like those from the National Comprehensive Cancer Network (NCCN)18, use cT3 disease as a key risk factor indicating a need for neoadjuvant treatment. However, several studies have shown that MRI often overstages T2 tumours as T3 resulting in potential overtreatment.19–21 Other guidelines consider early stage T3 disease (T3ab with ≤5 mm extramural invasion) as low risk, with comparable prognostic outcomes and treatment implications as T2 tumours.22,23 Our results show that this dichotomous classification results in better agreement and is therefore perhaps better suited to guide risk and treatment stratification. Assessment of T4a and T4b disease remains a problem area with poor IOA. As illustrated in Figure 3, there were several cT4a cases with simultaneous MRF involvement that were understaged as T3 MRF+. Apparently readers tend to choose either MRF or peritoneal (cT4a) invasion, rather than acknowledging that the two may co-occur. Moreover, some readers may mistake anterior invasion of the peritoneum above the level of the peritoneal reflection as MRF invasion (Figure 5AB). These issues were also demonstrated as pitfalls that require further teaching in a recent international survey study.12 It could be helpful

30 Chapter 2 to include the anatomical boundaries between MRF and peritoneum, and the distinction between cT3 MRF+ and cT4a MRF+ disease as specific staging options in template reports, aiming to increase radiologists’ awareness and ultimately uniformity in staging. Future studies should focus on establishing the benefit of further training and teaching in reducing interobserver variability in such pitfall cases. Similar to cT-staging, results for cN-staging were also better when dichotomized (into cN0 vs cN+). Considering the known limitations of MRI, radiologists should perhaps refrain from detailed cN-staging and limit themselves to estimating the risk for cN0 or cN+ disease, including a level of confidence to support their findings. Results for assessment of lateral nodes and tumour deposits were remarkably good, especially when considering the concerns voiced in recent reports on the lack of validated imaging criteria.12 These good results can likely be partly explained by the low prevalence of positive lateral nodes and tumour deposits in our cohort and the fact that agreement with our expert standard or reference was particularly high (88%-93%) to diagnose these negative cases. Results for assessment of EMVI were lower than expected (>80% interobserver agreement reached in only half of the study cases). Although image-detected EMVI has been acknowledged as an important prognostic factor for some time, it was introduced into reporting guides and templates more recently and is still less routinely reported than other risk factors such as TN-stage and MRF.1,7 Some readers may therefore still be going through a learning curve and experience difficulties with less straightforward cases such as Figure 3. In line with a previous report that showed relatively low sensitivity and PPV (62-67%) and high specificity and NPV (88-89%)17, radiologists in our study were also better in assessing EMVI- than EMVI+ tumours. Further teaching, and inclusion of specific published grading systems to diagnose EMVI+ disease into reporting templates, could be beneficial to further improve uniformity in the staging of EMVI. When looking at the other staging items, tumour morphology (in particular the distinction between annular, semi-annular and polypoid tumours) and level of sphincter involvement showed the poorest results. Describing the tumour morphology is mainly relevant as polypoid tumours typically have a better prognosis compared to (semi-) annular tumours with a more extensive invasive margin.24 Disease-focused panel recommendations from the Society of Abdominal Radiology (SAR) define a polypoid tumour as a tumour with a pedicle or stalk.25 A later report by Golia Pernicka et al. suggested to redefine the polypoid definition to ≤¼ wall circumference attachment and a visible pedicle.24 Future guidelines and reporting templates should perhaps adopt such more specific definitions to improve uniformity in the radiological assessment of tumour morphology. Moreover, radiologists should be properly informed about the morphology

31 Pearls and pitfalls of structured staging and reporting of rectal cancer on MRI 2 (but also size and location) of tumour lesions from the endoscopy reports to allow proper clinically-informed image interpretation and reporting. Concerning the level of sphincter involvement, recent guidelines recommend that in distal tumours a coronal sequence parallel to the anal canal should be included to properly assess the relation between tumour and anal sphincter.1 In our current cohort, such a sequence was only available in 38% of the cases with suspected sphincter involvement. Our cohort dates back as far as 2010, which means that some scans were obtained using outdated study protocols, including suboptimal sequence angulation. Though a more in-depth analysis of the impact of image quality was beyond the scope of this paper, we acknowledge this as a limitation that may have introduced bias. There are some other limitations to our study design. First and foremost, as aforementioned, accuracy figures could only be estimated and were calculated using expert consensus as a standard of reference, considering the lack of histological confirmation for the majority of patients. Our study cohort was skewed towards advanced disease (77%) and direct correlation with pathology was only available for 17 cases. Moreover, not all staging variables were routinely reported in the pathology reports. Diagnostic confidence scores were furthermore missing for some of the staging variables. Image evaluation was performed using only T2-weighted sequences. Though DWI sequences are not routinely recommended for primary staging (mainly for restaging)1 they are also commonly included in primary staging protocols and might have been of benefit to aid in tumour detection and delineation. MRI scans in our cohort originate from one single country. Still, we believe that the sample derived from 10 different centres is representative for general clinical routine with representation of different vendors and common variations in clinical protocols. Finally, the number of patient cases assessed in this study was relatively small (n=75). This number was chosen as a minimum requirement to allow meaningful statistical analyses 26 while at the same time ensuring the feasibility that a large number of radiologists would complete all the scans within an acceptable timeframe. In conclusion, this study shows that – though structured reporting aims to accomplish uniformity in staging – there are still some pitfalls that need to be acknowledged as they may result in insufficient staging reproducibility. Suggestions for improvement include more simplified, dichotomized risk stratification of cT- and cN-stage, adoption of confidence scores for items with low reproducibility, embedding more specific definitions for image interpretation into staging templates, and ensuring state-of-the-art image protocols.

32 Chapter 2 References 1. Beets-Tan RGH, Lambregts DMJ, Maas M, Bipat S, Barbaro B, Curvo-Semedo L, et al. Magnetic resonance imaging for clinical management of rectal cancer: Updated recommendations from the 2016 European Society of Gastrointestinal and Abdominal Radiology (ESGAR) consensus meeting. Eur Radiol 2018; 28(4):1465–75. 2. Kassam Z, Lang R, Arya S, Bates DDB, Chang KJ, Fraum TJ, et al. Update to the structured MRI report for primary staging of rectal cancer Perspective from the SAR Disease Focused Panel on Rectal and Anal Cancer. Abdom Radiol (NY) 2022; 47:3364–74. 3. Nougaret S, Rousset P, Gormly K, Lucidarme O, Brunelle S, Milot L, et al. Structured and shared MRI staging lexicon and report of rectal cancer: A consensus proposal by the French Radiology Group (GRERCAR) and Surgical Group (GRECCAR) for rectal cancer. Diagn Interv Imaging 2022; 103(3):127–41. 4. Jessup MJ, Goldberg RM, Asare EA, et al (2017) Colon and rectum. In: Amin MB, Edge S, Greene F, Byrd DR, Brookland RK, Washington MK, et al (eds). AJCC Cancer Staging Manual (8th edition). Springer; 2017. 251–273 p. 5. Brown PJ, Rossington H, Taylor J, Lambregts DMJ, Morris E, West NP, et al. Standardised reports with a template format are superior to free text reports: the case for rectal cancer reporting in clinical practice. Eur Radiol 2019; 29(9):5121–8. 6. Zhao AH, Matalon SA, Shinagare AB, Lee LK, Boland GW, Khorasani R. Improving the completeness of structured MRI reports for rectal cancer staging. Abdom Radiol (NY) 2021; 46(3):885–93. 7. Bogveradze N, el Khababi N, Schurink NW, van Griethuysen JJM, de Bie S, Bosma G, et al. Evolutions in rectal cancer MRI staging and risk stratification in The Netherlands. Abdom Radiol (NY) 2022; 47(1):38. 8. Alvfeldt G, Aspelin P, Blomqvist L, Sellberg N. Radiology reporting in rectal cancer using MRI: adherence to national template for structured reporting. Acta Radiol 2022; 63(12):1603-1612. 9. Nörenberg D, Sommer WH, Thasler W, D’Haese J, Rentsch M, Kolben T, et al. Structured Reporting of Rectal Magnetic Resonance Imaging in Suspected Primary Rectal Cancer: Potential Benefits for Surgical Planning and Interdisciplinary Communication. Invest Radiol 2017; 52(4):232–9. 10. Tersteeg JJC, Gobardhan PD, Crolla RMPH, Kint PAM, Niers-Stobbe I, Boonman–de Winter LJM, et al. Improving the Quality of MRI Reports of Preoperative Patients With Rectal Cancer: Effect of National Guidelines and Structured Reporting. AJR Am J Roentgenol 2018; 210(6):1240–4. 11. Brouwer NPM, Stijns RCH, Lemmens VEPP, Nagtegaal ID, Beets-Tan RGH, Fütterer JJ, et al. Clinical lymph node staging in colorectal cancer; a flip of the coin? Eur J Surg Oncol 2018; 44(8):1241–6 12. Lambregts DMJ, Bogveradze N, Blomqvist LK, Fokas E, Garcia-Aguilar J, Glimelius B, et al. Current controversies in TNM for the radiological staging of rectal cancer and how to deal with them: results of a global online survey and multidisciplinary expert consensus. Eur Radiol 2022; 32(7):4991-5003. 13. Gollub MJ, Lall C, Lalwani N, Rosenthal MH. Current Controversy, Confusion and Imprecision in the Use and Interpretation of Rectal MRI. Abdom Radiol (NY) 2019; 44(11):3549. 14. Schurink NW, van Kranen SR, Roberti S, van Griethuysen JJMM, Bogveradze N, Castagnoli F, et al. Sources of variation in multicenter rectal MRI data and their effect on radiomics feature reproducibility. Eur Radiol 2022; 32(3):1506–16. 15. el Khababi N, Beets-Tan RGH, Tissier R, Lahaye MJ, Maas M, Curvo-Semedo L, et al. Comparison of MRI response evaluation methods in rectal cancer: a multicentre and multireader validation study. Eur Radiol 2022; doi: 10.1007/s00330-022-09342-w. 16. Ziegler E, Urban T, Brown D, Petts J, Pieper SD, Lewis R, et al. Open Health Imaging Foundation Viewer: An Extensible Open-Source Framework for Building Web-Based Imaging Applications to Support Cancer Research. JCO Clin Cancer Inform 2020; 4(4):336–45.