584779-Bijlsma

49 3 We kept track of how many times the tool was used and labelled the results as belonging to Time Point 1, Time Point 2, and so forth. The Impact! items can be found in Appendix A. Items 1 to 15 are answered on a 4-point Likert scale (0 = totally disagree; 1 = disagree; 2 = agree; 3 = totally agree). Item 16 is an open-ended question, where students can enter their answers (the answers to this question were not used in this study). In Items 6, 8, and 11, an extra option, not applicable, was included, because these items do not apply to every lesson. The items from the Impact! questionnaire were originally scored 0, 1, 2, or 3. However, the lowest answer category was used too rarely, which had a negative effect on the stability of the analyses. Therefore, the two lowest categories were combined into one category in the analyses, so the scores became 0, 1, and 2. The extra answer option not applicable was registered as “missing value”. The data have a multilevel structure pertaining to teachers, students, and time points (students’ responses were nested within teachers and collected at different time points). 3.3.4 Variables at the student, teacher, and classroom levels Table 3.1 presents an overview of all variables included in the analysis at the student, classroom, and teacher levels. Except for the timing of the questionnaire administration (morning/afternoon), which was collected automatically by means of the digital Impact! tool, data were collected by means of the paper-based questionnaires administered prior to the start of the study. 3.3.5 Analyses To answer the research question, a combined item response theory (IRT) and generalizability theory (GT) model was developed. This type of analysis, introduced by Fox and Glas (2001, 2003), provides a theoretically well-founded framework for educational measurements. It has become the standard statistical tool for supporting the construction of instruments, and for evaluating test bias and differential item functioning (e.g., Lord, 1980). The methodology is also used in evaluations of high-stakes tests and examinations, for instance, in large-scale international educational surveys, such as the Programme for International Student Assessment (PISA), the Trends in International Mathematics and Science Study (TIMSS), the Progress in International Reading Literacy Study (PIRLS), the International Computer and Information Literacy Study (ICILS), and the Programme for the International Assessment of Adult

RkJQdWJsaXNoZXIy MjY0ODMw