584779-Bijlsma

14 The reliability and construct validity of student perceptions of teaching quality included in Impact! and each item addressed only one aspect of teaching. Four answer options were given to avoid central tendency in responding the items (Weisberg, 1992). Students were asked to rate a lesson on scientifically proven characteristics of effective lessons (based on the above-mentioned literature review) instead of, for example, asking them if they liked the lesson, and all items were about one single lesson instead of about the teacher’s lessons in general. The items were all formulated in a teacher-centred way (e.g., “The teacher created a safe atmosphere during the lesson”) to determine the teacher’s contribution to lesson quality. Because students were supposed to give their own opinion when answering the items, “I” was used instead of “our class” (e.g., “The teacher clearly indicated what I was going to learn”). Based on feedback from fellow researchers, teachers and students, a final set of 16 items was included in the Impact! questionnaire. The questionnaire can be found in Appendix A. 1.6 THIS DISSERTATION In order to investigate the main topics that were introduced in this chapter – the validity and impact of student perceptions of teaching quality – four studies were conducted that form the core of this dissertation. In the following sections each of these studies is introduced. 1.6.1 The reliability and validity of student perceptions of teaching quality In the first study that we conducted, the following research questions were answered: Is there support for the construct validity of the Impact! questionnaire? How reliable are student perceptions of teaching quality as measured by means of the Impact! tool? We investigated the student Impact! ratings for 26 mathematics teachers (717 students) using a multi-level modelling approach. A combined item response theory (IRT) and generalizability theory (GT) model (unidimensional model, latent score) was used to model and measure the scores. Three models were examined (Glas, 1999, 2016). The models differed systematically: 1) a full model, where differences between students, teachers, measurement timings and interactions between these variance components were included; 2) a model where the interaction between time and teachers was removed; 3) a linear regression model, to analyse teachers’ growth curves over time.

RkJQdWJsaXNoZXIy MjY0ODMw