584779-Bijlsma

105 6 item response theory (IRT) and generalizability theory (GT) model, a good fit was found for a model that included all aspects of teaching quality and all interaction effects, as the absolute differences between the estimated data (based on the IRT-GT model) and the observed data were smaller than 0.1 (Glas, 2016). The results thus showed that the construct validity of the Impact! questionnaire is good, as all items loaded on one scale measuring teaching quality. In chapter 3 the same IRT-GT-modelling approach as used in chapter 2 was used to model the factors potentially associated with differences in student ratings (content validity). Many of those factors did not seem to bias students’ ratings of teaching quality. It was found that student and teacher gender, teacher age, teachers’ initial teaching quality score, the measurement timing (in terms of in the morning or the afternoon), class size and the ethnic diversity of the class were not associated with differences in students’ ratings of teaching quality. High-performing students did rate their teacher’s teaching quality higher, on average, than low-performing and middle-performing students. More likeable and more experienced teachers also received higher teaching quality ratings from their students, and the higher the class’s average math grade, the higher the students rated their teachers. Thus, eventhoughtheconstruct validityof the Impact! toolwas supported in our study, students’ ratings possibly may not be fully valid measures, as there were some additional factors associated with differences between student ratings. As far as the four factors that were found to be correlated with students’ ratings of teaching quality are concerned, the reasons for this relationship are, however, unknown (see chapter 3 of this dissertation). For example, we do not know whether more likeable teachers teach better, or whether they receive higher teaching quality ratings because they are nice, funny, and so forth, but do not actually teach better than other teachers. Furthermore, one cannot tell a low-performing girl in a classroom who is rating her teacher’s teaching quality that her ratings should actually be higher because her ratings are influenced by her own level of performance. This girl might rate the quality of her instructor’s teaching lower than a high-performing boy or girl does because she does not like her teacher, or she really might find him to be a poor teacher because she does not understand his explanations during the lessons. There is, however, no indication that student perceptions should not be used in educational practice. Naturally, the ratings that students give are not based on a standardized norm, as students are not trained observers who use specific scoring rules. Students spend more time with teachers in the classroom

RkJQdWJsaXNoZXIy MjY0ODMw