584779-Bijlsma

36 The reliability and construct validity of student perceptions of teaching quality content of the items, results are quite plausible. For example, the local reliability of item 2 (“The teacher explained the subject matter in such a way that I understood it well’) was high. It makes sense that this item reflects teaching quality from a student perspective well. Students were also able to discriminate well between the Impact! items (high discriminant validity), in that the discrimination parameters were all high. This is in line with the results from other recent studies in which support was found for the discriminant validity of student perceptions of teaching quality, for example, research done by Kunter and Baumert (2006), Fauth et al. (2014), Kane et al. (2010), and van der Scheer et al. (2019). 2.6.2 Discussion and recommendations for future research The results of the current study support the construct validity of the Impact! questionnaire. The model with one underlying construct (hypothesized to be teaching quality based on the literature review) was tested and good fit was found for a model containing all items. However, teaching is a complex activity and includes multiple dimensions, which could also be modelled as multiple constructs (Feldman, 1997; Marsh & Roche, 1997; Roche & Marsh, 2000; Sammons et al., 1995). An example is the ICALT questionnaire (Maulana et al., 2015; van der Lans et al., 2015), in which six aspects of teaching quality are distinguished. Kyriakides and Creemers (2016), Fauth et al. (2014) and Praetorius et al. (2018) distinguished three generic aspects of teaching quality. In these examples, more than one aspect of teaching quality was modelled as a separate construct. So, how can we then interpret the construct validity results of the current study? We found that the expected scores of the model with one construct fit the observed scores well. We could have used a model with multiple aspects of teaching quality; however, the model would then have been less parsimonious, which leads to more unexplained variance. So, for the present purpose, that is, to investigate the reliability of the measurements, and to determine whether some principal component underlying the responses could support validity, a unidimensional model proved sufficient. Scores from students evaluating teaching quality vary. It is reasonable that 36.6% of the variance is explained by differences between teachers, because students might perceive differences in the quality of teachers’ teaching. However, students could also rate teachers differently based on teachers’ gender, teaching experiences, age, or because some teachers are more popular. Furthermore, 24.4% of the variance in the scores was explained by differences in student characteristics, for example students’ gender, performance level, or age. In future

RkJQdWJsaXNoZXIy MjY0ODMw