Investigation of Rater Tendencies and Reliability in Different Assessment Methods with Many Facet Rasch Model

Duygu Koçak


One of the most commonly used methods for measuring higher-order thinking skills, such as problem-solving and written expression is open-ended items. Three main approaches are used to evaluate responses to open-ended items: general evaluation, rating scale, and the rubric. In order to measure and improve the problem-solving skills of the students, firstly, an error-free measurement process should be performed. Error caused by rater is a common problem in the evaluation of open-ended items. Errors caused by the rater, such as bias, high or low tendency to score, adversely affect the accuracy of decisions to be made. In this study, the raters' tendencies are evaluated in terms of general evaluation, rating scale, and rubric conditions used to evaluate open-ended items. The rater behaviors in each assessment method and the raters' opinions about the assessment methods were determined. The participants of the study consisted of 12 different mathematics teachers, and the analyses were based on the Many Facet Rasch Model. The scoring reliability of each method was estimated. When using the rating scale, it was concluded that the raters had a more homogeneous scoring tendency. In addition, while the majority of raters stated that they prefer to use rubric, the most difficult method to use was stated by the raters.


KOÇAK, Duygu.


