Published
March 23, 2025
| Pages: 237-251 | Views: 89
Abstract
The National Assessment of Educational Progress (NAEP), often referred to as The Nation’s Report Card, offers a window into the state of U.S. K-12 education system. Since 2017, NAEP has transitioned to digital assessments, opening new research opportunities that were previously impossible. Process data tracks students’ interactions with the assessment and helps researchers explore students’ decision-making processes. Response change is a behavior that can be observed and analyzed with the help of process data. Typically, response change research focuses on multiple-choice items as response changes for those items is easily evident in process data. However, response change behavior, while well known, has not been analyzed in constructed response items to our knowledge. With this study we present a framework to conduct such analyses by presenting a dimensional schema to detect what kind of response changes students conduct and how they are related to student performance by integrating an automated scoring mechanism. Results show that students make changes to grammar, structure, and the meaning of their response. Results also revealed that while most students maintained their initial score across attempts, among those whose score did change, factor changes were more likely to improve scores compared to grammar or structure changes. Implications of this study show how we can combine automated item scoring with dimensional response changes to investigate how response change patterns may impact student performance.
Listen -
References
- Al-Hamly, M., & Coombe, C. (2005). To change or not to change: Investigating the value of MCQ answer changing for Gulf Arab students. Language Testing, 22(4), 509–531.
- Aninditya, A., Hasibuan, M. A., & Sutoyo, E. (2019). Text Mining Approach Using TF-IDF and Naive Bayes for Classification of Exam Questions Based on Cognitive Level of Bloom’s Taxonomy. 2019 IEEE International Conference on Internet of Things and Intelligence System (IoTaIS), 112–117. https://doi.org/10.1109/IoTaIS47347.2019.8980428
- Beck, M. D. (1978). The Effect of Item Response Changes on Scores on an Elementary Reading Achievement Test. The Journal of Educational Research, 71(3), 153–156. https://doi.org/10.1080/00220671.1978.10885059
- Benjamin, L., Cavell, T., & Shallenberger, W. (1984). Staying with Initial Answers on Objective Tests: Is it a Myth? Teaching of Psychology, 11, 133–141. https://doi.org/10.1177/009862838401100303
- Bergner, Y., & von Davier, A. A. (2019). Process Data in NAEP: Past, Present, and Future. Journal of Educational and Behavioral Statistics, 44(6), 706–732. https://doi.org/10.3102/1076998618784700
- Bridgeman, B. (2012). A Simple Answer to a Simple Question on Changing Answers. Journal of Educational Measurement, 49(4), 467–468. https://doi.org/10.1111/j.1745-3984.2012.00189.x
- Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research, 16, 321–357. https://doi.org/10.1613/jair.953
- Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (arXiv:1810.04805). arXiv. https://doi.org/10.48550/arXiv.1810.04805
- Engblom, C., Andersson, K., & Åkerlund, D. (2020). Young students making textual changes during digital writing. Nordic Journal of Digital Literacy, 15(3), 190–201. https://doi.org/10.18261/issn.1891-943x-2020-03-05
- Ercikan, K., Guo, H., & He, Q. (2020). Use of Response Process Data to Inform Group Comparisons and Fairness Research. Educational Assessment, 25(3), 179–197. https://doi.org/10.1080/10627197.2020.1804353
- Hojeij, Z., & Hurley, Z. (2017). The Triple Flip: Using Technology for Peer and Self-Editing of Writing. International Journal for the Scholarship of Teaching and Learning, 11(1). https://eric.ed.gov/?id=EJ1136125
- Jeon, M., Boeck, P., & Linden, W. (2017). Modeling answer change behavior: An application of a generalized item response tree model. Journal of Educational and Behavioral Statistics, 42(4), 467–490.
- Jeon, M., De Boeck, P., & van der Linden, W. (2017). Modeling answer change behavior: An application of a generalized item response tree model. Journal of Educational and Behavioral Statistics, 42(4), 467–490.
- Johnson, E. G. (1992). The Design of the National Assessment of Educational Progress. Journal of Educational Measurement, 29(2), 95–110. https://doi.org/10.1111/j.1745-3984.1992.tb00369.x
- Kim, H.-K., & Kim, H. A. (2022). Analysis of Student Responses to Constructed Response Items in the Science Assessment of Educational Achievement in South Korea. International Journal of Science & Mathematics Education, 20(5), 901–919. https://doi.org/10.1007/s10763-021-10198-7
- Latif, E., & Zhai, X. (2024). Fine-tuning ChatGPT for automatic scoring. Computers and Education: Artificial Intelligence, 6, 100210. https://doi.org/10.1016/j.caeai.2024.100210
- Lee, Y.-H., & Jia, Y. (2014). Using response time to investigate students’ test-taking behaviors in a NAEP computer-based study. Large-Scale Assessments in Education, 2(1), 8. https://doi.org/10.1186/s40536-014-0008-1
- Linden, W. J. van der, & Jeon, M. (2012). Modeling Answer Changes on Test Items. Journal of Educational and Behavioral Statistics, 37(1), 180–199. https://doi.org/10.3102/1076998610396899.
- Liu, O. L., Bridgeman, B., Gu, L., Xu, J., & Kong, N. (2015). Investigation of response changes in the GRE revised general test. Educational and Psychological Measurement, 75(6), 1002–1020.
- Malekian, D., Bailey, J., Kennedy, G., de Barba, P., & Nawaz, S. (2019). Characterising Students’ Writing Processes Using Temporal Keystroke Analysis. International Educational Data Mining Society. https://eric.ed.gov/?id=ED599193
- McMorris, R. F., & Others, A. (1991). Why Do Young Students Change Answers on Tests? https://eric.ed.gov/?id=ED342803
- Morris, W., Holmes, L., Choi, J. S., & Crossley, S. (2024). Automated Scoring of Constructed Response Items in Math Assessment Using Large Language Models. International Journal of Artificial Intelligence in Education. https://doi.org/10.1007/s40593-024-00418-w
- Ouyang, W., Harik, P., Clauser, B. E., & Paniagua, M. A. (2019a). Investigation of answer changes on the USMLE® Step 2 Clinical Knowledge examination. BMC Medical Education, 19(1), 389. https://doi.org/10.1186/s12909-019-1816-3
- Ouyang, W., Harik, P., Clauser, B. E., & Paniagua, M. A. (2019b). Investigation of Answer Changes on the USMLE® Step 2 Clinical Knowledge Examination. BMC Medical Education, 19(1), 389. https://doi.org/10.1186/s12909-019-1816-3.
- Pools, E., & Monseur, C. (2021). Student test-taking effort in low-stakes assessments: Evidence from the English version of the PISA 2015 science test. Large-Scale Assessments in Education, 9(1), 10. https://doi.org/10.1186/s40536-021-00104-6
- Qiao, X., & Hicks, J. (2020, August 11). Exploring Answer Change Behavior Using NAEP Process Data. AIR - Technical Memorandum.
- Setzer, J. C., Wise, S. L., van den Heuvel, J. R., & Ling, G. (2013). An Investigation of Examinee Test-Taking Effort on a Large-Scale Assessment. Applied Measurement in Education, 26(1), 34–49. https://doi.org/10.1080/08957347.2013.739453
- Tate, T. P., & Warschauer, M. (2019). Keypresses and Mouse Clicks: Analysis of the First National Computer-Based Writing Assessment. Technology, Knowledge and Learning, 24(4), 523–543. https://doi.org/10.1007/s10758-019-09412-x
- Tiemann, G. (2015). An Investigation of Answer Changing on a Large-Scale Computer-Based Educational Assessment [(Doctoral dissertation,]. University of Kansas.
- Tyack, L., Khorramdel, L., & von Davier, M. (2024). Using convolutional neural networks to automatically score eight TIMSS 2019 graphical response items. Computers and Education: Artificial Intelligence, 6, 100249. https://doi.org/10.1016/j.caeai.2024.100249
- van der Linden, W. J., & Jeon, M. (2012). Modeling Answer Changes on Test Items. Journal of Educational and Behavioral Statistics, 37(1), 180–199. https://doi.org/10.3102/1076998610396899
- Whitmer, J., Beiting-Parrish, M., Blankenship, C., Folwer-Dawson, A., & Pitcher, M. (2023). NAEP Math Item Automated Scoring Data Challenge Results: High Accuracy and Potential for Additional Insights.
Keywords
Response Change, Process Data, Constructed Response, Automated Scoring, Writing Behavior
Affiliations
Congning Ni
Vanderbilt University
Bhashithe Abeysinghe
American Institutes for Research
Juanita Hicks
American Institutes for Research
Downloads
Download data is not yet available.