Individual ability on high-stakes test: Choosing cumulative score or rasch for scoring model

Muhammad Dhiyaul Khair, Brawijaya UniversityFollow
Sukaesi Marianti, Brawijaya UniversityFollow

Keywords

Ability; cumulative score; Rasch, dichotomous responses

Document Type

Article

Abstract

In a test, a method is required to estimate an individual's ability based on their responses. Typically, this is done by summing the correct responses or calculating a cumulative score. An alternative method is the Rasch model. This study aims to determine whether an individual's position, based on cumulative score estimates, remains unchanged or changes when compared with ability estimates using Rasch on dichotomous responses. The study uses open-source data from the 2018 Program for International Student Assessment (PISA) by the Organization for Economic Co-operation and Development (OECD) and involves 317 Indonesian students. Ability analysis will be conducted on Math and Reading aspects using cumulative scores and Rasch with dichotomous responses. The study will employ data analysis techniques such as Rasch, paired samples t-test, and descriptive statistical analysis. The cumulative score and Rasch results will be tested using a paired samples t-test, and a comparison of the cumulative score and Rasch estimation results will be carried out using descriptive statistical analysis. The study results indicate that there are differences in individual positions based on ability estimates using cumulative score and Rasch. These differences are caused by variations in scores. Therefore, even if two individuals have the same cumulative score, they may have different Rasch estimates.

First Page

Last Page

Issue

2338-6061

Volume

Digital Object Identifier (DOI)

10.21831/pep.v28i1.71661

DOI Link

http://dx.doi.org/10.21831/pep.v28i1.71661

Recommended Citation

Khair, Muhammad Dhiyaul and Marianti, Sukaesi (2024) "Individual ability on high-stakes test: Choosing cumulative score or rasch for scoring model," Jurnal Penelitian dan Evaluasi Pendidikan: Vol. 28: Iss. 1, Article 6.
DOI: 10.21831/pep.v28i1.71661
Available at: https://scholarhub.uny.ac.id/jpep/vol28/iss1/6

References

Abedalaziz, N., & Leng, C. H. (2018). The Relationship between CTT and IRT approaches in analyzing item characteristics. MOJES: Malaysian Online Journal of Educational Sciences, 1(1), 64–70. http://jice.um.edu.my/index.php/MOJES/article/download/12857/8251 Amelia, R. N., & Kriswantoro, K. (2017). Implementation of Item Response Theory for Analysis of Test Items Quality and Students’ Ability in Chemistry. JKPK (Jurnal Kimia Dan Pendidikan Kimia), 2(1), Article 1. https://doi.org/10.20961/jkpk.v2i1.8512 Anastasi, A., & Urbina, S. (2002). Psychological testing. Prentice Hall. Baker, F. B., & Kim, S. H. (2017). The Basics of Item Response Theory Using R. Springer International Publishing. https://doi.org/10.1007/978-3-319-54205-8 Bichi, A. A. (2016). Classical Test Theory: An introduction to linear modeling approach to test and item analysis. International Journal for Social Studies, 2(9), 27–33. https://www.academia.edu/download/53191152/CLASSICAL_TEST_THEORY_An_Introduction_to.pdf Bichi, A. A., & Talib, R. (2018). Item Response Theory: An Introduction to Latent Trait Models to Test and Item Development. International Journal of Evaluation and Research in Education, 7(2), 142–151. Bichi, Talib, R., Atan, A., Ibrahim, H., & Yusof, S. (2019). Validation of a developed university placement test using classical test theory and Rasch measurement approach. International Journal of Advanced and Applied Sciences, 6, 22–29. https://doi.org/10.21833/ijaas.2019.06.004 Champlain, A. F. D. (2010). A primer on classical test theory and item response theory for assessments in medical education. Medical Education, 44(1), 109–117. https://doi.org/10.1111/j.1365-2923.2009.03425.x Columbia University. (2016, August 5). Rasch Modeling. Columbia University Mailman School of Public Health. https://www.publichealth.columbia.edu/research/population-health-methods/rasch-modeling DeMars, C. (2010). Item response theory. Oxford University Press. Fan, X. (1998). Item Response Theory and Classical Test Theory: An empirical comparison of their item/person Statistics. Educational and Psychological Measurement, 58(3), 357–381. https://doi.org/10.1177/0013164498058003001 Fernanda, J. W., & Hidayah, N. (2020). Analisis kualitas soal ujian statistika menggunakan Classical Test Theory dan Rasch Model. Square: Journal of Mathematics and Mathematics Education, 2(1), Article 1. https://doi.org/10.21580/square.2020.2.1.5363 Hambleton, R. K., & Jones, R. W. (1993). Comparison of Classical Test Theory and Item Response Theory and their applications to test development. Educational Measurement: Issues and Practice, 12(3), 38–47. https://doi.org/10.1111/j.1745-3992.1993.tb00543.x Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of Item Response Theory. SAGE. Hayat, B., Putra, M. D. K., & Suryadi, B. (2020). Comparing item parameter estimates and fit statistics of the Rasch model from three different traditions. Jurnal Penelitian Dan Evaluasi Pendidikan, 24(1), Article 1. https://doi.org/10.21831/pep.v24i1.29871 Kemendikbud. (2019). Pendidikan Di Indonesia Belajar Dari Hasil PISA 2018. Pusat Penilaian Pendidikan Balitbang Kemendikbud. https://simpandata.kemdikbud.go.id/index.php/s/tLBwAm6zAGGbofK Kiliç, A. F. (2019). Can Factor Scores be used instead of Total Score and Ability Estimation? International Journal of Assessment Tools in Education, 6(1), Article 1. https://doi.org/10.21449/ijate.442542 Linacre, J. M. (1994). Sample Size and Item Calibration or Person Measure Stability. https://www.rasch.org/rmt/rmt74m.htm MacDonald, P., & Paunonen, S. V. (2002). A Monte Carlo comparison of Item and Person Statistics based on Item Response Theory versus Classical Test Theory. Educational and Psychological Measurement, 62(6), 921–943. https://doi.org/10.1177/0013164402238082 Magno, C. (2009). Demonstrating the difference between Classical Test Theory and Item Response Theory using derived test data. The International Journal of Educational and Psychological Assessment, 1(1), 1–11. https://ssrn.com/abstract=1426043 Maulani, M. R., & Rahardjo, B. (2014). Teori pengukuran pendidikan menggunakan Classical Test Theory dan Item Response Theory. Competitive, 9(1), Article 1. https://ejurnal.ulbi.ac.id/index.php/competitive/article/view/257 OECD. (2017). What is PISA? In PISA 2015 Assessment and Analytical Framework: Science, Reading, Mathematic, Financial Literacy and Collaborative Problem Solving. OECD Publishing. https://read.oecd-ilibrary.org/education/pisa-2015-assessment-and-analytical-framework/what-is-pisa_9789264281820-2-en#page1. OECD. (2019). 2018 Database—PISA. https://www.oecd.org/pisa/data/2018database/ Prieto, L., Alonso, J., & Lamarca, R. (2003). Classical test theory versus Rasch analysis for quality of life questionnaire reduction. Health and Quality of Life Outcomes. 10.1186/1477-7525-1-27 Progar, Š., & Sočan, G. (2008). An empirical comparison of Item Response Theory and Classical Test Theory. Horizons of Psychology, 17(3), 5–24. http://psiholoska-obzorja.si/arhiv_clanki/2008_3/progar_socan.pdf Rasch. (n.d.). Rasch Dichotomous Model vs. One-Parameter Logistic Model 1-PL. https://www.rasch.org/rmt/rmt193h.htm. Rasch. (2007). Raw Score-to-Measure (Scaled Score) Tables. https://www.rasch.org/rmt/rmt211j.htm Rusch, T., Lowry, P. B., Mair, P., & Treiblmaier, H. (2017). Breaking free from the limitations of Classical Test Theory: Developing and measuring information systems scales using Item Response Theory. Information & Management, 54(2), 189–203. https://doi.org/10.1016/j.im.2016.06.005 Saifuddin, A. (2002). Tes Prestasi fungsi dan pengembangan pengukuran prestasi belajar. Pustaka Pelajar Offset. Sarea, M. S., & Ruslan, R. (2019). Karakteristik butir soal: Classical Test Theory VS Item Respone Theory? DIDAKTIKA : Jurnal Kependidikan, 13(1), 1–16. https://doi.org/10.30863/didaktika.v13i1.296 Vincent, W., & Shanmugam, S. K. S. (2020). The role of classical test theory to determine the quality of classroom teaching test items. Pedagogia : Jurnal Pendidikan, 9(1), Article 1. https://doi.org/10.21070/pedagogia.v9i1.123 Waterbury, G. (2019). Missing Data and the Rasch Model: The Effects of Missing Data Mechanisms on Item Parameter Estimation. Journal of Applied Measurement, 20, 1–12. https://pubmed.ncbi.nlm.nih.gov/31120433/ Widoyoko, E. P. (2012). Teknik penyusunan instrumen penelitian. Pustaka Pelajar. Xu, T., & Stone, C. A. (2012). Using IRT Trait Estimates Versus Summated Scores in Predicting Outcomes. Educational and Psychological Measurement, 72(3), 453–468. https://doi.org/10.1177/0013164411419846

Download

COinS