Comparing item-total correlation and item-theta correlation in test item selection: A simulation and empirical study

Sukaesi Marianti, Universitas Brawijaya, Indonesia
Ana Rufaida, Universitas Brawijaya, Indonesia
Nur Hasanah, Universitas Brawijaya, Indonesia
Sofia Nuryanti, Universitas Brawijaya, Indonesia

Document Type

Article

Abstract

One of the important processes in the evaluation of the psychometric properties of a test is item selection. The item selection process usually uses a very popular technique called item-total correlation. This study attempts to describe the item-total correlation technique and explore it using a similar technique called item-theta correlation. Both techniques are applied using simulation studies by creating several conditions related to test length and sample size. After the simulation study, the next step is the study using empirical data as an illustration of the results of the simulation study. The results of this study show that there are differences in the results of item selection based on these two approaches. Item-theta correlation detects more items that have weak discrimination power than item-total correlation. The difference is more noticeable in conditions where the cutoff point used for item selection is low(.20).

First Page

133

Last Page

145

Issue

Volume

Digital Object Identifier (DOI)

10.21831/pep.v27i2.61477

Recommended Citation

Marianti, Sukaesi; Rufaida, Ana; Hasanah, Nur; and Nuryanti, Sofia (2023) "Comparing item-total correlation and item-theta correlation in test item selection: A simulation and empirical study," Jurnal Penelitian dan Evaluasi Pendidikan: Vol. 27: Iss. 2, Article 1.
DOI: 10.21831/pep.v27i2.61477
Available at: https://scholarhub.uny.ac.id/jpep/vol27/iss2/1

References

Andrich, D., & Marais, I. (2019). A course in rasch measurement theory: Measuring in the educational, social and health sciences. Springer. https://doi.org/10.1007/978-981-13-7496-8

Azwar, S. (1994). Seleksi item dalam penyusunan skala psikologi. Buletin Psikologi, 2(2), 26-33.

Bae, J., Lee, J. H., Choi, M., Jang, Y., Park, C. G., & Lee, Y. J. (2023). Development of the clinical reasoning competency scale for nurses. BMC Nursing, 22(1), 138. https://doi.org/10.1186/s12912-023-01244-6

Baker, F. B. (2001). The basics of item response theory (2nd ed.). Eric Clearinghouse on Assessment and Evaluation. https://eric.ed.gov/?id=ED458219

Baker, F. B., & Kim, S.-H. (2017). The basics of item response theory using R. Springer. https://doi.org/10.1007/978-3-319-54205-8

Bandalos, D. L. (2018). Measurement theory and applications for the social sciences. Guilford Press. www.guilford.com/MSS

Bock, R. D., & Gibbons, R. D. (2021). Item response theory (1st ed.).

John Wiley & Sons.

Bulut, O., & Sünbül, Ã–. (2017). R programlama dili ile madde tepki kuramında Monte Carlo simülasyon Ã§alıÅŸmaları. EÄŸitimde ve Psikolojide Ã–lÃ§me ve DeÄŸerlendirme Dergisi, 8(3), 266-287. https://doi.org/10.21031/epod.305821

Crocker, L. M., & Algina, J. (2008). Introduction to classical and modern test theory. Cengage Learning.

De Ayala, R. J. (2022). The theory and practice of item response theory (2nd ed.). The Guilford Press.

Dichoso, A. A., Joy, R., & Cabauatan, M. (2020). Test item analyzer using point-biserial correlation and P-values. International Journal of Scientific & Technology Research, 9(4), 2122-2126. https://www.ijstr.org/final-print/apr2020/Test-Item-Analyzer-Using-Point-biserial-Correlation-And-P-values.pdf

Feinberg, R. A., & Rubright, J. D. (2016). Conducting simulation studies in psychometrics. Educational Measurement: Issues and Practice, 35(2), 36-49. https://doi.org/10.1111/EMIP.12111

Finch, W. H., & French, B. F. (2018). Educational and psychological measurement. Routledge. https://doi.org/10.4324/9781315650951

Guo, H., Lu, R., Johnson, M. S., & McCaffrey, D. F. (2022). Alternative methods for item parameter estimation: From CTT to IRT. ETS Research Report Series, 2022(1), 1-16. https://doi.org/10.1002/ets2.12355

Hu, Z., Lin, L., Wang, Y., & Li, J. (2021). The integration of classical testing theory and item response theory. Psychology, 12(9), 1397-1409. https://www.scirp.org/journal/paperinformation.aspx?paperid=111936

Jacobs, P., & Viechtbauer, W. (2017). Estimation of the biserial correlation and its sampling variance for use in meta-analysis. Research Synthesis Methods, 8(2), 161-180. https://doi.org/10.1002/JRSM.1218

Karakaya, N., & KiliÃ§, M. (2021). Turkish adaptation of the workplace breastfeeding support scale: A validity and reliability study. Samsun SaÄŸlık Bilimleri Dergisi, 6(3), 721-736. https://doi.org/10.47115/jshs.1029188

Kesici, A., & TunÃ§, N. F. (2018). The development of the digital addiction scale for the university students: Reliability and validity study. Universal Journal of Educational Research, 6(1), 91-98. https://doi.org/10.13189/ujer.2018.060108

Luecht, R. M., & Hambleton, R. K. (2021). Item response theory : A historical perspective and brief introduction to applications. The history of educational measurement (1st ed.), pp. 232-262. Routledge. https://doi.org/10.4324/9780367815318-11

Macdonald, P., & Paunonen, S. V. (2002). A Monte Carlo comparison of item and person statistics based on item response theory versus classical test theory. Educational and Psychological Measurement, 62(6), 921-943. https://doi.org/10.1177/0013164402238082

Paek, I., & Cole, K. (2019). Using R for item response theory model applications. Routledge.

Robitzsch, A., Kiefer, T., & Wu, M. (2022). Type package title test analysis modules. https://cran.r-project.org/web/packages/TAM/TAM.pdf

Sabbag, A. G. (2019). Handbook of educational measurement and psychometrics using R.. The American Statistician, 73(4), 415-416. https://doi.org/10.1080/00031305.2019.1676110

Supratiknya, A. (2014). Pengukuran psikologis (1st ed.). Universitas Sanata Dharma.

Wu, M., Tam, H. P., & Jen, T.-H. (2016). Educational measurement for applied researchers: Theory into practice. Springer. https://doi.org/10.1007/978-981-10-3302-5

Xie, D., & Cobb, C. L. (2020). Item analysis. In B. J. Carducci & C. S. Nave (eds.), The Wiley encyclopedia of personality and individual differences: Models and theories, pp. 159-163. https://doi.org/10.1002/9781118970843.CH97

Yilmaz, M. L., & Keskin, H. A. (2020). Is a universal model of a "good" national education system that brings economic returns emerging?. Anadolu Ãœniversitesi Sosyal Bilimler Dergisi, 20, 61-72. https://doi.org/10.18037/ausbd.725563

Download

Included in

Educational Assessment, Evaluation, and Research Commons

COinS