Jurnal Inovasi Pendidikan IPA


rasch model, R program, characteristics of chemical items

Document Type



Chemistry is one of the subjects taught in high school. To find out and assess students' understanding regarding chemistry subjects in one semester can be proven by a test. The tests used must have good quality. This study aims to provide information about the characteristics of chemical items test using the Rasch model. Descriptive explorative was used in this study. The subject of the study were tenth grade students in Xaverius Senior High School taken the final semester examination on chemistry subject. The object of this research were the form of item tests and student answer sheets. Data collection techniques used documentation. Student answer sheets were analyzed using the R program. The results showed that the reliability of the item tests was 0.3 to 0.54 or medium category. Subsequently acquired a good level of difficulty about which amounted to 28 items. In addition, the average student ability is 0.008 with a minimum ability of -2.309 and a maximum of 2.233. ICC and IIC obtained are very accurate in predicting students' abilities. Chemicals items used in the final semester examination can be used by teachers as a item bank for use in the evaluation of students' abilities. However, there are two items that need to be revised level of difficulty to produce a good question.

First Page


Last Page


Page Range






Digital Object Identifier (DOI)





Ackerman, T. A., Gierl, M. J., & Walker, C. M. (2003). Using multidimensional item response theory to evaluate educational and psychological tests. Educational Measurement: Issues and Practice, 22(3), 37-51. https://doi.org/10.1111/j.1745-3992.2003.tb00136.x

Allen, M. J., & Yen, W. M. (2001). Introduction to measurement theory. California: Waveland Press, Inc.

Anderson, L. W. (2003). Classroom assessment: Enhancing the quality of teacher decision making. New Jersey: Lawrence Erlbaum Associates, Inc.

Ary, D., Jacobs, L. C., Irvine, C. K. S., & Walker, D. (2018). Introduction to research in education. Cengage Learning.

Aziz, A., & Prasetyo, Z. (2015). Karakteristik soal ujian akhir semester gasal mata pelajaran fisika SMA kelas X di Kabupaten Lombok Tengah Nusa Tenggara Barat. Jurnal Evaluasi Pendidikan, 3(2), 99-111. Retrieved from http://journal.student.uny.ac.id/ojs/index.php/jep/article/view/1266

Bahar, A. (2013). The influence of cognitive abilities on mathematical problem solving performance. The University of Arizona.

Baumgartner, T. A., Jackson, A. (Tony), Mahar, M., & Rowe, D. (2007). Measurement for evaluation in physical education and exercise science. New York: McGraw-Hill.

Cohen, L., Manion, L., & Morrison, K. R. B. (2002). Research methods in education. New York, N.Y.: Routledge.

Crocker, L., & Algina, J. (2008). Introduction to classical and modern test theory. Ohio, USA: Cengage Learning.

Cronbach, L. J. (1984). Essentials of psychological testing. New York: Harper & Row Publisher.

Dada, E. M., & Ohia, I. (2014). Teacher-made language test planning, construction, administration and scoring in secondary schools in ekiti state. Journal of Education and Practice, 5(18), 71-76. Retrieved from https://www.iiste.org/Journals/index.php/JEP/article/view/13928

de Gruijter, D. N. M., & Van der Kamp, L. J. T. (2008). Statistical test theory for the behavioral sciences. Chapman and Hall.

Downing, S. M. (2003). Item response theory: applications of modern test theory in medical education. Medical Education, 37(8), 739-745. https://doi.org/10.1046/j.1365-2923.2003.01587.x

Embretson, S. E., & Reise, S. P. (2000). Item Response Theory for Psychologists Multivariate Applications Book Series. London: Lawrence Erlbaum Associates, Inc.

Finch, W. H., & French, B. F. (2015). Latent variable modeling with R. New York, N.Y.: Taylor & Francis.

Gregory, R. J. (2004). Psychological testing: History, principles, and applications. Allyn & Bacon.

Hambleton, R. K. (2018). Emergence of item response modeling in instrument development and data analysis. Medical Care, 38(9), II60-II65.

Hambleton, R. K., & Swaminathan, H. (1985). Item response theory: Principles and applications. New York, N.Y.: Springer Science+Business Media. https://doi.org/10.1007/978-94-017-1988-9

Harrison, P. M. C., Collins, T., & Müllensiefen, D. (2017). Applying modern psychometric techniques to melodic discrimination testing: Item response theory, computerised adaptive testing, and automatic item generation. Scientific Reports, 7(1), 1-19. https://doi.org/10.1038/s41598-017-03586-z

Iskandar, A., & Rizal, M. (2017). Analisis kualitas soal di perguruan tinggi berbasis aplikasi tap. Jurnal Penelitian Dan Evaluasi Pendidikan, 21(2), 12-23. https://doi.org/10.21831/pep.v22i1.15609

Kubiszyn, T., & Borich, G. D. (2013). Educational testing and measurement: Classroom application and practice (10th ed.). London: Wiley.

Linacre, J. M. (2015). Sample size and item calibration stability. Journal of Applied Measurement, 3(1).

Linden, W. J. van der, & Hambleton, R. K. (1996). Handbook of modern item response theory. New York, N.Y.: Springer Science+Business Media. https://doi.org/10.1007/978-1-4757-2691-6 I.

Lord, F. M., & Novick, M. R. (2008). Statistical theories of mental test scores. (F. Mosteller, Ed.). Addison-Wesley.

Mardapi, D. (2017). Pengukuran penilaian dan evaluasi pendidikan. Yogyakarta: Nuha Medika.

Maydeu-Olivares, A. (2013). Goodness-of-fit assessment of item response theory models. Measurement: Interdisciplinary Research and Perspectives, 11(3), 71-101. https://doi.org/10.1080/15366367.2013.831680

Miller, M. D., Linn, R. L., & Gronlund, N. E. (2009). Measurement and assessment in teaching. (L. Reinkober, Ed.) (10th ed.). Kevin M. Davis.

Ostini, R., & Nering, M. L. (2006). Polytomous item response theory models. Thousand Oaks, California: SAGE Publications, Inc.

Presiden Republik Indonesia. Peraturan pemerintah Republik Indonesia no 19 th 2005 tentang standar nasional pendidikan, Pub. L. No. 19, Peraturan pemerintah Republik Indonesia 1 (2005).

Reckase, M. D. (2009). Statistics for social and behavioral sciences: Multidimensional item response theory. New York, N.Y.: Springer Science+Business Media.

Retnawati, H. (2016). Validitas reliabilitas dan karakteristik butir. Yogyakarta: Parama Publishing.

Rizopoulos, D. (2006). ltm : An R package for latent variable modeling. Journal of Statistical Software, 17(5). https://doi.org/10.18637/jss.v017.i05

Salirawati, D. (2011). Pengembangan instrumen pendeteksi miskonsepsi kesetimbangan kimia pada peserta didik SMA. Jurnal Penelitian Dan Evaluasi Pendidikan, 15(2), 232-249. https://doi.org/10.21831/pep.v15i2.1095

Sutrisno, H. (2016). An analysis of the mathematics school examination test quality. Jurnal Riset Pendidikan Matematika, 3(2), 162-177. https://doi.org/10.21831/jrpm.v3i2.11984

Taub, G. E., Floyd, R. G., Keith, T. Z., & McGrew, K. S. (2008). Effects of general and broad cognitive abilities on mathematics achievement. School Psychology Quarterly, 23(2), 187-198. https://doi.org/10.1037/1045-3830.23.2.187

Tshabalala, T., Mapolisa, T., Gazimbe, P., & Ncube, A. C. (2015). Establishing the effectiveness of teacher-made tests in Nkayi District Primary Schools. Nova Journal of Humanities and Social Sciences, 4(1), 1-6. https://doi.org/10.20286/jhss.v4i1.29

Van Alphen, A., Halfens, R., Hasman, A., & Imbos, T. (1994). Likert or rasch ? Nothing is more applicable than good theory. Journal Af Advanced Nursing, 20, 196-201. https://doi.org/10.1046/j.1365-2648.1994.20010196.x%0A

Van de Walle, J. A. (2010). Elementary and middle school mathematics : teaching developmentally. Boston: Pearson /Allyn and Bacon.