REID (Research and Evaluation in Education)


discriminating power; Indonesian language test; item difficulty; school examinations; standardized test

Document Type



This study aimed to determine the characteristics of Indonesian language test used in the national standardized school examinations (Ujian Sekolah Berstandar Nasional, USBN). We used the response data of 218 students from a public senior high school in the Special Region of Yogyakarta, Indonesia, to the test consisting of two packages, A and B, in the 2018/2019 academic year to investigate the characteristics of the test and its items. Quantitative analysis using classical test theory (CTT) and one-parameter logistic item response theory (1-PL IRT) model was conducted to investigate the characteristics of the test and its items based on difficulty and discriminating power. The results of the study under the CTT showed that most test items in both package A and package B have difficulty in the easy category and the portion of items in the difficult category is no more than 10%. In addition, while the majority of test items in both test packages demonstrated good discriminating power, package B contained a high number of items with poor discriminating power (47.5%). Under 1-PL IRT, our study results indicated the dominance of items with difficulty level in the moderate category in both test packages. In addition to revealing the difficulty level of the test items, our study showed that the items in the difficult category under CTT and IRT were related to the topics of types of conjunction; spelling, grammar, and sentence structure; job application letter; and observation report text and its structure. The results of this study are expected to contribute to improving the quality of the Indonesian language test and the quality of learning on topics where students have difficulty based on test items in the difficult category.

Page Range






Digital Object Identifier (DOI)





Adams, R. J., & Khoo, S. T. (1996). ACER Quest: The interactive test analysis system (Version 2.1.) [Computer software]. Australian Council for Educational Research. https://research.acer.edu.au/measurement/3/

Allen, M. J., & Yen, W. M. (2002). Introduction to measurement theory. Waveland Press.

Anggraini, D., & Suyata, P. (2014). Karakteristik soal UASBN mata pelajaran bahasa Indonesia di Daerah Istimewa Yogyakarta pada tahun pelajaran 2008/2009 [Characteristics of UASBN test on Indonesian language subjects in the Special Region of Yogyakarta in the academic year of 2008/2009]. Jurnal Prima Edukasia, 2(1), 57-65. https://doi.org/10.21831/jpe.v2i1.2644

Arruarte, J., Larrañaga, M., Arruarte, A., & Elorriaga, J. A. (2021). Measuring the quality of test-based exercises based on the performance of students. International Journal of Artificial Intelligence in Education, 31, 585-602. https://doi.org/10.1007/s40593-020-00208-0

Bichi, A. A., & Embong, R. (2018). Evaluating the quality of Islamic civilization and Asian civilizations examination questions. Asian People Journal, 1(1), 93-109.

Costello, E., Holland, J., & Kirwan, C. (2018). The future of online testing and assessment: Question quality in MOOCs. International Journal of Educational Technology in Higher Education, 15(1), 1-14. https://doi.org/10.1186/s41239-018-0124-z

Ebel, R. L., & Frisbie, D. A. (1991). Essentials of educational measurement. Prentice-Hall.

Evers, A., Muñiz, J., Hagemeister, C., Høstmælingen, A., Lindley, P., Sjöberg, A., & Bartram, D. (2013). Assessing the quality of tests: Revision of the EFPA review model. Psicothema, 25(3), 283-291. https://doi.org/10.7334/psicothema2013.97

Gronlund, N. E., Miller, M. D., & Linn, R. L. (2009). Measurement and assessment in teaching (10th ed.). Pearson Education.

Hambleton, R. K., Swaminathan, H., & Rogers, D. J. (1991). Fundamentals of item response theory. Sage.

Huber, S. G., & Skedsmo, G. (2017). Standardization and assessment practices. Educational Assessment, Evaluation and Accountability, 29, 1-3. https://doi.org/10.1007/s11092-017-9257-1

Kosasih, E. (2017). Buku siswa bahasa Indonesia untuk kelas SMP/MTs kelas VIII [Indonesian language student book for junior high school/MTs class VIII]. Kementerian Pendidikan dan Kebudayaan.

Maharani, A. V., & Putro, N. H. P. S. (2020). Item analysis of English final semester test. Indonesian Journal of EFL and Linguistics, 5(2), 491-504. https://doi.org/10.21462/ijefl.v5i2.302

Mardapi, D. (1999). Estimasi kesalahan pengukuran dalam bidang pendidikan dan implikasinya pada ujian nasional [Estimation of measurement error in education and implications for national examinations]. Universitas Negeri Yogyakarta.

Moses, T. (2017). A review of developments and applications in item analysis. In R. E. Bennett & M. von Davier (Eds.), Advancing human assessment (pp. 19-46). Springer. https://doi.org/10.1007/978-3-319-58689-2_2

Mugianto, M., Ridhani, A., & Arifin, S. (2017). Pengembangan perencanan pembelajaran menulis teks laporan hasil observasi model pembelajaran berbasis proyek siswa kelas X SMA [Development of lesson plan for writing observation report text using project-based learning model for grade X high school students]. Ilmu Budaya: Jurnal Bahasa, Sastra, Seni dan Budaya, 1(4), 353-366. http://doi.org/10.30872/jbssb.v1i4.769

Muhson, A. (2017). AnBuso (Version 8.0) [Computer software].

National Education Standards Board. (2018). Prosedur operasional standar penyelenggaraan ujian sekolah berstandar nasional [Standard operating procedure of the administration of the national-standardized school examination]. https://bsnpindonesia.org/2018/12/bsnp-tetapkan-pos-usbn-dan-un-2019/

Nurgiyantoro, B. (2016). Penilaian pembelajaran bahasa berbasis kompetensi [Competency-based assessment on language learning]. BPFE Yogyakarta.

Osterlind, S. J. (1998). Constructing test items: Multiple-choice, constructed-response, performance, and other formats (2nd ed.). Kluwer Academic Publishers.

Pardede, T., Santoso, A., Retnawati, H., Rafi, I., Apino, E., & Rosyada, M. N. (2023). Gaining a deeper understanding of the meaning of the carelessness parameter in the 4PL IRT model and strategies for estimating it. REID (Research and Evaluation in Education), 9(1), 86-117. http://doi.org/10.21831/reid.v9i1.63230

Rafi, I., Retnawati, H., Apino, E., Hadiana, D., Lydiati, I., & Rosyada, M. N. (2023). What might be frequently overlooked is actually still beneficial: Learning from post national-standardized school examination. Pedagogical Research, 8(1), em0145. https://doi.org/10.29333/pr/12657

Retnawati, H. (2014). Teori respons butir dan penerapannya [Item response theory and its applications]. Nuha Medika.

Retnawati, H., Kartowagiran, B., Arlinwibowo, J., & Sulistyaningsih, E. (2017). Why are the mathematics national examination items difficult and what is teachers' strategy to overcome it? International Journal of Instruction, 10(3), 257-276. https://doi.org/10.12973/iji.2017.10317a

Retnawati, H., Kartowagiran, B., Hadi, S., & Hidayati, K. (2011). Identifikasi kesulitan peserta didik dalam belajar matematika dan sains di sekolah dasar [Identifying learners' difficulties in learning math and science in primary schools]. Jurnal Kependidikan, 41(2), 162-174. https://doi.org/10.21831/jk.v41i2.1930

Åžahin, A., & Anıl, D. (2017). The effects of test length and sample size on item parameters in item response theory. Educational Sciences: Theory & Practice, 17(1), 321-335. http://doi.org/10.12738/estp.2017.1.0270

Suherli, S., Suryaman, M., Septiaji, A., & Istiqomah, I. (2017). Buku guru bahasa Indonesia untuk kelas SMA/MA/SMK/MAK kelas XI [Indonesian language book for teachers of SMA/MA/SMK/MAK class XI]. Kementerian Pendidikan dan Kebudayaan.

Suryaman, M., Suherli, S., & Istiqomah, I. (2018). Bahasa Indonesia untuk SMA/MA/SMK/MAK kelas XII [Indonesian for SMA/MA/SMK/MAK grade XII]. Kementerian Pendidikan dan Kebudayaan.

Suryani, Y. E. (2017). Pemetaan kualitas empirik soal ujian akhir semester pada mata pelajaran bahasa indonesia sma di kabupaten Klaten [Empirical quality mapping of end-of-semester exam questions on Indonesian language subjects in senior high schools in Klaten district]. Jurnal Penelitian dan Evaluasi Pendidikan, 21(2), 142-152. https://doi.org/10.21831/pep.v21i2.10725