Keywords
Teori respon butir, statistika ekonomi, ujian akhir semester, bank soal, item response theory, economic statistic, final semester exam, item banks
Document Type
Article
Abstract
Penelitian ini bertujuan untuk mendeksripsikan kualitas butir soal ujian akhir semester mata kuliah statistika ekonomi yang dikembangkan oleh Universitas Terbuka (UT) sebagai dasar dalam mengembangkan bank soal yang terkalibrasi menggunakan pendekatan Teori Respons Butir. Penelitian ini merupakan penelitian deskriptif kuantitatif. Sumber data penelitian ini adalah pola jawaban mahasiswa UT yang telah mengikuti ujian akhir semester (UAS) mata kuliah statistika ekonomi selama enam masa ujian, dengan ukuran sampel sebanyak 23334 mahasiswa. Hasil penelitian ini menunjukkan bahwa butir-butir soal ujian akhir semester mata kuliah statistika ekonomi yang dikembangkan UT: (1) terbukti valid secara konstruk, yakni hanya mengukur satu faktor dominan, yaitu kemampuan statistika ekonomi; (2) memiliki kehandalan yang baik dengan nilai koefisien reliabilitas empiris lebih dari 0,70 (koefisien reliabilitas empiris = 0,7335); (3) dari 140 butir soal yang dikalibrasi terdapat 108 butir soal (25 butir soal berkualitas baik atau tanpa revisi dan 83 butir soal berkualitas kurang baik atau perlu revisi) yang layak disimpan dalam bank soal, sedangkan 32 butir soal berkualitas tidak baik; dan (4) mampu memberikan informasi akurat terkait kemampuan statistika ekonomi mahasiswa pada level kemampuan yang tinggi (-1,3 sampai +4,0).
Quality of statistical test bank items (Case study: Final exam instrument of statistics courses in Universitas Terbuka)
Abstract
This study aims to determine the quality of final semester test items of economic statistics course that was developed by Universitas Terbuka (UT) as a basis for developing calibrated item banks using Item Response Theory. This research uses a quantitative descriptive approach. The researcher investigates the answer pattern of the final semester exam (UAS) in the economic statistics course during six periods of the final exams. The sample size in this study was 23334 students. The results of this study indicate that the final semester exam items of economic statistics courses developed by UT: (1) proved to construct valid, i.e. only measure one dominant factor, namely the ability of economic statistics; (2) has good reliability with empirical reliability coefficient values more than 0.70 (empirical reliability coefficient = 0.7335); (3) of the 140 items calibrated there are 108 items (25 items of good quality or without revision and 83 items of poor quality or need to be revised) that are worth keeping in the question bank, while 32 items of quality are not good; and (4) able to provide accurate information related to students' economic statistical abilities at a high level of ability (-1.3 to +4.0)
Page Range
165-176
Issue
2
Volume
6
Digital Object Identifier (DOI)
10.21831/jrpm.v6i2.28900
Source
https://journal.uny.ac.id/index.php/jrpm/article/view/28900
Recommended Citation
Santoso, A., Kartianom, K., & Kassymova, G. K. (2019). Kualitas butir bank soal statistika (Studi kasus: Instrumen ujian akhir mata kuliah statistika Universitas Terbuka). Jurnal Riset Pendidikan Matematika, 6(2), 165-176. https://doi.org/10.21831/jrpm.v6i2.28900
References
Attali, Y., & Bar"Hillel, M. (2003). Guess where: The position of correct answers in multiple"choice test items as a psychometric variable. Journal of Educational Measurement, 40(2), 109-128. https://doi.org/10.1111/j.1745-3984.2003.tb01099.x
Barnard-Brak, L., Lan, W. Y., & Yang, Z. (2018). Differences in mathematics achievement according to opportunity to learn: A 4pL item response theory examination. Studies in Educational Evaluation, 56, 1-7. https://doi.org/10.1016/j.stueduc.2017.11.002
Crocker, L., & Algina, J. (2008). Introduction to classical and modern test theory. Cengage Learning.
Firmansyah, M. A. (2017). Analisis hambatan belajar mahasiswa pada mata kuliah statistika. Jurnal Penelitian Dan Pembelajaran Matematika, 10(2). https://doi.org/10.30870/jppm.v10i2.2036
Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. Sage.
Hulin, C. L., Drasgow, F., & Parsons, C. K. (1983). Item response theory: Application to psychological measurement. Dow Jones-Irwin.
Istiyono, E., Mardapi, D., & Suparno, S. (2014). Pengembangan tes kemampuan berpikir tingkat tinggi fisika (pysthots) peserta didik SMA. Jurnal Penelitian Dan Evaluasi Pendidikan, 18(1), 1-12. https://doi.org/10.21831/pep.v18i1.2120
Kartianom, K., & Mardapi, D. (2018). The utilization of junior high school mathematics national examination data: Conceptual error diagnosis. REiD (Research and Evaluation in Education), 3(2). https://doi.org/10.21831/reid.v3i2.18120
Kartianom, K., & Ndayizeye, O. (2017). What's wrong with the Asian and African Students' mathematics learning achievement? The multilevel PISA 2015 data analysis for Indonesia, Japan, and Algeria. Jurnal Riset Pendidikan Matematika, 4(2), 200-210. https://doi.org/10.21831/jrpm.v4i2.16931
Keeves, J. P., & Alagumalai, S. (1999). New approaches to measurement. Advances in Measurement in Educational Research and Assessment, 23-42.
Kien-Kheng, F., & Idris, N. (2010). A comparative study on statistics competency level using TIMSS data: Are we doing enough? Journal of Mathematics Education, 3(2), 126-138.
Mardapi, D. (2012). Pengukuran penilaian dan evaluasi pendidikan. Nuha Medika.
Mills, J. D., & Holloway, C. E. (2013). The development of statistical literacy skills in the eighth grade: Exploring the TIMSS data to evaluate student achievement and teacher characteristics in the United States. Educational Research and Evaluation, 19(4), 323-345. https://doi.org/10.1080/13803611.2013.771110
Muslim, M., Suhandi, A., & Nugraha, M. G. (2017). Development of reasoning test instruments based on TIMSS framework for measuring reasoning ability of senior high school student on the physics concept. Journal of Physics: Conference Series, 812(1), 012108. https://doi.org/10.1088/1742-6596/812/1/012108
Pey Tee, O., & Subramaniam, R. (2018). Comparative study of middle school students' attitudes towards science: Rasch analysis of entire TIMSS 2011 attitudinal data for England, Singapore and the U.S.A. as well as psychometric properties of attitudes scale. International Journal of Science Education, 40(3), 268-290. https://doi.org/10.1080/09500693.2017.1413717
Ramos, J. L. S., Dolipas, B. B., & Villamor, B. B. (2013). Higher order thinking skills and academic performance in physics of college students: A regression analysis. International Journal of Innovative Interdisciplinary Research, 4(48-60).
Retnawati, H. (2013). Evaluasi program pendidikan. Universitas Terbuka.
Retnawati, H. (2016). Validitas reliabilitas dan karakteristik butir. Parama Publishing.
Retnawati, H. (2017). Diagnosing the junior high school students'difficulties in learning mathematics. International Journal on New Trends in Education and Their Implications, 8(1), 33-50. http://www.ijonte.org/FileUpload/ks63207/File/04.heri_retnawati.pdf
Retnawati, H., & Hadi, S. (2014). Sistem bank soal daerah terkalibrasi untuk menyongsong era desentralisasi. Jurnal Ilmu Pendidikan, 20(2), 183-193. https://doi.org/10.17977/jip.v20i2.4615
Rindermann, H., & Baumeister, A. E. E. (2015). Validating the interpretations of PISA and TIMSS tasks: A rating study. International Journal of Testing, 15(1), 276-296. https://doi.org/10.1080/15305058.2014.966911
Rogers, H. J. (1999). Guessing in multiple choice tests. In Advances in measurement in educational research and assessment (pp. 235-243). Pergamon Press, New York.
Wibawa, S. (2017). Tri Dharma Perguruan Tinggi (Pendidikan dan pengabdian kepada masyarakat). In Disampaikan dalam Rapat Perencanaan Pengawasan Proses Bisnis Perguruan Tinggi Negeri. Yogyakarta (Vol. 29).
Wu, M., Tam, H. P., & Jen, T.-H. (2016). Educational measurement for applied researchers. Springer Singapore. https://doi.org/10.1007/978-981-10-3302-5