Keywords
comparison of DIFdetection methods;differential itemsfunctioning;unidimensional IRT
Document Type
Article
Abstract
This research aims to detect Differential Item Functioning (DIF) in the 2014/2015 National Examination Questions in mathematics of junior high schools and equivalent- level schools in the Yogyakarta region as a reference group and the South Kalimantan region as a focus group using the Likelihood Ratio Test (LRT) method, Area Measure Raju, and Lord. A sensitivity analysis was conducted to determine the most sensitive method. The data consisted of 5,465 National Examination papers of the students from the two regions who worked on type A questions. A sample of 1,000 exam papers for each region was established using the simple random sampling (SRS) technique, which was conducted to avoid the effect of sample size. The research results showed that by using the LRT method, the researchers found 36 items had significant DIF detection, 32 items were significant for Raju Area, and all items had significant DIF detection using Lord. Lord Method is the most sensitive method because it can detect most DIF items.
Page Range
99-113
Issue
1
Volume
10
Digital Object Identifier (DOI)
10.21831/reid.v10i1.73270
Source
https://journal.uny.ac.id/index.php/reid/article/view/73270
Recommended Citation
Setiawan, A., Kassymova, G. K., Mbazumutima, V., & Agustyani, A. D. (2024). Differential Item Functioning of the region-based national examinationequipment. REID (Research and Evaluation in Education), 10(1). https://doi.org/10.21831/reid.v10i1.73270
References
Akour, M., Sabah, S., & Hammouri, H. (2015). Net and global differential item functioning in PISA polytomously scored science items. Journal of Psychoeducational Assessment, 33(2), 166–176. https://doi.org/10.1177/0734282914541337
Alfarizi. (2019). Meningkatkan mutu pendidikan di Indonesia melalui MESUPPEN “Maksimalkan pendekatan supervisi pendidikan.” Tugas Kuliah Administrasi dan Supervisi Pendidikan Jurusan Matematika Universitas Negeri Padang, 1–5. http://dx.doi.org/10.31227/osf.io/tmyz7
Azis, A. (2015). Conceptions and practices of assessment: A case of teachers representing improvement conception. TEFLIN Journal - A Publication on the Teaching and Learning of English, 26(2), 129-154. https://doi.org/10.15639/teflinjournal.v26i2/129-154
Başman, M. (2023). A comparison of the efficacies of differential item functioning detection methods. International Journal of Assessment Tools in Education, 10(1), 145–159. https://doi.org/10.21449/ijate.1135368
Berrío, Á. I., Herrera, A. N., & Gómez-Benito, J. (2019). Effect of sample size ratio and model misfit when using the difficulty parameter differences procedure to detect DIF. The Journal of Experimental Education, 87(3), 367–383. https://doi.org/10.1080/00220973.2018.1435502
Çelik, M., & Özkan, Y. Ö. (2020). Analysis of differential item functioning of PISA 2015 Mathematics subtest subject to gender and statistical regions. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 11(3), 283–301. https://doi.org/10.21031/epod.715020
Center for Educational Assessment. (2020). Laporan hasil ujian nasional - Capaian nasional. Pusat Penilaian Pendidikan, Kementerian Pendidikan dan Kebudayaan. https://hasilun.pusmenjar.kemdikbud.go.id/#2019!smp!capaian_nasional!99&99&999!T& T&T&T&1&!1!&
Cho, S., Suh, Y., & Lee, W. (2016). An NCME instructional module on latent DIF analysis using mixture item response models. Educational Measurement: Issues and Practice, 35(1), 48–61. https://doi.org/10.1111/emip.12093
Delgado, A. R., Burin, D. I., & Prieto, G. (2018). Testing the generalized validity of the emotion knowledge test scores. PLOS ONE, 13(11), e0207335. https://doi.org/10.1371/journal.pone.0207335
Desjardins, C. D., & Bulut, O. (2017). Handbook of educational measurement and psychometrics using R. Chapman and Hall/CRC. https://doi.org/10.1201/b20498
Effendi, E. (2011). Detecting crossing differential item functioning (CDIF): Based on item response theory. Jurnal Evaluasi Pendidikan, 2(2), 147-158. https://dx.doi.org/10.21009/JEP.022.03
Effiom, A. P. (2021). Test fairness and assessment of differential item functioning of mathematics achievement test for senior secondary students in Cross River state, Nigeria using item response theory. Global Journal of Educational Research, 20(1), 55–62. https://doi.org/10.4314/gjedr.v20i1.6
French, B. F., Finch, W. H., & Immekus, J. C. (2019). Multilevel Generalized Mantel-Haenszel for differential item functioning detection. Frontiers in Education, 4, 47. https://doi.org/10.3389/feduc.2019.00047
Gaberson, K. B. (1997). Measurement reliability and validity. AORN Journal, 66(6), 1092–1094. https://doi.org/10.1016/S0001-2092(06)62551-9
Galli, S., Chiesi, F., & Primi, C. (2011). Measuring mathematical ability needed for “non mathematical” majors: The construction of a scale applying IRT and differential item functioning across educational contexts. Learning and Individual Differences, 21(4), 392–402. https://doi.org/10.1016/j.lindif.2011.04.005
Hadi, S., Basukiyatno, B., & Susongko, P. (2021). Differential item functioning national examination on device test mathematics high school in Central Java. Proceedings of the 1st International Conference on Social Science, Humanities, Education and Society Development, ICONS 2020, 30 November, Tegal, Indonesia. https://doi.org/10.4108/eai.30-11-2020.2303726
Hadi, S., Puspita, F., Ati, A. P., & Widiyarto, S. (2020). Penyuluhan dan pembelajaran karakter melalui pelaksanaan Idul Adha pada siswa SMA. Jurnal Pemberdayaan: Publikasi Hasil Pengabdian Kepada Masyarakat, 4(2), 205–210. https://doi.org/10.12928/jp.v4i2.1833
Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. In Fundamentals of item response theory. Sage Publications, Inc.
Hidajad, A. (2019). Pendidikan Indonesia: Ramai di dapur, sepi di panggung (Sebuah tinjauan perkembangan). GETER : Jurnal Seni Drama, Tari dan Musik, 2(2), 1–11. https://doi.org/10.26740/geter.v2n2.p1-11
Huang, X., Wilson, M., & Wang, L. (2016). Exploring plausible causes of differential item functioning in the PISA science assessment: Language, curriculum or culture. Educational Psychology, 36(2), 378–390. https://doi.org/10.1080/01443410.2014.946890
Ihsan, H. (2016). Validitas isi alat ukur penelitian konsep dan panduan penilaiannya. PEDAGOGIA Jurnal Ilmu Pendidikan, 13(2), 266-273. https://doi.org/10.17509/pedagogia.v13i2.3557
James, G., James, R. C., & Davis, P. J. (1959). Mathematics dictionary. Physics Today, 12(10), 50–52. https://doi.org/10.1063/1.3060526
Jusmirad, M., Angraeini, D., Faturrahman, M., Syukur, M., & Arifin, I. (2023). Implementasi literasi dan numerasi pada program MBKM dan dampaknya terhadap siswa SMP Datuk Ribandang. Jurnal Pendidikan Indonesia, 4(03), 303–310. https://doi.org/10.59141/japendi.v4i03.1687
Kane, M. T. (2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50(1), 1–73. https://doi.org/10.1111/jedm.12000
Langer, M. M. (2008). A reexamination of lord’s wald test for differential item functioning using item response theory and modern error estimation. Dissertation, The University of North Carolina. https://doi.org/10.17615/chn0-dz45
Leiner, J. E. M., Scherndl, T., & Ortner, T. M. (2018). How do men and women perceive a highstakes test situation? Frontiers in Psychology, 9. https://doi.org/10.3389/fpsyg.2018.02216
Ozdemir, B., & Alshamrani, A. H. (2020). Examining the fairness of language test across gender with IRT-based differential item and test functioning methods. International Journal of Learning, Teaching and Educational Research, 19(6), 27–45. https://doi.org/10.26803/ijlter.19.6.2
Patricia, D. C., & Araújo, L. (2012). Differential item functioning (DIF): What functions differently for immigrant students in PISA 2009 reading items ? JRC Publications Repository. European Union. https://doi.org/10.2788/60811
Raju, N. S. (1990). Determining the significance of estimated signed and unsigned areas between two item response functions. Applied Psychological Measurement, 14(2), 197–207. https://doi.org/10.1177/014662169001400208
Retnawati, H. (2013). Pendeteksian keberfungsian butir pembeda dengan indeks volume sederhana berdasarkan teori respons butir multidimensi. Jurnal Penelitian dan Evaluasi Pendidikan, 17(2), 275–286. https://doi.org/10.21831/pep.v17i2.1700
Scott, N. W., Fayers, P. M., Aaronson, N. K., Bottomley, A., de Graeff, A., Groenvold, M., Gundy, C., Koller, M., Petersen, M. A., & Sprangers, M. A. G. (2009). A simulation study provided sample size guidance for differential item functioning (DIF) studies using short scales. Journal of Clinical Epidemiology, 62(3), 288–295. https://doi.org/10.1016/j.jclinepi.2008.06.003
Siegrist, M., Connor, M., & Keller, C. (2012). Trust, confidence, procedural fairness, outcome fairness, moral conviction, and the acceptance of GM field experiments. Risk Analysis, 32(8), 1394–1403. https://doi.org/10.1111/j.1539-6924.2011.01739.x
Sinha, R., van den Heuvel, W. A., & Arokiasamy, P. (2013). Validity and reliability of MOS short form health survey (SF-36) for use in India. Indian Journal of Community Medicine, 38(1), 22-26. https://doi.org/10.4103/0970-0218.106623
Sitepu, V. V., & Rahmawati, F. (2022). Analisis pusat pertumbuhan dan sektor ekonomi dalam mengurangi ketimpangan pendapatan. AKUNTABEL: Jurnal Akuntansi dan Keuangan, 19(1), 1–12. https://download.garuda.kemdikbud.go.id/article.php?article=3275677&val=11261&title= Analisis%20pusat%20pertumbuhan%20dan%20sektor%20ekonomi%20dalam%20mengur angi%20ketimpangan%20pendapatan
Soysal, S., & Koğar, E. Y. (2021). An investigation of item position effects by means of IRT-based differential item functioning methods. International Journal of Assessment Tools in Education, 8(2), 239–256. https://doi.org/10.21449/ijate.779963
Sudaryono, S. (2017). Sensitivity of differential item functioning (DIF) detection method. Jurnal Evaluasi Pendidikan, 3(1), 82-94. https://doi.org/10.21009/JEP.031.07
Thissen, D., Steinberg, L., & Gerrard, M. (1986). Beyond group-mean differences: The concept of item bias. Psychological Bulletin, 99(1), 118–128. https://doi.org/10.1037/0033-2909.99.1.118
Thissen, D., Steinberg, L., & Wainer, H. (1988). Use of item response theory in the study of group differences in trace lines. In Test validity. (pp. 147–172). Lawrence Erlbaum Associates, Inc. https://doi.org/10.1037/14047-004
Turang, D. A. O. (2017). Pendekatan model ontologi untuk pencarian lembaga pendidikan (Studi kasus lembaga pendidikan provinsi Daerah Istimewa Yogyakarta). Jurnal Ilmiah Teknologi Infomasi Terapan, 3(3), 175-182. https://doi.org/10.33197/jitter.vol3.iss3.2017.134
Uğurlu, S., & Atar, B. (2020). Performances of MIMIC and logistic regression procedures in detecting DIF. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 11(1), 1–12. https://doi.org/10.21031/epod.531509
Ukanda, F., Othuon, L., Agak, J., & Oleche, P. (2019). Effectiveness of Mantel-Haenszel and logistic regression statistics in detecting differential item functioning under different conditions of sample size, ability distribution and test length. American Journal of Educational Research, 7(11), 878–887. https://www.sciepub.com/EDUCATION/abstract/11217
Whynes, D. K., Sprigg, N., Selby, J., Berge, E., & Bath, P. M. (2013). Testing for differential item functioning within the EQ-5D. Medical Decision Making, 33(2), 252–260. https://doi.org/10.1177/0272989X12465016
Yamin, M., & Syahrir, S. (2020). Pembangunan pendidikan merdeka belajar (Telaah metode pembelajaran). Jurnal Ilmiah Mandala Education, 6(1), 126-136. https://doi.org/10.36312/jime.v6i1.1121
Yildirim, O. (2019). Detecting gender differences in PISA 2012 mathematics test with differential item functioning. International Education Studies, 12(8), 59-71. https://doi.org/10.5539/ies.v12n8p59
Zampetakis, L. A., Bakatsaki, M., Litos, C., Kafetsios, K. G., & Moustakis, V. (2017). Gender-based differential item functioning in the application of the theory of planned behavior for the study of entrepreneurial intentions. Frontiers in Psychology, 8, 451. https://doi.org/10.3389/fpsyg.2017.00451
Zukmadini, A. Y., Karyadi, B., & Rochman, S. (2021). Peningkatan kompetensi guru melalui workshop model integrasi terpadu literasi sains dan pendidikan karakter dalam pembelajaran IPA. Publikasi Pendidikan, 11(2), 107-116. https://doi.org/10.26858/publikan.v11i2.18378