•  
  •  
 

Jurnal Penelitian dan Evaluasi Pendidikan

Keywords

Accuracy, Calibration, Fixed Parameter, Algorithm, Expectation-Maximization

Document Type

Article

Abstract

Penelitian ini bertujuan untuk (1) mengidentifikasi karakteristik butir-butir tes pada perangkat soal ujian nasional mata pelajaran Matematika tingkat SMP tahun pelajaran 2009/2010 yang dikalibrasi dengan metode kalibrasi fixed parameter, dan (2) mengetahui metode kalibrasi fixed parameter yang paling akurat di antara metode NWU-OEM (no prior weights updating and one expectation-maximization cycle), NWU-MEM (no prior weights updating and multiple expectation-maximization cycles), OWU-OEM (one prior weights updating and one expectation-maximization cycle), OWU-MEM (one prior weights updating and multiple expectation-maximization cycles), dan MWU-MEM (multiple weights updating and multiple expectation-maximization cycles). Penelitian ini menggunakan pendekatan kuantitatif deskriptif. Subjek penelitian adalah data respons ujian nasional mata pelajaran Matematika tingkat SMP tahun pelajaran 2009/2010 dari provinsi DI Yogyakarta. Kriteria akurasi metode adalah nilai fungsi informasi tes dan kesalahan pengukuran. Hasil penelitian adalah sebagai berikut. (1) Statistik parameter butir-butir tes pada perangkat ujian nasional mata pelajaran Matematika tingkat SMP tahun pelajaran 2009/2010 menunjukkan rerata indeks daya beda butir berada pada interval [1,07 sampai 1,14], rerata indeks kesukaran butir [-0,35 sampai -0,20], dan rerata pseudo guessing < 0,25. Nilai theta-nilai kemampuan-pada posisi fungsi informasi butir menjadi maksimal menunjukkan grafik fungsi kelima metode kalibrasi fixed-parameter hampir berimpit. (2) Metode OWU-OEM merupakan metode yang paling akurat dalam mengestimasi parameter butir pada perangkat tes ujian nasional mata pelajaran Matematika tahun pelajaran 2009/2010.

Kata kunci: akurasi, kalibrasi, fixed parameter, algoritma, Expectation-Maximization

______________________________________________________________ THE ACCURACY OF THE FIXED PARAMETER CALIBRATION METHOD:STUDY OF MATHEMATICS NATIONAL EXAMINATION TEST

Abstract This study aimed to: (1) identify the characteristics of the test items on the mathematics test of the national examination which are calibrated with the fixed parameter calibration methods, and (2) reveal the most accurate fixed parameter calibration methods among NWU-OEM (no prior weights updating and one expectation-maximization cycle), NWU-MEM (no prior weights updating and multiple expectation-maximization cycles), OWU-OEM (one prior weights updating and one expectation-maximization cycle), OWU-MEM (one prior weights updating and multiple expectation-maximization cycles), and MWU-MEM (multiple weights updating and multiple expectation-maximization cycles) methods. This study used descriptive quantitative approach. The subject is the testee' responses to the mathematics national examination in junior high school in 2009/2010. The criteria of the accuracy methods are TIF and SEM. The research results are as follows. (1) Item of statistical parameter on Mathematics national examination test in 2009/2010 showed the average of item discrimination on the interval [1.07, 1.14], the average of item difficulty on the interval [-0.35, -0.20], and the average of pseudo guessing is c < 0.25. Theta - ability - score where the item information function maximalist showed the function of five fixed-parameter calibration methods almost coincides. (2) OEM-OWU method is the most accurate in estimating the parameters on mathematics national examination test in 2009/2010. Keywords: Accuracy, Calibration, Fixed Parameter, Algorithm, Expectation-Maximization

First Page

188

Last Page

201

Issue

2

Volume

18

Digital Object Identifier (DOI)

10.21831/pep.v18i2.2860

References

American Educational Research Associa- tion, American Psychological Associ- ation, and National Council on Mea- surement in Education. (1999). Stan- dards for educational and psychological testing. Washington, DC: American Psychological Association.

Ban, J-C., Hanson, B.A., Tianyou Wang, et al. (2001) A comparative study of on- line pretest item-calibration/scaling methods in computerized adaptive testing. Journal of Educational Measure- ment, 38, 191-212.

Crocker, L. & Algina, J. (1986). Introduction to classical and modern test theory. New York: Holt, Rinehart and Winston Inc.

Depdiknas. (2000). Penilaian dan pengujian untuk guru SLTP. Jakarta: Direktorat Jenderal Pendidikan Dasar dan Me- nengah, Direktorat Sekolah Lanjutan Tingkat Pertama, Depdiknas.

Depdiknas. (2005). Peraturan Pemerintah RI Nomor 19, Tahun 2005, tentang Standar Nasional Pendidikan.

Depdiknas. (2006). Peraturan Menteri Pendi- dikan Nasional RI Nomor 23, Tahun 2006, tentang Standar Kompetensi Lulusan untuk Satuan Pendidikan Dasar dan Menengah.

du Toit, M. (Ed.) (2003). IRT from SSI: BILOG-MG, MULTILOG, PAR-

SCALE, TESTFACT. Lincolnwood, IL: Scientific Software International.

Embretson, S. E. & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Lawrence Erlbaum Associates Publisher.

Green, D.R., Yen, W.M., & Burket,G.R. (1989). Experiences in the application of item response theory in test construction. Journal of Educational Measurement, 2, 297-312.

Hambleton, R. K. & Swaminathan, H. (1985). Item response theory: Principles and applications. Boston, MA: Kluwer- Nijhoff Publishing.

Hambleton, R. K., Swaminathan, H. & Rogers, H. J. (1991). Fundamentals of item response theory. Newbury Park, CA: Sage Publications, Inc.

Hambleton, R. K & Jones, R. W. (1993). An NCME instructional module on compa- rison of classical test theory and item response theory and their applications to test development. Diambil pada tanggal 5 Juli 2010, dari www.ncme.org/pubs/ items/10. pdf.

Hulin, C.L., Drasgow, F. & Parsons, C.K. (1983). Item response theory: Application to psychological measurement. Homewood, IL: Dow Jones-Irwin.

Kim, S. (2006). A comparative study of IRT fixed parameter calibration methods. Journal of Educational Measurement, 43, 355-381.

Kolen, M. J. & Brennan, R. L. (1995). Test equating: Methods and practices. New York: Springer.

Lee, W-C & Ban, J-C. (2010). A comparison of IRT linking procedures. Applied Measurement in Education, 23, 23-48.

Li,-Y. H., Griffith, W. D., & Tam, H.P. (1997, June). Equating multiple tests via an IRT linking design: Utililizing a single set of common items with fixed common item parameters during the calibration process. Paper presented at the annual meet- ing of the psychometric society, Knoxville, TN.

Mardapi, D. (2012). Pengukuran, penilaian, dan evaluasi pendidikan. Yogyakarta: Nuha Medika.

McLachlan, G.J. & Krishnan, T. (2008). The EM algorithm and extensions (2nd ed.). New York: John Wiley & Sons.

Partchev, I. (2004). A visual guide to item response theory. Diambil pada tanggal 12 April 2009.

Reckase, M. D. (1979). Unifactor latent trait models applied to multifactor tests: Results and implications. Journal of Educational Statistics, 4, 207-230.

Taehoon Kang & Petersen, N. (2009). Linking item parameters to a base scale. ACT Research Report Series, 2009-2. Diambil tanggal 20 September 2010, dari http://www. act.org/ research/researchers/reports/pdf/A CT_RR2009-2.pdf.

Wright, B. D. & Stone, M. H. (1979). Best test design. Chicago: Mesa Press.

Yen, W. M & Fitzpatrick, A. R. (2006). Item response theory dalam R.L. Brennan (Ed.), Educational measurement. 4th ed. (pp.111-153). Westport, CT: Ameri- can Council on Education and Praeger Publishers.

Share

COinS