•  
  •  
 

Jurnal Penelitian dan Evaluasi Pendidikan

Keywords

estimasi kemampuan; metode maksimum likelihood; metode Bayes; estimation ability; maximum likelihood method; bayes method

Document Type

Article

Abstract

Studi ini bertujuan untuk membandingkan ketepatan estimasi kemampuan laten (latent trait) pada model logistik dengan metode maksimum likelihood (ML) gabungan dan bayes. Studi ini menggunakan metode simulasi Monte Carlo, dengan model data ujian nasional matematika SMP. Variabel simulasi adalah panjang tes dan banyaknya peserta. Data dibangkitkan dengan menggunakan SAS/IML dengan replikasi 40 kali, dan tiap data diestimasi dengan ML dan Bayes. Hasil estimasi kemudian dibandingkan dengan kemampuan yang sebenarnya, dengan menghitung mean square of error (MSE) dan korelasi antara kemampuan laten yang sebenarnya dan hasil estimasi. Metode yang memiliki MSE lebih kecil dikatakan sebagai metode estimasi yang lebih baik. Hasil studi menunjukkan bahwa pada estimasi kemampuan laten dengan 15, 20, 25, dan 30 butir dengan 500 dan 1.000 peserta, hasil MSE belum stabil, namun ketika peserta menjadi 1.500 orang, diperoleh akurasi estimasi kemampuan yang hampir sama baik estimasi antara metode ML dan metode Bayes. Pada estimasi dengan 15 dan 20 butir dan peserta 500, 1.000, dan 1.500, hasil MSE belum stabil, dan ketika estimasi melibatkan 25 dan 30 butir, baik dengan peserta 500, 1.000, maupun 1.500 akan diperoleh hasil yang lebih akurat dengan metode ML.

Kata kunci: estimasi kemampuan, metode maksimum likelihood, metode Bayes

THE COMPARISON OF ESTIMATION OF LATENT TRAITS USING MAXIMUM LIKELIHOOD AND BAYES METHODS

Abstract

This study aimed to compare the accuracy of the estimation of latent ability (latent trait) in the logistic model using maximum likelihood (ML) and Bayes methods. This study uses a quantitative approach that is the Monte Carlo simulation method using students responses to national examination as data model, and variables are the length of the test and the number of participants. The data were generated using SAS/IML with replication 40 times, and each datum is then estimated by ML and Bayes. The estimation results are then compared with the true abilities, by calculating the mean square of error (MSE) and correlation between the true ability and the results of estimation. The smaller MSE estimation method is said to be better. The study shows that on the estimates with 15, 20, 25, and 30 items with 500 and 1,000 participants, the results have not been stable, but when participants were upto 1,500 people, it was obtained accuracy estimation capabilities similar to the ML and Bayesian methods, and with 15 items and participants of 500, 1,000, and 1,500, the result has not been stable, while using 20 items, the results have not been stable, and when estimates involve 25 and 30 items, either by participants 500, 1,000, and 1,500 it will obtain more accurate results with ML method. Keywords: estimation ability, maximum likelihood method, bayes method

First Page

145

Last Page

155

Issue

2

Volume

19

Digital Object Identifier (DOI)

10.21831/pep.v19i2.5575

References

Anonim. (2005). Monte-Carlo simulation. Bahan kuliah Universitas Alberta. Diambil dari http://www.ualberta.ac/tanggal 2 Januari 2006.

Cohen, A.S., Kane, M.T., & Kim, S. (2001). The precision of simulation study results. Applied Psychological Measurement Journal, 25(2), 136-145.

Du Toit, M. (2003). IRT from SSi: BILOG-MG, MULTILOG, PARSCALE, TESTFACT. Lincolnwood: SSi.

Hambleton, R.K., Swaminathan, H., & Rogers, H.J. (1991). Fundamentals of item response theory. Newbury Park, CA: Sage Publication Inc.

Hambleton, R.K., & Swaminathan, H. (1985). Item response theory. Boston, MA: Kluwer Inc.

Hambleton, R.K., & Cook, L.L. (1983). Robustness of item response models and effects of test length and sample size on the precision of ability estimates. Dalam D. Weiss (Ed.), New Horizons in Testing (pp. 31-49). New York: Academic Press.

Harwell, M.R., Stone, C.A., Hsu, T.C., et al. (1996). Monte-Carlo studies in item response theory. Applied Psychological Measurement, 20, 101-125.

Harwell, M. (1997). Analyzing the results of Monte-Carlo studies in item response theory. Educational and Psychological Measurement, 20, 266-279.

Retnawati, H. (2014). Teori respons butir dan penerapannya: untuk peneliti, praktisi pengukuran dan pengujian, mahasiswa pascasarjana. Yogyakarta: Parama.

Hulin, C.L., Drasgow, F., & Parsons, C.K. (1983). Item response theory: Application to psychological measurement. Homewood, IL: Dow Jones-Irwin.

Keeves, J.P., & Alagumalai, S. (1999). New approaches to measurement. Dalam Masters, G.N., & Keeves, J.P. (Eds.), Advances in measurement in educational research and assessment. Amsterdam: Pergamon.

Komrey, J.D., Parshall, C.G., Chason, W.M., & Yi, Q. (2006). Generating responses based on multidimensional item response theory. Diambil dari http://www2.sas.com/procedings/sugi19/posters/tanggal 1 September 2006.

Mislevy, R.J., & Bock, R.D. (1990). BILOG 3: Item analysis & test scoring with binary logistic models. Moorseville: Scientific Software Inc.

Segall, D.O. (2000). General ability measurement: An application of multidimensional item response theory. Psychometrica, Vol. 66, 79-97.

Sinharay, S. (2004). Experiences with Marcov Chain Monte Carlo convergence assessment in two psychometric examples. Journal of Educational and Behavioral Statistics, 29(4), 461-488.

Share

COinS