Jurnal Penelitian dan Evaluasi Pendidikan


item response models, robustness, local item independence

Document Type



Tujuan utama penelitian ini adalah untuk mengetahui robustness Model Logistik 1 Parameter (ML 1-P), Model Logistik 2 Parameter (ML 2-P) dan Model Logistik 3 Parameter (ML 3-P) terhadap pelanggaran Asumsi Independensi Lokal Butir (ILB). Penelitian ini menggunakan data simulasi yang dibangkitkan dengan 40 butir, 500 simuli, dan 10 replikasi untuk setiap model. Skor-skor batas dibangun berdasakan pelanggaran Asumsi ILB 0 - 100% yang dihasil-kan dengan menggunakan 1- 40 kelompok butir sedangkan kategori-kategori skor dibangun berdasarkan dampak pelanggaran Asumsi ILB terhadap struktur dari matriks data. Hasil penelitian ini menunjukkan bahwa model yang paling robust terhadap pelanggaran Asumsi ILB adalah ML 1-P dengan skor batas 31,71% (kategori pelanggaran berat) diikuti ML 2-P dengan skor batas 12,1% (kategori pelanggaran sedang), dan ML 3-P dengan skor batas 7,68% (kategori pelanggaran sedang).

Kata kunci: model-model respons butir, robustness, independensi lokal butir



Abstract The primary purpose of this study was to investigate the robustness of the 1-PLM, 2-PLM, and 3-PLM, against violation of the Local Item Independence (LII) Assumption based on cut-off scores and score categories. The investigation used simulated data generated with 40 items, 500 simulees, and 10 replications for each model. The cut-off scores were built based-on 0 - 100% violations of the LII Assumption that were introduced using 1- 40 item clusters. The score categories in this study were built based-on impact of the violations of the LII Assumption to the structure of data matrix. The result showed that the most robust model was 1-PLM with cut-off score 31,71% (heavy violation category) followed by 2-PLM with cut-off score 12,1% (moderate violation category), and 3-PLM with cut-off score 7,68% (moderate violation category).

Keywords: item response models, robustness, local item independence

First Page


Last Page






Digital Object Identifier (DOI)



Ackerman, T. (September 1987). The ro- bustness of LOGIST and BILOG IRT estimation programs to viola- tions of local independence. ACT Research Report Series, 87-14. Anderson, T. W. (1954). On estimation of parameters in laten structure analysis. Psychometrika, 19 (1), 1 – 10. Andrich, D. (2008). Relationships between the Thurstone and Rasch approach toitem scaling. DalamS. Gorard (Ed.), Quantitative research in education: Key techniques for education research (pp. 66- 78). London: 2008. Antal, J. (2003). Fit indices for the rasch model. Dissertation, Ohio: The Ohio State University. Balazs, K. & De Boeck, P. (2007). Detec- ting local item dependence stemming for minnor dimensions. Interuniversity Attraction Pole (IAP) Statistics Network Technical Report Series, 0684. Bond, T. G. & Fox, C. H. (2007). Applying the rasch model: Fundamental measurement in the human sciences. Mahwah, NJ: Lawrence Erlbaum, Inc. Bradlow, E. T. Wainer, H. & Wang, X. (January 1998). A bayesian random effects model for testlets. Educational Testing Service (ETS) Research Report, RR-98-3. Braeken, J. & Tuerlinckx, F. (2009). Investi- gating latent constructs with item res- ponse models: A MATHLAB IRTm toolbox. Behavior Research Methods, 41, 1127-1137. Braeken, J. Tuerlinckx, F. & De Boeck, P. (Juni 2005). A copula model for resi- dual dependency in IRT models. Inter- university Attraction Pole (IAP) Statistics Network Technical Report Series, 0534. Brown, J. D. (2005). Testing in language prog- rams: A comprehensive guide to english language assessment. New York: Mc- Graw-Hill Companies, Inc. Crocker, L. & Algina, J. (1986). Introduction to classical and modern test theory. New York: Holt, Rinehart & Winston. du Toit, M. (Ed.). (2003). IRT from SSI: Bilog-MG, Multilog, Parscale, Testfact. Lincolnwood, IL: Scientific Software International. Edwards, M. C. (2009). An introduction to item response theory using the need cognition scale. Social and Personality Psychology Compass, 3, 507-529. Embretson, S. E. (1996).The new rules of measurement. Psychological Assessment, 8, 341-349. Embretson, S. E. & Reise, S. P. (2000). Item response theory for psychologist. Mahwah, NJ: Lawrence Erlbaum, Inc. Fox, J. (2008). Applied regression analysis and generalized linear models. (2nd ed.) Thou- sand Oaks, CA: Sage Publications, Inc. Fox, J. P. (2010). Bayesian item response modeling: Theory and applications. New York: Springer. Fraley, R. C., Waller, N. G., & Brennan, K. A. (2000). An item response theory analysis of self-report measures of adult attachment. Journal of Personality and Social Psychology, 78, 350-365. Freund, J. E. (2004). Mathematical statistics with applications. Upper Saddle River, NJ: Pearson Education International. Goodman, J. T. & Luecht, R. M. (August 2009). An examination of the magni- tude of residual covariance for com- plex performance assessments under various scoring and scaling methods. American Institute of Certified Public Accountants (AICPA) Technical Report Series, W0901. Hambleton, R. K. & Jones, R. W. (1993). Comparison of classical test theory and item response theory and their applications to test development. An NCME Instructional Module, 253-262. Diakses tanggal 5 Juli 2010, dari: http://ncme.org/linkservid/ 6696808 0-1320-5CAE-6E4E546A2E4FA9E 1/showMeta/0/ Hambleton, R. K. & Swaminathan, H. (1985). Item response theory: Principles and applications. Boston, MA: Kluwer Nij- hoff Publishing. Hambleton, R. K. Swaminathan, H. & Ro- gers, H. J. (1991). Fundamental of item response theory. Newbury Park, CA: Sage Publications, Inc. Hulin, C. L. Drasgow, F. & Parsons, C. K. (1983). Item response theory: Application to psychological measurement. Homewood, IL: The Dorsey Professional Series. Huynh, H. Michels, H. R. & Ferrara, S. (April 1995). A comparison of three statistical procedures to identify clusters of items with local dependency. Paper presented at the Annual Meeting of National Council on Measurement in Edu- cation, di San Francisco. Ip, E. H. Wang, Y. J. De Boeck, P. et al. (2004). Locally dependent latent trait model for polytomous responses with application to inventory of hostility. Psychometrika, 69, 191-216. Jiao, H. & Kamata, A. (April 2003). Model comparisons in the presence of local item dependence. Paper presented at the Annual Meeting of The American Edu- cational Research Association (AERA), di Chicago. Jiao, H. Kamata, A. Wang, S., & Jin, Y. (2012). A multilevel testlet model for dual local dependence. Journal of Edu- cational Measurement, 49, 82-100. Johnson, M. S. (2007). Marginal maximum likelihood estimation of item res- ponse models in R. Journal of Statistical Software, 20, 1-24. Kim, D. De Ayala, R. J. Ferdous, A. A. et al. (2007). Assessing relative performance of local item dependence (LID) indexes. Diakses tanggal 5 Juli 2010, dari http:// www.Measuredprogress.org/ resources/psychometrics/framework /materials/07/AERA.NCME/Assess ingRelativePerformnace.pdf. Lord, F. M. (1952).A theory of test scores. New York: Educational Testing Service. Lord, F. M. & Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA: Adison-Wesley. Mislevy, J. L. (2011). Detecting local item dependence in polytomous adaptive data. Dissertation. Maryland: University of Maryland. Mokken, R. J. (1997). Nonparametric models for dichotomous response. Dalam W. J. van der Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 351-367). New York: Springer-Verlag. Orlando, M. (2008). Critical issues to address when applying item response theory (IRT). Diakses tanggal 5 Juli 2010, dari http://outcomes.cancer.gov/confere nce/irt/orlando.pdf. Panther, A. T. & Reeve, B. B. (2002). Asses- sing tobacco beliefs among youth using item response theory models. Drug and Alcohol Dependence, 68, S21- S39. Pommerich, M. & Ito, K. (March 2008). An examination of the properties of local dependence measures when applied to adaptive data. Paper presented at the Annual Meeting of the National Council on Measurement in Education (NCME), di New York. Ramsay, J. O. (1997). A functional ap- proach to modeling test data. Dalam W. J. van der Linden & R. K. Ham- bleton (Eds.), Handbook of modern item response theory (pp. 381-394). New York: Springer-Verlag. Reese, L. M. (April 1995). The impact of local dependencies on some LSAT outcomes. Law School Admission Coun- cil (LSAC) Statistical Report, 95-02. Scott, S. L. & Ip E. H. (2002). Empirical bayes and item-clustering effects in a latent variable hierarchical model: A case study from the national assess- ment of educational progress. Journal of the American Statistical Association, 97, 1-11. Sijtsma, K. & Junker, B. W. (2006). Item response theory: Past performance, present developments, and future ex- pectations. Behaviormetrika, Vol. 33, No. 1, 75-102. Spray, J. A. (1997). Multiple-attempt, single- item response models. Dalam W. J. van der Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 351-367). New York: Springer-Verlag. Stark, S. Chernyshenko, O. S. & Drasgow, F. (April 2002). Investigating the effects of local dependence on the accuracy of IRT ability estimation. American Institute of Certified Public Accountants (AICPA) Technical Report Series, 15. Torgerson, W. S. (1958). Theory and methods of scaling. New York: John Wiley & Sons, Inc. Traub, R. E. (1983). Apriori considerations in choosing an item response model. Dalam R. K. Hambleton (Ed.), Appli- cations of item response theory (pp. 1-23). Vancouver, BC: Educational Re- search Institute of British Columbia. Tuerlinckx, F. & De Boeck, P. (2001). Non- modeled item interactions lead to distorted discrimination parameters: A case study. Methods of Psychological Research Online, 6, 159-174. van der Linden, W. J. & Hambleton, R. K. (1997) Nonparametric models. Dalam W. J. van der Linden, & R. K. Ham- bleton, (Eds.). Handbook of modern item response theory (pp. 347-349). New York: Springer-Verlag. Wainer, H. Bradlow, E. T. & Wang, X. (2007). Testlet response theory and its applications. Cambridge: Cambridge University Press. Wang, W. C., & Wilson, M. (2005). Explo- ring local item dependence using a random-effects facet model. Applied Psychological Measurement, 29, 296-318. Wright, B. D. & Stone, M. H. (1979). Best test design. Chicago, IL: Mesa Press. Yen, W. M. (1984). Effects of local item dependence on the fit and equating performance of the three-parameter logistic model. Applied Psychological Measurement, 8 (2), 125-145 Zenisky, A. L., Hambleton, R. K., & Sireci, S. G. (2003). Effects of local item dependence on the validity of IRT item, test, and ability statistics. Medical Council Admission Test (MCAT) Report Series. Zwinderman, A. A. (1997). Response mo- dels with manifest predictors. Dalam W. J. van der Linden & R. K. Ham- bleton (Eds.), Handbook of modern item response theory (pp. s351-367). New York: Springer-Verlag.