REID (Research and Evaluation in Education)


assessment instrument, higher order thinking, JHS mathematics

Document Type



This study is a research and development study. It aims to produce an instrument for assessing junior high school (JHS) students' higher order thinking skills (HOTS) in mathematics. Its procedure consists of nine steps: (1) Constructing the test specification; (2) writing test items; (3) analyzing test items; (4) conducting the first tryout; (5) analyzing the results of the first try out; (6) revising the test; (7) assembling the test; (8) conducting the second tryout; and (9) analyzing the results of the second tryout. The instrument content validity was obtained through the focus group discussion (FGD) forum, and Delphi technique. The construct validity was found out through the tryout data analysis. The instrument tryout was conducted twice involving 264 participants in the first tryout and 821 participants in the second tryout. The results of the study indicate that the instrument for assessing JHS students' HOTS in mathematics has met the validity and reliability criteria. From the results of the content validity analysis, it can be concluded that the instrument is valid, and it was supported by the items validity indices above 0.79. From the results of the construct validity analysis, it can be concluded that the instrument is valid, as indicated by the value of χ2 = 67.69, with p-value = 0.10, Root Mean Square Error of Approximation (RMSEA) = 0.03, supported by Goodness of Fit Index (GFI) of 0.97, Normed Fit Index (NFI) of 0.95, and Adjusted Goodness of Fit Index (AGFI) of 0.95. The instrument reliability is 0.88. The developed instrument for assessing HOTS in mathematics consists of 12 items, each of which is of essay test type. The test items have difficulty indices in a range of 0.30 ‰¤ Pi ‰¤ 0.7.

Page Range






Digital Object Identifier (DOI)





Aiken, L.R. (1985). Three coefficients to analyzing the reliability and validity of rating. Educational and Psychological Measurement, 45, 131-142.

Allen, M.J. & Yen, W.M. (1979). Introduction to measurement theory. Monterey, CA: Brooks/Cole.

Atkin, J.M. (2003). Assessment in support of instruction and learning. Workshop report. Washington, WA: The National Academies.

Azwar, S. (2009). Penyusunan skala psikologi (12th ed.) [Composing psychological scale]. Yogyakarta: Pustaka Pelajar.

Brookhart, S.M. (2010). How to assess higher order thinking skills in your classroom. Alexanderia, VA: ASCD.

Byrnes, J.P. (2008). Cognitive development and learning in instructional contexts. Boston, MA: Pearson Education.

Crocker, L. & Algina, J. (1986). Introduction to classical and modern test theory. New York, NY: CBS College.

de Lange, J. (1999). Framework for classroom assessment in mathematics. Retrieved on January 12, 2012 from http://www.fi.uu.nl/catch/products/framework/de_lange_frameworkfinal.pdf

Kaur, B. & Lam, T.T. (Eds.). (2012). Reasoning, communication and connections in mathematics. Singapore: World Scientific.

Lester, F.K. (1980). Research on mathematical problem solving. In Shumway, R.J. (Eds.). Research in Mathematics Education, pp. 286-323. Reston, VA: NCTM.

Mardapi, D. (2008). Teknik penyusunan instrumen tes dan nontes [Test and non-test instruments composing techniques]. Yogyakarta: Mitra Cendekia.

Mueller, R.O. (1996). Basic analysis of structural equation modeling. New York, NY: Springer-Verlag New York.

Muraki, E. & Bock, R.D. (1998). Parscale: IRT item analysis and test scoring for rating scale data. Chicago, IL: Scientific Software International.

National Council of Teachers of Mathematics (NCTM). (2000). Principles and standards for school mathematics. Reston, VA: The National Council of Teachers of Mathematics.

National Council of Teachers of Mathematics (NCTM). (2009). Guiding principles for mathematics curriculum and assessment. Reston, VA: The National Council of Teachers of Mathematics. Retrieved on 15 January 2013 from http://standards.nctm.org/document/chapter2/content.aspx?id=23273

Nitko, A.J. & Brookhart, S.M. (2007). Educational assessment of students. Boston, MA: Pearson Prentice Hall.

Nunnally, J.C. (1981). Psychometric theory. New Delhi: McGraw Hill.

Peressini, D. & Webb, N. (1999). Analyzing mathematical reasoning in students' response across multiple performance assessment tasks. In Stiff, L.V. & Curcio, F.R. (Eds.). Developing Mathematical Reasoning in Grades K-12. pp. 156-174. Reston, VA: NCTM.

Polya, G. (1981). Mathematical discovery: On understanding, learning, and teaching problem solving. New York, NY: John Wiley & Sons.

Russel, S.J. (1999). Mathematical reasoning in the elementary grades. In Stiff, L.V. & Curcio, F.R. (Eds.). Developing Mathematical Reasoning in Grades K-12. pp. 1-12. Reston, VA: NCTM.

Schumacker, R.E. & Lomax, R.G. (2004). A beginner's guide to structural equation modeling (2nd ed.). Mahwah, NJ: Lawrence Erlbaum Associates.

Shafer, M.C. & Foster, S. (1997). The changing face of assessment. Principled Practice in Mathematics & Science Education, 1(2), 1-12.

Stanley, T. & Moore, B. (2010). Critical thinking and formative assessment. Lachmont, NY: Eye On Education.

Thomas, D.A., Okten, G., & Buis, P. (2002). On-line assessment of higher-order thinking: A java-based extension to closed-form testing. ICOTS6, 1-4. Retrieved on June 6, 2013 from https://www.stat.auckland.ac.nz/~iase/publications/1/6d4_thom.pdf

Urbina, S. (2004). Essential of psychological testing. Hoboken, NJ: John Wiley & Sons.