Evaluating the Reliability Index of an Entrance Exam Using Item Response Theory

Jeffrey Imer  Salim,

Evaluating the Reliability Index of an Entrance Exam Using Item Response Theory

Discipline: Education

Abstract:

Humans, in the conduct of daily activities, have used measurement as a crucial tool. In education, certain measurement tools such as achievement testing is used to assess the psychological capabilities of students. Thus, it is important that correct test constructs are utilized for results to be reliable. This study was done to evaluate the reliability index of the Mindanao State University – Tawi-Tawi College of Technology and Oceanography Senior High School Entrance Examination, which is given annually to prospect students. The examination consists of four subjects: English, Science, Mathematics and Aptitude with 75, 30, 40, and 25 questions each respectively. The study employed the descriptive quantitative research design. Stratified sampling was used on the entire raw scored answer sheets to come up with a sample of 200 examinees of the said exam. To evaluate the reliability index, the Statistical Program from Social Sciences or SPSS was used. The study concluded that the test constructs for the examination was highly reliable with reliability index of 0.968, with a level of significance of 0.1. Reliability per subject were 0.965 for Language or English, 0.983 for Science, 0.967 for Mathematics, and 0.974 for Aptitude. The study recommends that the examination be further evaluated and updated.

References:

Aldrich, J. (1997). "R.A. Fisher and the making of maximum likelihood 1912-1922." Statist. Sci. 12 (3) 162 - 176, August 1997. https://doi.org/10.1214/ss/1030037906
Baker, F. B., (2001). The Basic of Item Response Theory. ERIC Clearing house on Assessment and Evaluation. University of Wisconsin,
Bandalos, D. L., (2018). Measurement Theory and Application for the Social Sciences. The Guildform Press New York London. pp 63- 69, 120, 157, 159, 404, 407, 420.
Binh, H.T. & Dui, B.T. (2016). Student ability estimation based on IRT
Bridgeman, B., and Cline, F. (2000). Variations in Mean Response Times for Questions on the Computer-Adaptive GRE General Test: Implications for Fair Assessment (ETS RR-00-7). Available online a t : https://www.ets.org/research/policy_research_reports/publications/re port/2000/hsdr
Bock, D. and Moustaki, I. (2007). Item response theory in a general Framework in Handbook of Statistics on Psychometrics, Vol. 26, edited by C. R. Rao and S. Sinharay. Elsevier. Gregory, R.J. (2000). Psychological Testing. 3rd Edition. Illinios: Allyn and Bacon, Inc.
Hassan, S. & Hod, R. (2017). Use of Item Analysis to Improve the Quality of Single Best Answer Multiple Choice Question in Summative Assessment of Undergraduate Medical Students in Malaysia. Education in Medicine Journal. 2017;9(3):33-43. https://doi.org/10.21315/eimj2017.9.3.4
Jones, Lyle & Thissen, David. (2006). 1 A History and Overview of Psychometrics. Handbook of Statistics. 26. 1-27. 10.1016/S0169-7161(06)26001-2.
Lee, Y., & Cho, J. (2013). Personalized item generation method for adaptive testing systems. Multimed Tools Appl, 74(19): 8571-8591.
Lazarsfeld, P. F. (1958). Evidence and inference in social research, Dedalus, 87, 99-109.
Price, L. R. (2017). Theory into Practice. The Guild press New York London. pp. 5
Rasch, G. (1960). Probabilistic model for some intelligence and attaintment tests. Copenhagen: Danish Institute for Educational Research.
Sallil, U., 2017. Estimating Examinee’s Ability in a Computerized Adaptive Testing and Non-Adaptive Testing using 3 parameters IRT model.
Warm, T.A. Weighted likelihood estimation of ability in item response theory. Psychometrika 54, 427–450 (1989). https://doi.org/10.1007/BF02294627
Zheng, Y. (2014). New Methods of Online Calibration for Item Bank Replenishment