Evaluating the Reliability Index of an Entrance Exam Using Item Response Theory
Jeffrey Imer Salim
Discipline: Education
Abstract:
Humans, in the conduct of daily activities, have used measurement as a crucial tool. In education,
certain measurement tools such as achievement testing is used to assess the psychological capabilities
of students. Thus, it is important that correct test constructs are utilized for results to be reliable. This
study was done to evaluate the reliability index of the Mindanao State University – Tawi-Tawi
College of Technology and Oceanography Senior High School Entrance Examination, which is given
annually to prospect students. The examination consists of four subjects: English, Science,
Mathematics and Aptitude with 75, 30, 40, and 25 questions each respectively. The study employed
the descriptive quantitative research design. Stratified sampling was used on the entire raw scored
answer sheets to come up with a sample of 200 examinees of the said exam. To evaluate the
reliability index, the Statistical Program from Social Sciences or SPSS was used. The study
concluded that the test constructs for the examination was highly reliable with reliability index of
0.968, with a level of significance of 0.1. Reliability per subject were 0.965 for Language or English,
0.983 for Science, 0.967 for Mathematics, and 0.974 for Aptitude. The study recommends that the
examination be further evaluated and updated.
References:
- Aldrich, J. (1997). "R.A. Fisher and the making of maximum likelihood 1912-1922." Statist. Sci. 12 (3) 162 - 176, August 1997. https://doi.org/10.1214/ss/1030037906
- Baker, F. B., (2001). The Basic of Item Response Theory. ERIC Clearing house on Assessment and Evaluation. University of Wisconsin,
- Bandalos, D. L., (2018). Measurement Theory and Application for the Social Sciences. The Guildform Press New York London. pp 63- 69, 120, 157, 159, 404, 407, 420.
- Binh, H.T. & Dui, B.T. (2016). Student ability estimation based on IRT
- Bridgeman, B., and Cline, F. (2000). Variations in Mean Response Times for Questions on the Computer-Adaptive GRE General Test: Implications for Fair Assessment (ETS RR-00-7). Available online a t : https://www.ets.org/research/policy_research_reports/publications/re port/2000/hsdr
- Bock, D. and Moustaki, I. (2007). Item response theory in a general Framework in Handbook of Statistics on Psychometrics, Vol. 26, edited by C. R. Rao and S. Sinharay. Elsevier. Gregory, R.J. (2000). Psychological Testing. 3rd Edition. Illinios: Allyn and Bacon, Inc.
- Hassan, S. & Hod, R. (2017). Use of Item Analysis to Improve the Quality of Single Best Answer Multiple Choice Question in Summative Assessment of Undergraduate Medical Students in Malaysia. Education in Medicine Journal. 2017;9(3):33-43. https://doi.org/10.21315/eimj2017.9.3.4
- Jones, Lyle & Thissen, David. (2006). 1 A History and Overview of Psychometrics. Handbook of Statistics. 26. 1-27. 10.1016/S0169-7161(06)26001-2.
- Lee, Y., & Cho, J. (2013). Personalized item generation method for adaptive testing systems. Multimed Tools Appl, 74(19): 8571-8591.
- Lazarsfeld, P. F. (1958). Evidence and inference in social research, Dedalus, 87, 99-109.
- Price, L. R. (2017). Theory into Practice. The Guild press New York London. pp. 5
- Rasch, G. (1960). Probabilistic model for some intelligence and attaintment tests. Copenhagen: Danish Institute for Educational Research.
- Sallil, U., 2017. Estimating Examinee’s Ability in a Computerized Adaptive Testing and Non-Adaptive Testing using 3 parameters IRT model.
- Warm, T.A. Weighted likelihood estimation of ability in item response theory. Psychometrika 54, 427–450 (1989). https://doi.org/10.1007/BF02294627
- Zheng, Y. (2014). New Methods of Online Calibration for Item Bank Replenishment