Examining the reliability and validity of biology and Persian literature tests in the final exams of the secondary education system

Number of pages: 195 File Format: word File Code: 29897
Year: 2014 University Degree: Master's degree Category: Educational Sciences
  • Part of the Content
  • Contents & Resources
  • Summary of Examining the reliability and validity of biology and Persian literature tests in the final exams of the secondary education system

    Dissertation for obtaining a master's degree in measurement and measurement

    Research abstract

    The aim of the current research was to investigate the validity and reliability of biology and Persian literature courses in the final exams of the secondary education system. The society was all the questions of the final exams of experimental sciences of the third year of high school in June 2013, and in order to check the psychometric characteristics of these questions, the performance of 600 students of district one of Khorramabad city in the two subjects was analyzed. In this combined research, in order to check validity, these exams were evaluated in terms of structure, content, classification distribution of questions and the amount of book content coverage using the opinions of subject experts and in the form of researcher-made questionnaires. With the help of measurement plans in GT, the validity of the mentioned exams was checked. Also, the analysis of test questions was done using classical and IRT models. In order to check the acceptance status of the subjects and compare their performance based on gender, descriptive indices and independent t-test were used. In the Persian literature course, almost 95% of the questions were without structural problems, which indicates a very favorable situation. The most structural problems in the biology course were the lack of proportionality of the question load with the level of difficulty and importance of the question. Based on the permissive criterion (0.6), in Persian literature and biology courses, respectively (26.31, 67.19) percentage of questions were recognized as necessary. Almost 60% of the questions in both courses belonged to the knowledge level. Both students' grades and questions have high validity, but the absolute generalizability coefficient for the correctors was less than 0.70, which indicates the unfavorable grading of the correctors in both courses. The relative generalizability coefficients obtained for the proofreaders also showed that the correction process of the proofreaders was different in the Persian literature course and similar in the biology course. Using Cronbach's alpha, the estimated validity coefficient of the biology and Persian literature test was determined (0.97, 0.96). Also, according to the CTT and IRT models, the courses had favorable psychometric properties. The average score of girls in both subjects was higher than that of boys. Empirical evidence indicated the neutral role of the second correction in final exams. Keywords: final exams, validity, validity, classical model, IRT models, generalizability theory. Abstract: The purpose of the present research was assessment of validity and reliability of biology and literature courses at final exams of the education system of high schools. The studied population of this research was all the items of final exams of third level of high school in science field which was held at June 1390. In order to assess the psychometric properties of these items, analyzed the performance of 600 students in these two courses at district one of Khorramabad city. In this mixed methods research, in order to assess validity, these exams were assessed in aspects of structure, content, distribution of items according to cognitive level and the level of coverage of book contents, using views of subject matter expertise in the form of researcher-made questionnaires. reliability of mentioned exams were assessed based on measurement designs at GT. Also, items of exams were analyzed based on IRT and classical models. We used descriptive measure and independent T-test in order to assess acceptance situation of subjects and compare their performance based on gender. In literature course approximately 95% of items have no structural flaws which represents entirely satisfactory condition. The most structural flaws were in biology course including inappropriate item score with difficulty and importance of item. According to permissive criteria (0.6), in literature and biology respectively (26.31, 67.19) percent of items were identified as essential. Approximately, 60% of items in both courses belonged to knowledge level. Both students' scores and items have high reliability. On the other hand, Coef_G absolute scorer was less than 0.70 which demonstrate undesirable correction scorers in both courses.Achieved Coef_G relative for scorers demonstrate that correction methods of scorers in literature have been different and at biology courses have been similar. Estimated reliability coefficient in biology and literature was determined by Cronbach's Alpha, (0.97, 0.96). Also, according to CTT and IRT models, courses have desirable psychometric properties. The mean score of girls was more than boys in both courses. Empirical evidence represents the neutral role of the second scorer in final exams.

  • Contents & References of Examining the reliability and validity of biology and Persian literature tests in the final exams of the secondary education system

    List:

    Table of Contents

    Chapter One: General Research

    Introduction. 1

    Statement of the problem. 2

    Research objectives. 3

    General purpose. 4

    Minor objectives. 4

    Research questions. 4

    The importance and necessity of research. 5

    Definitions of words and terms. 6

    Chapter Two: Research Literature

    Introduction. 9

    Evaluation process of academic progress. 10

    Types of questions. 11

    Types of exams applicable at the level of the Ministry of Education. 15

    How to design final exam questions. 16

    How to correct final exams. 16

    Theoretical foundations. 18

    Classical test theory (CTT) 19

    Assumptions of classical test theory. 19

    Limitations of the classical test theory. 20

    Generalizability theory (GT) 22

    Concepts and terms in GT. 24

    Types of studies. 27

    Considerations of studies G and D. 27

    World of acceptable observations and studies G. 28

    World of generalization and studies D. 29

    Stochastic and mixed models with unlimited and limited generalization worlds. 30

    Generalization schemes. 31

    Types of decisions and error variances 35

    Types of coefficients. 37

    Question-Answer Theory (IRT) 37

    Assumptions of Question-Answer Theory. 38

    One-dimensionality. 38

    Local independence. 38

    Introduction of basic concepts in question-answer theory. 39

    Question specific curve (ICC) 39

    Question difficulty parameter. 39

    Question detection parameter. 40

    Question guess parameter. 40

    Test parameter. 40

    The feature of invariability of parameters 41

    Models in question-answer theory. 41

    Query Models – Logistic Response for Binomial Data. 42

    One-parameter model. 42

    Two-parameter model. 42

    Three-parameter model. 43

    The theory of generalizability against the classical theory of test. 43

    Classical test theory and generalizability theory versus question-answer theory. 45

    The concept of credit. 47

    Statistical definitions of credit. 47

    Credit estimation methods. 48

    Methods for estimating the validity of standard reference tests. 50

    Factors affecting the validity of the test. 56

    Measurement standard error. 57

    The concept of validity in IRT. 59

    Definition and concept of narrative. 59

    Narrative history. 60

    Narrative types. 60

    Relation of narrative and credibility. 65

    Content 65

    Content analysis 66

    Research done inside and outside of Iran. 66

    Chapter Three: Research Method

    Introduction. 78

    Research method. 78

    Statistical society. 79

    Sample group and its selection method. 80

    Method of gathering information. 81

    Research implementation method. 83

    Method of information analysis. 84

    Chapter Four: Data Analysis

    Analysis of statistical analysis. 92

    Chapter Five: Discussion and Conclusion

    Introduction. 151

    Discussion and conclusion. 152

    Ancillary research findings related to the correction of final exam papers. 163

    Research limitations. 166

    Suggestions for future research 166

    Resources

        Appendices

     

    Resources

    Atshek, Mohammad. (2011). Evaluation of gender justice in Iran's educational system. Women in Development and Policy, 10(4): 127-151. Allen, Mary J.; Yen, Wendy M. (2008). An introduction to measurement theories (psychometrics) (translation by Ali Delaware), third edition, Tehran: Samt. Embertson, Suzanne; Rice, Steven P. (2008). New theories of psychometrics for psychologists (translated by Hasan Pasha Sharifi, Vali A. Farzad, Mojtabi Habibi Asgarabad and Bilal Izanlou), first edition, Tehran: Rushd. Bazargan, Abbas (2013). Introduction to qualitative and mixed research methods (common approaches in behavioral sciences). Third edition, Tehran: Didar.                  

    Bovalhosni, Maria. (2013). Examining the validity of the practical part of the 2019 master's entrance exam in the field of architecture using variance components (generalizability theory). Master's Thesis, Allameh Tabatabai University. Baker, Frank B. (1381). Basic foundations of question-answer theoryThe basic foundations of question-answer theory: the new theory of psychometrics (translation by Haider Ali Homan and Ali Asgari). Tehran: Parsa. (Published in original language 2001).

    Jazairi, Hossein. (2004). Examining the validity of teachers' grading in the coordinated and final descriptive exams of the course

         General education of Lorestan province in the academic year 2014-2015. The report of the research plan approved by the organization

         Education of Lorestan province.

    Hosni, Mohammad; Samari, Maryam; Abbas Zadeh, Mir Mohammad; Mousavi, Miranjaf. (2012). Investigating inequality in the education of male and female high school students in West Azarbaijan province. Zan Dar Development and Policy, 11 (3): 332-315. Hashmati, Abdolreza, Rafe, Abulqasem and Jamshid Nejad, Mehrdad. (2013). Analysis of the results of the final exams      

         the third year of high school June 81 across the country. The report of the research project approved by the vice-chancellor of education and theoretical and skill development. Rahimi, Mahmoud. (2007). Evaluation and analysis of the levels of cognitive domains and psychometric indicators of the final exams of the third grade of middle school in Khuzestan province in June 2016. Research plan report

    Approved by Khuzestan province education organization.

    Rastegar, Tahereh. (2008). Assessment in the service of education: new approaches in assessment and evaluation with an emphasis on continuous and dynamic assessment and effective feedback to students in the education process. Tehran: Institute Publications

    Cultural Herald of Education.

    Rostgari Moghadam, Khodiar. (1378). Analyzing the results of the master's exam in Islamic studies, philosophy and playwriting of the participants in 2017 using the theory of generalizability. Dissertation

    MA, Allameh Tabatabai University.

    Sadei, Ali. (1376). Analytical comparison of coordinated and internal examinations of secondary school and its compliance with the scientific criteria of question design and test making. Research project report approved by the Research Council

    Education Organization of Khorasan Province.      

    Satari, Behzad. (2012). Applied advanced psychometrics. Mashhad: To be published.

    Sarmad, Zohra; Bazargan, Abbas; Hijazi, Elaha. (2012). Research methods in behavioral sciences. Tehran: Ageh.

    Soleimani, Ali. (2004). Review and analysis of the exam questions of theoretical high school mathematics courses of educational centers of Kermanshah province in the final exams of June 2018. Report of the research plan approved by the organization. Education and education of Kermanshah province. Seif, Ali Akbar. (2013). Educational measurement, measurement, and evaluation. Tehran: Doran.

    Shaterian, Mohammad. (1384). Examining the validity, reliability and indicators of difficulty and clarity of questions of teacher-made tests of mathematics, physics, chemistry and Arabic courses of the first year of secondary school in Qom city in the academic year 2013-84. Report of the research project approved by the research council of the education and training organization, Qom province.    

    Falsofinejad, Mohammad Reza. (2012). Question and answer theory class booklet (irt). Faculty of Psychology and Educational Sciences

    Allameh Tabatabai University.

    Kaplan, Robert M; Sacco, Dennis P. (1388). Psychoanalysis (translated by Ali Delavar, Fariborz Dartaj and Noor Ali Farrokhi). Tehran: Arsbaran. (Published in the original language in 2004). 

    Creswell, John W. (2011). Research project in humanities and social sciences (translated by Ismail Saadipour). Tehran: Duran. (Date of publication in original language 2009).

    Kerlinger, Fred Ann. (2008). Basics of research in behavioral sciences (translated by Hasan Pasha Sharifi and Jafar Najafi Zand). Tehran: Avai Noor. (published in the original language in 1986).

    Croker, Linda; Algina, James. (2008). New topics in psychometrics (translated by Valiullah Farzad and Hossein Zare). Tehran: Aizh. (Date of publication in original language 2008).

    Kiamanesh, Alireza; Hosni, Mohammad. (1388). Criticism of Iran's educational evaluation system from the perspective of evaluation

    educational system. Educational Innovations Quarterly, 30, 101-75.

    Magnuson, David. (1351). Theoretical foundations of psychological tests (translated by Mohammad Naghi Brahni). Tehran: Publications, University of Tehran. (Date of publication in original language 1967).

    Vahedi, Shahram; Fazon Mehr, rare. (1384). Examining the level of conformity of the final and internal exam questions of the year

    Third guidance based on psychometric indicators in West Azarbaijan province.

Examining the reliability and validity of biology and Persian literature tests in the final exams of the secondary education system