'item response theory' Search Results
Measurement Invariance of the Student Personal Perception of Classroom Climate Scale (SPPCC) in the Turkish Context
gender invariance personal perception elementary education classroom climate...
Among school psycho-social factors with considerable effect on student outcomes are both school and classroom climate. Because how students perceive the classroom climate strongly predicts achievement, measuring classroom climate gains importance and the need for testing the existing results across cultures persists. In this study, we assessed the validity and measurement invariance of the Turkish adaptation of the Student Personal Perception of Classroom Climate Scale (SPPCC) developed in English (US). Confirmatory factor analyses (CFA) and measurement invariance (MI) analyses by sex were performed on 629 students’ data. CFA results confirmed the factorial structure of the SPPCC. Results of the MI analyses showed that the SPPCC measures the same construct for females and males in a non-English context. Latent mean comparisons revealed girls perceived the classroom climate more positively than boys. We concluded that this study in the Turkish context is a further step in developing evidence of the extent to which SPCC provides psychometrically sound scores.
4
The Development of an Instrument to Measure the Higher Order Thinking Skill in Physics
higher order thinking skill physics instrument...
This study is conducted to develop the diagnostic test, which can be used to measure the higher-order thinking skill (HOTs) of students of first-grade senior high school in Bima district, West Nusa Tenggara. The step of developing instruments such as test which using modification model of Oreondo which include two activities such as test designing and test trials. The analysing technique of validity of content used Aiken formula, classical test theory used software Iteman 4.3, the model of Rasch used software Winstep and analysing reliability used software SPSS. The conclusion which can be taken are developing instrument has the characteristics as a useful instrument and fulfil requirement used to measure. This case proved from the data of analysis result which confirm that the instrument has been achieved the content of validity by expert judgment and obtained the empirical evidence, both as classical test theory or Rasch model.
A Comparison of Score Equating Conducted Using Haebara and Stocking Lord Method for Polytomous
equating polytomous graded data...
The purposes of this research are: 1) to compare two equalizing tests conducted with Hebara and Stocking Lord method; 2) to describe the characteristics of each equalizing test method using windows’ IRTEQ program. This research employs a participatory approach as the data are collected through questionnaires based on the National Examination Administration of 2018. The samples are classified into group A and group B respectively by 449 and 502 respondents. This paper discusses how to equalize shared items using the anchor method with a set of instruments in the forms of 35 questionnaire items and 6 shared items. In addition, the researcher also uses PARSCALE to estimate each respondent’s skills and each item’s characteristics. The shared items are eventually equalized using IRTEQ program. The results show that there is a significant difference between those conducted using Haebara method (0.592) which produces bigger mean-sigma value and Stocking & Lord (0.00213). Thus, the results show that the shared testing items may improve respondents’ discrimination and increase the difficulty level (parameter b). Due to the availability of shared items, it is good and appropriate to equalize two different tests on different theta skills.
Implementation of the Omega (ω) Index to Detect Large-Scale Systematic Cheating
answer-copying indices item response theory pirls cheating detection standardized testing test integrity...
Cheating detection is an important issue in standardized testing, especially in large-scale settings. Statistical approaches are often computationally intensive and require specialised software to conduct. We present a two-stage approach that quickly filters suspected groups using statistical testing on an IRT-based answer-copying index. We also present an approach to mitigate data contamination and improve the performance of the index. The computation of the index was implemented through a modified version of an open source R package, thus enabling wider access to the method. Using data from PIRLS 2011 (N=64,232) we conduct a simulation to demonstrate our approach. Type I error was well-controlled and no control group was falsely flagged for cheating, while 16 (combined n=12,569) of the 18 (combined n=14,149) simulated groups were detected. Implications for system-level cheating detection and further improvements of the approach were discussed.
Developing of Computerized Adaptive Testing to Measure Physics Higher Order Thinking Skills of Senior High School Students and its Feasibility of Use
computerized adaptive testing hots partial credit model item response theory...
The Computer has occupied a comprehensive coverage, especially in education scopes, including in learning-teaching processes, testing, and evaluating. This research aimed to develop computerized adaptive testing (CAT) to measure physics higher-order thinking skills (HOTS), namely PhysTHOTS-CAT. The Research Development used the 4-D developmental model carrying the four phases of define, design, development, and dissemination (4D) developed by Thiagarajan. This testing instrument can give the item test based on the student’s abilities. The research phases include (1) needs analysis and definition, (2) development design (3) development of CAT and assemble the test items into CAT, (4) validation by experts, and (5) feasibility try-out. The findings show that PhysTHOTS-CAT is valid to measure physics HOTS of the 10th-grade students of Senior High School according to 82.28% of teachers and students assessment on PhysTHOTS-CAT content and media. Therefore, it can conclude that PhysTHOTS-CAT can be used and feasible to measure physics HOTS of the 10th-grade students of the Senior High School.
The Effect of Emotional Intelligence, Self-Efficacy, Subjective Well-Being and Resilience on Student Teachers’ Perceived Practicum Stress: A Malaysian Case Study
emotional intelligence self-efficacy subjective well-being resilience perceived practicum stress...
Stress is inevitable in the world of teaching and practicum training and therefore, student teachers naturally incur a certain level of stress due to the demands for them to use various knowledge and skills in real school and classroom environment. Hence, practicum stress needs to be addressed accordingly. The central focus of this study is using a partial least square-structural equation modeling to explore the inter-relationships among the student teachers’ personal resources to mitigate practicum stress. A sample of 200 student teachers selected by purposive sampling from teacher education institutions in Sabah, Malaysia was used in this study. This study collected data via survey methods using a questionnaire developed from several existing scales. Findings showed that emotional intelligence, self-efficacy, and subjective well-being were able to explain resilience with good predictive accuracy and relevance but poorly for practicum stress. These findings were suggestive of the need to include additional constructs to explain perceived practicum stress better in future exploratory research.
The Development of Computerized Economics Item Banking for Classroom and School-Based Assessment
item banking cbt assessment economics...
The advancement of information and technology resulted in the change in conventional test methods. The weaknesses of the paper-based test can be minimized using the computer-based test (CBT). The development of a CBT desperately needs a computerized item bank. This study aimed to develop a computerized item bank for classroom and school-based assessments. A research and development method is used in this study, which consisted of four phases, i.e., planning, item development, system development, and field testing. Data is collected through documentation, expert judgment, and field testing. The data were analyzed using descriptive statistics and item response theory. The sample of this study was teachers and high school students in West Sumatera province selected using purposive random sampling techniques. The results of the study are as follows. 1) The computerized item bank has excellent quality based on expert validation. 2) There are 120 items inputted into the item bank system that has a moderate difficulty and good discriminant index based on item response theory. 3) The field testing indicated the computerized economics item banking has high effectiveness of usability, usefulness for the teachers, and feasible for classroom and school-based assessment.
Implementation of Four-Tier Multiple-Choice Instruments Based on the Partial Credit Model in Evaluating Students’ Learning Progress
learning progress four-tier change of state of matter partial-credit model...
One of the issues that hinder the students’ learning progress is the inability to construct an epistemological explanation of a scientific phenomenon. Four-tier multiple-choice (hereinafter, 4TMC) instrument and Partial-Credit Model were employed to elaborate on the diagnosis process of the aforementioned problem. This study was to develop and implement the four-tier multiple-choice instrument with Partial-Credit Model to evaluate students’ learning progress in explaining the conceptual change of state of matter. This research applied a development research referring to the test development model by Wilson. The data were obtained through development and validation techniques on 20 4TMC items tested to 427 students. On each item, the study applied diagnostic-summative assessment and certainty response index. The students’ conceptual understanding level was categorized based on the combination of their answer choices; the measurement generated Partial-Credit Model for 1 parameter logistic (IPL) data. Analysis of differences was based on the student level class using Analysis of Variants (One-way ANOVA). This study resulted in 20 valid and reliable 4TMC instruments. The result revealed that the integration of 4TMC test and Partial-Credit Model was effective to be treated as the instrument to measure students’ learning progress. One-way ANOVA test indicated the differences among the students’ competence based on the academic level. On top of that, it was discovered that low-ability students showed slow progress due to the lack of knowledge as well as a misconception in explaining the Concept of Change of State of Matter. All in all, the research regarded that the diagnostic information was necessary for teachers in prospective development of learning strategies and evaluation of science learning.
A Rasch Model Analysis of the Psychometric Properties of the Student-Teacher Relationship Scale among Middle School Students
student-teacher relationship psychometric properties rasch analysis adolescents middle and high school...
The current study investigated Student-Teacher Relationship Measure (STRM) psychometric properties using Rasch analysis in a sample of middle school female students (N = 995). Rasch Principal Components Analysis revealed psychometric support of two subscales (i.e., Academic and Social Relations). Summary statistics showed good psychometric properties. The category structure and individual statistics (i.e., items and person infit and outfit) were not ideal. Category structure showed that the distances between adjacent thresholds were lower than optimal criteria. Even though findings indicated that items mean square statistics (MNSQ) were optimal, standardized fit statistics (i.e., ZSTD) reflected many misfit persons and items in each subscale. After eliminating the misfit persons and items, the two subscales met the Rasch optimal criteria. The updated short 22-item scale had good psychometric properties, high item and person separation, and good item and person reliability for the two subscales and can be used as a reliable and valid scale.
Construct Exploration of Teacher Readiness as an Assessor of Vocational High School Competency Test
competency test construct exploration readiness instruments vocational high school...
Teachers who can adapt and be ready for all changes will also be able to provide a balance to increase the competence of vocational high school students. This is also not denied when teachers become assessors in student competency tests. The objectives of this study were to produce an instrument for the readiness of teachers as assessors; to knowing good grain reliability; to know the characteristics of the instrument; and to know the difficulty level of the item. The method used in this research is instrument development. Respondents were vocational school teachers who were candidates for competency test assessors. Data collection techniques using a questionnaire. Analysis of construct validity using Confirmatory Factor Analysis. Reliability using Cronbach’s alpha. Test the instrument items using the Rasch model. The results are the readiness instruments of the vocational teacher as an assessor has 19 indicators that have been grouped into 5 factors with consistency values being in the same construct (proven construct validity). The result of the calculation of the reliability of this instrument is 0.852, which means that the reliability coefficient is high; There are two items, namely numbers 24 and 18 which indicate the absence of a fit item in the overall item fit criteria; At the item difficulty level, items 8 and 6 have a difficulty score of more than 2, while this indicates that items 8 and 6 have a high difficulty level.
Measuring Self-Regulated Learning in the STEM Framework: A Confirmatory Factor Analysis
confirmatory factor analysis reliability self-regulated learning stem validity...
Within the context of Self-Regulated Learning (SRL), a process of directing oneself to facilitate individual learning more effectively, the SRL instrument development is deemed necessary to measure students’ self-reliance in learning mathematics in the science, technology, engineering, and mathematics (STEM) framework. The research aims to develop and test the validity and reliability of an SRL instrument, namely a 14-item SRL questionnaire accommodating four aspects, namely planning, self-monitoring, self-controlling, and evaluation. The study involved 420 junior high school students in East Java, Central Java, and Yogyakarta Special Region. The results show that the questionnaire was developed as planned and that planning, monitoring, controlling, and evaluating aspects can reflect the SRL variable in a valid, reliable, and significant way supported by each aspect's behavior indicator. The SRL variable theoretical model corresponds (good fit) with the empirical data, and all of the items are likely valid and reliable to assess student's mathematics SRL in the STEM framework. The SRL questionnaire was also found suitable for use by teachers to measure junior high school students’ self-reliance in SRL.
The Development of Historical Thinking Assessment to Examine Students’ Skills in Analyzing the Causality of Historical Events
causality historical events historical thinking skills...
This research aimed to develop a historical thinking assessment for students' skills in analyzing the causality of historical events. The development process of Gall and colleagues and Rasch analysis models were used to develop an assessment instrument consisting of two processes, including the analysis of the framework of cause and consequence, the validity, reliability, and difficultness test. This research involved 150 senior high school students, with data collected using the validation sheet, tests, and scoring rubric. The results were in the form of an essay test consisting of six indicators of analyzing cause and consequence. The instruments were valid, reliable, and suitable for assessing students’ skills in analyzing the causality of historical events. The developed instruments were paired with a historical thinking skills assessment to improve the accuracy of the information about students' level of historical thinking skills in the learning history.
Developing Assessment Instrument Using Polytomous Response in Mathematics
assessment instrument classical and modern theory vocational school polytomous responses...
This research is a developmental research aiming at developing a good mathematical test instrument using polytomous responses based on classical and modern theories. This research design uses the Plomp model, which consists of five stages, (1) preliminary investigation, (2) design, (3) realization/construction, (4) revision, and (5) implementation (testing). The study was conducted in three vocational schools in Lampung Province, Indonesia. The study involved 413 students, consisting of 191 male and 222 female students. The data were collected through questionnaire and test. The questionnaire was used to identify the assessment instruments currently employed by teachers and to be validated by the experts of mathematics and educational evaluation. The test used an open polytomous response test numbering of 40 items. The data were analyzed using both classical and modern theories. The results show that (1) the open polytomous response test has a good category according to classical and modern theory. However, the discrimination power of test items in classical theory needs several revisions, (2) the assessment instrument using the polytomous response of open multiple choice can guarantee information on the actual competence of students. This is proven by the fact that there is a harmony between the analysis result obtained from classical and modern theory from the students' arguments when giving reasons for their choices. Therefore, the open polytomous response test can be used as an alternative to learning assessment.
Teaching and Student Evaluation Tasks: Cross-Cultural Adaptation, Psychometric Properties and Measurement Invariance of Work Tasks Motivation Scale for Teachers
student and teacher evaluation work task motivation scale wellbeing in higher education cross-cultural adaptation...
The present research aimed to test an Amharic version of the multi-dimensional Work Task Motivation Scale for Teachers (WTMST), which measures the five pillars of university instructors’ motivation toward teaching and student evaluation tasks based on self-determination theory (SDT). Therefore, the WTMST offers the first instrument to measure all five motivational elements, and today it is one of the most applicable instruments to assess teachers’ motivation. An Amharic version of the WTMST for teaching and student evaluation tasks was adopted and assessed in large-scale data (N=1,117). Our findings demonstrate excellent reliability and construct validity (convergent, discriminant, divergent and factorial). Besides, the results of the model comparisons provided that out of the four theoretically competing models (single-order factor, correlated factor, higher-order factor and bi-factor models), the bi-factor model was the most-fitted one used for measurement invariance across various groups. Results also suggest that the factor structure of the WTMST for both teaching and student evaluation tasks demonstrate consistency across gender (men, women), university types (research, applied, and general university), age and experience in teaching. Therefore, the WTMST for teaching and student evaluation tasks may be valid in Ethiopian higher education settings.
The Development of a Four-Tier Diagnostic Test Based on Modern Test Theory in Physics Education
developing test four-tiers diagnostic test modern test theory...
Diagnostic tests are generally two or three-tier and based on classical test theory. In this research, the Four-Tier Diagnostic Test (FTDT) was developed based on modern test theory to determine understanding of physics levels: scientific conception (SC), lack of knowledge (LK), misconception (MSC), false negatives (FN), and false positives (FP). The goals of the FTDT are to (a) find FTDT constructs, (b) test the quality of the FTDT, and (c) describe students' conceptual understanding of physics. The development process was conducted in the planning, testing, and measurement phases. The FTDT consists of four-layer multiple-choice with 100 items tested on 700 high school students in Yogyakarta. According to the partial credit models (PCM), the student's responses are in the form of eight categories of polytomous data. The results of the study show that (a) FTDT is built on the aspects of translation, interpretation, extrapolation, and explanation, with each aspect consisting of 25 items with five anchor items; (b) FTDT is valid with an Aiken's V value in the range of 0.85-0.94, and the items fit PCM with Infit Mean Square (INFIT MNSQ) of 0.77-1.30, item difficulty index of 0.12-0.38, and the reliability coefficient of Cronbach's alpha FTDT is 0.9; (c) the percentage of conceptual understanding of physics from large to small is LK type 2 (LK2), FP, LK type 1 (LK1), FN, LK type 3 (LK3), SC, LK type 4 (LK4), and MSC. The percentage sequence of MSC based on the successive material is momentum, Newton's law, particle dynamics, harmonic motion, work, and energy. In addition, failure to understand the concept sequentially is due to Newton's law, particle dynamics, work and energy, momentum, and harmonic motion.
Development and Validation of a Concept Inventory for Interpreting Kinematics Graphs in the Tanzanian Context
concept inventory kinematics graphs physics teachers tanzania context...
This paper discusses the development and validation of a concept inventory for interpreting kinematics graphs in the Tanzanian context. The study involved 61 participants comprising physics pre-service teachers, secondary school teachers, diploma college tutors, and a university lecturer from Tanzania. We developed 25 multiple-choice questions for interpreting kinematics graphs. The different steps in the development process used are selecting the topic, setting objectives, constructing questions, validating questions, and reliability testing. We carried out descriptive and inferential statistical analysis by using Statistical Package for Social Science (SPSS) version 22 followed by item analysis for pre-and post-piloting. Findings revealed normal distribution scores with a mean and standard deviation of 39.28±10.893 for pre-piloting and 40.16±8.08 for post-piloting. It also revealed no significant difference between pre-and post-piloting results with a p-value of 0.414. In addition, correlation coefficients for test re-test reliability were .783 and .878 for single and average measures respectively. Moreover, item analysis in terms of difficulty index, discrimination index, and distractor efficiency agreed with the published standards. Based on these findings, the study recommends the use of developed and validated kinematics graphs concept inventory by physics educators in both research and classroom instructions in the Tanzanian context.
Study Item Parameters of Classical and Modern Theory of Differential Aptitude Test: Is it Comparable?
classical test theory differential aptitude test item parameter modern test theory...
This study aimed to find the Classical Test Theory (CTT) and Modern Test Theory (MTT) item parameters of the Differential Aptitude Test (DAT) and examined their comparability of them. The item parameters being studied are difficulty level and discrimination index. 5.024 data of the result sub-test DAT were documented by the Department of Psychology and Guidance and Counselling bureau. The parameter of classical and modern test items was estimated and correlated by examining the comparability between parameters. The results show that there is a significant correlation between item parameter estimates. The Rasch and IRT 1-PL models have the highest correlation toward CTT regarding the item difficulty level. In contrast, model 2-PL has the highest correlation toward CTT in the item discrimination index. Overall, the study concluded that CTT and MTT were comparable in estimating item parameters of DAT and thus could be used independently or complementary in developing DAT.
Course Dropout Intention Scale: Development and Validation of a New Brief Measure in Academic College Context
brief measure college student course dropout dropout intention dropout studies...
University students may encounter situations where they perform poorly in a course and contemplate dropping out. This intention to drop out of a course manifests not only in thoughts or ideas but also in a cognitive self-evaluation of their performance and skills, enabling them to reflect on the possibility of dropping out. In this sense, there is a shortage of instruments that evaluate the intention to drop out of a course, so the aim was to develop and validate the Course Dropout Intention Scale (CDIS). Data from two samples (N1 = 198; N2 = 675) were used; the first was for the EFA, and the second was for the CFA, GRM, and SEM. The one-factor model was derived from the EFA and confirmed in the second sample, exhibiting appropriate goodness-of-fit indices. Similarly, the GRM obtained adequate fit indices; all items discriminated adequately, and the difficulty parameter had a monotonic increase. The SEM model of the effect of satisfaction with studies on the CDIS showed a negative and statistically significant effect. Thus, it was demonstrated that the CDIS is a robust instrument in its psychometric properties and empirical evidence with other variables.
Developing Creative Thinking in Preschool Children: A Comprehensive Review of Innovative
comprehensive review creative thinking early childhood...
The ability to think creatively has a vital role in the development of preschool children. This research provides a comprehensive review of innovative approaches and strategies for developing creative thinking in preschool children based on current trends and methodologies used in educational settings. This research shows three significant areas: (a) creative thinking skills in preschool children, (b) factors influencing creative thinking skills in depth, and (c) innovative strategies and approaches to stimulate creative thinking abilities in preschool children. This research uses a literature study method assisted by the publish perish application to find reference sources related to creative thinking abilities in preschool children. Studies show that creative thinking abilities in preschool children enable them to find innovative solutions, help them adapt to challenges, foster self-confidence and courage, and enrich their experience and knowledge of the world around them. Meanwhile, preschool children's creative thinking abilities are influenced by collaboration from the external environment (parents, teachers, and society); providing support and examples for children to develop and stimulate their creative thinking skills is very important.