' item measurement' Search Results
Adaptation of Distributed Leadership Scale into Turkish: The Validity and Reliability Study
distributed leadership distributed leadership scale validity and reliability...
The purpose of this study was to adapt “Distributed Leadership Scale” originally developed by Davis into Turkish Language. A total of 386 participants including teachers employed in high schools in Tokat participated in the study. Explanatory Factor Analysis (EFA) and Confirmatory Factor Analysis (CFA) were performed to test the structural validity of the scale. EFA results illustrated that adapted scale consisted of seven factors. In the light of the original scale form, these factors were named as “School Organization”, “School Vision”, “School Culture”, “Instructional Program”, “Artifacts”, “Teacher Leadership”, “Principal Leadership”. The scale consisted of 34 items and reliability coefficients for the subscales from .75 and .92. Results finally revealed that Distributed Leadership Scale-Turkish Adapted Form is a valid and reliable measurement tool to be used in describing the distributed leadership behaviors in schools.
3
Measuring Teaching Best Practice in the Induction Years: Development and Validation of an Item-Level Assessment
teacher induction development assessment rasch modeling teaching internship teacher attrition...
Schools and teacher induction programs around the world routinely assess teaching best practice to inform accreditation, tenure/promotion, and professional development decisions. Routine assessment is also necessary to ensure that teachers entering the profession get the assistance they need to develop and succeed. We introduce the Item-Level Assessment of Teaching practice (I-LAST) as a flexible framework-based approach for quantitative evaluation of teaching best practice in the induction stages. We based the I-LAST on a novel framework for teaching best practice, and used Fuller’s scale as a framework for understanding the potential of the I-LAST in providing longitudinal measures for growth. Using the context of a year-long teacher induction program in the Midwestern United States, we collected data through an online survey from 46 teaching supervisors who were asked to evaluate their interns. We used the Rasch partial credit model as a criterion for construct validity, and measured dimensionality and reliability from both Rasch and classical frameworks. The I-LAST was found to be a unidimensional, valid, and reliable measure for teaching best practice. It demonstrated the ability to provide reliable scores for specific sub-dimensions of best practice, including those which manifest at various stages along Fuller’s scale. Potential uses of the I-LAST to advance understanding of the role of teacher induction programs in fostering productive growth in new teachers is discussed.
5
A Reflection on Distorted Views of Science and Technology in Science Textbooks as Obstacles to the Improvement of Students’ Scientific Literacy
scientific literacy knowing about science science inquiry technology society environment textbooks conceptions...
Scientific literacy has been increasingly considered a major goal of science education. While textbooks remain the most widespread tools for pursuing this goal within classrooms, they have been slow to adapt to the most recent epistemological paradigms, often still conveying distorted views of science and technology. Accordingly, we present herein a theoretical framework specifically intended to highlight the potential of textbooks to promote students’ scientific literacy. It is additionally argued that, often, the misconceptions conveyed by textbooks represent obstacles to the acquisition of a fair image of science and, therefore, to the acquisition of scientific literacy. Finally, a textbook analysis is suggested.
6
Development of a Scale to Measure Educators’ Practice in Teaching Self-Determination
scale development self-determination teachers validity and reliability...
The purpose of this study was to develop a scale for assessing teachers’ self-determination instruction and to test the validity and reliability of this tool. The subjects included 315 teachers recruited from elementary and junior high schools nationwide in Taiwan. The Teaching Self-Determination Scale (TSDS) developed in this study aimed at assessing the extent to which educators teach students knowledge and skills related to self-determination. The 24-item TSDS is comprised of four subscales including Self-Realization, Psychological Empowerment, Self-Regulation, and Autonomy. Data collected were analyzed using descriptive statistics, correlation analyses, t tests, and factor analyses. Findings showed that the TSDS has satisfactory psychometric properties. The internal consistency reliability coefficients (Cronbach’s α) ranged from .76 to .93, while the test-retest coefficients ranged from .71 to .87. Findings of the exploratory factor analysis showed that the four TSDS subscale factors can be reasonably extracted, which can explain 59.7% of the total item variance. The confirmatory factor analysis results further indicated a good fit between the measurement model and the sample data (GFI = .96, AGFI = .91, RMSEA = .08, NFI = .97, RFI = .93, IFI = .98, TLI = .95, CFI = .98). Suggestions are provided for future research.
0
Opinions and Suggestions of Teacher Candidates on the Teaching of the Reading Skill in French Language: The Example of Uludag University
teacher candidates of french reading-comprehension skill student opinions content analysis...
In this work, we have analyzed the opinions of teacher candidates of French as a foreign language on the method that should be used in the teaching of reading-comprehension, one of the main linguistic skills. At the end of the Fall Semester of the 2011-2012 academic year, a survey constituted of three open-ended questions was carried out among teacher candidates studying at the French Language Teaching Programme of the Faculty of Education of Uludag University. Among the 120 students enrolled in the Programme, 64 have participated in the survey voluntarily and expressed their opinions. Students have been priorly informed about the study and the link of the survey prepared via Google Docs has been sent to them through an e-mail where they were asked to fill in the study in Turkish. In the scope of the present work, only student opinions relating to option b of the study's second question -i.e. on the method that should be used in the teaching of the reading skill in French- have been taken into consideration. In this qualitative work based on a case study pattern, opinions have first been sorted out through the content analysis method and have been classified; then, they have been compared with and discussed in the light of opinions and suggestions that already exist in literature. Lastly, findings have been interpreted and presented as a determination.
Measurement Invariance of the Student Personal Perception of Classroom Climate Scale (SPPCC) in the Turkish Context
gender invariance personal perception elementary education classroom climate...
Among school psycho-social factors with considerable effect on student outcomes are both school and classroom climate. Because how students perceive the classroom climate strongly predicts achievement, measuring classroom climate gains importance and the need for testing the existing results across cultures persists. In this study, we assessed the validity and measurement invariance of the Turkish adaptation of the Student Personal Perception of Classroom Climate Scale (SPPCC) developed in English (US). Confirmatory factor analyses (CFA) and measurement invariance (MI) analyses by sex were performed on 629 students’ data. CFA results confirmed the factorial structure of the SPPCC. Results of the MI analyses showed that the SPPCC measures the same construct for females and males in a non-English context. Latent mean comparisons revealed girls perceived the classroom climate more positively than boys. We concluded that this study in the Turkish context is a further step in developing evidence of the extent to which SPCC provides psychometrically sound scores.
4
The Development of an Instrument to Measure the Higher Order Thinking Skill in Physics
higher order thinking skill physics instrument...
This study is conducted to develop the diagnostic test, which can be used to measure the higher-order thinking skill (HOTs) of students of first-grade senior high school in Bima district, West Nusa Tenggara. The step of developing instruments such as test which using modification model of Oreondo which include two activities such as test designing and test trials. The analysing technique of validity of content used Aiken formula, classical test theory used software Iteman 4.3, the model of Rasch used software Winstep and analysing reliability used software SPSS. The conclusion which can be taken are developing instrument has the characteristics as a useful instrument and fulfil requirement used to measure. This case proved from the data of analysis result which confirm that the instrument has been achieved the content of validity by expert judgment and obtained the empirical evidence, both as classical test theory or Rasch model.
A Comparison of Score Equating Conducted Using Haebara and Stocking Lord Method for Polytomous
equating polytomous graded data...
The purposes of this research are: 1) to compare two equalizing tests conducted with Hebara and Stocking Lord method; 2) to describe the characteristics of each equalizing test method using windows’ IRTEQ program. This research employs a participatory approach as the data are collected through questionnaires based on the National Examination Administration of 2018. The samples are classified into group A and group B respectively by 449 and 502 respondents. This paper discusses how to equalize shared items using the anchor method with a set of instruments in the forms of 35 questionnaire items and 6 shared items. In addition, the researcher also uses PARSCALE to estimate each respondent’s skills and each item’s characteristics. The shared items are eventually equalized using IRTEQ program. The results show that there is a significant difference between those conducted using Haebara method (0.592) which produces bigger mean-sigma value and Stocking & Lord (0.00213). Thus, the results show that the shared testing items may improve respondents’ discrimination and increase the difficulty level (parameter b). Due to the availability of shared items, it is good and appropriate to equalize two different tests on different theta skills.
Psychometric Assessment and Cross-Cultural Adaptation of the Grit-S Scale among Omani and American Universities’ Students
grit psychometric properties achievement goal orientations cross-cultural study...
The current study aimed to adopt and assess the psychometric properties and measurement invariance of Grit-S among Omani and American students (N = 487) using Exploratory Factor Analysis (EFA) and Multi-Group Confirmatory Factor Analysis (CFA). The scale’s construct validity was estimated by investigating its associations with achievement goal orientations (AGOs). EFA results suggested that a two-factor solution (i.e., perseverance of effort [G_PE] and consistency of interest [G_CI]) was the best factorial structure, explaining 47.74% and 51.02% of the variance in the Omani and American samples, respectively. The factors had good reliability coefficients in the two samples. Related to the intercultural differences, G_PE explained more variance among Omanis (31.02%) relative to American sample, whereas G_CI explained a larger proportion of variance among Americans (36.86%) compared with Omani sample. The first level of measurement invariance, configural invariance, was not supported, necessitating the investigation of the other levels of measurement invariance using a new sample. Grit correlated positively with mastery and performance-approach goals (r = .29 and .12, respectively) and negatively with avoidance goals (r = -.25), supporting the scale’s construct validity. These findings showed that Grit-S scale can be used as valid and reliable assessment tool to assess student interest and perseverance in the academic context in Arabic/Omani and American cultures.
Implementation of the Omega (ω) Index to Detect Large-Scale Systematic Cheating
answer-copying indices item response theory pirls cheating detection standardized testing test integrity...
Cheating detection is an important issue in standardized testing, especially in large-scale settings. Statistical approaches are often computationally intensive and require specialised software to conduct. We present a two-stage approach that quickly filters suspected groups using statistical testing on an IRT-based answer-copying index. We also present an approach to mitigate data contamination and improve the performance of the index. The computation of the index was implemented through a modified version of an open source R package, thus enabling wider access to the method. Using data from PIRLS 2011 (N=64,232) we conduct a simulation to demonstrate our approach. Type I error was well-controlled and no control group was falsely flagged for cheating, while 16 (combined n=12,569) of the 18 (combined n=14,149) simulated groups were detected. Implications for system-level cheating detection and further improvements of the approach were discussed.
Developing of Computerized Adaptive Testing to Measure Physics Higher Order Thinking Skills of Senior High School Students and its Feasibility of Use
computerized adaptive testing hots partial credit model item response theory...
The Computer has occupied a comprehensive coverage, especially in education scopes, including in learning-teaching processes, testing, and evaluating. This research aimed to develop computerized adaptive testing (CAT) to measure physics higher-order thinking skills (HOTS), namely PhysTHOTS-CAT. The Research Development used the 4-D developmental model carrying the four phases of define, design, development, and dissemination (4D) developed by Thiagarajan. This testing instrument can give the item test based on the student’s abilities. The research phases include (1) needs analysis and definition, (2) development design (3) development of CAT and assemble the test items into CAT, (4) validation by experts, and (5) feasibility try-out. The findings show that PhysTHOTS-CAT is valid to measure physics HOTS of the 10th-grade students of Senior High School according to 82.28% of teachers and students assessment on PhysTHOTS-CAT content and media. Therefore, it can conclude that PhysTHOTS-CAT can be used and feasible to measure physics HOTS of the 10th-grade students of the Senior High School.
The Effect of Emotional Intelligence, Self-Efficacy, Subjective Well-Being and Resilience on Student Teachers’ Perceived Practicum Stress: A Malaysian Case Study
emotional intelligence self-efficacy subjective well-being resilience perceived practicum stress...
Stress is inevitable in the world of teaching and practicum training and therefore, student teachers naturally incur a certain level of stress due to the demands for them to use various knowledge and skills in real school and classroom environment. Hence, practicum stress needs to be addressed accordingly. The central focus of this study is using a partial least square-structural equation modeling to explore the inter-relationships among the student teachers’ personal resources to mitigate practicum stress. A sample of 200 student teachers selected by purposive sampling from teacher education institutions in Sabah, Malaysia was used in this study. This study collected data via survey methods using a questionnaire developed from several existing scales. Findings showed that emotional intelligence, self-efficacy, and subjective well-being were able to explain resilience with good predictive accuracy and relevance but poorly for practicum stress. These findings were suggestive of the need to include additional constructs to explain perceived practicum stress better in future exploratory research.
The Development of Computerized Economics Item Banking for Classroom and School-Based Assessment
item banking cbt assessment economics...
The advancement of information and technology resulted in the change in conventional test methods. The weaknesses of the paper-based test can be minimized using the computer-based test (CBT). The development of a CBT desperately needs a computerized item bank. This study aimed to develop a computerized item bank for classroom and school-based assessments. A research and development method is used in this study, which consisted of four phases, i.e., planning, item development, system development, and field testing. Data is collected through documentation, expert judgment, and field testing. The data were analyzed using descriptive statistics and item response theory. The sample of this study was teachers and high school students in West Sumatera province selected using purposive random sampling techniques. The results of the study are as follows. 1) The computerized item bank has excellent quality based on expert validation. 2) There are 120 items inputted into the item bank system that has a moderate difficulty and good discriminant index based on item response theory. 3) The field testing indicated the computerized economics item banking has high effectiveness of usability, usefulness for the teachers, and feasible for classroom and school-based assessment.
Multiple Intelligences-based Creative Curriculum: The Best Practice
model assessment curriculum multiple intelligences kindergarten...
The purpose of this research is: 1) to develop the model and produce the assessment of creative curriculum-based learning program multiple intelligences (MI), 2) to know the characteristics and impacts of developed product models. Research using multi-years by method R & D (Research and Development) with two phases; First phase: 1) Preliminary survey stage, 2) definition stage, 3) design phase, 4) trial stage, and 5) development stage; The second phase: 1) the instrument design stage through the Forum Group Discussion, 2) the product trial phase of 100 children in Sleman Regency, 3) wide-scale implementation of 200 children in Yogyakarta Province, 4) the evaluation phase with construct analysis and achievement of research subjects' performance, 5 ) the stage of measuring the effectiveness of the product with user perception. The subject comprises 200 children of early childhood and 20 kindergarten teachers in 10 kindergartens in the Yogyakarta province in Indonesia, by the approach of Reflective Measurement Theory (RMT). The results showed that: 1) the MI-based creative curriculum assessment model was developed to meet valid, reliable and conformity criteria of an empirical data model, 2) The implementation of the assessment model had fulfilled the requirements worthy of using three criteria aspect; 1) The results of the assessment using creative instruments based on multiple intelligences on children get "very good" results, 2) the readiness of the teacher in learning is included in the "good" category; 3) teacher performance appraisal shows the "very good" category, and 4) the benefits of the products developed are in the "very good" category. It was concluded that the developed product had tested empirically and practically so that it was useful in learning in early childhood.
Design and Validation of Mathematical Literacy Instruments for Assessment for Learning in Indonesia
instruments mathematics literacy content validity construct validity construct reliability...
This study aims to design mathematical literacy instruments that have evidence of content and construct validity and are reliable for use as an Assessment for Learning. The research involved eight experts as instrument validators and 273 eighth-grade students of junior high school in Yogyakarta Province. The results showed that the ten mathematical literacy items developed had the V Aiken coefficient index calculated from 0.781 to 0.906 (> 0.75). The results of adequacy testing of samples with KMO and Bartlett show Chi-Square in the Bartlett test of 608,608, the p-value <0.05 and KMO value of 0.781 (> 0.5). The results of testing of the measurement model with Confirmatory Factor Analysis (CFA) produce a Root Mean Square Error of Approach (RMSEA) value of 0.049 (≤ 0.08), chi-s Square of 33.92 (<2df), the p-value of 0.05004 (≥ 0.05). Nine out of the ten items developed had t-value> 1.96, Standardized Loading Factor (SLF) was greater than the critical limit (> 0.3), and Construct Reliability (CR) of 0.78 (> 0.7). It can be concluded that the developed mathematical literacy instrument can measure what must be measured and nine items significantly reflect the construct or latent variable, as well as the level of consistency of a good score.
Ethnic Differences in Students’ Attitudes to the Arts: Providing Validity Evidence to Make Comparisons
attitude to arts measurement invariance ethnicity...
Previous research suggests that non-cognitive factors play an important role in promoting success at school and beyond, aligning with the multifaceted goals of education. Enhancing students’ attitudes to learning in school is expected to have positive impacts on various schooling outcomes. To date, very few studies have focused on measuring and understanding students’ attitude to the arts. This study aims to address a gap in current research in this area by introducing instruments designed to measure attitude to dance, drama, music and visual arts. Confirmatory factor analysis and measurement invariance analyses are employed to examine the factorial validity and measurement equivalence of the scales of attitude to the arts disciplines for different ethnic groups in New Zealand. Findings support the utility of the scales as valid measures of attitude to dance, drama, music and visual arts. Noticeable differences are reported among New Zealand European, Maori, Pasifika and Asian students regarding their attitudes to dance, drama, music and visual arts.
The Effects of Principal’s Decision-making, Organizational Commitment and School Climate on Teacher Performance in Vocational High School Based on Teacher Perceptions
organizational commitment principal’s decision-making school climate teacher performance...
This quantitative research aims to analyze the effects of the principal's decision-making, organizational commitment and school climate on teacher performance in vocational high school. The research sample was 160 vocational school teachers in North Minahasa Regency with simple random sampling method. The data were collected using a Likert scale questionnaire 25 with statements. The data analysis was performed using simple linear regression and multiple linear regression. The results showed that the principal's decision-making, organizational commitment and school climate had a positive and significant effect on the performance of vocational school teachers, both partially and simultaneously. The results of this research can be an important reference for educational administrators at vocational high school level to design school strategies and policies that can encourage increased teacher performance to achieve better school productivity.
The Effect of Negative Peace in Mind to Aggressive Behavior of Students in Indonesia
aggressive behavior peace education peace of mind...
This ex-post facto research aims to identify the negative influence of peace of mind on students' aggressive behavior. Aggressive behavior of students has become a problem that has not been alleviated to the maximum and is increasingly complex. One model of education that seeks to build students' peace of mind is the peace education model. The use of this educational model can suppress the urge of students to show aggressive behavior. The research data was collected using the peace of mind scale (PoMS) and aggressive behavior scale (ABS). The research sample was taken using cluster random technique with a total of 1263 students coming from western part of Indonesia (East Java, the Special Region of Yogyakarta, and Lampung), the central part of Indonesia (West Nusa Tenggara and Central Sulawesi), and the eastern part of Indonesia (North Maluku). Data in this study were analyzed using simple linear regression. The results of the analysis of the study concluded that negative peace of mind has an effect of 62.9% on aggressive behavior committed by students. This study is recommended for future researchers to develop peaceful thinking training programs to reduce students' aggressive behavior.
Agreement Levels of Kindergarten Principals and Teachers to Determine Teaching Competencies and Performance
agreement level competence kindergarten teacher teaching performance...
This research aimed to analyze the levels of agreement between kindergarten teachers and principals in identifying the assessment of teachers’ teaching competencies and performance. The study was designed under a quantitative approach using a survey. It implemented a non-probability sampling technique with purposive sampling. The sample of the population comprised of 173 kindergarten teachers and 101 principals in Semarang District, Indonesia, or a total of 274 respondents. The data were collected through a questionnaire and analyzed using Cohen’s Kappa coefficient to measure the levels of agreement between raters and Pearson Chi-Square test was also utilized to determine the differences in perceptions among principals and teachers. The findings showed that the levels of agreement between raters were averagely in the no agreement category, implying the existence of differences in perceptions among teachers and principals. The involvement of a multi-rater strategy in such research is a rare effort, especially for the Early Childhood Education (ECE) level in Indonesia. Researches regarding teaching competencies and performance generally only involve single rater, either teachers or principals who judge themselves on their competencies and performance, thus the results tend to be subjective. In conclusion, the assessment of teaching competencies with the relation of cognitive abilities was conducted through a test that considered subjective questions and case analysis to evaluate the teachers’ skills based on their performance and self-description. Both personal and social assessments utilized self-assessment forms or autobiographies, which were completed with specific themes. Meanwhile, the performance assessment was observed with the assessment rubric and comparison with the learning process performed by an individual educator.
Assessment of the Validity and Reliability of Mental Health Instruments of High School Student in Indonesia
validity reliability mental health item measurement adolescence indonesia...
This study aims to develop a standard instrument for measuring mental health among urban adolescents in Indonesia. The objective is to produce valid and reliable school adolescent mental health instruments to be used by agencies or schools to identify students' mental health. The survey was conducted in Jakarta and South Tangerang with a total of 1007 respondents divided into two experiments where the first trial was conducted on 597 students and the second trial was conducted on 410 students. Measurements were made using a Likert scale questionnaire. Instrument testing begins with a theoretical validity test by 4 experts and 20 panelists who test the instrument material in terms of construction, content and language. Experts analyze and correct the instrument qualitatively. The instrument was then reviewed and analyzed quantitatively by panelists using the Aiken index. At this stage, 44 items, 9 indicators and 3 variable dimensions were obtained. The next test is done by testing the validity empirically, by analyzing the measurement model using Confirmatory Factor Analysis (CFA) with the LISREL 8.80 Full Version program. By using the criteria for the SLF value ≥0.30 and t-value ≥1.96, and calculating the reliability with the construct reliability (CR) at the level> 0.70, the results of the second trial showed that 35 items were valid. The observations of the model fitness through Goodness-of-Fit test showed that there is a fitness between the theoretical model and the empirical model for the mental health instruments in this study.