logo logo European Journal of Educational Research

EU-JER is is a, peer reviewed, online academic research journal.

Subscribe to

Receive Email Alerts

for special events, calls for papers, and professional development opportunities.

Subscribe

Publisher (HQ)

Eurasian Society of Educational Research
Eurasian Society of Educational Research
Christiaan Huygensstraat 44, Zipcode:7533XB, Enschede, THE NETHERLANDS
Eurasian Society of Educational Research
Headquarters
Christiaan Huygensstraat 44, Zipcode:7533XB, Enschede, THE NETHERLANDS

'item response theory' Search Results



...

Among school psycho-social factors with considerable effect on student outcomes are both school and classroom climate.  Because how students perceive the classroom climate strongly predicts achievement, measuring classroom climate gains importance and the need for testing the existing results across cultures persists.  In this study, we assessed the validity and measurement invariance of the Turkish adaptation of the Student Personal Perception of Classroom Climate Scale (SPPCC) developed in English (US).  Confirmatory factor analyses (CFA) and measurement invariance (MI) analyses by sex were performed on 629 students’ data.  CFA results confirmed the factorial structure of the SPPCC.  Results of the MI analyses showed that the SPPCC measures the same construct for females and males in a non-English context.  Latent mean comparisons revealed girls perceived the classroom climate more positively than boys.  We concluded that this study in the Turkish context is a further step in developing evidence of the extent to which SPCC provides psychometrically sound scores.

description Abstract
visibility View cloud_download PDF
10.12973/eu-jer.7.1.113
Pages: 113-120
cloud_download 927
visibility 1565
4
Article Metrics
Views
927
Download
1565
Citations
Crossref
4

The Development of an Instrument to Measure the Higher Order Thinking Skill in Physics

higher order thinking skill physics instrument

Syahrul Ramadhan , Djemari Mardapi , Zuhdan Kun Prasetyo , Heru Budi Utomo


...

This study is conducted to develop the diagnostic test, which can be used to measure the higher-order thinking skill (HOTs) of students of first-grade senior high school in Bima district, West Nusa Tenggara. The step of developing instruments such as test which using modification model of Oreondo which include two activities such as test designing and test trials. The analysing technique of validity of content used Aiken formula, classical test theory used software Iteman 4.3, the model of Rasch used software Winstep and analysing reliability used software SPSS. The conclusion which can be taken are developing instrument has the characteristics as a useful instrument and fulfil requirement used to measure. This case proved from the data of analysis result which confirm that the instrument has been achieved the content of validity by expert judgment and obtained the empirical evidence, both as classical test theory or Rasch model.

description Abstract
visibility View cloud_download PDF
10.12973/eu-jer.8.3.743
Pages: 743-751
cloud_download 1161
visibility 1485
20
Article Metrics
Views
1161
Download
1485
Citations
Crossref
20

Scopus
36

...

The purposes of this research are: 1) to compare two equalizing tests conducted with Hebara and Stocking Lord method; 2) to describe the characteristics of each equalizing test method using windows’ IRTEQ program. This research employs a participatory approach as the data are collected through questionnaires based on the National Examination Administration of 2018. The samples are classified into group A and group B respectively by 449 and 502 respondents. This paper discusses how to equalize shared items using the anchor method with a set of instruments in the forms of 35 questionnaire items and 6 shared items. In addition, the researcher also uses PARSCALE to estimate each respondent’s skills and each item’s characteristics. The shared items are eventually equalized using IRTEQ program. The results show that there is a significant difference between those conducted using Haebara method (0.592) which produces bigger mean-sigma value and Stocking & Lord (0.00213). Thus, the results show that the shared testing items may improve respondents’ discrimination and increase the difficulty level (parameter b). Due to the availability of shared items, it is good and appropriate to equalize two different tests on different theta skills.

description Abstract
visibility View cloud_download PDF
10.12973/eu-jer.8.4.1071
Pages: 1071-1079
cloud_download 630
visibility 1107
3
Article Metrics
Views
630
Download
1107
Citations
Crossref
3

Scopus
2

...

Cheating detection is an important issue in standardized testing, especially in large-scale settings. Statistical approaches are often computationally intensive and require specialised software to conduct. We present a two-stage approach that quickly filters suspected groups using statistical testing on an IRT-based answer-copying index. We also present an approach to mitigate data contamination and improve the performance of the index. The computation of the index was implemented through a modified version of an open source R package, thus enabling wider access to the method. Using data from PIRLS 2011 (N=64,232) we conduct a simulation to demonstrate our approach. Type I error was well-controlled and no control group was falsely flagged for cheating, while 16 (combined n=12,569) of the 18 (combined n=14,149) simulated groups were detected. Implications for system-level cheating detection and further improvements of the approach were discussed.

description Abstract
visibility View cloud_download PDF
10.12973/eu-jer.8.4.1307
Pages: 1307-1322
cloud_download 489
visibility 929
0
Article Metrics
Views
489
Download
929
Citations
Crossref
0

Scopus
0

...

The Computer has occupied a comprehensive coverage, especially in education scopes, including in learning-teaching processes, testing, and evaluating. This research aimed to develop computerized adaptive testing (CAT) to measure physics higher-order thinking skills (HOTS), namely PhysTHOTS-CAT. The Research Development used the 4-D developmental model carrying the four phases of define, design, development, and dissemination (4D) developed by Thiagarajan. This testing instrument can give the item test based on the student’s abilities. The research phases include (1) needs analysis and definition, (2) development design (3) development of CAT and assemble the test items into CAT, (4) validation by experts, and (5) feasibility try-out. The findings show that PhysTHOTS-CAT is valid to measure physics HOTS of the 10th-grade students of Senior High School according to 82.28% of teachers and students assessment on PhysTHOTS-CAT content and media. Therefore, it can conclude that PhysTHOTS-CAT can be used and feasible to measure physics HOTS of the 10th-grade students of the Senior High School.

description Abstract
visibility View cloud_download PDF
10.12973/eu-jer.9.1.91
Pages: 91-101
cloud_download 1323
visibility 1570
24
Article Metrics
Views
1323
Download
1570
Citations
Crossref
24

Scopus
30

...

Stress is inevitable in the world of teaching and practicum training and therefore, student teachers naturally incur a certain level of stress due to the demands for them to use various knowledge and skills in real school and classroom environment. Hence, practicum stress needs to be addressed accordingly. The central focus of this study is using a partial least square-structural equation modeling to explore the inter-relationships among the student teachers’ personal resources to mitigate practicum stress. A sample of 200 student teachers selected by purposive sampling from teacher education institutions in Sabah, Malaysia was used in this study. This study collected data via survey methods using a questionnaire developed from several existing scales. Findings showed that emotional intelligence, self-efficacy, and subjective well-being were able to explain resilience with good predictive accuracy and relevance but poorly for practicum stress. These findings were suggestive of the need to include additional constructs to explain perceived practicum stress better in future exploratory research.

description Abstract
visibility View cloud_download PDF
10.12973/eu-jer.9.1.277
Pages: 277-291
cloud_download 2405
visibility 2774
31
Article Metrics
Views
2405
Download
2774
Citations
Crossref
31

Scopus
30

The Development of Computerized Economics Item Banking for Classroom and School-Based Assessment

item banking cbt assessment economics

Friyatmi , Djemari Mardapi , Haryanto , Elvi Rahmi


...

The advancement of information and technology resulted in the change in conventional test methods. The weaknesses of the paper-based test can be minimized using the computer-based test (CBT). The development of a CBT desperately needs a computerized item bank. This study aimed to develop a computerized item bank for classroom and school-based assessments. A research and development method is used in this study, which consisted of four phases, i.e., planning, item development, system development, and field testing. Data is collected through documentation, expert judgment, and field testing. The data were analyzed using descriptive statistics and item response theory. The sample of this study was teachers and high school students in West Sumatera province selected using purposive random sampling techniques. The results of the study are as follows. 1) The computerized item bank has excellent quality based on expert validation. 2) There are 120 items inputted into the item bank system that has a moderate difficulty and good discriminant index based on item response theory. 3) The field testing indicated the computerized economics item banking has high effectiveness of usability, usefulness for the teachers, and feasible for classroom and school-based assessment.

description Abstract
visibility View cloud_download PDF
10.12973/eu-jer.9.1.293
Pages: 293-303
cloud_download 686
visibility 974
2
Article Metrics
Views
686
Download
974
Citations
Crossref
2

Scopus
1

Implementation of Four-Tier Multiple-Choice Instruments Based on the Partial Credit Model in Evaluating Students’ Learning Progress

learning progress four-tier change of state of matter partial-credit model

Lukman Abdul Rauf Laliyo , Syukrul Hamdi , Masrid Pikoli , Romario Abdullah , Citra Panigoro


...

One of the issues that hinder the students’ learning progress is the inability to construct an epistemological explanation of a scientific phenomenon. Four-tier multiple-choice (hereinafter, 4TMC) instrument and Partial-Credit Model were employed to elaborate on the diagnosis process of the aforementioned problem. This study was to develop and implement the four-tier multiple-choice instrument with Partial-Credit Model to evaluate students’ learning progress in explaining the conceptual change of state of matter. This research applied a development research referring to the test development model by Wilson. The data were obtained through development and validation techniques on 20 4TMC items tested to 427 students. On each item, the study applied diagnostic-summative assessment and certainty response index. The students’ conceptual understanding level was categorized based on the combination of their answer choices; the measurement generated Partial-Credit Model for 1 parameter logistic (IPL) data. Analysis of differences was based on the student level class using Analysis of Variants (One-way ANOVA). This study resulted in 20 valid and reliable 4TMC instruments. The result revealed that the integration of 4TMC test and Partial-Credit Model was effective to be treated as the instrument to measure students’ learning progress. One-way ANOVA test indicated the differences among the students’ competence based on the academic level. On top of that, it was discovered that low-ability students showed slow progress due to the lack of knowledge as well as a misconception in explaining the Concept of Change of State of Matter. All in all, the research regarded that the diagnostic information was necessary for teachers in prospective development of learning strategies and evaluation of science learning.

description Abstract
visibility View cloud_download PDF
10.12973/eu-jer.10.2.825
Pages: 825-840
cloud_download 425
visibility 1019
6
Article Metrics
Views
425
Download
1019
Citations
Crossref
6

Scopus
5

...

The current study investigated Student-Teacher Relationship Measure (STRM) psychometric properties using Rasch analysis in a sample of middle school female students (N = 995). Rasch Principal Components Analysis revealed psychometric support of two subscales (i.e., Academic and Social Relations). Summary statistics showed good psychometric properties. The category structure and individual statistics (i.e., items and person infit and outfit) were not ideal. Category structure showed that the distances between adjacent thresholds were lower than optimal criteria. Even though findings indicated that items mean square statistics (MNSQ) were optimal, standardized fit statistics (i.e., ZSTD) reflected many misfit persons and items in each subscale. After eliminating the misfit persons and items, the two subscales met the Rasch optimal criteria. The updated short 22-item scale had good psychometric properties, high item and person separation, and good item and person reliability for the two subscales and can be used as a reliable and valid scale.

description Abstract
visibility View cloud_download PDF
10.12973/eu-jer.10.2.957
Pages: 957-973
cloud_download 607
visibility 1166
2
Article Metrics
Views
607
Download
1166
Citations
Crossref
2

Scopus
2

...

Teachers who can adapt and be ready for all changes will also be able to provide a balance to increase the competence of vocational high school students. This is also not denied when teachers become assessors in student competency tests. The objectives of this study were to produce an instrument for the readiness of teachers as assessors; to knowing good grain reliability; to know the characteristics of the instrument; and to know the difficulty level of the item. The method used in this research is instrument development. Respondents were vocational school teachers who were candidates for competency test assessors. Data collection techniques using a questionnaire. Analysis of construct validity using Confirmatory Factor Analysis. Reliability using Cronbach’s alpha. Test the instrument items using the Rasch model. The results are the readiness instruments of the vocational teacher as an assessor has 19 indicators that have been grouped into 5 factors with consistency values being in the same construct (proven construct validity). The result of the calculation of the reliability of this instrument is 0.852, which means that the reliability coefficient is high; There are two items, namely numbers 24 and 18 which indicate the absence of a fit item in the overall item fit criteria; At the item difficulty level, items 8 and 6 have a difficulty score of more than 2, while this indicates that items 8 and 6 have a high difficulty level.

description Abstract
visibility View cloud_download PDF
10.12973/eu-jer.10.3.1471
Pages: 1471-1485
cloud_download 385
visibility 944
5
Article Metrics
Views
385
Download
944
Citations
Crossref
5

Scopus
7

...

Within the context of Self-Regulated Learning (SRL), a process of directing oneself to facilitate individual learning more effectively, the SRL instrument development is deemed necessary to measure students’ self-reliance in learning mathematics in the science, technology, engineering, and mathematics (STEM) framework. The research aims to develop and test the validity and reliability of an SRL instrument, namely a 14-item SRL questionnaire accommodating four aspects, namely planning, self-monitoring, self-controlling, and evaluation. The study involved 420 junior high school students in East Java, Central Java, and Yogyakarta Special Region. The results show that the questionnaire was developed as planned and that planning, monitoring, controlling, and evaluating aspects can reflect the SRL variable in a valid, reliable, and significant way supported by each aspect's behavior indicator. The SRL variable theoretical model corresponds (good fit) with the empirical data, and all of the items are likely valid and reliable to assess student's mathematics SRL in the STEM framework. The SRL questionnaire was also found suitable for use by teachers to measure junior high school students’ self-reliance in SRL.

description Abstract
visibility View cloud_download PDF
10.12973/eu-jer.10.4.2067
Pages: 2067-2077
cloud_download 753
visibility 1173
3
Article Metrics
Views
753
Download
1173
Citations
Crossref
3

Scopus
4

...

This research aimed to develop a historical thinking assessment for students' skills in analyzing the causality of historical events. The development process of Gall and colleagues and Rasch analysis models were used to develop an assessment instrument consisting of two processes, including the analysis of the framework of cause and consequence, the validity, reliability, and difficultness test. This research involved 150 senior high school students, with data collected using the validation sheet, tests, and scoring rubric. The results were in the form of an essay test consisting of six indicators of analyzing cause and consequence. The instruments were valid, reliable, and suitable for assessing students’ skills in analyzing the causality of historical events. The developed instruments were paired with a historical thinking skills assessment to improve the accuracy of the information about students' level of historical thinking skills in the learning history.

description Abstract
visibility View cloud_download PDF
10.12973/eu-jer.11.2.609
Pages: 609-619
cloud_download 1556
visibility 1911
2
Article Metrics
Views
1556
Download
1911
Citations
Crossref
2

Scopus
1

...

This research is a developmental research aiming at developing a good mathematical test instrument using polytomous responses based on classical and modern theories. This research design uses the Plomp model, which consists of five stages, (1) preliminary investigation, (2) design, (3) realization/construction, (4) revision, and (5) implementation (testing). The study was conducted in three vocational schools in Lampung Province, Indonesia. The study involved 413 students, consisting of 191 male and 222 female students. The data were collected through questionnaire and test. The questionnaire was used to identify the assessment instruments currently employed by teachers and to be validated by the experts of mathematics and educational evaluation. The test used an open polytomous response test numbering of 40 items. The data were analyzed using both classical and modern theories. The results show that (1) the open polytomous response test has a good category according to classical and modern theory. However, the discrimination power of test items in classical theory needs several revisions, (2) the assessment instrument using the polytomous response of open multiple choice can guarantee information on the actual competence of students. This is proven by the fact that there is a harmony between the analysis result obtained from classical and modern theory from the students' arguments when giving reasons for their choices. Therefore, the open polytomous response test can be used as an alternative to learning assessment.

description Abstract
visibility View cloud_download PDF
10.12973/eu-jer.11.3.1441
Pages: 1441-1462
cloud_download 486
visibility 1138
0
Article Metrics
Views
486
Download
1138
Citations
Crossref
0

Scopus
0

...

The present research aimed to test an Amharic version of the multi-dimensional Work Task Motivation Scale for Teachers (WTMST), which measures the five pillars of university instructors’ motivation toward teaching and student evaluation tasks based on self-determination theory (SDT). Therefore, the WTMST offers the first instrument to measure all five motivational elements, and today it is one of the most applicable instruments to assess teachers’ motivation. An Amharic version of the WTMST for teaching and student evaluation tasks was adopted and assessed in large-scale data (N=1,117). Our findings demonstrate excellent reliability and construct validity (convergent, discriminant, divergent and factorial). Besides, the results of the model comparisons provided that out of the four theoretically competing models (single-order factor, correlated factor, higher-order factor and bi-factor models), the bi-factor model was the most-fitted one used for measurement invariance across various groups. Results also suggest that the factor structure of the WTMST for both teaching and student evaluation tasks demonstrate consistency across gender (men, women), university types (research, applied, and general university), age and experience in teaching. Therefore, the WTMST for teaching and student evaluation tasks may be valid in Ethiopian higher education settings.

description Abstract
visibility View cloud_download PDF
10.12973/eu-jer.11.4.2243
Pages: 2243-2263
cloud_download 950
visibility 2198
2
Article Metrics
Views
950
Download
2198
Citations
Crossref
2

Scopus
1

The Development of a Four-Tier Diagnostic Test Based on Modern Test Theory in Physics Education

developing test four-tiers diagnostic test modern test theory

Edi Istiyono , Wipsar Sunu Brams Dwandaru , Kharisma Fenditasari , Made Rai Suci Shanti Nurani Ayub , Duden Saepuzaman


...

Diagnostic tests are generally two or three-tier and based on classical test theory. In this research, the Four-Tier Diagnostic Test (FTDT) was developed based on modern test theory to determine understanding of physics levels: scientific conception (SC), lack of knowledge (LK), misconception (MSC), false negatives (FN), and false positives (FP). The goals of the FTDT are to (a) find FTDT constructs, (b) test the quality of the FTDT, and (c) describe students' conceptual understanding of physics. The development process was conducted in the planning, testing, and measurement phases. The FTDT consists of four-layer multiple-choice with 100 items tested on 700 high school students in Yogyakarta. According to the partial credit models (PCM), the student's responses are in the form of eight categories of polytomous data. The results of the study show that (a) FTDT is built on the aspects of translation, interpretation, extrapolation, and explanation, with each aspect consisting of 25 items with five anchor items; (b) FTDT is valid with an Aiken's V value in the range of 0.85-0.94, and the items fit PCM with Infit Mean Square (INFIT MNSQ) of 0.77-1.30, item difficulty index of 0.12-0.38, and the reliability coefficient of Cronbach's alpha FTDT is 0.9; (c) the percentage of conceptual understanding of physics from large to small is LK type 2 (LK2), FP, LK type 1 (LK1), FN, LK type 3 (LK3), SC, LK type 4 (LK4), and MSC. The percentage sequence of MSC based on the successive material is momentum, Newton's law, particle dynamics, harmonic motion, work, and energy. In addition, failure to understand the concept sequentially is due to Newton's law, particle dynamics, work and energy, momentum, and harmonic motion.

description Abstract
visibility View cloud_download PDF
10.12973/eu-jer.12.1.371
Pages: 371-385
cloud_download 594
visibility 1011
0
Article Metrics
Views
594
Download
1011
Citations
Crossref
0

Scopus
0

...

This paper discusses the development and validation of a concept inventory for interpreting kinematics graphs in the Tanzanian context. The study involved 61 participants comprising physics pre-service teachers, secondary school teachers, diploma college tutors, and a university lecturer from Tanzania. We developed 25 multiple-choice questions for interpreting kinematics graphs. The different steps in the development process used are selecting the topic, setting objectives, constructing questions, validating questions, and reliability testing. We carried out descriptive and inferential statistical analysis by using Statistical Package for Social Science (SPSS) version 22 followed by item analysis for pre-and post-piloting. Findings revealed normal distribution scores with a mean and standard deviation of 39.28±10.893 for pre-piloting and 40.16±8.08 for post-piloting. It also revealed no significant difference between pre-and post-piloting results with a p-value of 0.414.  In addition, correlation coefficients for test re-test reliability were .783 and .878 for single and average measures respectively. Moreover, item analysis in terms of difficulty index, discrimination index, and distractor efficiency agreed with the published standards. Based on these findings, the study recommends the use of developed and validated kinematics graphs concept inventory by physics educators in both research and classroom instructions in the Tanzanian context.

description Abstract
visibility View cloud_download PDF
10.12973/eu-jer.12.2.673
Pages: 673-693
cloud_download 460
visibility 749
2
Article Metrics
Views
460
Download
749
Citations
Crossref
2

Scopus
1

Study Item Parameters of Classical and Modern Theory of Differential Aptitude Test: Is it Comparable?

classical test theory differential aptitude test item parameter modern test theory

Farida Agus Setiawati , Rizki Nor Amelia , Bambang Sumintono , Edi Purwanta


...

This study aimed to find the Classical Test Theory (CTT) and Modern Test Theory (MTT) item parameters of the Differential Aptitude Test (DAT) and examined their comparability of them. The item parameters being studied are difficulty level and discrimination index. 5.024 data of the result sub-test DAT were documented by the Department of Psychology and Guidance and Counselling bureau. The parameter of classical and modern test items was estimated and correlated by examining the comparability between parameters. The results show that there is a significant correlation between item parameter estimates. The Rasch and IRT 1-PL models have the highest correlation toward CTT regarding the item difficulty level. In contrast, model 2-PL has the highest correlation toward CTT in the item discrimination index. Overall, the study concluded that CTT and MTT were comparable in estimating item parameters of DAT and thus could be used independently or complementary in developing DAT.

description Abstract
visibility View cloud_download PDF
10.12973/eu-jer.12.2.1097
Pages: 1097-1107
cloud_download 456
visibility 1227
2
Article Metrics
Views
456
Download
1227
Citations
Crossref
2

Scopus
1

Course Dropout Intention Scale: Development and Validation of a New Brief Measure in Academic College Context

brief measure college student course dropout dropout intention dropout studies

Daniel E. Yupanqui-Lorenzo , Lizbeth Angela Jara-Osorio , Carlos Carbajal-León , Tomás Caycho-Rodríguez , Manuel Antonio Cardoza Sernaqué , Kerly Stefanny Duran Quispe


...

University students may encounter situations where they perform poorly in a course and contemplate dropping out. This intention to drop out of a course manifests not only in thoughts or ideas but also in a cognitive self-evaluation of their performance and skills, enabling them to reflect on the possibility of dropping out. In this sense, there is a shortage of instruments that evaluate the intention to drop out of a course, so the aim was to develop and validate the Course Dropout Intention Scale (CDIS). Data from two samples (N1 = 198; N2 = 675) were used; the first was for the EFA, and the second was for the CFA, GRM, and SEM. The one-factor model was derived from the EFA and confirmed in the second sample, exhibiting appropriate goodness-of-fit indices. Similarly, the GRM obtained adequate fit indices; all items discriminated adequately, and the difficulty parameter had a monotonic increase. The SEM model of the effect of satisfaction with studies on the CDIS showed a negative and statistically significant effect. Thus, it was demonstrated that the CDIS is a robust instrument in its psychometric properties and empirical evidence with other variables.

description Abstract
visibility View cloud_download PDF
10.12973/eu-jer.13.1.103
Pages: 103-113
cloud_download 547
visibility 1110
0
Article Metrics
Views
547
Download
1110
Citations
Crossref
0

Scopus
0

Developing Creative Thinking in Preschool Children: A Comprehensive Review of Innovative

comprehensive review creative thinking early childhood

Novita Eka Nurjanah , Elindra Yetti , Mohamad Syarif Sumantri


...

The ability to think creatively has a vital role in the development of preschool children. This research provides a comprehensive review of innovative approaches and strategies for developing creative thinking in preschool children based on current trends and methodologies used in educational settings. This research shows three significant areas: (a) creative thinking skills in preschool children, (b) factors influencing creative thinking skills in depth, and (c) innovative strategies and approaches to stimulate creative thinking abilities in preschool children. This research uses a literature study method assisted by the publish perish application to find reference sources related to creative thinking abilities in preschool children. Studies show that creative thinking abilities in preschool children enable them to find innovative solutions, help them adapt to challenges, foster self-confidence and courage, and enrich their experience and knowledge of the world around them. Meanwhile, preschool children's creative thinking abilities are influenced by collaboration from the external environment (parents, teachers, and society); providing support and examples for children to develop and stimulate their creative thinking skills is very important.

 

description Abstract
visibility View cloud_download PDF
10.12973/eu-jer.13.3.1303
Pages: 1303-1319
cloud_download 493
visibility 1730
0
Article Metrics
Views
493
Download
1730
Citations
Crossref
0

Scopus
0

...