The Spanish Version of the Teacher Efficacy for Inclusive Practice (TEIP) Scale: Adaptation and Psychometric Properties

The Teacher Efficacy for Inclusive Practice (TEIP) scale is an instrument created by Sharma et al. to assess efficacy of instruction in inclusive settings. Despite its increase in use, the TEIP has not been validated with a Spanish teacher population. The aim of this study was to: (1) analyze the psychometric properties and factor structure of the TEIP scale in a sample of Spanish preservice teachers (N = 475; 80% female, 20% male), and (2) examine the level of self-efficacy for inclusive practices that teachers experience when they graduate from their training programs. Exploratory and confirmatory factor analyses indicate that a Spanish version (TEIP-ES) consisting of 15 items with a three-factor construct explains 64.65 percent of total variance. Item totalcorrelations ranged from .574 to .715, and factor loadings from .521 to .774. Convergent validity with measures of quality of teacher education (TE) programs and self-report of preparedness to teach in inclusive settings was good. In contrast, self-efficacy for inclusive practices was rated moderately low. Overall, these findings support the construct and convergent validity of the TEIP-ES and suggest that it is a useful instrument to measure self-efficacy for inclusion in Spanish preservice teacher populations. This manuscript reports the findings, discusses the implications for the improvement of TE programs, and suggests possible avenues for future related research.


Introduction
The movement towards inclusion is having an important impact on education systems worldwide (UNESCO [United Nations Educational, Scientific, andCultural Organization], 1994, 2009). What began as a trend to educate students with disabilities in regular instead of special schools, now seeks to establish an educational system responsive to the needs of all students, particularly those who require more educational support than usual. The right to an inclusive education has been encouraged internationally by the United Nations (2006), with many governments in the last decades introducing legislation to promote more inclusive educational systems. Yet, despite the changes in national policies, necessary developments in classroom practices do not seem to have been achieved. Several studies indicate that while the majority of teachers are highly committed to the principles of equity, social justice, and inclusion, many of them feel unprepared to teach in today's diverse classrooms, expressing concerns with respect to their ability to teach all students (Cardona-Molto, 2009;Scruggs & Mastropieri, 1996;de Boer et al.. 2011;Zagona et al., 2017). Teachers' lack of knowledge, skills, support and resources impact the implementation of inclusive practices and can also have a negative effect on attitudes (Chiner & Cardona-Molto, 2013;Forlin et al., 2010;Hecht et al., 2017). Preservice teacher preparation is the best time to address not only concerns, attitudes, and perceptions about diversity and inclusion, but also develop diversity competencies needed for teachers to succeed in today's diverse inclusive classrooms (Ball & Tyson, 2011).

Teacher Education for Inclusion
Given the current trend towards inclusion, all prospective teachers must be prepared to teach in highly diverse inclusive settings. This means that they need to acquire competence in a wide variety of areas during their preparation period. In practice, however, although many teacher education programs identify themselves as inclusive, their quality and emphasis on inclusion differ considerably (Cardona-Molto et al., 2018;Florian & Rouse, 2009;Kim, 2011;Tirri & Laine, 2017). Teacher education programs differ not only with respect to their curricula, but their assumptions about the type of teachers they desire to train. Although some programs require their students to take a specified number of special education credits in addition to core general education courses, others have combined the curricula (regular and special). As a result, teachers with the same degree can graduate with different levels of preparadness and, consequently, with varying levels of knowledge, attitudes, and perceptions of self-efficacy about teaching in inclusive settings (Kim, 2011).
In the European Union, the European Agency for Special Needs and Inclusive Education (EASNIE) (2013) addressed the question of how general education teachers are prepared through their initial education to be 'inclusive.' The question is not easy to answer. In Europe, not all countries use the term inclusion in the same way. In Sweden, 'school for all' is the terminology utilized to refer to inclusive education, while in Spain it is 'attention to diversity.' Other countries simply continue to use the term 'integration.' These differences represent a real challenge both for researchers and those involved in teacher preparation programs (Florian & Rouse, 2009). In addition, although there is a common ground regarding requirements and routes of preparation to enter the teaching profession across Europe, in practice, the competencies and standards related to inclusion that have been established differ (EASNIE, 2013). Overall, key areas viewed as crucial to be able to teach inclusively are: (1) collaboration with professionals and parents; (2) capacity to value diversity and address cultural, social, linguistic, and academic needs in regular education settings; (3) competence in the use a variety of 'inclusive' teaching methods and learning approaches (e.g. differentiated instruction); and (4) ability and skills to plan curricula and content that engage all learners (Bhroin & King, 2020;Florian & Black-Hawkins, 2011).
The conclusion seems clear: despite efforts to design more inclusive TE programs to better prepare prospective teachers to provide instruction in diverse settings, it is not clear how many programs achieve the goal and meet the training needs of undergraduates (Acedo, 2010;Kim, 2011). The sense of self-efficacy teachers possess with respect to implementing inclusive education is therefore key, if they are committed to inclusive practices and desire to make an effort toward supporting this process.

Teachers' Self-Efficacy Beliefs for Inclusion
Teacher self-efficacy is the belief that a teacher holds about their own ability to effectively manage the tasks related to their professional activity. The concept of self-efficacy is derived from Bandura's social learning theory (Bandura, 1997) that assumes that people learn through observation and that the sense of self-efficacy affects emotions, thoughts, and behaviors. Therefore, individual's expectations about their own efficacy determines whether a given behavior will be initiated, maintained, and executed (Bandura, 1997). In the field of inclusive education, self-efficacy translates into better work management and engagement, as well as greater security in one's ability to teach all students, including those with special educational needs. Teacher self-efficacy, according to Bandura (1997), is a context-specific construct that occurs within the boundaries of a particular situation (see Bulut & Topdemir, 2018;Mutlu et al., 2019), but is at the same time a multidimensional construct. Its components vary from three to six, including teacher instruction, classroom management, motivating and engaging students, and collaboration with teachers and parents (Bandura, 1997;Klassen et al., 2011;Skaalvik & Skaalvik, 2007). Literature on teacher's self-efficacy consistently reveals that perceived self-efficacy is positively related to high quality instructional processes, student achievement, and teachers' well-being (Akbari et al., 2009;Ashton & Webb, 1986;Brownell & Pajares, 1999;Zee & Koomen, 2016). Teachers with high self-efficacy tend to believe that they have the ability to make a difference in student achievement and trust their students' abilities significantly more than those with low self-efficacy. This belief transfers into a higher level of planning and organization, engagement in instructional tasks, and willingness to follow effective teaching methods to better meet students' educational needs (Mergler & Tangen, 2010;Tschannen-Moran & Hoy, 2001).
There are a wide range of variables that possess the potential to have a significant impact on preservice teacher selfefficacy. For the purpose of inclusive education, some of the most notable are: years of training, curriculum content, gender, and degree. Romi and Leyser (2006) conducted a study with 1155 preservice teachers in Israel and found that student teachers in the third and fourth year of study had significantly higher levels of perceived self-efficacy than those of the first and second year. Curriculum content was also found to be significantly related to preservice teachers' self-efficacy for inclusive practices as has been reported by Lancaster and Bain (2010), who embedded inclusionrelated content in teacher education programs. Similarly, Brown et al. (2008) incorporated special education components into the programs resulting in a raise of preservice teachers' self-efficacy regarding inclusion. Studies by Erdem and Demirel (2007), Romi and Leyser (2006), and Leyser et al. (2011) found that preservice female teachers expressed a higher degree of perceived teaching-efficacy than their male counterparts. In addition, the work of Woodcock (2011) and Forlin et al. (2010) revealed that preservice teachers at the secondary level had a lower level of self-efficacy than elementary preservice teachers.
Teacher self-efficacy has been shown to be a powerful predictor of attitudes towards inclusive education. Several studies found positive correlations between teacher self-efficacy and inclusive attitudes with r values of .40 or above for in-service teachers Kuittinen, 2017;Savolainen et al., 2012;Yada & Savolainen, 2017), and values of .33 or less for preservice teachers (Ashan et al., 2012;Hecht et al., 2017;Kim, 2011;Malinen et al., 2013;Saloviita, 2015). Kim (2011) examined teachers' attitudes and self-efficacy of 110 student teachers from ten US institutions with combined, separate, and general TE programs and found that preservice teachers with high levels of self-efficacy had better attitudes and dispositions regarding inclusive education. Preservice teachers from combined programs had significantly more positive attitudes toward inclusion than those from separate and general TE programs. However, there were no significant differences in self-efficacy among preservice teachers from these types of teacher preparation programs. Ashan et al. (2012), in a research study involving 1623 preservice teachers from 16 training institutions in Bangladesh, found that preservice teachers with high levels of self-efficacy had more positive attitudes towards inclusive education and lower levels of concerns about this practice. Length of training, gender, interaction with students with disabilities, and knowledge about inclusion were also significantly related to teacher self-efficacy and attitudes towards inclusive education (Ashan et al., 2012;Leyser et al., 2011). In another study, with 552 preservice teachers from three Chinese universities, Malinen et al. (2013) identified a strong association between teacher self-efficacy and attitudes, with results suggesting that teachers who feel more capable in teaching within a classroom with diverse learners have more positive perceptions and attitudes towards inclusion. More recently, Hecht et al. (2017) in an exploratory comparative study between 221 Italian and 143 Austrian secondary school preservice teachers found that in both groups attitudes and teacher self-efficacy were high, with the Italian sample scoring higher than the Austrian. Attitudes and self-efficacy for inclusive practices correlated significantly with efficacy in inclusive instruction and collaboration, but not in efficacy in managing behavior.
All the above studies underline the importance of studying self-efficacy and its correlates as key elements in determining the success or failure of teachers in effectively implementing inclusive practices.

Research on Teacher-Efficacy for Inclusive Practices using the TEIP scale
The body of self-efficacy research has utilized, until very recently, general teacher efficacy scales (e.g. Bandura, 1997;Gibson & Dembo, 1984;Schwarzer et al., 1999;Skaalvik & Skaalvik, 2007;Tschannen-Moran & Hoy, 2001) due to: (a) the difficulty with developing a measurement tool able to capture the essential facets of inclusive education self-efficacy adequately, and (b) the absence of specific instruments to assess teacher efficacy for inclusive practices. Sharma et al. (2012), were the first authors to develop a specific instrument, the Teacher Efficacy for Inclusive Practice (TEIP) scale, designed to investigate teacher self-efficacy in the context of inclusion. Based on the results of an exploratory factor analysis (EFA) and using data collected in Canada, Australia, Hong Kong, and India they identified three factors with 18 out of the 20 items that made up the initial version of the scale: (1) Efficacy in Using Inclusive Instruction (EII), (2) Efficacy in Collaboration (EC), and (3) Efficacy in Managing Behaviour (EMB), which showed high internal consistency (Cronbach's Alpha = .93,.85,and .85,respectively). In another study, Loreman and colleagues (Loreman et al. 2013) used the TEIP to explore differences in self-efficacy for inclusive practices as a function of demographic variables in a sample of 380 preservice teachers from Canada, Australia, Hong Kong and Indonesia. Once again, the TEIP showed good internal consistency (Cronbach's Alpha .89 total scale), and content and discriminant validity, in spite of the differences between countries.
Numerous studies using confirmatory factor analysis (CFA) have also provided support for the factor structure of the TEIP (e.g., Aiello et al., 2017;Alnahdi, 2019;Malinen et al., 2013;Park et al., 2016;Savolainen et al., 2012;Tanriverdi & Ozokcu, 2018). Malinen et al. (2013) tested the factor structure of the scale with 550 Mainland Chinese student teachers. Their findings gave support to the three correlated but separate factors of self-efficacy showing again a high reliability (alpha coefficients of .90 for the whole scale, and coefficients from .75 to .85 for the three subscales). Correlations between factors were moderate to strong (.53 to .60). Savolainen et al. (2012) studied the self-efficacy of Finnish and South African elementary and secondary in-service teachers. They found the three expected factors (EII, EC, and EMB), but two items (Item 12 "I can make my expectations clear about student behavior", and Item 6 "I am confident in my ability to get students to work together in pairs or in small groups") were eliminated because their loadings on the respective factors were low. The same TEIP factor structure was found in the sample from South Africa except for one item ("Designing individualized learning tasks") that cross-loaded on two factors. Reliability of the scale was good in both countries (Finland alpha = .88; South Africa alpha = .91). Park et al. (2016) also examined the TEIP scale for dimensionality with 134 US preservice teachers from a regional university in Kentucky in the context of interdisciplinary early childhood education. The authors found that the TEIP is basically a unidimensional scale composed of one dominant latent factor and the originally found three specific factors that represent unique dimensions of the dominant factor-teacher self-efficacy for inclusive practices. Findings also showed that the TEIP is a reliable instrument (alpha = .97 for total scale, and .93, .95, and .94 for each factor, respectively). Furthermore, TEIP scores correlated highly with attitudes toward inclusion and other demographic variables including gender, grades, students' plan to teach, and experience with people with disabilities as they leave their respective faculties of education (Specht et al., 2016) suggesting that: (a) female pre-service teacher trainees compared to their male counterparts, and (b) trainees planning to teach in lower versus higher grades had more favorable attitudes about inclusive education.
Recently, Alnahdi (2019), using CFA and Rash validation procedures, examined the psychometric properties of the Arabic version of the TEIP with a sample of in-service and preservice teachers in Saudi Arabia. The author found evidence that the Arabic version of the TEIP of 18 items preserves the three-factor structure of the original and showed good internal consistency (alpha > .80). TEIP scale validation studies have been done in a variety of other languages including Turkish (Tanriverdi & Ozokcu, 2018), Japanese (Yoshitoshi, 2014), Portuguese (Martins & Chacon, 2020), Polish (Narkun & Smogorzewska, 2019), Italian , and German (Hecht et al., 2017). All the above studies provide support for the construct validity of the TEIP as well as evidence that its items are equally appropriate for implementation in different languages (e.g. English, Chinese, Japanese, Finnish, French, Italian, German, Polish, Portuguese, Turkish, Arab) and cultural contexts (e.g. the United States, Canada, Australia, Finland, South Africa, China, Japan, Italy, Austria, Poland, Brazil, Turkey, Saudi Arabia). Despite a large body of research supporting its use adaptation and validation of the TEIP scale in other than the mentioned languages has not been undertaken. As of this date there is no Spanish version suitable for use with Spanish-speaking teachers, despite the relevance of the inclusion movement within the Spanish communities. Throughout Europe, and particularly in Spain, the rapid rise in the number of education programs and degrees that have recently been remodeled in order to adapt to the Bologna Process has raised questions about how well these programs are preparing preservice teachers to work effectively with diverse learners in inclusive settings (Spanish Ministry of Education, Culture, and Science, 2007). In addition, given the increasing use of the TEIP scale to analyze the readiness of teachers to teach inclusively Alnahdi, 2019;Martins & Chacon, 2020;Narkun & Smogorzewska, 2019;Specht et al., 2016;Tanriverdi & Ozokcu, 2018;Yada & Savolainen, 2017), it is crucial to provide rigorous scale cross-validation data. Countries may have much to learn from one another and comparisons of differences that may exist between them can generate awareness of issues that need to be addressed.

Research Goal
The purpose of the present study was to test the psychometric properties of the Spanish adaptation of the TEIP scale  with undergraduate student teachers, and to ascertain: (1) the TEIP three-factor structure identified in previous research, reliability, and convergent validity of the TEIP-ES; and (2) the level of self-efficacy of student teachers when they graduate. Two research questions guided the study: Research Question 1: What is the underlying factor structure, reliability, and convergent validity of the Spanish version of the TEIP scale? Research Question 2: What level of self-efficacy for inclusive practices do student teachers have when they graduate? Does this level differ across undergraduate degrees?

Participants and Context
The study took place in an urban teacher education institution of the Valencian Community, Spain, due to accessibility for data collection. The urban setting contains only one public institution of higher education with TE programs within the province limits of approximately 24625 students. The college of education at this institution had a total student enrollment of 3426 undergraduates (27% males and 73% females), 99% Spaniards, majoring in Early Childhood, Elementary, and Physical Education (UA, 2017). The present study included student teachers enrolled in two of the three accredited teacher preparation programs, those who were seeking initial licensure in Early Childhood and Elementary Education. For initial certification, a teacher at this university must complete a 4-year undergraduate program composed of general subject studies, studies on specific didactics, practicum, and electives plus a final project. Emphasis is placed on diversity and inclusion, but there is only one compulsory 9-credit course on special needs and inclusion in the Early Childhood Education program, and two courses, 6 credits each, in the Elementary Education program related to students with special educational needs.
Convenience sampling was used to select students seeking a degree in Early Childhood Education and in Elementary Education. It was decided to focus on third year students because those in their fourth year spend most of their time off campus at practicum sites. To guarantee the representativeness of the sample, all third year students of these two programs were invited to participate. The data was gathered during the 2016-2017 Spring term from a potential pool of 707 third year early childhood and elementary student teachers from both study programs. A total of 475 student teachers representing 70.68% of the cohort completed the survey (70.60% Early Childhood and 70.76% Elementary Education). The average age of the participants was of 22.19 years old (SD = 3.68, range 20-52) for the entire sample, and 22.69 (SD = 3.73, range 20-41) and 21.77 (SD = 3.60, range 20-52), for Early Childhood and Elementary Education student teachers, respectively. Nineteen percent (19.40%) were male and 80.60% female. Linguistically, the sample was diverse: only 12.20% of the participants were monolingual while the other 87.80% were bilingual or multilingual. Ninety-seven percent were full-time students.

Data Collection
The Spanish version of the 20-item original TEIP scale , translated into Spanish (TEIP-ES), was used in this study. The TEIP original scale assesses, in a self-report format, the self-efficacy of preservice teachers with respect to implementing inclusive practices. The Spanish translation and adaptation of the TEIP scale was created following standard test adaptation guidelines (Hambleton, 2005;International Test Commission, 2005). Due to our concern that the original TEIP scale does not provide a full measure of teachers' self-efficacy for inclusive practices (the scale refers specifically to students with disabilities), we changed the wording from 'students with disabilities' to 'students with special educational needs' (SEN), in line with the Spanish policy on inclusion (Spanish Goverment, 2006, 2013 and the worldwide movement of education for all (UNESCO, 1994). During the translation process, TEIP items were first translated from English into Spanish by the first author who is a native Spanish speaker; then the items were translated back to English by two bilingual native English-Spanish speaking teacher educators. The original TEIP and the back translated TEIP-ES items were then compared. Finally, the translated version was revised by six experts in inclusive education, educational measurement, and curriculum to investigate item content validity, based on their professional experience. The 20-items Spanish version of the TEIP, which included issues of assessment, classroom management, instruction, working in teams, and professional issues, were answered using a six-anchor Likert scale of 1 (Strongly Disagree), 2 (Disagree), 3 (Somewhat Disagree), 4 (Agree Somewhat), 5 (Agree), and 6 (Strongly Agree). A high score (close to 6) is indicative of strong feelings towards self-efficacy specific to teaching inclusively. The translated Spanish version of the original 20-item TEIP scale is included in Appendix 1. The survey instrument, in addition to the TEIP-ES scale, included three additional items designed to assess convergent validity. These one-item measures asked student teachers to indicate an overall rating (1 to 6) of their perception of: (1) TE program commitment on diversity and inclusion, (2) opportunity they had during course work to learning to teach inclusively, and (3) preparedness to teach in diverse inclusive settings after graduation. A score of 1 was indicative of a strong feeling of unpreparedness to teach in inclusive environments, a lack of opportunity to learning to teach inclusively, and a lack of TE program commitment to diversity teaching, while a score of 6 reflected just the opposite.
The TEIP-ES version was delivered to all students present in class on the day the survey was administered after obtaining permission from instructors. Participants were informed about the voluntary nature of their participation and, after signing the consent form, were asked to complete the survey instrument anonymously. They received no incentive for taking part in the study. Students who did not want to participate returned blank surveys. The survey took no more than 10 minutes to complete. The study was deemed exempt from review by the university Ethics Committee.
Data Analysis Data analysis entailed several phases. First, SPSS version 24.0 was used for descriptive analyses, reliability, construct and convergent validity, and comparison of means. Reliability was estimated via Cronbach's Alpha coefficient. The factor structure of the scale was checked through an EFA (Exploratory Factor Analysis) using the method of principal components with Promax oblique rotation. Convergent validity was examined assessing the association (Pearson correlation coefficients) between the TEIP-ES and participants' overall ratings of self-competence for inclusion, opportunity to learning to teach inclusively, and perception of TE program commitment for diversity. Finally, to compare respondents' self-efficacy means by degree, a series of t-test for independent samples was performed. Second, to substantiate the construct validity of the TEIP-ES a CFA was run using AMOS version 22. Due to the continuous nature and normal distribution of the data, the Maximum Likelihood procedure was used. To assess model fit the following indexes were calculated: (1) the Chi-square statistic ( 2 ), and the mean-square error of approximation (RMSEA) as absolute measures of fit; (2) the Tucker-Lewis index (TLI), and the comparative goodness index (CFI) as incremental adjustment measures; and (3) the ( 2 /df) ratio as a measure of parsimony adjustment.
Adequacy of the hypothesized model was checked using the following cutoff criteria. For the  2 /df ratio, a value of ≤ 2 illustrates a good fit, and a value of ≤ 3 establishes an acceptable fit. For RMSEA, values of less than 0.05 reflect a close fit and above 0.08 indicate reasonable approximation errors (Browne & Cudeck, 1993). CFI and TLI are considered appropriate with values ≥ than 0.90. However, Hu and Bentler (1999) appeal for more rigorous cutoff criteria of goodness of fit indexes such as 0.95 for CFI and TLI, and 0.06 for RMSEA.

Research Question 1: What Is the Underlying Factor Structure of the Spanish Version of the TEIP Scale?
Exploratory Factor Analysis. The factor structure of the 20-item version of the TEIP-ES was examined first through an EFA. Before proceeding with the analysis, data were explored to check suitability for factor analysis. The indicators of sample suitability were optimal: the Kaiser-Meyer-Olkin (KMO) value for sampling adequacy was .932, indicating that there was a considerable proportion of common variance and that the analysis of principal components was viable. The Bartlett's test of sphericity was also highly significant (p < .000) showing that there was systematic covariance between the items that make up the Spanish version of the TEIP. The EFA initially showed one factor dimension with all the items collapsing in the same factor. After sedimentation graph analysis, it was decided that the optimal solution would be that of three components with individual eigenvalues higher than one. An EFA of the correlation matrix using the principal component method (Promax oblique rotation) was then performed looking for the clearest possible association of each of the variables with the corresponding factor.
Based on the initial EFA results, Item 1 ("I can use a variety of assessment strategies") and Item 14 ("I can improve the learning of a student who is failing") were eliminated for not adapting well to the original model. EFA was conducted again with the remaining 18 items. Results are shown in Table 1. As can be seen, all items grouped around three dimensions without cross-loadings. The three dimensions explained 61.52% of the variance. Analysis of the items indicated that dimension 1 items were more closely associated with Efficacy in Managing Classroom Behavior (EMB), dimension 2 was related to Efficacy in Implementing Inclusive Instruction (EII), and dimension 3 with Efficacy in Collaboration (EC). .806 3. I am confident in designing learning tasks so that the individual needs of students with special support needs are accommodated.
.805 2. I am able to provide an alternate explanation or example when students are confused.
.722 5. I can provide appropriate challenges for very capable students.
.705 6. I am confident in my ability to get students to work together in pairs or in small groups.
.639 20. I am confident in adapting school-wide or state-wide assessment so that students with all special needs can be assessed.

SEN = Special Educational Needs
We also analyzed the pattern of the TEIP-ES factor structure as a function of participants' degree through disaggregating the sample by subgroups. For this purpose, an EFA was performed separately for early childhood and elementary student teachers. By means of this analysis, we could confirm that the instrument functions similarly for each group. More specifically, the psychometric properties of the TEIP-ES do not vary for pre-service trainees in early childhood and elementary education. The EFA extracted the same three factors identified previously in the whole sample of participant teachers. The KMO = .92 and .93 for early childhood and elementary preservice teachers, respectively, supported again an adequate structure of the instrument explaining 62.27% and 61.91% of the total variance.

Confirmatory Factor Analyses
The first model tested was the 18-item unidimensional model configured after the EFA was performed. Table 2 shows the estimates of goodness of fit. As shown in the table, RMSEA (.111) suggested poor model fit, and CFI (.845) and TLI (.842) were considered unsatisfactory. An inspection of the modification indices showed that model fit could be improved by eliminating Items 12, 13, and 19. Factors loadings for the original model ranged from .48 to .87. The revised model, composed of 15 items, resulted in better goodness of fit (see parameters in Table 2) as well as better internal consistency rates. Removing Items 12, 13 and 19 produced a model with a significant lower Chi-square estimate (4.17) and RMSEA (.090) that indicated acceptable model fit. Additionally, CFI (.864) and TLI (.861) were in the range of appropriate model fit. The factor loads of the revised model ranged from .76 to .88. These results revealed that a three-factored model is confirmed. The first dimension achieved with CFA has five items (Item 7, Item 8, Item 9, Item 10, and Item 11), which corresponds with Efficacy in Managing Classroom Behavior. It covers statements such us "I can control disruptive behavior in the classroom" (Item 8), "I am able to get children to follow classroom rules" (Item 10), and "I am confident when dealing with students who are physically aggressive" (Item 11). The second dimension involves six items (Item 2, Item 3, Item 4, Item 5, Item 6, and Item 20) for Efficacy in Using Inclusive Instruction and covers statements such us "I am confident in designing learning tasks to accommodate individual needs" (Item 3), "I am confident in my ability to get students to work together in pairs or in small groups" (Item 6), and "I am confident in adapting school-wide or state-wide assessment" (Item 20). The third dimension, Efficacy in Collaboration, has four items (Item 15, Item 16, Item 17, and Item 18) covering statements like "I am able to work jointly with other professionals and staff to teach students with SEN in the classroom" (Item 15), and "I am confident in my ability to get parents involved in school activities of their children with SEN" (Item 16).
Next, the three-factor model was tested for both subsamples (early childhood and elementary student teachers) using multi-group CFA, and a better fit for the data was found, particularly, in the sample of early childhood students (Table  3). The goodness of fit indexes obtained for both groups in RMSEA were satisfactory, whereas CFI and TLI were close to acceptable, with a relatively worst fit in the case of elementary students. Then we tested the weak, configural, and strong measurement invariance for the two groups. The ΔCFI between the constrained and the unconstrained models was below .01, indicating that strong invariance was supported according to the recommendations of Cheung and Rensvold (2002). The equivalence of the measurement was proven in both groups of students. Correlations between the three subscales were positive and statistically significant (r = .83 between EMB and EC, p <. 01; r = .90 between EMB and EII, p <. 01; and r = .74 between EC and EII, p <. 01), coefficients that according to Cohen (1988) can be considered strong correlations.

Convergent Validity.
We assessed the convergent validity of the instrument by analyzing the relationships between the TEIP-ES scores (entire scale and subscales) and other related constructs: (1) respondents' perception of TE program commitment on diversity and inclusion, (2) perception of opportunity to learning to teach in inclusive settings, and (3) self-perceived preparedness to teach in diverse inclusive classrooms. The correlations are shown in Table 4. Considering the scale as a whole, the highest correlations were observed in the association between the TEIP-ES (total scores) and perception of opportunity to learning to teach inclusively (r = .57, p < .01), and the TEIP-ES and self-report on preparedness to teach in diverse inclusive settings (r = .47, p < .01). The association between teacher efficacy and ratings of program commitment in regards to diversity, equity, and inclusion even being smaller than the two others was also positive and statistically significant (r = .23, p < .01). By subscales, the results showed significant positive correlations between the three TEIP-ES factors and participants' perception of opportunity to learning to teach inclusively, self-perception of preparedness for inclusion, and perception of TE program commitment on diversity and equity (see Table 4).

Research Question 2: What Level of Self-Efficacy for Inclusive Practices Do Respondents Have?
Overall, the mean of total scores on the TEIP-ES version was 4.22 (SD = 0.73) out of a total possible score of 6 (midpoint of the scale of 3.50). For the entire group of respondents, mean scores by subscales (see Table 5) were 4.04 (SD = 0.92), 4.21 (SD = 0.75), and 4.47 (SD = 0.93) for Efficacy in Dealing with Student Behavior, Efficacy in Using Inclusive Instruction, and Efficacy in Collaboration, respectively, indicating that the sense of efficacy to teach in inclusive settings was rated moderately low. By subgroups of respondents, preservice teachers pursuing an early childhood degree scored significantly higher in Efficacy in Managing Classroom Behavior (M = 4.15 vs = 3.95, p < .05) and in Efficacy in Collaboration (M = 4.58 vs = 4.40, p < .05) than student teachers pursuing a degree in elementary education, but not in Efficacy to Use Inclusive Instruction (M = 4.16 vs = 4.25, p > .05). .036* Scale range 1-6 (1 = Strongly Disagree; 2 = Disagree; 3 = Somewhat Disagree; 4 = Agree Somewhat; 5 = Agree; 6 = Strongly Agree); *Statistically significant at .05 or above; SEN = Special Educational Needs Comparison of individual items revealed that early childhood student teachers' self-reported significant higher selfefficacy in being able to get children to follow classroom rules than student teachers pursuing a degree in elementary education (M = 4.61 vs 4.25) [t = 3.83 (467), p = .000], in dealing with students who are physically aggressive (M = 4.13 vs 3.86) [t = 2.33 (465), p = .020], as well as in their ability to collaborate with colleagues to teach students with specific support needs in the regular classroom (M = 4.80 vs 4.46) [t = 3.52 (469), p = .000]. In contrast, preservice elementary teachers scored significantly higher than early childhood preservice teachers in Item 6 which measured the ability to put students to work in pairs or in small groups (M = 4.40 vs 4.68) [t = -3.22 (471), p = .001].

Discussion and Conclusion
Spain is considered to be one of the most inclusive countries in Europe (Spanish Ministry of Education, Culture, and Science, 2018). Since 1982, it has had a powerful legislation that guarantees a quality education for all students in regular schools and classrooms with adequate supports. Although the Salamanca Statement (UNESCO, 1994) contributed to the reinforcement of the notion of quality education for all 35 years ago, it still cannot be fully affirmed that inclusive quality education for all has been achieved. Teachers, who are the main agents of inclusion, do not always possess nor finish their preparation with the necessary competence, motivation, and attitudes to face the challenges posed by an inclusive education system. In this context, self-efficacy for inclusive teaching becomes an essential factor to examine teachers' ability to implement inclusive practices. Given the absence of valid and reliable instruments in Spanish to measure teacher self-efficacy for inclusion, this study sought to: (1) adapt the TEIP scale  into Spanish and validate its factor structure in a sample of Spanish preservice teachers, and (2) explore Spanish preservice teachers' self-efficacy for inclusive practices toward the end of their program of study.
With respect to the first inquiry, our results support the theoretical structure of three factors suggested by the authors of the original study , confirmed later in the studies by Malinen et al. (2013), Park et al. (2016), Hecht et al. (2017), Tanriverdi andOzokcu (2018), or Alnahdi (2019) with Chinese, US, Italian, German, Turkish, and Arab preservice teachers, respectively. The findings of our study suggest a Spanish version of the TEIP consisting of 15 items and a construct with three dimensions that explained 64.65% of the total variance equivalent across two degree programs. Although the three-factor model provides a modest fit to the data, the instrument has acceptable psychometric properties with good reliability (alpha coefficients higher than those reported by Sharma et al., 2012;Malinen et al., 2013;or Narkun & Smogorzewska, 2019), and adequate construct validity as highlighted by the correlations between factors, which were all statistically significant. These findings give support to previous TEIP validation studies Alnahdi, 2019;Malinen et al., 2013;Martins & Chacon, 2018;Park et al., 2016;Tanriverdi & Ozokcu, 2018) that found the same factor structure of the TEIP, thus, contributing to the idea that teacher self-efficacy for inclusive practices is a multidimensional and universal construct identified in different countries and cultures across the world. In addition, we provided preliminary evidence of previously unexplored relationships of the TEIP-ES subscales with several quality indicators of teacher training programs (convergent validity), highlighting the fact that program quality is positively related to higher self-efficacy to teach diverse students in inclusive settings (Cardona-Molto et al., 2018). In fact, in this study, perceived opportunity to learn to teach inclusively, preparedness for inclusion, and program commitment to diversity had a moderate but strong positive association with self-efficacy for inclusive practices. These results are congruent with the findings of studies by Ashan et al. (2012), Brown et al. (2008) or Lancaster and Bain (2010) demonstrating that the incorporation of special educational needs and/or inclusion related content into training programs contributes to raising the sense of self-efficacy for inclusive practices of preservice teachers and the quality of programs. The findings of this study gives us the confidence to consider the Spanish version of the TEIP as a useful tool to explore preservice teachers' self-efficacy for inclusion in Spanish contexts.
Regarding the second inquiry, the results indicate a mild to moderate respondents' perception of self-efficacy beliefs, particularly, efficacy in managing behaviour and efficacy in collaboration being significantly higher in participating teachers pursuing a degree in early childhood compared to those in elementary education. This finding concurs with earlier studies (e.g. Forlin et al., 2010;Hecht et al., 2017;Specht et al., 2016;Woodcock, 2011) that suggest lower levels of teacher self-efficacy for inclusive education as grade level increases. It may be explained not only by participants' inexperience in addressing diversity and inclusion (Milem et al., 2005), but also by insufficient institutional compliance with the necessary TE reform as reflected in the absence of the alignment of study programs with diversity standards (Spanish Ministry of Education, Culture, and Science, 2007).

Suggestions
The movement towards inclusive education poses important challenges to initial teacher preparation programs. It is imperative that future teachers leave the system with the necessary knowledge and skills to become inclusive educators. This study made a contribution to the field by adapting and validating the TEIP to measure preservice teachers' self-efficacy for inclusive practices in Spanish teacher populations. The findings should provide useful information to institutions designing new teacher education programs or to those who are evaluating or revising programs to incorporate strategies for addressing diversity and inclusion.
Future research should incorporate samples from a larger variety of teacher education programs and institutions. Because different types of TE programs have different impact on preservice teacher preparation (Kim, 2011), researchers should examine the impact of different kinds of programs (general, inclusive, or separate programs) alongside with other institutional characteristics (e.g., sensitivity and commitment on diversity, approaches to teaching diversity and inclusion, course content, or field practicum) on preservice teacher self-efficacy for inclusive practices. Interviewing university educators of required courses in the programs would also provide rich information about the ways they are preparing future teachers for inclusive education in their respective courses.

Limitations
The aforementioned findings, although promising, should be considered in light of several limitations. First, the TEIP-ES scale is a self-reported measure of self-efficacy for inclusive practices and may be subject to social desirability biases. Therefore, future research should monitor this by carrying out more studies of an observational, longitudinal, and qualitative nature, as has been repeatedly suggested (Hecht et al., 2017;Henson, 2002;Klassen et al., 2011;Mintz, 2019;Tschannen-Moran et al., 1998). Tschannen-Moran et al.'s work in 1998 suggested that qualitative approaches will help in gaining understanding of how teacher beliefs about self-efficacy work. Henson (2002) commented that a greater diversity of methodologies would lead to the growth of teacher self-efficacy research. Second, the use of a transversal design, such as this one (also known as a cross-sectional study at one time) does not allow us to assess how participants' self-efficacy evolves after graduation. Hence, its stability and sensitivity to change after participants' initial experiences as teachers need to be analyzed, as Mintz also recommends (2019). Third, the sample is limited to two convenient third-year cohorts of preservice teachers from two single certification programs (early childhood and elementary education) from only one TE institution of the existing three of all the Valencian Community. This impacts the generalizability of results that cannot be transferred to other institutions in Spain and/or to other Hispanic/Latin-American countries and cultures without additional validation research. Finally, although adapting instruments from other languages and cultural contexts represents an additional limitation, we believe our findings are valuable and strongly recommend the use of the Spanish version of the TEIP extensively for assessing teachers' self-efficacy for inclusive practices in preservice and in-service early childhood and elementary education, as well as in secondary education with additional validation.

Teacher Efficacy for Inclusive Practice scale (Spanish version)
This survey is designed to help us understand the nature of factors influencing the success of routine classroom activities in creating an inclusive classroom environment. Please circle the number that best represents your opinion about each of the statements. Please attempt to answer each question. Notes. The word disability (original TEIP version) has been changed by the term 'Necesidades educativas especiales' [Special Educational Needs] in the Spanish TEIP version. *Items excluded from the original TEIP version  after exploratory and confirmatory factor analysis, so that 15 items composed the final version of the TEIP-ES scale.