Adaptation of Teachers’ Teaching Thinking Practices Scale into English

This study aims to adapt teachers’ classroom practices for teaching thinking scale from Turkish to English culture. The scale includes 21 items. Each item has 5-point Likert type. It has 4 factors: Effectiveness of Teaching Thinking, Loyalty to Curriculum, Teacher Dependence and Encouraging Thinking. In the first step, statistical analyses were administrated to achieve linguistic equivalence. To do that, the data were collected from 30 English teachers with 20 day intervals. In the second step, the data collected from 148 native English teachers were analysed by Confirmatory Factor Analyses. Good level fitting indices were found at the end of this process. Cronbach Alpha coefficient value was found to be .90. For Convergence and Discriminating validity, which means construct validity, correlations between sub-dimensions and average explained variance values were calculated and found good sufficient levels. Items in the scale were discriminating. As a result of this study, it was found that English translation of the scale was statistically valid and reliable.


Introduction
The common idea that only clever people can think is not valid anymore (Dilekli, 2015(Dilekli, , 2016Hashim, 2004;Siegel, 2010) because many studies (Costa, 2001;Beyer, 2010;Onosko, 1991;Nair & Ngang, 2012) have shown that thinking skills can be taught to anyone. Furthermore, the question of which cognitive skills should be taught for creating better thinkers have been answered. Thinking skills can be classified as looking for meaning or searching for meaning (analytical thinking), critical thinking, creative thinking, problem solving and decision making. For teaching looking for meaning, teachers should teach comparing and constructing, classification of data, determining parts and whole relationship, sequencing skill, finding reason and conclusion skills. For teaching critical thinking generating possibilities, determining reliabilities of the sources, using of evidence for causal explanation skills, prediction and conditional reasoning, deduction skills should be taught. For creative thinking, generating possibilities and metaphor skills should be developed. For teaching decision making and problem solving skills, again generating possibility and comparing and contrasting skills should be taught (Dilekli, 2019;McGuinness, Eakin, Curry, Bunting & Sheehy, 2003;Swartz & Parks, 1994).
In order to develop all these skills, curriculums are developed on the basis of three main approaches. The first one is general thinking skills program (e.g., Instrumental Enrichment Programme by Fuerstein, Cognitive Reseach Trust [CoRT] by De Bono). These programs are not based on a particular curriculum or discipline. However, these programs are criticized because of the transfer problem of the skills to the specific disciplines. For this reason, discipline specific programs were launched (e.g., Cognitive Acceleration for Science [CASE] by Adey & Shayer, Philosophy for Children [P4C] by Lipman). These programs try to teach thinking skills by using specific domains. In these programs, all activities are planned according to determined domain such as literature, geography or math. However, problem of transfer of the skills into other fields cannot be solved even with these programs. Hence, the infused approach launched (e.g., by Swartz & Parks). In this approach, all the curricula have been revised or completely changed for teaching thinking. Subject-matters become a mean to teach the skills and how you teach became more important than what you teach. Thus, teaching techniques and classroom practices become the major elements of teaching thinking process (Beyer, 2010;Dilekli & Tezci, 2015, 2016Nispet, 1990). According to Fisher (1995), there are three main factors effecting teaching thinking: (1) learning and facilitating teachers, (2) thinking learners, and (3) supportive learning atmosphere. In thinking classrooms, teacher is the facilitator of the learning. As a facilitator, teachers help students to understand what it is really said or what is the intention of the writer, and support students' creativity by letting them to say different ideas or want them to produce different ideas. In addition, teachers create a problem situation for establishing cause and effect relationships. Furthermore, in thinking classrooms, students work together to solve real life problems or making a project and evaluate their solutions from a critical point of view (Avargil, Herscovitz, Dori, 2011;Dilekli & Tezci, 2015, 2016Goelz, 2004;Kline, 2002;Lipman, Sharp & Oscanyan, 1980). Many studies (Keyser & Broadbear, 1999;Nair & Ngang, 2012;Ritchhart & Perkins, 2000;Sternberg, 1992) show that some habits and classroom applications hinder teaching thinking. Some of the most common teacher habits that inhibit teaching thinking are giving more importance to convey pure knowledge, being dependent on the curriculum, believing that only clever ones can learn thinking and seeing the process of thinking as time consuming. Moreover, students' and their parents' expectations from schools, perennial curriculums and central examinations, such as university entrance exams, are listed as barriers for teaching thinking (Dilekli, 2015;Dilekli & Tezci, 2016;Dilekli & Tezci, 2018). As teaching thinking takes long time, the results of this process cannot be seen in a short time. Owing to this, some teachers and parents are not in favor of teaching these abilities (Dilekli, 2015;Oberski, 1991).
For growing thinking generation, teachers' teaching approaches is key a factor. In this context, many studies indicated similar fundamental teachers' classroom practices (Dilekli, 2019;McGuiness et al., 2003). Thus, how you teach becomes more important than what you teach.

Research Goal
In this study, it is aimed to adapt the scale developed by Dilekli and Tezci (2015) in Turkish culture into English culture. No matter which culture you belong, teaching thinking practices are similar. Furthermore, there are limited number of scales for teachers' teaching thinking skills practices (e.g., Doganay & Sari, 2012;Williams,1999). English is intentionally selected. Because, in scientific area English is one of the most commonly used language. By this way, other cultures can use the scale by translating from English to original language of other researches.

Participants
This research was administrated with two different group of participants. The first group consisted of 30 teachers of English and was for checking appropriateness of the translation. They were between 28 to 52 years old and 18 (60%) of them were female and 12 (40%) of them male. 10 of them had 1-5 years of experience, 9 of them had 6-10 years of experience, 7 of them had 11-15 years' experience and 4 of them had 16-20 years of experience. The second group was 148 native English speaker teaching English from 11 different disciplines. 96 of them (64.9%) are female and 52 of them (35.1%) male teachers, a total of 148 (100 %). 16 of them have (10.8%) 1-5-year experience, 32 of them (21.6%) 6-10-year experience, 28 of them (18.9%) 11-15-year experience, 25 of them (16.9%) 16-20-year experience and 47 of them (31.8%) 21 and more experience. Teachers were invited to participate into an online survey study via email invitation. The data taken from this group was used for CFA and reliability analysis.

Data Collection Tool
The scale was used in Turkish sampling in five different cities with the permission of Ministry of National Education (MoNE). In order to get permission from MoNE, a council analyzed the instrument and approved that it meets ethical requirements (MoNE approval number: 70297673/100/3578931). As this study aimed at a cultural adaption, the same scale was used. The data collection tool consists of two parts. The first part is related to demographic information of the participants. In this part, the data are related to teachers' teaching field, professional seniority, age and graduation. The second part of the instrument is for scale items. The scale, developed by Dilekli and Tezci (2015), consists of 21 items and 4 factors. The first factor consists of 9 items and labelled as Effectiveness of Teaching Thinking. Under this factor, 9 items are related to teaching practices in classroom. The second factor has 5 items and called Loyalty to Curriculum. This factor covers teacher dependence to curriculum. The third factor containing 4 items is called Teacher Dependence. Items under this factor are related to classroom climate and the last factor called Encouraging Thinking has 4 items. Encouraging Thinking factor is related to behaviours that students are encouraged to do. In this 5-point scale, 5 is for Always and 1 is for Never. This scale was previously used and statistically analysed by Dilekli and Tezci (2015). In Explanatory Factor Analysis (EFA), it was found that scale has 4 factors and explains 56.431% of the total variance and reliability values of the scale was found to be ranging between .73 to .88. Latent and observed variables were found significant and factor loads were between .55 to .74, and the correlation between latent variables were found positive and significant. Furthermore, in Turkish version of the scale, item discriminant values and total item correlation were calculated. It was seen that all items were significant and total item correlation were between .511 (in item 12) to .964 (in item 7). It was found, in Confirmatory Factor Analysis (CFA), the scale has acceptable fitting indexes for defined factor structure.

Analyses
In order to ensure appropriateness of the translation, the scale was applied to 30 bilingual teachers via 20 day intervals. 22 of them were Turkish teachers and teaching English, 8 of them native English speakers and knowing Turkish. After that, fitting indexes were controlled by correlation analyses. In the next step, reliability and total item correlations were analyzed. Since the factor structure of the scale which was developed in Turkish culture was tried to be tested in English culture, confirmatory factor analysis (CFA) was applied to data obtained from 148 teachers. Joreskog (1969) indicated that CFA should be administrated to find construct validity. CFA allows the researchers to identify fitting of the factor structure of a model consisting of observable variables with the collected data (Brown, 2006). CFA gives some indexes to define factor structure of a scale such as Chi-Square Goodness, , Goodness of Fit Index (GFI), Adjusted Goodness of Fit Index (AGFI), Root Mean Square Error of Approximation (RMSEA), Root Mean Square Residuals (RMR), Standardized Root Mean Square Residuals (SRMR), Normed and Non-Normed Fit Index (NFI & NNFI) (Brown, 2006;Hu and Bentler, 1999;Joreskog & Sorbom, 2004;Tabachnick & Fidell, 2007). Furthermore, Average Variance Extracted (AVE) and Composite Reliability (CR) indexes were analysed for fitting validity. For AVE, .50 is acceptable when CR index is higher than .60, AVE's acceptable value can be higher than .40 (Fornell & Larcker, 1981). For defining internal reliability of the scale, Cronbach's Alpha value was calculated. Besides, for item discriminating indexes, top and low group 27% technique was used. Furthermore, independent sample t-test was administrated for each item.

Translation Process and Linguistic Equivalence
Firstly, the scale was translated by the two researchers and four bilinguals, two of whom were native English speakers. These two experts were working in the department of English Language and Literature department and the other two were native Turkish speakers who had a Ph.D. degree in USA. Native English speakers translated the form into English and this translated form was translated back into Turkish by native Turkish speakers and the researchers. After back translation process, the forms were compared and no corrections were made as the translations had the same meaning. In order to check linguistic equivalence, firstly Turkish version of the scale was administrated, then English translation of the scale was applied to 30 teachers teaching English 20 days intervals. T-test and correlation analyses were administrated, and the results are shown in Table 1. According to the analyses results, there were not significant relationships (p>.05) between the 20-day interval scores of applications. Not having significant relationships shows that both English and Turkish translations of the scale are semantically the same. The lowest correlation was seen in item 10 (r= .492, p<.05) and the highest one (r=.881, p<.05) was seen in item 21. These results showed the translation was coherent with the original form.

CFA Results
CFA was administrated with the data collected from 148 native English teachers. CFA analyses were administrated for controlling the factor structure of the Turkish version of the scale with English translation. It was also used to control whether there were differences between the latent and observed variables of the Turkish and English versions. It was found that X 2 /df =402.96/180=2.24 but RMSEA=.089 which was between acceptable range and some indexes (GFI=.79; AGFI=.74) showing plainness of the scale were found low. Better fitting indexes were found after the proposed modifications was made between the items 5 and 11, 5 and 14, 6 and 13, 7 and 14 in Effectiveness of Teaching Thinking dimension and also between the item 3 and 4 in Teacher Dependence dimension. The results are shown in Table 2.  Figure 1.

Figure 1: Parameters of Standardized CFA
In path diagram, the highest value was .85 in item 9 (t=12.54, p<.05) and item 6 (t=12.51, p<.05) in the factor Effectiveness of Teaching Thinking. The lowest one was seen 0.80 in item 14 in Loyalty to Curriculum factor, the highest value was .80 (t=11.09) in item 18 and the lowest one was .66 item 19 (t=8.45, p<.05). In Teacher Dependence factor, the highest value was .83 (t= 12.79, p<.05) in item 12 and the lowest one was .56 (t=6.63, p<.05) in item 4. For the last factor called Encouraging Thinking, the highest value was .84 (t=10.75, p<.05) in item 17 and the lowest one was .64 (t=7.82, p<.05) in item 21.

Convergent and Discriminant Validity
Although CFA is an analysis for construct validity, Campbell and Fiske (1959) indicated two other ways as convergent and discriminant validities for testing construct validity. Convergent validity is the degree of confidence of the property measured in a good level by its indicators, while the discriminant validity is the degree of measurement of different properties that are unrelated to each other.  Table 3.  (Fornel & Larcker, 1981;Peterson, 2000). After the proposed modifications based on the 5-error variance, better fitting indexes were found. RMSEA value decreased from .089 to .060 and NFI increased from .91 to .93. RMR decreased from 0.069 to .065 and SRMR decreased from .060 to .057 which are acceptable indexes. After the proposed modifications, excellent fitting indexes were found as NNFI =.97, CFI =.98, IFI=.98, RFI=.95. Furthermore, AGFI showing plainness of the scale increased from .74 to .81. GFI indexes increased from .79 to .85. The path from latent variables to observed ones were meaningful. The standardized path diagram is shown in Figure 1.

Reliability and Discriminant Analysis
Cronbach Alpha, Omega, item total correlations and item discriminant indexes were calculated for internal consistency and reliability values of the scale. The results were shown in Table 4.   Corrected item total correlation scores ranged from 0.770 to 0.827 in Effectiveness of Teaching Thinking, 0.614 to 0.727 in Loyalty to Curriculum, 0.589 to 0.728 in Teacher Dependence and 0.556 to 0.694 in Encouraging Thinking factors. These values showed that scale items and its dimensions were moderately correlated with total scores on their respective dimension. The t values for the top 27 and lowest 27 percent of the students were significant. The values ranged from 5.145 to 11.907. These results showed that scale items have the power of discrimination.

Discussion and Conclusion
In this study, it is aimed at adapting TTTP scale, originally in Turkish, into English. Firstly, the scale translated into English and it was applied to 30 tutors teaching English 20 day intervals for its clarity. Paired samples t-test and correlation analysis were administrated to the data taken from this process. We found medium as well as high correlation between original and translated version of the scale. Yet, there was no significant relationship between the scales in paired samples t-test analysis. As a result, English version of the scale is understandable and clear which means that Turkish and English versions of the scale have similar meanings. In the second phase, English version of the scale was applied to 148 native English teachers. CFA indicated that the scale has acceptable indexes in respect for RMSEA, RMR, SRMR. On the other hand, some GFI and AGFI values were not at acceptable levels. Therefore, the proposed corrections based on error variances were applied and AGFI value reached .81 and NFI reached .93 which show excellent levels. RMSEA decreased from .089 to .060 which was between acceptable levels. Consequently, both English and Turkish versions of TTTP scale have the same factor structure. This finding is similar to Dilekli and Tezci (2015) who developed the scale. However, Dilekli and Tezci (2015) did not make any modifications based on error variances in the scale's Turkish version. Convergent validity and discriminant validity analyses showed that all the factor load and AVE values were found higher than .50 for each dimension. These findings support scale's convergent validity. Fornel and Larcker (1981), and Peterson (2000) (2012) in their study, 'A Study of Developing the Thinking-Friendly Classroom Scale', developed a scale consisting of 32 items. The scale has three factors as teachers' behaviours and students' behaviours promoting thinking and behaviours that prevent thinking. Cronbach Alpha internal consistency coefficients are .89 for the first factor, .82 for the second factor, .69 for the last factor and .89 for the whole scale and also total examined variance was 42.36%. While TTTP scale for teachers' practices, Thinking-Friendly Classroom Scale is for students and only applied to one culture, Turkish fifth-grade students. This study will help the practitioners while teaching thinking. Besides, as the scale was based on teachers teaching behaviours, this scale can be adapted to different disciplines. Furthermore, the results can help other researchers who want to develop discipline-based teaching thinking scale as this study is not based on a specific discipline. As both versions of the scale showed a similar construct, it can be concluded that latent components of the thinking skills were culture-free (Guilford, 1959;Tebbs, 2000;Tishman, Perkins & Jay, 1995) and being culture-free may be effective on the research results. For example, in Bloom's taxonomy (1976) evaluation, analyses and synthesis are the cognitive construct which direct cognitive process and these steps are culture-free, yet, the product at the end of these process may be different because of the culture. Similarly, Alvino (1990) indicated that higher ordering thinking process has the same construct, yet cultural factors may affect the results. Although teaching thinking may change from school to school, even from teacher to teacher, the need for thinking skills for every student is inevitable (Boyer, 1995;Costa, 1991;de Bono, 1976;Ontario Council of Regents, 1990). Results of this study will help the teachers to evaluate their teaching practices.

Suggestions/Limitations
The sample group was consisted of 148 native English teachers. Therefore, the scale should be applied to a larger sample of students coming from different socio-cultural backgrounds. Furthermore, this study aimed at scale adaptation. Other factors effecting teaching thinking practices may be done with different samples. Another limitation of this study was lack of having native English speakers during the back translation process. In this study, 4 bilinguals participated in the back translation process. It would be better to study with greater bilinguals during the back translation process.