Validating Student’s Green Character Instrument Using Factor and Rasch Model

Many researchers have separately developed instruments to measure environmental characteristics such as attitudes, values, and knowledge. However, there is no instrument used to measure all these aspects in one comprehensive instrument. This study is meant to develop and validate a green character instrument which reveals student behavior and awareness of the environment. The instrument consists of 40 statement items consisting of 5 aspects, namely private pro-environmental behavior, public pro-environmental behavior, environmental knowledge, environmental values, and environmental attitudes. It was implemented on 1,398 students from 15 universities in Indonesia. The instrument content validation was analyzed by three experts using content validity index (CVI). The construct validity was analyzed using exploratory factor analysis, confirmatory factor analysis, and RASCH analysis. The content validity results obtained CVI scores ranging between 0.8 and 0.9 with a good category, while item reliability was in a fairly good category with a high level of separation index. Construct validation resulted in 34 items (4 items were eliminated after Exploratory and Confirmatory Factor Analysis, and 2 items were eliminated after RASCH analysis) spread over five constructs, namely environmental behavior, environmental knowledge, environmental values, environmental attitudes, and environmental habits. The resulting instrument has a good level of item difficulty, with a well understood response set which can be understood easily by respondents, and without bias. Therefore, it can be used to measure the students’ green character on both male and female.


Introduction
Character as a part of humanity (Pradhan, 2009) in the form of values, beliefs, good and bad behavior (Rahman et al., 2020;Ryan, 2013), and morality (Sari et al., 2021) is used to think and behave (Maisardi, 2017). It needs to be formed as it cannot spontaneously arise (Muharlisiani et al., 2019). Therefore, character needs to be familiarized to the younger generation through continuous learning, examples, and practices (Rahmawati et al., 2020). People with character will have good morals (Asrial et al., 2021), who consciously controls every action and behavior (Maisardi, 2017).
Good character is needed in all aspects, such as in environment. Example of good character to the environment is implemented in an attitude of caring for the environment (Pane & Patriana, 2016;Sanjaya, 2021). The character of caring for the environment must also be made accustomed (Arent et al., 2020;Masturoh & Ridlo, 2020), and it is important to be developed as the environment will have an impact on human existence (Yunesa, 2019). Environmental care character will create positive behavior towards the environment (Asrial et al., 2021;Sukri et al., 2020a), and reduce the negative impact of human behavior on the environment (Palupi & Sawitri, 2018;Sukri et al., 2020b). In addition, concerning for the environment is very important as most of the environmental damage is caused by human behavior (El Faisal et al., 2018;Sukri et al., 2018).
The term green character in this study refers to a person's behavior and awareness of the environment. Behavior refers to human activities to protect the environment or what is called pro-environmental behavior (Stern, 2000), while awareness refers to knowledge (Raymond et al., 2010), values (Thompson & Barton, 1994) and attitudes to the environment (Dunlap et al., 2000). Therefore, caring for the environment attitude is part of a green character. The term green character was chosen to describe all positive behaviors and awareness of the environment. Frasz (2016) mentions environmental character as feelings, sentiments and virtues towards the environment. The term green is also used by Chankrajang and Muttarak (2017) to describe one aspect of attitude towards the environment which is proenvironmental behavior. By using the term green character, all behaviors, attitudes, knowledge, values, and all things with a positive impact on the environment can be covered which makes this term more universal.
Currently, it is difficult to find an instrument that can fully accommodate all aspects of behavior and environmental awareness. The research conducted by Stern (2000) only developed an instrument to measure pro-environmental behavior, while Raymond et al. (2010) focused on the knowledge aspect. In addition, Thompson and Barton (1994) and Dunlap et al. (2000) only focused on values and attitudes aspects. The only similar research has been conducted by Fu et al. (2018), which unfortunately has some weaknesses, namely (1) limited to the behavior and awareness of the campus academic community and not generally applicable to the wider community, and (2) statement items developed in the instruments are mostly not in accordance with the conditions, context, and socio-cultural prevailing in many countries, such as in Indonesia. Whereas according to He and Filimonau (2020) and Chwialkowska et al. (2020), a person's socio-cultural background influences his behavior towards the environment. For example, the statement item "I believe I know environmental issues well' presented by Fu et al. (2018) cannot be reduced to a concrete statement because it is not in accordance with the conditions of society in several countries with the same culture and conditions, especially Indonesia. The statement will become understandable if it is transformed into real environmental issues occuring in the community, for example "Illegal logging can result in the loss of clean water sources and natural disasters" and "Throwing garbage in rivers can cause damage to marine ecosystems".
Therefore, this research is very important to be conducted to produce an instrument that can accommodate all aspects of environmental behavior and awareness. The resulting instrument can be used to measure not only the knowledge, values and attitudes towards the environment, but also to measure behavior reflected in pro-environmental attitudes. The results of this study can be used as a reference for other researchers in different countries which have similar or even the same cultural and socioeconomic conditions to Indonesia, which will make this instrument will be more contextual and precise to measure the "green character" of students.

Contribution to the Literature
• Some of the instruments developed by previous researchers were limited to certain aspects and did not cover all aspects of environmental behavior and awareness • Instruments to measure green character have not been disclosed and have not been validated, especially in Indonesia • Instruments validated of this study can be used to measure students' green character precisely because it is contextual and in accordance with the conditions experienced by students.

Methodology
This research is meant to develop and validate the green character instrument. The development is conducted through three steps; 1) analyzing the supporting literature and arranging the items, 2) content validation, 3) construct validation through Exploratory Factor Analyis (EFA), Confirmatory Factor Analysis (CFA), and RASCH (Saefi et al., 2020).

Literatur Review and Item Arrangement
Literature review is done to determine the representative variables for green character instrument. Literature analysis is based on studies or research results that have been published in reputable international journals such as research by Stern (2000), Raymond et al. (2010), Thompson and Barton (1994), and Dunlap et al. (2000). Based on the results of the review, a draft of a green character instrument was prepared which includes 40 items. The green character instrument draft consists of private pro-environmental behavior aspects (Stern, 2000) covering 11 items; public proenvironmental behavior aspects (Stern, 2000) which consists of 8 items; environmental knowledge aspects (Raymond et al., 2010) with 6 items; environmental value aspects (Thompson & Barton, 1994) with 8 items; and environmental attitudes aspects (Dunlap et al., 2000) which consists of 7 items. The student's response consisted of five answer choices; 1 = strongly disagree, 2 = disagree, 3 = indifferent, 4 = agree, and 5 = strongly agree.

Content Validation
Content validity is evidence of the extent to which the elements of an assessment instrument are relevant and represent a construct targeted for a particular assessment objective (Almanasreh et al., 2019). Content validity includes four criteria; relevance, clarity, simplicity, and ambiguity (Yaghmaei, 2003). The validity of the green character questionnaire content is done by lecturers, practitioners and researchers in the environmental field as experts in their respective fields to obtain acceptable assessment. In conducting the assessment, the validator was asked to fill in four criteria which are, 1 = not relevant, 2 = somewhat relevant, 3 = quite relevant, 4 = very relevant which was adjusted to 4 aspects of content validation. Furthermore, from the four criteria, dichotomous data was made to measure content validation using the content validity index method (Polit & Beck, 2006) with the provisions that CVI values > 0.79 were accepted, CVI values 0.70-0.79 were revised, and CVI < 0.70 were rejected (Devon et al., 2007).

Research Sample
This study involved 1,398 students as respondents from 15 universities in Indonesia through random sampling (Endo et al., 2016). Respondents consisted of 972 women (69.53%) and 426 men (30.47%) with the age ranging from 19 to 22 years old. Respondents came from various regions in Indonesia including western, central and eastern Indonesia from various different majors such as social science, science, science education, engineering, humanities and business. The number of samples, 1,398 people, met the ideal limits for factor analysis (Tabachnick & Fidell, 2014) and RASCH analysis (Hagell & Westergren, 2016).

Data Analysis
The initial stage of the analysis was performed through an exploratory factor analysis (Williams et al., 2010). Prerequisite analyzes such as Kaiser-Meyer-Olkin (KMO) and Bartlett's Test of Sphericity were performed prior to EFA (Chan & Idris, 2017). Furthermore, EFA uses the varimax rotation method (Osborne, 2015) and maximum likelihood estimation (Kassim et al., 2013) with the criteria of Eigenvalue > 1 (Yong & Pearce, 2013), and a minimum loading factor of 0.3 (Prasetyo et al., 2019). CFA was conducted to confirm the EFA results with model fit criteria based on the Root mean square error of approximation (RMSEA 0.06), Goodness of fit index (GFI 0.95), Comparative Fit Index (CFI 0.95), Tucker-Lewis Index (TLI 0.95), and X2/df < 3.00 (Sun, 2005). The RASCH analysis measures the validity of the instrument's construct in terms of content and consequential aspects (Susongko, 2016). Since the sample used is > 500 (Sumintono & Widhiarso, 2015), the item fit criteria are seen based on the mean-square infit and outfit values (MNSQs, between 0.6 to 1.5), and the point-measure correlation coefficient (PTMEA Corr, between 0.3 up to 0.7) (Linacre, 2018). Items that meet one of these criteria are designated as valid items, while items that do not meet the criteria will be deleted from the instrument. Furthermore, the reliability value of the items received is between 0.65 and 0.83 (Sumintono & Widhiarso, 2015) with a separation index value of 1 and > 2 (Ismail et al., 2020). In addition to reliability, Wright map analysis was also performed to determine the items' level of difficulty (Scoulas et al., 2021) followed by rating scale analysis to evaluate the clarity and ease of interpretation of the response set in the instrument (Kim & Kyllonen, 2006). Finally, to avoid bias in the instrument, a Differential Item Functioning (DIF) analysis was conducted to determine the responses of male and female students (Iseppi et al., 2021).

Content Validation
The results of CVI analysis on 40 green character instrument items show that the CVI values range from 0.8-0.9 for all aspects. Based on these results, all items in the instrument have met the valid criteria which were reviewed based on relevance, clarity, simplicity, and ambiguity.

Exploratory Factor Analysis (EFA)
Factor analysis serves to reduce variables that are replaced by several factors which summarize the relationship between variables (Goldberg & Velicer, 2006). The initial assumption in factor analysis is the adequacy of the sample in the analysis (UI Hadia et al., 2016). Sample adequacy is measured by the Kaiser-Meyer-Olkin (KMO) value which must be greater than 0.5 (Hair et al., 2010). In addition to the adequacy of the sample, the assumption that must be met in the EFA is that there should be relationship between variables in the factors (Matore et al., 2019) which is indicated by the value of Bartlett's Test of Sphericity (BTS) which must be less than 0.05 (Chan & Idris, 2017). The results of the KMO and BTS analysis are shown in Table 1 which shows that the KMO value is 0.917 and is in the very good category (UI Hadia et al., 2016), while the BTS value is <.001 which indicates that both EFA assumptions are met and acceptable for further analysis (Field, 2000).

Kaiser-Meyer-Olkin
Bartlett's Test of Sphericity Overall MSA X 2 df p 0.917 18800.609 780.000 <.001 After the EFA assumption test is met, the next step is to perform a factor analysis of 40 instrument items using the varimax rotation method (Osborne, 2015) and maximum likelihood estimation (Kassim et al., 2013). To determine the number of factors being formed, the parallel analysis method was conducted (Çokluk & Koçak, 2016). The results can be seen in Figure 1 which shows that the implementation point is formed after five factors resulted in 5 constructs which were formed from the results of factor analysis. Each item in the formed factor has a loading factor of more than 0.3. The minimum factor loading value used in this study is 0.3 to indicate that the formed factor has met the fit criteria (Prasetyo et al., 2019). The loading factor that were formed are shown in Table 2. Based on Table 2, several items such as items A7, A8, A11 and A34 were eliminated from the analysis because they had a loading factor of less than 0.3. Based on these results, 40 items were analyzed resulting in 5 factors. The five formed factors were then grouped and named according to the similarity of characteristics possessed by each item as follow factor 1, environmental behavior; factor 2, environmental knowledge; factor 3, environmental value; factor 4, environmental attitude; and factor 5, environmental habits. The results are strengthened by the Eigenvalue, variance, interitem correlation and Cronbach's alpha value which are presented in Table 3.

Confirmatory Factor Analysis (CFA)
The interpretation of the CFA fit model uses Diagonally Weighted Least Squares (DWLS), which is considered as the most suitable for not normally distributed data compared to the maximum likelihood model (Nye & Drasgow, 2011). The results of the CFA fit model and final measurement model are shown in Table 4 and Figure 2.

Figure 2. CFA Final Measurement Model
To strengthen the results of the EFA and CFA, a RASCH analysis was performed to determine the validity and reliability of the instrument following the Messick validity which includes several aspects namely content, substance, structure, external and consequential (Susongko, 2016). This research is only limited to the content and consequential aspects.
The following describes the results of the RASCH analysis on the green character instrument.

Green Character Instruments Reliability
The results of the measurement of reliability and separation of the item and person indices of the instrument are shown in Table 5.

Fit Analysis of Green Character Statistic Instrument
The results of the item fit analysis of the green character instrument are shown in Table 6.

Wright Map
Wright map analysis was performed to determine the level of difficulty of the items (Saefi et al., 2020;Scoulas et al., 2021). Wright map analysis is shown in Figure 3. The next stage in instrument testing is done through rating scale diagnostics. This measure is used to evaluate the clarity and ease of interpretation of the response set in the instrument (Kim & Kyllonen, 2006). The results of the diagnostic scale rating are shown in Figure 4.

Figure 4. Probability Category Curve of The Green Character Instrument Differential Item Functioning (DIF) Analysis
DIF analysis was conducted to determine whether different subgroups, in this case gender, responded to items differently (Iseppi et al., 2021). The results of the DIF analysis are shown in Figure 5.

Discussion
This study will test the green character instrument consisting of 40 items which are coded from A1 to A40. The first step to test the relationship between variables in the instrument is performing factor analysis. EFA analysis results on Table 3 shows that the Eigenvalue is more than 1 (range from 1.54 to 4.77). Eigenvalue is a measure used to determine the number of factors being formed (Larsen & Warne, 2010). Based on the Eigenvalue, the 5 formed constructs are fit. This is in accordance with Yong and Pearce (2013) opinion which say that the Eigenvalue value of more than 1 indicates that the factor has met the assumption of the fit criteria. Table 3 also shows the value of the variance formed on each factor (ranging from 3.80 to 11.90) with a cumulative variance of 38.10%. The cumulative variance value is relatively small as usually the cumulative variance for humanities research ranges from 50-60% (Pett et al., 2011). However, the resulting variance value is still acceptable as the other criteria have been met in the EFA analysis. The low value of this variance is thought to be caused by the maximum likelihood extraction method used. According to Costello and Osborne (2005), the principal component analysis (PCA) method in extraction produces a greater variance than the maximum likelihood (ML) method. This happens because PCA does not divide the unique variance from communalities so it sets all item communalities at 1.0, whereas ML estimates the level of shared variance for the items, which ranged from 0.39 to 0.70.
The range of the average interitem correlation values in the factors is 0.31 to 0.6 (Table 3). This indicates that there is a strong relationship between each item in the same factor. According to Tabachnick and Fidell (2014), the interitem correlation value that exceeds 0.3 meets good factorability in the EFA. Table 3 also shows that the average value of interfactor correlation is smaller than the average value of interitem correlation in factors that range from 0.02 to 0.07. This proves that the instrument has good specificity. The intended specificity is the instrument's ability to distinguish the specificity of each factor based on its correlation value (Trumpower et al., 2010). The results of Cronbach's alpha analysis in Table 3 reveal that the reliability value ranges from 0.74 to 0.85. This shows that the instrument has good reliability. The reliability value above 0.7 proves that the instrument is reliable and acceptable (Yu & Richardson, 2015).
To test the consistency of the formed factors, a confirmatory factor analysis was performed (Tomé-Fernández et al., 2020). CFA was conducted on 5 factors and 36 items. They are Environmental Behavior (EnB), Environmental Knowledge (EnK), Environmental Value (EnV), Environmental Attitude (EnA), and Environmental Habits (EnH) factors. The fit model criteria are based on the Root mean square error of approximation (RMSEA), Goodness of fit index (GFI), Comparative Fit Index (CFI), Tucker-Lewis Index (TLI), and X2/df (Sun, 2005). The results of the CFA analysis in Table  4 show that all fit criteria have been met by the model. The obtained RMSEA value is 0.036, CFI = 0.952, TLI = 0.948, GFI = 0.957, and x2/df = 2.802. All of these values have met the model fit criteria (Hidayat et al., 2018;Nye & Drasgow, 2011;Prudon, 2014). The results of this final measurement are then used for the validity and reliability of items using the RASCH model (Susongko, 2016). The analysis using the RASCH model includes (1) instrument reliability, (2) instrument item quality, (3) level of difficulty of the items, (4) evaluate the clarity of items, and (5) items bias.
Instrument reliability was performed on five constructs, namely environmental behavior, knowledge, values, attitudes, and habits. The reliability analysis results showed that the item reliability values for each domain ranged from 0.99-1.00 with the item separation values ranging from 9.63 to 24.44. A reliability value above 0.9 indicates that the instrument's reliability is in the good category (Saefi et al., 2020), while the separation index value of > 2.0 indicates that the measurement using RASCH can distinguish the instrument into several different groups or domains (Ismail et al., 2020). In addition, the results of the person reliability analysis ranged from 0.65 to 0.83 which include in the pretty good category (Sumintono & Widhiarso, 2015) with a separation index value ranging from 1 and above 2. These results indicate that the instrument has the capability to distinguish respondents' abilities, respondents with high and low performance (Ismail et al., 2020).
The fit index value indicates the quality of the items in the instrument which reveals how accurately the data fits the model (Scoulas et al., 2021). The fit model reference used in this study is the MNSQ infit/outfit value, and PTMEA, while the ZSTD infit/outfit value is ignored because the sample used in this research is > 500 (Sumintono & Widhiarso, 2015). The MNSQ value is used as an indicator of item discrepancy in the RASCH model (Ismail et al., 2020), while the PTMEA is performed to determine whether the instrument can distinguish respondents according to their response level (Saefi et al., 2020).
The results of the item fit analysis in Table 6 show that there are two items which do not meet the fit index criteria. One item on the environmental attitude construct is EnA5 and on the environmental habits construct is EnH2. The MNSQ and PTMEA infit/outfit values for each of these items are outside the predetermined index value (Bond & Fox, 2007;Linacre, 2018). In this study, the criteria for item acceptance were determined by three criteria, namely infit MNSQ, outfit, MNSQ, and PTMEA. If the item meets one of the predetermined fit index criteria, then the item in the instrument can be accepted (Sumintono & Widhiarso, 2015). This result is different from the result of factor analysis and confirmatory factor. Based on these results, the loading factor values for EnA5 and EnH2 items are 0.464 and 0.721, respectively ( Table 2). The loading factor value is quite large and acceptable (Prasetyo et al., 2019), but based on the results of item fit analysis using RASCH, both items do not meet the criteria and are declared as invalid items. This study found that there was a discrepancy between the results of the CFA analysis and the RASCH model. According to Scoulas et al. (2021), the RASCH model can detect potential measurement problems such as item bias or local item dependencies that may arise when measuring using classical validation methods such as factor analysis. Based on this assumption, researchers tend to eliminate both items which are considered as invalid items.
The analysis of the items difficulty level through the wright map in Figure 3 showed that only 4 items namely EnB9, EnV7, EnV1 and EnH2 are considered difficult by respondents in understanding green character instruments. There were no items that were categorized as difficult to be understood by the respondents in the environmental knowledge component. Overall, the questions on the instrument can be easily understood by the respondent. This shows that the green character instrument has met the criteria for a good item difficulty level.
The rating scale visualization shown in Figure 4 shows the probability of the response category in the green character instrument according to the recommended pattern. Each category has a distinct peak at some point along the scale as expected (Scoulas et al., 2021). Thus, it can be concluded that the green character instrument response series is functioning properly (Saefi et al., 2020). The final stage of testing items used the DIF test to determine the instrument items bias. DIF analysis was specifically used to reveal the ability to answer between male and female students to find out whether there was a bias from the items given. Question items that have a bias are indicated by differences in the ability to answer between male and female students. To overcome the bias in the items, Iseppi et al. (2021) suggested to make two separate items, one item for men and another for women. The results of the DIF analysis of the green character instrument shown in Figure 5 show that there is no bias as evidenced by the graph of male and female responses approaching the normal line (green). This proves that the items in the instrument are free from bias and can be used to reveal green character for both male and female respondents.
The final result of the green character instrument found in five constructs with a total of 34 items (4 items were eliminated after EFA and CFA, and 2 items were eliminated after RASCH). The five formed constructs, namely Environmental Behavior (EnB), Environmental Knowledge (EnK), Environmental Value (EnV), Environmental Attitude (EnA), and Environmental Habits (EnH) were confirmed through the CFA and met the criteria for the Goodness of fit index (Table 4). These results indicate that the construct validity of the instrument has been met. This finding is in line with the theory that underlies this research such as the theories that have been tested by Stern (2000) regarding Environmental Behavior, environmental knowledge (Raymond et al., 2010), environmental values (Thompson & Barton, 1994), and attitudes towards the environment (Dunlap et al., 2000). Based on the results of the content validity analysis, which includes the fit item test, person-item map, and diagnostic rating scale, and the consequential validity which includes the DIF analysis, the green character instrument is declared eligible and has met the standard criteria that have been determined. However, this study revealed that one of the constructs, the Environmental Habits (EnH), experienced an item reduction to leave only one statement item. Based on these findings, the researcher believes that there is a lack of research caused by the lack of items used in this instrument. However, empirically, based on the results of the EFA, CFA and RASCH this questionnaire has met the standards in instrument development, so it can be used to measure the students' green character.

Conclusion
This study showed that the green character instrument series had met the criteria for item validity and reliability using the EFA, CFA and RASCH models. The EFA showed the loading factor was approximately on 0.314-0.772 with the initial eigenvalues in the interval of 1.54-4.77. It had a good goodness of fit index with X 2 /df, RMSEA, GFI, CFI and TLI in the category of good after confirmed through CFA. The EFA and CFA analysis resulted 36 items after eliminating 4 unstandardised items. A further analysis using RASCH on 36 items remained 34, 2 out of 36 was deleted due to not reach the standard value of MNSQ and PTMEA infit/outfit. The final result of this measurement found that the 34 items reached a fit model of EFA, CFA, and RASCH. This instrument can reveal knowledge, behavior, values, attitudes and habits towards the environment. Although it was found that there were discrepancies in the results of measurements using factors and RASCH, these three types of validity measurements should be used simultaneously so that they can complement one another.

Recommendations
Further research can be conducted to test the precision of the instruments that have been produced in revealing the students' green character in various demographic conditions. In addition, to obtain more comprehensive results, further research can be carried out at lower levels of education such as elementary, junior high and high school. For teachers, the green character instrument can be applied through a modified instrument for suitable materials and topics.

Limitations
The environmental habits construct has too few items. This allows the occurrence of missing in the data. Therefore, further research can arrange more items so that they can represent constructs to get more valid and reliable results.