The Academic Buoyancy Scale: Measurement Invariance across Culture and Gender in Egyptian and Omani Undergraduates

The academic buoyancy scale (ABS) is one of the most widely used instruments for measuring academic buoyancy. To obtain meaningful and valid comparisons across groups using ABS, however, measurement invariance should be ascertained a priori. To that end, we examined its measurement invariance, validity evidence based on relations to other variables, and score reliability using categorical omega across culture and gender among Egyptian and Omani undergraduates. Participants were 345 college students: Egyptian sample (N=191) and Omani sample (N=154). To assess measurement invariance across culture and gender, multiple–group confirmatory factor analysis was performed with four successive invariance models: (a) configural, (b) metric, (c) scalar, and (d) residual. Results revealed that the unidimensional baseline model had adequate fit to the data in the full sample. Moreover, measurement invariance was found to hold across culture but not across gender and consequently the ABS could be used to yield valid cross-cultural comparisons between the Egyptian and Omani students. Conversely, it cannot be used to yield valid inferences related to comparing gender groups within each culture. Validity evidence based on relations to other variables was supported by the significantly moderate correlation between ABS and academic achievement (GPA; r =.435 and r = .457, P < .01) for the Egyptian and Omani samples, respectively. With regard to score reliability, categorical omega coefficients were moderate across both samples. Educational and psychological implications, limitations and suggestions for improving the scale are discussed.


Introduction
It is well-documented in the extant literature that stress becomes a part of students' academic life (e.g., Reddy et al., 2018). Specifically, approximately 80% of students encounter problematic levels of academic setbacks (Walker et al., 2005). It is therefore necessary to know why certain students bounce back strongly. This explains the recent movement from emphasis on students' disruptive behaviors in academic settings to motivational factors such as buoyancy that impact students' ability to face challenges (Martin & Marsh, 2006). Buoyancy appeared in literature to distinguish between students who could manage difficult situations from those who could not and consequently interventions could be implemented (Kanevsky et al., 2008). Academic buoyancy has been recently an evolving topic in educational and school psychology (Martin & Marsh, 2009;Olendo et al., 2019). In saying that, Martin (2013) argued that without academic buoyancy, motivated students' gains might be lost. It has also been found to be a significant factor influencing students' ability to face academic challenges (Khalaf, 2014;Martin, 2013;Martin & Marsh, 2008b. Relatedly, Martin (2014) concluded that "there is merit in widely promoting and fostering academic buoyancy among students" (p.86). Furthermore, it was found to be positively correlated with emotional and behavioral engagement; and academically buoyant students were active participants in academic tasks (Datu & Yang, 2018). Despite the significance of academic buoyancy, there is a paucity of applied research on establishing measurement invariance of commonly used instruments particularly for conducting crosscultural research. Thus, the overall objective of the current study is to assess measurement invariance of the Academic Buoyancy Scale (hereafter referred to as ABS; Martin & Marsh, 2008a) across culture and gender among Egyptian and Omani undergraduates, given that the scale was first developed and initially validated in Australia. A second main objective of the study is to collect validity evidence based on relations to other variables (e.g., academic achievement). Third, there has been a critical need to utilize congenic reliability estimators (e.g., categorical omega) to estimate score reliability of the ABS, given the serious limitations of alpha (more information is provided in the data analysis section).

Conceptualization of Academic Buoyancy
Academic buoyancy can be defined as students' ability to overcome challenges that negatively impact their educational progress (Martin & Marsh, 2006). Generally, "it reflects more of an everyday academic resilience and it is distinct from the more traditional academic resilience construct" (Martin & Marsh, 2008a, p. 169). Academic resilience is mainly concerned with the relevance of resilience in educational settings (Cassidy, 2016). We argue that a common component of the previous definitions is students' ability to overcome academic challenges to achieve the intended learning outcomes.

Initial Development and Validation of the ABS
Given the significance of academic buoyancy in students' academic success, Martin and Marsh (2008a) developed the ABS to basically measure students' ability to effectively deal with setbacks, challenges, stress, and pressure that occur in the ordinary course of academic life. To do that, the authors developed an item pool, conducted various revision iterations with subject matter experts, and selected four items to compose the scale. To psychometrically validate the four-item ABS, Martin and Marsh (2008a) administered it among 3,450 high school students in Australia. The authors conducted confirmatory factor analysis (hereafter referred to as CFA) that supported its unidimensionality as hypothesized. Regarding score reliability, alpha and test-retest coefficients were .67, .80 (time 1), and .82 (time 2), respectively.

Psychometric Evaluation of the ABS
Following the ABS initial development and validation, there has been various studies conducted to validate the intended score interpretation among other populations. For instance, Kapikiran (2012) assessed the validity and score reliability of the ABS among a Turkish sample (192 females and 186 males). Results of exploratory factor analysis (hereafter referred to as EFA) and CFA supported its unidimensionality. With regard to its convergent and discriminant validity evidence, statistically significant correlations were found with intrinsic motivation (r = .48, p <0.01), test anxiety (r = -.26, p <0.01), family social support (r = .72, p <0.01), self-esteem (r = .20, p <0.01), and GPA (r = .24). As for its score reliability, the respective values for Cronbach's alpha and test-retest coefficients were .83 and .82. Despite utilizing EFA and CFA, the author did not conduct measurement invariance, which limited its usefulness for valid subgroup comparisons.
Using data of 190 Egyptian undergraduates, Khalaf (2014) conducted EFA for the academic resilience scale as the base measure used to develop the ABS. The author extracted only one factor that accounted for 42.91% of the variance, with alpha coefficient of .74. Convergent validity evidence was supported by the significant correlation with academic achievement (r =.57, p <0.01). Despite using a closely related scale in an Egyptian sample, the author did not conduct CFA, measurement invariance, or congeneric reliability. Putwain et al. (2015) used a sample of 705 students in grade 11 in the UK. The authors reported a significantly negative correlation between academic buoyancy and performance on a high-stakes test, and a positive correlation between academic buoyancy and the score obtained from the general certificate of secondary education. Alpha coefficient was found to be .77. In a sample of 402 Filipino university students, Datu and Yang (2018) supported the unidimensionality of the ABS. They also investigated its validity evidence based on relations to other variables and found a positive association between ABS and behavioral and emotional engagement. Kumar and Sharma (2020) administered the ABS on 400 senior secondary school students from different districts of Punjab, India. Results of EFA and CFA supported its unidimensionality. Putwain et al. (2020a) reported alpha for the total score (.80). However, no validity evidence was reported in this study. Due to its significance, Verrier et al. (2018) developed a teacher-reported version of the ABS called TABS. Using responses of 50 teachers and 100 tenth grade students, results of CFA partially supported the unidimensional structure of the scale with RMSEA exceeding the recommended thresholds. Score reliability using alpha was found to be .84. Validity evidence based on relations to other variables was supported since ABS positively correlated with academic achievement (r =.40, p <.01) and task management (r =.21, p <.05), but negatively correlated with disengagement (r = -.16, p <.05) and self-sabotage (r = -.16, p <.05).
In a recent score reliability study, Martin and Marsh (2019) estimated McDonald's omega coefficients across time points and reported .85 and .81 for the first and second time point, respectively. It is noted that this is the only study where the authors reported McDonald's omega but for continuous data that is not consistent with ordinal data collected by means of rating scales.
Before reviewing psychometric studies conducted to evaluate measurement invariance of the ABS, it is necessary to illustrate its importance for obtaining valid subgroup comparisons and consequently study-based inferences. Vandenberg and Lance (2000) argue that failure to establish measurement invariance is similar to failure to assess validity and reliability. Milfont and Fischer (2010) state that the establishment of measurement invariance is a prerequisite for meaningful comparisons across groups. Testing measurement invariance is an assumption for obtaining valid cross-cultural comparisons . Abulela and Davenport (2020) argue "testing for measurement invariance helps researchers ascertain that the instrument measures the same construct across all subgroups" (p.35).
Previous researchers concluded the lack of fairness to compare between males and females in academic buoyancy without assessing gender invariance of the ABS. It has been repeatedly argued that the contradictory gender differences reported on academic buoyancy may be due in part to failure to establish measurement invariance before conducting subgroup comparisons (e.g., Fong & Kim, 2021;Khalaf, 2014). Put differently, results of subgroup differences reported without ascertaining measurement invariance make it difficult to attribute the obtained observed differences to true ability differences. These claims emphasize the persistent need to establish measurement invariance if the intended interpretations are across subgroups (i.e., conducting subgroup comparison).
When initially developed, Martin and Marsh (2008a) used multi-group CFA (hereafter referred to as MGCFA) to establish measurement invariance of the ABS across gender and age. The authors concluded that the scale scores were invariant across gender and age groups. Martin et al. (2017) tested the cross-cultural measurement invariance of the ABS in three country groups: USA and Canada to represent North America (n= 989), the UK (n= 1182), and China (n= 3617). Results of MGCFA supported its measurement invariance across the three groups. Putwain et al. (2015) and Datu and Yang (2018) established its measurement invariance between two time points and gender subgroups, respectively.

The ABS Validity Evidence based on Relations to Other Variables
In addition to the previous results reported in the psychometric evaluation of the scale, other studies have reported a positive correlation between academic buoyancy and other variables: (a) achievement (Miller et al., 2013), (b) positive motivational perspectives (Collie et al., 2017), and (c) higher levels of engagement in school work (Yu & Martin, 2014). Specifically, academic buoyancy was found to be a strong predictor of undergraduates' GPA (Bowen, 2010;Martin, 2014;Putwain & Daly, 2013). To conclude, previous results supported the conclusion that academic buoyancy can be correlated with academic achievement or students' GPA to establish its validity evidence based on relations to other variables.

Rationale for the Current Study
Through a comprehensive and rigorous review of literature, we found some critical limitations, which may impact the utility of the published studies. First, despite the need to use the ABS with undergraduates (Bowen, 2010); most studies were conducted within student populations other than undergraduate students (e.g., Kumar & Sharma, 2020;Putwain et al., 2015;Verrier et al., 2018). Second, score reliability coefficients were mostly reported by means of coefficient alpha despite its strict assumptions of the essentially-tau equivalent model. Third, most of the reported results were not discussed within the framework of the intended interpretation and uses of scores (American Educational Research Association [AERA] et al., 2014). Fourth, In addition to these limitations, some authors recommended conducting more psychometric investigations to evaluate the psychometric properties of the ABS within other non-western populations (Datu & Yang, 2018). According to Bernardo (2011), it is not logical to assume that all measurement instruments function equally across cultural subgroups. Thus, cross-cultural measurement invariance of the ABS still in its infancy compared to validity and reliability studies.
We argue that no study, to date, attempted to validate the intended interpretation and uses of the ABS either in an Egyptian or in an Omani context. To that end, the overall objective of the current study is to assess the measurement invariance for the ABS across culture (Egypt vs. Oman) and gender (males vs. females). Second, there is a need to establish validity evidence based on relations to other variables. Third, given all but one of the published studies reported alpha or test-retest for estimating score reliability, we attempted to estimate categorical omega given its robustness with ordinal data. Results of the study have the potential to enrich the psychometric literature of the ABS and consequently provide more evidence about its validity evidence and measurement invariance for cross-cultural subgroup comparisons.

Research Design
The current study is a psychometric investigation intended to utilize advanced psychometric methods to validate the intended interpretations and uses of the ABS among undergraduates.

Sample and Data Collection
Participants were 191 Egyptian undergraduates (101 females, 90 males) and 154 Omani undergraduates (72 females, 82 males). Age means were 21.49 (SD = 1.17) and 22.32 (SD = 1.39) for the Egyptian and Omani samples, respectively. All participants were recruited from two large public universities in North Upper Egypt and Muscat Governorate, Sultanate of Oman. Recent methodological and psychometric research (e.g., Finch et al., 2018) illustrated that measurement invariance analyses could be conducted when sample size was above 50 per group, and consequently sample size was not a concern in the current study.
Before data collection, the first author completed all logistics required to conduct the study, particularly data collection. After obtaining the required approvals, a paper-and-pencil academic buoyancy scale was distributed to Egyptian and Omani college students at the beginning of Fall 2019. They completed an adapted Arabic version of the ABS by rating each statement on a 7-point rating scale ranging from 1 (strongly disagree) to 7 (strongly agree). Upon their participation, they received five points towards their course credit.

Measures Academic Buoyancy Scale (ABS)
As illustrated earlier, the final version of the ABS included four items measuring students' ability to deal with daily academic adversities. The ABS was originally developed in English. In the context of the present study, the scale was rigorously translated into Arabic using the back-translation technique adopted in previous cross-cultural studies (Brislin, 1970). Some successive steps were implemented to properly utilize the back-translation technique and ensure the accuracy of translation as a key factor for using measurement instruments validated within different linguistic populations. First, two bilingual professors independently translated the four items of the scale focusing on standard Arabic to be easily understood by the Egyptian and Omani students. Then, the two bilingual professors met and compared the two translated versions of the scale. Given the scale had four items; there was an agreement on the two translated versions so that only one version was used in later steps. Next, a third bilingual professor translated the Arabic version of the scale back to English (Back-translation). Last, the authors compared the original English version of the scale with the back translated version and found them consistent. This consistency is evidence for the accuracy of translation as the core of the back-translation technique.

Achievement Score
Achievement was operationally defined as undergraduates' accumulative GPA scores. Therefore, they were asked to report their GPA scores while responding to the ABS. These scores were then used in the correlational analysis between academic buoyancy and students' GPA to collect validity evidence based on relations to other variables.

Data Analysis
Consistent with the objectives of the study, data analyses were conducted in three steps: (a) Assessing measurement invariance for the ABS across culture (Egyptian vs Omani) and Gender (Males vs Females) within each culture separately, (b) collecting validity evidence based on relations to other variables by computing the correlation between ABS and participants' GPA as a proxy for academic achievement, and (c) estimating score reliability using categorical omega with bootstrapping for obtaining confidence intervals.

Testing Measurement Invariance
According to the Standards of the Educational and Psychological Testing (American Educational Research Association [AERA] et al., 2014), researchers should collect validity evidence that supports the intended interpretations and uses of scores. If the intended interpretations of test or scale scores are across multiple groups, there should be validity evidence to ensure that the scale functions similarly across subgroups or more technically has measurement invariance.
To assess measurement invariance, there should be an established baseline model for the hypothesized factor structure before moving to the MGCFA. The unidimensional baseline model was tested for the two cultural groups (Egyptian vs. Omani) as well as the two gender groups (males vs. females) within each culture. The MGCFA has four hierarchical invariance steps with additional constraints added on each step: (a) configural, (b) metric, (c) scalar, and (d) residual. In configural invariance, the hypothesized factor structure is constrained to be equal across groups. Put simply, only the pattern of loadings are constrained to be equal across groups regardless of their equal magnitudes. In metric invariance, factor loadings are constrained to have equal magnitudes across groups. In scalar invariance, items intercepts are constrained to be equal across groups. It is worth noting that establishing scalar invariance is a prerequisite for the valid comparisons of groups means in the underlying latent ability. Last, residual invariance involved constraining the error variance to be equal across groups.
In the context of the present study, we followed the recommendations of Hu and Bentler (1999) for evaluating the fit of the baseline model. We utilized the χ 2 test statistic, the Comparative Fit Index (CFI), and Tuker-Lewis Index (TLI), the Standardized Root Mean Square Residual (SRMR), and the Root Mean Square Error of Approximation (RMSEA). The χ 2 test statistic should be nonsignificant to indicate there are no significant differences between the observed and reproduced covariance matrices. However, as sample size increases, the test becomes more powerful to detect minor differences leading to incorrectly rejecting correctly hypothesized factor structures (Kline, 2005). Thus, since the χ 2 test statistic is sensitive to sample size, other descriptive fit indices are used. The CFI and TLI should be ≥ 0.95 to indicate the superiority of the hypothesized model over the null model or the worst fitting model. The SRMR should be ≤ 0.08 to indicate that the standardized difference between the residuals of the sample covariance matrix and that of the hypothesized model is not substantial. The RMSEA should be ≤ 0.08 to indicate less discrepancy between the hypothesized factor structure and the population covariance matrix.
Assessment of measurement invariance was evaluated using the goodness of fit indices recommended in relevant literature (Chen, 2007;Cheung & Rensvold, 2002;Meade et al., 2008). In addition to the goodness of fit indices described above, specific criteria are used such as the χ 2 difference test, ∆CFI ≤ -.01, and ∆RMSEA ≤ .015. The χ 2 difference test should be nonsignificant to indicate that the more constrained model is still invariant despite adding more constraints. The change in CFI and RMSEA is computed as follows: where ∆CFI is the change in the comparative fit index while CFI Constrained and CFI Unconstrained are the comparative fit indices for the model with more and fewer constraints, respectively. Similarly, ∆RMSEA is the change in the root mean square error of approximation while RMSEA Constrained and RMSEA Unconstrained are indices associated with the root mean square error of approximation for the more and less constrained models, respectively. All measurement invariance analyses were conducted using the lavaan package (Rosseel, 2012) available in R (R Development Core Team, 2020).

Validity Evidence Based on Relations to other Variables
If the scale is supposed to be empirically correlated with other variables of interest, validity evidence based on relations to other variables should be collected and reported. As extensively illustrated within the context of the present study, academic buoyancy has repeatedly been found to significantly correlate with measures of academic achievement. In saying that, we estimated the correlation between students' ABS scores and their academic achievement operationalized by their GPA scores within each culture.

Score Reliability: Categorical Omega
There has been recently a relative consensus among measurement specialists and psychometricians to report coefficient omega over alpha, given that the latter is sensitive to the violations of the essentially-tau equivalent model leading to underestimation of score reliability coefficients (Dunn et al., 2014;Panayides, 2013;Raykov, 2001). Categorical omega was estimated using the MBESS package (Kelley, 2007) available in R (R Development Core Team, 2020). Estimating categorical omega is considered a substantial contribution to the existing literature of the ABS since this is the first study where we reported categorical omega estimates with confidence intervals for the ABS score reliability.

Validity Evidence Based on Relations to other Variables
Results of correlational analysis revealed that academic buoyancy was positively and significantly correlated with students' GPA scores (r=.435, and r= .457, p < .01) in the Egyptian and Omani samples, respectively. This indicated that academic buoyancy was moderately correlated with students' GPA scores.

Score Reliability: Categorical Omega
Estimates of categorical omega for score reliability were .703, .724, and .656 for the total, Egyptian, and Omani samples, respectively. We argue that the moderate magnitude of reliability estimates is in part a function of the small number of items in the ABS. Table 3 shows categorical omega estimates with their associated standard errors and 95% confidence intervals. Note. ABS = Academic Buoyancy Scale, SE = standard error, CI = confidence intervals

Discussion
In the current study, results of CFA supported the unidimensionality of the ABS in the Egyptian and Omani samples consistent with previous research (Kapikiran, 2012;Kumar & Sharma, 2020;Martin & Marsh, 2008a). Results of measurement invariance were in line with Putwain et al. (2015) who supported the ABS cross-cultural measurement invariance in a British population. Conversely, results associated with the ABS lack of measurement invariance across gender groups were not consistent with some previous research where the authors concluded that the ABS functions equally across males and females (Datu & Yang, 2018;Kapikiran, 2012;Martin & Marsh, 2008a). The moderately significant correlation between the ABS and students' GPA reported in the present study was in agreement with previously published investigations (Bowen, 2010;Kapikiran, 2012;Khalaf, 2014;Martin, 2014;Putwain & Daly, 2013;Verrier et al., 2018). Regarding score reliability, omega coefficients were somehow low compared to previous research. This likely happened due to the limited variability in students' responses.
To explain the potential reasons for which measurement invariance does not hold across gender, we need to think of gender differences in motivational and other construct-irrelevant factors associated with academic challenges. Braten and Olaussen (1998) found that females were highly motivated than males, which could influence their responses to motivation scales. Relatedly, Martin et al. (2015) stated that motivation factors could differ between both males and females. In that sense, such motivational factors were found to play a significant role in students' capacity to effectively deal with academic challenges and setbacks. Contrastingly, Olendo et al. (2019) found no gender differences in academic buoyancy.
Some recently published investigations have supported the use of other constructs to be utilized in collecting validity evidence based on relations to other variables. For instance, Olendo et al. (2019) found that self-efficacy was a significant predictor of academic buoyancy. Due to the association between academic achievement and self-efficacy, this explains the positively moderate association obtained in the current study that supported validity evidence based on relations to other variables. Putwain et al. (2020b) also found that academic buoyancy could protect students from underachievement due to negative achievement emotions. This is another potential reason underlying the positive association obtained in the current study between academic buoyancy and students' achievement. Relatedly, Ursin et al. (2021) found that academic buoyancy was correlated with emotional and cognitive engagement. These results stipulate that student engagement can be used in future validation studies of the ABS to collect validity evidence based on relations to other variables.

Conclusion
Given the need to assess undergraduates' academic buoyancy due to its association with their academic achievement, the present study was an attempt to assess measurement invariance of the ABS, collect its validity evidence based on relations to other variables, and estimate its congeneric score reliability using categorical omega. Results of measurement invariance analyses supported the ABS equal functionality across the Egyptian and Omani cultures but not across gender subgroups. Validity evidence based on associations with other variables was also established allowing for drawing inferences related to the moderate correlation between ABS and undergraduates' GPA scores.
Categorical omega coefficients were also estimated to relatively support the ABS score reliability in the two studied populations. To conclude, the current study adds to the educational research literature by investigating the psychometric properties of the ABS in a non-western population. Additionally, using categorical omega is a substantial contribution to estimate score reliability due to its appropriateness for ordinal data and consequently obtaining unbiased estimates. Such psychometric methodology for estimating score reliability can be generalized in future validation studies not only for the ABS but also for other educational and psychological scales.

Recommendations
Findings of the current study present robust evidence about the measurement invariance of the ABS in a non-western educational context. It is the first study that addressed measurement invariance of the ABS across culture and gender in both countries (Egypt and Oman). These results have the potential to set the stage for conducting more research within the two populations. Specifically, researchers at the teaching and learning centers affiliated with Arab universities can utilize the ABS to conduct cross-cultural research and intervene accordingly. Since academic buoyancy was found to be positively correlated with students' GPA, these findings highlight the importance of integrating academic buoyancy boosters (high motivators), reducing mufflers (constrained motivators) and avoiding guzzlers (reduced motivators) among undergraduates.
The current psychometric investigation could serve as the basis for future research in the Arab world. Specifically, more research is needed to cross-validate the results obtained in the study at different educational institutions. Future research should include additional measures of achievement such as teacher ratings or school records. A possible trend of future research might compare students' self-appraisal of academic buoyancy and their teacher's appraisal. Related to the lack of measurement invariance across gender, more research endeavors are needed to qualitatively examine the four items of the ABS and look for sources of construct-irrelevant variance that may underlie the lack of invariance based on gender within each culture. Finally, given that we are in the era of COVID-19, future research may address the association between academic buoyancy and cognitive load in online learning environments.

Limitations
Although this study highlights the importance of cross-culturally validated measurement instruments, results should be interpreted with some limitations. The first and probably the most critical limitation might be that academic buoyancy scale is a short self-report measure and therefore response bias is an inherent constraint. Another important challenge encountered in this study was students' reluctance to disclose their GPA, which resulted in decreasing the number of participants' scores in conducting correlational analysis. Despite the establishment of measurement invariance of the ABS between Egyptian and Omani undergraduates, such results should be generalized with cautions, given the necessity of validating the scale across other Arab countries.

Authorship Contribution Statement
The two authors worked collaboratively in the manuscript. Both authors actively worked on the conceptualization, design, data collection and analysis, and drafting the manuscript. They conducted critical revision of the manuscript and approved all content.