Exploring the relationship among soccer-related knowledge, attitude, practice, and self-health in Chinese campus soccer education

Summary China has promoted campus soccer for over a decade due to its potential health benefits. The study aimed to explore soccer knowledge (SK), soccer attitude (SA), soccer practice (SP), and health status among Chinese freshmen and sophomore undergraduates who had received campus soccer education. Of the 7419 participants, 1,069 were valid and included in the analysis. Structural equation modeling (SEM) results indicated SK is positively associated with SA (p < 0.001), but negatively with SP (p < 0.01). SA was positively linked to SP (p < 0.001). SK indirectly affected SP through SA (Z = 13.677). Random forest-tree-structured Parzen estimators (RF-TPE) with SHAP indicated SP holds primary importance with a strong negative impact on health. Additionally, differences in rankings for SK, SA, and SP were observed among gender and urban-rural groups. These results reveal current campus soccer education is suboptimal to health promotion.


SEM fit tests
The fit performance of the SEM is an important criterion for evaluating its validity.The better the model fits, the more explanatory validity the model has for data.Based on previous studies, 28 evaluating model fit involves a rigorous assessment through multiple indicators, each addressing different aspects of fit between the model and the observed data.The chi-square to degrees of freedom ratio gauges the model's overall discrepancy per degree of freedom, aiming for lower values.The root-mean-square error of approximation considers error approximation per degree of freedom, with lower values indicating a better fit.Comparative indicators like normed fit index, relative fit index, incremental fit index, Tucker-Lewis index, and comparative fit index contrast the proposed model against a baseline model, with values nearing 1 signifying superior fit.Finally, the goodness of fit index and adjusted goodness of fit index reflect the proportion of variance explained by the model, with higher values preferred.Collectively, these indicators provide a multifaceted evaluation of the SEM's fit, where meeting or exceeding the respective thresholds across these metrics indicates a robust and valid model that adequately represents the observed data.Table 1 demonstrates that these indicators meet the established criteria, indicating that the model exhibits good fit performance.

Structural results
Table 2 and Figure 2 show the structural results.The results show that two of the hypotheses proposed in this study are supported, and four hypotheses are rejected.The direct and positive effects of knowledge on attitude are statistically significant (standardized direct effect b = 0.621, p < 0.001), so H1 is supported.The direct and negative effects of knowledge on practice are statistically significant (standardized direct effect b = -0.076,p < 0.001).However, due to the negative path coefficient, H2 is rejected.The direct and positive effects of attitude on practice are statistically significant (standardized direct effect b = 0.957, p < 0.001), so H4 is supported.There are no significant relationships in terms of the direct effect of attitude on health (standardized direct effect b = 0.053, p > 0.05), practice on health (standardized direct effect b = 0.09, p > 0.05) and knowledge on health (standardized direct effect b = 0.065, p > 0.05).Thus, H3, H5 and H6 are rejected.

Mediating effect analysis
There are many methods to analyze the mediation effect, such as the causal-step method and the Sobel test.However, previous studies 29,30 indicated that mediation effect analysis with bootstrapping is more accurate than the aforementioned two methods.Thus, this study adopted bootstrapping to analyze the mediating variable effects, performing bootstrapping at a 95% confidence interval with 5,000 samples.The asymptotic critical ratio (Z) and the confidence interval of the lower and upper bounds (95% BC, 95% percentile) were used to test whether the indirect effects were significant.When Z > 1.96 and a 95% confidence interval does not contain zero, there is statistical significance.Table 3 shows that the indirect effect of K-> A-> P is statistically significant (Z > 1.96, 95% confidence interval does not contain zero), and the indirect effects of K-> A-> H, K-> A-> P-> H, and K-> P-> H have no significant relationships.

Machine learning model performance comparison
Table 4 shows the testing performance of four machine learning models in 10-fold cross-validation.The results show that RF-TPE has the best performance in terms of four key metrics, that is, recall, precision, accuracy, and F1.Thus, it is appropriate to select RF-TPE for further analysis with SHAP.

Global analysis
Figure 3A shows SHAP's global interpretation of the RF-TPE model in all data, and three features are sorted according to their importance, indicating that practice ranks top, followed by knowledge and attitude.As shown in Figures 3B and 3C, the SHAP summary plot provides more detail about the relationship between features and outputs.The result shows that the higher the values of practice, the more likely students are to be unhealthy, indicating that practice has a strong negative impact on health.
The x axis and left y axis in Figures 4C and 4D show that knowledge values have a nonlinear relationship with their SHAP values, and the value of knowledge lies at approximately 0.8, showing a trend shift.Figure 4A shows that attitude values exhibit a nonlinear relationship with their SHAP values, and the value of attitude lies at approximately 0.7, showing a trend shift.Figure 4B shows the relationship of practice values, and their SHAP values are more random in the range of 0-1.
According to the x axis and right y axis, Figures 4A-4D show the relationship of each feature, and their relationships are all nonlinear.Figures 5A and 5B show that the higher the attitude, the more likely the practice is to be higher.Figures 4C and 4D show that the higher the knowledge, the more likely the attitude and practice are to be higher.As shown in Table 5, the two gender groups were statistically significant in terms of knowledge (p = 0.014), attitude (p = 0.000), and practice (p = 0.000).The city and village groups had statistically significant differences in terms of knowledge (p = 0.004) and practice (p = 0.036), and they had no significant difference in attitude.
Figure 5 shows the local analytical results in the gender group.The results in the male group show that practice ranks first, followed by attitude and knowledge, indicating that practice has a strong negative impact on health.The results in the female group show that knowledge ranks top, followed by practice and attitude, indicating that knowledge has a strong positive impact and that practice has a negative impact on health.Figure 6 shows the local analytical results in the gender group.The results in the city group indicate that practice ranks top, followed by knowledge and attitude, which is consistent with the global analytical results.The results in the village group show that knowledge ranks top, followed by attitude and practice.

DISCUSSION
This study is an inquiry into the effects of campus soccer education activities on young people's health based on the KAP hypothesis model with an added health component. 22It aims to identify the key challenges that have arisen in more than a decade of implementing campus  soccer education.The strong correlation between exercise and health has been well established by academic evidence.Hence, it was plausible to include it in the proposed framework.This study confirms that the KAP model adequately explains campus soccer behavior through SEM.Moreover, it uses both SEM and machine learning methods to assess how the knowledge, attitudes, and behaviors acquired by young people through campus soccer education influence their health outcomes from linear and nonlinear perspectives.This study reveals that soccer knowledge has a significant impact on the attitudes of young people toward soccer, which is consistent with Dishman's findings. 31He discovered that as soccer knowledge increases, so does positive attitude.Meanwhile, Vanttinen et al. contend that better soccer knowledge fosters a favorable outlook on soccer. 32This study also demonstrates that attitude influences behavior considerably, in line with previous findings on this topic. 33A previous study verified that attitude is a crucial predictor of individual behavior. 33Regarding soccer, Schei argues that an optimistic attitude enhances one's performance and conduct. 34ontrary to previous research, this study shows that soccer knowledge does not directly promote positive soccer practice.Fabrigar indicated that individuals with superior soccer knowledge exhibit better soccer practice. 35Moreover, higher soccer knowledge implies more involvement in soccer activities. 36However, these studies overlooked the effects of attitude.A previous study indicated that knowledge and practice are only weakly linked. 35Based on the KAP hypothesis model, individual behavior results from a sequential process of acquiring knowledge, forming beliefs, and acting accordingly.Knowledge may be necessary but not sufficient for practice.Hence, this study employed SEM to examine its mediating effect.The findings revealed that soccer knowledge significantly influences soccer practice through soccer attitude.
8][39][40][41][42] Previous studies have shown that individuals with positive attitudes toward sports tend to enjoy better mental health. 40Jordan argued that favorable sports attitudes lead to sports participation, 43 which in turn enhances individual well-being.Moreover, soccer-related activities have been demonstrated to boost physical and mental health effectively. 44The researchers attribute this discrepancy to the serious challenges of implementing soccer activities in campus soccer education.This study also investigates three indirect pathways from soccer knowledge to health.The results suggest that soccer knowledge does not improve individual health by fostering positive attitudes toward soccer, nor does it improve well-being by indirectly stimulating soccer engagement.This corroborates the authors' hypothesis that young people struggle to translate the attitudes and practices acquired from campus soccer education into better health outcomes.
SEM is believed to be the most effective technique for removing any biasing effect caused by measurement errors and building the most appropriate technique for investigating the relationship between observed and latent variables. 45However, previous studies have shown that knowledge, attitude, practice, and health may have a nonlinear relationship. 46SEM is a suitable approach to test hypotheses with a linear  relationship, but it cannot address nonlinear relationships. 45To analyze the nonlinear relationships among soccer knowledge, attitude, practice, and health, we selected an optimal machine learning algorithm based on the aforementioned situation.Previous studies have also used SEM and machine learning or deep learning algorithms to examine complex social problems, 47 which is consistent with our approach.Moreover, we applied the SHAP method to visualize and interpret the global and local importance of each feature.Our results showed that soccer practice was the least important factor for individual health, corroborating the findings from the SEM method.
The main purpose of soccer education and soccer activities in schools is to achieve health promotion through soccer practice. 48Therefore, the practice should be a relatively important part of the sustainable development of physical and mental health.Meanwhile, a previous study showed that the health, fitness, and other benefits of soccer participation and practice were well recognized, 49 which are essential parts to prompt health.However, the results indicate that soccer practice plays a slight role in health conditions in current Chinese soccer education.There are three possible reasons for these results: (1) Unscientific soccer practice may reduce the importance of practice for health promotion.Many previous studies have shown that exercise and sports participation can indeed serve a health-promoting function, 50 but the wrong motions, inappropriate amount of exercise, etc., have a negative effect on health.In the past decade, although campus soccer has been strongly promoted by the national government, there is still a serious shortage of professional soccer physical education teachers on campuses, leading to the formation of incorrect technical movements during exercise, which can lead to sports injuries.At the same time, campus soccer is primarily taught through large class teaching, and it is difficult for teachers to pay attention to all students, which may also lead to athletic injuries.Regarding students, previous study has shown that the obesity rate and cardiorespiratory fitness of Chinese students are worsening annually, 51 and such physical fitness may make it difficult for some students to withstand the load of soccer exercise, which may lead to psychological rejection of students and may have negative effects on mental health.Meanwhile, obese students are more likely to suffer joint injuries during more strenuous sports. 52(2) Lower social support may reduce the importance of practice for health promotion.The researchers argue that campus soccer education could enhance soccer knowledge among youths, foster their interest in soccer, and increase their engagement in soccer practices.The frequency and duration of soccer participation significantly affect individual health 53 ; however, there is very little social support for joining soccer activities in China, 54 as society, schools, and families prioritize academic success over everything else.This leads to a substantial gap between the intended and actual outcomes of young people's involvement in soccer. 55Moreover, a top-down social system may have led some leaders to neglect students' health and lead practices for formalities' sake in campus soccer education.This misguided approach resulted in many students taking part in soccer practice half-heartedly, hindering the goal of promoting health.This accounts for the upsetting results regarding campus soccer over the past decade.
(3) The lack of sustainability and balance of resource allocation in campus soccer at the various stages of education leads to a situation in which soccer practice is likely to fail to become a lasting behavior.Campus soccer in China's teaching system often encounters problems such as receiving campus soccer education and practice in primary school but not in schools that move on to middle school. 56Meanwhile, a relative study indicated that appropriate sport or practice has a health-promoting effect when it becomes a continuous behavior. 57When practice is interrupted or stopped, the health-promoting function is gradually reduced.
Finally, multigroup analysis showed that there were significant differences in the research results between the gender and city vs. village groups.As shown in Table 5, the scores of soccer knowledge, attitude, and practice of males were significantly higher than those of females, which indicated that the male group was more willing to participate in campus soccer activities.In addition, when analyzing the differences between various city vs. village groups, this study found that the scores of soccer knowledge, attitude, and practice of the city groups were higher than those of the village groups, and there are significant differences in their knowledge and practice scores.
In addition, the results in the SHAP local interpretation of the machine learning model found that there are differences in the importance of influencing features on health between males and females.Soccer practice had a greater impact on males' physical health, while knowledge had a greater impact on females' health.Previous studies have shown that males are more willing to participate in physical activities and perform risky behaviors to obtain stimulation due to physiological characteristics than females. 58However, in the current campus soccer environment, this nature often leads to sports injuries caused by practice, aggravating the negative impact of practice on health.In contrast, females practice less than males, which is consistent with the World Health Organization's survey results; females' physical activity level is  significantly lower than that of males. 59This may reduce the occurrence of injuries and other situations, reducing the negative effects of practice.Previous studies have shown that physical knowledge has a significant impact on physical and mental health, 37,38,[60][61][62][63] which may lead to female soccer knowledge of practices or activities outside campus soccer having a health-promoting effect.In addition, in the analysis of city vs. village students, the study found that the city students showed that practice was the most important feature, while the analysis of the village students showed that practice was the least important feature.This may also reflect the problem of uneven distribution of campus soccer resources.Related studies have shown that China is a typical large country with significant urban-rural differences and uneven regional economic development, 64 which is considered an important influencing factor for the results of this study.Campus soccer education first started in city areas and then spread to village areas.Campus soccer activities were more frequent in city areas, and village areas had more well-equipped soccer fields, resulting in more soccer participation behaviors among city people.Therefore, at this stage of campus soccer activities, individuals from cities are more likely to have a negative impact on health during soccer participation.In contrast, village people participate less in soccer practice but obtain soccer-related sports nutrition and health knowledge.
In conclusion, this study strongly indicates that the current implementation of campus soccer education has not effectively contributed to the overall health improvement of young students.Despite fostering student participation in soccer, concerns and shortcomings in the implementation process have hindered its ability to promote the physical and mental well-being of adolescents.In essence, the past decade of campus soccer implementation can be considered somewhat unsuccessful.It is essential to address these issues to ensure that sports programs genuinely contribute to the holistic health of youth.

Limitations of the study
However, since the present study was a cross-sectional study, causality could not be established.In other words, malnutrition may lead to impaired physical functioning, or one's own health may be the cause of soccer-related knowledge.Also, although this study is a survey of youth across the country, resource constraints and other reasons may result in an uneven distribution of the regions to which the sample belongs.During the long process of teaching soccer in schools, there are some individuals who may not have been adhering to the sport of soccer, which may also affect the results of the study.In addition, the structural equation modeling used in this study may not handle nonlinear data well.Therefore, machine learning algorithms with the ability to handle nonlinear data were also used in this study for additional analyses.

STAR+METHODS
Detailed methods are provided in the online version of this paper and include the following:

Machine learning modeling
Before applying data to models, the study used Min-Max scaling.This study compared five machine learning algorithms: decision tree (DT), random forest (RF), light gradient boosting machine tree (LightGBM), and gradient boosting + categorical features (Catboost).All of them applied a Bayesian optimization algorithm to optimize the hyperparameters based on tree-structured Parzen estimators (TPEs), and the search spaces and final hyperparameters of the four algorithms are shown in Tables S2-S5.In the performance comparison among models, recall, accuracy, precision, and F1 were used in this study, as shown in Equations 1, 2, 3, and 4.9][80] The local interpretations have two perspectives: 1) gender (male, female) and 2) city vs. village.All data processing and modeling were performed via Python 3.8.The whole process of modeling and SHAP analysis as shown in below figure.

Decision tree (DT)
DT is a non-parametric supervised learning algorithm utilized for both classification and regression tasks.Comprising a hierarchical tree structure, it consists of a root node, branches, internal nodes, and leaf nodes.The process of modeling and SHAP analysis

Figure 2 .
Figure 2. Results of the structural equation model and hypothesis testing CMIN/DF, chi-square to degrees of freedom ratio; RMSEA, root-mean-square error of approximation; NFI, normed fit index; IFI, incremental fit index; CFI, comparative fit index.

Figure 4 .
Figure 4.The impact of features on model output (healthy = 1, unhealthy = 0) (A) The relationship among attitude, practice and SHAP values for attitude.(B) The relationship among practice, attitiude and SHAP values for practice.(C) The relationship among knowledge, practice and SHAP values for knowledge.(D) The relationship among knowledge, attitude and SHAP values for knowledge.

Table 1 .
Model fit summary for the proposed research model

Table 4 .
The testing performance of four machine learning models in 10-fold cross-validation (mean G standard deviation)

TABLE
d RESOURCE AVAILABILITY B Lead contact B Materials availability B Data and code availability d METHOD DETAILS SS, soccer story; SE, soccer equipment; SR, soccer rules; NK, nutrition knowledge; SH, sports health; EA, emotional attitude; CA, cognitive attitude; PA, practical attitude; DP, direct practice; INP, indirect practice; KQ n , knowledge question n; AQ n , attitude question n; PQ n , practice question n.