Development of the Perceived Physical Literacy Questionnaire (PPLQ) for the adult population

Background/objective In physical literacy (PL) research, instruments for the adult population covering all relevant domians are currently lacking in German language. Therefore, the Perceived Physical Literacy Questionnaire (PPLQ) was developed as an assessment instrument of PL for the adult population. The purpose of this study is to describe the multistage development process leading to the aim to evaluate the psychometric properties of the PPLQ. Methods Based on established questionnaires (subscales) operationalizing the six defined PL domains (motivation, confidence, physical competence, knowledge, understanding, and physical activity behavior), we generated a large item pool. Exploratory analyses on survey data (n = 506), compelemented through an expert panel, served to identify the best fitting items. Cognitive interviews (n = 7) and a language certification process (level A2) helped to enhance the content validity of the items. Finally, we assessed the hypothesized factor structure of the PPLQ and its convergent validity with the Physical Activity-related Health Competence (PAHCO) questionnaire in a second independent sample. Results Valid data of 417 adults (66% women, 48 ± 16 years) entered the confirmatory factor analysis. We found empirical support for a theory-compatible 24-item version, after reducing complexity (i.e., domain subscales). Additionally, the six domains could be subsumed under an overall factor for PL (χ2247 = 450.70, χ2/df = 1.83, CFIRobust = 0.895, RMSEARobust = 0.074 [CI90 = 0.063–0.085], SRMR = 0.064). Factor loadings, composite reliability, and discriminant validity were sufficient, while acceptable convergent validity was achieved for the total PL score and three domains. Conclusion The 24-item version of the PPLQ is appropriate for assessing PL among adults. However, some items (especially in the knowledge domain) can benefit from refinement in further studies.


Introduction
The concept of physical literacy (PL) has gained worldwide attention over the last two decades 1,2 as it is broadly understood as the gateway for sustained lifelong participation in physical activity (PA). 3,4Pioneered by Margaret Whitehead, 5 the International Physical Literacy Association (IPLA) has defined PL as "the motivation, confidence, physical competence, knowledge and understanding to value and take responsibility for engagement in physical activities for life". 6Noteworthy, the multidimensional concept of PL follows a mind-body integrated, holistic approach to PA and is framed as a lifelong "journey" that every individual takes throughout its life. 2,7Given its ultimate goal of lifelong PA participation, 8 PL is conceptually linked to a wide array of health outcomes. 9Most importantly, by considering several theory-based PA correlates of well-established constructs (e.g., confidence [i.e., self-efficacy] and motivation derived from social cognitive theory or self-determination theory, respectively), [10][11][12] the PL concept may present an innovative framework to promote PA in different populations and settings. 13Particularly in German-speaking countries, such as Austria, where PL has attracted only little attention so far, 14,15 the integration of the PL approach could therefore be a promising strategy to reduce the high prevalence of physically inactive adults. 168][19] To date, the majority of research initiatives have aimed at children, adolescents and recently adults in educational context, 17,[20][21][22][23] even though PL is in its philosophical essence a concept for all people regardless of age, gender, ethnicity, and socioeconomic status. 24,25This is an important research gap, as it bears the risk of uncoupling PL from its original conception as a lifelong journey. 26s the consideration of PL might act as an essential strategy for promoting PA in adults, reliable and valid PL instruments are required for multiple professions and in multiple contexts. 27For example, outcomes may enable researchers and practitioners to understand adults' PL levels so that they can implement or evaluate an appropriate intervention. 20Furthermore, the results can be used on the policy level to promote and allocate resources to PL initiatives. 27Therefore, PL assessment instruments for adults are needed that support the conceptual and theoretical assumptions underlying the PL construct. 9In the long term, instruments can help to accumulate empirical findings which are at the moment, to a large degree, only available for children. 17t present, there are only a few PL assessment instruments available for adults. 28Among these, some have not yet been validated 3,29 or do not reflect the holistic nature of the concept by focusing solely on physical aspects. 30Only studies on the College Student Physical Literacy Questionnaire (CSPLQ), 31 the Perceived Physical Literacy Instrument (PPLI), 32 as well as its adapted versions (e.g., for Simple Chinese, 33 Spanish, 34 Turkish 35 or Persian 36 ) have reported satisfactory results for reliability levels and global model fits.However, the target population for both the CSPLQ and PPLI was restricted to specific subgroups of adults within the Chinese culture, namely college students and physical education teachers. 31,32Thereby, all PPLI adaptations relied on the same 18-item pool used to develop the original version (see Table 7 in Mendoza-Muñoz et al. 34 ), but extracted different items nested within different factor structures in their final versions, respectively.Moreover, the PPLI and its adaptions have a thematic focus on sport and considered only three domains (factors), which may be a narrow interpretation of PL given the complexity of the concept.In fact, the authors of the Spanish PPLI version, 34 suggested that further studies should go beyond the three-factor structure of the PPLI by additionally examining other domains in the assessment of PL.For example, the PPLI does not consider PA behavior, even though it is acknowledged as a valuable indicator for the operationalization of PL. 37 Not only in the PPLI, but also in several other existing PL instruments, two distinct theoretical constructs such as motivation and confidence were combined into one domain within the operationalized factor structure (see e.g., 32,34,37 ).This combination may limit both the interpretation of the domain score 38 and the implementation of an appropriate PL-based intervention according to the assessment results. 20For instance, a limited level of motivation requires similar but still different intervention strategies than a limited level of confidence (i.e.self-efficacy) and vice versa. 39,40t is to be noted that no existing instrument for adults provides an overall PL score, which allows quantification of an individual's "PL level" -a term referenced very frequently in literature.
The current state of published literature led us to conclude that to date there is no PL assessment instrument for adults available, which is adaptable into the German language and the cultural setting.Most notably, no existing instrument allows for a comprehensive and differentiated measurement of all key attributes according to the IPLA definition (motivation confidence, physical competence, knowledge, understanding and PA behavior) and integrates a composite score for PL.Hence, our overall aim was to develop the Perceived Physical Literacy Questionaire (PPLQ); a PL assessment instrument for the general adult population (i.e., aged 18-65 years with diverse socioeconomic and PA background), which covers all these six domains separately and additionally subsume them under an overall PL factor.In this sense, the instrument should be easy-to-understand (i.e., at least A2 language level), 41 and easy-to-administer as well as cost-and time-efficient in terms of applications within large-scale assessments.The specific aims in this multistage process were to (i) assemble a comprehensive item pool, with items suitable for assessment among the adult population and (ii) to identify the most appropriate items based on data of an online survey.Subsequently, we aimed to (iii) evaluate the content validity of the remaining items with cognitive interviews in a target group and (iv) by external comprehensibility inspection.In the last stage (v), we evaluated how well the shortened version fitted a hypothesized measurement model in a second, independent sample and undertook further improvements, if indicated.

Development of the PPLQ
The development of the PPLQ was conducted in Austria, in German language.Its development process started in 2019 and included five stages, whereas the main text of this paper focuses on stage 5.The aims, methods and results of the preceding stages are also summarized below.A detailed description for each stage can be found in Appendix A.

Stage 1
The development of the PPLQ is based on a questionnaire used in previous PL-intervention trials. 42,43This instrument already consisted, to a large extent, of a conglomerate of established questionnaires (subscales) for the respective PL domains.Yet, their psychometric properties have not been investigated for application in an integrative model and multidimensional measurement framework.Accordingly, this is crucial to evaluate whether the underlying constructs for the domains justify separate treatment and are distinct from each other (i.e., sufficient discriminant validity).Moreover, we identified methodological and content-related shortcomings during the trials at the domain and global level of the instrument (e.g., reliability of the domain knowledge; scoring procedure; no measurement of physical competence, which is, however, an important part of current PL definitions 6 ).In autumn 2019 we revised this initial questionnaire according to our proposed six-domain PL measurement model (see above).Since the development of a questionnaire based on a multidimensional model is always a challenge, 44,45 the aim of stage 1 was to generate a large item pool for our underlying PL domains.Within this procedure, we retained questionnaires (subscales) from our previous instrument, if considered appropriate and used newly validated questionnaires (subscales) based on a literature review.We could not identify any established questionnaires (subscales) for understanding and knowledge fitting the PL framework and the subscales in our former questionnaire were found to be inappropriate (i.e., some items in the understanding domain were thematically focused on motivation and confidence, and the knowledge domain was operationalized by open-ended questions). 42,43Therefore, we additionally adjusted item wordings and self-constructed new items for these domains.In total, the questionnaire version from stage 1 (referred as PPLQ version 1) consisted of 61 items (see Appendix A, pages 2 to 4).

Stage 2
The aim of this stage was to identify the most convenient items from the generated item pool in stage 1.For this purpose, we conducted a cross-sectional study in winter 2019, in which a sample of 506 students and university staff members (72% female, 27.31 ± 10.11 years), who did not participate in any of the following development stages completed the PPLQ version 1.The subsequent item selection procedure followed a two-step-approach, each under the premise to have a maximum of two subscales per domain with a minimum of three items within a subscale.In a first step, we conducted exploratory factor and P. Holler et al.  reliability analyses for each latent domain, except for PA behavior as it is not operationalized by a reflective measurement model.The subscales/ items with the most appropriate measurement properties (i.e., factor loadings and Cronbach's alpha coefficients) were selected. 46In a second step we critically discussed the results with seven experts from research and practice (three senior and four early-stage researchers all working in the field of PA and/or public health and familiar with the PL concept), to select the items/subscales with the highest content validity.If applicable, we used the content validity index (CVI) as a well-established method to calculate content validity quantitatively. 47Applying the CVI, the experts rated the subscales/items of one domain either as "relevant (1)" or "non relevant (0)", whereby a CVI of at least 0.83 (i.e., at least six out of seven experts gave a rating as "relevant") was considered as satisfactory. 48In total, this procedure resulted in a PPLQ version 2 of 34 items.Appendix A (pages 5 to 13) provides a detailed description of the methods and results of stage 2.

Stage 3
Within this stage, we inspected the remaining items from the PPLQ version 2 using cognitive interviewing, which is a method for assessing the content clarity and interpretation consistency of self-reported items based on respondents' thought processes while answering the items. 49dapting a theoretical sampling strategy, 50 two rounds of a total of seven cognitive interviews (n 1 = 3, n 2 = 4) were conducted (71% female, 54.71 ± 2.29 years).The analysis of the interviews from round two indicated a saturation (i.e., no new information was provided).All results were discussed involving the expert committee from stage 2 to achieve a consensus on which changes should be made (see Appendix A, pages 13 to 15).

Stage 4
Subsequently, the PPLQ version 3 was subjected to an external comprehensibility check by capito (www.capito.eu), a certified organization for barrier-free and comprehensible information and communication.As part of this process, the PPLQ was modified (see Appendix A, Table A.12) and certified to A2 language level, which allows the PPLQ to be classified as an easy-to-understand questionnaire (i.e., 96% of the Austrian population should be capable of understanding it). 41Overall, the modifications conducted within the four stages resulted in a PPLQ version 4 with 31 items and six domains (see Appendix A, pages 16 to 21).Table 1 shows the item labels, the English item wordings, and assigned response scales for each item of the PPLQ version 4. The full PPLQ version 4 in the original German version can be found in Appendix B.

Stage 5
The focus of the current study relates to this stage, in which the aim was to evaluate the hypothesized factor structure and psychometric properties of the PPLQ version 4 with an independent sample (see next paragraph).We tested the theoretically assumed factor structure with confirmatory factor analyses (CFAs), initially for the entire model and subsequently for each domain separately.In cases of misfits, the model was modified based on content-related and statistical considerations.Moreover, we examined the convergent validity of the PPLQ by employing the Physical Activity-related Health Competence (PAHCO) 51,52 questionnaire.The corresponding procedure is described in more detail in the result section.

Design, participants, and procedures
The following descriptions exclusively refer to stage 5 as the empirical core process of the entire PPLQ development.Within this stage, a cross-sectional study was conducted, including adults who were (i) aged between 18 and 65 years and (ii) fluent in the German language.We collected date between February 2021 and April 2022.Eligible participants completed the paper-pencil version of the PPLQ version 4, with a subsample also completing the PAHCO 51,52 questionnaire to assess convergent validity.In addition, we gathered information on gender, age, and education level.Participants were recruited from two settings: (i) via four primary health care centers in Styria, a province of Austria (participants of this subsample completed both the PPLQ and PAHCO), (ii) via personal contact during "physical activity education events", which were held in six rural Styrian communities as part of the project "MOVEluencer". 53This subsample only completed the PPLQ, since the time frame of the events did not allow to complete both the PPLQ and PAHCO.The study was approved by the Research Ethics Committee of the University of Graz, Styria, Austria (GZ.39/101/63 ex 2019/20) and informed consent was obtained from all participants.

Scoring of the PPLQ
Domain scores for each of the six domains and a total PL score are calculated as follows.The domain scores range between 0 and 100, with higher values representing a greater domain proficiency.Acknowledging the holistic nature of the PL concept, the total PL score represented a composite calculation in which each of the six domains were weighted equally with 16.67%.This composite score also ranged between 0 and 100, with higher values indicating a better PL.5][56] A detailed description of the scoring procedure can be found in Appendix C, both for the PPLQ version 4 (Table C.1) and for the final version resulting from stage 5 (PPLQ version 5; Table C.2).

Convergent validity
We employed the PAHCO 51,52 questionnaire to assess the convergent validity of the PPLQ.Conceptualized as a multidimensional framework at the intersection of PL and health literacy, the underlying PAHCO model is theoretically and conceptually closely related to the PL approach. 57More specifically, the PAHCO model assumes that three interrelated sub-competencies are required for a healthy, physically active lifestyle: (i) movement competence, (ii) control competence, and (iii) self-regulation competence.Within the PAHCO questionnaire these three competences are posited as second-order factors bundling 10 first-order factors with a total of 42 items. 51,52In collaboration with a member of the PAHCO development team (JC), we expected equivalent scores of the PAHCO to be highly correlated with the PPLQ (ρ ≥ 0.5) in case of sufficient evidence for convergent validity.With regard to the PAHCO, we only used the total score of the 3 second-order factors for determining convergent validity.For instance, we expected high correlations between the PAHCO scores from the second-order factor "movement-competence" with the domain score "physical competence" of the PPLQ.The theoretically assumed correlations, as well as the whole intercorrelation matrix (for exploratory purposes) is presented in the corresponding result section.

Statistical analysis
The collected data was accurately screened by excluding participants with implausible and missing values.More specifically, we excluded only participants with missing values for all items in one or more domains (PPLQ) or first-order factors (PAHCO).The remaining missing values were imputed using multiple imputation, whereby the result of five imputed data sets were averaged to have one single complete data set. 58We applied CFAs for factorial validity assessment.The domain PA behavior entered the CFA with a single indicator given the formative nature of this measurement model 59 and the categorical character of the variable with eleven levels (linear transformation from the MVPA score).The multiple-choice items of the knowledge domain were  Response scale: open response categories (unit for part a of each PAB-item -"days per week", unit for part b of each PAB-item "minutes per day").At each PAB-item there was also an addition response category "no walking", "no vigorous-intensity activity" and "no moderate-intensity activity ", respectively.

Label
Items (in order of appearance) PAB1 a.During the last 7 days, on how many days did you walk for at least 10 min minutes at a time?This includes walking distances at home or at work, walking to get from one place to another, and all other walking for recreation, exercise, or leisure.b.How much time did you usually spend walking on one of those days?PAB2 a. Think only about those physical activities that you did for at least 10 min minutes at a time.During the last 7 days, on how many days did you do vigorous physical activities like aerobics, running, fast cycling or fast swimming?b.How much time did you usually spend doing vigorous physical activities on one of those days?PAB3 a. Think again only about those physical activities that you did for at least 10 min minutes at a time.During the last 7 days, on how many days did you do moderate physical activities like carrying light loads, bicycling at a regular pace, or swimming at ordinary speed?Caution: This does not include walking! b.How much time did you usually spend doing moderate physical activities on one of those days?Note: * item excluded in the final 24-item version from stage 5 (i.e., PPLQ version 5).Note: * item excluded in the final 24-item version from stage 5 (i.e., PPLQ version 5); a in the analysis procedure each response category is counted as one single item with "breast cancer" = KNO_KB1a, "dementia" = KNO_KB1b and "hypertension" = KNO_KB1c [excluded in the final 24-items version], b in the analysis procedure each response category is counted as one single item with "sugar disease" = KNO_KB2a, "Parkinson's disease" = KNO_KB2b, "joint wear" = KNO_KB2c and "heart failure" = KNO_KB1d.

P. Holler et al.
partitioned into separate dichotomous items (true/false).We used the mean and variance adjusted weighted least squares (WLSMV) estimator, which is recommended for models with binary and categorical data.For the assessment of the global model fit, we relied on the following indices: comparative fit index (CFI), root mean square error of approximation (RMSEA) and standardized root mean square residual (SRMR). 60Since fit indices 60 under the WLSMV estimator have a profound tendency to mask model misspecification, 61 we relied on the robust versions of CFI and RMSEA proposed by Savalei. 62Furthermore, we adopted a variant which considers the underlying degrees of freedom (χ 2 /df) as the chi-square (χ 2 ) test is highly sensitive to model complexity and sample size. 62We applied following rules of thumb for first indication of adequate model fit.A good model fit was considered when χ 2 /df ≤ 2.0, CFI Robust ≥ 0.95, RMSEA Robust ≤ 0.05 and SRMR ≤ 0.05 and an acceptable model fit when χ 2 /df ≤ 3.0, CFI Robust ≥ 0.90, RMSEA Robust ≤ 0.08 and SRMR ≤ 0.10. 63Factor loadings of ≥ 0.5 for newly developed items (knowledge and understanding), and of ≥ 0.6 for established items (motivation, confidence and competence) were treated as adequate. 64,65We inspected the models in more detail on the local level using modification indices, residual correlations, and item difficulties.
Composite reliability was assessed by means of McDonald's omega (ω), with values exceeding 0.7 indicating an acceptable level of internal consistency. 66We derived evidence of discriminant validity based on the Fornell-Larcker criterion. 67Accordingly, discriminant validity is established if the average variance extracted (AVE) by a construct (i.e., latent variable) is greater than the squared correlation between the construct and any other constructs.For the convergent validity assessment between the PPLQ and the PAHCO, spearman correlation coefficients (ρ) were applied.The analyses regarding the CFAs and the reliability metrics were conducted in R (4.2.2) 68 via the packages lavaan (version 0.6-13) 69 and semTools (version 0.5-6). 70For all other statistical analyses, we used IBM SPSS 27.

Participants
A total of 476 participants completed the PPLQ version 4, with a subsample of 244 participants also completed the PAHCO.Regarding the PPLQ, 43 participants were excluded as they had missing values for all items in one or more domains.Another 16 participants were excluded as they reported implausible (contradictory) data within the PA behavior domain (e.g., 0 days and 5 h of moderate PA per week).Hence, the final sample for the factorial validity assessment of the PPLQ consisted of 417 participants.The percentage of imputed missing values for the PPLQ was 0.26% (39 values).
Participants excluded from the PPLQ were also excluded from all analyses related to the PAHCO (n = 40).Additionally, we excluded 11 participants as they had missing values for all items in one or more firstorder factors and another 35 participants because of implausible (contradictory) data.Hence, for the convergent validity assessment (PPLQ vs. PAHCO) the data of 158 participants was used.The percentage of imputed missing values for PAHCO was 0.89% (74 values).Participants characteristics for the total sample and the subsample completing also the PAHCO are illustrated in Table 2, with group differences only observed for education level.

Factorial validity
First, we tested our assumed 31-item model of the PPLQ version 4. As visualized in Figure A.1 (Appendix A), this model was specified as a hierarchical model with a third order g-factor for overall PL.Regarding the IPLA definition, 6 this structure may provide the most interpretable solution as well as the closest conceptual representation of PL.Using the WLSMV estimator, however, the CFA could not be computed.Subsequently, further CFAs were conducted separately for the respective PL domains.Heywood cases were observed related to potential autocorrelations between indicators (i.e., observed variables) in the model. 71herefore, the model was modified based on content and statistical considerations (i.e., fit indices, factor loadings, modification indices and item difficulties).Redundant and inadequate items were removed, and sub-domains merged into one factor per domain.The final modified models on domain level revealed acceptable model fits (see Appendix D, Table D.1).This procedure resulted in an overall 24-item second-order model with PL as a g-factor (referred as PPLQ version 5).The global model-fit of the final model was acceptable (χ 2 247 = 450.70,p < 0.001, χ 2 /df = 1.83,CFI = 0.974, CFI Robust = 0.895, RMSEA = 0.045 [CI 90 = 0.038-0.051],RMSEA Robust = 0.074 [CI 90 = 0.063-0.085],SRMR = 0.064].Even so, there were still some minor statistical discrepancies with six modification indices greater than 20 and residual correlations greater than 0.2 between the item PCO_ST1 and PAB_MVPA, KNO_KB2a and KNO_HM3 as well as KNO_KB2a and KNO_KB1a.Notably, the correlated residuals were not permitted in the final model. For the final 24-item-model, all factor loadings were found to be highly significant (p < 0.001) and acceptable, excepted for KNO_HM3 (0.393) and KNO_KB2a (0.404; see Fig. 1).Considering the internal consistency, all coefficients for the domains (subscales) ranged from sufficient to good (0.70 ≤ ω ≤ 0.90; see Table 3).Additionally, the omega coefficient for the g-factor (i.e., overall PL) was very satisfactory (ω = 0.90).Except for the domain knowledge (AVE = 0.39), the AVEs for all other domains were in an acceptable range (0.56 ≤ AVE ≤ 0.73).Moreover, the AVE values of all domains exceeded the corresponding highest squared correlation estimates, indicating sufficient discrimination between the domains.Descriptive statistics for the final 24-itemversion of the PPLQ are presented in Tables 4 and 5, showing mean, standard deviation, minimum, maximum, skewness, kurtosis and item difficulty in terms of the knowledge domain, respectively.

Convergent validity
The PL total score and the domain scores for physical competence, motivation, and confidence showed sufficient correlations with the corresponding second-order factors of PAHCO.Yet, the correlation coefficients for the domains understanding, PA behavior and knowledge did not reach an acceptable level of convergent validity (see Table 6 for detailed correlations; the whole intercorrelation matrix including also the first-order-factors from the PAHCO can be found in Appendix E, Table E.1).

Discussion
The purpose of this paper was to describe the development process and first validity insights for the PPLQ as an instrument to measure PL in the German-speaking adulthood population.From an initial large item pool for the underlying PL domains, the most convenient items were selected based on exploratory results of a cross-sectional survey and expert ratings.Subsequently, content validity was enhanced by conducting cognitive interviews and an external comprehensibility check.
In stage 5 of the study, the hypothesized factor of the PPLQ and its Fig. 1. 24-item second-order PPLQ model with PL as a g-factor.PL: physical Literacy; PCO: domain physical competence; UND: domain understanding; MOT: domain motivation; CON: domain confidence (i.e., self-efficacy); KNO: domain knowledge; the full wording of the corresponding item-labels can be retrieved from Table 1.

P. Holler et al.
convergent validity with the PAHCO questionnaire 51,52 was assessed.We found empirical support for a theory-compatible 24-item version covering all six domains of the IPLA definition, 6 while not releasing residual correlations as well as applying robust estimators and goodness-of-fit indices. 62The six domains could be subsumed under an overall factor for PL, with the domains being distinct from each other (i.e., sufficient discriminant validity) and therefore allowing for independent meaningful interpretation.Evidence of adequate internal consistency was established at the domain level and for the overall factor.An acceptable level of convergent validity was achieved for the total PL score and the domains physical competence, motivation, and confidence but not for understanding, PA behavior and knowledge.
To the best of our knowledge, the PPLQ is the first PL assessment instrument for adults that provides a comprehensive and differentiated measurement of all key attributes of the IPLA definition. 6Specifically, we operationalized each theoretical PL attribute separately, leading to the identification of six domains: (i) motivation, (ii) confidence, (iii) physical competence, (iv) knowledge, (v) understanding and (vi) PAbehavior (a description of the meaning of each domain can be found in Table 7).Each domain was equally weighted.From this perspective, the PPLQ may occupy a middle position on a continuum between an idealistic and a pragmatic approach to assess PL. 45,72 By contrast, several adaptions of the PPLI used different items in varying factor structures (see Table 7 in Mendoza-Muñoz et al. 34 ).For example, in the Spanish adaption, 34 motivation and confidence have been merged into a single domain.Similarly, in the simplified Chinese adaption, 33 physical competence and confidence have been grouped together.In summary, such procedures entails the risk that a domain may not be recognized as unique and therefore, may not be interpreted clearly and consistently. 38In younger age groups (adolescents) the authors of the Portuguese PL Assessment (PPLA) argued for a combination of scales under the psychological factor but differentiated operationalization and interpretation (e.g., motivation and confidence). 73,74In the adult population, apart from the PPLQ, only the CSPLQ 31 seems to satisfy the meaningful separation of domains.However, the CSPLQ was developed for Chinese college students, while the entire development process of the PPLQ was focused on the general adult population (i.e., aged 18-65 years with diverse socioeconomic and PA background) within the German-speaking culture.Overall, the PPLQ may provide a valuable approach to 'translate' the PL domains into measurable and clearly interpretable latent constructs.In this respect, we encourage other researchers to further develop and adapt the PPLQ (i.e., in other languages).
Based on the results of our study, we endorse an application of the PPLQ in research and practice.However, caution is warranted since some domains may need a refinement in future studies.Regarding the knowledge domain, we observed an insufficient AVE of less than 0.5, while two items had factor loadings below the critical value of 0.5. 64,65

Note:
The full wording of the corresponding item-labels can be retrieved from Table 1.

P. Holler et al.
The preceding exploratory analyses (from stage 2) also revealed insufficient properties for this domain, which is partially compensated by the higher number of items.When interpreting these findings, it is necessary to consider that knowledge in the context of PL is (and was) constructed as a broad concept. 75,76This procedure may have yielded in a set of items with comparatively moderate factor loadings. 77At this point, further studies may conclude that the knowledge domain should rather be operationalized via a formative than a reflective measurement model.In reflective models, the latent construct influences the manifest indicators (i.e., items), which share a common theme and are desirably highly correlated.In formative models in contrast, indicators uniquely define the latent construct, with each contributing to a distinct aspect of the construct and are not necessarily correlated. 78In fact, assessing PA knowledge per se is an understudied area across all age groups, 21,79,80 with no validated scale currently available for adults. 77Accordingly, there is no clear consensus on the content of the knowledge domain required to implement a formative measurement approach.In summary, further studies are necessary addressing these issues in more detail.Moreover, further studies are needed to build on the discussion whether PA behavior is an outcome or a domain of PL.For the development of PPLQ we adopted the CAPL developers' interpretation of the IPLA definition: "When people value and take responsibility for engaging in physical activity, they will demonstrate this by being physically active". 37In this context, PA behavior is envisioned as a domain of the PL framework.However, this view seems to contradict the current operational approaches to PL in adults.Longitudinal studies on the causal relationship are needed to undermine the assumption.
The present study has several limitations that should be considered.In our convenience sample, women, middle-aged adults, and higher educated individuals were overrepresented in comparison to the general population.Besides, nearly one third of the study participants (32%) reported over 300 min of MVPA.In this context, we identify the possibility of a selection bias in our recruitment strategy that has disproportionately attracted participants with a special interest in PA and exercise.This might have also affected the results of other domains, especially those with an affective characteristic (e.g., motivation).Consequently, the external validity of our results should be interpreted with caution and further studies with a large representative sample of adults are necessary to generalize the study findings.Although residual correlations were not allowed in the final model, the non-trivial covariances should be critically re-examined in future research to test for recurring misspecification or mere sampling error.Future investigations should also seek to establish test-retest reliability and measurement invariance across gender and age groups, as this was beyond the scope of the present study.So far, we cannot provide evidence that the PPLQ functions equivalently across the heterogeneous group of adults (aged 18-65 years) for whom it was developed, ensuring that any observed differences are due to the underlying construct being measured.In addition, the self-reported nature of the PPLQ presents a limitation per se as it makes the instrument susceptible to recall and social desirability bias.Associations with objective measurements in the physical domain could enhance the validity and explanatory power.The convergent validity comparisons with the PAHCO did not reach acceptable levels for knowledge, understanding and PA behavior, which should be thoroughly readdressed and cross-validated with closely related instruments for the corresponding domain.Finally, unlike to previous studies assessing PL (e.g., 73,81,82 ), we did not consider a social domain for operationalization of PL, since it is not a key attribute of the IPLA definition. 6

Conclusion
The present study provides first evidence of reliability and validity of the PPLQ, as an instrument to measure PL and its domains in a general adult population in German speaking countries.Compared to other PL instruments for adults, the PPLQ offers a comprehensive measurement model of all essential attributes according to the PL definition of the IPLA, 6 with the domains being distinct from each other and, therefore, allowing for independent meaningful interpretation.Moreover, the six domains derived from the model can be combined to calculate an overall PL score facilitating the quantification of an individual's 'PL-level'.This renders the PPLQ a valuable instrument for researchers, but also for practitioners and trainers for charting domain and overall PL levels over time to provide feedback for individuals regarding their progress along their PL journey.However, caution is warranted since specific items, especially the knowledge domain needs to be refined and perhaps approached through formative indicators in further studies.Future large-scale studies with well-stratified samples should also investigate whether the PPLQ model remains invariant across the diverse adult population.Note: n = 158 and p < 0.001 for all correlations.Non-expected correlations, including those which were significant, were masked to avoid overinterpretations of convergent validity.The whole intercorrelation matrix can be found in Appendix E, Table E.1.PCO: physical competence; UND: understanding; MOT: motivation; CON: confidence (i.e., self-efficacy); PAB: physical activity behavior; KNO: knowledge; PL: physical literacy; PPLQ: Perceived Physical Literacy Questionnaire; PAHCO: Physical Activity-related Health Competence (questionnaire).

Table 7
Description of the meaning of each domain of PPLQ version 5 (i.e., 24-item version).

Physical competence
Refers to a person's perception of his/her own fitness and ability to perform various strength and endurance related physical activities.

Understanding
Refers to a person's grasp of the value of physical activity for lifelong health and well-being.

Motivation
Refers to a person's inherent satisfaction and pleasure to engage in regular physical activity.

Confidence
Refers to a person's situational belief in his/her capabilities to adopt and maintain a physically active lifestyle.

Knowledge
Refers to a person's knowledge of health-enhancing physical activities and how to perform them.In addition, this refers to a person's knowledge of the health benefits of being physically active.

Physical activity behavior
P.Holler et al.

Table 1
Item labels and wordings of the PPLQ version 4 (i.e., 31 item version).

Table 1 (continued): 31
-items version of the PPLQ from stage 4 (i.e., PPLQ version 4) Domain: Knowledge [KNO] Two subfactors: how to move [KNO_HM] & knowledge of the benefits [KNO_KB]Response scale: closed response categories (since the scales are different for each item, they are reported below together with the item description -italics, correct answer in bold) According to the Austrian physical activity guidelines, at least how many minutes per week should you perform activities that involve a slight increase in breathing and pulse rate e.

Table 2
Participants characteristics.Note: a missing data for one participant, both in the total sample and in the PAHCO-subsample; b missng data for 23 participants in the total sample and 19 in the PAHCO-subsample; c missing data for two participants in the total sample and no missing data in the PAHCO-subsample.* significantly different at the α = P.Holler et al.

Table 3
Analyses of reliability and discriminant validity.
Note: * variable ranging between 0 and 100 (see section 2.3 in the manuscript for detailed information); M: Mean; SD: Standard deviation; Min: Minimum; Max: Maximum; Skew: Skewness; Kurt: Kurtosis.The full wording of the corresponding item-labels can be retrieved from Table1.