Physical‐activity support for people with intellectual disabilities: development of a tool to measure behavioural determinants in direct support professionals

Abstract Background Physical‐activity approaches for people with intellectual disabilities (ID) are more likely to be effective and sustainable if they also target direct support professionals' behaviour. However, no tools to measure the behavioural determinants for direct support professionals are available as of yet. This study aims to construct a self‐report tool to measure direct support professionals' behavioural determinants in physical‐activity support for people with ID and to analyse its psychometric properties. Methods The tools' sub‐scales and items corresponded with a proposed conceptual model. A pilot study was carried out to investigate and improve content validity. Construct validity and measurement precision were examined using item response theory models with data from a convenience sample of 247 direct support professionals in the support of people with ID. Results Results supported the three theory‐driven behaviour scales and indicated reasonable to good construct validity. The marginal reliability for the scales ranged from 0.84 to 0.87, and adequate measurement precision along the latent continua was found. Conclusions The tool appears to be promising for measuring the behavioural determinants of direct support professionals for the physical‐activity support of people with ID and has potential as a tool for identifying areas to focus on for interventions and policies in the future.


Background
There is growing recognition that interventions aimed at promoting the participation in physical activity of people with intellectual disabilities (ID) should also target the physical and social environment of these people (Peterson et al. 2008;Heller et al. 2011;Bergström et al. 2013;Kuijken et al. 2016;Bossink et al. 2017;Steenbergen et al. 2017). A large and essential part of this physical and social environment can be attributed to the quality and content of the support provided by direct support professionals (Buntinx & Schalock 2010). The content of the support received from direct support professionals has turned out to predict the physical-activity participation in adults with mild to moderate ID (Peterson et al. 2008). Moreover, support from others, such as direct support professionals, is often indicated as being an important factor that influences whether people with mild to moderate ID participate in physical activity (Kuijken et al. 2016;Bossink et al. 2017). Although these findings were biased towards the support of people with mild to moderate ID, it is known that engaging people with a combination of profound intellectual and severe motor disabilities in physical activities requires intensive effort and support from others (Nakken & Vlaskamp 2007;Van der Putten et al. 2017).
Targeting and influencing the support of direct support professionals, however, requires a thorough understanding of their perspective. Recently, a theory-informed qualitative study explored the perspective of direct support professionals as regards physical-activity support for people with ID (Bossink et al. 2019). Underpinned by valid theoretical frameworks for behaviour and behavioural change (Michie et al. 2011;Cane et al. 2012), various influences on the behaviour of direct support professionals were explored as related to the three essential sources of the nature of behaviour (e.g. capability, opportunity and motivation). A conceptual model was proposed comprising the influential factors that facilitate or impede physical-activity support related to the capability, to the opportunities afforded and, subsequently, to the motivation of direct support professionals in terms of engaging in physical-activity support (Bossink et al. 2019). Another important finding included in this conceptual model concerns those characteristics of people with ID that affect direct support professional behaviour vis-à-vis physical-activity support.
Because the perspectives presented in the qualitative research findings were wide ranging (Bossink et al. 2019), an additional step is needed to accurately measure the differences in direct support professional behaviour in order to promote physicalactivity participation in people with ID. To our knowledge, no validated tools exist to measure the behavioural determinants of direct support professionals in the context of the physical-activity support for people with ID. This study will therefore attempt to develop a validated tool based on the theoretical knowledge of behaviour and behaviour changes in direct support professionals regarding physical-activity support for people with ID. This study's main focus is on the initial evaluation of the tool's psychometric properties. This tool can subsequently be used to investigate direct support professional behaviour regarding their support in promoting physical activity and to identify areas for future interventions and policies.

Study design and participant selection
A cross-sectional approach was used. The inclusion criteria for the participants were as follows: (1) professional supporting a group of people with ID in a living unit and/or activity centre and (2) being directly in contact with people with ID for most of the working time. No reward or incentive was offered for participation. The participants were mainly recruited from 10 residential facilities in the Netherlands. Each facility was allowed to decide how to internally distribute the invitation for participation in this study. An indication of the overall response rate was given by calculating the response rate for the four participating facilities that invited professionals to participate by email (21.4% response rate). Awareness for this study was raised by online advertising in the other six facilities. In addition, participants were also recruited via a national information platform for direct support professionals and by social media.
In total, 395 potential participants visited the online application that introduced the tool (260 from the facilities and 135 from social media/national information platforms). Of these, 363 chose to participate and completed the screening questions (i.e. the inclusion criteria for this study). A total of 28 did not meet our inclusion criteria. A further 50 did meet our inclusion criteria but exited the questionnaire after the screening, and another 38 completed less than half of the items (<21 items).
A convenience sample of 247 participants was used in this study. Table 1 shows the characteristics of the participants.

Development of the capability, opportunity and motivation sub-scales
The tools' sub-scales and items correspond to a proposed conceptual model for understanding direct support professional behaviour in their physical-activity support for people with ID (Bossink et al. 2019) and were supplemented with the results of a systematic review identifying barriers and facilitators of physical activity in people with ID . The sub-scale Capability represents the professionals' psychological and physical ability to enact a behaviour, which includes having the necessary knowledge and skills. Opportunity is defined as any circumstance in the physical or social environment that influences a behaviour: all factors that are external to the professional. Motivation represents all those brain processes that energise and direct the behaviour of the professional (Michie et al. 2011). These sources (i.e. the three sub-scales) interact to generate the behaviour of interest (i.e. direct support professional behaviour regarding their support in promoting physical activity) (Michie et al. 2011). The influencing factors facilitating or impeding physical-activity support known in the literature were, for this study, compiled into items that were presumed to be reflective indicators of the three different sources of direct support professional behaviour. Lower item scores reflect an influencing factor that acts as a barrier; higher scores indicate a facilitator. The item distribution among sub-scales is based on the number of influences on the underlying construct known in the literature. The selection and designing process was discussed during regular meetings with the research group. Item-writing guidelines were used (Mellenbergh 2011, pp. 73-78;Van Sonderen et al. 2013). In addition, a five-point Likert scale (from 0 'disagree' to 4 'agree') was used for the different response categories of each item (Krosnick & Fabrigar 1997, as cited in Mellenbergh 2011.
Two content experts were involved to improve content validity and to assess the applicability for current practice in the work of direct support professionals. One expert worked as a physiotherapist in a large-scale residential facility, and the other worked as a movement scientist. Both experts have experience with developing questionnaires for research and professional purposes. After feedback from the expert panel, the first draft of the tool was developed comprising 41 items: 8 for the sub-scale 'capability', 15 for the sub-scale 'opportunity' and 18 for the sub-scale 'motivation'.
With this tool, a pilot study was carried out with a convenience sample of 10 direct support professionals, who were not enrolled in this study's 1195 sample. Each direct support professional was asked to complete the first draft of the tool, to fill out a demographic questionnaire and to finish a retrospective evaluation formall online. The demographic questionnaire included questions about the characteristics of the people with whom they work (e.g. age, level of ID and additional impairments), their own characteristics (e.g. age, gender, profession and employment years) and characteristics of their organisations. The evaluation form included questions about the time needed to complete the tool, the clarity and completeness of the instructions at the start and in the course of completing the tool, the clarity and applicability of individual items and their response options in the tool and the completeness of the tool in terms of the physical-activity support topic. The proposed tool, a demographic questionnaire, and an evaluation form were made available online using Qualtrics research software.
The pilot results were discussed with the research group and two field experts, which resulted in some adjustments. One item on education was removed from the sub-scale 'capability' and translated into an organisational characteristic about whether or not they were trained in physical-activity support and what sort of education they had received, which was then relocated in the demographic questionnaire. Another item on practical support was added to the tool and was attributed to the sub-scale 'opportunity'. Based on the pilot results, we also added 'expected time costs' to the introduction section and screening questions. Furthermore, we decided to add a question to the demographic questionnaire about the role of the physiotherapist in their organisation. A final 41-item self-reported tool was proposed, with seven items covering the capability construct, 16 the opportunity construct and 18 the motivational construct. Qualtrics research software was again used to make both the adapted demographic questionnaire and the proposed tool available online. The psychometric properties of the tool were examined in this study.

Statistical analyses
The descriptive statistics were computed first. Raw item scores were described according to mean (standard deviation), and the frequency scores of the response options were given. Response categories were collapsed for further analyses, in case too few participants had chosen a response option (minimum of 12 ratings for a response option).
The psychometric properties were analysed using an item response theory (IRT) model separately for the three sub-scales proposed. IRT is a statistical theory consisting of mathematical models describing the relationships between the properties of single items of a tool, the underlying construct that a tool proposes to measure and respondents' answers to any item (Kline 2005). Compared with classical test theory, IRT models generate much richer item level information and greater detail on the tool's reliability (Nguyen et al. 2014). Based on the underlying theory, unidimensionality for the three sub-scales was warranted. The different sub-scales were then calibrated under a polytomous item response model using the R mirt package version 1.27.1 (Chalmers et al. 2018) in the open-source software environment R version 3.4.3 (R Development Core Team 2017). The marginal maximum likelihood estimation was used to estimate item parameters (Bock & Aitkin 1981). Samejima's (1969) graded response models were estimated, which are potentially useful models when item response options lie on an ordered but categorical level. Samejima's model is a polytomous extension of the two-parameter logistic model for dichotomous item responses and was chosen over the more restricted model of Muraki (1990), because this model allows for item response options that do not have to be the same across items (Kline 2005, pp. 131-137).
For Samejima's model, the item characteristic curve that relates the probability of an item response to the underlying construct (denoted θ), measured by the item set, is characterised by two parameters: a slope parameter (denoted α) and the thresholds category parameters (denoted as β). α describes how well an item can differentiate along θ and, similar to factor loadings, how well the item relates to the construct measured. A reasonable range for α is from 0.5 to 3.0 (Baker, as cited in Toland 2014). β defines the point on θ at which 50% of the respondents would choose the designated response category or higher. Every respondent has a 100% probability of choosing the lowest category or higher, so there are (number of response categories -1) β's for each item (Kline 2005, pp. 131-132). β generally ranges from À2 to 2, but it is not uncommon for this parameter to range between À3 and 3 (Toland 2014).
The information functions (Toland 2014) are the IRT equivalent of reliability. Each item has its own item information function (IIF) shaped by its item parameters. With IIFs, the amount of precision for each item was gathered for a particular location or across a range on θ (Toland 2014). In addition, it was used to see how much information an item is adding to the entire scale and where that information is occurring along θ (Toland 2014). For each scale, IIFs were combined into a test information function illustrating the precision of this scale for each score level of θ. Moreover, marginal reliability was estimated representing a value that summarised the precision for the entire range of a scale (similar to traditional reliability; Green et al. 1984). Finallyand in addition the IRT score estimates (θ for each respondent on the scale) and their standard errors were assessed. Table 2 presents the average items scores and frequency scores of the response options for items within the different sub-scales. The participants, for the most part, agreed or partly agreed with the items in the capability scale, especially on the items covering their awareness, knowledge and skills (mean score > 3.0). Within the opportunity scale, the response options partly agree and agree were, on average, slightly more often (56% of responses) used by the participants, although only the mean score of the item covering social influence by colleagues was higher than 3.0. The mean score of the item covering unforeseen things was the only one in the direction of the disagree point along the continuum (mean score < 2.0). Participants also responded, on average, more frequently with partly agree or agree to the items in the motivation scale (70% of responses). Ten out of 18 items had a mean score higher than 3.0. Three out of 18 had a mean score lower than 2.0.

Psychometric properties of the capability scale
The calibrated graded response model for the capability scale explained 50% of the data variance. Factor loadings ranged from 0.56 to 0.82. The estimated slope parameters for the items in the capability scale range from 1.14 to 2.41 ( Table 3) and confirm that estimating a unique α for each item was reasonable. This also indicates that all the items have a satisfactory distinction power. The category threshold parameters range from À2.08 to 1.99. Within each item, the distance between the lowest and highest category threshold parameters is 1.74 to 4.07 units, which means that the capability construct is well covered. In addition, as shown in Table 3, the standard errors for the estimated IRT parameters indicate that they are estimated with good precision. The estimated IRT scores for the participants range from À2.57 to 1.99, which are not on the same metric as the category thresholds. Two participants have estimates IRT scores lower than À2.08.
In Fig. 1, the test information function for the capability scale demonstrates that most of the test information is below the middle ranges of the capability construct and that the precision of the capability scale peaked near À1.2. The IIFs for the capability items are provided in the Appendix. Direct support professionals in the capability construct between À2.2 and 1.2 are likely to be measured with the greatest reliability (>0.8; see also Fig. 1). Marginal reliability for the capability scale is 0.84.

Psychometric properties of the opportunity scale
The calibrated graded response model for the opportunity scale explained a proportional variance of 0.31, where factor loadings ranged from 0.41 to 0.75. The estimated slope parameters for the items in the opportunity scale range from 0.77 to 1.94, which indicates that all the items have a satisfactory distinction power ( Table 3). The category threshold parameters range from À4.01 to 3.99. Within each item, the distance between the lowest and highest category threshold parameters is 2.06 to 5.47 units. The opportunity scale covers the underlying construct well. The standard errors for the estimated IRT parameters are reasonably small (0.15 to 0.33) and indicate that the parameters were estimated with suitable precision. The estimated IRT scores for the 247 participants range from À2.73 to 2.21, which are on the same metric as the category thresholds.
The test information function indicates that most of the information is found around the middle ranges of the opportunity construct (Fig. 1). The IIF for the opportunity items is provided in the Appendix. Direct support professionals in the opportunity construct between À2.2 and 1.8 are likely to be measured with the greatest reliability (>0.8; Fig. 1). Marginal reliability for the opportunity scale is 0.87.

Psychometric properties of the motivation scale
The calibrated graded response model for the motivation scale explained a proportional variance of Within each item, the distance between the lowest and highest category threshold parameters is 1.291 to 9.351 units, which means that the motivation construct is broadly covered. The parameters for the motivation scale are estimated with satisfactory precision, apart from a standard error of 0.40 for the slope parameter of affinity. The estimated IRT scores for the 247 participants range from À3.20 to 2.81, which are on the same metric as the category thresholds of the motivation scale. The test information function, as shown in Fig. 1, indicates that there is more information below the middle ranges of the motivation construct. The IIF for the motivation items is provided in the Appendix. Direct support professionals on the motivation construct between À3.0 and 1.3 are likely to be measured with the greatest reliability (>0.8; Fig. 1).
Marginal reliability for the motivation scale is 0.87.

Discussion
The aim of this study was to develop and validate a tool to measure the behaviour of direct support professionals in terms of their physical-activity support for people with ID. The development of the tool was theoretically well founded, and experts were involved to ensure its content validity. The study's main objective was to evaluate the psychometric properties of the tool to facilitate research in the field.
With IRT models, we analysed the construct validity and reliability of the three theory-driven behaviour scales for direct support professionals of people with ID. In addition, the IRT models allowed the performance of individual items to be evaluated.
The results demonstrate good construct validity for the capability and opportunity scales and reasonable construct validity for the motivation scale. In the motivation scale, two of the items relate less to the construct measured (i.e. slope parameters were unsatisfactory). These items, however, did not correlate with items from the capability and opportunity scales. Their retention in these scales is warranted as long as the IRT score estimates, which take into account item properties, are used. Furthermore, removing items is only allowed when it does not destroy content validity (Toland 2014). The results also prove that the capability, opportunity and motivation scales are reliable, with good measurement precision along the continua. Additionally, the ranges of the threshold parameters ensured that all of the scale levels were represented in the current scale items. The scales, in their current stage, can distinguish satisfactorily between direct support professionals over the entire range of capability, opportunity and motivation levels.
This study is not without limitations. Content experts were involved in the development of the different sub-scales. Content experts' feedback can be subjective; thus, the study might be subjected to bias that may exist between these two experts. However, the potential participants were also asked to suggest other items for the tool, which helped minimise this limitation. Additionally, a number of potential participants (n = 38) exited the online tool before completion, and this study's design did not allow for the reasons for quitting to be identified. It might be that these direct support professionals did not agree with the content of the tool. In future, we should incorporate the rationale behind the reason for not completing. The same applies to percentage of missing data for some items (range: 0 to 6.4). However, it can be assumed that these limitations did not significantly affect the results presented in the current study. In IRT models, because of the invariance property, a non-random sample from the population of interest can be used (De Mars 2010). Furthermore, IRT models are perfect for handling data with missing values.
Based on the results found, the tool is potentially useful in assessing direct support professional behaviour vis-à-vis their support of physical activity; this study's data can already be used to identify areas and target groups for future interventions and policies. Additionally, based on this study's data, we can recommend minor changes to the scales before being used in practice, along with further psychometric research.
The content in terms of the difficulty of some of the items could be adjusted. For example, the category threshold estimates for the response options of partly agree and agree for the item 'unforeseen things' were extremely high. It is expected that only respondents who score very high on the opportunity continuum will answer this item positively. In contrast, the category threshold estimates for the item 'family expectations' were extremely low. Respondents with both low and high levels in terms of the opportunities afforded will respond neutrally or positively to this item. The same applies to a number of items in the motivation scale (e.g. success experiences or worriless). Changing the content in terms of difficulty of these items could also contribute to the scale's construct validity.
The scales in their current state are particularly reliable in determining those who score on the lower levels of capability, opportunity and motivation. To improve the distinctiveness and reliability of the scales, we recommend adding more items to the capability scale with thresholds category above 1.2, to the opportunity scale above 1.8 and to the motivation scale above 1.3. However, additional items are not necessary when the intention is to use the scales in the clinical field to principally identify those direct support professionals who can benefit from an intervention or change in policy.
Another recommendation for practical purposes may be to shorten the tool, especially for the opportunity and motivation scales. In reference to the study results, some items both reflect the same concept and have overlapping IIFs (Appendix). In the context of a critical look at content validity, one might consider removing one of the items or merging them. For example, although various aspects were addressed, there are multiple items covering the concept of organisation. Policymakers might choose to merge the items for organisational support, time provided and budget. Alternatively or in addition, policymakers might choose between the item on family expectations and the one on family support, because both function in a similar way in this study's data. However, the psychometric properties would then have to be re-examined, which can be carried out in close collaboration with researchers.
Future psychometric research on the tool should incorporate participant-centred research methods, such as interviews and behavioural observations. Interviews that investigate the perspectives of direct support professionals for different positions on the continua or with striking combinations will contribute to validation of the tool. Accordingly, this can help to improve our understanding of direct support professional behaviour. Behavioural observations allow researchers to measure the tool's correlation with the actual physical-activity support for people with ID. In addition, future research should assess the tool's intra-rater reliability and its sensitivity to change over time. This will enable the use of this tool to monitor and evaluate intervention functions and organisational policy change focused on improving the physical-activity support.

Conclusions
This study focused on the development of a tool to measure the behaviour of direct support professionals and has provided evidence on preliminary content, construct and reliability. The tool can be used to measure the capability, motivations and opportunities afforded to carry out physical-activity support among direct support professionals who support people with ID. The tool can also be used to measure differences between direct support professionals in terms of their own characteristics, the diversity of the people with whom they work and their environmental context. Moreover, this study's results have addressed theoretical support for the model of direct support professional behaviour in the physical-activity support for people with ID.