Introduction

Neurogenic lower urinary tract dysfunction (NLUTD), also referred to as neurogenic bladder, is defined as “lower urinary tract dysfunction due to disturbance of the neurological control mechanism” and is a common and disruptive condition for individuals with spinal cord injury and disease (SCI/D) including multiple sclerosis (MS) and spina bifida [1]. Clinically, NLUTD has been classified in different ways, including based on urodynamics, neurological outcomes, or bladder function [2]. Urinary tract infection (UTI) is a common occurrence in individuals with NLUTD and leads to significant additional morbidity and mortality [3,4,5,6,7], beyond that of the NLUTD. Despite the frequency of UTI occurrence and its negative impact, NLUTD-associated UTI definitions and diagnostic criteria have not been standardized [8,9,10,11,12]. A key criterion of UTI diagnosis is the presence of symptoms, and as such, individuals with NLUTD due to SCI/D have identified urinary symptoms that are not included in, or are explicitly excluded from, UTI diagnostic consideration [13].

Available instruments relevant to bladder function among people with NLUTD have focused on the assessment of function [14] and the evaluation of quality of life [15, 16], and thus cannot be utilized in identifying UTI or signs and symptoms that may precede UTIs. We have previously reported a clinical trial focused on treating bothersome urinary symptoms [17, 18] for individuals with NLUTD and SCI/D, supporting a shift towards more clinical research into treatments for these specific symptoms. Our research team focused on patient-centered patient reported outcomes featuring the bothersome urinary signs and symptoms of NLUTD experienced by individuals with SCI/D. Specifically, we created and piloted Urinary Symptom Questionnaires for people with Neurogenic Bladder (USQNB), based on focus groups and interviews with individuals recruited according to bladder management methods. We have previously discussed our reliability evidence for the USQNBs for those using intermittent catheterization (IC) in Tractenberg et al. 2018 [19] together with its validity evidence; and reliability evidence for the USQNB instruments specific to those using indwelling catheter (IDC, inclusive of both indwelling urethral and suprapubic) and voiding (V) in Tractenberg et al., in review [20]. The USQNB instruments are intended to aid in UTI diagnosis for individuals with SCI/D, as well as support research and patient self-management with respect to urinary symptoms potentially attributable to NLUTD-related UTI, depending on bladder management. Throughout this work, we have followed the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) [21] to guide our design for prospective testing of these instruments, which were created following a patient-centered patient reported outcome development model we published in 2017 [13]. These standards represent key measurement properties arising from psychometrics and education. One of these properties is reliability, defined as either the level of association between equivalent forms of a single assessment or the level of consistency among successive administrations (replications) of the assessment [22]. Mokkink et al. elaborate that “reliability” relates to “(t)he degree to which the measurement is free from measurement error” (p. 108) [21]. The other of these is validity, which describes the extent to which evidence and theory support the use/interpretation of the summary of the assessment, often a total score (or subscore) [22]. Here we focus attention on the validity evidence for using the USQNB-IDC and USQNB-V, our two newest instruments, continuing our objectives to promote and support research into new interventions for UTI and bothersome urinary signs and symptoms in the presence of NLUTD.

Formal (psychometrically-defined) validity is a multi-dimensional continuum that comprises the evidence supporting the use of an assessment for a specific (decision-making) purpose; this has been extensively discussed and studied for the cognitive and achievement assessment of students/in educational settings [23]. Geisinger discusses the modern perception of validity as “…a unitary concept for which there are five types of evidence to gather to justify the use of a particular measure (p. 631).” [23] These five types include content; relations with other variables of interest; internal structure of the instrument; what the respondent (must be doing) as they answer; and the consequences of the use of the instrument. These features, representing the technical definitions of validity evidence from the field of psychometrics, are consistent with—and go beyond—the COSMIN evaluation approach outlined in Prinsen et al. 2018 (Phase B, step 5 (content validity) and step 7 (criterion validity)) [24].

In this paper we explore the psychometric validity of patient responses on the two newest USQNBs, for indwelling catheter users and voiders, using Geisinger’s set of evidence types23 to ensure we meet and exceed the COSMIN standards [21, 24]. While there is no ‘gold standard’ for assessing urinary symptoms attributable to UTI, this manuscript describes evidence of the validity of the two newest USQNBs for use by clinicians, researchers, and by individuals to self-monitor and self-manage these symptoms.

Methods

All USQNBs were developed following our model for patient-centered patient reported outcomes [13], separately to elicit patient-centered reports of urinary symptoms by individuals with NLUTD according to bladder management method. Approval for the studies was received from the MedStar National Rehabilitation Hospital Institutional Review Board (V: IRB# 2016-212, IDC: IRB# 2016-088). As discussed below, COSMIN definitions of “validity” were augmented with other standards [22] to ensure that decisions made by clinicians, researchers, and patients would be supported by the items on these instruments. That is, potential users of these instruments need evidence that they will generate information about the respondent’s experiences that is appropriate for the uses to which it will be put [25]. To ensure content validity as well as convergent and divergent validity, we recruited responses from national samples within and outside of our targeted respondents specifically for the purposes of establishing the validity (this paper) and reliability (Tractenberg et al., in review [20]) of these two instruments. Our analyses focused on endorsement of the items, because the impact, frequency, and severity ratings increase the difficulty of generalizing results.

Recruitment of target participants

As we have reported elsewhere for this sample [20], participants in the patient groups these instruments target were recruited via direct email, social media, and with the assistance of advocacy organizations. Participants were recruited in the United States by English-language advertising through Facebook, via email, and with the assistance of the national (U.S.) advocacy networks in spinal cord injury and multiple sclerosis. All of these outreach and recruitment efforts were advertisements seeking respondents with NLUTD who use the specified bladder management method to visit the URL we established for data collection. “Voiding” was defined as “primarily voids to empty their bladder; that individual may intermittently (no more than once per day) use an intermittent catheter to empty his/her bladder. “Indwelling catheter” was defined to include suprapubic and indwelling urethral types. These were essentially the inclusion criteria for our target groups.

These recruitment efforts targeted individuals with SCI/D who have NLUTD or neurogenic bladder. Partner advocacy groups announced study recruitment. Participants were recruited using a SurveyMonkey link sent as a direct e-mail or posted on our website and Facebook. No identifying information was gathered to be able to identify or re-contact those who completed the survey. We included questions in the survey about time since diagnosis (of SCI or MS) and whether respondents had received a diagnosis of neurogenic bladder or NLUTD or if they did not know if they had this diagnosis. We also asked about respondent experience of UTI (“how many urinary tract infections (UTIs) have you been diagnosed with in your lifetime?”) -but we did not ask for details about any UTI diagnosis. The responses to these ancillary questions were used to ensure we obtained responses from our target population (SCI/D with NLUTD). We then sought responses from participants in our validation (divergent/convergent) populations, described below. No personal or identifying data were collected from any respondent, and initiating responses on the survey was deemed sufficient consent by the IRB.

Divergent validity participant recruitment

As with the USQNB-IC [19], we sought divergent validity data by having people who did not meet our inclusion criteria—i.e., having no experience with UTI, NLUTD, or both—complete the instruments. Individuals with NLUTD who self-report a history of no UTIs (“how many UTIs have you been diagnosed with in your lifetime?”= none), individuals with chronic mobility impairments (but without SCI or MS) and without NLUTD, and individuals with no mobility impairment, no NLUTD, and no history of UTIs were recruited largely from an inner-city rehabilitation hospital to complete the USQNB-V and USQNB-IDC, according to their bladder management method, if relevant. We sought a minimum of 30 respondents in each of our “divergent validity evidence” groups, planning to close the survey once we reached 50 in any of these groups. We planned a 6-month data collection period, and over time we were able to determine whether we were on track to reach our minimum group size. We determined that, if we could not get even 30 respondents in a subgroup, we would explore collapsing smaller subgroups, e.g., we started with 0 UTIs, 1-2 UTIs, 3–5 UTIs, and >5 UTIs as subgroups for voiders; however, we anticipated these might be too rare to allow us to reach our minimum response targets for each divergent validity group. Thus, we planned to collapse over the UTI-count subgroups, as needed (but not over groups based on other characteristics, like mobility or NLUTD status).

All respondents were asked about each symptom they endorsed within the past year, with instructions, “when considering a symptom, you should first identify what is a normal experience for you. After identifying your normal experience, report any CHANGE from that. If you report “Yes”, you will then be asked follow-up questions regarding that particular symptom”. For symptoms that were endorsed, respondents were asked to describe whether they attributed that symptom “to a urinary tract infection (UTI)”. We did not anticipate that “attributed to a UTI” would be consistently interpretable [19] given the facts that symptoms may have occurred more than one time over the year; and that there exists a wide range of diagnostic criteria for “UTI”. Although we included attribution options of “always”, “sometimes”, “rarely”, and “never, we planned to only use the response “never attributed to a UTI” to characterize the tendency to experience both the specific symptom and a UTI. Thus, contributing to our evidence of convergent and divergent validity, we computed the % of each responding group who did endorse a symptom and never attributed it to a UTI. Convergent validity evidence comes from similar groups having similar response patterns to those in the target group, while divergent validity evidence comes from dissimilar groups having different response patterns from the target group’s patterns. Both of these represent Geisinger’s “evidence based on relations with other variables”, with additional convergent validity evidence coming from lower rates of endorsement of a symptom with never attributing it to a UTI within our target group and validity subgroups with similar UTI lifetime experience, as compared to those with lower levels of UTI experience.

Materials

For each instrument (and following the same method as was reported in Tractenberg et al. [19]), bladder management-specific focus groups and individual interviews were convened to discuss the items that represent the patient experience of urinary symptoms associated with NLUTD. These were reviewed and revised iteratively by clinical subject matter experts, patient experts, and our research team, leading to the instruments reported in this paper.

Once each instrument was finalized, national samples of people with NLUTD who manage their bladders with indwelling catheters (IDC) or by voiding (V) were recruited to complete the new (relevant) instrument (USQNB-IDC or USQNB-V) online, using SurveyMonkey. Participants anonymously filled out one instrument—specific to their management type, by following a URL to the survey site. Each item was presented in English as a query about whether the respondent had experienced it during the past year (yes/no). These instruments contain 26 different items each (plus one item, “other”), and all items are phrased such that endorsement indicates either a greater level than is normal, or more intense experience of something than is usual (e.g., darker urine than normal).

For all instruments, the symptoms have been categorized into one of four different symptom types: those that are clinically actionable (A, “actionable”); those that represent bladder-specific signs and symptoms (B1, “bladder actionable”); those representing characteristics of urine (B2, “urine quality”), and all of the other items (C, “other”) that were retained from the initial patient focus groups and interviews, following evaluation by clinicians and investigators. The instruments and classifications are the subject of a new paper currently in preparation; the symptoms appear in Tables 3 and 4, sorted according to the classifications12 shown in Fig. 1A, B in the Results section.

Fig. 1: Percentage of those in each responding group endorsing items on the USQNB-IDC and USQNB-V.
figure 1figure 1

A Endorsement of items on the USQNB-IDC across groups. B Endorsement of items on the USQNB-V across groups.

The analyses described below for each instrument were focused on the endorsement (yes/no) of the items on each instrument. Tables 3 and 4 show the symptom that individuals were asked to endorse if it had been experienced within the previous 12 months. All responses were given “for the past year”, even though we intend for the instruments to be used in a much shorter time frame. This yearlong time frame was featured to ensure that the preliminary validity evidence we collected from our national sample on each instrument was as inclusive as possible.

Validity evidence

These analyses were planned to generate evidence of instrument validity. We defined content validity as the reflection of the construct to be measured—urinary symptoms potentially attributable to UTI for NLUTD depending on bladder management; face validity represented by the recognizability of the items as representing the construct to be measured—symptoms that were generated by subject matter experts and similar patients; internal or structural validity, or the extent to which the instrument captures recognizable dimensions of the construct to be measured. However, because there is no “gold standard” for urinary symptoms of UTI in this population, we were unable to generate evidence of criterion validity, defined by association with a gold standard. Instead, we used the patterns of “never attributed to a UTI” among the target respondent groups, so that lower levels of never attributing the symptom with a UTI would suggest association with UTI.

In addition, we collected evidence on convergent (people around the country with similar bladder management and a history of urinary symptoms should experience the symptoms our focus groups identified) and divergent (people around the country with similar bladder management and a history of urinary symptoms should NOT experience, or should experience much less frequently, the symptoms our focus groups identified) validity. Divergent validity data came from individuals with NLUTD who do not have a history of UTIs, individuals with chronic mobility impairments (no SCI/D) and without NLUTD, and those with no mobility impairment, no NLUTD, and no history of UTIs. COSMIN criteria [21] define construct and criterion validity as including divergent and convergent validity; including both divergent and convergent validity data conforms to Geisinger’s “relations with other variables of interest”. We also used the patterns of “never attributed to a UTI” among the divergent validity respondent groups, so that higher levels of never attributing the symptom with a UTI would suggest an important lack of association with UTI. Given the challenges in diagnosing UTI arising from variable criteria and the reliance on self-report, the clearest criterion signal comes from the “never attributed” response.

As we argued in our exploration of the psychometric properties of the USQNB-IC [19], structural validity was not expected to yield interpretable results for either of the two new instruments, because this analytic approach to validity is not consistent with either our approach to developing the patient-centered patient reported outcomes, nor with the one-year time frame of our national sample surveys. However, as we have done before, we confirmed this suspicion with exploratory common factor analysis (using principal axis factoring) to model shared covariance among items, and inferred causal model with Bayesian networks, to model the shared Shannon Information [20].

Table 1 below presents the COSMIN validity criteria together with the methods by which evidence of these properties for the instruments was obtained.

Table 1 COSMIN validity criteria and how they were assessed in this study.

Data analysis

Statistical analyses to generate the validity evidence described above/listed in Table 1 were carried out using SPSS v. 24 for descriptive statistics; these analyses were based on the yes/no endorsement of all items for the samples by instrument. We identified an endorsement rate minimum cut-off of 10% [19], below which we would want to reconsider the item’s inclusion due to low national endorsement over the course of a year, suggesting lower-than-acceptable content or face validity. We previously reported the results of similar analyses of the USQNB-IC [19], and reliability evidence for the USQNB-IDC and USQNB-V [20].

Results

Demographics

We checked to ensure respondents in each group (target, convergent, divergent) met the criteria for that group and bladder management method. All respondents on the IDC were included with SCI (n = 306) or MS (n = 8). All respondents on the USQNB-V with SCI who met criteria for NLUTD, although the survey item asked specifically about neurogenic bladder (which might be more familiar to respondents), and were included as “voiders with NLUTD” (n = 103). Of the 447 individuals with MS who were recruited to complete the voider instrument, those with MS who indicated they did not have/did not know if they had NB and who did not endorse any bladder- or urine-specific symptoms (n = 4) were excluded because their NLUTD status was ambiguous, leaving 405 MS voiders with NLUTD in the sample. four people indicated they had a MS diagnosis in the demographics questions, but in another question (“MS flare up”), responded that they did not have MS; another person stopped responding after completing fewer than half of the USQNB-V. Of the remaining 442 MS voider responders, 33 indicated that they did not have neurogenic bladder (NB), but instead, indicated they had “some other bladder problem”; and so were excluded from consideration. Of the remaining 409, 185 indicated that they had NB, and 224 reported that they did not have, or did not know if they had, a diagnosis of NB. item was Of these, 220 individuals were included as “voiders with NB” because they did have a diagnosis of MS or SCI, and they indicated that they did experience at least one bladder- or urine-specific (B1 or B2) symptoms on the USQNB-V (thus, endorsing a NLUTD-causing SCI/D condition plus some experience of symptoms associated with NLUTD).

Table 2A, B show the descriptive statistics for our target groups (IDC or V with NB and the specific bladder management), as well as the divergent validity groups. Two individuals in one voider validity group indicated that they had no history of UTI, but when asked if they attributed any USQNB-V symptom that they endorsed to a UTI, one responded “always” on all endorsed items and the other responded “occasionally” on most endorsed items. Their responses (on all items) were excluded from all analyses.

Table 2 A. Descriptive statistics, indwelling catheterization (IDC) target and validity groups. B Descriptive statistics—voiders target (MS, SCI) and validity groups.

The divergent validity groups included people with MS or SCI who managed their bladders with catheters and who reported a history of UTIs in the following three categories in their lifetime: No UTIs in their lifetime, 1–5 or >5 UTIs. This information was self-reported by respondents, we were unable to confirm category membership. We were also forced to combine the 1–5 and >5 UTIs voider groups into one “history of UTI” group in order to reach our target of 50 (or more) in this group. For the IDC instrument we were able to find individuals in sufficient numbers (at least 50) who self-identified as having lifetime histories of 1–5 UTIs and >5 UTIs, but this was not possible during our recruitment period for the voiders divergent validity groups.

Validity evidence

Tables 3 and 4 present the percentages of respondents endorsing the symptom within the past year in the target and our convergent/divergent validity groups. Table 3 describes the IDC endorsements across all groups, while Table 4 describes endorsement for Voiders.

Table 3 USQNB-IDC endorsement rates by group (IDC, and convergent/divergent groups).
Table 4 USQNB-V endorsement rates by group (Voiders with NLUTD, and convergent/ divergent groups).

The endorsement rates for each item on both instruments were all >20% for our target respondents. The levels of endorsement by the target groups support claims of face validity for both instruments for the target population; all items were endorsed in the national sample. These patterns are clearer in Fig. 1A, B, reflecting the endorsement rates for all groups on the instrument for IDC (1 A) and V (1B), below.

Fig. 1A, B show that the target groups endorse all items on their respective instruments sufficiently to suggest that the items on the instrument have face and content validity; people with history of UTI endorsed items at a greater frequency than those with a history of no UTI on both instruments.

Our final analysis of divergent and convergent validity was an examination of association of the endorsed symptoms “to a UTI”. As noted, the figures show the percentage of respondents in each group that did endorse each item, but never attributed it to a UTI (whether or not they reported ever having had a UTI before). Fig. 2A shows results for IDC, and 2B shows them for V.

Fig. 2: Percentage of those endorsing each item who never attributed it to a UTI.
figure 2figure 2

A Attribution (never attributed to a UTI) of items on the USQNB-IDC across groups. B Attribution (never attributed to a UTI) of items on the USQNB-V across groups.

Values shown in Fig. 2A, B falling below 50% indicate that the specific symptom was attributed to a UTI (however “UTI” might have been determined by the respondent) at least 50% of the time. Respondents in divergent validity groups without history of UTI were expected to choose “never attributed to a UTI” 100% of the time, and this was observed for both the IDC and V samples. The figures suggest that, while prevalent, many clinically actionable (A) and “other” (C) type symptoms are not typically attributed to having a UTI when they are experienced by our target groups. For the IDC target group, both bladder (B1) and urine quality (B2) type symptoms are usually attributed to UTIs, while for voiders this is true for most urine quality (B2) but not all bladder (B1) type symptoms. In the divergent validity groups (non-target respondents), the rates tended to indicate less attribution to UTI if these were in their inclusion criteria/history, as compared to the target groups; the responding groups we recruited with histories of UTI naturally tended to have higher rates of attribution to UTI (i.e., lower points on these plots showing the percentage in each group that never attributed each symptom to a UTI).

Discussion

The purpose of this study was to describe the psychometric validity evidence supporting implementation of two new patient-centered patient reported outcome instruments for urinary signs and symptoms in people with NLUTD and who use either indwelling catheterization to manage their bladder (USQNB-IDC), or who void (USQNB-V). Validity is a continuum, and with so many sources it is impossible to coherently summarize all the evidence quantitatively. Evidence of content and face validity comes from the instrument design process and the endorsement rates by national samples. Criterion validity is challenging to document because there is no accepted gold standard for the patients’ lived experience, but we were able to estimate this using respondent attribution of each symptom to experience with UTIs. Convergent and divergent validity evidence were obtained both from diverse respondent groups, and from the “never attributed to UTI” item on the survey we administered. Specifically, attribution of USQNB items to UTI tended to be greater for those with greater experience with UTIs, resulting in lower rates of “never attributed to a UTI” as well as higher endorsement rates when compared to respondents with no history of UTI. Convergence across those with similar UTI and bladder management experience was observed in both the endorsement and the attribution, while divergent validity evidence is similarly presented for those without, or with lower levels, of UTI experience and with bladder management that does not involve catheters. Structural validity evidence was found for these instruments with the target groups, and has been reported elsewhere [20], which is relevant despite the lack of a measurement (causal) model in the instrument’s development. Our validity evidence is summarized below:

Face, content, convergence validity evidence

All items—which were generated by patients themselves in our focus groups, and then integrated with clinician and researcher input—were recognized and endorsed by at least 20% of the national samples using each bladder management method (most were endorsed by at least 50%). We also observed that endorsement rates were highest for all groups who reported a history of UTIs, as compared to corresponding groups reporting no UTIs. Further, the symptoms on the USQNBs are typical of individuals with NLUTD who experience UTIs, but it is not clear whether the USQNB items are solely specific to NLUTD-related UTI or that they might extrapolate to other populations. Our validity evidence suggests that the items are generally more commonly endorsed by individuals with experience of UTIs than those without a history of UTIs.

Divergent validity

Few items were endorsed by even 50% of any of the control groups (without NLUTD or UTI history), although those with a self-reported history of UTIs tended to endorse at higher rates while those without a history of UTIs endorsed at the lowest rates. These results suggest that the items are more descriptive for our target user group and less descriptive of the experiences of those outside that group, particularly those without histories of urinary symptoms and UTIs. It is important to note that information about UTI history was self-reported, and we were unable to confirm category membership. We were also forced to combine the 1–5 and >5 UTIs voider groups into one “history of UTI” group in order to reach our target of 50 in this group, while for the IDC instrument we were able to identify individuals for the divergent validity groups in sufficient numbers who reported having lifetime histories of 1–5 UTIs and >5 UTIs. The convergent and divergent validity results tend to support our intention that the USQNB instruments reflect urinary (and other) symptoms that are associated with UTI.

As with our previous report of intermittent catheter users [21], the national samples queried participants about each item within the previous year (e.g., “Did you experience, in the past year, increased cloudiness of urine?”), even though the intended use of the instrument is for a much shorter time window, such as one or two weeks. While we have observed different endorsement rates by week over an 18-month period for the USQNB-IC items [18], and over a 12-month period when assessed biweekly for the USQNB-V [26] and USQNB-IDC [27], the analysis and interpretability of these fluctuations is not yet clear. We have reported [18, 28] different methods of summarization (i.e., scoring) of the USQNB instruments that can be used together with our reports of the validity and reliability of all three instruments for continuing implementation of these surveys and their use by patients, clinicians, and investigators.

Although our national samples completing the USQNB-IDC and -V, like with the USQNB-IC [19], support conclusions of validity and reliability [20], these national surveys attracted respondents who are quite heterogeneous. That is, respondents were not excluded if they had any of the following conditions: (1) known genitourinary pathology beyond neurogenic bladder (i.e., vesicoureteral reflux, bladder or kidney stones, etc.); (2) use of prophylactic antibiotics; (3) instillation of intravesicular agents to reduce UTI (i.e., gentamycin); (4) psychologic or psychiatric conditions influencing the ability to follow instructions; and (5) participation in another study in which results would be confounded. Thus, all three instruments derive validity and reliability evidence from national samples that are possibly more similar to a typical clinical sample than to a typical clinical trial/efficacy study sample. Importantly, these characteristics were all exclusion criteria for the focus groups from whom the items on all instruments were originally obtained; since it is possible that our national sample included respondents with some of these potentially confounding factors, it tends to strengthen the argument from this evidence of the face, content, and convergent validity of the instruments. However, everyone who agreed to participate was self-selected (just as our focus groups were in developing the instruments in the first place). We did not inquire about facilitated voiding techniques (Valsalve, Crede, etc), as there is no standardized way to interpret patient report of these techniques. Similarly, we did not specify a type of indwelling catheter and responses on the USQNB-IDC from the target group would have included individuals with both indwelling urethral and suprapubic catheters. While research applications of these instruments would have possibly more homogeneous respondents than our samples were, the validity evidence comes from heterogeneous users that are more consistent with clinical practice.

With the validity (this paper) and reliability [20] evidence supporting the further use and implementation of these instruments in clinical, research, and self-management contexts, greater depth of study into the existence of bias is possible, including recall bias. This study used a 12-month recall period to ensure we could capture real face validity evidence from our national target sample. However, the instruments are intended for use over a much shorter time frame (1–2 weeks), which will limit recall bias.

Our approach to validity evidence leverages formal psychometric criteria [23] and consensus-based approaches [21] to the documentation of validity in clinical assessment. Both include considerations of associations with gold standards (COSMIN criterion validity) or decisions (Geisinger’s “consequences of testing”), and because the purpose of the USQNB instruments is to strengthen the representation of the patient’s experience of urinary signs and symptoms in the diagnosis of UTI-and possibly, sub-clinical treatable levels of symptoms- and improve our understanding of NLUTD, neither of these aspects of validity could be included in this study. We used the same methods [13] to develop the two USQNB instruments described here as were used for the first USQNB, for intermittent catheter users (USQNB-IC [19]). For these three independent studies, similar sources of validity evidence were obtained, even though the instruments have different items. Our team continues to analyze these rich data sets in hopes of continuing to move research and clinical care forward.

Conclusions

The patient-centered patient reported outcomes discussed here represent our efforts at “valuing the patient perspective” and maintaining a “culture of patient centeredness in research”. It is important to keep in mind that formal measurement properties that are of greatest interest [21, 24] in health outcomes (reliability and validity) are defined based on the uses to which the instrument will be put [22, 23, 25, 29]. Our results suggest that, like we reported for the USQNB-IC [19], the USQNB-V and USQNB-IDC provide a valid, coherent, and comprehensive view of the patient’s experience of urinary symptoms along the continuum of NLUTD. The three instruments were developed independently but have yielded the same level and types of validity evidence which strengthens our confidence in the results. A new paper outlining all three USQNB instruments, together with COSMIN-appropriate scoring information, is in preparation. These instruments are clinically relevant because patient-reported urinary symptoms are bothersome and common, but not the same as the symptoms identified by authoritative guidelines for diagnosis of UTI [13]. Having an instrument that allows for measurement of both patient- and clinician-determined symptoms that are potentially related to UTI for individuals with NLUTD specifically, and depending on bladder management, we can begin to actually differentiate those symptoms which are definitely, probably, or unlikely related to UTI among those with NLUTD. Our team continues to work to improve treatment and research into UTI and bothersome urinary symptoms (e.g., Tractenberg et al. 2020 [18]), as well as to help promote antibiotic stewardship by focusing treatments where antibiotics would be most likely to be effective.