Patterns of joint involvement in juvenile idiopathic arthritis and prediction of disease course: A prospective study with multilayer non-negative matrix factorization

Background Joint inflammation is the common feature underlying juvenile idiopathic arthritis (JIA). Clinicians recognize patterns of joint involvement currently not part of the International League of Associations for Rheumatology (ILAR) classification. Using unsupervised machine learning, we sought to uncover data-driven joint patterns that predict clinical phenotype and disease trajectories. Methods and findings We analyzed prospectively collected clinical data, including joint involvement using a standard 71-joint homunculus, for 640 discovery patients with newly diagnosed JIA enrolled in a Canada-wide study who were followed serially for five years, treatment-naïve except for nonsteroidal anti-inflammatory drugs (NSAIDs) and diagnosed within one year of symptom onset. Twenty-one patients had systemic arthritis, 300 oligoarthritis, 125 rheumatoid factor (RF)-negative polyarthritis, 16 RF-positive polyarthritis, 37 psoriatic arthritis, 78 enthesitis-related arthritis (ERA), and 63 undifferentiated arthritis. At diagnosis, we observed global hierarchical groups of co-involved joints. To characterize these patterns, we developed sparse multilayer non-negative matrix factorization (NMF). Model selection by internal bi-cross-validation identified seven joint patterns at presentation, to which all 640 discovery patients were assigned: pelvic girdle (57 patients), fingers (25), wrists (114), toes (48), ankles (106), knees (283), and indistinct (7). Patterns were distinct from clinical subtypes (P < 0.001 by χ2 test) and reproducible through external data set validation on a 119-patient, prospectively collected independent validation cohort (reconstruction accuracy Q2 = 0.55 for patterns; 0.35 for groups). Some patients matched multiple patterns. To determine whether their disease outcomes differed, we further subdivided the 640 discovery patients into three subgroups by degree of localization—the percentage of their active joints aligning with their assigned pattern: localized (≥90%; 359 patients), partially localized (60%–90%; 124), or extended (<60%; 157). Localized patients more often maintained their baseline patterns (P < 0.05 for five groups by permutation test) than nonlocalized patients (P < 0.05 for three groups by permutation test) over a five-year follow-up period. We modelled time to zero joints in the discovery cohort using a multivariate Cox proportional hazards model considering joint pattern, degree of localization, and ILAR subtype. Despite receiving more intense treatment, 50% of nonlocalized patients had zero joints at one year compared to six months for localized patients. Overall, localized patients required less time to reach zero joints (partial: P = 0.0018 versus localized by log-rank test; extended: P = 0.0057). Potential limitations include the requirement for patients to be treatment naïve (except NSAIDs), which may skew the patient cohorts towards milder disease, and the validation cohort size precluded multivariate analyses of disease trajectories. Conclusions Multilayer NMF identified patterns of joint involvement that predicted disease trajectory in children with arthritis. Our hierarchical unsupervised approach identified a new clinical feature, degree of localization, which predicted outcomes in both cohorts. Detailed assessment of every joint is already part of every musculoskeletal exam for children with arthritis. Our study supports both the continued collection of detailed joint involvement and the inclusion of patterns and degrees of localization to stratify patients and inform treatment decisions. This will advance pediatric rheumatology from counting joints to realizing the potential of using data available from uncovering patterns of joint involvement.


Methods and findings
We analyzed prospectively collected clinical data, including joint involvement using a standard 71-joint homunculus, for 640 discovery patients with newly diagnosed JIA enrolled in a Canada-wide study who were followed serially for five years, treatment-naïve except for nonsteroidal anti-inflammatory drugs (NSAIDs) and diagnosed within one year of symptom onset. Twenty-one patients had systemic arthritis, 300 oligoarthritis, 125 rheumatoid factor (RF)-negative polyarthritis, 16 RF-positive polyarthritis, 37 psoriatic arthritis, 78 enthesitisrelated arthritis (ERA), and 63 undifferentiated arthritis. At diagnosis, we observed global hierarchical groups of co-involved joints.
Some patients matched multiple patterns. To determine whether their disease outcomes differed, we further subdivided the 640 discovery patients into three subgroups by degree of localization-the percentage of their active joints aligning with their assigned pattern: localized (�90%; 359 patients), partially localized (60%-90%; 124), or extended (<60%; 157). Localized patients more often maintained their baseline patterns (P < 0.05 for five groups by permutation test) than nonlocalized patients (P < 0.05 for three groups by permutation test) over a five-year follow-up period.
We modelled time to zero joints in the discovery cohort using a multivariate Cox proportional hazards model considering joint pattern, degree of localization, and ILAR subtype. Despite receiving more intense treatment, 50% of nonlocalized patients had zero joints at one year compared to six months for localized patients. Overall, localized patients required less time to reach zero joints (partial: P = 0.0018 versus localized by log-rank test; extended: P = 0.0057).
Potential limitations include the requirement for patients to be treatment naïve (except NSAIDs), which may skew the patient cohorts towards milder disease, and the validation cohort size precluded multivariate analyses of disease trajectories.

Conclusions
Multilayer NMF identified patterns of joint involvement that predicted disease trajectory in children with arthritis. Our hierarchical unsupervised approach identified a new clinical feature, degree of localization, which predicted outcomes in both cohorts. Detailed assessment of every joint is already part of every musculoskeletal exam for children with arthritis. Our study supports both the continued collection of detailed joint involvement and the inclusion of patterns and degrees of localization to stratify patients and inform treatment decisions. This will advance pediatric rheumatology from counting joints to realizing the potential of using data available from uncovering patterns of joint involvement.

Author summary
Why was this study done?
• Juvenile idiopathic arthritis (JIA) is a heterogeneous childhood disease for which joint inflammation is the common denominator.
• The current classification system, the International League of Associations for Rheumatology (ILAR) subtypes, categorize patients according to the number of active joints (four or fewer versus five or more affected joints in the first six months of disease) with little evidence to support this cut-off.
• Grouping frequently co-involved joints into joint patterns may help to better classify patients and predict disease course.

Introduction
Juvenile idiopathic arthritis (JIA) is the most common chronic inflammatory rheumatic disease in childhood, characterized by joint inflammation of unknown etiology lasting at least six weeks starting at <16 years of age. The International League of Associations for Rheumatology (ILAR) criteria classify JIA into seven subtypes [1]. Clinicians acknowledge that individual joints or groups of joints influence outcome. Distinct joint involvement patterns such as dactylitis, sacroiliac joint involvement, or tarsitis are clearly recognized in patients with JIA [1]. However, despite efforts to characterize distinct disease entities, these patterns remain heterogeneous in clinical presentation and consequently disease course and response to treatment. In addition, individual joints such as the hip, cervical spine, ankle, or wrist may be indicators of poor outcome [2][3][4][5] and have therefore been considered in the 2011 American College of Rheumatology (ACR) treatment recommendations as indicator joints for poor prognosis [6][7][8]. However, existing classification subtypes and treatment recommendations do not take affected joint patterns into account [1,6]. Moreover, these findings only apply to a subset of JIA patients and are based on a small number of joints [9,10]. A significant gap remains in our understanding of the relationship between patterns of joint involvement and disease outcome.
Systematically collected data from the Research in Arthritis in Canadian Children, Emphasizing Outcomes (ReACCh-Out) study have enabled us to address this knowledge gap. ReACCh-Out is a prospective inception cohort study of children with newly diagnosed JIA [11] that has rigorously collected detailed longitudinal clinical data, including information on joint inflammation. A standard 71-joint homunculus precludes the use of traditional statistical-or supervised learning-techniques to associate individual joints, let alone combinations of joints, to descriptors of outcome in an unbiased manner due to multiple hypothesis testing in the context of small patient numbers, a common challenge in rare diseases. To address this challenge, unsupervised learning can help identify a small number of patterns of arthritis in a principled, logical, and expressive manner, whose outcomes we can then explore. We can draw from previous successes applying these approaches in JIA analyzing clinical and biomarker data [12,13]. These patterns not only provide possible predictive features of outcome but also help to organize the patients into homologous groups that may underlie further disease research.
In this study, we pursued a data-driven strategy in a novel data domain, joint involvements, to identify two easily measurable clinical variables-joint pattern and degree of localizationin children with newly diagnosed JIA. We then determined the relationship of these measures to longitudinal outcomes, including pattern stability and response to treatment.

Methods
Our analysis, outlined as a flow chart in S1 Fig, can be divided into the following phases: data collection and filtering, dimensionality reduction, clustering, and postclustering analysis. As indicated in our transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) checklist (S1 Checklist), our postclustering analysis plan was developed after we completed the clustering and observed that some patients defied identification with a single joint grouping. This observation prompted us to include a variable describing this in our multivariate prediction model.

Study design
For the discovery cohort, patients enrolled in ReACCh-Out were included if they satisfied the classification criteria for any of the seven ILAR JIA subtypes [1]. Enrolled patients were included within one year of JIA diagnosis and were treatment naïve for medications except for nonsteroidal anti-inflammatory drugs (NSAIDs  [11]. The nonoverlapping independent validation cohort comprised prospectively collected patients with newly diagnosed chronic childhood arthritis and detailed information about joint involvement recruited between 1980 and 2007 from two tertiary Canadian sites: Royal University Hospital of Saskatoon, Saskatchewan and Health Sciences Centre of Winnipeg, Manitoba [14]. These sites were chosen because they had standardized collection of prospective data by the same personnel over the study timeframe. For this study, all patients were treatment naïve except for NSAIDs. Prior to the introduction of the ILAR classification criteria, patients in the validation cohort were identified as satisfying ACR criteria for juvenile rheumatoid arthritis (JRA).
Research ethics boards at each participating center approved the study protocols. Informed written consent for participation was obtained from parents, and informed consent or assent was obtained from patients as appropriate.

Demographic, clinical, and laboratory data
For the discovery cohort, detailed demographic, clinical, and laboratory data were collected prospectively at study entry using standardized clinical reporting forms. Collected information included key features of the ILAR classification [1], the ACR pediatric core set measures of disease activity [15], standard laboratory markers, and physician assessments of global disease activity. All forms were subsequently validated. Musculoskeletal information for 71 joints as well as treatment regimens were serially collected at six-month intervals for the first 18 months and yearly thereafter for five years after study enrollment.
In the validation cohort, the features of the ILAR classification were collected as well as joint involvements at all subsequent clinical visits.

Hardware and software
All analyses were conducted using R version 3.3.0 (Vienna, Austria) and Python version 3.6 and higher (Wilmington, Delaware, US) on Apple computers running Mac OS X 10.10 and higher, as well as at supercomputing facilities at the High Performance Facility at SickKids (Toronto, Ontario, Canada) and the Terrence Donnelly Centre for Cellular and Biomolecular Research at the University of Toronto (Toronto, Ontario, Canada).

Joint co-involvements
Initially, joint co-involvements-which describe pairs of joints that are active together-were analyzed by calculating joint co-involvement frequencies and conditional joint co-involvement frequencies. A joint co-involvement frequency, P(x,y), for a reference joint x and a co-involved joint y is the proportion of patients having both joints involved. The conditional joint coinvolvement frequencies, P yjx ð Þ ¼ Pðy;xÞ PðxÞ , are the fraction of all patients with x involved who also have y involved.
To investigate whether joint co-involvement frequencies were asymmetric, i.e., higher for joints on the same side of the body, we developed a measure of same-side versus opposite-side skew in the co-involvement frequencies computed across the discovery cohort. A "joint type" is the pair of corresponding joints on opposite sides of the body, e.g., the knee joint type consists of the left and right knee joints. To compute skew for a reference joint type x and a coinvolved joint type y, we counted the number of patients for whom a reference joint was paired with a co-involved joint on the same side (n same ) and on the opposite side (n opposite ) and used these to compute a z-score to measure skew for the joint type pair by setting z ¼ mÀ 0:5 s , such that m ¼ n same þ1 n same þ1þn opposite þ1 and s ¼ ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi mð1À mÞ n q . If the joint type pair co-involvement was symmetric, z would be close to zero, whereas if co-involved joints tended to occur on the same side of the body, z would be large and positive. To assess the significance of the skew, χ 2 tests were conducted using n same and n opposite under the null hypothesis that both quantities were equal. P values were translated to false discovery rates (FDRs) to account for multiple hypothesis testing. FDR < 0.1 was considered significant.

Identifying patterns of joint involvement
We selected an analysis strategy suitable for the positive joint counts that would be able to identify sparse patterns of joint involvement if they were present in the data. Sparse patterns better support interpretation and application of these patterns by clinicians. Also, we sought a mixed membership model so that patients could belong to multiple groups. Finally, our preliminary analysis suggested a hierarchical structure to the joint co-involvement patterns, with some tightly coupled joint groups (e.g., toe joints) that often co-occur along with one or more broader patterns of co-activation of these tight groups. As such, we adapted a form of multilayer non-negative matrix factorization (NMF) [16] to learn hierarchical, sparse representations (S1C Fig). Multilayer NMF progressively applies NMF to the joint patterns. NMF is a dimensionality reduction method that fits summary indicators or "factors" that group frequently co-involved joints together [17]. Broadly, NMF decomposes joint involvements X into a basis/loading matrix W, describing the contributions of joints to factors, and a coefficient/score matrix H that scores patients on factors. Multiplying these two matrices approximates the input data (X � WH). NMF produces an intuitive partsbased representation in which reconstructing patient data involves only adding groups of joints in factors, similar to adding parts of faces (different eyes or noses) to reconstruct facial representations [17]. Unlike other dimensionality reduction techniques like principal component analysis (PCA), NMF constrains the elements of W and H to be positive. This constraint often causes many elements of the two matrices to be zero, a trend that we supplement by setting small elements to zero (a technique called "sparsifying"), as described in S1 Text. NMF can be used to cluster patients by interpreting the nonzero elements of H for each patient as assignments to clusters represented by the factors in W [18].
Multilayer NMF was conducted as described in S1 Text. Briefly, NMF was applied to the joint involvement data, and W was then sparsified to produce low-level factors that correspond to tight joint groupings. NMF was then applied to the H matrix to identify tight joint groupings that often co-occur; the term "high-level factors" refers to these frequently co-occurring tight joint groupings. Through matrix multiplication, joint involvements can be recovered from the high-level factors. "Key joints" for each high-level factor were those appearing with nonzero contributions. For both the high-and low-level factors, patients were assigned into patient groups ("[x]") corresponding to their highest-scoring factor [19].
The degree of overlap was assessed between patient groups at both levels of the analysis. Patient factor scores were normalized patient-wise to the highest factor score and then zscore-transformed factor-wise. One-sided z-tests determined which patient groups had higher scores than expected on individual factors. FDRs were calculated from P values to account for multiple hypothesis testing. Relationships between patient groups and factors were significant if their FDR was <0.1.

Relationships between patient groups and ILAR subtypes
Relationships were visualized between patient groups and ILAR subtypes through a circular figure built using Circos 0.63 [20]. To identify enriched relationships between patient groups and ILAR subtypes, a χ 2 test was conducted. Relationships were enriched if their standardized residual was �1.96 (i.e., P < 0.05).

Localization of joint involvements
To describe how closely patients aligned with their associated high-level factors or patient groups, patients were further stratified by the "degree of localization" of their active joints. Patients with "localized" involvement had �90% of active joints being key joints in the highlevel factor underlying their patient group. For patients with "partially localized" involvement, this range was �60% and <90%. All other patients had "extended" involvement. S2 Text describes how these boundaries were determined.
To determine which patient groups skewed towards any localization, χ 2 tests were conducted to compare the distribution of localizations within a single patient group against the global distribution. P values were Bonferroni-adjusted for multiple hypothesis testing.

Associations of patient groups with treatment decisions
For each medication at six-month and one-year visits, multivariable logistic regression was conducted to predict medication status as an outcome from both patient groups and degrees of localization. Model significance was assessed using likelihood-ratio tests. A model for a medication and visit was significant if P < 0.05 after a Bonferroni adjustment for multiple hypothesis testing.

Patient group trajectories and disease trajectories
To track how joint involvement changed over subsequent visits, high-level patient factor scores and patient group assignments were calculated where possible at six-month, one-year, 18-month, two-year, three-year, four-year, and five-year visits. From baseline patient group assignments and localizations, frequencies of transitioning between two patient groups at any time between six-month and five-year visits were calculated. Significantly overrepresented transitions were determined by a 2,000× permutation test.
A multivariate Cox proportional hazards analysis, as implemented in the survival R package, version 2.41 [21], was conducted to identify which patient groups, ILAR subtypes, and degrees of localization experienced zero joint involvement more quickly. The assumption of proportional hazards was assessed visually and tested using the "cox.zph" function in survival. Hazard ratios (HRs) and 95% confidence intervals (CIs) were computed and patient groups, ILAR subtypes, and degrees of localization were compared using log-rank tests.

Validation
To determine whether the identified patterns of joint involvement could generalize beyond the discovery cohort, validation cohort joint involvement data were projected onto high-level factors and patient groups. At each level, the same scaling parameters were applied to joint involvement data or low-level patient factor scores. Patients were assigned to patient groups based on their highest-scoring high-level factors as above. Patterns of joint involvement were also identified independently of discovery joint patterns using the multilayer NMF framework described above.

Patient characteristics
We included 640 patients with newly diagnosed JIA in the discovery cohort and 119 in the validation cohort. Table 1 outlines demographic data for these cohorts respectively. Discovery cohort patients were diagnosed at a median age of 7.7 years, with a range of 0.57 to 16.6 years, whereas validation cohort patients were diagnosed at a younger age (median: 5.7 years; range: 0.5-18 years). The most highly represented subtype in both cohorts was oligoarthritis (discovery: 47%; validation: 56%), and most patients were female (discovery: 65%; validation: 71%).

Joints were co-involved in distinct patterns across the entire discovery cohort
To investigate patterns of joint involvement and co-involvement, we computed overall frequencies of individual joint involvement and pair co-involvement in the discovery cohort. Fig 1 depicts overall joint involvement frequencies. Knees, ankles, and wrists had the highest rate of involvement. When we considered conditional co-involvement frequencies partitioned by side of body (Fig 2), 3,271 of 5,041 (65%) of these probabilities were significant (P < 0.05 after Bonferroni adjustment). Examining the heat maps for same-side joints in the top-left and bottom-right quadrants, we observed hierarchical patterns of joint co-involvement along the vertical axis. Joints closer on the vertical body axis were generally more often co-involved joints. For example, index finger joints and middle finger joints were likely to be co-involved. However, there also appeared to be broad, nonlocal patterns of co-involvement, e.g., finger and toe joints were frequently co-involved. In contrast, the heat maps for opposite-side involvement in the top-right and bottom-left quadrants appeared nearly identical (Frobenius norm = 3.7; P < 0.001 by permutation test), indicating little proximity preference along the horizontal axis. For example, left index finger joints were as likely to be co-involved with right middle finger joints as those on the left middle finger. This observation was consistent with widespread symmetric joint involvement in JIA.
Asymmetric joint involvement is associated with more severe forms of JIA [22]. To study the prevalence of asymmetric co-involvements, we determined which joint types were more likely to be co-involved with joints on the same side of the body. Consistent with our initial impression of Fig 2, and different than what we expected based on previous reports in JIA, we found few statistically significant asymmetric joints: only ankle, midfoot, and subtalar joints (with ankle: χ 2 = 14, P FDR = 0.0097; with subtalar joints: χ 2 = 11, P FDR = 0.048) had statistically significant same-side co-involvement (S2 Fig).

Multilayer NMF identified seven distinct patterns of joint involvement
To characterize hierarchical patterns of joint co-involvements, we applied multilayer NMF on discovery cohort joint involvements to identify groups of frequently co-involved joints. To identify this hierarchical structure to the co-involvements, we conducted a second round of conventional NMF on low-level patient scores, which identified seven groupings of groups we deem "high-level factors", <A-G>. S3 Text describes the process of deriving these high-level factors. As expected from our preliminary investigations (Fig 2), each high-level factor combined low-level factors into broader, symmetric groupings covering partially overlapping areas of the vertical body axis (S6 Fig). We named these high-level factors <A pelvic girdle>, <B fingers>, <C wrists>, <D toes>, <E ankles>, <F knees>, and <G sternoclavicular joints>. Finger distal interphalangeal (DIP) joints distinguished <B fingers> from <C wrists>, as <B fingers> skewed towards finger DIPs and <C wrists> towards shoulders and elbows. <A pelvic girdle> included sacroiliac joints and/or hips, and <E ankles> included ankles, subtalar, and midfoot joints.
We classified each patient into one of seven groups ("[x]") corresponding to the high-level factors based on their highest-scoring high-level factor. Fig 3 shows  Overall, these patient groups corresponded to logical patterns of joint involvement reported at the bedside [23].

Patterns of joint involvement subdivided the ILAR subtypes
Six of seven patient groups associated with at least one ILAR subtype, despite patient groups comprising different stratifications of patients from the ILAR subtypes (χ 2 = 313; P < 0.001). Conversely, six of seven ILAR subtypes associated with patient groups. More opaque ribbons in the Circos figure (Fig 4), linking patients common to patient groups and ILAR subtypes, represent these enriched associations encompassing more patients than expected through standardized residuals from the above χ 2 test (S2 Table). These standardized residuals quantify how far the number of patients shared by a patient group and ILAR subtype deviated from expectation.

Some patients had nonlocalized joint involvement
Although the high-level groupings completely encapsulated joint involvements for most patients (56%), a small group of patients (25%) had more non-key joints involved than key joints (S9 Fig). We deemed these patients as having extended involvement. Patients with 90% of their joints as key were localized, and those with 60% to 90% were partially localized. S4 Text describes how we determined these thresholds. To determine the clinical significance of these subcategories, we compared the clinical attributes of patients therein. Patient

Treatment decisions were associated with degrees of localization
To determine whether treatment decisions were associated with patient groups and degrees of localization, we conducted multivariable logistic regression. S10A Fig depicts the number of patients in each group and localization with medication data at six-month and one-year visits. S10B Fig depicts the proportion of patients in each patient group, divided by localization, who received biologics, disease-modifying antirheumatic drugs (DMARDs), joint injections, and systemic corticosteroids prior to these visits. Patient groups differed by DMARD usage, with higher than expected utilization in children in [C wrists] and [E ankles] with localized involvement and [F knees] with partially localized and extended involvement, and lower than expected utilization in [F knees] with localized involvement (S4 Table). In terms of systemic corticosteroid usage, patients in [F knees] with localized involvements were less likely to receive such treatment at six-month visits. Therefore, patient joint group and localization influenced treatment.

Patient groups remained stable throughout initial disease course
To observe how the data-driven joint patterns evolved throughout disease course, we traced patient group assignments longitudinally. Fig 5 depicts transition probabilities from baseline patient groups, divided by degree of localization, to groups at any visit up to five years. While patients often transitioned to zero joint involvement, several transitions were enriched (P < 0.05; Holm-Bonferroni-adjusted), with little movement between patient groups given the lack of filled circles outside the diagonals. Among patients with localized involvements (Fig  5A) (Fig 5B), relationships between patient groups (left; colored wedges in the second-innermost ring) and ILAR subtypes (right; grey wedges in the second-innermost ring). Heat maps (outer rings; bottom legend) display high-level patient factor scores, scaled to 0% to 100%, starting with <A pelvic girdle> moving outwards to factor <G sternoclavicular joints>. Each set of arcs along the radial axis, aligned along a ray from the center, represents an individual discovery cohort patient. Ribbons link groups of patients between patient groups and ILAR subtypes. Enriched relationships, as determined from a χ 2 test, appear more opaque. ILAR, International League of Associations for Rheumatology; RF, rheumatoid factor.

Localizations predicted time to zero joint involvement
Having observed a strong tendency for some groups towards zero joint involvement (disease control/inactive disease), we asked whether the groups differed by the rate in which they achieved this outcome. We constructed a Cox proportional hazards model predicting time to zero joint involvement from patient groups, localizations, and ILAR subtypes. The resulting model did not deviate from the proportional hazards assumption (χ 2 = 22.9, P = 0.062) and identified localizations and ILAR subtypes that reached zero joint involvement at different rates than others (R 2 = 0.093, P < 0.001). At least half of the patients followed in [F knees  (HR = 0.63, Z = −2.8, P = 0.0057) (Fig 5D). Among diagnoses, patients with RF-positive polyarthritis (HR = 0.42, Z = 2.4, P = 0.016) and undifferentiated arthritis (HR = 0.64, Z = −2.8, P = 0.0059) reached this outcome more slowly. S5 Table contains additional statistics for the Cox proportional hazards model.

Patient groups generalized to an independent validation cohort
To determine whether factors and patient groups were generalizable beyond the discovery cohort, we projected an independent validation cohort of JIA patients to the joint patterns. These patients' joint involvements were reconstructed by low-level factors with Q 2 = 0.81, high-level factors with Q 2 = 0.55, and patient groups with Q 2 = 0.48 (S1 Table), comparing favorably against ILAR subtypes, with Q 2 = 0.43. Projected patient groups presented with similar joint involvement frequencies (S12 Fig). We then calculated de novo factors and patient groups, which we detail in S4 Text. Validation joint involvements were reconstructed by lowlevel factors with Q 2 = 0.84, high-level factors with Q 2 = 0.55, and patient groups with Q 2 = 0.35. De novo factors described similar joint groupings as discovery factors (S13C, S13F and S13G Fig).
We then asked whether the projected groups could also predict time to zero joint involvement. Considering individual univariate models due to limited power, patient groups (χ 2 = 17, P = 0.0072) and localizations (χ 2 = 8.9, P = 0.012; S14 Fig) themselves predicted this outcome. Patients with extended involvement took significantly longer to reach zero joint involvement than patients with localized involvement (HR = 0.53, Z = −2.5, P = 0.011).

Discussion
We explored patterns of articular involvement in JIA using unsupervised data-driven pattern recognition techniques. We initially observed joints co-occurring in logical and localized groupings without same-side skewing (Figs 1 and 2 and S2 Fig). To better understand these signals in a clinically applicable manner, we conducted a modified version of multilayer NMF, identifying seven high-level groupings of joints ( S6 Fig and Fig 3) describing arthritis foci anchored by distinct subsets of joints. The resulting seven patient groups subdivided the ILAR subtypes into distinct subgroups based on patterns of arthritis (Fig 4). Patients with localized involvement often remained in the same patient group after baseline visit (Fig 5A, 5B and 5C) and reached zero joint involvement faster than patients with nonlocalized involvement ( Fig  5D). These patterns generalized to an independent validation cohort, supporting their applicability beyond the discovery cohort (S12 Fig and S13 Fig).
Our study is the first to provide a detailed, data-driven description of heterogeneously coinvolved joints in JIA. Previous studies have focused on individual joints [2][3][4][5][6][7][8] in specific individual ILAR subtypes [9,10], whereas we identified joint groupings in a data-driven fashion independently of these ILAR subtypes. For example, our approach identified differences in presentation among patients with polyarticular involvement, separating these patients based on joint patterns.  (Fig 5). Wrist involvement has been associated with poor prognosis and decisions to treat more aggressively [6]. Our results reflected this trend as patients in [C wrists] had less defined disease trajectories and longer times to zero joint involvement (S11 Fig) despite more commonly receiving DMARDs and systemic glucocorticoids. Similar findings were observed in patients in [B fingers] which tended to transition into [C wrists], with higher use of (biologic) DMARDs and systemic glucocorticoids at diagnosis. These findings were notable as patients in [B fingers] tended to transition to [C wrists] when they had partially localized involvement.
The stability of the patient groups for up to five years after diagnosis supports how meaningful these patterns are. As patients lose active joints over the course of treatment, we expected their joint patterns to shift as patient factor scores represent weighted sums of individual joints. With fewer active joints, we expected patterns to become more sensitive to the specific joints involved. However, patients remaining in their same groups in at least one subsequent visit suggested that patients with residual joint involvement had it in their group's key joints. Therefore, joint patterns may represent robust core groupings of joints much like the indicator joints of poor outcome [2][3][4][5][6][7].
Our study is also among the first to identify the degree of active joint localization with outcomes through an easily measurable clinical variable, the degree of localization. Patients had worse disease outcomes if their active joints did not align strictly with a single pattern. This has been instinctively recognized at the bedside by clinicians, as this aspect has clearly influenced treatment decisions (S4 Table and S10 Fig), but is not included in any clinical guidelines. Furthermore, classifying patients by the degree of localization predicted disease course and time to zero joint involvement. For patients with recognizable patterns of joint involvement, treatment decisions appear effective. However, patients with partially localized or extended joint involvement had the poorest outcomes, taking a longer time to reach zero joint involvement despite receiving stronger medications (e.g., more DMARDs). Patients with nonlocalized joint involvement may therefore represent a high-risk group who require early aggressive therapy.
Our pattern recognition approach is well suited for analyzing joint involvements. Factors supported signals that were apparent based on overall co-involvements. Low-level factors grouped knees, subtalar joints, and midfoot joints on separate sides of the body into separate factors, demonstrating that our approach identifies both asymmetrical and symmetrical patterns of arthritis. As our approach identified patterns with little overlap outside the fingers, patient groups described clinically homogeneous entities. Extending NMF into a multilayer approach bears some similarity to other hierarchical modelling techniques such as deep autoencoders [24], in which each successive layer identifies increasingly broad representations of the data, or Gaussian process latent variable models [25]. However, multilayer NMF provides directly interpretable latent representations. This representation differs from PCA, which produces patterns that are orthogonal to each other, a feature with implications with respect to joint involvements. Furthermore, PCA patterns are less intuitive as joints contribute both positively and negatively to them. To reconstruct a patient's joint involvements, we would have to add and subtract groups of joints, whereas with NMF, we would only add groups of joints.
Our study has a number of limitations. First, both discovery and validation cohorts were based on sample sizes of convenience. Because discovering joint patterns involved an unsupervised analysis, a priori power analyses were not done. However, the proximity of joints within patterns along the vertical axis and our ability to identify useful clinical measures demonstrated the potential of conducting such an analysis retrospectively. Secondly, the small size of the validation cohort limited our ability to test our findings in a multivariate model in a validation cohort, although we successfully validated our predictors in univariate tests. Furthermore, our validation cohort strongly supported these clinical measures, demonstrating that this concern is one of statistical power rather than approach. Lastly, we required patients to be treatment naïve except NSAIDs, which may have potentially skewed our patient cohorts towards individuals with milder forms of disease.
The identified joint patterns appear to have important prognostic implications. They are conceptually simple to apply at the bedside as they represent an easily computed weighted sum of active joints. Further classifying patients by the degree of localization may help clinicians further tailor treatment decisions as patients with less strongly defined phenotypes may require early aggressive therapy. Patterns of joint involvement may be among key components of a new disease classification for JIA in addition to other data domains, including antinuclear antibody (ANA) status [26], biological measures [13], and other musculoskeletal features such as enthesitis. Beyond JIA, our approach may be a transferrable template for application in rheumatoid arthritis (RA) and spondyloarthropathies (SpAs). Previous efforts in RA have attempted to define a representative pattern of "core joints" as indicators instead of using the full complement of joints [27,28]. This reductionist approach still counts joints. Alternatively, utilizing the rich data for pattern recognition may identify predictors of outcome.
Using multilayer NMF, we identified patterns of joint involvement predictive of disease trajectory in children with arthritis. Our results are consistent with previous observations pointing to key individual joints as predictors of poor outcome. Our hierarchical unsupervised learning approach allowed us to identify a new clinical variable, localization of joint involvement, as a key feature associated with poor outcomes in both our discovery and validation cohorts. Detailed bedside assessment of every joint is already part of every musculoskeletal exam for children with arthritis. Our study supports not only the continued collection of detailed information about joint involvement but also the inclusion of these patterns together with localization data (i.e., how closely affected children fit these patterns) to stratify patients and inform treatment decisions. Our findings will move the field of pediatric rheumatology out of infancy, from joint counts to realizing the potential of using data available from patterns of joints involvement.
Supporting information S1 Checklist. TRIPOD checklist. TRIPOD, transparent reporting of a multivariable prediction model for individual prognosis or diagnosis. (DOCX) S1 Fig. Analysis workflow. (A) Overall analysis workflow for the discovery and validation cohorts to identify factors and patient groups from joint involvement data. (B) Factor identification workflow, which considers input data (left) comprising joint involvements or low-level factor scores and identifies factors described by coefficient/score matrices and basis/loading matrices (right). (C) The overall multilayer NMF scheme for this study. Boxes represent layers vertically scaled to the number of dimensions, which are, from left to right, the number of joints, the number of low-level factors, and the number of high-level factors. Circles represent patient groups. NMF, non-negative matrix factorization. (TIF) S2 Fig. Few pairings of joint types were asymmetrically involved. Heat maps of z-scores (colors; right legend), for co-occurring joint types (x-axis) on reference joints (y-axis). Pairings whose absolute z-score was <1 were zeroed. Pairings with FDR < 0.1 are outlined. FDR, false discovery rate. bootstraps. (B) Same as panel A but divided into patient groups (subpanels). (C) Bar graph of the percent of patients in each group with each localization (shades of grey; right legend). Up arrows denote enriched combinations of patient groups and localizations, and down arrows denote depleted or rarer combinations. ��� P < 0.001 by χ 2 test with Bonferroni correction for multiple hypothesis testing. (TIF) S10 Fig. Treatment decisions associated with degrees of localizations. (A) Bar plots of the number of patients followed (y-axis) per patient group (columns) per visit (shades of grey; bottom legend). (B) Bar plots showing, for each treatment (rows), the proportion of patients (yaxes) per patient group (columns) and degree of localization (shades of grey; bottom legend) prescribed that treatment within six-month windows prior to six-month and one-year visits (x-axes). Up arrows denote, for each patient group, visit, and patient group, enriched localizations that occur more often than expected, and down arrows denote depleted localizations that occur less often than expected.  Table. Associations between patient groups and ILAR subtypes. Standardized residuals � generated from a χ 2 test on a contingency table counting patients at the intersection of each patient group and ILAR subtype. � P < 0.05, �� P < 0.01, ��� P < 0.001. ILAR, International League of Associations for Rheumatology; RF, rheumatoid factor. (DOCX) S3 Table. Associations between patient groups and degrees of localization. � P < 0.05, �� P < 0.01, ��� P < 0.001. (DOCX) S4 Table. Associations between patient groups and degrees of localization and treatment decisions prior to subsequent visits. Associations are enriched, or observed more often than expected, when the coefficient β > 0, and depleted when β < 0. DMARD, disease-modifying antirheumatic drug. (DOCX) S5 Table. Coefficients from Cox proportional hazards modelling for time to zero joint involvement. � P < 0.05, �� P < 0.01. CI, confidence interval. Yeung.