Genome-wide mRNA profiling in urinary extracellular vesicles reveals stress gene signature for diabetic kidney disease

Summary Urinary extracellular vesicles (uEV) are a largely unexplored source of kidney-derived mRNAs with potential to serve as a liquid kidney biopsy. We assessed ∼200 uEV mRNA samples from clinical studies by genome-wide sequencing to discover mechanisms and candidate biomarkers of diabetic kidney disease (DKD) in Type 1 diabetes (T1D) with replication in Type 1 and 2 diabetes. Sequencing reproducibly showed >10,000 mRNAs with similarity to kidney transcriptome. T1D DKD groups showed 13 upregulated genes prevalently expressed in proximal tubules, correlated with hyperglycemia and involved in cellular/oxidative stress homeostasis. We used six of them (GPX3, NOX4, MSRB, MSRA, HRSP12, and CRYAB) to construct a transcriptional “stress score” that reflected long-term decline of kidney function and could even identify normoalbuminuric individuals showing early decline. We thus provide workflow and web resource for studying uEV transcriptomes in clinical urine samples and stress-linked DKD markers as potential early non-invasive biomarkers or drug targets.


INTRODUCTION
Diabetes is a multifactorial disease with diverse complications causing a global health burden and >1.3 million annual deaths. 1 One of the most severe complications is diabetic kidney disease (DKD)-a leading cause of end-stage renal disease. 2 Kidney biopsy is considered as the gold standard to observe the DKDlinked structural changes and progression stages. 3 However, in practice, the diagnosis is usually based on albuminuria and decline in kidney function measured as glomerular filtration rate. 4,5 Despite their wide clinical applications, these traditional markers perform poorly in detecting early stages of DKD, for example hyperfiltration and in prognosticating disease progression. 6 Transcriptomics of the kidney tissue has revealed association of kidney mRNA markers with DKD related structural changes like fibrosis 7,8 and led to development of biomarkers including epidermal growth factor as a chronic kidney disease marker. 9 However, the risks associated with kidney biopsies, in particular at later stages of DKD, have limited kidney transcriptome-based approaches in DKD precision medicine. 10 This has turned the interest to extracellular RNAs in urine that might mirror the pathological changes of kidney transcriptome and provide an alternative non-invasive way for diagnosing, prognostication, and monitoring DKD. 11 Extracellular vesicles (EV) are lipid bilayer contained particles secreted by cells in varying sizes ($30-1000 nm). They carry active biomolecules (proteins, lipids, metabolites, DNA, and RNA) and contribute to many biological processes in health and disease e.g., through their role in cell-to-cell signaling. [12][13][14] EV are found in all body fluids, where their membrane helps to protect the cargo from proteases and RNases. 15 There has been a global effort to map RNAs in EV from various human body fluids; 16,17 however, the predominant focus has been on small RNAs. [18][19][20] Comprehensive characterization of long RNAs (like mRNAs) in EV from clinical samples is lacking because of technical challenges posed by in replication diabetes cohorts. We defined these signs as a ''shoulder'' of the main small RNA peak toward the marker peak or a slight tail toward bigger RNA sizes ( Figure 1G). Similar profiles are characteristic to e.g., formalin-fixed paraffin-embedded samples known to contain degraded (cellular) RNA (Illumina technical document 26 ) or total urine RNA samples ( Figure S1). To look deeper into these quality criteria, we first assessed pairs of urine RNA containing and matching uEV RNA lacking the degradation signs (examples in Figure S1B). Messenger RNA sequencing of six such pairs of urine and their matching uEV (HUB study, n = 5 male, 1 female) revealed that the worse bioanalyzer profiles from urines associated with significantly reduced mapped read numbers as compared with the uEV (>3-fold, p < 0.001, Figure S1C). A similar difference was obtained when comparing the mapped reads between uEV samples with good and bad RNA profiles from a subset of T1D women's cohort (>2-fold, p < 0.01, Figure S1D). We additionally noted that $27% of this subset contained >5% of intergenic reads (10 kb upstream or downstream of coding regions) that may stem from DNA. Majority (85%) of these DNA containing samples showed also signs of RNA degradation by bioanalyzer. Thus, to avoid problems caused by degraded RNA or DNA, we excluded samples based on bioanalyzer and >5% of intergenic reads (when information was available). The diminished (F) Representative electropherograms of RNA isolated from uEV and analyzed with bioanalyzer using Agilent Pico kit. An RNA peak between 25 and 200 nt was detected in all samples and with this characteristic shape passed quality control. (G) Examples showing signs of nucleic acid degradation (arrows) in uEV RNA samples and not passing quality control. 24 h (24h); albumin excretion rate (AER); blood pressure (BP); body-mass index (BMI); estimated glomerular filtration rate (eGFR); waist-to-hip ratio (WHR); Genotype-Tissue Expression (GTEx) project; glycated hemoglobin (HbA1 C ); Lymph Node Carcinoma of the prostate (LnCaP), macroalbuminuria (Macro); microalbuminuria (Micro); non-diabetic control (Control); normoalbuminuria (Normo); messenger RNA sequencing (mRNAseq); Podocalyxin (PDX), type 1 or type 2 diabetes (T1D, T2D); urinary extracellular vesicles (uEV); principal component analysis (PCA); single-nucleus RNA sequencing (snRNA-seq).
Quality, robustness, and reproducibility of uEV mRNA profiling by next generation sequencing We performed next generation sequencing (NGS) of uEV mRNAs using methods accepting low RNA input. For discovery, we used both T1D cohorts (total n = 72 men, 24-h or ON urine collection) ( Figure 1) that produced a median (IQR) of 11.5 million (7.4-16.3) raw sequencing reads ( Figure S2) , and <0.1% mapped to intronic or intergenic regions (10 kb upstream or downstream of coding regions) ( Figure S2B). We detected robust expression of 10,596 protein coding genes in uEV from the T1D discovery cohorts that were expressed in >90% of samples (with count R1, Figure S2E).
We further tested the robustness of uEV mRNA profiles using pairs of samples for three different technical variables, 1) urine collection protocol: 24-h vs. ON, 2) effect of centrifuging the urine samples prior to freezing (Gcentrifugation), and 3) reproducibility by processing aliquoted urine samples through the whole uEV mRNAseq pipeline-from uEV isolation to sequencing-at different time points (replicates) ( Figure 1 and Table S5). In these three technical comparisons, similar to the main T1D study cohorts, we observed (median-IQR) 11.5 (8.6-13.5) million raw sequencing reads that were mostly [91.6% (88.2-93.9)] uniquely mapped to human genome, particularly to the exonic gene regions ( Figures S3A and S3B). We detected the expression of 10,449 protein coding genes (with count R1) present in R90% of samples used for the three technical comparisons. The read count data for these commonly expressed genes (10,449) showed clustering (based on correlation) for most technical replicates ( Figure S3C). We performed Euclidean distance-based clustering of expression profiles for all 12 sample pairs/triplicates using all common detected protein coding genes ( Figure 2A, n = 10,449) or ''kidney-enriched genes'' ( Figure 2B, n = 233 present in uEV)-a subset of protein coding genes known to be enriched in kidney (see STAR Methods) and expressed in our technical uEV sample sets. Pairs or triplicates of replicate urine samples and those with or without centrifugation showed high correlation (r = 0.81 to 0.95) and most of these technical pairs clustered together (9 out of 12) both when all or only kidney enriched genes were considered regardless of their donor's diabetic and non-diabetic status (Figures 2A and 2B). All 24-h and ON pairs did not cluster together, but no systematic difference was observed for the urine sample pairs either from T1D individuals or non-diabetic individuals regarding all commonly expressed genes ( Figure S4) or kidney-enriched genes (Figure 2C). When comparing discovery population samples with technical triplicates, we observed 1.6 to 2-fold higher variability (coefficient of variation -CV %) in all expressed genes (49% vs. 30%) or kidney-enriched genes (71% vs. 34%) ( Figures 2D and 2E) in population samples. Most variability in all samples was mainly contributed by low-expressed genes ( Figures 2D and 2E).
Finally, we also tested any sex driven global differences in uEV mRNA profiles by clustering of three samples from women with all the men from the T1D cohorts (as mentioned in Figure 1A, n = 72). Men and women clustered mainly separately based on sex chromosome genes and all robustly expressed uEV genes (n = 10,596 genes), but sexes did not separate based on kidney-enriched genes (n = 247 present in uEV out of 413, seeSTAR Methods) ( Figure S4).
Together, these results showed the expression of >10,000 protein coding genes in uEV from different clinical cohorts. It is noteworthy that this is the first study that successfully combined a relatively large number of samples from different urine collections highlighting the comparability and reproducibility of the uEV mRNA profile and the used low input technologies.

Similarity between uEV and kidney transcriptomes and kidney cell type deconvolution analysis
The uEV could be derived from several tissues-thus, to study the origin of the uEV mRNAs, we compared the uEV transcriptome to individual organ-specific gene expression signatures. We obtained sets of known ''kidney-enriched genes'' (413 genes with higher expression in kidney than in other tissues, see STAR Methods) and ''kidney-depleted genes'' ( iScience Article T1D uEV (with the total of n = 10,596 genes). We detected the expression of 59.8% (247 out of 413) of kidney-enriched genes compared to only 4.3% (200 out of 4,631) of kidney-depleted genes in the T1D uEV. We further looked at more ''specific kidney-enriched genes''-a subset of kidney-enriched genes that are reported to be detected in only few human tissues other than kidney (see STAR Methods) and detected the expression of 40.5% of such genes (77 out of 190) ( Figure S5A).
We next compared the expression profile of uEV mRNAs with the profiles of 17 other human tissues obtained from the GTEx database (https://www.gtexportal.org/home/datasets) (all male, Figures 3A-3C).
To infer proximity between the profiles, we performed principal component analysis (PCA) analysis using expression levels of three sets of genes; 1) all protein coding genes expressed in both uEV and the 17 Clustering of all technical replicates (from diabetic and non-diabetic individuals) using uEV expression levels across A-all commonly expressed genes (n = 10,449) and B-kidney enriched genes (n = 233). Technical replicates included 1) aliquots of individual samples (24h urine collection) processed through the uEV mRNA sequencing pipeline twice (R1 and R2, n = 6 donors) or thrice (R1, R2 and R3, n = 2) at different time points and 2) urine samples (24h collection) processed with (_C) and without centrifugation before freezing (n = 4).
(C) Clustering based on expression levels of kidney enriched genes in uEV for all technical samples from ON vs. 24h urine collections. Samples from individual donors were collected from the same day (n = 12 donors). Samples S1-S9 were from individuals with T1D and S10-S13 from non-diabetic individuals. The gene expression values (log2CPM) were inverse normally transformed and converted to Z score before analysis for A-C. Clustering analysis (A to C) was performed using Euclidean distance-based method.

OPEN ACCESS
iScience 26, 106686, May 19, 2023 5 iScience Article reference human tissues (n = 10,350), 2) ''kidney-depleted genes'' as a negative control, and 3) ''specific kidney-enriched genes'' (n = 77, as shown in Figure S5A). PCA analysis using gene expression data of all common gene detected in uEV showed that uEV samples clustered together with the kidney cortex tissue samples and away from other tissues suggesting close similarity of the mRNA expression patterns between uEV and kidney ( Figure 3A). The PCA analysis of kidney-depleted genes showed proximity of prostate samples with uEV samples ( Figure 3B). We observed a similar pattern, clustering of kidney cortex and uEV samples, when using specific kidney-enriched genes in the PCA analysis ( Figure 3C). The deconvolution analysis of uEV mRNAs using adult kidney single-nucleus sequencing reference data 25 showed that we could infer cellular fraction for all major kidney cell types with highest fraction for proximal tubular cells ( Figure 3D) using cell type marker data as described in Sc-Type database. 27 In summary, we detected expression of a high number of kidney-enriched genes in uEV and generally a lack of expression of genes not known to be expressed in kidney. The overall expression pattern of protein coding genes in uEV also showed close similarity with kidney and particularly proximal tubule, suggesting that they contribute majority of mRNAs detected in uEV.
Genome-wide uEV mRNA analysis revealed novel candidate markers for T1D DKD in discovery phase The distribution of gene expression of all the genes appeared similar for both discovery T1D cohorts despite the differences in urine collection protocols ( Figure S5B). To further test for differences in uEV mRNA profiles, we performed PCA of all (n = 10,596) robustly expressed genes. We observed no systematic iScience Article differences as 67 out of 72 samples clustered together ( Figure S5C). However, the five outliers were removed from downstream case-control analysis.
Having shown a difference between normo-and macroalbuminuric individuals, we then studied the expression of the identified 13 genes also in the microalbuminuric group (n = 13, mRNAseq PCA quality passed). The gene expression tended to increase from normo-to micro-to macroalbuminuria ( Figure 4B). While the difference between normo-and microalbuminuric individuals was not statistically significant for all the genes, the combined micro-and macroalbuminuria group differed significantly from the normoalbuminuric group (Table S6).
We also reanalyzed the uEV mRNA profile of the top 13 candidate genes by stratifying the same T1D individuals based on CKD (chronic kidney disease) stages, irrespective of albuminuria, using retrospective eGFR data (>5 years) from routine laboratory follow-up testing (Table S4). This confirmed significant (p < 0.004) upregulation of all 13 differentially expressed genes also in T1D individuals with CKD stage R3 compared with CKD stage %2 (Table S6).
To technically validate the differential expression results from NGS, we studied the expression of five of the top differentially expressed genes (MAP7, GPX3, IL32, NOX4, HRSP12) using Taqman quantitative PCR (qPCR) assays in the same set of samples. The qPCR-and mRNAseq-based expression values were mostly highly correlated ( Figure S6). Similar to the mRNAseq results, the qPCR-based expression levels were significantly (p % 0.005) upregulated in macroalbuminuric T1D individuals and somewhat less upregulated in microalbuminuric subjects compared to those with normoalbuminuria ( Figure S6).
Differentially expressed uEV mRNAs correlated with hyperglycemia and showed coexpression pattern similar to kidney samples We studied whether the expression level of the 13 differentially expressed genes, considered as candidate marker genes for T1D DKD, correlated with clinical parameters contributing to DKD progression (hyperglycemia and hypertension) at or close to the time of urine collection ( Figure 4C).
Most of the differentially expressed candidate genes (11 out of 13), except MAP7 and CAPN3, showed a positive correlation with HbA 1c (p < 0.05, Figure 4C). Of note, the strongest correlation (p % 0.001) was observed for the candidate genes known to be highly expressed in kidney (TINAG, GPX3, and RBP5) compared to other human tissues ( Figure 5A). In a subanalysis that included samples from normoalbuminuric individuals only (n = 37), we also saw a trend of positive correlation between expression levels of 10 candidate genes and HbA 1c suggesting that the observed correlations in the whole T1D cohort were not only driven by cases with clinically confirmed DKD ( Figure S5D).
We next explored the origins of the differential uEV transcripts by inspecting their co-expression patterns in uEV and GTEx-an external human tissue expression reference database (https://www.gtexportal.org/ home/datasets). Considering the urogenital organs previously established to be important contributors of uEV (Figure 3), the expression level of 8 out of 12 candidate marker genes (present in GTEx dataset) were higher in kidney cortex compared to prostate, bladder, and testis samples in the GTEx datasets (male only, Figure 5A). By pairwise correlation analysis, the co-expression pattern of the candidate genes in uEV resembled the pattern in kidney cortex when focusing on most of the kidney-enriched genes (MSRA, RBP5, GPX3, CXCL14, TINAG, NOX4, IL32, and CRYAB), but also in case of many of the non-kidney enriched candidate genes (MSRB1, CAPN3, TMEM9, and MAP7) (Figures 5B-5E). Further, we utilized single nuclei RNAseq data (>23,000 nuclei) from human diabetic kidney tissue, 28 to look up the expression levels of the candidate genes in different cell types of the kidney. Most of the candidate genes (6 out of 9 genes with cell type specific expression data available showed highest cellular percentage for the proximal convoluted tubule (PCT) cells expressing these genes ( Figure 5F). In summary, expression analysis of candidate marker genes for T1D DKD suggested that hyperglycemia may contribute to their higher expression in uEV samples. The main contributors of the candidate marker genes to uEV could be the kidney PCT cells.  iScience Article uEV mRNA-based stress score and its association with DKD in multiple cohorts All or most of the 13 differentially expressed candidate marker genes for T1D DKD showed common characteristics: 1) upregulation in DKD individuals ( Figures 4B), 2) positive correlation with hyperglycemia (Figure 4C), and 3) higher expression level in kidney cortex and proximal tubular cells ( Figure 5). This suggested that also the underlying disease mechanism responsible for these common characteristics might be common. A literature search indicated that six of the main 13 DKD candidate genes executed functions in cellular stress responses, namely 1) oxidative stress related responses (GPX3, NOX4, MSRB1, MSRA) and 2) protection of cellular proteins during stress (HRSP12 and CRYAB) (Table S7). In addition, the expression level of these six genes showed a positive correlation with HbA 1c ( Figure 4C) and co-expression with each iScience Article other ( Figure 5C). Hyperglycemia-induced stress in kidney cells including the PCT is considered as an important disease mechanism contributing to DKD progression. 29 Thus, we speculated that these six genes may be involved in hyperglycemia-induced stress responses of the kidney and that their expression level in uEV could reflect change in kidney function of diabetic individuals. To test this hypothesis, we derived a transcriptional stress score based on average normalized expression level of those six genes in uEV (GPX3, NOX4, MSRB1, MSRA, HRSP12, and CRYAB, see STAR Methods for statistics) to test association with DKD (defined by albuminuria). These six genes belong to group of top 13 transcriptome-wide significant DKD candidate genes as shown in Figure 4A. As expected, the stress score had a clear  iScience Article increasing trend from normo-to macroalbuminuria ( Figure 6A) and showed a positive correlation with HbA 1c (r = 0.41) measured at the time of urine sample collection in T1D discovery study but not with eGFR, systolic and diastolic blood pressure (SBP and DBP) measured at the same time ( Figure S7A). In support to the pathophysiological roles of the stress score genes, pathway enrichment (gene ontology) analysis of all nominally significant differentially expressed genes between T1D macro-and normoalbuminuria groups also suggested changes in oxidation-reduction as well as transporter and EV-linked activities (p < 0.05, logFC>1, n = 130 genes, Table S8).
We attempted to replicate the association of the stress score with albuminuria status-as seen in discovery phase-in three independent replication cohorts including one T1D (women only) and two T2D cohorts (men and women) ( Figures 6B-6E, Table S3). All three replication cohorts showed higher expression level of the stress score in diabetic individuals with albuminuria compared to normoalbuminuria (p = 0.046 to 0.0068) (Figures 6B-6D). Combining all three replication cohorts yielded a significant difference between groups ( Figure 6E, n = 75, p = 7.6 ✕10 À4 ) following a similar pattern as in discovery phase ( Figure 6A, n = 67, p = 9.5 ✕10 À6 ). Using discovery and validation cohorts, we did not see any difference in the stress score levels between 1) T1D normoalbuminuria groups treated with ACE (angiotensin-converting enzyme) inhibitors and/or Beta blockers vs. not treated (Figure 6F) nor between 2) T1D and T2D normoalbuminuria groups vs. non-diabetic controls ( Figure 6G).
uEV mRNA-based stress score as early marker for long-term eGFR decline  (Figure 7A). In contrast, the annual eGFR decline was >2-fold steeper in the highest quartile (75-100; À1.55 G 0.27). Interestingly, the 25-50 percentile group-the group with a moderate increase in stress core-showed an even faster decline (À1.61 G 0.17), which was derived from a higher eGFR at the retrospective starting points (visits 1-3) but no comparative difference in the recent years (visits [10][11][12][13][14][15]. This suggests the mid stress group may have undergone hyperfiltration unlike the low stress group. As the stress score was defined based on the change in the macroalbuminuria group, we performed a subanalysis of the long-term decline in eGFR excluding them. Hence, among the normoalbuminuric T1D individuals, we compared those within the lowest stress score quartile with those within the three higher quartiles (0-25 and >25) as well as with all microalbuminuric T1D individuals (regardless of the stress score) as a positive control, i.e., the group having early clinically confirmed T1D DKD. The normoalbuminuric group with the lowest stress score (0-25) showed only a minimal annual decline (À0.59 G 0.19) in the eGFR compared to those with mid-to-high stress score (>25; À1.34 G 0.20), which did not markedly differ from the microalbuminuria group (À1.08 G 0.16 ( Figure 7B). Similar analysis of longitudinal HbA 1c % (13 years) did not show notable differences between the normoalbuminuria groups having low or midto-high stress score (0-25 and >25) ( Figure 7C), but the most recent HbA 1c level measured at the time of uEV collection was higher for the mid-to-high (>25) stress score group similar to the microalbuminuria group ( Figure S7B). In T1D women's replication cohort, the eGFR again declined fast (À1.16 G 0.22) in the mid-to-high stress score (>25) group as in the microalbuminuria group (À2.03 G 0.28), whereas we observed no significant eGFR decline (0.26 G 0.29) in the normoalbuminuric low stress group (0-25) (Figure 7D). Further, an ROC (receiver operating characteristic) analysis for predicting at least % -0.5 annual eGFR decline in the normoalbuminuric individuals (n = 13) from the same T1D replication cohort revealed that the stress score was more informative (AUC = 0.83) compared to other clinical parameters ( Figure S7C).
We also looked how stress score correlated with the eGFR decline in groups with clinically established DKD. The long-term eGFR data was available for one T2D DKD replication cohort (n = 17 out of 22). When separating the T2D cohort into macro-and microalbuminuria groups, the respective lowest stress groups with sufficient patient numbers for analysis showed a minimal decline (T2D-micro, 0-25, À0.05 G 0.57; T2D-macro, 26-74, À0.98 G 0.37) compared to their respective higher stress groups (T2D-micro, 26-100, À1.80 G 0.16; T2D-macro, 75-100, À3.55 G 0.29) ( Figure 7E). Further, an ROC analysis for predicting fastest decliners (%-3.07) in all DKD replication cohorts (for available T1D and T2D) revealed that the stress score was most informative (AUC = 0.79) compared to other clinical parameters ( Figure S7D). Overall, in the combined analysis-of discovery and replication cohorts regardless of diabetes type and albuminuria status-the stress score was associated (p = 2.01✕10 À5 ) with eGFR decline rather than eGFR at the time of uEV collection, which remained significant (beta = À0.37 G 0.09 sd unit, p = 0.0003) after adjusting for other known clinical parameters (Figures 7F-7H). The study showed that a lower uEV stress score, i.e., a lower level of the candidate mRNAs in uEV samples, associates with a minimal long-term decline of kidney function. Conversely, a higher secretion of stress markers via uEV and thus a higher stress score, associates with a faster decline of kidney function in diabetic individuals. This suggests that these genes could be putative candidate DKD marker genes involved in the response of the kidney to the stress caused by hyperglycemic conditions.

DISCUSSION
Extracellular RNAs originating from the kidney and passed to urine inside EV could serve as a non-invasive proxy for a biopsy, a liquid kidney biopsy. Overall, we detected expression of >10,000 genes despite focusing on the most commonly expressed genes (>90% samples). They define the most stable EV mRNA transcriptome in population scale covering particularly male T1D population. The expression of large gene numbers in uEV is in line with results from some recent smaller sequencing studies. [30][31][32][33] Interestingly, irrespective of the degree of albuminuria, the uEV showed only a small subset of transcripts (<5%) not known to be expressed in the kidney compared to a high proportion (>50%) of kidney-enriched genes and a genome-wide expression pattern similar to kidney tissue. This suggests that kidney is the main contributor of uEV and that the glomerular filtration barrier limits the access of EV from non-urinary tract organs into the urine even during DKD. Nevertheless, as we did detect some non-kidney expressed genes, we cannot totally exclude the possibility of a low degree EV transfer from circulation. 34 Our results also agree with previous data showing uEV mRNAs from other urogenital organs including prostate. 17 EV field is still in its infancy with non-standardized workflows for urine, particularly when considering EV mRNAs compared to miRNAs in DKD. 35 Thus, we needed to test several pre-analytical and analytical factors. Specifically, we assessed the impact of the urine collection methods used in the clinic, 24-h or ON, which can affect the uEV concentration or cargo and thereby the detection sensitivity. We confirmed that ON urine, which is easier to collect, produced an almost similar mRNA profile and expression level of kidney enriched genes as the 24-h urine. Additionally, our whole uEV pipeline was reproducible, as it gave similar mRNA transcriptomes from the technical replicates processed at 1-5 months intervals through the pipeline. This suggests that urine storage at À80 C preserves uEV mRNAs well. Further, the degree of variation due to technical steps that are often assumed to cause problems, such as uEV isolation by ultracentrifugation 36 or sequencing in different batches, 37 was not significant here. We also validated the mRNAseq results with qPCR for the top differential genes.
Despite our focus on men's samples in an attempt to reduce heterogeneity in the discovery phase, we also explored the differences in the uEV RNA between men and women. In uEV RNA samples from women, we observed more frequently signs of RNA degradation and DNA contamination than in samples from men. In agreement with our results, it has been reported that DNA and RNA quantities are higher in uEV samples derived from women compared to men. 38,39 Further, while urinary EV DNA appears to be mainly located outside of the EV, 40 it is known that RNA outside of the EV is prone to degradation. 41 Thus, a plausible explanation for our findings is that the samples from women contained more often non-vesicular degraded RNA and DNA e.g. from remnants of epithelial cells. However, we overcome this difficulty with women's samples by introducing the new quality control criteria (Figures 1 and S1). We also confirmed that the differences in uEV transcriptomes were minimal between sexes when focusing on kidney-enriched genes and the stress score (Figures 6 and S4). This technical information is provided as a workflow (methods) and should assist in standardization efforts from biobanking activities of urine to future large-scale uEV mRNA research.
The study revealed putative marker genes that are prevalently expressed in PCT for DKD in T1D individuals. The important roles of PCT in kidney dysfunction (DKD and chronic kidney disease) is also supported by kidney tissue bulk 7 and single cell RNA sequencing 28,42 and genetic studies. 43 Remarkably, now uEV provide an opportunity to assess the expression of some highly PCT-enriched genes (e.g. TINAG, CXCL14, and RBP5) ( Figure 5) non-invasively in T1D individuals progressing toward DKD.
Based on the known functions of the DKD candidate marker genes, it appears that many associate with kidney diseases, and our stress score genes, particularly GPX3, HRSP12, MSRA, MSRB1, and CRYAB, protect the cell and/or cellular proteins during stress (Table S7). They thus may have a reno-protective role by mitigating hyperglycemia-induced stress in the kidney. GPX3 is one such antioxidant enzyme mainly produced in PCT and involved in relieving oxidative stress by neutralizing hydrogen peroxide. 44 The GPX3 and MSRB1 belong to the selenoprotein class. 45  iScience Article cells. 46 Recently, enhancing the GPX3 activity by selenium supplementation was shown to rescue kidney function in T2D human subjects. 47 Gene NOX4, also one of the stress score genes, has many roles in DKD ranging from detrimental to protective and its protein level was recently shown to be decreased in PCT during CKD. 48,49 Increased EV secretion during various types of stress, including oxidative stress (mechanical stretch, hypoxia, hyperglycemia, inflammation, acidosis) have been reported and thus could contribute to the ''leaking'' of these reno-protective kidney transcripts with putative roles in managing hyperglycemic stress in diabetic kidney. [50][51][52] One mechanistic explanation for increased EV secretion could be that the impaired intracellular EV degradation via defective autophagy in DKD would lead to increased EV secretion, 48,53 as the two systems appear closely linked, which is supported by accumulating data from cell culture studies. [54][55][56] Alternatively, or additionally, the increased transcript levels in uEV could be caused by lack of uptake and disturbed intra-renal signaling. 13,57 Overall, further studies would help to confirm whether increased transcript secretion via uEV in diabetic individuals causes a decrease in the level of reno-protective proteins in the kidney or whether the secretion is just a consequence of an ongoing compensatory mechanism in a hyperglycemic kidney.
Clinically, the most interesting finding in our study was that the uEV-based stress score correlated with long-term decline of kidney function (eGFR) in diabetic individuals even before manifestation of albuminuria ( Figures 7B-7D). The stress score thus has potential for non-invasive early identification of individuals with declining kidney function and a greater risk of developing DKD. Indeed, our study suggests uEV may be sensitive indicators of the DKD development due to the coupling of their biogenesis to stress -hyperglycemic stress causing oxidative cellular stress -and possibility for tracking their rich transcriptomic information to the cell-of-origin, notably to the PCT.

Limitations of the study
The main limitations of our study were 1) low sample size of individual groups, 2) focusing on male subjects in the discovery phase (due to expected higher heterogeneity in women, which could have had an impact considering the low sample size). However, given the novelty of the methods-whole transcriptome EV mRNAseq-this is the largest study until now with inclusion of two different T1D cohorts, two T2D cohorts, healthy controls, and detailed long-term retrospective clinical data. Additionally, we define a robust uEV transcriptome of >10,000 genes, show the position of uEV on tissue map, and provide evidence that uEV pipelines can be reproducible and applicable for high-impact large-scale biomarker studies. Our data resource should further help researchers in studying the mRNA landscape and kidney function in clinical uEV samples.
In summary, the study showed the utility of uEV transcriptomics in capturing the gene expression patterns of the kidney and discovered T1D DKD candidate marker genes including putative reno-protective genes.
Because these genes showed initial promise for T2D DKD as well, they appear suitable for development as broadly applicable novel biomarkers or therapeutic targets for DKD. Thus, we conclude that uEV offer a promising liquid kidney biopsy of DKD.

STAR+METHODS
Detailed methods are provided in the online version of this paper and include the following: