Treatment effects of soluble guanylate cyclase modulation on diabetic kidney disease at single-cell resolution

Summary Diabetic kidney disease (DKD) is the most common cause of renal failure. Therapeutics development is hampered by our incomplete understanding of animal models on a cellular level. We show that ZSF1 rats recapitulate human DKD on a phenotypic and transcriptomic level. Tensor decomposition prioritizes proximal tubule (PT) and stroma as phenotype-relevant cell types exhibiting a continuous lineage relationship. As DKD features endothelial dysfunction, oxidative stress, and nitric oxide depletion, soluble guanylate cyclase (sGC) is a promising DKD drug target. sGC expression is specifically enriched in PT and stroma. In ZSF1 rats, pharmacological sGC activation confers considerable benefits over stimulation and is mechanistically related to improved oxidative stress regulation, resulting in enhanced downstream cGMP effects. Finally, we define sGC gene co-expression modules, which allow stratification of human kidney samples by DKD prevalence and disease-relevant measures such as kidney function, proteinuria, and fibrosis, underscoring the relevance of the sGC pathway to patients.

Correspondence ksusztak@pennmedicine.upenn.edu In brief Balzer et al. show how ZSF1 rats recapitulate human diabetic kidney disease (DKD), the most common cause of renal failure, on a phenotypic and single-cell transcriptomic level. Their study prioritizes disease-relevant cell types and demonstrates the benefits of pharmacological modulation of soluble guanylate cyclase, a promising DKD drug target.

INTRODUCTION
Chronic kidney disease (CKD) is the fourth fastest growing cause of death, affecting >850 million people worldwide. 1 Patients with CKD have 3-to 5-fold increased mortality. 2 The survival rate for kidney failure (end-stage renal disease [ESRD]) is often worse than for many solid tumors, underscoring the importance and urgency of the disease. 3 Therapies to prevent progression of CKD are mostly based on inhibition of the renin-angiotensin-aldosterone system, introduced >20 years ago, and on blockade of a sodium glucose transporter, introduced recently. Although these therapies clearly slow progression, not all CKD patients benefit to the same degree and even non-responders are emerging. CKD remains a major unmet medical need, for which therapeutics are desperately needed.
One critical limitation has been that animal models poorly recapitulate human diabetic kidney disease (DKD). Most strains of mice when made diabetic (e.g., by streptozotocin injection) do not develop phenotypes observed in patients with DKD, such as mesangial expansion, glomerular basement membrane thickening, tubulointerstitial damage, and endothelial hyalinosis. 4 Animal models often do not show progressive kidney function decline and other microvascular complications of diabetes such as hypertension and heart failure. DKD is associated with reduced nitric oxide (NO) bioavailability and endothelial dysfunction, similar to other cardiovascular disorders such as hypertension, heart failure, and metabolic syndrome. 5,6 NO-soluble guanylate cyclase-cyclic guanosine monophosphate (NO-sGC-cGMP) signaling plays a critical role in regulating renal function. 7,8 Defects in NO availability (e.g., endothelial NO synthase [eNOS] deletion) can lead to severe kidney function deterioration and CKD. 8 Endogenous NO is generated from L-arginine by eNOS. After release from the endothelium, NO binds to sGC, which is a heterodimeric enzyme consisting of an a and b subunit carrying an N-heme-NO binding domain. The major isoforms are sGCa1b1 and sGCa2b1 encoded by GUCY1A1, GUCY1B1, and GUCY1A2, respectively. NO-dependent sGC stimulation triggers formation of cGMP, which is the cellular second messanger. 9 sGC is therefore a key signal transducer of NO-mediated organ effects. NO-sGC-cGMP signaling can be impaired by increased reactive oxygen species production, scavenging of NO via the reaction of NO and O 2À to form peroxynitrite, or direct scavenging by free hemoglobin but also by oxidation of Fe 2+ (ferrous) sGC to its NO-insensitive Fe 3+ (ferric) state. 10,11 Moreover, sGC transcription and mRNA stability are affected by oxidative stress. 12 Agonizing sGC directly has become a promising therapeutic approach. Current sGC agonists are categorized into two distinct classes based on their molecular mode of action. sGC stimulators (sGCstim) propel cGMP formation by binding to sGC, allosterically to the N-NOX domain, NO-independently and synergistically with NO. However, their efficacy depends on the ferrous state of the prosthetic heme group. In contrast, sGC activators (sGCact), by binding to the H-NOX domain directly, induce cGMP production preferentially at the oxidized/heme-free apo form of the enzyme, which is no longer responsive to NO and sGC stimulators 9,13 ( Figure 1A). Hence, maintaining sGC heme in the ferrous state is essential for sGC-cGMP signaling via NO and sGCstim, whereas sGCact can act independently of the ferrous heme group, bound to the b1 subunit (encoded by GUCY1B1), potentially explaining higher sGCact activity under pathophysiological and high oxidative stress conditions, such as DKD, compared with sGCstim. 14 Both sGCstim and sGCact have shown kidney-protective effects in preclinical CKD and DKD models 8,[15][16][17][18] and have been advanced to clinical studies. 9 Despite the positive effect of sGC modulation on clinical outcomes, the target cell types and molecular mechanism of action for sGC are poorly understood. Here, we studied the pharmacological effects of sGCstim and sGCact in the ZSF1 rat as a representative DKD model at the single-cell level in the kidney. Through unbiased tensor decomposition analysis, we prioritize podocytes, proximal tubule (PT) cells, and stromal cells as most disease-relevant cell types in DKD and describe the latter two as sGC-expressing cells. We highlight a continuous transcriptional lineage relationship of PT and stromal cells, starting with differentiated PT cells to injured (PTinj) and profibrotic PT (ProfibPT) states toward mesenchymal cells (Mesench). Finally, we use unbiased weighted gene correlation network analysis (WGCNA) to build a score, which successfully stratified 991 human kidney bulk RNA sequencing (RNA-seq) samples by DKD prevalence, and functional (degree of proteinuria, glomerular filtration rate) and structural (fibrosis) kidney impairment.

RESULTS
Diabetic ZSF1 rats recapitulate phenotypic changes of DKD with marked disease improvement by sGC activators It has been suggested that the obese ZSF1 rat model exhibits many of the phenotypic characteristics of human DKD, such as proteinuria, structural renal lesions, hyperglycemia, dyslipidemia, hypertension, oxidative stress, and obesity. [18][19][20][21][22] We analyzed ZSF1 obese diabetic rats at 25-26 weeks of age (Figure 1B). In line with previous publications, diabetic ZSF1 rats demonstrated marked obesity, hypercholesterolemia, hyperglycemia, elevated hemoglobin A1c (HbA1c), and hypertension (Figures 1C and S1A; Table S1), reflecting the pronounced metabolic disturbances reminiscent of the metabolic syndrome in humans. 23,24 Obese ZSF1 rats demonstrated impaired kidney function, as measured by elevated serum creatinine and urea, as well as marked proteinuria and albuminuria (Figures 1D and S1B). We noted higher levels of circulating kidney injury markers such as kidney injury molecule 1 (KIM-1) 25 and neutrophil gelatinase-associated lipocalin (NGAL) 26 ( Figure 1E) in diabetic ZSF rats. We performed explorative proteomics analysis of 92 plasma proteins using a multiplexed proximity extension assay (Olink) (Data S1). Plasma proteins showing higher levels in diabetic rats included Delta-like 1 (DLL1) and ectodysplasin A2 receptor (EDA2R) ( Figure S1C), both of which were recently found in a human proteomics study analyzing four independent cohorts of individuals with type 1 and type 2 diabetes and early and late DKD to be associated with progression to kidney failure. 27 Of the 46 proteins that Kobayashi et al. reported to be strongly associated with progression to kidney failure, 27 eight were included in our Olink panel. Interestingly, the levels of all eight proteins (100%) were significantly higher in diabetic ZSF1 rats (Data S1), again underscoring the similarities of the ZSF1 rat model to human DKD. These proteins had diverse biological functions including development (DLL1, MATN2), inflammation (EDA2R, IL17F, CCL5, TNFSF12), and transforming growth factor b (TGFb) signaling (FSTL3, TGFBR3). Functional impairment in diabetic ZSF1 rats was mirrored by renal histopathological changes such as increased interstitial fibrosis, tubular degeneration, hyaline cast formation, and glomerulopathy ( Figures 1F and 1G). 28,29 Next, we aimed to characterize the effects of sGCact and sGCstim on renal and metabolic parameters of ZSF1 rats. While sGCact significantly alleviated metabolic changes, sGCstim did not ( Figure 1C). The degree of renal function improvement was similar (as measured by urea) or greater (as measured by Figure 1. Diabetic ZSF1 rats recapitulate phenotypic changes of DKD with marked disease improvement by sGC activators (A) Representation of the importance of heme-containing (native) sGC and heme-free (dysfunctional) form of sGC and its redox equilibrium. sGC stimulator efficacy depends on the ferrous, Fe(II), state of the heme group at the b subunit of sGC, while sGC activators bind directly to oxidized, Fe(III), or heme-free apo form of sGC. Similar to other cardiovascular disorders, DKD is associated with reduced NO bioavailability, increased oxidative stress, and endothelial dysfunction. cGMP, cyclic guanosine monophosphate; DKD, diabetic kidney disease; NO, nitric oxide; NOS, nitric oxide synthase; O 2 Article ll OPEN ACCESS creatinine, albuminuria, and proteinuria) upon treatment with sGCact than with sGCstim ( Figure 1D). Increased kidney injury and kidney disease progression markers were largely rescued by sGCact but not by sGCstim ( Figures 1E, 1F, S1B, and S1C). Both tubulointerstitial and glomerular histopathological changes were markedly lower upon sGCact treatment, whereas we only saw a reduction in glomerulopathy upon sGCstim treatment ( Figures 1F and 1G).
In summary, we found that obese, diabetic ZSF1 rats recapitulated functional and renal histopathological changes of human DKD. Pharmacological sGCact ameliorated functional and histological changes of DKD, while sGCstim had modest effects on ZSF rats.
Single-cell transcriptomic landscape of the diabetic ZSF1 rat kidney To elucidate key cell types and DKD driver pathways, we next performed single-nuclei RNA-seq (snRNA-seq) on three rat kidney samples per group. After stringent quality control of each individual sample, including ambient RNA correction, doublet removal, nuclei filtering based on UMI, counts, and mitochondrial percentage (STAR Methods and Figures S2A-S2D), we integrated transcriptomes of high-quality single cells into a single dataset following batch correction (STAR Methods) and retained 217,132 high-quality single kidney nuclei (Figures 2A and S2E). Unsupervised clustering indicated 25 cell clusters ( Figures 2B  and S2E). After cluster-specific differential gene-expression analysis ( Figure S2f, Data S2), we grouped clusters into coarse-grained, high-level cell types: podocytes (Podo), endothelial cells (Endo), stroma cells (Stroma), proximal tubule cells (Prox tub, PT), non-proximal tubule cells (Non-prox tub) such as loop of Henle (LOH), distal convoluted tubule (DCT), connecting tubule (CNT), collecting duct principal cells (PC), and collecting duct intercalated cells (IC), as well as immune cells (Immune) (Figures 2A and 2C). Each cell type was present in every sample and in all groups, indicating the lack of major batch effect and negligible within-group heterogeneity ( Figures S3A-S3C). Differential proportion analysis showed significant differences in cell fractions between groups for almost all cell types (p < 0.001 for obese vs. lean comparisons in Endo, Stroma, Prox tub, Nonprox tub, and IC; p < 0.05 for Immune; not significant for Podo) ( Figure S3D). Next, we identified differentially expressed genes (DEGs) between disease states and treatment groups. PT and stromal cells showed the highest number of DEGs between treatment groups (Figures S4A-S4C; Data S3, S4, and S5). Importantly, individual cell-cluster transcriptomes in our ZSF1 rat DKD model demonstrated strong correlation with corresponding cell-cluster transcriptomes in two independent human single-cell DKD datasets 30 including the Kidney Precision Medicine Project (KPMP) 31 and served as an excellent reference on which all cell types present in the human DKD query dataset could be projected with high prediction accuracy (Figures 2D and S5A-S5D).
As the disease state was associated with important differences in both cell fractions and cell-type-specific gene expression, we used tensor decomposition analysis on our singlenuclei dataset ( Figure 2E; Data S6, S7, and S8) for an unbiased determination of critically important cell types associated with phenotypic changes. High-level cell-type identity, histopathological, and proteinuria metadata served as input for this unsupervised analysis that allowed us to retrieve main factors associated with phenotypic outcomes of the respective samples in an unbiased manner. Factor 1 explained by far the most (48.7%) transcriptomic variation across all samples and was significantly associated with interstitial fibrosis, tubular degeneration, hyaline cast formation, glomerulopathy, and proteinuria. Consistently, untreated ZSF1 obese samples had the lowest factor 1 scores, the highest proteinuria levels, and most severe histological damage. Factor 2 explained 19.0% of variation and was associated with rat genotype (ZSF1 lean vs. ZSF1 obese), suggesting that (E) Tensor decomposition analysis heatmap (center left) representing factor loading score of rat kidney (RK) samples (rows) onto tensor factors (rows). Degree of explained variance (exp_var) in the whole dataset is displayed on the bottom left. Significance level (Àlog 10 (p value)) of tensor factor association with clinical (uPCR, urinary protein/creatinine ratio in mg/mmol) and histopathology outcome measures (interst_fibrosis, interstitial fibrosis; tub_degen, tubular degeneration; mononuc_infiltr, mononuclear infiltration, glomerulopathy, each scored from 0 to 4) is displayed on the top left. Sample rows are color-annotated by outcome data, genotype (lean vs. obese), and treatment status (sGCm, sGC modulator treatment, or no treatment). (F) Heatmap representing factor 1 loading scores by cell type (columns) and genes (rows) (left). Explained variance is colored in shades of gray (top left), significance levels are shown on the right. The top five significant genes for every cell cluster are annotated. Article ll OPEN ACCESS the phenotype was a more important determinant of gene expression than the genotype. More detailed analysis of factor 1 loadings revealed that the majority of genes loading onto factor 1 were specific to three cell types: podocytes, PT, and stromal cells ( Figure 2F). Gene ontology (GO) analysis of factor 1-loading genes in PT and stromal cells revealed that repair (e.g., wound healing, regeneration, negative regulation of cell adhesion) and electron transport processes (e.g., mitochondrial respiratory chain complex assembly, electron transport chain, ATP metabolic process, proton transmembrane transport) were the top enriched pathways ( Figure S6A).
As we sought to study effects of pharmacological sGC modulation, we were reassured to find sGC genes (Gucy1a1, Gucy1a2, Gucy1b1, Gucy1b2) to be expressed almost exclusively in PT and stromal cells ( Figures 2G, S6B, and S6C). PT and stroma specificity for sGC and downstream cGMP effectors such as PDE3A and PDE5A was also observed in recent human and mouse DKD snRNA-seq datasets 30,32 ( Figure S6D), suggesting conservation across species. In addition, microdissected kidney tubule RNA-seq samples from human individuals with advanced DKD showed higher sGC mRNA expression ( Figure 2H). Finally, reanalysis of a human DKD snRNA-seq dataset 30 confirmed increased sGC expression in DKD stroma compared with control stroma (annotated as ''MES,'' mesangial, by the authors) as well as expression in PTinj ( Figure 2I).
Pharmacological sGC modulation improves gene expression in multiple cell types As our unsupervised tensor decomposition analysis prioritized podocytes, PT, and stromal cells as key cell types for improved structural and functional outcome, we focused on these cells. After three iterative rounds of clustering, we subset 2,065 podocyte nuclei that formed five clusters, establishing a continuous trajectory ( Figures S7A and S7B). Pathway enrichment analysis demonstrated that the start of the trajectory (Podo1-Podo3) was defined by nephrin, glomerular epithelium, glomerular development, or actin filament pathways, which is typical for healthy podocytes. Cluster Podo4 was specifically enriched for, e.g., FAK, p53, and apical junction pathways, whereas cluster Podo5 (at the end of the trajectory) was enriched specifically for, e.g., oxidative phosphorylation, ribosomal, and glutathione metabolism pathways ( Figure S7C). Clusters Podo1-Podo4 positively correlated with each other ( Figure S7D). Diabetic ZSF1 obese rats showed considerably lower fractions of differentiated Podo1 nuclei, which was rescued by sGCact but not by sGCstim ( Figure S7E). Vice versa, Podo5 was lowest in sGCact-treated rats and was enriched for oxidative phosphorylation ( Figures S7E and S7F).
Next, we analyzed the number of DEGs ''normalized'' (their expression changed to healthy level) by sGCact and sGCstim treatment (Data S10, S11, and S12). We found that the fractions of DEGs normalized by sGC modulation were highly variable and cell-type specific ( Figure 3C). The expression of a larger number of genes returned to baseline (healthy state) upon sGCact (n = 8,240) compared with sGCstim treatment (n = 7,885), which was consistent with the improved structural and functional outcome upon sGCact treatment ( Figures 1C-1G). Obese ZSF1 rats (compared with lean) showed the highest numbers of DEGs in mitoPT, PT(Spp1 + ), PST, and DediffPT_1. The largest numbers of genes returning to healthy control level by sGCact treatment were observed in DediffPT_1 (73% rescue), PST (72%), and Int (56%). For sGCstim treatment the highest percentages of rescue were observed in PST (59%), Mesench (48%), DediffPT_1 (48%), and PTinj (47%). While sGCact treatment was associated with a larger number of genes returning to baseline level than sGCstim treatment, we did not observe strong cell-type-specific differences between the two drugs, suggesting a class effect of action ( Figures 3C and S8H). Finally, the effect size of top DEG normalization via sGCact was similar to that of a control (lean) genotype ( Figure 3D), which again underlined the high effectiveness of sGCact.
We noticed a marked reduction of cGMP signaling in obese rats (compared with lean), which was restored more successfully by sGCact than sGCstim ( Figure S9A). DKD is associated with endothelial dysfunction and increased oxidative stress due to NO depletion, leading to oxidized and heme-free sGC, which neither NO nor sGCstim can target. 9 We were therefore intrigued to find that sGCact preserved gene expression of markers associated with negative regulation of oxidative stress better than sGCstim treatment ( Figure S9B). This could potentially explain-at least in part-the observed treatment benefits of sGCact over sGCstim. Along those lines, the negative oxidative stress regulation and cGMP effect sizes correlated positively ( Figure S9C). We further validated these findings in an external bulk kidney RNA-seq dataset: 36 advanced DKD kidneys had significantly lower cGMP effects scores than early DKD kidneys (p = 0.019), while controls and early DKD kidneys were not different (p = 0.085) ( Figure S9D). Similarly, kidneys from advanced DKD patients exhibited the lowest scores for negative regulation of oxidative stress compared with early DKD and control cases ( Figure S9E). Again we found a positive correlation between oxidative stress response and cGMP effect size ( Figure S9F) in human kidneys.
Our single-cell gene-expression data, consistent with prior immunostaining and in situ hybridization studies, 37 indicated sGC expression in stromal cells. To better understand sGC expression in the renal stroma, we subclustered the stromal cells. Unbiased clustering revealed two mesangial cell clusters (Mesang, Itga8, Gata3); juxtaglomerular apparatus (JGA, Ren); multiple fibroblast (Fib) clusters with previously described marker genes, such as Mgp, Apoe, B2m, Serpine1, Pdgfra, Cxcl10, Igfbp3, Xkr4, Igfbp5 or with an immune cell signature (Immune Fib, Ikzf1, Ptprc); clusters with PT marker genes (PT1, PT2); glomerular endothelial cells (GEC, Flt1, Ptprb); myofibroblasts (Myofib, Tnfrsf11b, Acta2); vascular smooth muscle cells (VSMC, Col14a1, Ntrk3); a mixture of the latter two (VSMC/Myofib, Ntrk3, Myh11, Synpo2); and pericytes (Peri, Rgs5, Notch3) ( Figures 3E-3G and Data S13). We noted the following patterns of expression of sGC pathway genes ( Figures 3H, S9G, and S9H). Serpine1 + Fib, Cxcl10 + Fib, and Igfbp3 + Fib expressed both Gucy1a1 and Gucy1b1, Mesang 2 and JGA enriched mainly for Gucy1a1, while Gucy1a2 was mainly expressed in Mesang 1 and Peri. sGC expression was largely absent from VSMC and Myofib, although downstream effectors such as Pde3a and Pde5a were expressed in these cell types. This is largely consistent with prior analyses that have highlighted mesangial cells, JGA, Fib throughout the cortical labyrinth, and Peri as main sites of sGC expression. 37 Taken together, DEG analysis with a focus on the proportion of rescued genes suggested high variability between cell types but with a larger number of genes returning to baseline with sGCact in comparison with sGCstim. Negative regulation of oxidative stress and downstream cGMP effects were better preserved upon sGCact treatment compared with sGCstim. We found that oxidative stress and cGMP effects correlated with clinical outcomes in both ZSF1 rats and DKD patients. Finally, we showed that sGC genes were mainly expressed in Mesang, JGA, Peri, and different Fib subsets, underscoring the importance of multiple stromal cells for sGC.

Trajectory analysis highlights dynamic changes of PT cells toward profibrotic and mesenchymal cell states
Our analysis consistently highlighted PT and stromal cells as potential disease-driving cell types ( Figure 4A). Dimension reduction after diffusion mapping revealed two consecutive trajectories. Lineage 1 originated from the healthy root state (PST), via PTinj toward ProfibPT ( Figure 4B). DEGs specifically higher along this trajectory (Data S14) enriched for typical PT functions such as organic anion transport, small molecule, and amino acid metabolism at the start of the lineage (PST) toward pathways associated with adherens junctions, extracellular matrix (ECM)-receptor interaction, focal adhesion, and epithelial-tomesenchymal transition (EMT) at the end of the lineage (ProfibPT) (Figures 4C and 4D; Data S15). The second trajectory followed a path from ProfibPT, enriching for tight junction, TGFb signaling, and adherens junction signaling, via a second PTinj cluster and DediffPT toward Int and Mesench, enriched, e.g., for EMT, cell adhesion, collagen fibril organization, and wound healing ( Figures 4B-4D and Data S15). To understand the stability and reproducibility of this trajectory, we used two different orthogonal methods (monocle2, monocle3) and obtained similar cell-trajectory profiles ( Figures S9I-S9N). Interestingly, DEG analysis revealed that PTinj cells separated into two  Figure 4E). These two subgroups showed a large number of non-overlapping, i.e., individually unique, DEGs ( Figure 4F and Data S16), indicating separate transcriptomic states. Genes in the transcriptomic state of PTinj_1 were enriched for pathways such as cellular amino acid metabolism, oxidative phosphorylation, and organic anion transport ( Figure S9O). These genes are essential for healthy PT function. PTinj_2 was enriched for VSMC migration, positive regulation of cell adhesion, and external encapsulating structure organization, and hence was more similar to a stromal identity of lineage 2 ( Figure S9P). Jaccard similarity analyses of cluster-specific DEGs confirmed the highest overlap of PTinj_1 with healthy PST, whereas overlap of PTinj_2 was highest with ProfibPT, respectively ( Figure 4G), again highlighting stark transcriptional differences between these two PTinj states. This was validated by clear separation of GO terms enriched in PTinj_1 and PTinj_2, respectively, in latent semantic space ( Figure S9Q and Data S17).
In summary, we demonstrate the close transcriptional relationship of PT and stromal cell types. Upon injury in the diabetic ZSF1 rat model, PT cells adopted a profibrotic and mesenchymal transcriptome.
Cell-cell communication analysis identifies a secretory phenotype of profibrotic PT As epithelial/stromal interplay has previously been shown to be implicated in kidney disease development, we next performed ligand-receptor analysis in PT and stromal subclusters (Figure 5A). ProfibPT and Mesench clusters presented with the highest interaction weights ( Figure 5B) and showed the highest ECM signaling ligand expression ( Figure 5C). ProfibPT and PTinj_2 expressed the most ECM receptors (Figures 5C, S10A, and S10B). The captured ligand-receptor network was functionally diverse ( Figure 5D) and we could attribute separate patterns ( Figure S10C): ProfibPT exhibited a strong secretory phenotype ( Figure 5E) and scored highly for secreted ECM factors ( Figure 5F), such as Pdgfb, Tgfb2, Fgf12, Hbegf, Il19, and Il24 (Figures 5G, S11A, and S11B). Moreover, Mesench was associated with the strongest ECM-associated outgoing signal (ligand expression) ( Figure 5H) and scored highest for the core matrisome ( Figure 5I), as reflected by high expression of Col1a1, Col3a1, Bgn, Prelp, Fbln5, and Fn1 (Figures 5J, S11C, and S11D).
Gene-regulatory network analysis highlights cell-typespecific transcription factors driving the PT-to-Mesench trajectory and prioritizes cell types of action for sGC modulation To gain insight into putative driver transcription factors (TFs) of the PT-to-Mesench cell trajectory, we performed gene-regulatory network (GRN) analysis ( Figure 6A). The root state and endpoints of lineages 1 and 2 showed the highest regulon density (Figures 6B, S12A, and S12B), again underscoring the richness within the transcriptomic states of differentiated PST, ProfibPT, and Mesench, respectively. The GRN logic that we inferred from cis-regulatory motif analysis clearly demonstrated that binarized regulon activity was able to independently differentiate and cluster all cells along the trajectory ( Figure 6C), suggesting high data quality and validating our prior clustering and trajectory analysis results. For example, we found highly specific regulons for PST (Gcm1, Stat5a, Bcl6, Lmx1b, Trps1), ProfibPT (Tead2, Bach2, Stat3, Gli3, Fosl2), and Mesench (Gli2, Gata6, Fli1, Tcf7l2, Hoxc6) ( Figures 6D, S12C, and S12D; Data S18). Taken together, our GRN analysis confirmed many known key TFs important for kidney disease development and attributed specific cell types to them. We also found novel cell-type-specific TFs, such as Nr1h4 for PTinj_1, Nfyc for DediffPT_1, and Foxn3 for PTinj_2, which are interesting candidates for studying their roles in renal disease development and warrant validation in future studies.
Finally, we asked whether we could infer specific cell types of action for sGC modulator treatment from our GRN. To this effect, we filtered for TFs that were predicted to target sGC genes (Figure 6E). We found marked enrichment of these regulons in PST, ProfibPT, and Mesench, with the highest regulon specificity scores for TFs Gcm1, Zmiz1, and Srebf2, respectively, targeting either Gucy1a2 or Gucy1b2 ( Figure 6F). Reassuringly, Zmiz1 was the top specific TF for ProfibPT. We have shown in a recent expression quantitative trait loci meta-analysis in human microdissected kidneys that ZMIZ1 is an eGene associated with several genome-wide association study variants significantly associated with kidney function. 38 ZMIZ1 has already been identified to be strongly associated with ESRD attributed to type 1 39 and type 2 diabetes. 40 Along those lines, regulon cell-type specificity tracked TF expression: Gcm1 enriched only in healthy PST, Zmiz1 in ProfibPT, and Srebf2 in Mesench (Figures 6G and  S12E). We found sGC genes to be expressed in PST, PTinj_1,  Figure 6H), further highlighting these probable cell types of action for pharmacological sGC modulation. To our knowledge, our report is the first to attribute celltype specificity of ZMIZ1 to ProfibPT, linking genetic discoveries with functional studies, and warrants validation in future studies.
WGCNA-derived sGC co-expression modules correlate with human DKD outcome Next, we sought to understand cell-type-specific changes in gene groups. To this end, we used WGCNA to identify modules correlating with sGC expression. We first created a WGCNAcompatible metanuclei dataset ( Figure 7A and STAR Methods), from which we retrieved seven gene modules (Figures 7B and S13A; Data S19). Some of these modules showed high kME values for sGC genes ( Figures 7C and S13B), indicating sGC genes as important hub genes for their corresponding coexpression modules. We noticed overall high specificity of modules for cell clusters along the PT-Mesench trajectory ( Figures 7D, S13C, and S13D). For example, the turquoise module was enriched in healthy PST. Blue, yellow, and black modules were enriched in ProfibPT. Green and red modules were enriched in Mesench. Module-specific phenotypes were consistent with Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway and GO term analysis, suggesting pathway enrichment representing healthy PT function in the turquoise model, profibrotic processes in blue, yellow, and black models, dedifferentiation to non-proximal tubule in the brown model, and ECM/Mesench processes in green and red modules, respectively ( Figure S14A and Data S20). Based on cell-type-specific sGC expression, we created a composite sGC co-expression WGCNA score of those gene modules demonstrating the highest co-expression with sGC in non-healthy PT and stromal cells along the trajectory ( Figure S14B). This WGCNA score showed highest enrichment in ProfibPT and Mesench ( Figure 7E). We were reassured that ProfibPT and Mesench clusters had the highest overlap of this composite WGCNA score with clusterspecific DEGs ( Figure 7F), suggesting that these two cell identities were most associated with sGC gene expression in non-healthy injury states.
Finally, we wanted to understand the relevance of sGC-associated changes to patient samples. We asked whether we could leverage this unbiased orthogonal dataset to a group with 991 microdissected kidney tubule RNA-seq samples from human individuals with and without DKD to infer disease-relevant param-eters ( Figure 7G and Data S21). Indeed, WGCNA scores were significantly higher in individuals with high albuminuria (Figure 7H), low glomerular filtration rate (GFR) (Figure 7I), and high degrees of fibrosis on histopathological examination ( Figure 7J). Furthermore, unsupervised clustering analysis of 991 patient samples was able to clearly group patients into two clusters based on sGC co-expression-derived WGCNA score ( Figure 7K). Reassuringly, although these two groups were matched with respect to clinical characteristics such as age, gender, race, blood pressure, and-most importantly-prevalence of diabetes, samples with high WGCNA score had significantly higher prevalence of DKD, albuminuria, fibrosis, and glomerulosclerosis than samples with low WGCNA score, as well as significantly lower GFR ( Figure 7L). These results suggest that the WGCNA sGC co-expression score was able to stratify subjects by clinical disease-relevant parameters. Accordingly, principal component analysis plots stratifying human samples into control, early DKD, and advanced DKD largely overlapped with WGCNA scoring (Figures S14C and S14D). We also validated these results in an independent human kidney bulk RNA-seq dataset including individuals with early and advanced DKD 36 ( Figure 7M).
Next, we built several multiple regression models to estimate the relative contribution of WGCNA score to disease-relevant parameters. Multiple linear regression models demonstrated that the WGCNA score estimated fibrosis (b = 1.144, p < 0.001), glomerulosclerosis (b = 0.616, p < 0.001), and GFR (b = À0.546, p < 0.001) independent of other clinical variables (Data S22). Ordinal logistic regression showed the WGCNA score to independently estimate albuminuria (odds ratio = 1.045, p < 0.001) (Data S22), such that a high WGCNA score was associated with albuminuria ( Figure S14E).
Having established the sGC co-expression WGCNA score as a valuable tool for assessing kidney outcomes relevant to DKD, we finally asked how pharmacological sGC modulation in the ZSF1 rat model would influence WGCNA score. Indeed, we observed lower WGCNA scores for rats treated with sGCact compared with untreated obese diabetic rats, while sGCstim had little effect ( Figure 7N), and we observed a negative correlation between WGCNA score and cGMP effects ( Figure S14F), suggesting that WGCNA score was a useful measure for estimating treatment effect size following sGC modulation. Our observations might also partially explain the improved kidney functional and structural outcome seen following sGCact treatment when compared with sGCstim.

DISCUSSION
Here we present the first comprehensive single-cell resolution atlas of DKD in the ZSF1 rat model. Not only does the ZSF1 rat recapitulate human DKD phenotypically, it also exhibits excellent correlation of cell-type-specific transcriptomes with that of human DKD, 30,31 underscoring the value of the ZSF1 rat for human DKD translational and pharmacological studies. To the best of our knowledge, we are the first to present a single-cell resolution head-to-head comparison of a sGCstim and sGCact treatment. In our model, we find superiority of sGCact over sGCstim in attenuating functional and structural DKD. We highlight key cell types; podocytes, PT, and mesenchymal cells showed the largest changes in the single-cell data, coinciding with the expression of sGC pathway genes. We demonstrate that sGC co-expression gene modules can be successfully used to stratify patient kidney samples by DKD renal outcome measures such as GFR, albuminuria, glomerulosclerosis, and interstitial fibrosis, indicating the relevance of the sGC pathway to patients.
Animal models play a key role in human disease understanding. While recently gene and pathway discovery approaches have heavily focused on patient samples, animal models remain critical for pharmacological gene and pathway modulation and proof-of-concept studies. Here we provide a comprehensive phenotypic, histological, biochemical, and single-cell geneexpression description of the ZSF1 rat model. Comparison of rat samples with human DKD shows very strong similarities but also differences, indicating that the model is useful to analyze specific disease manifestations. Detailed single-cell and omics analysis of animal models will be critical for therapeutics discovery. We present our data for our users via an easy-to-use interface at http://www.susztaklab.com/ZSF1_sGC_snRNA/. Furthermore, we present here an important tool for examining therapeutic effectiveness, target cell types, and mechanism of action via single-cell sequencing. We have been lacking a detailed understanding of individual sGC modulation effects on a cellular level, despite consistent kidney phenotypic improve-ment by sGC agonists in preclinical DKD models. 8,[15][16][17][18] Singlecell transcriptomics with an unbiased tensor decomposition approach highlighted PT and stromal cells as key target cell types of sGC-cGMP-mediated effects. This is mostly consistent with the cell-type expression of sGC pathway genes. Furthermore, we robustly demonstrate the close transcriptomic relationship between PT and stromal cells: During diabetic injury, formerly healthy PT cells transition via several cell states (PTinj, ProfibPT) toward a Mesench phenotype. Numerous studies have implicated EMT in renal fibrosis; 41-44 however, a potential connection to NO-sGC-cGMP signaling has not been described so far and mechanistic animal studies will be needed for future validation. In summary, while multiple cell types show changes in disease state and following drug treatment, novel single-cell tools are still able to identify key disease-driving cell types.
Furthermore, we identified important differences between sGCstim and sGCact. Studies have established the role of reactive oxygen species production, oxidative stress coupled with compromised NO bioactivity, and endothelial dysfunction. [45][46][47] To this effect, it is important to note that sGCstim and sGCact differ in their ability to generate cGMP under pathophysiological conditions such as the high oxidative stress state in DKD. 9 While sGCstim depend on a reduced iron (Fe 2+ ) state of sGC, sGCact preferentially target sGC at the heme-free or oxidized, NO-unresponsive sGC enzyme, explaining their higher pharmacological activity under conditions of high oxidative stress and NO depletion. 14 Our ZSF1 rat model results corroborated this hypothesis, demonstrating stronger attenuation of the DKD phenotype, such as increased kidney function, reduced kidney injury markers, glomerulosclerosis, proteinuria, and interstitial fibrosis, upon sGCact treatment compared with sGCstim. Moreover, we confirmed (in kidneys from both diabetic ZSF1 rats and human subjects with advanced DKD) that lower cGMP effects, a proxy of decreased downstream sGC action, correlated positively with the inability to negatively regulate oxidative stress. Most importantly, cGMP effects measured on a transcriptomic level were restored to a larger extent by sGCact than sGCstim. These results indicate that single-cell gene-expression analysis is able to identify not only disease-driving cell types but also diseasecritical pathways and drug mechanisms of action.
In summary, we present the first single-cell resolution atlas for the ZSF1 rat DKD model and a head-to-head comparison of sGCact and sGCstim effects in DKD. Our single-cell analysis was able to highlight key disease-driving cell types (podocyte, PT, and stromal cells) and disease-driving mechanisms. Finally, we show the potential relevance of animal model observations to patient samples and show that sGC co-expression can be used to stratify human DKD kidney samples by parameters relevant for kidney functional and structural outcomes.
Limitations of the study Our study has some limitations. We provide consistent data demonstrating higher efficacy of sGCact compared with sGCstim on several readouts, but we lack data on dose-response relationships of sGCact and sGCstim, which might be important for finetuning of the efficacy parameters. To avoid potential bias by blood pressure reduction through sGCact, we used doses that had no or minimal effects on blood pressure, while kidney-protective effects of sGCstim in ZSF1 rats are known to require higher dosages that are active on blood pressure. Future studies should address this limitation by comparing sGCact and sGCstim with a wider dose range and a more extensive set of physiological readouts. Additionally, the renoprotective effects in our ZSF1 model may be partially attributed to improvements in glycemia, but it is difficult to estimate the extent of this contribution. 48 Furthermore, given the specific expression of sGC subunit mRNAs in stromal and PT cells, it is difficult to distinguish direct sGC agonism in specific kidney cells from indirect effects such as changes in renal hemodynamics and glucose/lipid metabolism, which warrant further mechanistic studies.

STAR+METHODS
Detailed methods are provided in the online version of this paper and include the following:  Technologies #5067-1513) was used to check RNA quality. All samples with an RNA integrity number (RIN) >6 were used for cDNA preparation. Strand specific RNA-seq libraries were generated using TruSeq RNA library prep kit v2 (#RS-122-2001) following the manufacturer's protocol. RNA-seq libraries were sequenced to a depth of 20 million 2 3 150 pair end reads. Single-nuclei RNA-seq data analysis Individual sample alignment, ambient RNA correction, and doublet removal. Raw fastq files were aligned and quantified with CellRanger using a custom pre-mRNA GTF built from the ENSEMBL rn6 genome to include intronic regions.
Seurat was used for data quality control, preprocessing, and dimensional reduction. In short, for every sample, a separate genecell data matrix was generated and poor-quality cells with <200 or >3,000 expressed genes and mitochondrial gene percentages >15 were excluded. Remaining barcodes of high-quality nuclei were log-normalized and the top 3,000 highly variable genes were identified with the vst method. After data scaling, linear dimension reduction was performed using principal component analysis (PCA). A shared nearest neighbor network was created based on Euclidian distances between cells in multidimensional PC space using the first 15 dimensions before clustering using FindClusters and dimension reduction using RunUMAP functions, respectively.
Doublet-like cells were identified using DoubletFinder. 52 Assuming no ground truth to facilitate an unbiased approach, pK was identified using paramSweep_v3 function with PCs=1:15. Homotypic doublet proportion was estimated with function modelHomotypic using above clustering information. Finally, function doubletFinder_v3 was run with pN=0.25, pK and nExp as identified by the functions above and Uniform Manifold Approximation and Projections (UMAPs) were manually inspected for singlet/doublet status. Sample integration and batch correction. After determining high quality ambient RNA-corrected singlet barcodes for every sample individually, 10X filtered output matrices of all 12 samples were again corrected for ambient RNA and subset to singlet barcodes, as determined above, before merging of Seurat objects. The Seurat preprocessing pipeline was then rerun on the merged object (normalization, identification of highly variable genes, scaling, linear dimension reduction), regressing out nCount_RNA during scaling. Harmony 53 was used to correct for potential batch effects. The first 30 Harmony-corrected principal components were used for nearest neighbor network creation, clustering, and dimension reduction. A clustering resolution of 0.9 was chosen to best reflect separate cell identities without artificial over-clustering. Identification of marker genes and differentially expressed genes. Differentially expressed genes in cell clusters were identified in Seurat using FindAllMarkers function with parameters test.use=MAST, min.pct=0.1 and logfc.threshold=0.2 and a manually curated list of marker genes from prior publications 54-60 was used for manual annotation of the 24 resulting cell clusters in the final dataset including 217,132 rat kidney nuclei. Genes differentially expressed between experimental groups were determined with function Find-Markers for each cell type separately with the same thresholds. Integration of ZSF1 rat DKD with human DKD dataset. The ZSF1 rat DKD snRNA-seq dataset was integrated with two independent external human DKD datasets 30,31 using the FindTransferAnchors function in Seurat with n=30 dimensions. Nuclei from the human query dataset were projected onto the unimodal ZSF1 rat UMAP with function MapQuery. Before correlation analysis of average cluster expression genes from the human DKD dataset were converted to corresponding rat orthologues. Tensor decomposition. To study the effects of sample stratification across treatment groups and gain a deeper understanding of multicellular gene expression patterns, we used scITD 61 to employ tensor decomposition analysis on our single-nuclei dataset. The SoupX-corrected merged count matrix of the final dataset was provided as input, along with histopathological (interstitial fibrosis, tubular degeneration, hyaline cast, mononuclear infiltration, glomerulopathy scores), functional (proteinuria, as measured by urine protein-to-creatinine ratio), genotype (ZSF1 lean vs. obese), and pharmacological treatment (no treatment, sGC modulator treatment) metadata. Function form_tensor was used with parameters donor_min_cells=5, scale_factor=10,000, vargenes_method=-norm_var_pvals, vargenes_thresh=0.1, and var_scale_power=2. The number of factors was determined using function determine_r-anks_tucker with 10 iterations and stability analysis demonstrated mean donor scores correlation >0.9 for all 4 factors. Tucker tensor decomposition was performed using function run_tucker_ica with rotation_type=hybrid. Finally, genes significantly associated with each factor were determined with function get_lm_pvals. Gene ontology and pathway analysis. Gene ontology and pathway analyses for gene lists of interest were performed with package clusterProfiler 62 using functions enrichGO and compareCluster. HALLMARK, GO:BP, C2:KEGG, and C2:CP:PID C2:CP:BIOCARTA gene sets were retrieved through Molecular Signatures Database (MSigDB) v7. For some analyses, GO terms were reduced using package rrvgo functions calculateSimMatrix and reduceSimMatrix. Reduction of GO terms in semantic space was performed with package GOFigure. PT and stroma cell subclustering. The whole Seurat pipeline was repeated with the object subset to those barcodes of cells annotated as PT and stroma cells. The same settings were used for the pipeline as stated above. Differential gene expression analysis and subsequent manual annotation revealed that 3 cell identities represented contamination by scattered endothelial and mixed identity clusters and were thus removed from further analyses. scRNA-seq trajectory analysis Slingshot. To construct single-cell pseudotime cell trajectories and to identify genes whose expression changed as the cells underwent transition, package Slingshot 63 was applied to a random sample of the following subclusters from the PT and stroma cell dataset, for which UMAP inspection and differential gene expression analysis suggested close transcriptomic proximity: proximal straight tubule (PST), injured PT (PTinj), profibrotic PT (ProfibPT), dedifferentiated PT (DediffPT), interstitial (Int), and mesenchymal cells (Mesench), resulting in a total of 4,821 cells. After Seurat to SingleCellExperiment object conversion, genes were filtered for cell type markers with at least 3 reads in at least 10 cells. Next, counts were normalized and dimensionality was reduced using Cell Reports Medicine 4, 100992, April 18, 2023 e4 Article ll OPEN ACCESS diffusion maps with package destiny. 64 Slingshot functions getLineages and getCurves were used to calculate trajectories. To identify temporally differentially expressed genes, generalized additive modelling (GAM) was applied with a locally estimated scatterplot smoothing (LOESS) term for pseudotime. The top genes were picked based on p value and their expression over pseudotime was visualized in heatmaps after binning pseudotime into quantiles. Genes differentially expressed over pseudotime were input into pathway analysis using package clusterProfiler and pathways specifically enriched over pseudotime bins, as determined by q value calculation, were visualized in heatmaps.
Monocle2 & Monocle3. Slingshot-derived pseudotime trajectories were validated with Monocle2 65 and Monocle3 66 packages using the same cells as input. Genes for ordering cells were selected if they were expressed in R10 cells, their mean expression value was R0.05 and dispersion empirical value was R2. Highly variable genes along pseudotime were identified using differentialGeneTest function of Monocle2 with q<0.01. Individual branches were analyzed using BEAM and plot_genes_branched_heatmap functions. In Monocle3 cells were re-clustered using a resolution of 3e -4 . The trajectory was produced using default parameters of function learn_graph. Cluster centers of samples from differentiated PST cells were set as root node before ordering cells along pseudotime with function order_cells. Gene set/pathway scoring. Gene expression of lists or sets of genes was scored in single-cell data as described previously for the cell cycle 67 and other gene sets, 68 using normalized gene expression of a gene set/pathway of interest as input and setting the gene correlation value to 0.1. Jaccard similarity index. Single-cell cluster stability of PT and stroma cells was evaluated by comparing cluster-specific DEG lists and calculating Jaccard similarity indices according to the following formula: where J is the Jaccard similarity index and A and B represent DEG lists of two respective clusters to be compared. Ligand-receptor interactions. To assess cellular crosstalk between different cell types, we used CellChat 69 to infer cell-cell communication networks from single-cell transcriptome data. For the lack of a rat-specific ligand-receptor interaction database, we used orthologous mapping to facilitate usage of the Cellchat-curated mouse database. We followed the authors' tutorial for comparison analysis of multiple datasets (https://htmlpreview.github.io/?https://github.com/sqjin/CellChat/blob/master/tutorial/ Comparison_analysis_of_multiple_datasets.html), filtering communication with parameter min.cells=10. We used all inferred cellcell communications at the level of ligands/receptors and later repeated the analysis focusing on secreted factors and ECMreceptor interactions. Outgoing and incoming interaction weights of pairs of cell types were inferred using functions computeCommunProbPathway and aggregateNet. Dominant senders and receivers were visualized using functions netAnalysis_signalingRole_heatmap and netAnalysis_signalingRole_scatter. Structural and functional similarities of signaling pathways were visualized using function netVisual_embedding. Gene regulatory network inference. To identify TFs and characterize cell states, we employed cis-regulatory analysis using SCENIC, 70 which infers the GRN based on co-expression and DNA motif analysis. In short, TFs were identified using GENIE3 and compiled into modules (regulons), which were subsequently subjected to cis-regulatory motif analysis using RcisTarget with two gene-motif rankings: 10 kb around the TSS and 500 bp upstream. Regulon activity in every cell was then scored using AUCell. Finally, binarized regulon activity was projected onto diffusion map-embedded trajectories. Weighted gene coexpression network analysis (WGCNA). We applied WGCNA to our scRNA-seq dataset using the R package WGCNA, as described previously. 71,72 First, to circumvent the sparsity of single-cell data we constructed metanuclei with a bootstrapped aggregation process to single-cell transcriptomes and pooled nuclei within the same cell type to retain these metadata for WGCNA. We then created a similarity matrix, in which the similarity between genes reflects the sign of the correlation of their expression profiles. To emphasize strong correlations and reduce the emphasis of weak correlations on an exponential scale, we raised the signed similarity matrix to power b. The resulting adjacency matrix was transformed into a topological overlap matrix. Modules were defined using the following specific module-cutting parameters: module size=50 genes, deepSplit score=4, threshold of correlation=0.2. Modules with a correlation of >0.8 were . The first principal component of the module, the module eigengene (ME), was used to correlate with cell type. Hub genes were defined using intra-modular connectivity (kME) parameters of the WGCNA package. Bulk RNA-seq data analysis Quality control and alignment. Adaptor and lower-quality bases were trimmed with Trim-galore. Reads were aligned to the human genome (hg19/GRCh37) using STAR. Gene and isoform expression levels (TPM) were estimated using RSEM. Principal component analysis was performed to identify outliers. Hierarchical clustering analysis. To identify potential clustering of microdissected human kidney tubule bulk RNA-seq samples based on gene co-expression with sGC, hierarchical clustering was performed on the scaled TPM matrix of 991 microdissected human kidney tubules with the composite WGCNA score as input. Ward's method with Euclidean distances was used to cluster the datasets using function hclust. The optimal number of clusters k was determined by average silhouette method. After clustering, a cluster dendrogram was computed. Clinical and histopathological variables were compared between the clustered samples using Wilcoxon-Mann-Whitney, t or Fisher's exact test, as applicable. To exclude random clustering effects and demonstrate validity of the composite WGCNA score as input for clustering analysis, we repeated the above procedure with n=3 randomly generated gene sets with an equal number of genes. In each instance, patient samples were clustered into 2 main clusters based e5 Cell Reports Medicine 4, 100992, April 18, 2023 Article ll OPEN ACCESS on gene expression but failed to demonstrate significant differences of DKD prevalence and outcome measures (e.g., proteinuria, GFR, interstitial fibrosis, glomerulosclerosis). Multiple linear and ordinal logistic regression. To estimate the relative contribution of WGCNA score to disease-relevant parameters in bulk RNA-seq data from human kidney samples, we built separate multiple regression models with fibrosis, glomerulosclerosis, and GFR as dependent variable. Age, gender, race, systolic blood pressure, prevalence of diabetes, BMI, and HgbA1c were put into the models as independent variables. Independent variables were then reduced depending on whether they were informative for the model or not. In multiple regression models, b coefficients and F statistics were calculated. For albuminuria, we performed ordinal logistic regression using the MASS package with function polr and computed odds ratios as well as predicted probabilities. The proportional odds assumption was confirmed.

QUANTIFICATION AND STATISTICAL ANALYSIS
Data are expressed as means ± SEM unless otherwise stated. Statistical analyses are indicated in the respective methods sections and figure legends. Appropriate parametric or non-parametric tests were performed as per normality distribution. P < 0.05 was considered to be statistically significant. No statistical method was used to predetermine sample size. No data were excluded from the analyses.