Quantifying stimulus-response specificity to probe the functional state of macrophages

Immune sentinel macrophages initiate responses to pathogens via hundreds of immune response genes. Each immune threat demands a tailored response, suggesting that the capacity for stimulus-specific gene expression is a key functional hallmark of healthy macrophages. To quantify this property, termed "stimulus-response specificity" (SRS), we developed a single-cell experimental workflow and analytical approaches based on information theory and machine learning. We found that the response specificity of macrophages is driven by combinations of specific immune genes that show low cell-to-cell heterogeneity and are targets of separate signaling pathways. The "response specificity profile," a systematic comparison of multiple stimulus-response distributions, was distinctly altered by polarizing cytokines, and it enabled an assessment of the functional state of macrophages. Indeed, the response specificity profile of peritoneal macrophages from old and obese mice showed characteristic differences, suggesting that SRS may be a basis for measuring the functional state of innate immune cells. A record of this paper's transparent peer review process is included in the supplemental information.


INTRODUCTION
Macrophages reside in almost all tissues of the body, where they are sensors for injury, infection, or disease. 1 The many functions they perform require them to respond appropriately to pathogens (pathogen-associated molecular patterns [PAMPs]), injury and danger (danger-associated molecular patterns [DAMPs]), and to cytokines. Immune response genes code for potent bioactivities that are not constitutively expressed because they may be detrimental to the host. Thus, immune response gene programs should only be deployed as needed and to the extent necessary. In fact, the precise deployment of immune response genes is critical for preventing abnormal immune sequelae. Both weak or overactive immune responses may arise from the failure of immune sentinel cells to respond with appropriate specificity to immune threats, resulting in poor health outcomes. [2][3][4][5] Since immune sentinel cells function as individuals in initiating and coordinating immune responses, measuring their capacity to respond specifically to diverse pathogens requires consideration of their substantial cell-to-cell heterogeneity. 6 Prior population-level macrophage transcriptome profiling studies identified a common immune response gene program, 7 as well as gene programs that were in fact highly stimulus specific. 8-10 A plethora of population-level studies have shown that specific responses arise from the signaling and epigenetic networks downstream of receptor-ligand interactions. [11][12][13] In addition, several studies measuring time-dependent stimulus responses using transcription factor knockouts have delineated the regulation of stimulus-dependent gene programs. 8,13 However, these studies did not quantify how specific macrophages responses actually are, as this requires single-cell measurements to determine the breadth and overlap of single-cell response distributions.
Macrophage heterogeneity itself has recently been studied via single-cell measurements in multiple contexts, such as during polarization or differentiation. For instance, the heterogeneity of macrophage populations after exposure to polarizing cytokines was shown to be greater if simultaneous conflicting cues were provided. 14 Single-cell heterogeneity has also been profiled during monocyte-to-macrophage differentiation in different tissues under different infection conditions, uncovering distinct activation paths in vivo that are altered by microenvironment and disease. 15 Importantly, macrophages must function as responders to immune threats, and thus recent studies have investigated the heterogeneity of single-cell macrophage or monocyte responses to single stimuli, after polarization 16 or after immune training. 17 However, these studies have not yet addressed how the heterogeneity of macrophage gene expression affects the specificity of responses to different to stimuli, and how the resulting stimulus-response specificity may be affected by different microenvironmental contexts.
The response of immune sentinels to stimuli is known to be a function of the microenvironmental context. 18 Polarizing cytokines enhance and diminish the activation of specific immune genes in an immune stimulus-specific manner by altering signaling networks and epigenetic landscapes. [19][20][21][22] Thus, aberrant cytokine contexts that are associated with inflammatory or immunological disease may affect the macrophage's capacity to mount stimulus-specific responses. Indeed, as macrophages circulate and patrol the body, profiling their capacities to mount stimulus-specific responses may report on the inflammatory state of their donor. However, in the absence of a workflow that measures response distributions and quantifies their overlap, the opportunity to leverage the macrophage's sensitivity to immune cytokines for reporting on the function of an individual's innate immune system has not been explored.
Here, we developed the necessary single-cell experimental and computational approaches to assess the functional states of macrophages, via quantification of the specificity of their responses to immune threats. Using information theoretic and machine learning approaches, we found that stimulus-response specificity (SRS) was driven by genes with narrow response distributions that in combination distinguished different ligands. Mechanistically, high SRS was associated with stimulus-specific activation of interferon regulatory factor (IRF) or MAPKp38 pathways, or with differences in the dynamical profiles of NF-kB signaling. We found that SRS was affected by microenvironmental cytokines in a gene-specific manner. This realization prompted the development of the response specificity profile, involving systematic pairwise comparisons, to reveal important functional distinctions in macrophages polarized in vitro or conditioned in vivo by inflammatory conditions of age or obesity.

RESULTS
Despite single-cell heterogeneity, macrophages produce highly specific gene expression responses to diverse immune stimuli To quantify the degree of response specificity in macrophages, we developed an experimental workflow using a targeted mRNA sequencing approach 23 that was both cost-effective and reduced the technical noise of genome-wide single-cell RNA sequencing (scRNA-seq) approaches ( Figure 1A). We selected a set of 500 macrophage genes via principal-component analysis (PCA) on available macrophage response RNAseq data, which profiled macrophage responses across a time series at 0, 1, 3, and 8 h in response to 14 different viral and bacterial immune components or whole pathogens. 8 The gene loadings of the resulting PCA identified the most stimulus-specific genes in an unsupervised manner ( Figure S1A; Table S1; STAR Methods). This bulked RNA-seq dataset. The selected set of 500 genes showed greater enrichment of NF-kB, IRF, and AP1 motifs than 1,502 stimulus-induced genes, which suggested that these three signaling pathways are the primary drivers of response specificity ( Figure S1B).
To identify immune stimuli that would best represent response specificity, we further analyzed bulk RNA-seq data from macrophages responding to 14 different pathogen or cytokine ligands to determine the ligands that induce diverse macrophage responses ( Figure S1A). 8 Tensor component analysis 24 allowed for an integrated decomposition of the data across all measured time points and stimuli, and showed that each ligand occupied a non-redundant location, indicating a distinct transcriptomic response at the population level. However, some ligands such as CpG and Pam3CSK (P3C) sat closely adjacent in the tensor-decomposed space ( Figure S1C), indicating a greater similarity in their time-dependent expression profiles. From the 14 stimuli, we selected 6 with distinct tensor component weights that represented a spectrum of Gram-positive bacteria (P3C), Gram-negative bacteria (lipopolysaccharide [LPS]), bacterial DNA (CpG), viral nucleic acids (poly(I:C) [PIC]), and host cytokines activating either the interferon (IFN-b) or NF-kB (TNF) signaling pathways.
Because gene programs are induced dynamically, the quantification of SRS may depend on the time point at which gene expression measurements are taken. To compare different time points of stimulation, we calculated pairwise stimuli distances at 1, 3, and 8 h. Ligand-pair distances were most distinct at 3 h ( Figure S1D), reflecting less impact from secondary signaling mechanisms than at the 8-h time point, and thus we chose the 3-h time point for our analysis of response specificity. A heatmap comparison of the newly generated single-cell data showed good concordance with the published bulk RNA-seq data while revealing substantial cell-to-cell heterogeneity in expression ( Figure 1B).
A comparison of the scRNA-seq data from our targeted approach (Rhapsody) to the genome-wide approach (10x Genomics) showed that both had similar concordance in their means to bulk data ( Figure S2A) and had comparable distributions to each other ( Figure S2B). Notably, the majority of genes captured only by the genome-wide approach were poorly expressed, further supporting the utility of sequencing just the most informative genes through the targeted approach ( Figure S2C). For $90% of the genes measured by both approaches, the targeted approach also had a smaller percentage of cells with drop-out data (i.e., genes with zero reads), as each transcript could be sequenced more deeply ( Figure S2D). To avoid heterogeneity in macrophage populations due to different progenitors present in bone marrow, we used a clonal HoxB4-immortalized myeloid progenitor cell line that was differentiated into macrophages with macrophage colony-stimulating factor (M-CSF)-containing medium. The resulting cell populations showed similar bulk 25 and single-cell stimulus-response transcriptomic profiles as bone marrow-derived macrophages ( Figures S3A and S3B). Taken together, these results suggest that the targeted sequencing approach provides a reproducible ( Figures S3C and S3D; STAR Methods), cost-effective means for measuring heterogeneous single-cell macrophage gene expression in response to diverse immune ligands, making it suitable for quantifying SRS.
To determine how much single-cell heterogeneity affected the stimulus specificity of responses, we next sought to assess the overlap in gene expression response distributions. PCA revealed that IFN-b, LPS, and PIC-response distributions were best distinguished, with minimal overlap of their 95% confidence regions on the first two components (44% variance explained), while TNF, CpG, and P3C appeared overlapping (Figure 1C). Uniform manifold approximation and projection (UMAP) on the top 20 components clarified that TNF could be separated from CpG and P3C on lower components ( Figure 1D), potentially because the latter activate stronger MAPK and non-oscillatory NF-kB, while the MyD88-mediated CpG and P3C response distributions remained largely indistinguishable ( Figure 1E). We trained a random forest classifier to determine how well the stimulus could be predicted given a single cell's response transcriptome ( Figures 1F, S4A, and S4B). We found that IFN-b could be perfectly predicted (F1 score = 100%), while the bacterial ligands CpG and P3C were confused with each other and were thus more poorly classified (F1 = 85% and 74%, respectively). The overall prediction accuracy of 88% across all stimuli indicated a high degree of stimulus specificity despite cellto-cell heterogeneity in macrophage responses to ligands ( Figure 1F).
High stimulus specificity is determined by combinations of individual genes that alone can distinguish only subsets of stimulus pairs To identify genes that may be driving the observed stimulus specificity, we next employed an information theoretic approach. 26, 27 We considered ligand information as transmitted through a channel comprising cell signaling and gene regulatory networks, both affected by pre-existing biological heterogeneity and the stochasticity of biochemical reactions, to produce heterogeneous gene expression responses ( Figure 2A). 28,29 Under this framework, the maximum mutual information (max MI) describes the certainty about the ligand input given the transcriptome output. One bit equates to perfect distinguishability of 2 ligand response distributions (2 1 = 2), 2 bits for 4 ligands (2 2 = 4), and 2.58 bits for 6 ligands (2 2.58 = 6). Because MI is maximized over all possible input distributions (max MI), this metric provides a comparable absolute quantity for the biological characteristic of response specificity that is independent of the underlying type of data distribution or the mechanisms by which the information is encoded.
We found that 95% of the measured genes on their own conveyed no more than 1 bit of information ( Figure 2B; Table S2), which would indicate the ability to distinguish two conditions. A heatmap of the top-ranking individual genes indicated that rather than each gene having several levels of Bottom left: heatmap of all genes induced with log 2 (FC) > 2 in any stimulus from bulk RNA-seq data. 8  expression that would separate ligands, most genes displayed an on-or-off expression pattern across single cells, which thus distinguishes only two groups of ligands ( Figure 2C). The single gene most informative for distinguishing stimuli was Cmpk2, a mitochondria-associated gene with reported antiviral and homeostatic functions, 30,31 which allowed for a max MI of 1.5 bits ( Figure 2C; Table S2), corresponding to expression distributions sufficiently narrow to define approximately 3 kinds of stimulusspecific responses. How then can macrophages achieve high response specificity? One possibility is that a combination of genes that shows low cell-to-cell heterogeneity may specify stimulus-specific re-sponses to multiple stimuli. To assess this possibility, we employed the same information theoretic framework to calculate the max MI provided by the best-performing combinations of genes. Indeed, the best combination of two genes (Cmpk2 and Nfkbiz) already allowed for a max MI of almost 2 bits, with the gain in max MI plateauing to $2.25 bits as larger combinations were tested ( Figure 2D). Interestingly, the majority of genes within the top gene combinations were intracellular proteins controlling nucleotide metabolism (Cmpk2), 32 antiviral activity and cell death (Ifit3, Ifit1, and Mx2), [33][34][35] ubiquitination (Peli1), 36 mRNA half-life of inflammatory genes (Zc3h12a), 37 or phagocytosis (Swap70) 38 ( Figure 2E). Cytokine genes commonly  (Table S3). Machine learning classification using different gene combination sizes confirmed these conclusions. Even just the top two genes performed reasonably well at 70% accuracy, the top 5 genes further improved classification accuracy to 78%, and the top 15 genes performed almost as well as all genes at 85% accuracy (vs. 88% for all) ( Figure 2F cf. Figure 1F). Gene combinations that worked well together did so by distinguishing complementary ligand pairs ( Figure 2G). For example, Nfkbiz alone contributed 0.75 bits to distinguishing CpG and PIC, complementing the inability of Cmpk2 expression to distinguish these ligands (0 bits).
The response specificity of cytokine genes is limited by high cell-to-cell heterogeneity, despite having distinct expression means High response specificity requires not only that mean population gene expression is distinct but also that the cell-to-cell heterogeneity is low, such that distributions of single-cell responses have limited overlap. Hence, we investigated whether SRS was limited by small differences in means or wide distributions. We found that mean-square deviation (MSD), which summarizes differences in means across ligands, correlated more strongly to response specificity (r = 0.8) than the average Fano factor, which summarizes average gene heterogeneity (r = À0.5) ( Figure 3A). However, a few outliers were evident: some genes with a high mean difference (e.g., Ccl5 or Cxcl10) showed unimpressive response specificity, with similar max MI to genes with a low mean difference (e.g., Ifi205) ( Figure 3A). Plotting the average Fano factor revealed that these outliers differed in their dispersion: Cxcl10 and Ccl5 displayed much higher heterogeneity (i.e., high avg. Fano factors) ( Figure 3B). By contrast, metabolic and non-secreted genes such as Cmpk2, Ifi205, and Ifit3 had tight distributions (i.e., unusually low avg. Fano factors) and thus high response specificity.
Interestingly, a pattern emerged where the Fano factors of cytokine/chemokine genes were higher than expected, diminishing their response specificity despite distinct mean expression levels ( Figure 3C). For example, directly plotting the response distributions of cytokines Ccl5, Tnf, and Cxcl10 ( Figure 3D), as well as of the non-cytokine gene Cmpk2, showed that Ccl5 was induced with high variance expression distributions in response to PIC, P3CSK, and CpG, while Cxcl10 and Tnf distributions also had Fano factors (variance/mean ratio) close to or above one for multiple stimuli, which meant that their distributions were broad in relation to mean expression. By contrast, the Fano factors across stimuli for the non-cytokine gene Cmpk2 were well below one for all stimuli ( Figure 3E). These results suggest that single-cell cytokine/chemokine expression may have limited predictive value in specifying distinct ligand responses due to single-cell heterogeneity.
The stimulus specificities of immune response genes are due to the selective deployment of IRF and p38 pathways and NF-kB dynamical features In macrophage responses, four signaling pathways and their downstream gene regulatory factors are combinatorially activated and are responsible for transmitting information about extracellular ligands to the nucleus ( Figure 4A). Their target genes have been categorized into five gene regulatory strategies, namely, AP1, NF-kB, IRF, NF-kB|p38, and NF-kB|IRF. 8 We asked which gene regulatory strategies may mediate the high SRS of particular genes, as measured by max MI. We assigned each gene to a regulatory logic through matching mathematical model simulations of possible regulatory logics to available stimulus-response datasets, followed by a curation of prior literature ( Figure S5A; Table S4; STAR Methods). The appropriate motifs were enriched for genes assigned to each of the clusters ( Figure S5B). We then calculated the max MI of every gene for ligand pairs ( Figures 4B-4D). For example, contrasting TNF vs. IFN-b, genes within every regulatory logic group showed high specificity ( Figure 4B), a reflection of the fact that each group is activated by only one of the ligands: AP1 and NF-kB and NF-kB|p38 are activated by TNF but not IFN-b, whereas IRF genes are activated by IFN-b but not TNF. By contrast, for the P3CSK vs. TNF stimulus pair, only NF-kB| p38 target genes showed specificity because both stimuli activate AP1 and NF-kB, fail to activate IRF, and differ in the extent of p38 activation ( Figure 4C). Similarly, for P3CSK vs. PIC, NF-kB|p38 targets showed specificity because PIC does not activate p38. 8 However, for P3CSK vs. PIC, unlike P3CSK vs. TNF, IRF target genes also contributed to SRS, since PIC activates IRF but P3CSK does not ( Figure 4D). Of note, the max MI of IRF target genes was on average lower for P3CSK vs. PIC than for TNF vs. IFN-b, potentially due to highly heterogeneous activation of IRF by the TRIF signaling pathway. 40 In addition to combinatorial pathway control, the dynamics of NF-kB activation also specify gene expression. 41 Stimulusspecific NF-kB temporal dynamics involve six NF-kB signaling codons that convey information about the stimulus to target genes: ''speed,'' ''peak amplitude,'' ''oscillations,'' ''duration,'' ''total activity,'' and ''early vs. late'' activity ( Figure 4E). 39 To determine which signaling codons may be associated with the stimulus specificity of NF-kB target genes, we correlated pairwise specificities of NF-kB signaling codons with the pair-wise specificities of NF-kB target genes ( Figure S5C). We found that NF-kB target genes differed in with which signaling codons they were associated ( Figures 4F and S5D). These distinctions may be reflective of genes employing distinct gene regulatory mechanisms such as an incoherent feedforward loop that decodes peak amplitude, 42 long mRNA halflife or slow chromatin opening steps that decode duration, 43 or the requirement for de novo enhancers that distinguishes oscillations. 44 Cytokine polarization modulates the response specificity of specific genes to specific stimuli Macrophages show remarkable functional pleiotropy that is dependent on microenvironmental context. 45 Thus, polarization by prior cytokine exposure may alter their capacity for stimulusspecific responses. To test this hypothesis, we polarized macrophages into M1(IFN-g) and M2(IL-4) states, which represent opposing ends of the macrophage polarization spectrum (Figure S6A), 46 and generated single-cell stimulus-response data for the six ligands ( Figure 5A). Polarized M1(IFN-g) and M2(IL-4) macrophages expressed macrophage marker Adgre1 ( Figure S6B) and the appropriate polarization markers ( Figures S6C and S6D). PCA and UMAP projections of response distributions revealed that stimulus responses for polarized macrophages were distinct ( Figures 5B and S6E), but M1(IFN-g) response distributions were more overlapping than those for M0 naive macrophages.
We found that SRS was reduced in both polarization states for every set of the best gene combinations, as calculated by max MI, indicating that polarized macrophages may function more as specialized effectors and less as sentinels that serve a primary role of distinguishing immune threats ( Figure 5C; Table S3). Interestingly, we further observed that the genes within each of the best gene sets were different for each polarization state ( Figure 5D). This was corroborated by examining one gene, Cxcl10, which was included in the best gene combinations in M0 and M2 (IL-4) conditions but not in M1 (IFN-g). Indeed, in M1 (IFN-g) macrophages, Cxcl10 was promiscuously rather than stimulus-specifically activated and no longer carried stimulus-specific information about any pairs of stimuli ( Figure 5E). This change in SRS could be ascribed to the IRF pathway. Several NF-kB|IRF target genes (Cxcl10, Cmpk2, Ifit3, and Trim21) lost specificity in M1(IFN-g) macrophages (Figures 5F and S6F), reflecting the fact that in the presence of IFN-g conditioning, such genes only require activation of NF-kB to be induced. In fact, using the random forest machine learning model trained on M0 naive macrophages to predict the stimuli seen by M1 or M2 cells, we found that M1 responses were all more likely to be predicted as LPS or PIC, which are IRF-activating stimuli ( Figure 5G). Gene ontology attributed this loss of SRS to the biological pathways ''response to virus,'' ''response to LPS,'' and ''response to IFN-b'' ( Figure S6G; Table S5). Motif enrichment analysis also identified IRF target genes as responsible ( Figure 5H).
While a global analysis indicated that polarization diminished macrophage response specificity, the genes most affected differed for each macrophage state. In addition, a smaller set of largely NF-kB response genes also showed increased rather than diminished response specificities under Please cite this article in press as: Sheu et al., Quantifying stimulus-response specificity to probe the functional state of macrophages, Cell Systems (2022), https://doi.org/10.1016/j.cels.2022.12.012 each polarization condition ( Figure 5F). Such nuanced findings preclude the use of a single gene or even a single set of genes to quantify SRS over multiple macrophage states. The response specificity of macrophages thus may not be appropriately characterized by a single number, but rather by a higher dimensional profile. The response specificity profile of stimulus pairs assesses the functional state of macrophages and readily distinguishes M0 vs. M1 vs. M2 macrophages We noted that while different macrophage states could be identified by profiling their steady-state transcriptomes ( Figure S6A), their responses to stimuli were even more distinguishable (Figure 6A). Quantifying the distance between both the stimulated and unstimulated distributions, we found that in all comparisons, specific stimuli revealed differences that were not as evident from steady-state measurements. This was especially evident for M0 vs. M2 macrophages, whose response distributions to P3C and CpG were particularly more distinct than could be as-certained from steady-state distributions ( Figure 6B). This illustrates that for assessing the functional state of macrophages, measurements of multiple stimulus responses provide nonredundant information.
To quantitatively assess the response specificity provided by all 6 stimuli, we developed an approach to profile the max MI of all 15 stimulus pairs within the macrophage response landscape characterized by a PCA projection of all single-cell stimulus-response transcriptomes ( Figure 6C; Table S6; STAR Methods). The resulting response specificity profile of M0, M1(IFN-g), and M2(IL-4) macrophages showed specific differences for some of the stimulus pairs tested ( Figure 6C). In both polarization states, CpG-P3CSK specificity, which had been the least distinguishable stimulus pair for M0 macrophages, was enhanced. Calculating the difference in max MI from the M0 state, and thereby the delta response specificity profile, provided a characteristic signature of the functional state of the macrophage ( Figure 6D). This profile could also be summarized succinctly in a single number, the delta response specificity index (DRSI), which provided a clearer indication of differences among polarization states than the MI calculated on all stimuli together (Fig-ure 6E). We found that the overall DRSI of M2(IL-4) macrophages was greater than that of M1(IFN-g) macrophages, as a result of M2(IL-4) macrophages showing a greater loss of specificity in distinguishing host vs. bacterial ligands. Thus, the response specificity profile and DRSI revealed a signature of pairwise SRS scores associated with the function of each macrophage type, whether naive, M1(IFN-g), or M2(IL-4).
Peritoneal macrophages from old and obese mice show distinctive alterations in their response specificity profiles Next, we tested whether the response specificity profile might reveal aberrations in macrophages derived from mice with conditions associated with inflammatory disease. We took  Figure 7A). Visualization of the resulting single-cell data suggested that old mice had aberrant IFN-b-response distributions, while obese mice had aberrant PIC-response distributions ( Figure 7B). We calculated the response specificities for the data ( Figure 7C) and noted that the PMs from young, healthy mice had an overall DRSI most similar to naive M0 macrophages ( Figure 7D). A closer inspection of the response specificity profile showed that PMs from old mice showed the most diminished specificity for interferon-activating stimulus pairs (e.g., LPS-PIC), akin to what we observed with M1(IFN-g) macrophages. PMs from high-fatdiet, obese mice had decreased specificity across all stimulus pairs, but particularly in distinguishing cytokine vs. viral (TNF-PIC) responses. The differences observed through calculation of the response specificity profile on stimulus subsets suggest an importance to comparing multiple subsets of stimuli in evaluating innate immune function.
To identify individual genes that showed particularly high losses of SRS in these in-vivo-conditioned macrophages, we compared max MI values for each gene ( Figure 7E). We found that macrophages from both old and obese mice lost specificity in the cytokine Tnf, but also in the metabolic gene Acod1, 47,48 and in upstream TLR signaling network proteins such as Peli1, 36 Nfkbiz, 49,50 and Phlda1. 51 Normal function of these genes has been implicated in resistance to septic shock, macrophage response to atherosclerosis, and protection from autoimmune disease. Interestingly, a couple of genes showed higher stimulus specificity in these dysregulated microenvironments. In old mice, genes with the highest gains in specificity were the cytokine Cxcl10 and cytokine regulator Aw112010, which are required for mucosal immunity, 52

Article
Slamf8, which play roles in macrophage differentiation and migration. 53,54 This suggests that macrophage responses in unhealthy mice deviate from those in healthy mice through both losses and gains of stimulus specificity in individual genes: losses indicating responses that are too promiscuous, potentially causing conditions for autoimmunity to arise; and gains indicating responses that are too restricted to a particular stimulus, potentially diminishing core functions of macrophages in innate defenses or resolution of tissue inflammation and damage. As PMs of mice of different ages may be composed of different subpopulations, we next evaluated to what extent the changes in SRS may be due to subpopulations of macrophages differentially present in diseased vs. healthy mice. Based on scRNAseq of steady-state populations, 55 we found that all subclusters of macrophages were found in both young and old mice ( Figures S7A-S7E), although in slightly different proportions (Figure S7F). Using marker genes to match these clusters to the stimulus-response data, we likewise found these subgroups represented across healthy, aged, and obese mice, although again in different proportions ( Figure S7G).
Taken together, quantifying the SRS of macrophages revealed that this functional hallmark of immune sentinel cells is affected not only by polarizing cytokines used in pre-conditioning regimes in vitro but also by the microenvironments in vivo that are evidently distinct in obese and old mice. It is possible that response specificity profiles of PMs capture altered responses of both distinct subpopulations of macrophages that are differentially represented in inflammatory conditions and distinct functional states of the same subpopulation. Regardless, the observed differences in SRS suggest that quantifying post-stimulation single-cell response distributions could be valuable for assessing innate immune function.

DISCUSSION
Mounting stimulus-appropriate immune responses is a key property of healthy macrophage function. 7,8,[56][57][58] Macrophage response specificity is a function of the stimulus-specific engagement of signaling pathways and may be diminished by molecular network noise that results in cell-to-cell heterogeneity. While response specificity of the macrophage NF-kB signaling pathway has been characterized, 39,59 the response specificity of immune gene expression responses arising from all macrophage signaling pathways has not yet been quantified. By developing the experimental and computational tools to do so, we found that the high gene expression specificity observed was generated by sufficiently narrow response distributions in combinations of genes that respond with distinct patterns across stimuli, but the contribution of often-measured cytokine genes was limited by high cell-to-cell heterogeneity of expression. Mechanistically, we found that SRS is generated by stimulusspecific activation of IRF or MAPKp38 signaling, or by differences in NF-kB dynamics. Loss of IRF gene specificity by microenvironmental polarization was the key driver in altering response specificity profiles. Given that response specificity is context-dependent, we profiled PMs from old and obese mice, revealing highly specific changes in the response specificity profile that correlated with a different health status. These findings may prompt further studies to investigate whether macrophage response specificity could be a means to characterize the innate immune health of human donors.
Our ability to measure and subsequently quantify response specificity was enabled by a quantitative assay for costeffective, reliable scRNA-seq. 23,60 The targeted sequencing approach we pursued here resulted in less technical noise than genome-wide approaches, due to improved reverse-transcriptase efficiency and increased sequencing depth per gene. 23 However, even with technical improvements, the remaining measurement noise 61 still may result in underestimates of the true response specificity. Future efforts on smaller gene lists may allow for the use of even less noisy measurement approaches like single-molecule fluorescence in situ hybridization (smFISH), 62,63 which capitalize on the smaller sets of informative genes identified in this study.
Measuring the responses from macrophages rather than their steady-state transcriptomes provided additional levels of information. First, we found that stimulus-response transcriptomes may reveal differences in macrophage populations that are not apparent in the steady state. This may be because exposure histories and cytokine contexts change signaling pathways and chromatin states that are not reflected in steady-state mRNA abundances. Measuring stimulus responses also allowed us to evaluate multiple distributions for each macrophage type, rather than the single distribution provided by profiling the steady state. Emergent from these multiple distributions is response specificity, which evaluates the relative extent of distribution overlap and reports on the ability of each macrophage type to distinguish among immune threats. Not only is response specificity an important biological hallmark property of macrophages, 58 but also from the workflow perspective, being based on multiple measurements that may be compared with one another, it makes this analysis metric more resistant to technical noise or batch effect, potentially enabling better comparisons across studies.
The information theoretic approach we used here to quantify SRS has been previously employed to quantify the information transmission capacity of signaling pathways. 28,39,59,[64][65][66][67][68][69] In fact, we found that SRS of individual genes can be traced either to a single pathway or by multiple pathways, 8 for example, the NF-kB target gene Tnf whose mRNA half-life is regulated by stimulus-induced MAPKp38. 70 This combinatorial control explains why in principle some single genes like Tnf can hold more information than available from a single signaling pathway. But even single pathway genes may show response specificity owing to their ability to distinguish different dynamical characteristics, such as NF-kB signaling codons. Interestingly, we observed that such genes correlated strongly to identifiable signaling codons, indicating that their gene regulatory strategies are able to decode the information present in the stimulus-specific deployment of the signaling codon.
Theoretically, information loss from signaling to gene expression is minimal without noise, 71,72 while with noise, maintaining minimal information loss is only possible under select optimal promoter or chromatin conditions. [73][74][75] Specifically, to achieve high gene expression response specificity, signaling information must be interpreted by gene regulatory strategies 76,77 without amplifying the cell-to-cell heterogeneity in signaling activity, 78 and also without introducing further heterogeneity through pre-existing chromatin heterogeneity 79,80 or molecular noise. [81][82][83][84] In this context, it is not surprising that the SRS of most individual genes was low. However, as macrophages do not rely on only a single gene to mount a biological response to a specific immune threat, even with information loss at each particular promoter, the overall SRS observed through combinations of a few genes from complementary pathways was still high.
Within the body, macrophages are exposed to polarizing microenvironments in physiological scenarios, 85 as well as in pathological inflammatory contexts such as aging or obesity. [86][87][88] As circulating cells they are potentially reporters of even localized infections. 89 Indeed, profiling the transcriptome or epigenome of circulating cells or macrophages has revealed molecular markers or signatures that are prognostic for therapeutic efficacy 90,91 or alternative disease courses, such as in persistent infectious or inflammatory diseases 15,92 or in cardiovascular and autoimmune diseases. [93][94][95][96][97] However, as seen in human COVID-19 studies, it can be unclear which individuals have poor or vigorous immune health until they are challenged by infection. [98][99][100][101][102] Here, we considered that macrophage functions are deployed in response to immune threats and that stimulus responses are a function not only of the steady-state transcriptome or epigenome but also the dynamics of signaling complexes, membranes, and transport rates. We reasoned that macrophage responses to different stimuli reveal a functional pleiotropy not evident at steady state and that the stimulus-specific deployment of functions is key to healthy immunity. Indeed, we found that macrophages conditioned in vitro by defined polarizing cytokines, as well as macrophages isolated from obese or old mice, showed distinctly altered response specificity profiles due to both cytokine genes and regulators within macrophage metabolic and signaling pathways.
In developing the response specificity profile, we found that analyzing single genes and pairs of stimuli provided more insight than aggregating the data together. For example, response specificity for each pair of stimuli differed for each polarization condition, but this information is lost when calculating MI for all stimuli at once. Instead, an aggregate score of alterations in the response specificity profile (DRSI) provides the first indication of differences and the full response specificity profile pinpointed aberrancies in select stimulus pairs that may be a diagnostic for a specific condition, such as macrophages from obese mice confusing TNF and PIC responses. For initial surveys of response specificity profiles, measuring the expression of a large number of genes in response to multiple stimuli is important, as the most informative genes and stimulus comparisons are different across various macrophage states. 103 We employed a PCA approach to characterize the response landscape and identify genes that are important across conditions, using the gene weights in principal components. Indeed, the response specificity of individual genes differed greatly between the two disease models tested here, emphasizing the importance of gene-by-gene analysis in characterizing the response specificity profile to make biologically meaningful predictions.
Characterizing the macrophage response specificity profile may prove useful in clinical scenarios. Many steady-state metrics of immune health exist, such as the complete blood count that is a mainstay of clinical lab tests. However, tests for func-tional immune responses, already used in select clinical scenarios such as tuberculosis testing or allergy testing, are also rapidly emerging. 104 Multiple such assays rely on ex vivo stimulation of extracted clinical samples to diagnose immunosuppression 105 or phagocytic ability 106 and have included transcriptomic profiling studies of stimulus responses on peripheral blood, identifying inter-individual variation among healthy donors and the genes driving those differences. 107 As seen with existing assays, the extent to which stimulus-response measurements provide more reliably prognostic information than steady-state molecular profiling may depend on the health condition being studied.
Our assay of single-cell macrophage responses may identify outlier response cells within each donor sample through the response specificity profile, which may be associated with risk for aberrant inflammatory responses or diminished innate immune defenses, or which may be reflective of an ongoing inflammatory or infectious condition that is not otherwise presented. The response specificity profile allows for new samples to be compared readily with a healthy range, a property that could be the basis for a clinically deployable measure of innate immune health. The stimulus-response data may also identify specific genes with aberrant response distributions in patient cohorts or in individual donors. Identifying such genes may provide cost-effective prognostic markers for specific cohorts or may point toward the underlying etiology of poor immune function. To realize this promise, large-scale clinical studies will be required to establish connections between response specificity profiles and risk for disease. However, as a quantifiable property of macrophage function that changes with conditioning cytokines or states of health and disease, the macrophage response specificity profile may be a viable approach for measuring the health of innate immune function in the clinic.

STAR+METHODS
Detailed methods are provided in the online version of this paper and include the following: extracted by injecting 10mL PBS +1% FBS into the peritoneal space, shaking gently, and then pulling out as much fluid as possible, typically $8ml. Macrophages were plated in DMEM +10%FBS and allowed to rest 24hrs. Floating cells after that time were washed away, and remaining adherent macrophages were stimulated with the same ligand concentrations as for iMPDMs: 100ng/mL lipopolysaccharide (LPS, Sigma Aldrich), 10ng/mL murine TNF (R&D), 50mg/mL low molecular weight polyinosine-polycytidylic acid (Poly(I:C)), 500U/ml IFNb, or media only Untreated control. Cells were washed once with cold PBS after 3hrs of stimulation and lifted into suspension for the Rhapsody scRNAseq assay. The investigators were not blinded the identity of the animal models during the experiments or outcome assessment.

Gene Panel Selection Algorithm
To select genes for single-cell targeted gene profiling, we analyzed existing bulk transcriptomic profiling of macrophage responses. Bulk RNAseq data from Cheng et al. 8 was obtained from GEO GSE68318. Counts were converted to counts per million (cpm) using the package edgeR, 110 and genes with cpm>4 in at least three samples were retained. Induced genes were gathered by calculating fold changes at each of the 14 stimulus conditions available in the dataset, at each timepoint, against the unstimulated controls. Genes were retained as induced genes if they met the threshold of log 2 (fold change)>2 and p-value < 10 -5 , which resulted in 1502 genes. Because PCA identifies a new basis that maximizes variance within the rotated data, it was ideal for identifying genes that varied in expression level across different stimuli. PCA was performed centered and unscaled on the induced genes across all time points for the 14 stimuli in the dataset. The loadings matrix obtained from the PCA was used to calculate a rank score for each gene. The rank was computed as the radial distance of each gene j from the origin, over the top 20 PCs: score j = P 20 where PCx j is the component x loadings value for gene j. The top 480 ranked genes were included in the panel, and the remaining 20 genes were manually selected to add genes such as cell type markers, macrophage polarization markers, and transcription factors (Table S1). As a visual confirmation of the approach, k-means clustering was performed on all induced genes and loadings were colored by cluster. As expected, genes with the highest absolute loadings values in each principal component tended to be spread across different clusters, and the top genes in each principal component exhibited distinct patterns across stimuli.

Selection of Stimuli for Response Specificity Assay
To identify an optimal set of the most distinct stimuli from the set of 14 used in the bulk transcriptomic data, tensor components analysis (TCA) was performed. TCA is a higher dimensional parallel of PCA -whereas PCA is performed on a genes3sample matrix, TCA is performed on a higher-order tensor by folding the gene expression matrix into a genes3stimuli3timepoint tensor. The bulk RNAseq data consisted of N genes over S stimuli with T timepoints per stimulus, which formed a third-order tensor X of dimensions N3S3T (a three-dimensional array). Tucker decomposition 115 was performed using the package rTensor. This decomposes the tensor into a core tensor G of dimensions R 1 3 R 2 3 R 3 , multiplied by a matrix U ðiÞ along each mode: The first five components, which explained 92% of the variance in the data, were retained. The stimulus loadings matrix U ð2Þ of dimensions S 3 R 2 was hierarchically clustered on the first five components, and stimuli that each occupied separate branches of the hierarchical tree were selected. Stimuli that represented whole bacterial organisms or viruses were not selected, in favor of isolated bacterial or viral components.

Rhapsody scRNAseq
To collect the adherent macrophages for scRNAseq using the Rhapsody platform, macrophage cells were washed 1x with cold PBS, then lifted into suspension by incubating at 37C for 5 minutes with Accutase, which resulted in cell viability typically >85%. Cells were centrifuged at 4C, 400g for 5 minutes, and resuspended in PBS + 2% FBS. Cells were hash-tagged with anti-CD45hashtags (BD Rhapsody # 633793) and loaded onto the cartridge following manufacturer's instructions (BD Rhapsody # 633771), with the following modifications, which helped ensure sufficient cell viability for the subsequent steps: Incubation with hashtags was performed for 30mins on ice, instead of 20mins at room temperature; only two washes were performed after hashtag incubation to minimize cell loss. Each cartridge was then loaded with a total of $36k cells across 12 hash-tagged samples ($3k cells/sample). Libraries were prepared according to manufacturer's instructions (BD Rhapsody # 633771) and sequenced 2x100 on Novaseq 6000.

Motif Enrichment
Motif enrichment of induced genes and selected genes was performed using HOMER, 111  Please cite this article in press as: Sheu et al., Quantifying stimulus-response specificity to probe the functional state of macrophages, Cell Systems (2022), https://doi.org/10.1016/j.cels.2022.12.012 motifs). To summarize the overall enrichment of particular transcription factor families within the gene sets, the average -ln(p value) of motifs in each category was calculated, and a second log transform was taken for plot visualization.

Gene Ontology
Gene ontology on selected genes versus unselected genes was performed using clusterProfiler 113 against a background of all genes. Cutoff values of p-value < 0.01, Benjamini-Hochberg q-value <0.05, and minimum gene set size >5 were used. Ontologies were grouped if they had a similarity proportion greater than 0.7. The top three Biological Processes ontology terms for each group were plotted.
scRNAseq Data Processing Raw fastq files were processed using the BD RhapsodyÔ Targeted Analysis Pipeline (version v1.0) 23 hosted on Seven Bridges Genomics. Distribution-Based Error Correction (DBEC)-adjusted UMI counts (molecules per cell) were used in the downstream analysis. Multiplets, cells with undetermined barcodes, and cells with less than 80 features were removed from the analysis. Due to the selected 500 gene panel comprised of largely inducible genes, the assumption that the total number of RNAs per cell is constant does not hold. Counts were therefore normalized using the package ISnorm, 114 rather than the more standard approach of dividing by total counts per cell. PCA was performed centered and unscaled using the R function prcomp, and UMAP and tSNE were performed on the top 20 PCs.

Assignment of Genes to Regulatory Mechanisms
We first pursued a data-driven approach for the assignment, by examining time-series cell population average data from macrophages stimulated with the six stimuli. We then simulated the 7 gene regulatory logics identified by Cheng et al., 8 collapsing short and long mRNA half-life clusters. Each of the ordinary differential equations used for the 7 regulatory logics followed the same general form: where for single transcription factor logic gates 8 : fðtÞ = ð1 À k 0 Þ Ã ðK D Ã ½TFðtÞÞ n 1 + ðK D Ã ½TFðtÞÞ n + k 0 and for two transcription factor OR logic gates 8 : We matched regulatory logics to each gene by assigning the GRS with the lowest RMSD between model and experimental data (see https://github.com/signalingsystemslab/ResponseSpecificity; Cheng et al. 8 and Wang et al. 116 ). We then manually curated the model assignments based on evidence provided by the literature, resulting in assignments given in Table S4.
Machine Learning Models and Feature Importance Machine learning classification models were trained using scRNAseq data from naïve macrophages. The data was split 70%/30% into a training group and a testing group. Using only the training data, a random forest model was trained using repeated 10-fold cross validation, with 3 repeats. The parameter mtry, which is the number of variables randomly selected as candidate features for each decision tree split, was set to Oðtotal number of features). As alternative models, weighted k-nearest neighbors and neural network model were also trained on the same dataset. Random forest, weighted kNN, and neural network models were implemented using the R package caret (classification and regression training). 108 After the model was trained, the remaining held-out data was tested, with each cell assigned a soft probability prediction for each ligand. The highest probability ligand was the final prediction. Ensemble modeling was performed by majority voting, taking the predicted stimuli from each of the individual models and choosing the most common stimulus predicted for each cell among the different machine learning models. Macrophages of other polarization conditions, M1(IFNg) and M2(IL4), were tested using the model trained on M0 naïve macrophages.
Feature importance was extracted from the trained random forest model, which is calculated by how much information is lost at each node/split of the decision trees. We calculated this value based on Gini impurity: P C i = 1 ðfreq i 3 ð1 À freq i ÞÞ, across all unique category labels C. The feature importance is then the product (decrease in node impurity) * (probability of reaching that node), scaled so the top feature has a value of 100.

Mutual Information Analyses
An information theoretic approach was used to identify either individual genes or combinations of genes providing the highest maximum mutual information between ligand identity and gene expression. Error bars on mutual information calculations were done using 10 bootstraps on 50% of the data. Estimation of maximum mutual information was implemented using the R package Please cite this article in press as: Sheu et al., Quantifying stimulus-response specificity to probe the functional state of macrophages, Cell Systems (2022), https://doi.org/10.1016/j.cels.2022.12.012 SLEMI, 66 which uses a statistical learning-based approach to more accurately and more efficiently calculate maximum mutual information for data types with higher dimensional outputs. The max MI was first calculated for each gene individually, using all stimuli (highest theoretical max MI = 2.58bits), as well as for all pairs of stimuli (highest theoretical max MI = 1.0bits). To relate the max MI to the either the mean and variance of each gene, max MI was plotted against either the average Fano factor (avg: ).
To estimate the maximum mutual information of the best combination of 1; 2; 3; .; N genes, we first started from a list of the top 20 genes that individually had the best max MI value. For each of these single dimension channels, we scanned every combination of two genes, and again ranked the best combination of two genes and retained the top 20. This process was repeated for each additional gene until the gain in max MI for each additional gene leveled off. Retaining only the top 20 sets at each dimension made the calculation more computationally feasible, while still allowing the possibility for gene combinations that are not simply additive of the previous dimension's highest max MI combination.
For gene-specific pairwise calculation of MI, max MI between pairs of stimuli was calculated for each gene at 3hrs using the singlecell RNAseq data. Max MI was plotted against gene regulatory groups between the two stimuli. Correlation coefficients were calculated using the max MI values for signaling pairs for every NFkB signaling codon, and the max MI values for the corresponding gene expression pairs for every NFkB target gene. Genes without any correlation p-value < 0.25 across the six codons were removed from the display. The Pearson's coefficient was plotted on the heatmap, and genes were hierarchically clustered using complete linkage.

Response Specificity Profile
The Response Specificity Profile is a collection of values that collectively summarize the distinguishability of macrophage responses to different stimuli. The response landscape on which the Response Specificity Profile is calculated was obtained first by principal component analysis performed centered and unscaled on all stimuli across M0, M1(IFNg), and M2(IL4) cells. This initial matrix can be written as = N 3 P, where N is the total number of measured genes and P is the stimulated single cells of different macrophage subtypes. This matrix rotation of M by PCA represents the landscape of physiological macrophage responses. Calculation of maximum mutual information using the PC scores was then performed for all possible pairs of stimuli. The advantage of using PC scores to perform mutual information analyses lies in the reduction of noise that would otherwise result in overfitting. Overfitting due to the large number of features was observed to result in saturation of the maximum mutual information values to the theoretical maximum for all pairs. Max MI was therefore calculated on the top three PCs (capturing 42% of the variance in the data), using a truncation based on the PCA scree plot, as each subsequent component after Component 5 added only $1% more to the variance explained. All error bars were generated by 50 iterations of 50% bootstrap resampling of the complete single cell dataset.
The Response Specificity Profile of new samples were evaluated by projection of the new gene expression data onto the dimensionality-reduced space. Letting the initial PCA be defined as S = W T 3 M, where S is the r3 P scores matrix, and W is the N3r loadings matrix, the scores of the new projected data is then given by Since cells from disease models are projected into the same basis, scores from any new projected data now sit in the same lowerdimensional space and can be compared to the Response Specificity of samples within initially defined response landscape. For new samples, the maximum mutual information between ligand and transcriptomic output was again calculated for all available pairs of ligands using the PC scores. The summary score delta Response Specificity Index (DRSI) was generated by the following:       A) Distribution of prediction probabilities across cells for the test data. Based on its transcriptome, each cell was given a probability of having encountered a particular ligand, and the highest probability ligand was assigned as the prediction. Prediction probabilities helped distinguish weak versus strong assignments: for instance, correct IFNβ predictions consistently had prediction probabilities close to 1, whereas correct CpG predictions had prediction probabilities of ~0.6-0.7 with the second-best choice being P3C. B) Comparison of overall accuracy for three different model types (neural net, random forest, k-nearest neighbors) show similar overall accuracies, resulting in a similar prediction accuracy from ensemble-based majority voting from all three models. C) Gene expression values for the top gene combinations providing the greatest ligand discrimination, as quantified by max MI, for M0 macrophages using 1, 2, or 3 genes. Units are log 2 (normalized counts + 1).   Table S4, including mathematically modeled gene-assignments (first row), motif analyses, assignments made in prior literature, and assignments used in the present paper (last row). GRS assignments are not definitive but remain working hypotheses. Wang et al, 2021 made multiple possible assignments per gene, and gray bars in the "Wang 2021 BMDMs" row indicate additional assignments. B) Motif enrichment for genes in each of the assigned clusters. Only motifs significantly enriched at α < 0.001 in at least one cluster are shown. C) Correlation between max MI provided by features of NFκB signaling dynamics and max MI retained in gene expression at 3hrs after stimulation. Stimulus information in Cxcl10 expression best correlates with information provided by pairwise differences in NFκB Total activity, Tnf with Oscillations, and Clec4e with Speed. D) As counterexamples, Cxcl10 does not correlate with pairwise differences in NFκB Oscillations or Duration. Colors indicate categories of genes that increase or decrease in Specificity in M1 or M2 vs. M0. G) Biological processes that lose specificity in M1 or M2 macrophages compared to M0 naïve macrophages.