Abundance of ADAM9 transcripts increases in the blood in response to tissue damage [version 1; peer review: 3 approved with reservations]

Background: Members of the ADAM (a disintegrin and metalloprotease domain) family have emerged as critical regulators of cell-cell signaling during development and homeostasis. ADAM9 is consistently overexpressed in various human cancers, and has been shown to play an important role in tumorigenesis. However, little is known about the involvement of ADAM9 during immune-mediated processes. Results: Mining of an extensive compendium of transcriptomic datasets led to the discovery of gaps in knowledge for ADAM9 that reveal its role in immunological homeostasis and pathogenesis. The abundance of ADAM9 transcripts in the blood was increased in patients with acute infection but changed very little after in vitro exposure to a wide range of pathogen-associated molecular patterns (PAMPs). Furthermore it was found to increase significantly in subjects as a result of tissue injury or tissue remodeling, in absence of infectious processes. Conclusions: Our findings indicate that ADAM9 may constitute a valuable biomarker for the assessment of tissue damage, especially in clinical situations where other inflammatory markers are confounded by infectious processes. Competing Interests: No competing interests were disclosed. Rinchai and co-workers nicely present a re-analysis of existing genomic datasets, demonstrating an useful tool for quick establishment of functional hypotheses. By this, they suggest a novel function of ADAM9 as biomarker for tissue damage. The article is well written, but several concerns should be addressed before indexing.


Introduction
"ADAM metallopeptidase 9 (ADAM9) is a member of the ADAM (a disintegrin and metalloprotease domain) family. Members of this family are membrane-anchored proteins structurally related to snake venom disintegrins, and have been implicated in a variety of biological processes involving cell-cell and cell-matrix interactions, including fertilization, muscle development, and neurogenesis. The protein encoded by this gene interacts with SH3 domain-containing proteins, binds mitotic arrest deficient 2 beta protein, and is also involved in TPA-induced ectodomain shedding of membraneanchored heparin-binding EGF-like growth factor. Several alternatively spliced transcript variants have been identified for this gene." (Quoted from RefSeq 1 ). Figure 1). Human ADAM9 protein cleaves and releases collagen XVII from the surface of skin keratinocytes 2 . This activity is enhanced in the presence of reactive oxygen species. Mouse ADAM9 protein cleaves and releases epidermal growth factor (EGF) and fibroblast growth factor receptor 2IIIb (FGFR2IIIb) from the surface of prostate epithelial cells 3 . Following LPS treatment, ADAM9 protein catalytic domain cleaves Angiotensin-I converting enzyme (ACE) from the surface of endothelial cells 4 . Human ADAM9 protein disintegrin-cysteine-rich domain binds integrins and thus mediates cell adhesion 5 . Human ADAM9 protein enhances adhesion and invasion of non-small lung tumors which mediates tumor metastasis 6 . Mouse ADAM9 protein enhances tissue plasminogen activator (TPA)-mediated cleavage of CUB domain-containing protein 1 (CDCP1) 7 . This activity mediates lung tumor metastasis. Human ADAM9 protein mediates cellcell contact interaction between stromal fibroblasts and melanoma cells at the tumor-stroma border, thus contributing to proteolytic activities required during invasion of melanoma cells 8 . ADAM9 expression and regulation. ADAM9 has been reported as being expressed in various cell populations including monocytes 9 , activated macrophages 10 , epithelial cells, activated vascular smooth muscle cells, fibroblasts 8 , keratinocytes and tumor cells. The abundance of ADAM9 RNA measured by RT-PCR is decreased in vitro in human melanoma cells after culture with collagen type I or with Interleukin 1 alpha (IL1α) compared to mock stimulated conditions 11 . ADAM9 has been involved in disease processes including cancer, cone rod dystrophy and atherosclerosis. Homozygous mutation of the human ADAM9 gene results in severe cone rod dystrophy and cataract 12 . Mutation of the mouse ADAM9 gene results in no major abnormalities during development and adult life 13 . The abundance of ADAM9 RNA and protein measured by immunostaining and RT-PCR is increased in vivo in human prostate tumors compared to normal tissue 14 . The abundance of ADAM9 RNA measured by microarray and RT-PCR is increased in vivo in human advanced atherosclerotic plaque macrophages compared to normal tissue 15 . This increase is predictive of Prostate Specific Antigen (PSA) relapse.

ADAM9 top functions include cellular adhesion, protein cleavage and shedding. (Supplementary
It is known that ADAM9 is upregulated in some tumor cells during pathologic processes and also contributes to the formation of multinucleate giant cells from monocytes and macrophages 10 . However, little is known about the activities of ADAM9 in regulating physiologic or pathologic processes, especially during acute infection or in response to tissue damage.

ADAM9 bibliography screening and literature profiling
Existing knowledge pertaining to ADAM9 was retrieved using NCBI's National Library of Medicine's Pubmed search engine with a query that included official gene symbol and name as well as aliases: "ADAM9 OR ADAM-9 OR "ADAM metallopeptidase domain 9" OR MCMP OR MDC9 OR CORD9". As of January of 2015, 287 papers were returned when running this query. By reviewing this literature keywords were identified that were classified under six categories corresponding to cell types, diseases, functions, tissues, molecules or processes. Frequencies of these keywords were then determined for the ADAM9 bibliography as shown in Supplementary Figure 1. This literature screen identified and prioritized existing knowledge about the gene ADAM9 and was used to prepare the background section of this manuscript and provided the necessary perspective for the interpretation of ADAM9 profiles across other large-scale datasets.

Interactive data browsing application
We employed a resource that is described in details in a separate manuscript (submitted) and is available publicly: https://gxb. benaroyaresearch.org/dm3/landing.gsp. Briefly: we have assembled and curated a collection of 172 datasets that are relevant to human immunology, representing a total of 12,886 unique transcriptome profiles. These sets were selected among studies currently available in NCBI's Gene Expression Omnibus (GEO, http://www.ncbi.nlm. nih.gov/geo/). The custom software interface provides the user with a means to easily navigate and filter the compendium of available datasets (https://gxb.benaroyaresearch.org/dm3/geneBrowser/list). Datasets of interest can be quickly identified either by filtering on criteria from pre-defined lists on the left or by entering a query term in the search box at the top of the dataset navigation page.
Clicking on one of the studies listed in the dataset navigation page opens a viewer designed to provide interactive browsing and graphic representations of large-scale data in an interpretable format. This interface is designed to navigate ranked gene lists and display expression results graphically in a context-rich environment. Selecting a gene from the rank ordered list on the left of the data-viewing interface will display its expression values graphically in the screen's central panel. Directly above the graphical display drop down menus give users the ability: a) To change how the gene list is ranked; this allows the user to change the method used to rank the genes, or to include only genes that are selected for specific biological interest. b) To change sample grouping (Group Set button); in some datasets, a user can switch between groups based on cell type to groups based on disease type, for example. c) To sort individual samples within a group based on associated categorical or continuous variables (e.g. gender or age). d) To toggle between the histogram view and a box plot view, with expression values represented as a single point for each sample. Samples are split into the same groups whether displayed as a histogram or box plot. e) To provide a color legend for the sample groups. f) To select categorical information that is to be overlaid at the bottom of the graph. For example, the user can display gender or smoking status in this manner. g) To provide a color legend for the categorical information overlaid at the bottom of the graph. h) To download the graph as a jpeg image.
Measurements have no intrinsic utility in absence of contextual information. It is this contextual information that makes the results of a study or experiment interpretable. It is therefore important to capture, integrate and display information that will give users the ability to interpret data and gain new insights from it. We have organized this information under different tabs directly above the graphical display. The tabs can be hidden to make more room for displaying the data plots, or revealed by clicking on the blue "show info panel" button on the top right corner of the display. Information about the gene selected from the list on the left side of the display is available under the "Gene" tab. Information about the study is available under the "Study" tab. Information available about individual samples is provided under the "Sample" tab. Rolling the mouse cursor over a histogram bar while displaying the "Sample" tab lists any clinical, demographic, or laboratory information available for the selected sample. Finally, the "Downloads" tab allows advanced users to retrieve the original dataset for analysis outside this tool. It also provides all available sample annotation data for use alongside the expression data in third party analysis software.

Statistical analyses
All statistical analyses were performed using GraphPad Prism software version 6 (GraphPad Software, San Diego, CA). All primary data presented in this manuscript are provided as data files. Detailed legends for each data file can be found in the text file 'Description of GSE datasets'.

Knowledge gap assessment
The seminal discovery was made while examining RNAseq transcriptional profiles. A knowledge gap was exposed when those results were interpreted in light of existing knowledge reported in the literature. Next, the initial observation was validated and further extended by examining profiles of the gene of interest, ADAM9, across a large number of independent publically available transcriptome datasets. The completion of these tasks was aided by a custom data browsing application loaded with a curated compendium of 172 datasets relevant to human immunology sourced from the National Center for Biotechnology Information's (NCBI) Gene Expression Omnibus (GEO) (https://gxb.benaroyaresearch. org/dm3/landing.gsp, manuscript submitted). Briefly, ADAM9 transcript was identified as a potential early stage discovery while browsing RNA-sequencing profiles of blood leukocyte populations (https://gxb.benaroyaresearch.org/dm3/geneBrowser/show/396), with the genes being ranked in alphabetical order. In this particular dataset whole blood sample of healthy donors, patients during acute infections (meningococcal sepsis, E. coli sepsis, C. difficile colitis), multiple sclerosis patients pre-and post-interferon treatment, patients with Type 1 diabetes and patients with amyotrophic lateral sclerosis (ALS) were obtained and monocyte, neutrophil, CD4 T cell, CD8 T cells, B cell, NK Cell isolated prior to profiling via RNA sequencing 16 . The abundance of ADAM9 RNA measured by RNA-seq in human blood neutrophils and monocyte samples from subjects with sepsis was found to be markedly increased as compared to uninfected controls ( Figure 1; [iFigure/ GSE60424] 16 ). By comparison levels of abundance of ADAM9 RNA in lymphocytes and Natural Killer (NK) cells were low and no changes were observed in subjects with sepsis in these cell populations. Despite the small number of septic subjects included in the study (N=3) the robust increase in abundance that was observed prompted attempts to validate and further extend this initial observation in independent public datasets that were part of the compendium.

The abundance of ADAM9 increases during infection
Our data browsing tool allows the assessment of expression profiles across transcriptome datasets (https://gxb.benaroyaresearch.org/ dm3/geneBrowser/list). In order to validate and extend our original observation we looked up ADAM9 transcriptome profiles for all available 172 datasets (https://gxb.benaroyaresearch.org/dm3/gene-Browser/crossProject?probeID=ENSG00000168615&geneSymbol =ADAM9&geneID=8754studies).    Table 1). Altogether these data indicate that increase in abundance of ADAM9 can be detected in blood leukocytes, including monocytes and neutrophils fractions during bacterial and viral infection.
The abundance of ADAM9 increases only marginally following treatment with pathogen-associated molecules Next, we investigated the regulation of ADAM9 transcription following leukocyte exposure to pathogens and pathogen-associated molecules.   Table 2. Taken together, these results showed that the abundance of ADAM9 was not changed or changed only marginally after stimulation with purified molecules bearing Pathogen Associated Molecular Patterns (PAMPs). These finding raised the question as to whether ADAM9 transcription might be activated instead by host-derived Damage-Associated Molecular Pattern molecule (DAMPs) 25,26 .
The abundance of ADAM9 increases during tissue remodeling Our dataset screen revealed in addition that changes in abundance of ADAM9 could be associated with tissue remodeling. The ❶ GSE34205: In this study gene expression profiles were obtained from the whole blood of critically ill pediatric patients 19 , Children hospitalized with acute RSV and influenza virus infection were offered study enrollment after microbiologic confirmation of the diagnosis. Blood samples were collected within 42-72 hours of hospitalization. Median age of subjects was 2.4 months (range 1.5-8.6). Uninfected subjects of similar demographics were recruited in the study and served as controls. Children with suspected or proven polymicrobial infections, with underlying chronic medical conditions (i.e congenital heart disease, renal insufficiency), with immunodeficiency, or those who received systemic steroids or other immunomodulatory therapies were excluded. More details are available via the interactive data browsing application under the "study" tab. https://gxb.benaroyaresearch.org/dm3/miniURL/view/Ka ❷ GSE19439: Whole blood was collected from patients with different spectra of tuberculosis (TB) disease and healthy controls 21 . All patients were sampled prior to the initiation of any anti-mycobacterial therapy. Active Pulmonary TB: all patients confirmed by isolation of Mycobacterium tuberculosis on culture of sputum or bronchoalvelolar lavage fluid. Latent TB: All patients were positive by tuberculin skin test (>14mm if BCG vaccinated, >5mm if not vaccinated) and were also positive by Interferon-Gamma Release assay (IGRA). https://gxb.benaroyaresearch.org/dm3/miniURL/view/Kb ❸ GSE29536: Whole blood was collected from culture positive patients meeting criteria for sepsis enrolled in two independent cohorts (Sepsis 1 and Sepsis 2) 18 . Uninfected controls recruited in this study were of similar demographics. https://gxb.benaroyaresearch.org/dm3/miniURL/view/Jl ❹ GSE60424: Whole blood sample of healthy donors, patients during acute infections (meningococcal sepsis, E. coli sepsis, C. difficile colitis), multiple sclerosis patients pre-and post-interferon treatment, patients with Type 1 diabetes and patients with ALS were obtained and monocyte, neutrophil, CD4 T cell, CD8 T cells, B cell, NK Cell isolated prior to profiling via RNA sequencing 17 . https://gxb.benaroyaresearch.org/dm3/miniURL/view/Kc Statistical significance was determined using Mann-Whitney U test. ns, not significant, * p < 0.05, *** p < 0.001 and *** p < 0.0001. The horizontal lines indicate mean ± standard errors (SE).
abundance of ADAM9 RNA measured by microarrays in human skin biopsy samples of subjects with lepromatous leprosy was significantly increased as compared to controls in subjects with tuberculoid leprosy [iFigure/GSE17763] 27 . The abundance of ADAM9 RNA measured by microarrays in human blood samples was significantly increased as compared to controls in pregnant subjects [iFigure/GSE17449] 28 . The abundance of ADAM9 RNA measured by microarrays in human blood monocytes samples from subjects with filariasis was significantly increased as compared to uninfected controls [iFigure/GSE2135] 29 . These results are shown in Table 3, Figure 4 and Supplementary Figure 4. A common thread between these different states is that they involve extensive tissue remodeling, whether it involves the skin (leprosy), placental tissue (pregnancy) or lymphatic tissues (filariasis).
The abundance of ADAM9 increases following tissue injury and sterile inflammation Changes in ADAM9 transcript abundance were observed in additional datasets: The abundance of ADAM9 RNA measured by microarrays in human blood samples was significantly increased as compared to healthy controls in subjects with sarcoidosis [iFigure/ GSE34608] 22 , in subjects after severe blunt trauma [iFigure/ GSE11375] 30 , in subjects with chronic kidney disease [iFigure/ GSE15072] 31 , and in subjects who have undergone elective thoracic or abdominal surgery [iFigure/GSE28750] 17 . The abundance of ADAM9 RNA measured by microarrays in human blood samples from subjects treated with localized external beam radiation therapy for 42 days was significantly increased as compared to baseline samples [iFigure/GSE30174] 32 . The abundance of ADAM9 RNA measured by microarrays in human blood monocytes samples from obese subjects was significantly increased as compared to lean controls [iFigure/GSE32575] 33 . Finally, the abundance of ADAM9 RNA measured by microarrays in human blood monocytes samples from subjects after severe trauma was significantly increased as compared to healthy controls [iFigure/GSE5580] 34 . These results showed that increase in ADAM9 transcript abundance was associated with tissue injury and sterile inflammation (Table 4, Figure 5 and Supplementary Figure 5) and thus are consistent with the observations that are reported above associating increase in ADAM9 RNA with responses to Damage-Associated Molecular Pattern molecules (DAMPs) and tissue remodeling.

Conclusions
This study is the first report describing the modulation of levels of ADAM9 transcripts in human whole blood and showing restriction of its expression to neutrophils and monocytes. In addition we    observed that the abundance of ADAM9 was increased during acute infection but did not change after stimulation with pathogen-derived molecules. It was not changed in vivo following administration of synthetic double stranded RNA (polyIC), a treatment that mimics viral exposure. Notably, it was not increased either in patients during the early acute phase of HIV infection when an intense immunological response is detected in absence of clinical symptoms [iFigure/GSE29536] 18 . However, ADAM9 transcript abundance was increased in the blood of patients as a result of tissue damage, sterile inflammation and tissue remodeling. Therefore, in addition to its widely reported role in the pathogenesis of cancer the constellation of findings that we are reporting point towards the involvement of ADAM9 in immune-mediated processes and suggest that ADAM9 may constitute a valuable marker for assessing tissue damage, whether it occurs as result of acute infection, traumatic injury or medical procedures such as surgery or radiation therapy. Indeed, these findings may be of especially high significance in the context of acute infections since unlike "generic" markers of inflammation, that could also be used to assess tissue injury in other settings, ADAM9 would not be confounded by the host responses to the pathogen and may therefore accurately reflect damage to the patient tissues or organs ( Figure 6). Thus ADAM9 blood transcript levels, or possibly levels of circulating proteins, could potentially be employed for triage of patients presenting with symptoms of infection in the emergency room or for monitoring of patients in intensive care units. ❶ GSE17763 Skin biopsies were obtained from patients with leprosy classified as tuberculoid leprosy (controlled disease, few skin lesions) or lepromatous leprosy (uncontrolled diseases, widespread lesions) 27 . All tuberculoid and lepromatous specimens were taken at the time of diagnosis before treatment, and reversal reaction biopsies (labeled as "reaction") were taken upon follow from patients originally diagnosed with borderline lepromatous leprosy. https://gxb.benaroyaresearch.org/dm3/miniURL/view/Ke ❷ GSE17449 Peripheral Blood Mononuclear Cells were isolated from the blood of 12 women (7 MS patients and 5 healthy controls) followed during their pregnancy 28 . Samples were obtained before pregnancy and at 9 months.   The flow chart indicates how data were generated. Diamonds indicate supporting data and in the interactive version are hyperlinked to context-rich interactive plots. Links to these plots are also provided below: ❶ GSE34608 blood was collected from patients with active tuberculosis and sarcoidosis as well as uninfected controls 22 .

Data availability
All primary data presented in this manuscript can be accessed along with contextual information via the data browsing application described above and is also available in NCBI's GEO public repository. GEO accession numbers (starting with GSE) are provided where appropriate throughout this manuscript along with the primary reference associated with the GEO record. Author contributions DR and DC designed the analytic approach, mined the data, prepared figures and drafted the manuscript. CK, BK, GL participated in the mining of the dataset compendium. All authors read and approved manuscript.

Competing interests
No competing interests were disclosed.

Grant information
The author(s) declared that no grants were involved in supporting this work.   Figure 3. These plots show data supporting the notion that the abundance of ADAM9 increases only marginally following treatment with pathogen-associated molecules as presented in Figure 3 of the manuscript, and are accessible online in an interactive format:

The design, methods and analysis of the results from the study:
The methods and design have been explained, and the analyses are appropriate for the topic being studied. The results show impressive increases in ADAM9 gene expression in blood leukocytes in some disease states. However there are issues regarding the design/content of the study: It is not clear from reading only the manuscript whether the controls and disease groups are matched with respect to age, sex, and/or race/ethnicity, and/or whether the disease groups studied have co-morbidities that might contributed to the differences in ADAM9 gene expression observed between the groups. It would be necessary to read many of the cited papers in order to obtain this information. 1.
The microarray results do not appear to have been validated by performing real-time qPCR studies (for example) on any of the samples.

2.
In the in vitro studies, details on the concentrations and incubation times for the agonists have not been provided in the methods, text or legends. It is possible that the concentrations and time points studied were not optimal for detecting increases in ADAM9 gene expression.

Data presentation:
All of the results have been presented in the manuscript However, in general more details about the experimental conditions in the figure legends would have been helpful to the reader.

Discussion and conclusions:
The discussion section could be expanded to include a discussion of the limitations of the study. The discussion could have included a section on how the changes in ADAM9 gene expression detected in blood leukocytes might influence the pathogenesis of progression of the diseases that were studied based upon the known activities of this proteinase.

Competing Interests:
No competing interests were disclosed.
I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

Author Response 11 Oct 2016
Damien Chaussabel, Sidra Medical and Research Center, Doha, Qatar We thank the reviewer for the valuable feedback and suggestions to improve our manuscript.

The title and abstract of the manuscript:
Both are appropriate.

The design, methods and analysis of the results from the study:
The methods and design have been explained, and the analyses are appropriate for the topic being studied. The results show impressive increases in ADAM9 gene expression in blood leukocytes in some disease states. However there are issues regarding the design/content of the study: 1. It is not clear from reading only the manuscript whether the controls and disease groups are matched with respect to age, sex, and/or race/ethnicity, and/or whether the disease groups studied have co-morbidities that might contributed to the differences in ADAM9 gene expression observed between the groups. It would be necessary to read many of the cited papers in order to obtain this information.

Mechanisms exist that should help ensure that the study design and choice of selection of control subjects is appropriate at least in most studies:
One is IRB review that to some extent will evaluate the study design elements such as inclusion and exclusion criteria for case and control groups and will help ensure that results of the study will be meaningful and justify risk to the study population.

1.
The second mechanism is peer review. Having conducted such studies ourselves and reviewed submissions of others concerns often come up regarding factors that might potentially confound analyses and that would need be addressed before publication.

2.
In addition, the process of loading dataset and sample as well as study information in GXB as well as QC checks provide an additional opportunity to identify "faulty" designs. 3. In the in vitro studies, details on the concentrations and incubation times for the agonists have not been provided in the methods, text or legends. It is possible that the concentrations and time points studied were not optimal for detecting increases in ADAM9 gene expression.

Data presentation:
All of the results have been presented in the manuscript. However, in general more details about the experimental conditions in the figure legends would have been helpful to the reader.

Authors:
As requested by the reviewer we have added details in each dataset as shown in Figure  legends through

Discussion and conclusions:
The discussion section could be expanded to include a discussion of the limitations of the study. The discussion could have included a section on how the changes in ADAM9 gene expression detected in blood leukocytes might influence the pathogenesis of progression of the diseases that were studied based upon the known activities of this proteinase.

Authors:
A new section has been added in the conclusion specifically to discuss limitations of the study (see additions mentioned above). We are yet unsure of the functional significance of elevation in levels of ADAM9, which on one hand may be beneficial to mediate tissue repair; on the other hand the fact that ADAM9 proteins or transcripts levels are found elevated in blood may be an indication of extensive tissue damage and be associated with poor outcome. Indeed we now for instance report in the context of GSE11375 (profiling of responses in the blood of trauma patient) that abundance of ADAM9 in patients who did not survive was significantly higher than those who survive. In another dataset GSE34205/GSE38900 (Viral infections) we now show that abundance of ADAM9 is correlated with degree of severity in pediatric viral infection (RSV, influenza and HRV infection), moreover level of ADAM9 transcript in patients who were ventilated were significantly higher than that who were non-ventilated. We have added these statements in the discussion.

1.
Methods: The description is very nice but could be adapted to journal style and shortened. The possibilities offered by the software could be summarized in a table.
2. The diagrams within the figures are very redundant especially due to the detailed description within the text. This space should be used to present more original data sets.

7.
The tables should be summarized within one table. Further, the table should include all datasets analysed and mentioned in the text. 8. Figure 3: It would be helpful to mark the values for the different individuals, maybe by different colours to avoid the impression of a general outlier. Otherwise, changes after PAMP treatment could be possible.

9.
Conclusion: To address the point of infection the authors include a stimulation of blood samples. However, this is not sufficient to draw the conclusion of a biomarker for tissue damage (also as a result of infection). Experiments with tissue cells, including scratch assays, stimulations with cytokines, and conditioned media from blood samples would provide further information and address the tissue damage effect in comparison to the infection effect.

10.
Competing Interests: No competing interests were disclosed.
We confirm that we have read this submission and believe that we have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however we have significant reservations, as outlined above.

Author Response 11 Oct 2016
Damien Chaussabel, Sidra Medical and Research Center, Doha, Qatar We thank the reviewers for their valuable feedback and suggestions to improve our manuscript.
Rinchai and co-workers nicely present a re-analysis of existing genomic datasets, demonstrating an useful tool for quick establishment of functional hypotheses. By this, they suggest a novel function of ADAM9 as biomarker for tissue damage. The article is well written, but several concerns should be addressed before indexing.
1. Introduction: The authors give a nice review of the current literature. However, a link to their one study is missing. The introduction should include the motivation ("knowledge gap assessment"). Otherwise, readers could expect a detailed physiologic analysis of ADAM9 in tissue damage.
Authors: As suggested we added a paragraph at the beginning of the introduction section to present the rationale behind the data mining approach that was employed. We experienced issues with one of our servers at some time. We checked the links and references provided in the introduction and it seems to be working fine now.
2. Methods: The description is very nice but could be adapted to journal style and shortened. The possibilities offered by the software could be summarized in a table.

Authors:
We also took care of this. Since the description of the GXB tool has now been published and the code made openly available in Github we now point readers to these resources and shortened the paragraph describing the features of the software accordingly (Speake C, et al., J Transl Med 2015, Rinchai D, et al., F1000R 2016. A link to a tutorial video has also been added to the methods (https://www.youtube.com/playlist?list=PLtx3tvfIzJ9XkRKUz6ISEJpAhqKyuiCiD).
3. Figure 1: The colour scheme in the figure and the legend are sorted differentially. In general, the figure is overloaded and the colour scheme not helpful. Differences were only observed for monocytes and neutrophils. These results should be included in figure 1, whereas the other results should be included as supplementary figure. Authors: Thank you for pointing this out. In the original version of the Figure we used the graphic exported directly with GXB. However, we agree that it is difficult to read, especially without interactive features that allow overlay of sample information, sorting and mouse overs. Another reviewer also suggested retaining only the neutrophil and monocyte data the plot for Figure 1 and we have made these changes accordingly. Figure 2 to 5: I don't see any additional information by this second plot type. Supplementary Figures 2-5 represent the data exactly how they can be visualized interactively in the GXB. We felt this might be helpful given the fact that we provide links throughout the manuscript that lead to interactive version of these plots. We are now providing this rationale in the legend of the supplementary figures.  Table. Readers can access the data for each study by clicking the associated hyperlinks. All the studies mentioned in the text are also represented on the graphical abstract.

Authors:
7. The diagrams within the figures are very redundant especially due to the detailed description within the text. This space should be used to present more original data sets.

Authors:
We did not properly communicate the purpose of these diagrams that constitute graphical legend and allow presentation of the data in a semi-structured format that is both human and machine readable. We are now providing a rationale (see below) and have repositioned them at the bottom of the Figure, which will hopefully work better.
Rationale: "Diagrams have been incorporated within each Figure. These have a dual purpose, first they provide readers with a graphical summary of the findings and second constitute an attempt a structuring information for future computational applications. Indeed, an important limitation of communicating biomedical knowledge in the form of research articles is that it consists in unstructured information (free text 9. Figure 3: It would be helpful to mark the values for the different individuals, maybe by different colours to avoid the impression of a general outlier. Otherwise, changes after PAMP treatment could be possible.

Authors:
As suggested by reviewers, we labeled the value of different individual by using different colors in the PAMPs treatment dataset (GSE30101). We found that ADAM9 levels didn't show significant outlier response, with the exception of the green subject that shows low response to HKSA in comparison to the other subjects. This could be explained by donor-specific variation in the subject's ability to respond. Overall the magnitude of response to such stimuli remains especially low, especially when compared to CXCL10 which served as a positive control and did not reach significance. Donor information was not available for GSE32862 10. Conclusion: To address the point of infection the authors include a stimulation of blood samples. However, this is not sufficient to draw the conclusion of a biomarker for tissue damage (also as a result of infection). Experiments with tissue cells, including scratch assays, stimulations with cytokines, and conditioned media from blood samples would provide further information and address the tissue damage effect in comparison to the infection effect.
indexed subject to the following comments.
Introduction: The introduction starts with the Refseq definition of ADAM9 and a thorough review of existing literature on gene function of ADAM9. It left me wondering what motivated them to ADAM9 until the first section of Results (Knowledge gap assessment). It would be useful to the reader if a brief sentence or two on the motivation to study this gene was at the beginning of the Introduction section.
1. Figure 1: I find it very difficult to color match the Cell type on the x-axis of Figure 1 especially when it appears legend colors are sorted differently. A plot with seven smaller panels (one for each cell type) or even just 2 panels (nueturophils and monocytes) might be clearer. Can you add GSE60424 to title of  Table 1 also talks about three datasets but includes SOJIA vs Control and HIV vs Control from GSE29536.
b) I find the process diagrams (top half of figures) distracting and redundant with text and legend. This space could be used to incorporate the other studies. I suggest incorporating the cell type and measurement type after the study names on plot (e.g. GSE34205 \n microarray on whole blood; GSE29536 \n RNA-seq on neutrophils). Legend is well described.
c) The column for "Avg A -Avg B" is meaningless especially when comparing different platforms. The fold change (Avg A / Avg B) is more meaningful and would be worth stating to two decimal points. d) If possible, combine Tables 1 -4 into one page, possibly a large table with subheadings for during infection, after treatment with PAMPs, during tissue remodelling etc ... T

3.
The authors might also find such a plot on their webtool useful in the long run but this is beyond the scope of current paper.  We would like to thank reviewer for kindly comments and suggestions to improve our manuscript.
Rinchai et al. suggest a novel role for ADAM9 by mining exisiting dataset. This clever re-use of existing dataset is a demonstration on how scientists can test new hypothesis quickly, inexpensively and with more robustness. They also provide a web tool based on 172 curated datasets (https://gxb.benaroyaresearch.org/dm3/geneBrowser) which makes is a practical resource.
All sections of the article is extremely well written and I strongly recommend the article be indexed subject to the following comments.

Introduction:
The introduction starts with the Refseq definition of ADAM9 and a thorough review of existing literature on gene function of ADAM9. It left me wondering what motivated them to ADAM9 until the first section of Results (Knowledge gap assessment). It would be useful to the reader if a brief sentence or two on the motivation to study this gene was at the beginning of the Introduction section.
Authors: Thank you for raising this point, we agree that it would be better to start with such a description. So we have now added a paragraph explaining the "Collective data to knowledge" approach as the first paragraph of the introduction section.
2. Figure 1: I find it very difficult to color match the Cell type on the x-axis of Figure 1 especially when it appears legend colors are sorted differently. A plot with seven smaller panels (one for each cell type) or even just 2 panels (neutrophils and monocytes) might be clearer. Can you add GSE60424 to title of Figure 1?

Authors:
We agree with this suggestion. We initially wanted to use the plot as it would appear to the reader when accessing the GXB via the link provided, but it is rather difficult to interpret without the interactive features built into the software tool (overlay of sample information, sample sorting, pop ups etc…). Also per your and another reviewer's suggestion we changed the plot of Figure 1 showing only neutrophil and monocyte data. b) I find the process diagrams (top half of figures) distracting and redundant with text and legend. This space could be used to incorporate the other studies. I suggest incorporating the cell type and measurement type after the study names on plot (e.g. GSE34205 \n microarray on whole blood; GSE29536 \n RNA-seq on neutrophils). Legend is well described.

General comment on
Authors: This point has been raised by another reviewer as well and is obviously important. We did not properly communicate the purpose of these diagrams which are meant as "graphical figure legends". We aimed at structuring the information communicated and also help readers navigate the many finding that are reported while providing links to interactive figures and make details regarding study design more readily accessible (which a third reviewer deemed particularly important). In addition to providing the rationale for including those graphical figure legends we also moved them at the bottom of each figure, which is really the most logical spot for them to be.
Rationale: Diagrams have been incorporated within each figure. These have dual purpose, first providing readers with a graphical summary of the findings and second constitute an attempt a structuring information for future computational applications. Indeed, an important limitation of communicating biomedical knowledge in the form of research articles is that it consists in unstructured information (free text). This type of information is notoriously difficult to extract by computational means [Chaussabel D. Am J Pharmacogenomics 2004;4: 383-93]. Standardized graphical summaries such as the ones provided in this manuscript constitutes structured information that is both human readable and computationally tractable. The need for such solutions will become more pressing as the biomedical literature continues to grow exponentially to such scales that it can only be very narrowly apprehended by research investigators.
c) The column for "Avg A -Avg B" is meaningless especially when comparing different platforms. The fold change (Avg A / Avg B) is more meaningful and would be worth stating to two decimal points.

Authors:
We agree that it indeed cannot be compared across platforms, which we did not intend to do since rather than a meta-analysis our approach consists in a "meta-interpretation" across publicly available datasets. However, it is a good indication of robustness of the changes that are measured. We have used this criterion for many years to weed out genes that show high fold change but for expression levels that are close to background levels, which we have found to be poorly reproducible. For example, in case where fold change = 3 difference if A=30 and B=10 will be 20 which might be about twice the background intensity of the chip; whereas if A=300 and B=100, A/B is still = 3 but A-B is 200 or twenty time the background intensity. So having this information can help decide whether the changes that are observed are likely to be robust. Tables 1 -4 into one page, possibly a large table with subheadings