Mass spectrometry database of lacrimal gland adenoid cystic carcinoma and normal lacrimal gland tissue identifies extracellular matrix remodeling in these tumors

Adenoid cystic carcinoma of the lacrimal gland (LGACC) is a slow-growing but aggressive orbital malignancy. Due to the rarity of LGACC, it is poorly understood, which makes diagnosing, treating, and monitoring disease progression difficult. The aim is to understand the molecular drivers of LGACC further to identify potential targets for treating this cancer. Mass spectrometry was performed on LGACC and normal lacrimal gland samples to examine the differentially expressed proteins to understand this cancer's proteomic characteristics. Downstream gene ontology and pathway analysis revealed the extracellular matrix is the most upregulated process in LGACC. This data serves as a resource for further understanding LGACC and identifying potential treatment targets. This dataset is publicly available.


a b s t r a c t
Adenoid cystic carcinoma of the lacrimal gland (LGACC) is a slow-growing but aggressive orbital malignancy. Due to the rarity of LGACC, it is poorly understood, which makes diagnosing, treating, and monitoring disease progression difficult. The aim is to understand the molecular drivers of LGACC further to identify potential targets for treating this cancer. Mass spectrometry was performed on LGACC and normal lacrimal gland samples to examine the differentially expressed proteins to understand this cancer's proteomic characteristics. Downstream gene ontology and pathway analysis revealed the extracellular matrix is the most upregulated process in LGACC. This data serves as a resource for further understand-

Value of the Data
• This data provides differential protein expression between LGACC and normal lacrimal glands. It provides a deeper understanding of this cancer's molecular composition and processes. • This data is beneficial to researchers studying LGACC and those interested in orbital malignancies. • This data provides a dataset of potential protein targets for advancing the clinical management of LGACC.

Objective
We conducted mass spectrometry analysis of adenoid cystic carcinoma of the lacrimal gland (LGACC) and normal lacrimal gland tissue to identify differential proteomic expression signatures between the cancerous tissue and healthy tissue. There is little known about the molecular signatures of this rare cancer, and we wanted to add to the field by analyzing the protein characteristics.

Data Description
The findings of differential protein expression between the normal lacrimal gland (n = 4) and LGACC (n = 7) are presented. Mass spectrometry was performed on primary tumor samples to identify the proteomic differences between cancerous and normal tissue. The raw values that were processed in the MaxQuant software are in the data repository Mendeley Data (Title: Mass Spectrometry of lacrimal gland adenoid cystic carcinoma and normal lacrimal gland) in the file Raw_MassSpec_Data. The R Studio package limma was used to determine the differentially expressed proteins between normal and cancerous samples. The output of the limma analysis is in the data repository file MassSpect_limma_output including the raw values and the statistical analysis outputs. Three hundred eighty-three (383) differentially expressed proteins between LGACC and normal lacrimal glands were identified, 111 upregulated genes, and 272 downregulated genes. The differentially expressed genes (n = 383) including the raw values and the p-values can be found in the data repository file MassSpec_Analyzed_DEGs. Fig. 1 A shows the heatmap of the log2 transformed raw data of all samples. Fig. 1 B depicts a volcano plot representing all the proteins that are downregulated (red), upregulated (blue), and not differentially expressed (green) in LGACC compared to normal lacrimal gland, with the -log2 of the raw p-value on the y-axis and the log of the fold change on the x-axis.

Sample preparation
Normal lacrimal gland and LGACC tissue samples were extracted during surgery. Flash-frozen tissue samples were sent to MS Bioworks (Ann Arbor, MI) for sample preparation and mass spectrometry using label-free analysis with 4h LC/MS/MS per sample.

Protein identification and differential gene expression analysis
Samples were analyzed with a Waters NanoAcquity HPLC system with ThermoFisher Fusion Lumos. Data were processed with MaxQuant software v1.6.2.3. Uniprot names were converted to gene names using the Uniprot website (4,024 genes). Duplicate gene names were removed (3,995) by filtering based on the primary gene name. PCA, heatmap, and volcano plots were made in R Studio. For PCA analysis, a value of 1 was added to all raw values, and the log2 was taken of all values. Further downstream analysis was processed with the R package limma (Version 3.54.0) [1] . The limma analysis performed was used to calculate the p -value, adjusted p -value, and log fold change [2] . Proteins that had a p-value of < 0.05 and a logFC > 2 or less than < 2 were considered significant.

Gene ontology and pathway analysis
The differentially expressed genes found in the mass spectrometry data (n = 383) were analyzed for downstream gene ontology and pathway analysis using DAVID Functional Analysis Tool [3 , 4] . Ontology figures were made using the -log10 of the p-value for each ontology.

Ethics Statement
The University of Miami Institutional Review Board approved this study (IRB 20090524). The study was conducted in a Health Insurance Portability and Accountability Act of 1996-compliant manner.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Data Availability
Mass Spectrometry of lacrimal gland adenoid cystic carcinoma and normal lacrimal gland