Whole proteome copy number dataset in primary mouse cortical neurons

The functional diversity of neurons is specified through their proteome resulting in elaborate and tightly regulated protein interaction networks and signalling that regulates neuronal processes. Dysregulation of these dynamic networks in development or in adulthood lead to neurodevelopmental or neurological disorders respectively. Over the past few decades, mass spectrometry has become a powerful tool for quantifying and resolving any proteome, including complex tissues such as the brain proteome, with technological advances leading to higher levels of resolution and throughput than traditional biochemical techniques. In this article, we provide a proteomic reference dataset that has been generated to identify proteins and quantify their level of expression in primary mouse cortical neurons. It represents a summary analysis of previously published data in (Antico et al., 2021). Mouse cortical neurons were isolated from E16.5 C57Bl/6J mice and cultured for 21 days in vitro (DIV). We employed the mitochondrial uncouplers AntimycinA/Oligomycin (AO) to induce mitochondrial depolarisation that is a well-established paradigm to assess mitophagic signalling. Total lysates from mouse primary cortical neurons were subjected to label-free quantitative proteomic analysis using both data dependent acquisition (DDA) and data independent acquisition (DIA) modes. DDA proteomic analysis identified a total dataset of 9367 proteins in mouse cortical neurons and absolute abundance of proteins was calculated as copy numbers per cell. DDA dataset was also processed to generate a reference spectral library to fit in and quantify MS spectra generated in DIA mode. Quantitative DIA analysis identified more than 6000 protein groups and statistical comparison of the two analysed groups (untreated and AO-treated) revealed that the neuronal proteome was largely unchanged post mitochondrial depolarisation for 5 hours. To our knowledge, these files represent the most comprehensive DDA and DIA reference datasets of fully functional maturated mouse primary cortical neurons and serve as a valuable resource for further investigating the role of specific proteins involved in neurobiology and neurological disorders such as Alzheimer's disease (AD), Parkinson's disease (PD) and Autism Spectrum Disorders (ASD).


a b s t r a c t
The functional diversity of neurons is specified through their proteome resulting in elaborate and tightly regulated protein interaction networks and signalling that regulates neuronal processes. Dysregulation of these dynamic networks in development or in adulthood lead to neurodevelopmental or neurological disorders respectively. Over the past few decades, mass spectrometry has become a powerful tool for quantifying and resolving any proteome, including complex tissues such as the brain proteome, with technological advances leading to higher levels of resolution and throughput than traditional biochemical techniques. In this article, we provide a proteomic reference dataset that has been generated to identify proteins and quantify their level of expression in primary mouse cortical neurons. It represents a summary analysis of previously published data in . Mouse cortical neurons were isolated from E16.5 C57Bl/6J mice and cultured for 21 days in vitro (DIV). We employed the mitochondrial uncouplers AntimycinA/Oligomycin (AO) to induce mitochondrial depolarisation that is a wellestablished paradigm to assess mitophagic signalling. Total lysates from mouse primary cortical neurons were subjected to label-free quantitative proteomic analysis using both data dependent acquisition (DDA) and data independent acquisition (DIA) modes. DDA proteomic analysis identified a total dataset of 9367 proteins in mouse cortical neurons and absolute abundance of proteins was calculated as copy numbers per cell. DDA dataset was also processed to generate a reference spectral library to fit in and quantify MS spectra generated in DIA mode. Quantitative DIA analysis identified more than 60 0 0 protein groups and statistical comparison of the two analysed groups (untreated and AO-treated) revealed that the neuronal proteome was largely unchanged post mitochondrial depolarisation for 5 hours. To our knowledge, these files represent the most comprehensive DDA and DIA reference datasets of fully functional maturated mouse primary cortical neurons and serve as a valuable resource for further investigating the role of specific proteins involved in neurobiology and neurological disorders such as Alzheimer's disease (AD), Parkinson's disease (PD) and Autism Spectrum Disorders (ASD

Value of the Data
• This data presents a valuable and in-depth resource of the mouse primary cortical neuronal proteome and allows for the identification of proteins expressed in terminally differentiated neurons. • The data provides absolute copy number quantification of core neuronal proteins as well as major signalling pathways associated with neurodegenerative disorders. • This data represents a comprehensive catalogue of mouse neuronal proteins that can be interrogated by researchers to identify proteins of interest in a neuronal context and their relative amount of expression and to further examine the network complexity of molecular pathways involved in neuronal signalling and/or pivotal mechanisms linked to neuronal stress. • This proteomic data can be used to (1) check expression of targets of drug discovery and encourage pilot hypothesis-driven studies in primary neurons; (2) identify protein biomarkers and; (3) to study neuronal pathways in physiological conditions and pathological disorders. • This data also determines protein concentration at endogenous levels, whose information will be critical for understanding the stoichiometry relations between different proteins involved in same pathway or mechanism, and for studying physiological processes in vitro.

Objective
The fundamental objective was to produce a proteome-derived resource dataset that quantified expression of proteins derived from terminally differentiated mouse cortical neuronal cultures.
This dataset will enable quantitative analysis of protein expression in mouse cortical neurons and allow comparative analysis with different cell types derived from other cultured primary cells.
This deep proteomic analysis of mouse cortical neurons was undertaken as part of a project published by [1] related to Parkinson's-linked proteins, PINK1 and Parkin. The dataset provides a general resource for Parkinson's as it defines expression of other Parkinson's-encoded proteins including SNCA, PARK7/DJ1, VPS35, VPS13C, ATP13A2 and LRRK2 that will be of interest to the Parkinson's field more widely. It will also be a resource for related brain disorders including Alzheimer's disease and Autism Spectrum Disorder.

Data Description
The data presented in this article aim to profile a dataset of expressed proteins in cultured terminal differentiated mouse cortical neurons. Mature primary neuronal cultures model the physiology of cells in-vivo and therefore represent a tractable system to study molecular mechanisms related to the physiological and pathophysiological functions of neuronal networks. Neuronal progenitors were isolated from E16.5 mouse cortices (C57BL/6J) and cultured for 21 days in vitro . To further investigate whether mitochondrial stress induces changes in the neuronal proteome, mouse cortical neurons were stimulated at 21 DIV with a combination of Antimycin/Oligomycin (10 μM / 1 μM) to induce mitochondrial depolarisation for 5 hours. Three biological replicates (in technical duplicate) per condition were trypsin digested using the S-Trap assisted sample preparation followed by LC-MS/MS analysis. The workflow used to isolate and culture mouse cortical neurons, for sample preparation, data acquisition and analysis is outlined in Fig. 1 .
A shotgun label-free quantification (LFQ) proteomics was performed, and 45 high-pH peptide fractions were analysed on an Orbitrap Exploris 480 mass spectrometer acquired in a MS raw data were searched using MS-Fragger software suite (version 3.2) [2] , identified a total dataset of 9367 proteins in mouse cortical neurons and the data was classified into several cellular subsets Mouse cortical neurons were generated from E16.5 mouse cortices (C57BL/6J) and cultured for 21 DIV until terminal maturation. Neurons were treated with AO and DMSO for 5 hours to induce mitochondrial depolarisation (three biological replicate, in technical duplicates, for each condition). Protein lysates were prepared in SDS buffer and neuron peptides generated by trypsin digestion. Total cortical neuron peptides were fractionated in 45 fraction and subjected to LC-MS/MS analysis with Orbitrap Exploris . The neuronal proteome was analysed with 2 different acquisition modes: DDA-Data Dependent acquisition and DIA -Data independent acquisition. Spectral library was generated from DDA data and used for quantification of DIA proteins. of proteins relevant for the biological pathways in neurons and to determine the specific level of protein abundance. Further, the data was processed using Perseus software suite [3] to calculate protein copy numbers using histone-based proteomic ruler method used to quantitatively determine protein levels by copy numbers of proteins [4] .
Proteins were grouped into 10 classes based on function; subcellular localisation; and generisk associated-diseases ( Table 1 ). This classification enabled us to identify 316 kinases, 209   Table 2 . This dataset also contained proteins involved in glycosylation (183 proteins) and metabolic pathways (1737 proteins). That dataset is also amenable to subcellular localisation expression profiling, for example we identified 403 lysosomal proteins and 931 mitochondrial proteins ( Table 1 ), and this analysis can be extended to other organelles, such as peroxisome, Golgi apparatus and endoplasmic reticulum. This multifaceted analysis was also expanded to classify proteins linked to genetic disease risk for neurological disorders. In mouse cortical neurons, distinct sets of proteins encoded by disease risk genes included: 47 AD-related genes, 22 PD-related genes and 85 ASD-related genes ( Table 1 ). The top 15 proteins encoded by genes for each disease were reported with their relative copy number intensity and concentration ( Table 3 ).
This dataset also enables proteomic snapshots of neuronal processes. As neurons have a high compartmentalized signalling network and synaptic transmission is one of the key neuronal processes, a systematic examination was carried out for synaptic proteins. The most representative synaptic proteins identified in this dataset were clustered in the different stages of neurotransmission and shown in Fig. 2 . The dataset also enabled analysis of the stoichiometry and the reciprocal proportion of the synaptic proteins connected in the same pathway or phase of neuronal process. This analysis can be extended to others cellular processes, such as neuronal metabolism and neuronal protein turnover.
A DDA spectral library was also generated from 45 high-pH fractions of mouse cortical neuronal extracts using MS-Fragger search algorithm. In parallel, sample fractions were acquired and analysed in Data-Independent acquisition mode (DIA acquisition scheme reported in Supplemen-   ). DIA spectra were identified and quantified using the spectral library generated from DDA data.

Mouse cortical neuronal preparation from C57BL/6J mice
The C57BL/6J (RRID:IMSR_JAX:0 0 0 6 64) mice were obtained from Charles River Laboratories (Kent-UK) and housed in a pathogen-free facility with temperature-controlled rooms at 21 °C and 45 to 65% relative humidity, 12-hour light/12-hour dark cycles and supplied food and water ad libitum .
A detailed protocol describing the preparation of primary cortical mouse neurons has been published (http://dx.doi.org/10.17504/protocols.io.bswanfae) . In brief, cortices were isolated from E16.5 embryos and tissue digestion was performed by incubation with trypsin-EDTA at 37 °C for 30 min. Cortical neurons were plated at a density of 5.0 × 10 5 cells per well on poly-L-Lysine coated six-well plates and cultured in neuronal media: Neurobasal medium, 1X B27 supplement, 1X GlutaMAX, and 1X penicillin-streptomycin. Neurons were cultured in a watersaturated incubator at 37 °C and 5% CO 2 for 21 days and medium was partially replaced every 5 days for 1/3 of the total volume. The mass spectrometry workflow and methods used in this study, are described in detail in dx.doi.org/10.17504/protocols.io.bs3tngnn, dx.doi.org/10.17504/protocols.io.busynwfw and [16] .

Sample preparation for copy number proteomics
Briefly, an S-Trap protocol was used for either 50 μg of protein for each single AO-and DMSOtreated sample or 300 μg of pooled cortical neurons. Briefly, sample reduction was performed with 10 mM TCEP and alkylation with 40 mM Iodoacetamide. Samples were loaded on S-Trap micro and mini columns and purified by washing with S-Trap wash buffer [100 mM TEABC (pH 7.2) in 90% methanol] four times. Peptide digestion was done using Lys-C ± trypsin at 1:20 ratio and incubated at 47 °C for 1.2 hours following overnight incubation at room temperature. Peptides were sequentially eluted using 50 mM TEABC buffer, 0.15% formic acid (v/v), and 80% acetonitrile (ACN) in 0.15% formic acid (v/v) and vacuum-dried. For the pooled cortical neuron, peptides generated after trypsin digestion, were subjected to high-pH RPLC fractionation to generate 45 fractions and used for data-dependent acquisition (DDA) analysis. AO-and DMSOtreated samples were dissolved in LC buffer [3% ACN in 0.1% formic acid (v/v)], and 2 μg of peptide amount was injected for DIA analysis.

Copy number total proteomic analysis using DDA and DIA
Copy number analysis and DDA method are described in [4] . Orbitrap Exploris 480 mass spectrometer coupled in line with Dionex 30 0 0 RSLC nano liquid chromatography (LC) system was used to analyse the 45 high-pH fractions. Sample was injected onto trap column (Acclaim PepMap 2 cm, 3 μm particle) and separated on a 50-cm analytical column at 300 nl/ min (ES803; 50 cm, C18 2 μm particle) and directly electrosprayed into the mass spectrometer using EASY nanoLC source.
Data were acquired in a DDA mode by acquiring full MS at 60,0 0 0 resolution at a mass/charge ratio (m/z) of 200 and analyzed using Orbitrap mass analyzer. MS2 data were acquired at top speed for 2s to acquire as many data-dependent scans by using 1.2-Da isolation window using quadrupole mass filter and fragmented using normalized 30% high-energy collision-induced dissociation (HCD); the MS fragment ion was measured at 15,0 0 0 resolution at 200 m/z using Orbitrap mass analyzer. Automatic gain control (AGC) targets for MS1 were set at 300% and MS2 at 100% with a maximum ion-injection accumulation time at 25 and 80 ms, respectively.
For the DIA analysis, peptide amount from each of the AO-treated and DMSO-treated cortical neuron samples were acquired on an Orbitrap Exploris 480 mass spectrometer. Peptides were loaded on trap column and eluted on an analytical column by using a nonlinear gradient of 120 min and a total of 145-min run. MS1 data were acquired at 120,0 0 0 resolution at 200 m/z and measured using Orbitrap mass analyzer. Variable DIA scheme was used by using a Quadrupole mass filter in the mass range of 400 to 1500 m/z. A total of 45 variable isolation windows employed per duty cycle and peptide precursor ions were fragmented using a normalized steeped HCD collision energy (26, 28, and 30) and measured at 30,0 0 0 resolution at m/z of 20 0 using Orbitrap mass analyzer. AGC targets for MS1 were set at 300% and for MS2 at 30 0 0% with a maximum ion-injection accumulation time of 25 and 80 ms, respectively. The completed variable DIA window schemes and instrument settings are provided in the Supplementary table and have been deposited at Zenodo (doi: 10.5281/zenodo.8023364 ).
MS1 quantification was performed using MS-Fragger version 3.2 with an in-built Ion-Quant algorithm (https://github.com/Nesvilab/IonQuant; doi: 10.5281/zenodo.8098825) by allowing match between runs. One percent false discovery rate (FDR) at peptide-spectrum match (PSM), peptide, and protein level was applied for the final output files. Protein group table was further processed using Perseus software suite (v1.6.15.0-http://www.perseus-framework.org -RRID:SCR_015753) to estimate copy numbers using histone proteomic ruler [3 , 4] . The DDA data were used to generate a spectral library using Spectronaut version 15 (Biognosyshttps://biognosys.com/software/spectronaut/) pulsar search engine [18] . This library was used for the library-based search for DIA data by using the default search parameters and enabling cross-run normalization. The search output protein group table was exported and processed using Perseus for further analysis. Statistical analysis was completed using a Student T tests with 1% permutation-based FDR for the identification of differentially regulated proteins [3] .

Ethics Statements
All animal studies were conducted in accordance with the Animal Scientific Procedures Act (1986) and with the Directive 2010/63/EU of the European Parliament and of the Council on the protection of animals used for scientific purposes (2010, no. 63). Experiments and breeding were approved by the University of Dundee Ethical Review Committee and further subjected to approved study plans by the Named Veterinary Surgeon and Compliance Officer.

Declaration of Competing Interests
The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: M.M.K.M. is a member of the Scientific Advisory Board of Mitokinin Inc. and Scientific Consultant to Stealth Biotherapeutics Inc. and Merck & Co Inc.

Data Availability
COPY NUMBER ANALYSIS OF NEURONS (Original data) (PRIDE).