COVID-19 patients and Dementia: Frontal cortex transcriptomic data

Since the association of SARS-Cov-2 infection with Nervous System (NS) manifestations, we performed RNA-sequencing analysis in Frontal Cortex of COVID-19 positive or negative individuals and affected or not by Dementia individuals. We examined gene expression differences in individuals with COVID-19 and Dementia compared to Dementia only patients by collecting transcript counts in each sample and performing Differential Expression analysis. We found eleven genes satisfying our significance criteria, all of them being protein coding genes. These data are suitable for integration with supplemental samples and for analysis according to different individuals’ classification. Also, differential expression evaluation may be implemented with other scientific purposes, such as research of unannotated genes, mRNA splicing and genes isoforms. The analysis of Differential Expressed genes in COVID-19 positive patients compared to non-COVID-19 patients is published in: S. Gagliardi, E.T. Poloni, C. Pandini, M. Garofalo, F. Dragoni, V. Medici, A. Davin, S.D. Visonà, M. Moretti, D. Sproviero, O. Pansarasa, A. Guaita, M. Ceroni, L. Tronconi, C. Cereda, Detection of SARS-CoV-2 genome and whole transcriptome sequencing in frontal cortex of COVID-19 patients., Brain. Behav. Immun. (2021). https://doi.org/10.1016/j.bbi.2021.05.012.

classification. Also, differential expression evaluation may be implemented with other scientific purposes, such as research of unannotated genes, mRNA splicing and genes isoforms. The analysis of Differential Expressed genes in COVID-19 positive patients compared to non-COVID-19 patients is published in: S. Gagliardi

Value of the Data
• We exploited Next Generation Sequencing technique for providing transcriptomic profiles in Frontal Cortex of both COVID-19 positive or negative individuals and affected or not by Dementia individuals. These screenings are important for the study of impact of current infectious disease on Central Nervous System, so-called NeuroCOVID-19, and on diverse elderly comorbidities, such as Dementia. The aim was to collect information concerning RNA alterations in the prefrontal cortex given its contribution in hemodynamic responses. • These data can help in the study of molecular features of SARS-CoV-2 in the brain. Moreover, the dysregulation of specific pathways can be extrapolated from transcriptomic data making them a source of biomarkers. • Versatility of both raw and analysed RNA-sequencing data lies in their suitability for several purposes, such as gene expression analysis, unannotated genes discovery, mRNA splicing investigation and genes isoforms study. In addition, data in standard format, such as FastQ and BAM files, but also gene expression tables reporting raw counts, FPKM and TPM values, can be easily re-used and integrated with additional samples or exploited to refine the analysis with different individual classification."

Data Description
A summary of anagraphic and clinical feature of cases included in transcriptomic investigation is reported in Table 1 . Individuals with Dementia were six, individuals with Dementia and COVID-19 were seven, two individuals had neither Dementia nor COVID-19 and two individuals had COVID-19 but not Dementia.
In Supplementary Table 1, the counts of each gene (specified as Ensembl ID) are indicated for each sample submitted to sequencing.
The amount of both coding and non-coding counts was evaluated for each sample and as visible in Fig. 1 , coding ones were the most abundant. This result is in accordance with the currently available knowledge about non-coding transcripts that result to be globally less expressed than coding ones within the cell [ 1 , 2 ]. BB109 was nonuniform in terms of counts abundancy and did not pass quality check, thus this sample was excluded from further analysis.  A differential expression analysis of genes was performed. We compared the group of individuals with COVID-19 and Dementia ( n = 7) versus those with Dementia only ( n = 6). In order to evaluate the clustering resulting from this analysis, we represented in the Heatmap in Fig. 2 all the deregulated genes. The list of genes considered significant in this analysis is available in Supplementary Table 2. We found dysregulated 11 genes, 4 up-regulated and 7 down-regulated. All of them were protein coding. In this table Ensembl ID, base mean, log2FoldChange, lfcSE, stat, P -value, adjusted P -value, gene name, gene biotype and gene source are indicated.
We also performed differential expression analysis of genes considering COVID-19 patients without Dementia ( n = 2) versus COVID-19 negative individuals without Dementia ( n = 1), but we found no significantly deregulated genes observing our filtering criteria as reported in Supplementary Table 2.
The volcano plot in Fig. 3 shows statistical significance ( P -value) versus magnitude of change (fold change) of differential expressed (DE) genes in COVID-19 and Dementia individuals ( n = 7) versus individuals with Dementia only ( n = 6). The number of genes with |log2(fold change)|n 1 that are also statistically significant is low.

Experimental Design, Materials and Methods
Autoptic human brain samples were used for collecting these data. RNA was isolated by Trizol reagent (Life Science Technologies, Italy) according to the manufacturer's instructions and processed as described in Gagliardi et al. [1] .
Quality of individual sequences were evaluated using MultiQC software ( https://multiqc.info/ ) after adapter trimming with cutadapt software. UMI sequences were marked and deduplicated with UMI-tools software [2] [UMItools]. Per base sequence quality plots, showing the mean    Table 2 For each sample indicated in "Sample_name" column, the total number of input reads, the average read length, the number of reads uniquely mapped to the reference genome and the overall alignment rate are reported. quality value across each base position in the read are shown in Fig. 4 . Gene and transcript intensities and differential expression analysis for mRNA and non coding RNAs were computed as in Gagliardi et al. [1] . Human genome reference used for the alignment was GRCh38 (Gencode release 36), containing the up-to-date records for both coding and non coding RNAs. Coding and non coding genes were considered differentially expressed and retained for further analysis with |log2(disease sample/healthy control)| ≥ 1 and a FDR ≤ 0.1. We imposed minimum |Log2FC| of 1 and a FDR lower than 0.1 as thresholds to differentially expressed genes. Inter-and intra-group variability was assessed and shown in Fig. 5 . On average, 29.2 M reads were available for each sample and 22.7 M reads were aligned against the reference genome (average overall alignment rate: 77.9%). Input reads number, average read length, number of aligned reads and alignment rate are reported in Table 2 for each sample. Transcripts with a count value of at least 5 were retained for differential expression analysis. On average, 16734.8 coding genes and 5370.6 non coding genes resulted to be expressed in each sample.

Ethics Statement
The study protocol was approved by the Ethics Committee of the University of Pavia on October 6th, 2009 (Committee report 3/2009). In case of deceased subjects, the consent is not required, as the samples had been taken anyway for clinical/forensic purposes and because it is not possible to contact the next of kin in such circumstances. The reference law is the authorization n9/2016 of the guarantor of privacy, then replaced by REGULATION (EU) 2016/679 OF THE EUROPEAN PARLIAMENT AND OF THE COUNCIL.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships which have or could be perceived to have influenced the work reported in this article.