INTRODUCTION

Infectious pneumonia is an acute inflammatory process of the lung tissue caused by pathogens of a bacterial, viral or fungal nature, including streptococci, staphylococci, influenza viruses, coronaviruses among others (the most common causative agents of pneumonia are discussed in detail in reviews [15]).

Viral and bacterial pneumonias often have a similar clinical presentation. The etiology of pneumonia caused by coronaviruses and influenza viruses is not always possible to determine in a timely manner in a clinical laboratory, which can lead to errors in the choice of treatment strategy. During epidemics, the situation is aggravated by the spread of nosocomial bacterial infections (coinfections) in conditions of mass infection [6, 7]. Accelerated differential identification of the pathogen is very important, because it allows quick and accurate diagnosis, and therefore, timely and correct therapy [8].

Today, in clinical laboratories, infectious agents are identified using culture methods [9], and test systems based on PCR or serological tests, which in most cases are focused on identifying one type of pathogen [10, 11]. Often, the diagnosis and the appointment of therapy are based only on the clinical presentation of the disease [12], or in some cases clinical presentation supplemented by data from radiological methods [13, 14].

Multiplex PCR is a promising tool for molecular biological research and clinical diagnostics. The use of multiplex PCR for the simultaneous detection of several causative agents of human pneumonia in one sample is extremely important, given the complexity of determining the etiology of the disease by classical clinical methods, including the identification of causative agents of respiratory infections [1517]. In clinical laboratories, RNA viruses are detected using reverse transcription (RT) [18, 19].

Multiplex RT is used to detect RNA-containing viruses that cause respiratory infections [2023]. Multiplex RT-PCR is used for the simultaneous detection of RNA and DNA-containing viruses [24]. The simultaneous detection of viroids (RNA) and eubacteria (DNA) using a technique that combines RT and PCR has been described [25]. RT-PCR followed by hybridization analysis on a biological microchip is considered in [26, 27].

Direct multiplex RT-PCR on a biological microchip can increase the productivity, sensitivity and reliability of detection of nucleic acids of bacterial and viral infectious agents in a sample. This method fits well with the scale of testing performed by clinical laboratories.

In this work, we propose an accelerated method for determining the pathogen in a sample using multiplex solid-phase RT-PCR. This method has a high speed of analysis while simultaneously detecting a number of pathogenic agents. The prototype of the developed diagnostic system is resistant to cross-contamination, and compatible with standard in situ amplifiers and fluorescent signal detectors designed for biological microchips. The prototype has an open architecture, which makes possible expansion of the range of detected pathogenic agents if the compatibility condition of the multi-primer system is met.

EXPERIMENTAL

Strains. We used decontaminated genome-wide DNA of bacterial strains from the collection of the State Research Center for Applied Microbiology and Biotechnology (Obolensk) and viral RNA from the collection of the Institute of Vaccines and Serums (Moscow). Work with clinical isolates and live cultures was carried out at these institutions.

DNA from cultures was isolated using the CTAB method [28]. A bacterial culture suspension was prepared in 1× TE buffer. The cells were lysed using a lysozyme solution (10 mg/mL). Proteinase K (Thermo Fisher, United States) and CTAB/NaCl solution were used to degrade proteins and separate proteins from DNA. Proteins, cellular elements, and DNA were separated using a chloroform/isoamyl alcohol solution in a ratio of 24 : 1. DNA was separated from the rest of the solution using isopropyl alcohol and washed from the remaining reagents with 70% ethanol (reagent grade). Washed and dried DNA was dissolved in distilled water.

The quality and quantity of DNA was determined by electrophoresis in 1% agarose gel and spectrophotometrically (GeneQuant Pro RNA/DNA Calculator, Amersham Pharmacia Biotech, Great Britain).

Decontaminated DNA of the following strains was obtained: Staphylococcus aureus ATCC 25923, S. aureus ATCC 43300, Haemophilus influenzae ATCC 49247, Legionella pneumophila ATCC 33152, Pseudomonas aeruginosa 10662 NCTC ATCC 25668, Klebsiella pneumoniae 9633 NCTC ATCC 13883, Streptococcus pneumoniae ATCC 49619.

The SARS-CoV-2 virus was obtained by producing viral particles in a Vero cell culture (ATCC, USA) from a clinical sample obtained from a patient with COVID-19. The presence of SARS-CoV-2 RNA in the viral material was analyzed by real-time RT-PCR with primers to the gene N [29]. The taxonomic affiliation of the isolate to Severe acute respiratory syndrome—related coronavirus (clade GH SARS-CoV-2) was established by gene sequencing S (GenBank ID MW161041.1) and the complete genome (GenBank ID MW514307.1) followed by phylogenetic analysis. The sequence of the S gene in the Dubrovka strain had 99.2% similarity with the Wuhan-Hu-1 strain (NC_045512.20). A feature of the Dubrovka strain is the deletion of 27 nucleotides in the S gene (9 amino acid residues are encoded at position 68–76 of the S-protein YMSLGPMVL), which explains the relatively high level of difference (0.8%) between these strains.

Influenza viruses A (strain A/Panama/2007/99 H3N2) and B (B/Leningrad/179/86) were obtained from the Collection of microorganisms of III and IV pathogenicity groups of the Research Institute of Vaccines and Serums named after I.I. Mechnikov.

The viral material was inactivated in a lysis solution (RNA isolation kit Magno-Sorb, Russia) containing the chaotropic agent, guanidine isothiocyanate. Inactivation of the SARS-CoV-2 virus was checked by the presence or absence of a cytopathic effect in a sensitive Vero cell culture.

Viral genomic RNA was isolated from 140 μL of lysate using a commercial QIAamp Viral RNA Mini Kit (Qiagen, Germany) according to the manufacturer’s instructions. In the first case, purified viral RNA was eluted from the membrane of QIAamp Mini Spin columns with 60 μL of RNase-free water and stored at –70°C until use.

Viral RNA was also extracted using a modified CTAB-method, which needs to be optimized to be used simultaneously for both RNA and DNA-containing pathogenic agents.

Primers. The nucleotide sequences of genomic targets were aligned using the ClustalW algorithm (www.clustal.org). Primers were designed using a www.idtdna.com. The physicochemical characteristics of each primer, including testing for the presence of both intra- and intermolecular secondary structures were determined. The specificity was analyzed using the BLAST algorithm (NIH, USA).

Multiplex PCR in volume. The reaction mixture (30 μL) contained 1.5 units of Taq polymerases (Thermo Scientific, USA) in natural deoxynucleoside triphosphate (dNTP) buffer from the same company, each at a concentration of 200 μM, with primers at a concentration of 5 μM, and a whole genome bacterial matrix (or a mixture of bacterial DNA). The reaction was carried out on a MiniCycler DNA amplifier (MJResearch, USA). The temperature-time profile of PCR consisted of preliminary denaturation at 95°C for 5 min, followed by 30 cycles: 95°C (DNA denaturation) for 20 s; 66°С (primer annealing) for 30 s; 72°C (primer extension) for 30 s and final incubation at 72°C for 5 min.

Gradient PCR and determination of the sensitivity of the system using real-time PCR were performed on an IQ5 amplifier (Bio-Rad).

Horizontal electrophoresis for monitoring RT and PCR products. PCR products were separated in a 4% agarose gel (Agarose LE, Helicon, Russia); ethidium bromide was used for staining.

Immobilization of primers, biological microarrays. After synthesis and purification, primers were dissolved in water (Milli Q), the concentration was adjusted to 8 mM, mixed with the components of the gel, and applied to a chip, which is a treated silicate glass or a polymer substrate. Further processing of the substrate and the procedure for manufacturing the chip were carried out according to [30].

Multiplex solid-phase RT-PCR. RT-PCR was performed using MMLV reverse transcriptase and other components of the REVERTA-L kit ( Federal Budget Institution of science, Central Research Institute of Epidemiology, Rospotrebnadzor, Russia) and Hot Start Taq polymerase (Thermo Scientific) in an appropriate buffer, or using the OneStep RT-PCR Kit (Qiagen). The mixture contained natural dNTPs (400 µM each) and primers—5–10 µM forward and 0.5–1.0 µM reverse (differing for different primer pairs as a result of optimization). Cy5-dUTP [31] at a concentration of 8 μM was used as a fluorescent substrate for polymerase. The mixture was placed on a chip and sealed using Frame-Seal 25 μL (Bio-Rad, United States). The reactions were carried out on a DNA amplifier for in situ PCR TGradient Thermocycler (Biometra, USA). RT was performed for 30 min at 42°C, after which PCR was performed under the following conditions: 95°C for 3 min (initial denaturation); 36 cycles of 20 s at 95°C, 30 s at 64°C and 40 s at 72°C; final incubation for 5 min at 72°C.

The sensitivity of RT-PCR on a chip was determined by titrating the DNA/RNA of the analyzed samples in the range of 101–105 copies per reaction volume (25 μL).

Determination of the elongation of immobilized primers, interpretation of the analysis. The fluorescent signal from the chip after the extension of the immobilized primers was read according to [32] using a Chip Detector analyzer (IMB, Russia). The signal intensity was determined using ImaGeWare v. 3.50 (IMB, Russia).

RESULTS

Earlier, we reported on the development of multiplex PCR to determine the types of bacteria that cause human pneumonia [17, 33]. Further, based on primers specific to S. aureus and St. pneumoniae, we developed a multiplex PCR on a chip with the incorporation of the label into the DNA during its extension. A fluorescent label covalently inserted into the sequence of an immobilized primer during PCR allowed a “hard” wash of the chip without the risk of signal loss and, accordingly, without reducing the sensitivity of the system [32].

The set of specific primers has been expanded. Now it covers six main types of bacteria that cause pneumonia, as well as two RNA viruses: influenza A and the new type of coronavirus that causes CoVID-19 (Table 1), which led to the use of RT. The specificity and intraspecific conservatism of the selected regions of the genetic targets were taken into account as necessary conditions for the design of primers. For the convenience of identification of the pathogen at the stage of optimization of the multi-primer system “in the total volume,” pairs of primers were designed to obtain PCR products of various lengths. When designing multiplex PCR, we were guided by the requirement of primer compatibility, i.e. the absence of intermolecular interactions for all primers of the system, as well as the standard requirements for close melting temperatures of duplexes and the absence of intramolecular interactions (hairpins). BLAST analysis was performed by selecting species-specific conserved regions of target genes used for species identification of pathogenic microorganisms and viruses. Table 1 shows the sequences flanking the selected regions, as well as the lengths of the resulting PCR products. To create a microchip, either the reverse primer was immobilized in each pair, or the so-called “nested” primer lying inside the amplified region flanked by forward and reverse primers. The choice was determined by comparative testing of primers during system optimization. Table 1 shows the resulting primers obtained after system optimization.

Table 1.   Primers for the species-specific determination of bacterial and viral pathogens of human pneumonia

An important condition for the species-specific identification of pathogenic agents is the intraspecific conservatism of the selected targets. Ideally, these should be genes encoding proteins that are characteristic only for the studied microorganisms or viruses, or genes whose sequences differ sharply from the sequences of homologous genes of related species that are not pathogenic for humans. It is preferable to use genes encoding pathogenic factors. Thus, most of the selected targets determine the virulence of the identified pathogenic agents.

Table 1 shows the genetic targets used to identify pathogens. The targets were selected on the basis of the proven possibility of their use for species identification of the corresponding pathogen.

The St. pneumoniae Gene lytA encodes one of the virulence factors, autolysin, which is involved in a number of cellular processes [34]; the cpsB gene (tyrosine-specific protein phosphatase B) of streptococcus is involved in the regulation of the biosynthesis of capsular polysaccharides [35]. The ebpS gene (elastin-binding protein S) is an S. aureus gene that encodes proteins involved in the binding of molecules on the cell surface [36]. The fucK used gene for H. influenzae identification [37], is a carbohydrate metabolism gene. The oprL gene encodes a peptidoglycan-associated protein that is involved in membrane invagination during pseudomonas cell division, in particular in P. aeruginosa. This gene important for maintaining membrane integrity [38]. For species identification of L. pneumophila and K. pneumoniae the determinants of the virulence of SidA and RmpA are used respectively [39, 40].

In the case of RNA viruses: SARS-CoV-2 and influenza A virus the E gene (encoding the envelope protein [41]) and segment 7 of matrix protein 2 (M2) [42] were used respectively.

Multiple sequence alignments of each selected region from the archive of the NIH (USA) were constructed and searched for conserved regions for the design of flanking primers. In the case of RNA viruses with high variability, degenerate nucleotide positions were used. The theoretical specificity of the primers was checked using the BLAST algorithm (NIH, USA).

When developing systems with an open architecture, the compatibility of the entire primer pool is of great importance. Compatibility is understood as the absence of any complementary interactions in the entire set of primers. The expansion of the system leads to the need to check the absence of such interactions between the newly introduced primers and the already optimized system. This is possible even in manual mode.

At the optimization stage, it is useful to test primers using PCR in one volume, followed by electrophoretic separation of the reaction products. In addition, primers immobilized in different cells are deprived of the possibility of interaction with each other, which facilitates the task.

In Fig. 1 a schematic diagram of the detection of one genetic target using immobilized primers is shown.

Fig. 1.
figure 1

Analysis scheme using immobilized primers. Letters f and r designate forward and reverse flanking primers, letters r1 and r2—nested immobilized primers (contain amino groups at the 5 'end), dU*— cyanine dye-labeled deoxyuridine, inserted during primer extension.

The system is designed so that several specific primers are present on the chip for each pathogen. Sometimes the reverse flanking primer “r” was immobilized, in some cases the nested primers (“r1” and “r2” in the scheme) were immobilized, this was selected according to the results of preliminary optimization (testing in the total volume). This was repeated for each pathogen.

At the initial stage of the process of optimizing the multiplex system, several specific primers were tested for each region. The corresponding electrophoregrams are shown in Fig. 2.

Fig. 2.
figure 2

Experimental selection of specific primers during system optimization based on the results of analysis of PCR products. (a): (1) marker of lengths of fragments of dsDNA DNA Ladder 50 bp (bold bands of the marker correspond to lengths of 250 and 500 bp of dsDNA); (2) F1 + R1 (163 bp); (3) F2 + R1 (78 bp); (4) F3 + R1 (180 bp); (5) F1 + R2 (115 bp); (6) F2 + R2 (30 bp); (7) F3 + R2 (132 bp). (b): (1) Ladder 50 bp; (2) F4 + R3 (144 bp); (3) F5 + R3 (131 bp); (4) F6 + R3 (169 bp); (5) F4 + R4 (87 bp); (6) F5 + R4 (74 bp); (7) F6 + R4 (112 bp); (8) F4 + R5 (126 bp); (9) F5 + R5 (113 bp); (10) F6 + R5 (151 bp).

From Fig. 2 it can be seen that the lengths of the PCR products in each of the wells of the gel correspond to the theoretically expected lengths. However, there are clear differences in the yield of the product, its uniformity (in some cases, by-products are visible) or the “primer-dimer” effect, which is observed, for example, in well 10 (Fig. 2b). These effects were eliminated during the system optimization process.

In the case of RNA viruses, the first stage of optimization included the RT stage. In Fig. 3 the result of RT-PCR of SARS-CoV-2 RNA and influenza A virus is shown.

Fig. 3.
figure 3

RT-PCR analysis of RNA from coronavirus type 2 and influenza A virus isolated from a clinical specimen. (a): (1) marker of lengths of dsDNA fragments GeneRuler 50 bp (bold bands of the marker correspond to lengths of 250 and 500 bp dsDNA); (2F7 + R6, the amount of initial RNA 103 copies to a reaction tube; (3) F7 + R6, amount of initial RNA 104 copies to a reaction tube; (4) negative control (primers are visible). Primers Ef and Er were used; theoretical product length was 143 bp corresponding to that observed on the electrophoretogram. (b): Separate RT and PCR during optimization (testing using RT for both forward and reverse primers). (1) DNA product length marker GeneRuler 50 bp; (2) RT F8, PCR F8 + R7; (3) RT R7, PCR F8 + R7; (4) RT F8, PCR F9 + R7; (5) RT R7, PCR F9 + R8. Primer R8—experimental, modified (with the introduction of degenerate positions) on the basis of that proposed by WHO [42], the rest of the primers are designed specifically to create the proposed test system.

The first stage was followed by system optimization using all selected primer pairs, including gradient PCR and varying the concentration of system components, including primers.

The sensitivity and specificity of the test system were determined, including when the analyzed targets were used in pairs on the same biological microchip. According to the results of the experiments, some immobilized primers were replaced, as well as free flanking primers in the reaction mixture.

The system showed a higher signal to background ratio compared to using the labeled primer in solution and determining the primer extension by hybridization after PCR. The advantages of the proposed approach and the analysis scheme are given in [32].

The temperature-time profile of the reaction was optimized using gradient PCR; the concentrations of each reagent in the mixture were optimized, except for the components of the reaction buffer, which was used in accordance with the manufacturer’s recommendations. The specificity of primers to targets was determined for each pair of primers and the corresponding target, then for the same pair of primers - with other targets (to exclude pseudospecific interactions), and only after that the multiprimer system was tested. The experiment used full-length preparations of bacterial genomic DNA, as well as isolated and purified RNA of SARS-CoV-2 and influenza A.

After optimization of the multi-primer system in PCR mode, testing and optimization in RT-PCR mode were carried out, which required the replacement of some primers.

The primers for immobilization were subjected to the same multi-step testing procedure, after which they were immobilized and tested in various modes, similar to that described for the flanking primers. Testing was performed on decontaminated samples of isolated genomic DNA/RNA.

The resulting sequences of the immobilized primers, selected after optimization of the system, are shown in Table 2.

Table 2.   Immobilized primers for RT-PCR on a biochip

A diagram of a specialized biological microchip and examples of sample analysis are shown in Fig. 4.

Fig. 4.
figure 4

Specialized biological microchip for species identification of pneumonia pathogens by multiplex RT-PCR. (a) Scheme: M—fluorescent marker for automatic grid overlay in the software for calculating the signal intensity of the cells; Gel—empty gel cell; Sta—Staphylococcus aureus (here and in all subsequent cases, the numbers “1” and “2” denote the use of two different primers within the sequence flanked by the edge primers, letter “a” denotes the use of a primer at a concentration 2 times less than in a chip cell without a letter—this is necessary to improve the reliability of automatic analysis); Str—Streptococcus pneumoniae, Leg—Legionella pneumophila, Hae—Haemophilus influenzae, Pse—Pseudomonas aeruginosa, Kle—Klebsiella pneumoniae, Cov— SARS-CoV-2, Inf—influenza A virus; Con1—internal control of reverse transcription (reserve; currently not used), Con2— internal control of PCR. (b–f) Results of species definition (examples), respectively: St. aureus, S. pneumoniae, H. influenzae, SARS-CoV-2, influenza A. For convenience, photos with inverted signals (negatives) are shown.

The system provides two internal controls—for the RT stage and the PCR stage. The sensitivity of the system was determined by titration of each of the analyzed DNA/RNA samples. Depending on the target, the sensitivity of the system with immobilized primers ranged from 102 to 104 copies of nucleic acid per sample (only for the available samples in the collection; statistics are currently insufficient). The increase in sensitivity was facilitated by the inclusion of the label in the growing chain of the immobilized primer. This makes it possible to covalently attach the label and remove all components of the mixture that increase the background noise and reduce the sensitivity of the system. The proposed approach is described in more detail in [32]; it is an improvement on the classical methodology for the use of enzymatic reactions on hydrogel chips, which uses a labeled primer in solution.

The system can be expanded to analyze a wider range of pathogens by adding immobilized primers, provided that they are checked for compatibility with the multiplex RT-PCR developed.

DISCUSSION

The first work describing the performance of PCR on hydrogel biochips was published in 2000 [43] by a scientific group of the IMB RAS (Russia). Soon, PCR was used to detect rifampicin-resistant strains of the causative agent of tuberculosis [44, 45]; a methodical work [46] was devoted to modifying the PCR method on a chip.

Original approaches to carrying out enzymatic reactions in the solid phase have been proposed to create compact specialized biochips [47, 48], which have been developed for isothermal amplification, bridge PCR, and, ultimately, new generation sequencing (NGS). The technical difficulties of PCR on a chip in comparison with hybridization analysis have led to the fact that at present the “gold standard” of diagnostic systems based on the technology of specialized hydrogel biochips is the use of hybridization analysis [4953]. Based on this approach, a number of certified systems for clinical diagnostics have been created (TB-Biochip, SI-Biochip, and others).

In most methods based on RT, the reaction is carried out in the liquid phase (without using biochips), and only the subsequent hybridization is performed on the chip [26, 27].

The classical scheme of signal detection after carrying out an enzymatic reaction on a biochip provides for hybridization of a fluorescently labeled primer to an extended immobilized primer. This makes it impossible to carry out the reaction in real time, it is very sensitive to the conditions of detection, since it is a quasi-equilibrium system; at the same time, the risk of washing away the hybridized primer is increased. Another drawback is increased background signal and a relatively low ratio of signals of perfect and imperfect duplexes due to the need to use fluorescently labeled primers.

One of the improvements in the approach proposed in this work is inclusion of labeled nucleotides into the growing immobilized DNA strand, which allows complete removal of all components of the reaction mixture, thereby sharply reducing the background signal. When this approach is combined with microfluidics technology, solutions can be exchanged after each amplification cycle and reactions can be performed in real time. The combination of RT with PCR allows the analysis of both RNA and DNA-containing pathogenic agents (viruses and bacteria).

Differences in the chemical nature of nucleic acids complicate the development of a universal extraction method (in particular, to obtain the maximum yield of the isolated nucleic acid (DNA or RNA), different ratios of chloroform—isoamyl alcohol are used; in the case of RNA, buffer solutions treated with diethyl pyrocarbonate, etc. should be used). To create a universal method for detecting RNA- and DNA-containing pathogenic agents, it is necessary to unify the method for isolating nucleic acids. We used the CTAB method as it is one of the most careful methods for isolating total DNA and RNA from samples containing both bacterial and eukaryotic cells and viral particles [54, 55].

Scaling of the proposed system is hampered by differences in the nature of the tested samples, the presence of inhibitors, contaminants, and satellite microorganisms. The presence of total DNA and RNA in the extract can complicate the analysis and require a more careful approach to the design of species-specific primers.

Further improvement of the prototype of the proposed system involves statistical analysis, as well as determination of the sensitivity and specificity of the analysis using a wide range of samples.

CONCLUSIONS

Highly specific primers have been designed to identify six types of bacteria and two viruses that cause pneumonia. Solid phase RT-PCR and a biological microchip for the simultaneous detection of several pathogens in the test sample has been developed and optimized. The accumulation of a fluorescent signal is carried out by the incorporation of labeled nucleotides into the growing DNA strand. The developed prototype has an open architecture, due to which the spectrum of analyzed pathogenic agents can be expanded. The system is targeted at clinical laboratories, where a large number of samples need to be tested in parallel and it is important to get a quick response in order to develop a timely and adequate patient treatment strategy.