Developments and Applications of Functional Protein Microarrays

Protein microarrays are crucial tools in the study of proteins in an unbiased, high-throughput manner, as they allow for characterization of up to thousands of individually purified proteins in parallel. The adaptability of this technology has enabled its use in a wide variety of applications, including the study of proteome-wide molecular interactions, analysis of post-translational modifications, identification of novel drug targets, and examination of pathogen-host interactions. In addition, the technology has also been shown to be useful in profiling antibody specificity, as well as in the discovery of novel biomarkers, especially for autoimmune diseases and cancers. In this review, we will summarize the developments that have been made in protein microarray technology in both in basic and translational research over the past decade. We will also introduce a novel membrane protein array, the GPCR-VirD array, and discuss the future directions of functional protein microarrays.

family microarrays have a relatively small number of proteins, the expression system can be tailored for desired qualities and quantities. For example, the GPCR array is developed using Virion Display (VirD) technology (19) to maintain the seven transmembrane structure and to obtain the best GPCR expression in several mammalian cell lines, including Vero, HEL, HeLa, and 293T cells (14).
Alternatively, protein domain microarrays can be designed to analyze certain regions, domains, or epitopes within the proteins. These arrays often involve the careful design of desired gene sequences before entering the protein production pipeline. The cell-free protein/peptide microarray is designed to display a short peptide or full-length protein using a cell-free system. Cell-free expression is designed to bypass the expensive and often tedious work of cell-based protein production. To construct Page 9 representative studies based on the research applications illustrated in Figure 1. Here, we review research studies based on the proteomes immobilized on microarrays.
Zhu et al. constructed the very first proteome microarray, the yeast proteome microarray, and utilized it to investigate protein-protein interactions and protein-lipid interactions. The array was probed with biotinylated calmodulin and 33 new calmodulin binding proteins with new common motifs were identified (3). In the same study, the yeast proteome array was probed with fluorescently labeled liposomes carrying various phosphatidyl-inositides and more than 150 phospholipid binding proteins were identified The Zhu lab further demonstrated the utility of proteome arrays by performing covalent enzymatic reactions on the arrays. They were the first to establish the protein acetylation reactions using the yeast NuA4 complex, and two parallel signaling pathways in yeast aging were discovered (32). It has also been applied to determine the substrates of a HECT domain ubiquitin E3 ligase Rsp5 (33). These studies demonstrate the usefulness of the yeast proteome microarray in basic research. by guest on May 4, 2020 https://www.mcponline.org

Application of E. coli Proteome Microarrays in Basic Research
Chen et al. established a purified E. coli proteome microarray in 2008, comprising of 4,256 unique proteins and applied it to identify potential new players in the DNA damage response. The E. coli proteome microarray was probed with several short DNA probes containing mismatched base pairs or abasic sites, and two DNA repair proteins were identified: YbaZ and YbcN (8). In another study the same array was used to detect DNA binding proteins to the promoter of type 1 fimbriae and identified Spr as a phase switch for type 1 fimbria expression (34). Ho et al. probed several antimicrobial peptides using the E. coli proteome array and identified many intracellular targets. Among the four antimicrobial peptides, they identified some shared and unique targets and suggested a synergistic effect on LfcinB and Bac7, as well as LfcinB and PR-39 (35).
Hsiao et al. probed the E. coli proteome array with four glycosaminoglycans that are common on host cells and identified a hundred protein targets. They further validated ycbS as a bacterial factor for cell entry (36). Xu et al. probed the E. coli proteome array with an important bacterial second messenger, cyclic di-GMP, and identified CobB as a strong binder. Since CobB is a deacetylation enzyme, they subsequently found that cyclic di-GMP inhibits the enzymatic activity and forms a novel feedback loop to the cyclic di-GMP production (37). Feng et al. used E. coli proteome microarray to investigate protein-cell interactions by probing the human brain microvascular endothelial cells (HBMEC) on the array. They identified 23 target proteins and validated YojI as a protein for E. coli invasion. Moreover, they purified Yojl, probed using HuProt, and further identified interferon-alpha receptor as a host receptor for Yojl (38). Besides various binding assays, the E. coli proteome microarray has also been applied to by guest on May 4, 2020 https://www.mcponline.org

Downloaded from
Page 11 identify substrates, including substrates of glycoproteins (39), tyrosine sulfation (40), and ClpYQ protease (41). As demonstrated by these representative works, the E. coli proteome microarray is widely used to study bacterial physiology as well as hostmicrobial interactions.

Application of Human Proteome Microarrays in Basic Research
The human proteome microarray is the most widely used array in basic research, translational research, and in the pharmaceutical industry. There are three popular human proteome microarrays: HuProt, ProtoArray, and NAPPA. HuProt contains ~21,000 individual purified human proteins in full-length, which is by far the most comprehensive human proteome collection. ProtoArray contained ~9,000 human proteins purified from insect cells, but was discontinued commercially in 2018. NAPPA is an in vitro expression system that has been applied to express 10,000 human proteins. required for hepatitis C virus (HCV) replication, they further validated the target hnRNP K as a repressor for HCV replication (43). Therefore, the human proteome microarray is a valuable tool to study the complex regulatory networks of protein-DNA and -RNA interactions ( Figure 1F and 1G).
The human proteome microarray is also useful for the analysis of protein-protein interactions, especially for determining players involved in pathogen-host interactions  (52). Overall, the human proteome microarray serves as an unbiased platform for studying many different kinds of binding events and enzyme-substrate relationships.
The two major pharmaceutical applications of the human proteome microarray are drug target identification ( Figure 1D) and specificity tests for monoclonal antibodies by guest on May 4, 2020 https://www.mcponline.org

Downloaded from
Page 14 (mAbs) ( Figure 1H). HuProt was used to identify the targets of arsenic, a cancer drug, and 360 potential binders were identified. Hexokinase was validated to bind arsenic, and this binding event was further shown to result in the inhibition of glycolysis (53).

Application of Functional Protein Microarrays in Translational Research
Serological biomarkers are valuable tools for diagnosis, prognosis and companion diagnosis in various autoimmune diseases, cancers, and infectious diseases (59,60).
One of the early applications of functional protein microarrays was to discover new serological biomarkers for autoimmune diseases because they can serve as antigen surveying platforms to detect subtle changes in antibody composition. In a dysregulated immune system, the antibodies that are generated by humoral immunity and react with self-antigens are referred to as autoantibodies (AAbs). When a functional protein array covers the majority of the human proteome (e.g., HuProt), a specific AAb signature can be readily detected by probing the array with a diluted patient serum/plasma sample.
When this approach is used to profile AAb signatures for a large cohort, subsequent statistical analysis can reveal potential biomarkers associated with a disease of interest (Table 3). This approach has three major advantages. First, patient samples are easy to obtain and store because they are mostly in the forms of serum, plasma or body fluid.
Second, detection of AAbs on a human proteome array is very sensitive and quantitative, only requiring several microliters of samples. Finally, the presence of AAbs is detectable before symptoms can be identified, making early diagnosis possible.
Human proteome microarrays have been used to identify diagnostic AAbs for more identification, it is necessary to include the most comprehensive human proteome collection for unbiased screening, and to validate candidate biomarkers using additional cohort to avoid overfitting. These requirements often result in a high price tag for biomarker research. Song et al. developed a strategy to overcome this issue by dividing the process into two phases. In phase I, also known as the biomarker discovery or screening step, they used the HuProt array to survey AAbs in a smaller cohort of serums from 22 autoimmune hepatitis (AIH) patients and 30 healthy controls. In this phase, they narrowed down thousands of human proteins to 11 candidate autoantigens.
In phase II, also known as the biomarker verification or validation step, they fabricated a focused antigen array with the 11 candidate antigens to survey AAbs in a much larger cohort composed of sera from 44 AIH patients, 50 healthy controls, and 184 patients suffering from other autoimmune diseases as a disease comparison group. With this two-phase strategy, they identified and validated three new antigens, RPS20, Alba-like, and dUTPase as highly specific biomarkers for AIH (61).
In translational cancer research, it is important to identify early diagnosis markers to allow for earlier treatment and intervention. Human proteome arrays are widely used to profile the AAbs in 10 cancer types, including ovarian cancer (74,75), glioma (76) (14). We demonstrated that the GPCR-VirD array is useful to profile specificity of mAbs ( Figure   2A). Among the 20 commercial mAbs tested, only 10 mAbs were determined to be ultraspecific. The rest either failed to show specificity entirely, or at least had several offtargets. Interestingly, all four mAbs with reported neutralization activity were shown to be ultra-specific on the GPCR-VirD array. Next, we performed specificity tests with known ligands ( Figure 2B) and revealed several off-targets for a peptide hormone, somatostatin-14. Two selected off-targets along with the canonical GPCR were validated with virion nano-oscillators for real-time and label-free detection (105) and showed significant binding affinities. Lastly, we probed the GPCR-VirD array with a clinical strain involved in neonatal meningitis (Group B Streptococcus K79) and identified five potential GPCR targets ( Figure 2C). CysLTR1 was further validated in by guest on May 4, 2020 https://www.mcponline.org

Downloaded from
Page 19 vitro and in vivo as a host receptor for K79 invasion. We believe that the VirD array is a robust platform to profile many different kinds of membrane protein interactions.

Future Directions
Membrane proteins are one of the most important protein categories, as they play important roles in many biological processes, such as signal transduction, cell recognition, cell-cell communication, transport, and anchorage, to name a few. It is highly desirable to develop a high-content and high-throughput platform for functional membrane proteins to enable meaningful screening for ligands, biologicals and small molecule drugs. To date, many methods have been developed to maintain the native conformation of membrane proteins, including nanodiscs (106), macrodiscs (107), Salipro nanoparticales (108), virus-like particles (109), and VirD (14,19). Unlike VirD, the other methods are not easy to scale up for multiplexed, highly parallel screening while maintaining the flexibility of massive production of the reagents from various mammalian cell lines. When the VirD array is coupled with nano-oscillator technology (105), the entire membrane protein family can be screened simultaneously with candidate drugs or biologicals in a label-free, real time fashion, and binding specificity and kinetics can be obtained in a single experiment. We envision that VirD array technology can expand to all kind of membrane protein families and holds promise for discovering biologicals, drugs, and receptor interactions. Besides VirD tailored for membrane proteins, all other human proteins need a proper expression system for the best folding and PTMs. For this reason, it would be desirable to use a mammalian expression system. In combination with transfection, transformation, and CRISPR by guest on May 4, 2020 https://www.mcponline.org

Downloaded from
Page 20 knock-in technologies (110), it is possible to generate a human proteome microarray from human cells and accelerate research, potentially leading to the discovery of novel drugs or biologicals. by guest on May 4, 2020