Naturally occurring protein nano compartments: basic structure, function, and genetic engineering

Compartmentalization of reactions is the key to efficient metabolism in living organisms. Living systems have undergone several evolutionary alterations to outfit compartments that are apt to address that issues related to the slow turnover rates, competing parallel pathways and management of toxic intermediates. Several of these compartments are conditionally expressed when required, while others are permanently present in the cells. Most of these compartments in physiology have a complex makeup comprising of phospholipids, glycans and proteins. However, there are certain compartments that have a simple composition consisting of only one component. Such compartments are easy to study and explore for further in vitro applications. In this article we review the structure and function of entirely protein based natural prokaryotic and eukaryotic nanocompartments and how they have been modified or engineered to achieve a particular function in vitro.


Introduction
Protein self-assembly and compartmentalization are related to several physiological processes. The primitive bacteria also rely on protein compartmentalization for the isolation of metabolic processes. Although less popular than membrane-based compartmentalization, protein compartments have their advantages. Most importantly, these compartments are genetically controlled and can be manipulated or altered for suitable biotechnological use e.g. in drug delivery, vaccine development and nanoreactor etc. There are several paradigms of compartmentalization in biology that involves only proteins ( Table 1). These are widespread in prokaryotes and eukaryotes. In this article, we review the different kinds of naturally occurring protein compartments and how they have been modified or edited for different applications.
The most widely studied protein compartments in eukaryotes are the ferritins. Ferritins are known to act as iron storage protein, and regulate iron availability and homeostasis. Ferritins, though most common in eukaryotes, are also found in bacteria [13]. Ferritin was first isolated and crystallized by Laufberger in 1937 [14,15]. The isolated protein from the horse spleen contained close to 20% iron by dry weight which is reported to be almost one iron atom per amino acid. Follow up studies for over a century now have shed light on the structure and function of this protein [16][17][18]. A typical ferritin compartment consists of a closed spherical compartment with an external and internal diameter of~12 nm and~8 nm respectively [18]. Inside this spherical polypeptide shell (apoferritin), there exists a spherical inorganic core of the hydrated iron oxide ferrihydrite (5Fe 2 O 3 · 9H 2 O) [19]. The ferritin shell has pores that can be used to encapsulate the desired cargo [20]. One of the best features of ferritins that make it nice paradigm for the exploration for cargo encapsulation and delivery is the stability in a wide range of pH and temperature [21,22]. Another genre of eukaryotic compartments are the vaults. They are the largest known ribonucleoprotein and have a barrel-like structure which provide sufficient space for encapsulation of heterologous cargo [23]. They have role in multidrug resistance in cancer cell lines and other functions are yet to be explored [24]. The vault particle is composed of major vault protein (MVP) plus three minor components: two MVP-associated protein; the vault poly(ADPribose) polymerase (VPARP) and telomerase associated protein (TEP1) and several small non-coding RNAs  [3] Bacteria and Archaea Protection in oxidative stress Nanoreactor and Drug delivery Bacterial Microcompartment [4][5][6] Prokaryotes Carbon fixation and metabolism of pentose sugars Nanoreactors and Catalysis; Agriculture Virus like particles [7] Viruses -Vaccine development and Catalysis Lumazine synthase [8] Plants, fungi and most microorganisms Biosynthesis of riboflavin Vaccine development, Drug delivery E2 [9] Prokaryotes and Eukaryotes Glycolysis Vaccine development Chaperonin [10] Bacteria, Archaea, Eukaryotes and viruses Protein homeostasis Carrier, nanoreactor and nanosensor DmrB [11] Archaea, Few bacteria Redox enzyme Not explored yet Pyrenoid [12] Algae, Non-vascular plant Carbon fixation Enhance photosynthesis in higher plants desired protein. There is a protein cage forming enzyme DmrB (Dihydromethanopterin reductase) present in archaea and few bacteria, involved in redox reactions. Its assembling properties have been reported but its applications are yet to be explored [11,47]. Other than prokaryotic and eukaryotic origin, virus-like particles have been also derived from various plants and animal viruses after removal of their genetic material. The viral coat known as capsid act as a nanocompartment. These nanocompartments are also stable and biocompatible so can be used for biological applications [37]. There is a proteinaceous organelle pyrenoid, found in chloroplast of almost all the algae and a group of non-vascular plant. It has a similar role as carboxysomes i.e. CO 2 fixation. These protein bodies are currently being genetically engineered in higher plants to improve photosynthesis [12]. These naturally occurring, all protein compartments have been explored for several biotechnological uses by virtue of genetic engineering and biochemical manipulations. The following sections review these developments of the protein nanocompartments.

Engineering of natural protein nano compartments for different applications
The native protein nanocompartments found in vivo have been outfitted in several instances for in vitro applications ( Figure 1). In the following sections we detail the major proposed applications of the protein nanocompartments and discuss the bio-molecular engineering required to achieve such goals.

Drug delivery
For any molecule to act as a drug carrier should tend to encapsulate cargo along with the scope of its surface modification, for making it target specific. The proteinaceous nanocompartments have fulfilled the criteria to act as drug delivery agent as they are small in size, have a hollow core, biocompatible, and can be modified genetically as well as chemically, etc. Among the nanocompartments discussed, ferritin and recombinant vault fulfilled the above mentioned criteria, thus considered as 'ideal' drug delivery agents [1,48]. The ferritins have special property of disassembly and reassembly in certain conditions e.g. very low and very high pH, high concentration of urea etc. (Figure 2). By taking advantage of such assembling properties, drugs can be encapsulated within them [49,50]. Other than this, passive diffusion of drugs as well as use of high hydrostatic pressure has been shown to be helpful in drug encapsulation within the ferritins [51,52]. Next, if we talk about the target specificity using ferritins, they have innate tendency to attach with certain cancer cell specific receptors e.g. heavy chain of ferritin attach with TfR1 receptor and light chain with SCARA5. Further, this specific attachment is followed by their endocytosis. There are reports on TfR1 and SCARA5 based drug delivery system using ferritins [53,54]. However, these receptors are also expressed on normal cells, so further specificity needed to be improved. For this target specific peptides have been genetically added on the surface of ferritins such as RGD peptide, which has high affinity towards integrin (Tumor angiogenesis biomarker up-regulated in tumor endothelium) added to target specific tumor [55]. Similarly, He et al recently shows ferritin-based therapy for osteoarthritis, where they added cartilage targeting peptide to the exposed N-terminal of ferritin and loaded it with anti-inflammatory drug metformin. This CT-Fn/Met prolonged the retention time of the drug thus helpful in reducing inflammation [56]. Addition of peptides can be done using approaches like sort tagging, histidine tagging, SpyTag/ SpyCatcher system, etc [57][58][59][60]. This specific peptide conjugation approach can be further used for targeting of other specific targets.
Coming to the vault, INT-MVP (Major vault protein interaction domain) discussed earlier has been reported to be used as encapsulation peptide. This INT-MVP can be genetically added with the desired cargo [62]. First study based on this approach has been reported for the attachment of luciferase and a variant of GFP with INT. These INT based conjugates have been directed to encapsulation within the vault and shows catalytic and fluorescent properties. This helps in the development of the further protein compartment based biocompatible nanocapsules [63].
One disadvantage of using vault in drug delivery is their inability of encapsulating hydrophobic drugs, owing to their water-soluble properties [64]. To overcome this problem, Buehler et al reported lipid bilayer nanodisk formation within the vault. This nanodisk was derived from truncated form of Apolipoprotein-AI (Apo-AI, amino acids 44-200) containing a series of amphipathic helices that encircle the disk's circumference in a beltlike manner. This nanodisk containing rich lipophilic domain tends to absorb hydrophobic compound, makes vault as a versatile carrier for drug delivery [64]. Now in the case of vaults, C terminal of MVP is exposed and has been utilized for targeting purpose (figure 4). The target specific peptides attached to this exposed Cterminal e.g anti-EGFR antibodies were attached with it. These antibodies having affinity towards EGFR receptors reported to target A431 carcinoma cells, where these receptors were overexpressed [65]. Further, to improve vault penetration into target cells and for endosomal escape Han et al fused a membrane lytic peptide derived from adenovirus protein VI (pVI) to the N-terminus of the MVP to form pVI-vaults. These pVI vaults disrupts the endosomal membrane thus enhance overall transfection efficiency [30]. Other than these two nano compartments used for drug delivery purposes more PNCs such as virus-like particles, encapsulin, and enzymederived nanocages have also shown cargo encapsulation and specific cell targeting ability [7,66,67].

Vaccine development
Vaccine development is important as well as challenging work to do. Expression and/or encapsulation of epitope/antigen on protein shells is the primary step for vaccine development. In the context of protein nano compartments, genetic engineering has been used to modify their protein shells to express a particular epitope/ antigen or encapsulation of the same. Virus-like particles are one of the types of nanocompartment which have been extensively used for vaccine development because they mimic the structure of viruses and induce immune response [7]. Their repetitive amino acid sequence helps in binding with B cell receptors, small size (20-200 nm) helps in taking them up by antigen-presenting cells (APCs) as well as VLP containing host nucleic acid activates Toll-like receptors on APCs [68]. The very first recombinant peptide-VLP vaccine was developed by attaching poliovirus type 3 epitope with C-terminal of Tobacco mosaic virus (TMV) coat protein. This hybrid (TMVCPpolio 3) induces poliovirus neutralizing antibodies following injection into rats [69]. In a recent study, Singh et al developed two Pfs48/45 (expressed on the surface of Plasmodium falciparum sexual stages) based VLP vaccines. They attached antigens on the Acinetobacter phage AP205 VLP using SpyTag/Catcher system. The resulting Pfs48/45-VLP conjugates showed the immunogenicity and thus can be used for clinical purpose [70]. Similarly, other antigens have been added onto the surface using different conjugation methods e.g. Sortase tag ( Figure 3) on various VLPs to elicit immune responses [61,71,72].
Ferritins have also the potential to be used as agents for vaccine development. In line with that Kanekiyo et al developed a vaccine against the influenza virus using Helicobacter pylori non-haem ferritin, they inserted haemagglutinin at the interface of adjacent subunits of ferritin shell proteins so that eight trimeric viral spikes formed on the surface of ferritins. This complex was reported to produce high titre of haemagglutinin inhibition antibodies. Thus, this forms a foundation of building vaccine against emerging influenza viruses and other pathogens [73]. Although vaults are non-immunogenic they have shown to play a role as adjuvant. Kar et al used antigen which is full-length ovalbumin as it is highly immunogenic. They fused antigen with INT using overlap PCR followed by its encapsulation inside the vault. These Ova-vaults were effective in generating immunity [74] . The first study for vaccine development using the newly discovered protein particles 'encapsulins' were shown by Choi et al in which they inserted OT-1 (257−264 amino acid of Ovalbumin protein) at 3 different positions of the encapsulin subunit. This OT-1 was processed by phagosomes and showed an immune response [75]. In another study by Lagoutte et al Matrix protein 2 ectodomain(M2e) of influenza A virus was displayed onto the surface of encapsulin. They added the M2e gene between the genes of the encapsulin shell proteins. Along with M2e they also encapsulated GFP using flippase tag. As a result, they observed antibodies against both the M2e as well as for the GFP [76]. These few examples explained, how wide range of antigens can be added or encapsulated within the PNCs to elicit immune response and thus can be used in clinical purpose.

Nanoreactors
As described in previous sections that the protein nano compartments can be modified genetically by displaying desired proteins on the surface as well as by encapsulating cargoes into them. Together this approach can be used for the development of a simple metabolic pathway for the production of chemicals in bulk. For this, the desired enzyme can be encapsulated within the PNCs using encapsulation peptides. Desired enzyme can also be expressed on the surface. For expression desired gene can be incorporated with the shell protein-encoding genes. Further, substrate and/or products can be diffused in/out through the pores of shell proteins. Being highly organized and having an innate role in metabolic processes bacterial microcompartments (carboxysomes and metabolosomes) can be modified as nanoreactors. The bacterial microcompartment (BMC) is encoded by single operon thus by mutating genes in operon, changes in overall BMC structure can be made. One such nanoreactor is recently reported by Li et al where carboxysome was reprogrammed for hydrogen production. They modified operon encoding α-carboxysome proteins of the chemoautotrophic bacterium Halothiobacillus neapolitanus. Initially they synthesize empty α-carboxysome using essential shell proteins and identified the encapsulation peptide. Using this encapsulation peptide, enzymes responsible for hydrogen production were incorporated within the empty compartment and nanoreactor was made (Figure 4) [5]. There are reports on modulation of pore size of shell proteins. One report shows that mutation of specific amino acids near pore of PduA shell protein leads to defect in permeability of substrate [77]. Another report shows enhancement of substrate permeability by making chimeras of shell proteins [78]. These reports established an understanding for controlling the pore size of the shell proteins. Overall by modulating pore size, influx of substrates can be controlled and better metabolic pathway can be established. Altogether BMCs have potential for modification and can be used in similar studies for the production of various other important chemicals.
Where BMC is a very complex compartment simpler compartment encapsulin has also shown to be used as the multi enzyme nanoreactor. One such nanoreactor is reported by Jenkins et al they simultaneously display one enzyme onto surface of encapsulin and encapsulated one. For displaying enzyme they used SpyTag/Catcher system and encapsulation peptide for encapsulation. This type of nanoreactors utilize product from the first reaction in the second reaction thus establishing concept of cascading reactions using encapsulins [79]. Another example of encapsulin based nanoreactor was showed by Diaz et al where they constructed a light-responsive encapsulin nanoreactor for 'on-demand' production of reactive oxygen species (ROS). For this, they merge encapsulating peptide with a mini-singlet oxygen generator (miniSOG). This miniSOG was able to convert molecular oxygen into ROS in response to blue light. As ROS have toxic effects on tumor cells this nanoreactor can be used in photodynamic therapy. This proves significant use of encapsulin in cancer treatment [80]. Further like in BMCs, increase in pore size of encapsulin shell protein also lead to enhanced influx of substrate within the encapsulin. In relation to its pore size modulations Williams et al explored six amino acid loop region which constitutes one of the native pores in a vertex of encapsulin. They showed that deletion and substitutions of up to seven residues have not interfered with the structure and these alterations enlarge the pore diameter of encapsulins [81]. Here, we can conclude that a very complex as well as a simple protein compartment can be utilized to carry out metabolic reactions. Other than carrying synthetic metabolic reactions these nanocompartments have been used in the synthesis and encapsulation of biomaterials. For example, Giessen and Silver demonstrated size-dependent production of silver nanoparticles in encapsulin and showed ROS generation [82]. They fused silver precipitation sequence (AG4) with N terminal of encapsulin, AG4 starts nucleation site for Ag + to precipitate from solution inside the encapsulin [83]. The Ag + nanoparticles has antimicrobial activity more likely due to the release of Ag + ions which promote the generation of ROS [84]. Whereas Künzle et al have demonstrated the encapsulation of gold nanoparticles in the core of encapsulins. They first time used cargo loading peptide (CLP) for the encapsulation of inorganic cargo [85]. Similarly, ferritins have also been used in the synthesis of gold and silver nanoparticles. Here Moglia et al demonstrated a photochemical reduction method for silver nanoparticle synthesis [86] and Pulsipher et al demonstrated increase in the gold nanoparticles size within the ferritin upon addition of gold ions and reducing agent [87].

Catalysis
Protein nanocompartments being used as nanoreactors can also be explored to involve in certain catalytic reactions. For example Bari et al fabricated gold nanoparticles on the surface of BMC where the surface acts as scaffold. Gold nanoparticles of this hybrid were showing standard catalysis retaining activity of the BMC enzymes ( Figure 5) [6]. Not only complete BMC, in vitro nanoflowers made from the shell protein show catalytic activity. These nanoflowers reported to show better oxidase and peroxidase activity than the morphologically different nanoflowers made from the globular protein [40]. Other PNCs facilitate enzymes in maintaining their catalytic activity, for instance Schoonen et al encapsulated T4 lysozyme (T4L) within the cowpea chlorotic mottle virus (CCMV) capsid to maintain its catalytic activity in physiological condition. Here they used sortase based conjugation system for T4L encapsulation, the LPETG was conjugated with T4L and sortase A with CCMV. This conjugation leads to the formation of four T4L enzyme filled capsid which reported to be stable and possess activity [88]. Virus like particles also reported to use as scaffold for the attachment of enzymes. Roder et al attached Trichoderma reesei endoglucanase Cel12A onto the surface of potato virus X (PVX) using SpyTag/ SpyCatcher system. This attached enzyme converts more substrate than the free enzyme, overall reported to show better catalytic activity [89].

Plant biotechnology
Carboxysomes evolved in cyanobacteria as a consequence of increasing oxygen concentration in ancient atmosphere. Oxygen is a competitive substrate for RuBisCO enzyme, these carboxysomes compartmentalized RuBisCO and carbonic anhydrase enzymes. This compartmentalization overall increases the CO 2 concentration therefore, enhance carbon fixation. This adaptation of cyanobacteria can be engineered in the plants to increase photosynthetic ability and overall it will increase the agriculture productivity. Considering this fact, studies have been done to express β carboxysome genes in chloroplast. This idea introduced the feasibility of expressing carboxysome proteins into the chloroplast [90]. Recently, Long et al demonstrated the production of simplified carboxysomes from Cyanobium, within the tobacco chloroplast. They replaced the endogenous RuBisCO large subunit gene with cyanobacterial Form-1A RuBisCO large and small subunit genes, along with genes for two key α-carboxysome structural proteins. This reported the production of carboxysomes with minimal genes [91]. Another approach of making carboxysome mimetic has been demonstrated by Frey et al here they coencapsulated two carboxysomal proteins i.e. RuBisCO and carbonic anhydrase within the protein cage of lumazine synthase ( Figure 6). Although there was no significant kinetic effect on the enzymes were observed but it demonstrates the use of genetic engineering for making artificial organelle having metabolic pathway to enhance photosynthesis [92]. Similarly, encapsulin protein compartment have been used to encapsulate carboxysomal enzymatic proteins, making a simple carbon fixation machinery in chloroplasts [4]. This approach of synthesizing carboxysome or its alternative inside the chloroplast is yet to explore more. It needs further meticulous engineering for making the artificially designed metabolic pathway functional.

Imaging
Amendable property of PNCs have been used for target-specific imaging. For instance, Magnetic resonance imaging (MRI) contrast agent may be incorporated inside the nanocompartment and target specific protein can be expressed on its surface. For example , Hu et al synthesized bimodal contrast agent using VLP, here they load dysprosium (Dy 3+ ) complex and near-infrared fluorescence (NIRF) dye Cy7.5 into the cavity of Tobacco mosaic virus (TMV). Asp-Gly-Glu-Ala (DGEA) peptide together with polyethylene glycol linker was added to the surface of TMV to target integrin α2β1. This modified TMV was stable and reported to be suitable for multiscale MRI scanning [93]. Ferritin being the primary intracellular iron storage protein can be used as a reporter to monitor disease progression in vivo with MRI [94]. Taking into consideration the role of ferritin as MR receptor Choi et al overexpressed H-ferritin on tumor cells and used this model for in vivo tumor imaging and monitoring of lymph node metastasis [95]. Likewise, unbalance of iron homeostasis and different level of ferritins can be observed using MRI [96].

Other applications
Till now major applications of PNCs have been discussed. Furthermore, these PNCs are being continuously explored for more fascinating applications. Briefly, VLPs reported to deliver molecules e.g. delivery of siRNA therapeutics using cowpea chlorotic mottle virus-like particles [97]. Ferritins have also been shown to use in cancer therapy using TRAIL as an anti-cancer agent (tumor necrosis factor [TNF]-related apoptosis-inducing ligand), it activates extrinsic apoptotic pathway by binding to its D4 or D5 receptor. Yoo et al display TRAIL in trimer-like conformation to increase its stability and IL4rP for specificity. TRAIL-ATNCIL4rP was reported to show enhanced agonistic activity, thus can act as an anti-tumor agent [98]. Jeon et al show immunotherapy using ferritin nanocage. The T-cell response was inhibited by the interaction between programmed cell death 1 ligand 1 (PD-L1) with its receptor programmed cell death 1 (PD-1). Earlier studies show that monoclonal antibodies are effective in blockage of this interaction but it is effective only in some types of cancer. So to block this interaction, here, they display 24 PD-L1 peptides onto the ferritin surface (PpNF) and added PpNF to cocultures of T cells and cancer cells. Where they have been reported to inhibit the PD-1/PD-L1 interactions and restoration of T cell activities [99]. One more application of ferritins was demonstrated by Jacobs et al which is in the removal of phosphates from water. They used ferric iron nanoparticles encapsulated in P.furiosusferritin (PfFrt) as sorbent for phosphate [100]. These applications show the importance and versatility of protein nanocompartments like ferritin.
Amusingly encapsulin has been used to show dual functionality. Such as, addition of one target-specific affibody protein and one fluorescent protein on its surface. This was done using spy catcher/SpyTag bacterial superglue system. This dual functional protein compartment can be used for target-specific imaging [101]. Lee et al explored the role of encapsulin in production of antimicrobial peptides (AMP) which is a convenient alternative to antibiotics, but their microbial production was limited owing to their bactericidal nature. So, here they used HBCM2, which is an α helical AMP hybrid of cecropin and melittin peptides [102]. This fusion was overexpressed in E. coli, thus can be used for the commercial production of AMP [103]. Vault on the hand can be use in biodegradation of organic contaminants using enzyme like manganese peroxidase (MnP). INTconjugated MnP was encapsulated in vault and reported to show better phenol degradation than the free enzyme. Thus, this opens up a new way to use these compartments in wide range of applications [104]. Other protein cage such as GroELnanocage (bacterial chaperonin), was reported to use in a colorimetric assay. Wang et al demonstrated that Hemin-GroEL has peroxidase-like activity i.e., oxidation of 3,3′,5,5′tetramethylbenzidine (TMB), a chromogenic substrate in the presence of H 2 O 2 . So presence of glucose can be detected upon addition of glucose oxidase in the sample, where H 2 O 2 produced by oxidation of glucose can be used in oxidation of TMB and we can observe the colour change [105].

Conclusion and future perspectives
Protein compartments can have a profound effect on translational biotechnology if harnessed properly. With the ease of genetic engineering, proteins can be modified to a desired shape and morphology. The interesting feature of a protein is that they can form intricate structures that can be readily manipulated because their synthesis is genetically directed. Combined efforts of genetic engineering and synthetic biology as reviewed above will lead to the development of in vitro or biogenic production of protein based materials with smart and tunable properties. Other applications include tailored biochemical reactions for the generation of energy efficient green chemicals, organic-inorganic hybrid structures for bio-nanotechnology and bio-electronics, and bioremediation strategies for management of bio and chemical wastes. These protein containers are apt for the development of theranostic devices by meticulous manipulation of their properties by synthetic biology and genetic means.