Introduction

β-barrel proteins are among the most significant and abundant entities in both membrane and cytoplasmic proteins as well as helical domains. These β-barrel structures are composed of a repertoire of β-strand segments joined by loops. Each strand is hydrogen bonded to its neighbor in an antiparallel/parallel arrangement, thus forming β sheets. Such β sheets twist up together to form a closed or open cylinder-like structure. Although membrane and cytoplasmic β-barrel proteins differ sequentially in their amino acid propensities, both β-barrel proteins share similar structural folds. In addition, β-barrel proteins can carry out a variety of biological functions, which has gained them wide-spread attention1,2,3,4,5. Most transmembrane β-barrel (β-barrelsTM) structures are closed and act as pores or channels representing uniquely shaped Outer Membrane Proteins (OMP)6,7, while the cytoplasmic β-barrel structures (β-barrelscytoplasm) are mixed fractions of open and closed structures functioning as catalytic and ligand binding units with diverse topology8,9,10. Most OMPs are detected with an even number of strands ranging from about four to 2611,12,13,14. On the other hand, most β-barrelscytoplasm proteins are diverse and range between four to 10 strands, with fewer numbers of proteins ranging up to 14 strands15. OMP β-barrels are monomeric with a single chain barrel, and barrels of cytoplasmic proteins often show complicated topologies in which more than one barrel contributes to a single chain16. Together, both β-barrelsTM and β-barrelscytoplasm structures are significant for therapeutic targeting because they resemble the channeling system of helical proteins17,18,19.

While β-barrelscytoplasm structures are clearly determined, β-barrelsTM structure determinations are still progressing under ever-increasing challenges in relation to TM protein purification and crystallization20,21,22. Only a few hundred unique β-barrelsTM structures have been identified, and these constitute an inconsequential part of the known structure deposited in the Protein Data Bank (PDB), succeeding the membrane helical protein genomes sequenced so far. Supplementary to experimental procedures, computational modeling has been accelerating our understanding of β-barrel proteins23,24,25. However, such analyses have not yet illuminated the nature of β-barrel structures due to the complexities of conformational distribution. Comparison of the intricacies of any distinctive features of β-barrelsTM and β-barrelscytoplasm structures could be refined down by simple representations. Several descriptions of protein structures have been illustrated to show the relationship between secondary structural elements (SSEs) and several essential parameters that govern the structure and constraints26,27,28,29,30. Elucidation of geometric features using detailed atomic and Cα residue representations is a common approach to the analysis of helical and β-sheet packing characterizations31,32,33. In addition, various levels of coarse-grained models have been suggested to reduce the complexity34,35,36. However, a basic model that relates to the structural arrangement of both β-barrelsTM and β-barrelscytoplasm structures at the macroscopic level is required to define the connectivity similarities and dissimilarities. Because β-strand structural arrangements play a critical role in the function and organization of β-barrel proteins, models that utilize the β-strand conformations would be more useful. Furthermore, such model would facilitate the understanding of how β-barrel structural organizations endure both membrane and water-soluble environments while utilizing their β sheet arrangements, overall topology, and conformational heterogeneity. Therefore, methods that identify the macroscopic principles underlying β-barrel protein folding and design factors in the overall structure, function, dynamics, and interactions become promising.

In the present study, we attempted to perform a systematic comparison of β-barrelsTM and β-barrelscytoplasm geometries using the joint-based description approach37,38. For comparison, structures of high-resolution, non-homologous β-barrelsTM and β-barrelscytoplasm protein structures were obtained from the PDB and grouped based on their geometric analyses. We considered the understanding of both β-barrel types with the topological patterns that exhibit diverse structural and functional relationships. We believe that β-barrel structural organizations can be statistically quantified based on β-strands arrangements and conformational distributions. Thus, both β-barrelsTM and β-barrelscytoplasm structures were analyzed in terms of two dihedral angles; β and γ. Our results demonstrate that TM β-strands are predominantly right-handed and anti-parallel as reported previously39,40, and they are comparatively like the conformational flexibility of TM helices in nature40,41,42,43. In contrast, structures of β-barrelscytoplasm are short, mixed and varied in their arrangements with one another. All the observed conformational distributions and adjacent frequencies were associated with the arrangement of β-strands and loops in the context of how β-barrel structures have adapted in both membrane and water-soluble environments respectively. Despite uniformity, we observed that there are some considerable differences between β-barrelsTM and β-barrelscytoplasm that share an equal number of β-strands. The comparative analysis results of β-barrel proteins prove that the joint-based description approach is a powerful descriptor that can be used to boost the geometrical characterization of β-barrels at both local and global levels. Possibly, our joint-based descriptor approach will be an asset in β-barrels structural prediction, as well as in the in-silico design, modeling, and validation of topologies with the given β-strand and loop segments.

Results

Macroscopic descriptor for β-barrel structures

The present study deals in relation to the recently developed joint-based description of protein structures. To understand the macroscopic geometric features of both TM and cytoplasmic β-barrel proteins such as their conformational distribution and the arrangements of β-strands, we utilize the joint constraint principles that were previously prepared to represent TM helices37. These β-barrel proteins always show consecutive elements of β-strands and loops linked one another as described in Fig. 1. Like TM helical proteins, a set of joints connecting the individual β-strands and loops were selected to describe a β-barrel geometry based on the joint-based description approach. Specifically, the Cα carbon coordinates of the first and last residues of each β-strand were considered as structural joining points as they represent the basic elements of protein structures37,38. The spatial arrangement of such joint points was measured by the dihedral angles between them. For example, for a protein composed of eight β-strands (S1, S2, S3, S4, S5, S6, S7, and S8) and seven loops (L1, L2, L3, L4, L5, L6 and L7), a group of 16 joints (P1, P2, P3, P4, P5, P6, P7, P8, P9, P10, P11, P12, P13, P14, P15, and P16) can be assigned (Fig. 1). As a result, dihedral angle involving first four joints P1, P2, P3 and P4 can be determined by measuring the angle between two planes made by P1, P2, P3 and P2, P3, P4. Then, the second dihedral angle can be found by relating the structural points P2, P3, P4, and P5, and P3, P4, P5 and P6 joints are used to determine the third, fourth and so on. Here, we propose two new types of dihedral angles: β and γ for β strands and loops. Again, the first and third dihedral angles correspond to the type β. They are denoted as β1 and β2 respectively. In a similar way, the second and fourth dihedral angles correspond to the type γ, denoted as γ1 and γ2, respectively. Thus, the β-strand conformations of both β-barrelsTM and β-barrelscytoplasm proteins can be represented by the set of joints (P1, P2, P3…) and two types of dihedral angles (β1, γ1, β2, γ2, β3…) at the macroscopic level. Clockwise and counter-clockwise angle signs were specified based on a positive value (from 0 to 180 degrees) or a negative value (from −180 to 0 degrees) respectively (Fig. 1).

Figure 1
figure 1

Joint-based description of β-barrel proteins with eight strands and seven loops. (a) Assignment of the β type and γ type dihedral angles. S1 to S8 are β-strands, L1 to L7 are loops, and P1 to P16 are joint points. Type β dihedral angles, such as β1, are defined by the four joint points in the Strand-Loop-Strand, such as P1, P2, P3, and P4. The γ-type dihedral angles are defined by the four joint points in the Loop-Strand-Loop, such as P2, P3, P4, and P5. (b) Assignment of the positive and negative signs for dihedral angles. The positive (+) sign and negative (−) signs represent the clockwise and counter-clockwise angles, respectively, in the projections for the dihedral angles. The figures present the projections for β1 and γ1.

Dataset collection of targets β-barrel proteins

For β-barrelsTM, 29 proteins with an even number from 4 to 26 β-strands and odd number 19 TM β-strands protein were collected as an initial dataset. For β-barrelscytoplasm, a total of 51 proteins from several superfamilies were chosen. The total number of both TM and cytoplasmic β-barrel proteins with various topologies, β-strands, and loop numbers were listed in Table 1 and Table 2 respectively. These datasets were directly obtained from Protein Data Bank (PDB) by grouping into β-barrelsTM and β-barrelscytoplasm proteins. The dataset collection procedure is cited in the Methods section. Concisely in both cases, i) We selected 30% sequence criteria and filtered the high-resolution X-ray crystal structures from PDB, ii) We have separated the unique monomer β-barrel proteins, and iii) The non-homologous monomeric chains with respect to different number of β-strand was grouped as final target proteins as shown in Fig. 2(a,b). In the end, the target dataset of 29 and 51 β-barrel structures was analyzed and all such analyzed β and γ angles are grouped and classified (Supporting Information SI Tables (1, 2, 3 and 4)).

Table 1 Selected non-homologous β-barrelsTM protein structures (29), their PDB IDs, and the total number of β types and γ type dihedral angles used in this study.
Table 2 Selected non-homologous β-barrelscytoplasm protein structures (51), their PDB IDs, and the total number of β types and γ type dihedral angles used in this study.
Figure 2
figure 2

Total number of β-barrel proteins for (a) membrane and (b) cytoplasmic cases. (a) β-barrelsTM target dataset of 29 structures was grouped based on the TM β-strand numbers and (b) 51 β-barrelscytoplasm structures were also classified based on their β-strand numbers. Note: β-barrelscytoplasm include both closed and open structures.

Comparison of β and γ distributions and overall arrangements of both β-barrelsTM and β-barrelscytoplasm

In this section, we compare both the β-γ plots for β-barrelsTM and β-barrelscytoplasm at the macroscopic level in analogy with Ramachandran ϕ-ψ plot at the atomistic level to find out the distribution density of dihedral angles. Our β-γ plot reflects the Ramachandran map of peptide distribution, such as favorable and unfavorable regions for β-strands and loops as given in Fig. 3(a,b). The overall distribution can possibly assist to understand the similarities and dissimilarities in their structural arrangement patterns. Overall β and γ distributions are restricted to certain regions for β-barrelsTM indicating they show conformationally similar arrangements that are observed in Ω-λ plot for TM helical proteins37. On the other hand, β and γ distributions for β-barrelscytoplasm are sparsely distributed over greater regions signifying dissimilar patterns in their arrangements than the membrane counterparts. Conventionally, the relative conformation of adjacent structural elements i.e. neighboring elements both at residues and secondary structures level have influence on each other and that can be identified by the distribution of their dihedral angles44,45,46. Our new joint-derived dihedral angles such as β and γ distributions were related to the structural arrangement of β-stands for both β-barrelsTM and β-barrelscytoplasm architecture just as previously demonstrated with TM helical proteins using Ω and λ dihedral angles37. The dihedral angle between the joints is not only associated with the arrangements of the individual β-strands and loops but also involves in the dependency of adjacent secondary structural element orientations. If β-strands and loops are simplified as shown in Fig. 1, β-strands arrangements can be explained by β type dihedral angles between the ith β-strand (Si) and its neighboring i + 1th β-strand (Si+1). Similarly, the type γ can represent the arrangement of the loop segments between the ith loop (Li) and its neighboring i + 1th loop (Li+1). In addition, type γ dihedral angle also provides the relative arrangement between the ith β-strand (Si) and i + 2th β-strand (Si+2). Figure 4 show some specific relation between the dihedral angles and β-strands (or loops) arrangements. When the dihedral angle βi is close to 0 degrees, β-strand Si is antiparallel with the adjacent β-strand Si+1 (Fig. 4(a)). In a similar way, when the dihedral angle βi is close to ±180 degrees, β-strand Si is parallel with the adjacent β-strand Si+1 (Fig. 4(b)). When the dihedral angle γi is close to 0 degrees, loop Li is antiparallel with the adjacent loop Li+1, and β-strand Si and β-strand Si+2 are on the same side with respect to β-strand Si+1 (Fig. 4(c)). Meanwhile, when the dihedral angle γi is close to ±180 degrees, loop Li is parallel with the adjacent loop Li+1, and β-strand Si and β-strand Si+2 are on the opposite side respect to β-strand Si+1 (Fig. 4(d)). Here, we attempt to compare both β-barrelsTM and β-barrelscytoplasm geometric characteristics through consecutive β-strands and loops arrangements by investigating the distribution of β and γ dihedral angles.

Figure 3
figure 3

Distribution of the β type and γ type dihedral angles in the β-barrel proteins. (a) The β-γ distribution plot for the β-barrelsTM proteins. All β and γ type dihedral angles in the 29 non-homologous of β-barrelsTM proteins are plotted together in the 2-D scatter plot. (b) The β-γ distribution plot for the β-barrelscytoplasm proteins. All β and γ type dihedral angles in the 51 non-homologous of β-barrelscytoplasm proteins are plotted.

Figure 4
figure 4

Arrangements of the β-strands and loops depending on the β and γ dihedral angles. The central figure presents the front view of three consecutive β-strands in both β-barrelsTM and β-barrelscytoplasm proteins. S(n) represents the β-strands and L(n) represents loops. (a) Front view of the arrangement of two β-strands helices, Si and Si+1 when βi = 0°, (b) Front view of the arrangement of two adjacent helices, Si and Si+1 when βi = ±180°, (c) Top view of the arrangement of two adjacent loops, Li and Li+1, and three adjacent β-strands, Si, Si+1 and Si+2, when γi = 0°, and (d) Top view of the arrangement of two adjacent loops, Li and Li+1, and three adjacent β-strands, Si, Si+1 and Si+2, when γi = ±180°.

The overall distribution plots for all β and γ angles that belong to both β-barrelsTM (29) and β-barrelscytoplasm (51) non-homologous protein structures was shown in Fig. 5. In both cases, it was evident that the β type dihedral angles were mostly confined to the range of −30° to +30°. Figure 5(a) shows more than 80% of β distribution between 0° to +30° for β-barrelsTM, where the frequency in a clockwise (positive angles) orientation is evidently dominant than that of anti-clockwise (negative angles). We observed the most favored dihedral angle distribution regions between +10° and +20° corresponding to their restricted distribution space for TM β dihedral angles. Meanwhile, β type distributions for β-barrelscytoplasm were observed in an almost entire range of −150° to +150° with a strong preference towards 0° to +30° as given in Fig. 5(b). In contrast, type γ dihedral angles were distributed in the entire possible region between 180° to +180° for both β-barrelsTM and β-barrelscytoplasm proteins. However, TM γ distribution shows dominant peaks between +90° to +180° region (Fig. 5(c)). Possibly, γ preference of TM structures observed to be very dominance between +100° to +150° considering the overall distribution region. While type γ dihedral angles for β-barrelscytoplasm are distributed widely with equal preference (Fig. 5(d)). Interestingly, the frequencies of TM γ type dihedral angles in clockwise orientations are apparent than counter-clockwise, suggesting a preferable arrangement of adjacent β-strands tend to be right-handed orientation. Together, TM β-strands topological arrangements differ from cytoplasmic β-strands in overall β-barrel architecture, i.e. both β and γ dihedral angle distributions vary with one another. Membrane β-strands incline to pack within smaller conformational space, whereas cytoplasmic β-strand arrangements were relatively different. The propensities for antiparallel and right-handed orientation properties of β-strands for TM and soluble proteins are reported in several proteins packing studies47,48. Certainly, both β-barrelsTM and β-barrelscytoplasm share related conformational distribution space with significant variations observed in the cytoplasmic structures.

Figure 5
figure 5

The overall distribution of β and γ dihedral angles. (a) Histogram showing the distribution of β dihedral angles for β-barrelsTM proteins. (b) Histogram showing the distribution of β dihedral angles for β-barrelscytoplasm proteins. (c) Histogram showing the distribution of γ dihedral angles for β-barrelsTM proteins. (d) Histogram showing the distribution of γ dihedral angles for β-barrelscytoplasm proteins.

In terms of arrangements, Fig. 5(a,b) show β type dihedral angles indicated the preference in the narrow range as a main accessible region that two neighboring β-strand (Si and Si+1) most likely to be arranged in antiparallel as depicted in Fig. 4. It should also be noted that cytoplasmic β-strand arrangements show antiparallel, parallel and mixed arrangements in the β-barrel proteins. Despite the sparse distribution in the all possible range of −180° to +180°, it is observed that TM γ-type dihedral angles were predominantly distributed between 30° to +180° region according to Fig. 4(c). However, the overall orientations of Li and Li +1 indicate that β-barrelsTM neighboring loops are appeared to exist in both anti-parallel and parallel arrangements. On the other hand, cytoplasmic γ-type dihedral angles do not have a strong preference towards any region supporting the all possible antiparallel, parallel and mixed arrangements. Thus, cytoplasmic β-strand Si+2 are freely arranged between the same side and opposite side to β-strand Si.

Comparison of local patterns

A comparison of local patterns of TM β-barrels with cytoplasmic β-barrel geometries was attempted by analyzing the nearest neighbor frequencies of β and γ-type dihedral angles. The measurement of dihedral angle βi enables the prediction of the arrangement of neighboring β-strands (Si and Si+1) and the relative positions of two β-strands Si and Si+2 can be determined by measuring dihedral angle γi. These indicate that the measurements of the continuous dihedral angles can allow us to predict how the β-strands in β-barrel proteins are continuously arranged and to which extent they are twisted and varied. For instance, the information of βi and βi+1 can determine the arrangement of Si, Si+1, and Si+2, and the information of γi and γi+1 may allow the prediction of the relative positions of Si+2 and Si+3 to Si and Si+1. Here, to study the β-strands arrangements in both β-barrelsTM and β-barrelscytoplasm systems, we examined the local patterns of continuous dihedral angle clusters such as βi − βi+1, γi − γi+1, βi − βi+1 − βi+2 and γi − γi+1 − γi+2. Most of the β-strand structures in β-barrelsTM were observed with (+, +) and (+, +, +) patterns for βi − βi+1, γi − γi+1, and the same trend have been traced with signature patterns of β-barrelscytoplasm as well. Interestingly, (+, +) signature pattern means a strong bias toward right-handed topology.

Dihedral angles of β and γ-type were categorized into two groups: clockwise - positive value (from 0 to +180 degrees) and counter-clockwise - negative value (from −180 to 0 degrees). Then, the dihedral angles (β and γ) for both β-barrelsTM (29) and β-barrelscytoplasm (51) structures were interpreted as the combination of signatures (positive and negative sign) (SI Tables (5, 6, 7 and 8)). Firstly, dyad signature (i.e. Two adjacent/consecutive units) distribution patterns of βi − βi+1 and γi − γi+1 for both datasets was examined and that indicated a strong tendency towards (+, +) pattern (Fig. 6). For a βi − βi+1 cluster, the frequency of (+, +) pattern was obviously higher than others for both β-barrelsTM and β-barrelscytoplasm, respectively (Fig. 6(a,b)). Also, the frequency of (+, +) pattern was higher than others for the γi − γi+1 cluster (Fig. 6(c)). Only cytoplasmic γi − γi+1 has shown a moderately different trend in all their groups (Fig. 6(d)). For example, the pattern (, ) showing considerable frequency, whereas all other dyad clusters have it as the least common. Secondly, we examined the triad sequential signature patterns (i.e. three adjacent/consecutive units) of β type and γ type dihedral angles for both β-barrelsTM and β-barrelscytoplasm structures. We observed βi − βi+1 − βi+2 and γi − γi+1 − γi+2 clusters predominant towards (+, +, +) patterns among the 8 possible combinations (Fig. 7(a–c)). And patterns such as (, +, +), (+, +, ), and (+, , +) were also observed noticeably. However, cytoplasmic γi − γi+1 − γi+2 (Fig. 7(d)) has been identified with relatively different inclination in all their signature patterns. Finally, the comparisons of γ-type dihedral angles of TM with cytoplasmic proteins were performed. Both γ-type distributions were found in the entire range from 180° to +180°. Subsequently, the dihedral angle space was divided into four quadrants as (I) +A: 0° to +90°, (II) A: 90° to 0°, (III) +B: +90° to +180°, and IV) B: 180° to 90° for further adjacent frequency distribution analysis. In the quadrant, the pattern of γi − γi+1 cluster was studied more in detail. Among all possible 16 combinations of γi − γi+1 cluster, the most abundant dihedral angle distribution was observed in the range of (+90° to +180°, +90° to +180°) for β-barrelsTM (Fig. 8(a)). There is no such clear preference for β-barrelscytoplasm. However, there are few positive (AA), (BB), (AB) and (BA) regions with relatively better propensities were observed commonly in both the cases. The observed β-strands and loop angles are intensely positive, corresponding to a right-handed twist orientation. Altogether, type βi − βi+1 and γi − γi+1 signature analysis has revealed that the positive sign as a clearly dominant region for adjacent β-strands which can be correlated to the right-handed twist.

Figure 6
figure 6

Frequencies of the dyad patterns for consecutive β or γ type dihedral angles when β or γ type angles are categorized as (+) and (−). The bar diagrams show the observed numbers of (a) four different patterns of two consecutive β type angles, βi − βi+1 for β-barrelsTM proteins. (b) β-barrelscytoplasm proteins showing four different patterns of two consecutive β type angles, βi − βi+1 (c) Four different patterns of two consecutive γ type angles, γi − γi+1, for β-barrelsTM proteins. (d) β-barrelscytoplasm proteins showing four different patterns of two consecutive γ type angles, γi − γi+1.

Figure 7
figure 7

Frequencies of the triad patterns for consecutive β or γ type dihedral angles when β or γ type angles are categorized as (+) and (−). The bar diagrams show the observed numbers of (a) eight different patterns of three consecutive β type angles, βi − βi+1 − βi+2, for β-barrelsTM proteins. (b) β-barrelscytoplasm proteins eight different patterns of three consecutive β type angles, βi − βi+1 − βi+2. (c) Eight different patterns of three consecutive γ type angles, γi − γi+1 − γi+2, for β-barrelsTM proteins. (d) β-barrelscytoplasm proteins eight different patterns of three consecutive γ type angles, γi − γi+1 − γ i+2.

Figure 8
figure 8

Frequencies of the dyad patterns for consecutive γ type dihedral angles when the γ type dihedral angle was split into four regions. The bar diagram shows the observed numbers of the 16 different patterns of two consecutive γ type angles, γi − γi+1. Here, all λ type angles were split into four regions, i.e. +A(0° to 90°), +B(90° to 180°), A(− 90° to 0°), and B (−90° to −180°) (a) Out of 16 patterns from the combinations of two consecutive γ type angles, γi − γi+1, [BB] pattern were abundant for β-barrelsTM proteins (29) and (b) for β-barrelscytoplasm proteins, there is no obvious preference in the patterns in their 51 non-homologous dataset.

The twisted property of β-strands can be explained by both the conformation of individual β-strands and the relative orientations of adjacent β-strands using β and γ type dihedral angles. Figure 9 shows a conventional image of β-barrel proteins with several β-strands that have previously used to portray adjacent TM helical arrangements. Figure 9(a,b) show the schematic arrangement of three consecutive β-strands, i.e. Si, Si+1, and Si+2, depending on the pattern of βi − βi+1 cluster, observed in the front and side view respectively. As recorded in Fig. 4, four different (+, +), (+, −), (−, +) and (−, −) patterns determine four different types of arrangement between Si and Si+2 in parallel (Fig. 9(b)). The dominance of (+, +) and (+, +, +) patterns in βi − βi+1 and βi − βi+1 − βi+2 cluster indicates that β-barrel favor twisted pattern. Figure 9(c) shows a schematic picture of β-barrel proteins with several β-strands in top view. It shows how the β-strands were extended depending on the pattern of γi − γi+1 cluster. The (−, −) and (+, +) patterns suggest that β-strands are set in one direction with a distorted pattern in such a way it forms a twisted topology. The (+, ) and (−, +) patterns, however, show that β-strands are arranged such that β-strands are tightly packed. The clear preference of (+, +) and (+, +, +) pattern for γi − γi+1 and γi − γi+1 − γi+2 clusters suggests that β-strands in the membrane proteins favor the arrangements with a twisted pattern in the right-hand direction. Similarly, the dominant dihedral angular distribution of (90° to 180°, 90° to 180°) in the 16 possible patterns of γi − γi+1 cluster (Fig. 8) implies that there is also some angle preference in the twisted type arrangement of TM β-strands than cytoplasmic ones. These dihedral angle preferences and the right-handed properties of the β-strands are strongly interrelated with the overall β-barrel architectures. The specific conformations of βi − βi+1 and γi − γi+1 cluster confirms that adjacent positive angles may provide continued stability, thus β-strands twisting possibly control the conformational space.

Figure 9
figure 9

Relation to the dihedral angle patterns and β-strands arrangements or extensions. (a) Front view of the linearly ordered β-strands in β-barrel proteins with β-strands and loops. (b) Side view of the arrangement of three consecutive β-strands Si − Si+1 − Si+2 depending on the four different patterns of two consecutive β type angles, βi − βi+1. (c) Top view configuration of four consecutive helices, Si − Si+1 − Si+2 − Si+3, for the four different patterns of two consecutive γ type angles, γi − γi+1.

The right-handed orientation and antiparallel arrangement of β-barrelsTM can be related to the distribution of β type and γ-type dihedral angles found between +10° to +20° and +100° to +150° respectively. Few TM γ-type dihedral angles show a trend of relatively small dihedral value for short intracellular loops and larger dihedral value for long extracellular loops in their distributions locally, whereas no such bias has observed with β-barrelscytoplasm proteins. In general, conformational energetics strongly favors the antiparallel β-strand arrangements which have an advantage over the stability in terms of the packing interaction49. It is also known that the antiparallel barrel folds are particularly more stable when the strand orders are a consecutively right-handed conformation50. The combined orientations of individual strands and adjacent strands adopt an overall closed barrel or partially opened or distorted barrels. Structurally, the right-handed connectivity is common for all β-strands embedded in the finite hydrophobic membranes that prefer the antiparallel arrangements. Meanwhile, no specific trends have been found with β-barrelscytoplasm proteins in terms of their γ-type. Both TM and cytoplasmic adjacent loop orientations were observed as they can be found opposite side with no direct interaction between them. Though loops are arranged in a random orientation that can lead to maximization of entropy, they adopt few recurrent conformations (adjacent loops orientations can be specific for protein families). Besides, we subsequently quantified the respective dihedral angle distributions in effect with the number and position of β-strands for both β-barrelsTM and β-barrelscytoplasm proteins.

Comparison of β-strand numbers and positions effect in both β-barrels

We also examined how the similarities and dissimilarities for both β-barrelsTM and β-barrelscytoplasm in terms of the number and position of β-strands. As the numbers of β-strands vary, the arrangements of β-strands and loops might be changed due to change of interaction energy between β-strands and entropy of the flexible loop region. The distributions of dihedral angles for β and γ types were plotted according to their relative positions in β-strands for TM fold (Fig. 10(a,c)) and cytoplasmic fold (Fig. 10(b,d)) respectively. It seems β-barrelsTM proteins are relatively larger than β-barrelscytoplasm proteins, and so the numbers of β-strands are observed less in cytoplasmic folds. According to the number of β-strands, the β and γ type dihedral distribution were plotted for TM configurations (Fig. 11(a,c)) and cytoplasmic configurations (Fig. 11(b,d)) respectively. For β-barrelsTM proteins with 4, 10, 12, 14 and 19 β-strands obviously suggest that γ type distribution is also restricted to specific accessible regions and on the other hand, rather sparse distribution noted with 16, 18, 22 and 26 β-strands (Fig. 11(c)). For β-barrelscytoplasm proteins with 5, 6, 7 and 8 β-strands were found to be the major groups determining their distributions (Fig. 11(b)). In both cases, the distribution of β and γ type dihedral angles was quite different compared to those of overall distributions as shown in Fig. 5. These results suggest that the dihedral angles are significantly affected by the relative position of β-strands and loops as the number of the β-strands fluctuate in both groups. Further, the patterns of the distribution of βi − βi+1 and γi − γi+1 cluster were analyzed according to three different numbers of groups for both cases.

Figure 10
figure 10

Dihedral angle distributions in the β-barrelsTM and β-barrelscytoplasm proteins depending on their configurations. (a) Scatter plot of the β type dihedral angles of β-barrelsTM proteins depending on their configurations. (b) Scatter plot of β type dihedral angles of β-barrelscytoplasm proteins depending on their configurations. (c) Scatter plot of the γ type dihedral angles of β-barrelsTM proteins depending on their configurations. (d) Scatter plot of γ type dihedral angles of β-barrelscytoplasm proteins depending on their configurations. For (ad), all the ith dihedral angles (βi or γi values) of both of β-barrelsTM (29) and β-barrelscytoplasm (51) non-homologous β-barrel proteins were collected and plotted against βi or γi in the x-axis.

Figure 11
figure 11

Dihedral angle distributions in the β-barrelsTM and β-barrelscytoplasm proteins depending on the β-strands numbers. (a) Scatter and line plot of the β type dihedral angles of β-barrelsTM proteins depending on the β-strands numbers. (b) Scatter and line plot of β type dihedral angles of β-barrelscytoplasm proteins depending on the β-strands numbers. (c) Scatter and line plot of the γ type dihedral angles of β-barrelsTM proteins depending on the β-strands numbers. (d) Scatter and line plot of γ type dihedral angles of β-barrelscytoplasm proteins depending on the β-strands numbers. For (ad), all the ith dihedral angles (βi or γi values) of both β-barrelsTM (29) and β-barrelscytoplasm (51) non-homologous β-barrel proteins were given in y-axis and plotted against β-strands numbers in the x-axis.

Membrane β-barrel structures with 4–10 TM β-strands, with 12–16 TM β-strands, and with 18–26 TM β-strands were considered exclusively for size effect and grouped into smaller (structures with 4–10 TM β-strands), medium (structures with 12–16 TM β-strands) and larger membrane (structures with 18–26 TM β-strands) β-barrel proteins (Fig. 12(a,b)). In contrast, cytoplasmic β-barrel structures were grouped into 4–6N β-strands, 7–8N β-strands, and 10–14N β-strands according to their size grouping, where N denotes the number of β-strands (Fig. 13(a,d)). As shown in Fig. 12(a), (+, +) pattern for βi − βi+1 cluster show up more frequently for the proteins with 4–10 TM β-strands. The pattern remained same as the TM increases from 4–10 TM β-strands to 12–16 TM β-strands and for 18–26 TM β-strands. However, it is found that (, +), (+, ) and (−, −) patterns also shows up notably with 18–26 TM β-strands group compared to other groups. Dihedral angles of β17, β20 and β21 are sparsely distributed in the negative angles (−60° to 0°) belongs to β-barrel proteins with 18–24 TM β-strands. The (+, +) pattern becomes more dominant with all group proteins sharing β-strands. Especially, pattern (+, +) shows up more frequently for the proteins with 4–10 TM β-strands for the γi − γi+1 cluster. As the number of TM β-strands increase, it becomes more prevail in 12–16 TM β-strands and 18–26 TM β-strands groups (Fig. 12(b)). Noticeably, the (+, +) pattern is abundant for all the TM β-strands groups. Though (+, +) pattern is dominant, (, +), (+, ) and (, ) patterns are also observed considerably for βi − βi+1 and γi − γi+1 clusters of the proteins with larger TM structures (18–26 TM β-strands), respectively. On the other hand, cytoplasmic β-barrel proteins show slightly different trends. Although cytoplasmic βi − βi+1 also follows the same (+, +) pattern as the preferred pattern in all respective groups as observed with TM proteins (Fig. 13(a)) and the patterns trend for γi − γi+1 cluster are different. As shown in Fig. 13(b), for γi − γi+1 cluster, (+, ) pattern is the least common pattern of the small proteins with 4–6N β-strands. However, the pattern shifts as the β-strands (N) increases from 4–6N β-strands to 7–8N β-strands, where (+, ) pattern become dominant, yet other patterns such as (+, +), (, +) and (, ) are also observed substantially. For 10–14N β-strands, all (+, +), (+, ), (, +) and (, ) patterns remains equally dominant. Despite the overall uniformity in their structural arrangements, these results imply that β-strands arrangement of both β-barrel structures prefers to be positioned in a specific pattern for the efficient packing of β-strands as the number of β-strands changes. Like the overall distribution, the dihedral angles (β and γ) and signature preferences for various β-strands numbers and position are rather different between β-barrel proteins in the membrane and cytoplasmic origin.

Figure 12
figure 12

Frequencies of the patterns for two consecutive β or γ type dihedral angles depending on the β-barrelsTM proteins with a different number of β-strands. All the 29 non-homologous β-barrelsTM proteins were categorized into three groups, i.e., proteins with 4 to 10 TM β-strands, 12 to 16 TM β-strands, and 18 to 26 TM β-strands. The bar diagrams show the observed numbers of (a) four different patterns of two consecutive β type angles, βi − βi+1, in each group, and (b) four different patterns of two consecutive γ type angles, γi − γi+1, in each group.

Figure 13
figure 13

Frequencies of the patterns for two consecutive β or γ type dihedral angles depending on the β-barrelscytoplasm proteins with a different number of β-strands. All the 51 non-homologous β-barrelscytoplasm proteins were categorized into three groups, i.e., proteins with 4 to 6N β-strands, 7 to 8N β-strands, and 10 to 14N β-strands. The bar diagrams show the observed numbers of (a) four different patterns of two consecutive β type angles, βi − βi+1, in each group, and (b) four different patterns of two consecutive γ type angles, γi − γi+1, in each group.

Discussion

The present study provides a systematic comparison of macroscopic features of β-barrelsTM and β-barrelscytoplasm proteins, providing a more simplistic way to understand the structural organization of complex proteins. Furthermore, our studies help to differentiate the specificity through various β-strands arrangements by analyzing the overall conformational space, signature patterns, positions and numbers of β-strands. For the comparative analysis of TM β-barrels with cytoplasmic structures, the joint-derived dihedral angles of β-strands were used. The β-γ dihedral angles are macroscopic descriptions of protein structures based on joint-based approach37,38. In both cases, specific allowed and disallowed conformational accessible spaces were found. Here, we used non-homologous datasets of β-barrelsTM and β-barrelscytoplasm proteins for comparison and the approach can be extended to compare the conformational heterogeneities and geometric features of other polymers. It is well known that the adjacent β-strands in the membrane were mostly arranged in antiparallel due to a confinement inside membrane favoring stability and conformational energetics49,50. The overall dihedral angle distribution analyses for the β and γ-type suggest that the neighboring two β-strands would be antiparallel orientation as shown in Fig. 3(a) for TM proteins. On the other hand, cytoplasmic β-strands with greater conformational flexibility can possibly have all the antiparallel, parallel and mixed arrangements. In addition, the analyses of adjacent dihedral signature pattern analysis, such as βi − βi+1 and γi − γi+1 intensely supports the TM β-strands favored to be right-handed orientations with (+, +) pattern being clearly dominant. For cytoplasmic proteins, the same (+, +) signature patterns are clearly abundant preferring the right-handed orientations. Commonly, the (+, +) patterns were dominant over other signatures for β-strands arrangements. Though γ-type overall distribution is different in both the cases, they have common features of (+, +) patterns for loop-loop arrangements. Based on the results, we envisage that right-handed orientation is profoundly relevant to the stereochemistry of interaction between β-strands at the atomic level. Presumably, the successive combination of (+, +) pattern for βi − βi+1 (orientations of β-strands) and γi − γi+1 (orientations of loops) has an advantage for TM β-strands packing. With relatively greater conformational space, cytoplasmic proteins tend to have slightly loosely packed β-strands. Together, these results prove the packing similarities and dissimilarities between β-barrelsTM and β-barrelscytoplasm proteins.

The difference in the overall dihedral angle distributions according to the number of β-strands for both the β and γ-type reveals additional specifics. We support the fact that the β-strands positioning of β-barrel proteins could be significantly restricted more inside membrane lipid bilayer than the cytoplasmic environment as the number of TM β-strands changes. Our results could aid to comprehend the differences in their overall structure as it is reflected in the variability of the dihedral angle distributions of β and γ-type at the macroscopic level. This infers the local arrangement of β-strands and loops can be affected and varied by the number of β-strands increases. The γi − γi+1 cluster analysis also shows a clear preference of a “right-handed” conformation as the number of β-strands fluctuates, implying a right-handed orientation is optimized form required for the efficient TM β-strand packing geometry. In contrast, (+, ) pattern appear marginally higher than (+, +) pattern of the cytoplasmic proteins with 4–6N β-strands. It was clearly observed that the arrangements of β-strand orientations are diverse depending on the β-strand positions in cytoplasmic proteins. Taken together, our results indicate that the β-barrel structure in the lipid membrane environment is more significantly controlled than the structures of cytoplasmic β-strand structures at local and global arrangements.

All such macroscopic geometric analyses provide significant clues for better understanding to improve in-silico protein structure and function prediction, modeling, and validation. The relationship between the dihedral angles and β-strand arrangements provide more focus on β-barrels engineering and designing principles. In general, any rule that is related to such structural property supports to compare and classify the protein functions. Here, the antiparallel arrangements and right-handed orientations are common for most of the β-strand structures, while the mixed loop arrangement is also essential for β-barrel structural formation. Perhaps, non-adjacent and various combinations of β and γ-type dihedral angles would be distinguished to locate interaction and non-interaction part of the β-strands. In future, the combination of macroscopic joint description of protein structures as a coarse-grain level with all-atom or Cα residue level can help us to improve our understanding towards large structural rearrangements and dynamic behavioral studies of complex structures.

Methods

Datasets as target β-barrel proteins used in this study

At first, we selected and filtered all membrane proteins from the Protein Data Bank by using the key word “Membrane proteins” and then, with the help of selection mode, β-barrel proteins X-ray crystal structures were separated. It was found that 837 structures belong to TM β-barrel proteins. The dataset of 592 β-barrel structures having sequence identity less than 90%, were identified to be unique proteins containing both homologous and non-homologous protein chains. From the above numbers, the search was made for a non-homologous dataset with the following criteria (1) sequence identity less than 30%, (2) ≤3.0 Å resolution structures, (3) selection of monomeric protein chains were chosen manually from homomeric structures when more than a single chain is available, (4) removal of remote homologous i.e. when more than one structure is available for the superfamily, to avoid the structural redundancy we removed all structures that share same superfamily and kept single structure to represent the superfamily. Finally, a total of 29 proteins satisfying all the above criteria were grouped as non-homologous membrane proteins. The above selection processes were done with the help of PDB and PISCES server51,52. Obtained TM β-barrel proteins with TM β-strands from 4TM to 26TM were classified using PPM server with OPM53 and PDBTM54 databases. Then, the Protein Data Bank database was used to acquire a list of all structures available of cytoplasmic β-barrels. From all beta proteins, only closed or partly opened β-barrel structures were chosen using CATH AND SCOP annotations55,56,57. To make certain the same analyses, structures with X-ray resolution lower than 3.0 Å with sequence identity less than 30% were considered with the help of PISCES server. From the list of 117 structures of β-barrel proteins, we removed the remote homologous and we obtained 51 structures as a final cytoplasmic dataset. We extracted the secondary structure such as β-strands and loops assignments from the obtained structures using STRIDE server58. The same were cross-referenced with the SHEET records in the PDB header information for cytoplasmic β-barrel proteins.

Joint-based structural joint points and β and γ-type dihedral measurements

The first and last residue Cα atoms of each TM and cytoplasmic secondary structural (SS) segment were projected as joint structural points for the dihedral calculations. Structural joint selections and β-strands boundaries were prepared based on the annotations given in OPM and STRIDE server respectively. The previously developed in-house program was used for the joint-derived dihedral angle measurements. It should be noted that the total number of β and γ-type dihedral angles for a protein structure entirely depends on the total number of TM β-strands and loops present in the structure i.e. the total number of SS segments and joint coordinates. While forming all linkage of the joint points, a macroscopic description of the overall protein structure was portrayed.

Statistical analyses dihedral angle patterns

To find out the preferred relative orientations, the conformational analysis was performed based on the dihedral angles obtained and the signature patterns dominance was revealed among various combinations of consecutive dihedral angles. For the selected 29 and 51 structures for both TM and cytoplasmic cases presented by the βnnn+1n+1 dihedral angle set, as summarized in Supplementary Information Tables (S1, S2, S3 and S4); N stands for β strand numbers in the protein structures. The calculated dihedral angles were converted to positive (+ve) and negative (−ve) signatures to represent the conformations, as given in Supplementary Information Tables (S5, S6, S7 and S8). A consecutive β-β pattern was selected for each fold as βnn+1. Grouped βnn+1 should be a consecutive, adjacent set, and no fixed order, whereas non-consecutive βnn+2 were not considered. For example, the β-β pattern angles were selected from β11- β22- β33 to βnn as any consecutive βnn. To make more defined distribution patterns, the consecutive βnn+1 and βnn+1n+2 were also tested.