Molecular dynamics simulation of the synthesis of protein-like copolymers via conformation-dependent design

We propose a computer model to describe the synthesis of protein-like copolymers in a polar solvent via irreversible polymerization of hydrophobic and hydrophilic monomers with simultaneous globule formation. In the model, growing chain macroradical, polymerizing monomers, and the preferential absorption of hydrophobic monomers in the core of the globule are taken into account explicitly. The effect of monomer concentrations and reaction rate on the conformational properties and primary copolymer sequences is investigated. We find that, under certain conditions, the resulting copolymers can form protein-like globules with hydrophobic core wrapped in a hydrophilic envelope. Also, a gradient structure of primary sequences of the copolymers is revealed. Such sequences are formed due to both a change in the chain conformation and a continuous redistribution of comonomers between the globule and the solution in the course of the polymerization.


Introduction
Heteropolymers, which comprise two or more covalently bonded sequences of chemically distinct monomer species, show a surprisingly rich behaviour both in solution and in bulk [1]- [3]. The simplest examples of two-letter copolymers made up of two different monomer species, denoted by letters A and B, are AB diblock copolymers and regular multiblock copolymer with repeating structure (AB) x . In general, repeated-block copolymers can be written as (A i B j A k B l . . .) x , where the indices i, j, k, l, . . . denote the number of the corresponding monomer units. More sophisticated distributions of chemically different groups along the chain are characteristic for different random and random-block copolymers, including uncorrelated or ideal random copolymers and the so-called correlated random copolymers. In the former class, the corresponding primary sequences are uncorrelated and correspond to Bernoulli or zeroth-order Markov processes. In the latter class, the correlation in the sequences of both types of A and B segments is defined by means of a first-order Markov process. It is important to emphasize that in both cases the correlations characterizing the distribution of monomer units along copolymer chains decay exponentially. There are, however, such copolymers for which this is not the case.
Recently, the sequence design scheme to generate special primary structures of two-letter quasi-random copolymers with quenched disorder has been proposed in [4,5]. The idea behind this scheme can be explained as follows. We start with a homopolymer globule stabilized by the attraction between all monomer units. Then a 'colouring' procedure is performed. This procedure can be considered as a chemical reaction of a monomer unit with a low-molecularweight reagent (polymer-analogous transformation). The units in the core of the globule remain 'red' and are called hydrophobic (H units), while the units at the surface of the globule are 'coloured' in 'green' and are called hydrophilic, or polar (P units). As a result, monomer units on the surface, instead of being lyophobic, acquire lyophilic properties. Once this primary structure is fixed, attraction between monomer units is removed, and we obtain an HP copolymer. It is clear that the properties of the resulting copolymer, including its HP sequence, depend on the bare globular conformation. Such a copolymer was referred to in [4] as a 'proteinlike' copolymer, because it mimics one of the important features of real globular proteins, namely the possibility of formation of a compact hydrophobic core stabilized by hydrophilic Computer-generated movie illustrating the synthesis of a 256-unit protein-like copolymer via polymer-analogous reaction ('colouring') from an initially completely hydrophobic homopolymer globule. The polymer chain is depicted by red and green sticks and low-molecular-weight reactant is shown as small yellow spheres. The polymer-analogous reaction between the reactant and the hydrophobic chain units leads to changes in the colour of the chain units. See [4,5] for more detail. envelope in a globular state. The scheme of the conformation-dependent molecular design described above is illustrated in figure 1, where we present a computer movie generated during simulation.
The statistical properties of protein-like HP sequences were analysed in [6]. It has been shown, both by exact analytical theory and by numerical calculations, that these sequences exhibit long-range correlations of the so-called Levy-flight type [7]; therefore they acquire a certain degree of complexity of the bare globular conformation as a result of a one-step 'colouring' procedure [4,5]. Very recently, it has also been found that the statistics of the arrangement of monomer units along protein-like copolymer chains is distinctly non-Markovian [8] and differ very significantly from the copolymers with random and random-block sequences.
The procedure outlined above was first realized in computer experiments [4,5]. After the concept of conformation-dependent sequence design was formulated, several possible ways of experimental realization of this concept were proposed. In the literature, the following two main approaches were formulated: (i) chemical modification ('colouring') of reactive pendant groups in certain homopolymers and (ii) the formation of a designed protein-like copolymer directly in the course of its synthesis from appropriate comonomers.
The role of 'colouring' in real experiments is played by the reaction of a monomer unit with a low-molecular-weight reagent, which converts a hydrophobic unit to a charged or polar group, H → P. In [9], hydrophilization was achieved by grafting of short poly(ethylene oxide) chains to the thermosensitive poly(isopropylacrylamide) backbone. It was shown that grafting to the more compact conformation (which occurs mainly from the surface leading to a kind of surface 'colouring') is more efficient than random grafting to a coil conformation. Other attempts along the same line were also performed, but in general this method was shown to be rather unreliable, because of the impossibility of stabilizing dense globules in the solution for the time sufficient to implement a polymer-analogous transformation.
The synthesis of protein-like copolymers via copolymerization with simultaneous globule formation was first achieved in [10]. Then similar results were reported in [11,12] for other pairs of comonomers, thus showing the generality of the results obtained. This approach turns out to be much more robust when compared with the 'colouring' procedure. The main idea behind this approach can be explained as follows. Let us assume that we are performing a copolymerization of moderately hydrophobic (thermosensitive) and hydrophilic monomers in aqueous medium. The conditions for copolymerization (e.g. temperature) should be chosen in such a way that when the emerging chain is long enough it forms a globule. Globule formation will lead to a redistribution of monomers: hydrophobic ones should be predominantly concentrated within the globules, whereas hydrophilic monomers will be mainly located in the outside solution. Therefore, when a growing chain end is inside the globular core, mainly hydrophobic units will be added, whereas when the end is located in the outer shell of a globule, the addition of hydrophilic units becomes more preferable. In this way, the protein-like structure with predominantly hydrophobic core and hydrophilic outer envelope should emerge automatically. However, the theoretical foundation of this method is still lacking: the considerations of preferential solvation of monomers presented above are very qualitative, and more exact quantitative investigation is needed. This is just the aim of this paper.
Recently, the experimental approach proposed in [10] has been realized by Berezkin et al [13] in their Monte Carlo simulation study where it was shown that the statistical properties of the sequences obtained as a result of copolymerization in a selective solvent exhibit longrange correlations. The most important advantage of using the computational polymerization procedure introduced in [13] is that it provides detailed information about the composition and conformation of a growing copolymer chain, varying with reaction time, monomer concentrations or their solubility. In this simulation study, however, the polymerizing monomers were taken into account implicitly by the probabilities of attachment of P and H monomers to the growing-chain end. This scheme is a typical Monte Carlo approach.
In the present work, to simulate the irreversible copolymerization of two types of monomers having different solubility and adsorption properties, we will use the method of molecular dynamics. In the model, growing chain macroradical, polymerizing monomers and the preferential absorption of hydrophobic monomers inside the growing globule will be taken into account explicitly.
The rest of the paper is organized as follows. In section 2 we discuss our simulation approach and growth model. We then present our key results and their discussion in section 3. In section 4 we summarize our findings.

Model and simulation technique
The growth of polymers from monomers is a complex and fascinating problem. Several factors govern the structure evolution during polymerization. Some of them are reaction kinetics, thermal motion, topological constraints, interactions between polymerizing species, etc. Moreover, the effect of each factor can change with the evolving polymer conformation. Whether or not two functional units can react is dictated by the thermal energy of the system and reaction chemistry. When functional units collide during random thermal motion, if they have proper orientation and sufficient energy to overcome the activation barrier, they can react to form a single new chemical bond. Two opposite effects control the process of structure buildup during polymerization: diffusion limitations and reaction kinetics. From this point of view, the two limiting cases of polymerization process are well documented, depending on the mobility of monomers.
The polymerization reaction corresponding to the diffusion-controlled limit may be realized, e.g. in a dilute polymer solution with very high viscosity where the chain relaxation time τ Rel far exceeds the characteristic time to grow a sufficiently large macromolecule. The simplest computer model, which describes such a polymerization process, is the so-called kinetic growth walk (KGW) model introduced by Stanley and co-workers [14] (see also [15]- [18]).
For the polymerization corresponding to a kinetically controlled regime, the thermal motion of the molecules is implied fast enough so that all reacting species have in principle complete accessibility to each other. For this regime, Flory formulated the principle of equal reactivity of all species participating in the polymerization process [19] (see also [20]- [22]). In reality, however, strong deviations from the principle of equal reactivity are possible [22]. The case of unequal reactivity can be either chemical or physical. Chemical unequal reactivity can be due to the fact that the reactivity of an atomic group depends on the reactive state of other groups belonging to the same molecule. Physical unequal reactivity can arise for a number of different reasons. In particular, when the polymerizing system is composed of two different types of monomers, the interactions between these monomers result in unequal reactivities. Next, spatial constraints may be another reason for unequal reactivity. Finally, diffusion limitations may be a cause for physical unequal reactivity. They can be due to the preferential absorption of one of the types of polymerizing comonomers having different solubility or different affinity to the chemically different sections of a growing copolymer chain. This phenomenon occurs when the conformation of the growing copolymer changes in the course of the polymerization process [22]. This factor will be especially emphasized in the present study.
We simulate the irreversible growth process in the dilute regime under good solvent conditions for hydrophilic (or polar) monomers and under poor solvent conditions for hydrophobic monomers. The polymerization is modelled as a step-by-step chemical reaction of the addition of monomer units to the growing copolymer chain. For the polymerization process, H and P monomers are assumed to be bifunctional and the orientation of individual functional units on the monomers is not distinguished. If a free monomer comes within the prescribed distance R min ('reaction radius') of the end monomer unit of a growing chain, a bond can be formed between the two. Note that, for a radically polymerizing linear polymer, only one of the two ends of the chain is active, and is ready for reacting with free monomers. Also, no polymer rings are allowed to form.
We consider a continuous (off-lattice) model of a copolymer chain that involves two types of monomer units, H (hydrophobic) and P (hydrophilic or polar). To simulate the dynamics of the chain and polymerizing monomers, we employ the method of molecular dynamics (MD). The polymer is treated as a pearl necklace consisting of N segments (monomeric beads), with adjacent beads connected by a rigid rod of fixed length b. No additional restrictions are imposed on bond angles (i.e. the angle between two successive bond vectors) other than those naturally arising from the interaction of beads separated by two successive bonds.
Excluded volume interactions between all particles (monomers and chain beads) are modelled via the repulsive part of a Lennard-Jones potential where ε is the energy parameter, σ the diameter of the particle, and r the distance between particles. Effective hydrophobic attraction is taken into account via the Yukawa-type potential where ε αβ ∈ {ε HH , ε HP , ε PP }; r c is the cutoff distance for attractive interactions (r c = 2.8σ). In the model, this potential describes the solvent-mediated short-range hydrophobic interactions. The nature of hydrophobic forces is complicated, but for the purpose of the present study it is sufficient to introduce a non-specific short-range attraction using potential (2) in which the (positive) parameter ε HH is chosen to promote the preferential absorption of hydrophobic monomers in the core of the globule and the collapse of the copolymer chains. In all the calculations, we set ε HH = 1.61 (all the energies are measured in units of ε). For simplicity, we assume that the solvent molecules surrounding the growing chain are identical to the hydrophilic beads so that ε PP = 0 and ε HP = √ ε HH ε PP = 0. It is clear that for ε αβ = 0, there is no additional interaction between non-bonded particles except that corresponding to the excluded volume potential (1). Potentials (1) and (2) describe the behaviour of both polymerizing monomers and connected monomer units.
Explicitly, no solvent particles are included in the simulations; i.e. the solvent is represented by a dielectric continuum. To simulate solvation effects and the time evolution of the solution in contact with a heat bath of temperature T, we augment the equations of motion by Langevin uncorrelated noise terms [23]. In the MD simulations, the equations of motion of a polymer system in the presence of fixed constraints (the bead-rod model) are with the constraints of fixed bond lengths where N is the number of beads in the chain (N = N H + N P ), n denotes the number of non-bonded monomers in the system (n = n H + n P ), m i = 1 is the mass of the chain bead or monomer i with position vector r i , −∇ i U(r) represents the total non-bonded forces on the particle i (U is the total potential energy), b = σ is the bond length, λ s 's are the Lagrange multipliers, i describes the random force of the heat bath acting on particle i, and takes into account the viscosity of the solvent. The summation in equation (3) is performed over all the (N − 1) bonds of an N -unit polymer chain. This term represents the forces, ∇ i N −1 s=1 λ s G s (r 1 , . . . , r N ), which are due to the bond reactions. Each of the sth geometrical constraint (bond) is associated with exactly one unknown λ s . In equation (3), i and i are connected through the fluctuationdissipation theorem, R αi (0) · R αi (t) = 2 i k B Tδ(t), (α = x, y, z; T is the reference temperature) and ensures that the temperature is kept constant. The equations of motion (3) in conjunction with equation (4) are solved iteratively using a Newtonian iteration procedure with the time step t = 0.01σ √ m/ε (see [23]- [25] for more technical details). The unit of time in the simulation is defined by τ = σ √ m/ε. In the following, the reference temperature T (measured in units of We take the parameter to be dependent on solvent-accessible surface areas (SASA). To find the values of SASA for a given configuration, we perform an analytical computation of the surface areas A i for each specified particle [26].
where A max is the maximum solvent-accessible surface area of a particle for the model under study and the reference value of 0 is taken to be equal to 0.5τ −1 . The weighting factor A i /A max represents the degree of exposure of the particle i to the solvent. When the value of SASA for a given particle is zero, the frictional and random forces are zero and the Langevin equation (3) reduces to Newton's equation of motion. For the chain in a globular conformation, this typically happens when the monomer unit is located in the core of a globule. In contrast, a monomer unit located at the globular surface is strongly solvated; it means that A i should be close to A max and, as a result, the value of i is close to its reference value 0 .
Our algorithm simulates a copolymerization reaction in a three-dimensional (3D) cube utilizing periodic boundary conditions. Initially, n freely diffusing monomers (hydrophobic and hydrophilic) are placed randomly in the cube. The total number density of the monomers is fixed at ρ = 0.01. After a monomer is attached to the growing chain, a new free monomer of the same type is placed randomly in the box, checked for overlaps with already placed particles. Then this monomer gets a velocity of a given absolute value with a randomly chosen direction. The concentration of monomers in solution does not depend on their conversion and, therefore, always remains a constant. In most of the runs discussed below, the model system involved 20 and 80% H and P monomers, respectively. The reaction radius was set to R min = b ± δ, where δ = 0.01σ, i.e. a monomer was allowed to be attached to the end bead of the growing chain when the corresponding distance r was in the range b − δ r b + δ. Note that for the chain lengths simulated in this study (N 512), we in fact dealt with a dilute polymer solution.
The usual radical polymerization process consists of the following three phases: initiation, chain propagation and termination. Since we put only one radical initiator into the reaction mixture, spontaneous chain-terminating reactions, such as radical combination and disproportionation, were excluded. This situation takes place when the concentration of radical species in a polymerization reaction is very small compared with other reactants. The polymerization is modelled as a sequence of the alternating steps: (i) the addition of a new monomer unit to the growing chain and (ii) the relaxation of the chain with the current length N τ . Polymerization occurs without crossing energy barriers and is determined only by the local concentration of reactive monomers near the 'active' end of the macroradical. In the course of the chain relaxation, the average number of H and P monomers, n H (τ) and n P (τ), around the chain end in the reaction volume 4 3 πR 3 min are calculated. Since the probability of a reaction is proportional to its rate, the probability for H and P monomer binding can be written as Thus, during the simulation, we find the current position of the 'active' end of the chain [x N (τ), y N (τ), z N (τ)] and add to it a new monomer unit, according to the relative probability for the occurrence of the monomer H or monomer P. It is clear that, in our polymerization model, the values of v P and v H should depend on the position of the growing chain end: if it is located in the region occupied mainly by hydrophobic species, the value of v H dominates; otherwise, v P dominates. A random number uniformly distributed between 0 and 1 is generated and then compared with the probability v P : if this random number is less than the value of v P , the P monomer unit is added to the chain end; otherwise the H monomer unit is added. During this and subsequent stages of the simulation, the new terminal bead interacts with intramolecular potentials (1) and (2), depending on the kind of the bead, H or P. After that, the relaxation of the growing N τ -unit macroradical begins; this takes τ R MD steps. The time τ R that is allowed for this 'annealing' between attachments is used as a control parameter to produce characteristically different chain conformations as discussed below. Note that since in our model the concentration of the monomers is fixed, the quantity τ R does not depend on the concentration.
The value of τ R can be considered as a typical reaction time that, for simplicity, is considered as a constant. In reality, however, this time is a stochastic variable determined by random number η distributed uniformly over the unit interval: where w is the reaction rate. Generally speaking, the assumption of a constant reaction time does not affect the polymer properties as long as τ R is greater than typical times of other processes taking place in the system [13]. There are two such processes: monomer diffusion with a typical time scale τ D and chain relaxation with a typical time τ Rel . For a kinetically controlled polymerization, the condition τ R > τ D is sufficient. However, the conformation-dependent design imposes an additional condition: τ R > τ Rel [8,13]. In this case, the reaction time should exceed not only the diffusion time (τ D ) but also the relaxation time (τ Rel ) of the N -unit chain, which by itself could be much higher than τ D for sufficiently long chains. In other words, we should set [8,13] τ D < τ Rel < τ R .
The value of τ R was varied from 100 to 20 000 integration steps. During the reaction, information was tabulated at various checkpoints for later analysis. Copolymerization was ended when the chain length reached N = 512 units. After the termination of the growing chains, their average conformational properties were calculated during 10 4 MD time steps. Also, the primary structures of the synthesized copolymer chains were analysed. The distributions of monomer units are also calculated as a function of the distance from the centre of mass. To gain better statistics, approximately 10 3 copolymer chains for each set of the parameters were generated and the required average characteristics were found. The typical CPU time needed for generating one copolymer chain of 512 monomer units on a 1.8 GHz Pentium IV processor is approximately 48 h at τ R = 10 4 . Throughout the simulation, snapshots of the system were collected; then using these snapshots, we generated a movie illustrating the copolymerization process simulated in this work.

General trends in chain growth
It should be emphasized that, for all samples that are considered in the present study, the resulting copolymer chains tended to be compact due to strong attraction between the H monomer units. To give a visual impression of the simulated system, figure 2 presents a typical snapshot picture of an 512-unit copolymer chain synthesized at τ R = 10 4 . Figure 3 presents the corresponding movie showing the evolution of the growing chain. At this stage we are interested in qualitative effects only.
Already from these data one can easily see that, using the polymerization mechanism proposed in this work, we can indeed end up with a protein-like copolymer having a dense hydrophobic core surrounded by a hydrophilic shell. In general, the same core-shell structure is observed for protein-like globules obtained through the 'colouring' procedure (see figure 1 and [4,5]).

Reaction time (τ R )
Strictly speaking, our kinetic model gives correct results only in the limit, when τ R τ D and τ R τ Rel [8,13]. By varying τ R , one can find such conditions, when this limiting regime is In particular, we calculated the average fractions of H monomer units in the resulting copolymer, φ H , its mean-square radius of gyration, R 2 g , and the partial mean-square radii of gyration, R 2 gH and R 2 gP , found separately for H and P segments. All these quantities are shown in figures 4 and 5 as a function of τ R .
As seen from figures 4 and 5, at small τ R , the average composition and the partial meansquare radii of gyration correspond to a random copolymer for which, obviously, one has: In this case, the probabilities of attachment of H and P monomers are determined only by the concentrations of the monomers in the reaction mixture and the polymerization process is governed by the principle of equal reactivity of Flory [19]. On the other hand, as the τ R value is increased, these quantities deviate from those expected for a random copolymerization and asymptotically approach constant values. Such a behaviour corresponds to the conformation-dependent design regime for which inequality (7) becomes true [13]. Therefore, for sufficiently large reaction times, we actually deal with the conformation-dependent copolymer design.
As seen from figure 4, the fraction of H monomer units in the resulting copolymer increases with τ R . This behaviour has a simple explanation. At small τ R , the copolymer chain grows very fast so that chain conformation cannot relax completely and, as a result, H units cannot form the hydrophobic core which absorbs H monomers from the solution. In this case, therefore, only usual random copolymerization is possible. On the other hand, for sufficiently large τ R , owing to the preferential absorption of H monomers from the solution by the hydrophobic core of the growing globule, the possibility of adding H monomer units increases and their fraction in the   Figure 5. Mean-square radius of gyration, R 2 g , and partial mean-square radii of gyration, R 2 gH and R 2 gP , found separately for H and P segments as a function of reaction time.
copolymer also increases. Visible differences between the partial mean-square radii of gyration for H and P units (figure 5) reveal the fact that distinct segregation of the two types of monomer units in the core-shell conformation takes place. In particular, the value of R 2 gH characterizing the relative size of hydrophobic core becomes smaller than the analogous value R 2 gP corresponding to the hydrophilic envelope. Such a behaviour corresponds to the conformation-dependent design regime for which we observe a strong deviation from the principle of equal reactivity of Flory. Indeed, although the concentration of H monomers in solution is lower when compared with P monomers and the reactivities of these monomers are equal, the copolymer synthesized in this regime is preferentially enriched by hydrophobic monomer units. The deviation from Flory's principle is due to the interplay between monomer mobilities and hydrophobic attraction, on the one hand, and steric hindrances caused by the polymer chain in a globular conformation, on the other [22]. If monomers react much faster than their diffusion around the reaction chamber, they almost always react in small neighbourhoods, and usually this leads to random connections, i.e. to the formation of a trivial statistical copolymer. However, if monomers react very slowly, the growing copolymer chain has time to attach a sufficiently large number of hydrophobic monomers and to form a compact conformation with a dense hydrophobic core; as a result, an interface arises between two regions in which either hydrophobic species or hydrophilic species dominate. Spatial constraints (or steric hindrances) at the interface and inside the hydrophobic region limit the diffusion, thus preventing the intermix of the species. Due to this, when the growing chain radical is located inside the hydrophobic globular core, it attaches mainly H monomers strongly absorbed here; being outside the core, it gathers mainly P monomers. In a sense this process is similar to a random walk near an infinite plane when the walk crossing the plane changes its 'colour'. It is well known that the 'colour distribution' in this walk demonstrates a non-trivial (Levy-flight) statistics [6,7]. We would also like to emphasize that, in our model, the type of the unit attached to the growth centre during the simulation, under kinetic control, is determined by the conformation and primary structure of the growing chain as a whole, and not by the local concentration of reactive monomers near the 'active' end of macroradical. Such a co-operativity is an additional cause for the physical unequal reactivity of monomers.
It is necessary to stress that, in principle, the synthesis of random and protein-like copolymers proceeds simultaneously. However, the relative contribution of each of these processes significantly depends on the balance between the velocity of the chain relaxation (which is closely related to copolymer composition) and a typical reaction time. This statement is illustrated by the results shown in figure 6, where we present the fraction of different chains, W(φ P ), as a function of chemical composition, φ P = N P /(N H + N P ), for a few values of τ R . It is seen that at small and large times τ R , there is only one type of copolymer, which can be attributed to random and protein-like copolymers, respectively. In other words, the W(φ P ) function is unimodal. On the other hand, at intermediate times (τ R = 5 × 10 3 time steps), there are two types of copolymers, one of which is preferentially a hydrophobic protein-like copolymer and the other is preferentially a polar random copolymer.
From the simulations we can conclude that there is a strong correlation between the composition of the growing copolymer, which determines the chain conformation, and the type of the monomer, which is attached. It is clear that the chain with high content of H segments can form compact globular structures with absorbed H monomers more easily, and this further facilitates the reactions with H monomers. In contrast, the conformation of the chain with high content of polar units is more expanded and in this case the reaction with P monomers is more probable because their solution concentration is higher.
Apparently, the critical reaction time, when the chain relaxation competes with the copolymerization, is τ R = 5 × 10 3 . At τ R > 5 × 10 3 , conformation-dependent design proceeds preferably when compared with random copolymerization.

Copolymer structure at large τ R
For synthesized copolymers, we have studied the variation of the average HP composition at different content of H and P monomers in the reaction mixture. Figure 7 shows the fraction of connected P units, φ P as a function of the composition of the initial monomer mixture φ 0 P = n P /(n H + n P ) at τ R = 10 4 . It is clear that the equality φ P = φ 0 P corresponds to a random (statistical) copolymer. As seen from figure 7, the chemical composition of the resulting copolymer strongly deviates from the statistical composition. Also, we can conclude that a 1:1 HP composition is obtained for φ 0 P ≈ 0.8; i.e. the synthesis of such copolymers is possible only at large excess of polar monomers in the reaction mixture and the acceptable 1:1 HP comonomer composition is very narrow. Thus the change in the solution concentration of monomers allows a variation of the composition of synthesized copolymers over a wide range. Below, we will discuss only the results obtained at φ 0 P = 0.8. In figure 8 we present the radial distribution of H and P monomer units in the globules synthesized at τ R = 10 4 , where the centre of mass of the globular core is taken as the origin. The radial distribution of free H and P monomers around the centre of the globule at the same τ R value is shown in figure 9.
The inspection of figure 8 demonstrates that the copolymer chains form a core-shell structure in which the H and P segments are well segregated. An analogous conclusion has been drawn from the data presented in figures 2 and 3. From figure 9 it is seen that the distribution of free H and P monomers inside the globule is also inhomogeneous. In accordance with our suppositions, an excess of hydrophobic species in the globular core (at r 5σ) arises (figures 8 and 9), while in the solution (at r > 6σ) an excess of P monomers is observed ( figure 9). Hence, at the centre of the globule, the active chain radical reacts with H monomers, while P monomers are joined preferentially near the surface of the globule. It is necessary to note that at the core-shell boundary (at r ≈ 4.5σ) H monomers form an adsorption layer. The same feature has been observed in [13], for the monomer distribution where we have carried out the simulation of a homopolymer globule enclosed in a periodic box with hydrophobic monomers.

Analysis of copolymer sequences
Let us now consider some statistical properties of the primary sequences of the obtained copolymers. To this end, we calculated the average numbers of H and P blocks and the average lengths of these blocks, L H and L P . By the average block length we mean here the average number of H (or P) monomer units in a chemically homogeneous chain segment built up from these units. The values of L H and L P are shown in figure 10 as a function of the reaction time τ R .
As seen, the length of hydrophobic blocks increases with τ R while the value of L P is a decreasing function of τ R . To explain this effect, it should be kept in mind that if τ R is very small, when conformational equilibrium is lost, chain growth resembles a random walk connecting free monomers occurring in the capture radius of the growth centre. In other words, in this case the growing chain segment does not have enough time to get a compact conformation. This enhances the probability of chemical reactions in the polar regions of the forming copolymer and, at the same time, decreases the probability of penetration of the chain end-radical into hydrophobic regions. As a result, hydrophilic monomers occur in the reaction volume more frequently.
For comparison, we show in the same figure the data obtained for a purely random 1:1 sequence. The average block lengths for the synthesized copolymers are found to be greater than those for a random copolymer with 1:1 HP composition for which L H = L P = 2. This fact can be explained by 'tuning' the sequence of the growing chain to globule size. The changes in average block lengths are explained mainly by the change in the copolymer composition. Note also that the average block length distributions for the synthesized copolymers are significantly broader than those for a random copolymer.
We found that, owing to the specific mechanism of the chain propagation, the synthesized copolymers have a gradient primary structure, i.e. the HP composition changes along the growing chain during copolymerization. This result follows from the analysis of the distribution of monomer units along the chain. In figure 11, we present the average fraction of H units in the chain, φ H (i), as a function of monomer unit number, i, for a few values of τ R . It is seen that the function φ H (i) is practically constant at small τ R when a random copolymer is synthesized. On the other hand, when the value of τ R becomes sufficiently large, the local HP composition changes along the growing chain during copolymerization. Short chains do not contain a sufficiently large number of hydrophobic units to form a core-shell conformation and the growth of these chains is very similar to that observed for random copolymerization. The composition of these chains is also close to the composition of the initial monomer mixture. Then, as the chain becomes longer and the number of connected H units increases, a dense hydrophobic core can be formed. This core can intensively absorb H monomers and their fraction in the resulting copolymer increases. In this case, the probabilities of monomer addition expressed by equation (5) are also not constant due to the changes in the ratio between the volumes of the hydrophobic core and the polar shell during the chain growth. The gradient structure becomes more pronounced on increasing the reaction time.

Conclusion
In this work, we have carried out molecular dynamics simulations to describe the process of irreversible solution copolymerization with simultaneous globule formation. The presence of polymerizing monomers in the reaction mixture and the preferential sorption of hydrophobic monomers in the core of globule was explicitly taken into account. Our approach is similar to that first experimentally realized by Lozinsky et al [10] (see also [11,12]). Although in this study we have only dealt with the simplest model of copolymerization in a selective solvent, nevertheless, we believe that the results obtained from this simulation will be helpful for understanding the nature of the experimentally observed features and for providing ideas for further experiments. In particular, we have monitored the formation of a two-letter quasirandom copolymer with quenched primary structure, which demonstrates several non-trivial features. The effect of monomer concentrations and reaction rate on the conformational properties and primary copolymer sequences has been investigated in detail. We have found that, under certain conditions, the resulting copolymers can form protein-like globules with hydrophobic core wrapped in a hydrophilic envelope. An analogous conclusion has been drawn in our previous theoretical study [13] where the copolymerization in a selective solvent was simulated without explicit monomers.Also, we have found that the synthesized copolymers have a gradient structure of primary sequences. From our simulation it follows that such sequences are formed due to both a change in the chain conformation and a continuous redistribution of comonomers between the globule and the solution in the course of the polymerization. Finally, for the system under study we have observed a strong deviation from the principle of equal reactivity of Flory [19]. Although the concentration of hydrophobic monomers in the simulated system was lower when compared with hydrophilic monomers and the reactivities of both monomers were equal, the irreversible copolymerization with simultaneous globule formation leads to a copolymer that is preferentially enriched by hydrophobic monomer units. The basis for this effect is the interplay between monomer mobilities and hydrophobic attraction, on the one hand, and steric hindrances caused by polymer chain in a globular conformation, on the other.