Crystal Polymorph Selection Mechanism of Hard Spheres Hidden in the Fluid

Nucleation plays a critical role in the birth of crystals and is associated with a vast array of phenomena, such as protein crystallization and ice formation in clouds. Despite numerous experimental and theoretical studies, many aspects of the nucleation process, such as the polymorph selection mechanism in the early stages, are far from being understood. Here, we show that the hitherto unexplained excess of particles in a face-centered-cubic (fcc)-like environment, as compared to those in a hexagonal-close-packed (hcp)-like environment, in a crystal nucleus of hard spheres can be explained by the higher order structure in the fluid phase. We show using both simulations and experiments that in the metastable fluid phase, pentagonal bipyramids, clusters with fivefold symmetry known to be inhibitors of crystal nucleation, transform into a different cluster, Siamese dodecahedra. These clusters are closely similar to an fcc subunit, which explains the higher propensity to grow fcc than hcp in hard spheres. We show that our crystallization and polymorph selection mechanism is generic for crystal nucleation from a dense, strongly correlated fluid phase.

Nucleation plays a critical role in the birth of crystals and is associated with a vast array of phenomena such as protein crystallization and ice formation in clouds.Despite numerous experimental and theoretical studies, many aspects of the nucleation process like the polymorph selection mechanism in the early stages are far from being understood.Here, we show that the excess of particles in a face-centred-cubic (fcc)-like environment with respect to those in a hexagonal-close-packed (hcp)-like environment in a crystal nucleus of hard spheres as observed in simulations and experiments [1][2][3][4][5][6] can be explained by the higher order structure in the fluid phase.We show using both simulations and experiments that, in the metastable fluid phase, fivefold symmetry clusters -pentagonal bipyramids (PBs) -known to be inhibitors of crystal nucleation [7,8], transform into a different cluster -Siamese dodecahedra (SDs).Due to their geometry, these clusters form a bridge between the fivefold symmetric fluid and the fcc crystal, thus lowering its interfacial free energy with respect to the hcp crystal, and shedding new light on the polymorph selection mechanism.
Understanding nucleation is important in many research fields such as determining the molecular structure of proteins through crystallization, drug design in the pharmaceutical industry, ice crystal formation in clouds -the largest unknown in the earth's radiative balance and thus crucial in the context of climate change and weather forecasts -and crystallization of colloidal and nanoparticle suspensions with application perspectives in catalysis, opto-electronics, and plasmonics [9,10].
However, nucleation is extremely challenging to study in molecular systems as it is a stochastic and rare process, and the sizes of the crystal nuclei are often rather small and the nuclei grow out extremely fast when they exceed their critical size.An additional obstacle is that, for most substances, different crystal polymorphs may compete during nucleation.This phenomenon is of key importance in pharmaceutical sciences and applications as the crystallization of the "undesired" polymorph may for instance lead to neurodegenerative disorders such as Alzheimer's disease or eye cataract, or to reduced solubility/efficacy and even toxicity of certain drug compounds [11,12].
Recently, impressive strides have been made in the experimental observation of early-stage crystal nucleation by using atomic-resolution in situ electron microscopy, showing the observation of different nucleation pathways to different crystal polymorphs of proteins [13], pre-nucleation clusters in metal organic frameworks [14], early-stage nucleation pathways of FePt nanocrystals that go beyond classical nucleation theory and nonclassical scenarios [15], amorphous precursors in protein crystallization [16], featureless and semi-ordered clusters of NaCl nanocrystals [17], and reversible disorder-order transitions of gold crystals [18].These recent observations differ from the current nucleation models and call for a better theoretical insight in the crystallization pathways at the earliest stages of nucleation, when particles start to order from the metastable fluid phase and select the emerging crystal polymorphs.
Colloidal suspensions are suitable experimental systems to probe locally heterogeneous phenomena such as early-stage nucleation: the larger sizes and slower time scales of colloidal particles enable direct observation of the nucleation mechanisms [1,19].However, even for hard spheres (HSs), undoubtedly one of the simplest colloidal model systems, the polymorph selection mechanism is yet to be revealed.In a HS system, the hcp crystal is metastable with respect to the fcc, but the free-energy difference between the two structures is tiny ( 10 −3 k B T per particle) [20,21].One therefore might expect to find an approximately 50% occurrence of fcc-and hcp-like particles in the crystal nucleus of hard spheres.However this prediction is not realised either in experiments [1,3,6,19,22] nor in simulations [5,[23][24][25][26], which both show a hitherto unexplained predominance of fcc particles in the final crystal phase.In this Letter, we investigate, using Molecular Dynamics (MD) simulations and particle-resolved studies of colloids, the early stages of nucleation of hard spheres in order to shed light on the selection mechanism of the crystal polymorph.We study the structural transformations in the supersaturated fluid phase that finally lead to crystal nucleation.We find that the crystal embryo shows a preference towards fcc-like stacking, because of its striking similarity with local clusters present in the fluid phase.We also demonstrate that this purely geometric argument for a higher propensity to nucleate fcc is incorporated in thermodynamics by a lower interfacial free energy of fcc with respect to hcp crystals.

Siamese Dodecahedra and Pentagonal Bipyramids
We perform MD simulations to study crystal nucleation in a supersaturated fluid of hard spheres.We generate many nucleation events and analyse the trajectories using two different methodologies.To follow the nucleation process, we first identify the solid-like particles, i.e. particles with a local solid-like (ordered) environment, by calculating the averaged bond order parameters [27] that are based on spherical harmonics Y lm , measuring the arrangement of the neigbours around a particle.In particular, we identify particle i as solid-like if the sixfold rotational invariant q6 (i) ≥ 0.31, and we colour them blue in Fig. 1c-f.We note that this classification scheme (a) Fraction of particles belonging to SD (pink), PB (purple), and combined SD or PB (red) clusters along with the fraction of solid-like particles (blue) as a function of time during an exemplary spontaneous nucleation event.Note that a particle can be part of an SD cluster and a PB cluster at the same time, and therefore the corresponding fractions add up to a value which is higher than one.Also, a particle can be classified as crystal-like independently from whether it is also part of an SD cluster or not.The average values in the metastable fluid are shown by dashed lines.(b) Probability that a given PB cluster transforms into a SD cluster within a time interval of ∆t * = 10 during this nucleation event, calculated in a subcell of the system, which is centred around the centre-of-mass of the biggest crystalline cluster.We set this probability to zero when the denominator, i.e. the number of PB clusters in the considered subcell of the system, is lower than 10 units, for poor statistics.A sketch of this conversion is shown as an inset in (b).
acts on a single-particle level.
To investigate the structure of the fluid, we determine the topologies of various particle clusters present in the system using the Topological Cluster Classification (TCC) algorithm [28].We identify local clusters of 3 up to 13 particles consisting of not only rings of three, four, and five particles with and without additional neighbouring particles, but also compounds of these basic clusters.In total, we distinguish 41 topological clusters.
We focus our attention on a specific cluster geometry, the Siamese Dodecahedron (SD) due to its unique behaviour in the early stages of nucleation.In addition, we consider the Pentagonal Bipyramid (PB) because of its abundance in the fluid phase and its geometric similarity with the SD cluster.The SD cluster consists of particles that occupy four out of the five vertices of a pentagonal planar ring which we refer to as ring particles (as denoted by the red particles in Fig. 1a).The missing particle of the pentagonal ring is replaced by two particles (denoted by the blue particles in Fig. 1a), which are shifted up and down with respect to the pentagonal planar ring.We refer to these particles as shifted particles.Finally, two spindle particles (gold particles in Fig. 1a) are placed on top and below the pentagonal ring.The PB cluster is composed of ring particles (red particles in Fig. 1b), which form a pentagonal ring with two spindle particles (gold particles in Fig. 1b) similar to the Siamese dodecahedron.
For each particle in the system, we calculate the number of SD (PB) clusters that a particle belongs to.In Fig. 1c and 1d, we colour the fluid-like particles with different shades of pink (purple), depending on the number of SD (PB) clusters they are part of, according to the scale bar on the left (right).Even though the density of SD (PB) clusters is high throughout the fluid, Fig. 1c and 1d show that the density of SD (PB) clusters is spatially heterogeneous.Specifically, we observe that the crystal nucleus is surrounded by a high density of SD clusters, whereas the opposite trend is found for the PB clusters as the PB clusters are depleted near the surface of the crystal nucleus.Remarkably, the density of PB clusters seems to be anti-correlated with the density of SD clusters.
In Fig. 1e and 1f we perform the same analysis on an experimental sample, showing a similar heterogeneous structure consisting of high-and low-density regions of SD and PB clusters in the fluid phase, and a crystal nucleus that is surrounded by a high density of SD clusters and a low density of PB clusters, in excellent agreement with our simulations.We refer the reader to the Methods Section for more details on the experiments.
The incompatibility of the fivefold clusters with crystalline order rationalises the depletion of PB clusters near the surface of the crystal nucleus.It is tempting to speculate that the SD clusters surrounding the crystal nucleus play a transient role in the formation of the crystal phase, which will be investigated in more detail below.To better understand the role of the PB and SD clusters in the crystallization mechanism of hard spheres, we plot in Fig. 2a the fraction of particles belonging to SD (pink line) and PB (purple line) clusters as a function of time during an exemplary spontaneous nucleation event along with the fraction of crystalline particles (blue line) for comparison.Fig. 2a shows that the fraction of crystalline particles is approximately zero in the metastable fluid phase at the beginning of this trajectory until it starts to rise when crystallization sets in.We also observe that the populations of particles in both the SD and PB clusters are already high before crystallization sets in, showing that the metastable fluid exhibits strong spatial correlations due to packing constraints.More surprisingly, we find an increase in the number of SD clusters during the early stages of crystallization, which decreases to a lower value at the end of the crystallization process since SD clusters are not present in the fcc structure.In addition, the fraction of PB clusters decreases at the onset of crystallization.
To investigate the anti-correlation between SD and PB clusters, we also measure the combined fraction of particles belonging to either SD or PB clusters as a function of time (red line in Fig. 2a).The combined fraction is not only constant in the metastable fluid phase, but also shows lesser fluctuations than the individual fractions of SD and PB clusters.More surprisingly, we observe that the combined fraction remains constant during the early stages of crystallization, thereby demonstrating that the increase in SD clusters is a consequence of a decrease of PB clusters.The constant combined fraction of SD and PB clusters and the much smaller fluctuations suggest that there is a reversible conversion between PB and SD clusters.To this end, we calculate the probability that a PB cluster transforms into an SD cluster within a time interval ∆t * = 10 by only taking into account the subcell of the system where the first nucleus appeared.In Fig. 2b, we plot the conversion rate as a function of time.We find that the rate of PB into SD clusters is constant in the metastable fluid, and increases when crystallization sets in.
We thus observe that the supersaturated fluid exhibits a heterogeneous structure of high-and low-density regions of PB and SD clusters with a continuous conversion between the two clusters.In addition, we find that the early stages of crystallization is signaled by a higher conversion rate of PB into SD clusters, resulting in an increased fraction of SDs as shown in Fig. 2a.Subsequently, the number of SDs decreases when the crystal nucleus grows further, thereby demonstrating that the SD clusters represent an intermediate stage in the attachment of fluid-like particles to the crystal nucleus.

The nucleation mechanism
To understand the role of SD clusters in the fluid-solid transformation, we note that the four particles of the pentagonal ring of an SD cluster form a trapezoidal arrangement with two acute and two obtuse angles, see Fig. 3d.Interestingly, in the case that these particles form a square arrangement (Fig. 3c), the SD cluster can be identified as a subunit of an fcc crystal as illustrated in Fig. 3a where the particles are denoted with the same colours to facilitate the comparison.Given this topological similarity, we speculate that the attachment of fluid-like particles to the solid nucleus proceeds via SD clusters where the four particles in the pentagonal ring transform from a trapezoidal to a square arrangement such that it becomes part of the fcc cluster.
To investigate this conjecture, we measure the distribution of the four angles of the trapezoidal arrangement of the 4 particles in the pentagonal ring of the SD clusters, at four different times during the crystallization process.Fig. 3b shows that, in the fluid phase (t * = 100), the distribution is bimodal with a peak at an angle smaller and larger than 90 • , representing the trapezoidal arrangement.As crystallization progresses, the distribution becomes unimodal with a single peak around 90 • , indicating a square pattern.
Our results provide strong support that the trapezoidal arrangement of the four particles in the pentagonal ring of the SD cluster transforms into a square arrangement corresponding to a subunit of the fcc crystal.This transition is also illustrated in Fig. 3c and 3d, showing two representative SD clusters after and before the transformation, respectively.The key finding of our study is that the fivefold PB clusters -known to be inhibitors of crystal nucleation and abundant in the fluid phase -transform into SD clusters, and that the SD cluster-mediated attachment of particles to the growing nucleus proceeds via a simple rearrangement of particles into fcc subunits.The rearrangement of SD clusters into hcp is less straightforward and involves an additional displacement by one of the shifted particles (see Supplementary Information).Hence, the propensity to grow fcc is higher than hcp, revealing that the polymorph selection mechanism in hard spheres is already hidden in the higher order structure of the fluid phase.

The minimum free-energy pathway for nucleation
This finding begs the crucial question whether the polymorph selection mechanism as identified here has a kinetic or thermodynamic origin.In other words, does an fcc crystal have a lower interfacial free-energy -and hence a lower Gibbs free-energy barrier -than an hcp crystal in a metastable fluid phase?To answer this question, we calculate the Gibbs free energy β∆G(n f cc , n hcp ) for the formation of a crystal cluster consisting of n f cc fcc-like particles and n hcp hcp-like particles using the Umbrella Sampling (US) technique, see the Methods section for the technical details.
In Fig. 4a, we plot β∆G of a crystalline nucleus composed of n f cc fcc-like particles and n hcp hcp-like particles.The lowest free-energy path on this surface shows that the crystal nucleus has an excess of fcc-like particles in the early stages of nucleation and that the critical nucleus consists of about 70% fcc-like particles.We show two exemplary configurations of a nearly fcc-like and hcp-like cluster in Figs.4b and 4c, respectively, demonstrating the effectiveness of our umbrella sampling method to bias towards nuclei with a certain composition.

Conclusions
In conclusion, we unveal the crystallization and polymorph selection mechanism in a fluid of hard spheres by analysing the early stages of nucleation in MD simulations.We show that the supersaturated fluid is highly dynamic as there is a reversible conversion between fivefold Pentagonal Bipyramid and Siamese Dodecahedron clusters.The Siamese Dodecahedra have a stunning similarity with an fcc subunit, thereby explaining the as-of-yet unexplained higher propensity of fcc compared to hcp in hard spheres.Finally, we show that the polymorph selection mechanism has not only a geometric origin which is hidden in the higher-order correlations of the fluid phase, but also a thermodynamic one as the lowest free-energy path proceeds via a higher number of fcc-like particles with respect to hcp-like particles in the early stages of nucleation.This insight suggest ways to control the nucleation pathways and the crystal polymorphs.

MD Simulations
In order to generate trajectories in which we observe spontaneous nucleation, we conduct MD simulations in the isothermal-isobaric (NPT) ensemble with a constant number N = 13500 of nearly-hard spheres.The particles interact via a Weeks-Chandler-Andersen (WCA) pair potential, which can straightforwardly be employed in Molecular Dynamics (MD) simulations and which reduces to the hard-sphere potential in the limit that the temperature T → 0. The WCA potential u(r ij ) reads [29] with r ij = |r i − r j | the centre-of-mass distance between particle i and j, r i the position of particle i, the interaction strength, and σ the diameter of each sphere.The steepness of the repulsion between the particles can be tuned by the temperature k B T / .We set k B T / = 0.025, which has been used extensively in previous simulation studies to mimic hard spheres [4,[30][31][32].The temperature T and pressure P are kept constant via the Martyna-Tobias-Klein (MTK) integrator [33], with the thermostat and barostat coupling constants τ T = 1.0 τ M D and τ P = 1.0 τ M D , respectively, and τ M D = σ L m/ is the MD time unit.The time step is set to ∆t = 0.004τ M D , which is small enough to ensure stability of the simulations.We ran the simulations for 10 9 τ M D time steps, unless specified otherwise.The simulation box is cubic and periodic boundary conditions are applied in all directions.
We select the pressure values in a region of metastability that allow us to observe nucleation phenomena on reasonable time scales.Specifically, the reduced pressure varies in the range βP σ 3 ∈ [13.40,16.00],which results in numerous spontaneous crystallization events.All MD simulations are performed using the HOOMDblue (Highly Optimised Object-oriented Many-particle Dynamics) software [34].
In order to calculate an effective packing fraction for the WCA systems, we use the mapping described in Refs.[4,31,32], which results in each particle having an effective diameter σ eff 1.097.

Experiments
We used polymethyl methacrylate (PMMA) particles of diameter 2.00 mm with a polydispersity of 4.0% as determined by static light scattering which were fluorescently labelled with Rhodamine dye.The particles were dispersed in a density matching mixture of cis decalin and cyclohexyl bormide.Tetrabutyl ammonium bromide salt was used to screen the electrostatic charges.The resulting dispersions were imaged using a Leica SP5 confocal microscope.Due to the residual electrostatic interactions, the effective hard sphere diameter is 1.02 times that of the physical diameter and thus we quote experimental values in effective packing fraction.Further details are available in [35].

Bond Order Parameters
To describe the local environment of a particle, we employ the standard bond-orientational order parameters introduced by Steinhardt et al. [36].We first define the complex vector q lm (i) for each particle i where N b (i) is the number of neighbours of particle i, Y lm (θ(r ij ), φ(r ij )) denotes the spherical harmonics, m ∈ [−l, l], θ(r ij ) and φ(r ij ) are the polar and azimuthal angles of the distance vector r ij = r j − r i , and r i denotes the position of particle i.
The averaged qlm (i) is defined as where Ñb (i) is the number of neighbours including particle i itself.The rotationally invariant quadratic and cubic averaged bond order parameters are defined as and To identify the neighbours of particle i we employ the parameter-free solid-angle-based nearest-neighbour (SANN) algorithm of Van Meel [37].This algorithm assigns a solid angle to every potential neighbour j of i, and defines the neighbourhood of particle i to consist of the N b (i) particles nearest to i for which the sum of solid angles equals 4π.

Topological Cluster Classification
In order to perform an analysis which is not solely based on local symmetries, we require an algorithm that is capable of successfully finding different topological clusters in a metastable fluid.To this end, we employ the Topological Cluster Classification (TCC) algorithm [28].The bonds between particles are detected using a modified Voronoi construction method.The free parameter f c , controlling the amount of asymmetry that a fourmembered ring can show before being identified as two three-membered rings, is set to 0.82.

Umbrella Sampling Simulations
To investigate the thermodynamic propensity towards fcc-like or hcp-like ordering during the nucleation process, we use Umbrella Sampling [38] to calculate the the nucleation barrier of a system of hard spheres at a pressure of βP σ 3 = 17.0.Similar to previous literature [25,39] we identify the nucleus by using the dot product with l = 6 to define solid-like bonds as those bonds between particle pairs (i, j) for which d 6 (i, j) > 0.7, and define solid-like particles as those that have at least 7 of such solid-like bonds.Particle neighbours are defined using a distance cutoff of r c = 1.4σ.The nucleus is then the largest set of solid-like particles that are connected by solid-like bonds.To disentangle fcc-like and hcp-like order, we subsequently classify solid-like particles as fcclike and hcp-like based on their value of the Steinhardt bond order parameter w 4 : particles with w 4 < 0 are fcclike and those with w 4 ≥ 0 are hcp-like [27].The number of such particles are n f cc and n hcp , respectively, and we use these to define the US biasing potential: where both coupling constants λ f cc and λ hcp are set to an equal value of βλ f cc = βλ hcp = 0.05.This allows us to sample the two-dimensional Gibbs free-energy difference β∆G(n f cc , n hcp ) that is the nucleation barrier as a function of the number of fcc-like and hcp-like ordered particles.
We initialise each US window (n f cc 0 , n hcp 0 ) from a configuration with a nucleus with approximately n f cc ≈ n f cc 0 and n hcp ≈ n hcp 0 .For very small nuclei up to n = n f cc +n hcp ∼ 20 we measure the full cluster size distribution instead of only the size of the largest cluster, as the probability of multiple small nuclei appearing simultaneously can be significant.We implement the US scheme by adding additional Monte Carlo bias moves that accept or reject trajectories based on the bias potential of 7 on top of a hard-particle Monte Carlo (HPMC) simulation implemented using HOOMD-blue's HPMC module [34,40].Bias moves are performed every MC cycle in order to also sample regions of the free-energy landscape where the gradient is large.Finally, we reconstruct the nucleation barrier by using the Weighted Histogram Analysis Method (WHAM) [41], specifically by using the algorithm provided by Ref. [42].

DATA AVAILABILITY
The data associated with this research is available upon reasonable request.

SUPPLEMENTARY NOTE: MORE ON BOND ORDER PARAMETERS
A. Distinction between ring, shifted, and spindle particles The analysis conducted in the main text demonstrates that SD clusters play a transient role in the attachment of fluid particles to a crystal nucleus.To investigate this transformation further, we calculate the bond order parameters (BOPs) for the particles belonging to SD clusters.In Fig. S3a we show the BOP values for the particles composing the SD clusters in the q4 − q6 plane for a typical configuration in the early stage of nucleation along with the BOPs of a typical fluid and crystal configuration for comparison.Even though a large fraction of the SD particles (bright pink points) shows a higher than average degree of fourfold and sixfold symmetry, the large spread in q4 and q6 values show that there is no clear correlation between the SD clusters and their BOP values in the fluid phase.
In order to clarify the transition process from a PB cluster to an SD cluster, we divided in the main text the particles composing an SD cluster into several categories -ring, shifted, and spindle particles.This classification is not only useful in describing the topology of the SD clusters, but also reveals additional information regarding the nucleation mechanism.By computing the probability distribution of q6 for the three different particle types of the SD clusters at the onset of crystallization, we find that the shifted particles show slightly higher q6 values (blue curve in Fig. S3b), suggesting that the crystallization process is largely initiated by the shifted particles.
In the main text we described how the particles re-arrange during the transformation of PB clusters into SD clusters.To be more specific, we showed that the transformation is initiated by the appearance of two shifted particles.Recalling that the disappearance of PB clusters and the subsequent excess of SD clusters enables the start of the crystallization phenomenon, it is to be expected that the shifted particles of the SD cluster are indeed the first to crystallize.

B. Polymorph detection with different classification schemes
The results presented in this work are all based on a combined use of bond order parameters (BOPs) and a Topological Cluster Classification (TCC) analysis.In particular, when using BOPs there are several choices to make, which affect the classification of the particles.These choices are related to the identification of the local neighbourhood of a particle, to the distinction of fluid-like and solid-like particles, and to the further distinction of the different crystal polymorphs.In this section, we select different criteria for all these sub-tasks, and check the robustness of the classification outcome.Specifically, we check that all methods record a predominance of fcc-like with respect to hcp-like ordering in the growing nucleus.
To this end, we start by using a total of three different techniques to find the local neighbourhood of a particle.The first two are based on a simple cutoff radius equal to 1.4σ and 1.5σ with σ the diameter of the particles, while the third is based on the solid-angle nearest-neighbour (SANN) algorithm.
Furthermore, in order to classify particles as solid-like or fluid-like, we use two different criteria.One is implemented via the criterion q6 > 0.31, as in the main text, while the second is based on dot-products of the q 6m , as implemented in the Umbrella Sampling calculations (see Methods).
Finally, in order to distinguish the different crystal polymorphs, we again use different strategies.In the first method, we employ the w 4 value, following the scheme proposed in our Umbrella Sampling calculations (see Methods).Alternatively, we can also use the averaged bond order parameter w4 .
Using all possible combinations for detecting the local environment, distinguishing between solid-like and fluid-like particles, and classifying different polymorphs, we obtain a total of 12 classification algorithms.We use all of these to analyse the spontaneous nucleation trajectory shown in the main text and compute the fraction of fcc-like and hcp-like particles in the largest crystal nucleus during a nucleation trajectory.In Fig. S4a (Fig. S4b), we show the ratio of fcc-like (hcp-like) particles computed via each classification scheme.We note that the first part of the nucleation event is noisy as the denominator, i.e. the number of particles belonging to the main cluster, is very small.
We clearly observe that our results are robust with respect to the choice of classification scheme, thereby providing confidence that the crystal nuclei are predominately composed of (60%-80%) fcc-like particles.

FROM SIAMESE DODECAHEDRA TO HCP
In Fig. 3 of the main text, we show that particles arranged in an SD cluster have a high propensity to transform to fcc due to the topological similarity between the SD cluster and a subunit of fcc.In particular, the transition between the SD cluster into an fcc subunit proceeds by changing the trapezoidal arrangement of the four ring particles into a square arrangement.Here, we show that such a simple transformation does not hold for the transition from an SD cluster to hcp, which involves an additional displacement by one of the shifted particles.
This transformation is sketched in Fig. S5.In Fig. S5a, we show the unit cell of an hcp crystal, with an additional particle on the right belonging to an imaginary adjacent unit cell.By including the latter particle, it is possible to find a pattern that resembles the SD cluster described in the main text.We therefore colour the particles with the usual colour coding, i.e. ring particles are coloured red, spindle particles are coloured gold, and shifted particles are coloured blue.As one of the shifted particles of this cluster is displaced with respect to an actual SD cluster, we denote this cluster as a defective SD.Note that particles have been reduced in size so that the whole unit cell is visible.
In Fig. S5b, we isolate only the particles belonging to this defective SD cluster and picture them with the actual colloid size.From the view point shown in this figure -and comparing it with Fig. 3c and 3d of the main text -it is evident that one of the shifted (blue) particles is displaced with respect to its position in the SD cluster.FIG.S5: Geometrical relationship between an SD cluster and the hcp unit cell (a) Unit cell of the hcp crystal with an additional particle belonging to the adjacent unit cell.An defective SD cluster is identified in the unit cell of an hcp phase and the particles belonging to this defective SD cluster are coloured following the colour coding explained in the text.This SD cluster is termed defective as one of the shifted particles (in blue) is displaced with respect to an actual SD cluster.(b) The defective SD cluster resulting from the pattern in (a).From this viewpoint, it can be seen that one of the shifted (blue) particles is displaced, as a result of the "a-b-a-b" stacking of hcp, differently from what is observed in the fcc case, where the stacking of the hexagonal planes follows the "a-b-c-a-b-c" pattern.
Hence, the transition from an SD cluster to a defective SD cluster, which resembles a subunit of hcp, involves an additional displacement of one of the shifted particles.We therefore conclude that this additional displacement makes the propensity of fluid-like particles to crystallize into hcp lower than that of fcc.

FIG. 1 :
FIG. 1: Typical configuration of a crystal nucleation event of hard spheres (a-b) Rendering of the arrangement of particles in (a) a Siamese Dodecahedron (SD) and (b) Pentagonal Bipyramid (PB) cluster.The colour coding is explained in the text.(c-d) Cut-through image of an early-stage nucleation event generated by MD simulations.Crystal-like particles are coloured blue, while fluid-like particles are coloured following the scale bar on the left (c) or right (d) depending on the number of SD (c) or PB (d) clusters each particle belongs to.The system is simulated at constant pressure βP σ 3 = 13.80,i.e. starting from a fluid configuration at average effective packing fraction φeff = π 6 ρ σ 3 eff = 0.541, where ρ is the average number density of the system, while σ eff is the effective diameter of the particles (see Methods for the calculation of the effective packing fraction).(e-f) Experimental configuration at effective packing fraction φ eff = π 6 ρσ 3 eff = 0.541.

FIG. 2 :
FIG. 2: Behaviour of Siamese Dodecahedron (SD) and Pentagonal Bipyramid (PB) clusters during nucleation.(a)Fraction of particles belonging to SD (pink), PB (purple), and combined SD or PB (red) clusters along with the fraction of solid-like particles (blue) as a function of time during an exemplary spontaneous nucleation event.Note that a particle can be part of an SD cluster and a PB cluster at the same time, and therefore the corresponding fractions add up to a value which is higher than one.Also, a particle can be classified as crystal-like independently from whether it is also part of an SD cluster or not.The average values in the metastable fluid are shown by dashed lines.(b) Probability that a given PB cluster transforms into a SD cluster within a time interval of ∆t * = 10 during this nucleation event, calculated in a subcell of the system, which is centred around the centre-of-mass of the biggest crystalline cluster.We set this probability to zero when the denominator, i.e. the number of PB clusters in the considered subcell of the system, is lower than 10 units, for poor statistics.A sketch of this conversion is shown as an inset in (b).

FIG. 3 :
FIG. 3: Transition from a Siamese Dodecahedron (SD) cluster to an fcc subunit.(a) Arrangement of the particles in an fcc unit cell.Red, blue and golden particles correspond to an SD cluster, while the remaining particles are coloured in lilac.(b) Probability distribution of the four angles θ of the trapezoidal arrangement of the 4 particles (red) in the pentagonal ring of all SD clusters in the system as computed at four different times during the crystallization process.The typical arrangements of particles in SD clusters after and before nucleation are shown in (c) and (d), respectively, where the black lines connecting the centres of the ring particles help to better understand the transition.

FIG. 4 :
FIG. 4: Thermodynamic propensity towards fcc-like particles in the early stages of crystal nucleation of hard spheres.(a) Gibbs free-energy barrier as a function of the number of fcc and hcp particles in the crystal nucleus as recognised by the classification scheme described in the Methods Section.(b) A typical configuration of a nearly pure fcc crystal nucleus and of (c) a nearly pure hcp crystal nucleus as obtained from US simulations, where fcc-like particles are coloured blue, hcp-particles are red, and fluid-like particles are lilac.

FIG. S2 :
FIG. S2: Subcell analysis (a) Pearson correlation r(nSD, nx) as a function of time t * , as defined in the main text.(b) Fraction of crystal-like particles nx as a function of the fraction of particles belonging to at least one SD cluster nSD, calculated for each subcell in the system.The data points are from three snapshots obtained from independent MD simulations, which correspond to the maximum of the Pearson correlation function r(nSD, nx).The vertical (horizontal) line indicates the average value of nSD (nx).

FIG. S3 :
FIG. S3: Correlation between bond order parameters (BOPs) and Siamese Dodecahedron (SD) clusters (a) Projection of the BOP values of particles belonging to SD clusters in the early stages of nucleation on the q4 − q6 plane.Particles belonging to SD clusters show BOP values that are very similar to the ones of the fluid phase, which are therefore not useful in describing the behaviour of the SD particles.(b) Probability distribution P (q6) of the ring, shifted, and spindle particles using the same configuration as in (a).The inset shows that the shifted particles show slightly higher q6 values, revealing that the nucleation process is largely initiated by the shifted particles.

FIG. S4 :
FIG. S4: Crystal polymorphs in the growing crystalline cluster Fraction of (a) fcc-like n fcc /N and (b) hcp-like n hcp /N particles in the growing nucleus consisting of N particles as identified by 12 different classification schemes described in the text.