Multi-scale model for the structure of hybrid perovskites: Analysis of charge migration in disordered MAPbI$_3$ structures

We have developed a multi-scale model for organic-inorganic hybrid perovskites (HPs) that applies quantum mechanical (QM) calculations of small HP supercell models to large coarse-grained structures. With a mixed quantum-classical hopping model, we have studied the effects of cation disorder on charge mobilities in HPs, which is a key feature to optimize their photovoltaic performance. Our multi-scale model parametrizes the interaction between neighboring methylammonium cations (MA$^+$) in the prototypical HP material, methylammonium lead triiodide (CH$_3$NH$_3$PbI$_3$, or MAPbI$_3$). For the charge mobility analysis with our hopping model, we solved the QM site-to-site hopping probabilities analytically and computed the nearest-neighbor electronic coupling energies from the band structure of MAPbI$_3$ with density-functional theory. We investigated the charge mobility in various MAPbI$_3$ supercell models of ordered and disordered MA$^+$ cations. Our results indicate a structure-dependent mobility, in the range of 50$-$66 cm$^2$V$^{-1}$s$^{-1}$, with the highest observed in the ordered tetragonal phase.


Introduction
Hybrid organic-inorganic perovskites (HPs) are the primary candidates to substitute silicon as the cheap and efficient solar cell material of the future [1]. Methylammonium lead triiodide CH 3 NH 3 PbI 3 (shortened to MAPbI 3 hereafter) is the spearhead of this novel photovoltaic (PV) materials class with an unprecedented development in its power conversion efficiency (PCE), from ca.10% in 2012 to the present 23%, approaching the best silicon-based single-junction devices. However, the most efficient HPs are currently unstable due to structural degradation in atmospheric water [2]. Conversely, stable HPs are still too inefficient in their PCE. New computational models, developed in parallel with experimental research, aim to improve the PCE of HPs via detailed understanding of the atomic structure and expedite the introduction of HPs into consumer-grade solar devices. While several mixed-halide HP compositions have recently entered the field, MAPbI 3 has maintained its status as the archetypal HP in the latest research.
MAPbI 3 combines the key properties of a good PV material-a suitable band gap of ca.1.6 eV [3], good visible-light photon absorption [4] and long diffusion length [5]. At present, the nature of the charge-carrier mobility in MAPbI 3 remains unclear. Recent studies suggest that the mobility is fairly modest in comparison to typical inorganic semiconductors, less than 100 cm 2 V −1 s −1 for electrons and holes [6,7]. This indicates that the observed long diffusion length and the resulting high PV performance likely arise from a long charge-carrier lifetime. To optimize the charge mobility, we need detailed insight into the relationship between the mobility and the corresponding MAPbI 3 structure.
Recent research has focused on solving the atomistic origins of these properties, yet the structural complexity of MAPbI 3 and other HPs has hampered progress. Structural investigations have focused on the role of the organic methylammonium (MA + ) cations, specifically the rotational dynamics of the cations and their interplay with the inorganic PbI 3 lattice [8][9][10][11][12][13][14][15][16]. Depending on the cation orientation, coupling with the lattice via hydrogen bonding distorts the framework geometry and affects the electronic properties, for example by altering the nearest-neighbor (NN) atomic orbital overlap and the lattice vibrational modes. The exact relationship between this dynamical structure and the material's PV properties remains debated. Suggested mechanisms include for example charge-carrier pathways formed by ferroelectric domains of ordered MA + cations [16], the effects of the lattice distortion on charge localization [17] and potential fluctuation effects on the charge-carrier mobility due to dynamical MA + disorder [18]. However, detailed computational studies of these processes require electronically accurate models, which significantly restricts the size of the studied systems.
To describe properties that derive from the atomic and electronic structure with sufficient accuracy, we need to incorporate quantum mechanical (QM) information. Methods such as density-functional theory (DFT) [20] deliver this QM insight, but are limited to relatively small structures (<100 unit cells). However, due to the orientational freedom of the MA + cations in each unit cell, HPs will be disordered and require modeling on much larger length scales than accessible by DFT. We approach this accuracy-size conundrum with a new multiscale model, which we have developed specifically for HP structures. Our model is based on a DFT parameterization of small supercell models and then applicable to large MAPbI 3 structures.
In our previous MAPbI 3 unit-cell energy-optimization study [13] we recognized two stable low-energy structures, corresponding to diagonal and face-to-face MA + orientations in the unit cell. We then parametrized the MAPbI 3 atomic structure by studying the MA + s in neighboring unit cell pairs [14]. Each pair of the NN MA + s share a common face of the cubic unit cell. We focused on the MA + orientations, describing them by a C-N dipole vector, and defined the dipole pair configurations as 'pair modes' (PMs hereafter, see section 2.1). We analyzed the effect of energy optimization on the PM distribution in 33 4×4×4 MAPbI 3 supercells. In the optimized structures, we observed an absence of the diagonal dipoles and a notable change in the PM distribution, for example an increase in the number of perpendicular PMs. As a good approximation, the observed changes were also independent of the initial structures before optimization. These findings motivated our description of low-energy MAPbI 3 structures by a specific 'optimized' PM distribution. Moreover, it enabled us to generate new large-scale structures using the optimized PM distribution for further analyses, such as charge migration.
In this paper, we study the charge migration in large-scale MAPbI 3 structures with a semi-classical site-tosite hopping model. We generated the structures based on our previous observations in our supercell DFT studies, applying the PM description and a self-developed structure-generation algorithm. We then analyzed the effect of dipole orientations on the charge migration velocity, by comparing the generated structures to various reference systems, for example ordered dipole configurations (corresponding to tetragonal and orthorhombic phases), parallel and antiparallel dipoles and randomly oriented dipoles. We also analyzed the effect of structural defects (such as planar defects and precipitates) on the charge velocity. Finally, we estimated the charge mobility in our model based on an experimental reference value for the scattering time at room temperature [7].
This paper is organized as follows. In section 2 we review the PM concept and introduce methods for the MAPbI 3 structure generation and the charge migration study. In section 3 we introduce and discuss the results, and then conclude our findings in section 4.

Computational models and methods
To analyze the large-scale MA + cation order (or disorder), we generated new systems based on the PM distribution of the energy-optimized MAPbI 3 supercell models from our previous study [14]. We developed a semi-classical site-to-site hopping model to study charge migration in MAPbI 3 structures and compared the statistical migration velocities in various dipole configurations. In the following, we will first review the PM concept and then introduce the methods for the large-scale structure generation (see section 2.2) and the charge migration study (see section 2.3).

Pair mode description
We developed the PM concept based on our previous studies in MAPbI 3 structure optimization, using the allelectron numeric-atom-centered orbital DFT code FHI-aims [20][21][22]. In these studies we applied the Perdew-Burke-Ernzerhof (PBE) generalized gradient approximation [24] with Tkatchenko-Scheffler van der Waals (vdW) interactions [25] as the exchange-correlation functional. Based on our single-unit-cell structure optimization study [13], the MA + cation has 32 discrete stable orientations in a unit cell-8 diagonal and 24 face-to-face with a small deviation angle to the lattice plane. In these low-energy states the ammonium end of the cation forms hydrogen bonds with the lattice I − atoms, pulling them towards the cation. The amount of displacement is the largest in the face-to-face MA + orientations, which subsequently has ca.21 meV lower energy than the diagonal orientation. We have parametrized the structural energies of this framework distortion by describing the MA + cation as a dipole vector in the C-N bond direction and studying the dipole configurations in neighboring unit cell pairs, the PMs (see figure 1(a)). The displacement of the I − atom depends not only on the dipole orientation in a single unit cell, but also on the orientation in the neighboring cell. By considering the rotational and reflectional symmetries of the dipole pairs, we were able to reduce the amount of 32 2 =1024 pairs to only 86 unique PMs. Furthermore, by approximating the face-to-face orientations in each unit cell to be identical (i.e. four-fold degenerate without the lattice-plane deviation angle), the amount of unique PMs in this 14-orientation framework (with 8 diagonal and 6 face-to-face orientations) was reduced to 25. In our recent study [14] we optimized 33 4×4×4 MAPbI 3 supercells (see figure 1(b)) with DFT and studied the effect of optimization on the PM distribution (see figure 2). We observed that PMs 1-17, in which one or both dipoles are diagonal, are extremely rare in the optimized structures; much rarer than in an 'intrinsic' random distribution (see red bars in figure 2). This is an obvious consequence of the higher energy of the diagonal structure. Comparing our optimized PM distribution to then only the face-to-face 'intrinsic' PM probabilities of a random system (green data in figure 2), we observe a notable increase in the frequency of PMs 20 and 22. These correspond to perpendicular-aligned dipole pairs. Accordingly, the frequency of the other PMs  decrease. To study the dipole order in MAPbI 3 on a large scale and its effect on the charge migration properties, we then generate new structures based on this optimized PM distribution.

Large-scale structure generation
We generated the structures using our own algorithm, in which the dipoles are reoriented one-by-one in a random order to reach the given PM distribution (the 'target' distribution, hereafter). In this study, the target distribution corresponds to the optimized PM distribution, shown in figure 2. The process is started from an initial state, in which the dipoles are oriented randomly in the 24 face-to-face orientations. These orientations correspond to the minimum-energy structures found in our previous study [13]. New dipole orientations are then determined by evaluating the change that any of the other orientations would induce in the PM distribution. By subtracting the present distribution from the target distribution, the algorithm calculates a difference value for each PM, which tells if the PM is too rare (positive value) or too frequent (negative value) in the present system. Each new dipole orientation is then evaluated by calculating the sum of the difference values for the 6 NNs. The most favorable orientation is the one that produces the largest sum, i.e. provides the largest increase in the deficient PMs. If a more favorable orientation is found, the dipole is rotated to the new orientation, after which the algorithm proceeds to the next random dipole. Via this process, the system evolves towards the target distribution, until a specified error threshold between the distributions has been reached. The error is calculated as the root mean square error (RMSE) of the PM frequency between the present and the target distribution. Due to a noisy variation in the present distribution, the threshold is evaluated with respect to the mean of the 10 previous RMSEs. In practice, the structure generation algorithm consists of an iteration loop, in which the following steps are executed.
(i) Select a random dipole.
(ii) Rotate the dipole through all 32 orientations to see, which one (if any) produces the most favorable change towards the target PM distribution.
(iii) If another orientation is more favorable, reorient the dipole to the most favorable orientation.
(iv) Calculate the present PM distribution.
(v) Calculate the RMSE between the present and target distributions. Stop if the specified accuracy is reached.
In this process, we also analyzed the electrostatic dipole-dipole interaction energy of the structure. The energy is in which p i and p j are the electric dipole moments of dipoles i and E j, ij is the electric field at dipole i by dipole r j, ij is the distance vector from dipole j to dipole i, ò 0 is the vacuum permittivity and ò r is the dielectric constant. The results of the structure generation are presented in section 3.1.

Hopping model for charge migration
We analyzed the effect of dipole order on the charge migration using a self-developed algorithm (see figures 3 and 4), in which a fully localized charge-carrier executes a classical random walk in a simulation box, based on PM-specific NN hopping probabilities. The QM behavior of the charge-carrier is evaluated at each hop based on an analytical solution of the hopping probabilities and the time step (see section 2.3.1), without the need to compute the time evolution of the QM wave-function explicitly. The analytical solution applies the PM-specific electronic coupling energies between the NNs, which are pre-computed transfer integrals (TIs) from the DFT band structure of MAPbI 3 (see section 2.3.2). Importantly, the analytical solution results in an extremely low computational simulation cost. This facilitates the calculation of a large number of migration paths in large systems. For example, in this study, each sample set consists of 10 4 paths, which were calculated in systems of 10 6 unit cells and above.
Each hopping path was started from a random initial location at the bottom plane (z = 0) of the system (see figure 4(a)). To increase the number of the paths that migrate through the system and not exit from the sides, the initial sites were selected from the middle half of the bottom plane (in range [ ] d d 4, 3 4 , in which d is the dimension of the system in x or y direction). The hopping was then executed (see figure 4(b)) until the path exited the system, from any of the 6 sides of the system (see figure 4(c)). To simulate the force of an external electric field on the charge-carrier, we applied a constant bias multiplier to the hopping probabilities in the direction of the electric field, which was set to +z. The multiplier increased the number of the paths that migrated through the system, which was essential in obtaining a sufficiently large set of migration times for the statistical charge velocity. For this model, we first derived the analytical solution for the NN hopping probabilities (see section 2.3.1). The solution applies the electronic coupling energies V n between the NNs, which we calculated from the DFT band structure of MAPbI 3 (see section 2.3.2).

Analytical solution for hopping probabilities
The NN hopping probabilities and the time step have an analytical QM solution (see appendix A for details). The Hamiltonian consists of the NN coupling energies V n between the initial site 0 and the neighboring sites (i.e. the 6 NNs that share a common face of the cubic unit cell with the initial site) n ä [1,6], such that  We defined the squared sum of the coupling values as From these, we derived the time evolution of the wave function and the probabilities for the charge to be at the initial site or at a neighboring site. The probability for the charge to be at the initial site at time t, starting from a fully localized state of the wave function, is Importantly, the probability P(0; t) is a simple function of V and does not require solving the QM time evolution of the wave function explicitly in each hopping step, which is computationally expensive. As can be seen from the cosine function, the probability for the charge to be at the initial site starts to decrease as time progresses. Subsequently, the charge has a non-zero probability to exist at 7 different sites, i.e. at the initial site or at any of the 6 neighboring sites. Thus, we approximated that the charge will hop, on average, once its initial-site probability has decreased below 1/7. The hopping time is therefore At the time t hop , the probability of the charge to be at site n is We then applied the bias multiplier to the probability of the neighbor n in the direction of the electric field and determined the site to which the charge hopped according to the thus-obtained probability distribution. Next, we will introduce the calculation of the electronic coupling energies V n from the DFT band structure of MAPbI 3 .

Electronic coupling between neighboring unit cells
The DFT band structure of MAPbI 3 (see figures 6(a) and C1) exhibits a cosine-type dispersion relation, which is known from the tight-binding (TB) theory [26]. Based on this, we adopted the TB model in the calculation of the electronic coupling between the neighboring unit cells. The coupling energies correspond to the TIs, from which we calculated the NN hopping probabilities in the charge-migration model via the analytical solution.
In the TB approximation, the wave function Ψ k of a particle in a periodic lattice, in which k is the momentum quantum number, is written as a linear combination of the atomic wave functions f n , in which n denotes the site index. In the one-dimensional single-band TB model: . This constant can be set to 0 without loss of generality.
(iii) Only the couplings between the NNs are non-zero, which we will denote by −Λ, i The one-dimensional TB Hamiltonian can thus be written as n n n n n 1 1 1 In practice, we will work on finite systems, in which the total number of sites is N. For this, we will apply periodic boundary conditions by coupling the first and the last (Nth) site with −Λ, such that The solutions to this eigenvalue problem are Bloch waves Ψ k , and the corresponding eigenvalues ε k are cosine functions: in which a is the distance between the NN sites. With (10), we can calculate the TIs as the coupling energies −Λ from the amplitude of the conduction band in the DFT band structure of the material. The effective mass * m of the charge-carrier can be calculated from the curvature of the conduction band (see for example [27]) as Using the cosine function of the energy bands (10) we get in which we have applied ka=0 in the conduction-band minimum. With the effective mass, we can then calculate the mobility of the charge-carrier as in which q is the charge of the charge-carrier and τ is the scattering time. The processes that affect the scattering time in MAPbI 3 , for example the electron-phonon coupling, are outside the scope of our charge-migration model. Thus, we calculated the charge-carrier mobility based on an experimental reference value for the scattering time in the normal-temperature tetragonal phase. We obtained the mobilities for the other analyzed structures by scaling the computed velocities in our charge-migration model linearly in relation to the tetragonal-phase mobility. The results of this analysis are presented in section 3.3.

Computational details
We performed the band-structure DFT calculations with the all-electron numeric-atom-centered orbital code FHI-aims [20][21][22][23]. For all calculations we used tier 2 basis sets and the PBE exchange-correlation functional [24] augmented with long-range vdW corrections based on the Tkatchenko-Scheffler method of Hirshfeld electron-density partitioning [25]. In addition, scalar relativistic effects were included by means of the zeroorder regular approximation [28]. A Γ-centered 4×4×4 k-point mesh was used for the supercell models a and b (see figure C1), and a 8×8×8 k-point mesh for the primitive-cell model c (see figure 6(a)). The results of the band-structure calculations are available from the Novel Materials Discovery (NoMaD) repository [29].

Results and discussion
3.1. Large-scale dipole structures First, we studied the performance of the structure-generation algorithm by generating 50 structures of 20×20×20 unit cells (see figure 5(b)). With the PM distribution of the 33 energy-optimized MAPbI 3 4×4×4 supercells (see figure 2) as the target distribution, we consider the generated structures to describe a low-energy MAPbI 3 structure on a large scale. In the process, we followed the electrostatic dipole-dipole interaction energy (1) within a cutoff radius of 3 unit cells (see figure 5(a)), using the experimental dielectric constant of 25.7 [30] and the dipole moment of 2.3D [9]. Since the electrostatic dipole-dipole energy decreases with respect to the distance between the dipoles as r −3 , we consider the dipole-dipole interactions to be negligible in this study beyond the radius of 3 unit cells (ca.19 Å). We evaluated the convergence of the solution by following the RMSE between the present and the optimized PM distributions. The moderately small system size allowed us to update the PM distribution after each dipole reorientation. The RMSE was updated once in every 100 reorientations, and the system generation was terminated once the mean value of the 10 previous RMSEs decreased below 10 −7 .
The results show that the optimized PM distribution is reached well within 10 4 generation steps (i.e. dipole reorientations). The electrostatic energy, which is initially ca.0 in a system of randomly oriented dipoles, decreases in the generation to ca.−0.08meV per unit cell.
The generated structures suggest that, instead of the polarized domains (see e.g. Frost et al [16]), the dipole order in MAPbI 3 on a large scale may predominantly consist of non-aligned dipoles, which can be in perpendicular formations, corresponding to PMs 20 and 22, and the out-of-plane PM 25. This resembles the tetragonal-phase MAPbI 3 structure (see for example [12]), which however is not static at normal temperatures due to dipole reorientations [8,10]. The dynamic structure may exhibit the tetragonal-phase dipole order in local domains on a small scale, while remaining macroscopically disordered.

Transfer integrals
To obtain the NN hopping probabilities in the charge-migration model, we calculated the TIs from the DFT band structure of MAPbI 3 . We analyzed the charge migration in structures, which consist of the face-to-face PMs 18-25, thus excluding the rare diagonal dipoles. We obtained the TIs for these PMs from the band structures of one primitive-cell model and two 2×2×2 supercell models. We calculated the band structure of MAPbI 3 in the primitive-cell model (see figure 6(a)) using lattice parameters = = = Å a b c 6.313 [31]. In the relaxed structure, the dipole is oriented approximately in the +y direction. Here we use the usual notation 2 for the reciprocal lattice of a simple cubic crystal system, in which the conduction-band minimum is located. In addition, we denote . The band dispersion along R-M x/y/z reflects the effects of the inter-site coupling along x/y/z. As shown in figure 6(a), they all exhibit the typical cosine character and can be described with the one-dimensional TB theory. We can thus calculate the TI from the amplitude of the conduction band using(10). For example, the TI of the extending PM 18 can be calculated from the primitive-cell model along R-M y as  in which E R and E M y are the conduction band energies at points R and M y , respectively. Also, in these points, ka=0 and ka=π in (10), respectively. To obtain the TI for the parallel-aligned PM 23, we calculated the mean value of the TIs in the primitive-cell model along R-M x and R-M z . For the remaining PMs, we calculated the TIs from the DFT band structures of two cubic 2×2×2 supercell models, which mimic the dipole orientations in the low-temperature orthorhombic phase and in the mid-temperature tetragonal phase (see figure C1). In these supercell models, we used the lattice constant of = Å a 2 12.626 . In both models, the conduction band minimum is located at Γ ≡ (0, 0, 0), which is different from the primitive-cell model due to band folding. Instead of the common notation for the high-symmetry points in the Brillouin zone, we denote here º º ( ) ( ) X Y , 0, 0 , 0, , 0 and º ( ) Z 0, 0, 1 2 . From the conduction band dispersion along Γ-X and Γ-Y, we calculated the TIs associated with the perpendicular PMs 20 and 22. Since both the head-perpendicular and the tail-perpendicular PMs exist in both models and cannot be singled out in the band structure, we took the mean value of the calculated TIs and assigned the same value to both PMs. Finally, the antiparallel PM 24 and the out-of-plane PM 25 correspond to the conduction band dispersion along Γ-Z in models a and b, respectively. The opposite-aligned PMs 19 and 21 are extremely rare in the energy-optimized structures (see figure 2) and difficult to obtain via structural relaxation in DFT. Therefore, they were excluded from this TI analysis and assigned a coupling value of 0 eV in the charge-migration model. The TIs for all the calculated PMs are presented in figure 6(b).

Charge migration analysis
With our hopping model, we simulated the charge migration in various MAPbI 3 structures and analyzed the effect of dipole configurations on the charge mobility. We estimated the charge mobility based on the computed charge velocities, the effective mass of the charge-carrier calculated from the DFT band structure (see (12) and figure 6(a)), and an experimental reference value for the scattering time. Since we computed the velocities from the average charge-migration time through a finite-sized systems, they were dependent on the system size. Therefore, we first analyzed the velocities in systems of different sizes to determine the required size for a good approximation of a bulk material.
We generated dipole configurations of 10-130 unit cells with cubic system dimensions, which are analogous to (dis)ordered MAPbI 3 structures and known crystal phases in the coarse-grained model. The structures (see figures 7 and B1) included the following.
(i) The low-temperature orthorhombic phase in two different orientations with respect to the charge migration direction (i.e. the direction of the bias multiplier in the NN hopping probabilities). Figure 7. 5×5 unit-cell models of the basic structure types in the charge migration study. The models show a single xy plane in the structure, with charge-migration in the +z direction. A complete list of the structures is presented in figure B1.
(ii) The mid-temperature tetragonal phase, in two different orientations, as above.
(iii) Parallel and antiparallel dipoles in z and x directions, which correspond to the parallel and perpendicular orientations with respect to the charge migration direction, respectively.
(v) Optimized dipole configurations, generated with the structure-generation method based on the optimized PM distribution (see section 2.2).
In all the charge-migration calculations, we set the bias multiplier to 5, which resulted in ca.80% of the migration paths going through the system. Since the paths were started within the center half of the bottom plane, they rarely exited at the side-faces of the cubic system, particularly in large systems (see figure 8). Typically, a path was terminated due to exiting through the bottom plane during the first few hops in the migration. We computed 10 4 paths in each of the structures, which based on test calculations on larger sample sets, provided a sufficient accuracy for the statistical velocity calculation. The results (see figure 9(a)) show that in all the analyzed structures, a good approximation of a bulk system was obtained with a system dimension of ca.100 unit cells or larger. We observed the highest velocities in the mid-temperature tetragonal phase in the xy direction. With respect to the charge migration in the +z direction, the tetragonal xy structure consists of the out-of-plane PMs 25, which have the strongest electronic coupling, i.e. the highest TI value (see figure 6(b)). Conversely, the tetragonal xz structures had the lowest computed velocities. This can be attributed to the low electronic coupling of the perpendicular PMs 20 and 22 in the direction of the charge migration. In this case, the highly coupled out-of-plane PMs in the x and y directions effectively deviate the charge sideways, which results in a prolonged migration time through the system and thus a low velocity. The same deviation effect can be seen in the parallel and antiparallel z structures, in which the higher coupling of the parallel PM 23 results in more deviation in the sideways direction and thus a lower velocity than in the antiparallel structure. With respect to the disordered structures, the velocities were slightly higher in the generated optimized structures than in the fully random structures. This likely results from the reduced number of the non-coupling PMs 19 and 21 in the optimized structures.
We analyzed the effect of structural defects on the charge-migration velocity by introducing volumes of randomly oriented dipoles in the tetragonal xy structures of 100×100×100 unit cells. We studied the effect of two different types of defects: planar defects and precipitates (see figure 10(a)). In the planar defects, we introduced a system-wide plane in the middle of the system, such that the orientation of the plane (xy) is perpendicular to the charge migration direction (+z). We then varied the thickness of the plane and analyzed the charge velocity with respect to the defect volume in the system, i.e.~Ṽ V defect system . In the precipitates, we introduced 25 randomly located non-overlapping cubic volumes of randomly oriented dipoles in the system. To reduce the noise that originates from the randomly located precipitates, we generated 20 systems of each defect concentration in 0%-20% volume range, and computed the velocities as their mean value. The results (see figure 10(b)) show that the velocity decreases fairly linearly with defect concentration. By extrapolating the linear velocity decrease, which starts from the tetragonal xy phase velocity of ca.7.4 Å fs −1 at 0% defect concentration, the velocity approaches the fully random system with a velocity of ca.5.8 Å fs −1 . The linear velocity decrease is qualitatively similar in both defect types, with the precipitates having slightly lower velocities.
Finally, we calculated the charge mobility in the analyzed structures, based on (i) the computed velocities in the largest computed structures of size 130×130×130 unit cells, (ii) the effective mass of the charge-carrier, calculated from the tetragonal-phase DFT band-structure (12), and iii) an experimental reference value of the scattering time in the tetragonal-phase MAPbI 3 . Karakuset al [7] have studied the electron-phonon scattering in MAPbI 3 with time-resolved terahertz spectroscopy, measuring a scattering time of ca.4fs in the tetragonal phase at 300K. We associated this value with the best-performing tetragonal xy structure in our charge-  in which q e is the charge of an electron. The resulting mobility is more than twice the experimental reference mobility of ca.27 cm 2 V −1 s −1 , measured by Karakuset al [7]. The difference may arise e.g. from the calculation of the effective mass, which we have derived from the DFT band structure of the tetragonal-phase 2×2×2 supercell model with the TB approximation. In reality, the atomic structure of MAPbI 3 is likely disordered on a macroscopic scale at room temperature. Despite the simplified model in this study, the result is of the correct order of magnitude. We obtained the mobilities for the other structures by relating the computed velocities to the tetragonalphase mobility in (15). First, we associated the tetragonal-phase mobility μ tet with the computed velocity of the corresponding structure v tet . Then, we assigned mobilities for the other structures in relation in which μ i is the charge mobility in structure i, and v i is its computed velocity in the charge-migration model. The results (see figure 9(b)) show that the mobilities vary in a range of ca.50-66 cm 2 V −1 s −1 . The relative difference between the best-and worst-performing structures is ca.30%. Also, the variation of 20 cm 2 V −1 s −1 is of the order of the experimentally measured charge mobility in the reference study by Karakuset al. The result suggests that the dipole order has a significant effect on the charge mobility in MAPbI 3 . Interestingly, the highest and the lowest charge mobilities arise from the same tetragonal-phase structure, the difference being the direction of the charge migration. In MAPbI 3 preparation, correct orientation of the structure is therefore important to achieve optimal PV performance.

Computation time
Overall, the computation times in this study are extremely low, measured in seconds or minutes. In comparison, calculations with the conventional DFT methods can typically take hours or days. The structure-generation algorithm (see section 2.2) scales approximately as ( ) n 2 with respect to the number of unit cells n. The scaling factor arises non-trivially from the way that the system size relates to reaching the target PM distribution within the set RMSE threshold. Since the PM distribution of the whole system is updated at regular intervals during the generation process, the computation time depends heavily on the system size. As an example, a system of a cubic dimension of 40 unit cells takes approximately 1 min to generate, whereas a system of a dimension of 50 unit cells requires ca.3.5 min.
The charge-migration algorithm (see section 2.3) relies on the analytical solution for the NN hopping probabilities (see section 2.3.1). Due to this, the process is extremely rapid, and the system size affects directly only on the path length in the migration direction. In large systems, in which the overhead of initiating the process is negligible, the algorithm scales approximately linearly as ( ) n with respect to the number of unit cells. For a given system, the charge-migration analysis can be executed in a fraction of the time that it takes to generate the system. For example, computing 10 5 migration paths in a system of a cubic dimension of 50 unit cells (with the optimized PM distribution) takes ca.2 s.
The computation times presented here were calculated on a single core of a modern workstation processor at the time of this research.

Conclusion
In this work, we have investigated large-scale MAPbI 3 structures with a multi-scale model that applies the PM concept in the parametrization of the complex HP atomic structure. With the optimized PM distribution, we generated new coarse-grained MAPbI 3 structures to study the dipole configurations on a large scale. Different to the previously suggested polarized dipole order in MAPbI 3 [16], our results indicate that the structure may exhibit local tetragonal-phase domains at room temperature, while being disordered on a macroscopic scale due to reorienting dipoles.
We applied our coarse-grained model in the charge-migration study of MAPbI 3 with a semi-classical hopping model. We estimated the average charge-carrier velocities in various (dis)ordered dipole configurations, applying our analytical solution for the QM hopping probabilities and the electronic NN coupling energies from the DFT band-structure. We related the computed velocities to charge mobilities via an experimental reference value for the scattering time and the effective mass calculated from the DFT bandstructure.
We obtained charge mobilities in range of 50-66cm 2 V −1 s −1 , which are comparable to the experimental mobility of 27-150 cm 2 V −1 s −1 [7]. Importantly, both the highest and the lowest mobilities were obtained in the same tetragonal phase, the difference being the charge-migration direction (i.e. in the tetragonal-phase structures xy and xz, respectively). The relative variation of ca.30% in the charge mobility suggests that the orientation of the dipoles has a substantial effect on the charge mobility in MAPbI 3 . In the defected tetragonalphase MAPbI 3 structures we observed a linear velocity relation with respect to the defect concentration, with little difference between the planar defects and the precipitates.
With our model, we have aimed to advance the present understanding on the order of the organic cations in HPs and their effect on the charge mobility. In addition to MAPbI 3 , our model is applicable to other HP compositions with an appropriate analysis of the PMs of the organic cations. Thus, the model provides a costefficient method to analyze the cation order and the charge mobility in numerous HPs. With the acquired knowledge, new HP compositions can be offered for experimental research, and modifications to the present materials can be suggested to boost their performance.
in which states ñ |0 and ñ |n correspond to the charge being located at the initial site or at any of the 6 neighboring sites, respectively (see figure A1). Here we consider that these 7 states are orthonormal and form a complete set, so that the Hamiltonian and the time-dependent wave function Ψ(t) can be represented in terms of them: Since we have a freedom of choosing an appropriate basis to represent the system, we can construct a new delocalized basis as a superposition of the localized states ñ ñ {| | } n 0 ; . The new basis set (also orthonormal) is We define the coefficients α n to be   In the first term, the sum over a n 2 equals 1 (A.10). The second term vanishes due to the orthogonality of the states å å å å å Therefore,Ĥ in the delocalized basis consists of only two states The eigenstates ofĤ are Figure A1. Localized charged states, with ñ |0 corresponding to the initial site before hopping and ñ -ñ | | 1 6 to the neighboring sites. Next, we calculate the time evolution of the quantum state of the system, described by Y ñ | ( ) t . Initially, at time t=0, the charge is fully localized at the initial site, which corresponds to the state Y ñ = ñ | ( ) | 0 0. The time evolution of the system is determined by the time evolution operator In general, the probability P of a QM system to be in a state xñ | is calculated as the squared norm of the wave function amplitude .
A . The probability of the charge to be located at the initial site is then A.24 2 2 Similarly, we get the probability of the charge to be located at any of the neighboring sites n as . From this result we see that the probability of the charge to be located at the initial site, or any of the neighboring sites, oscillates at frequency V/ÿ. The initial-site probability P(0; t) (A.24) is 1 at time t=0, which corresponds to a fully localized charge at the initial site. The probability then starts to decay according to the squared cosine function, as the wave function propagates to the neighboring sites. Concurrently, site n probabilities P(n; t) (A.25) increase according to the squared sine function, so that the higher the coupling energy V n is, the more rapid is the increase.
Due to the probabilistic nature of the hopping, we cannot precisely determine the time period that the charge spends at each site. However, we can approximate the average time t hop that it takes for the charge to hop to another site, based on the fact that the charge has 7 possible sites to be located at. Therefore, on average, we approximate that the charge will hop once its probability to be located at the initial site has decreased below 1/7. The hopping time can be solved from (A.24) as The probability of the charge to be located at site n at the time of a hop can then be calculated from (A.25) as P(n; t hop ) ≡ P n . Based on these probabilities, we can then determine the site that the charge hops into. For this, we first form a cumulative array with values 1 and see, between which values it lands in the array. This then determines the site that the charge hops into, by correctly accounting for different probabilities at different sites. Finally, after each hop, the simulation time is advanced by the time step t hop .