Introduction

Pb-free 2D hybrid organic–inorganic perovskites (HOIPs) have attracted considerable attention due to several advantages over their 3D counterparts, including enhanced structural stability, improved moisture resistance, and a widened band gap1,2,3,4,5,6,7. In addition to typical photovoltaic (PV) and light-emitting diode (LED) applications, 2D perovskites have applications in other optoelectronic devices, such as photoelectric detectors, field-effect transistors, spintronics, waveguides, photocatalysis and nanolasers8,9,10,11,12,13. The present investigation focuses solely on the discovery of Pb-free 2D HOIPs for use in PV and LED applications. Although there are many reasons for the scarcity of promising Pb-free 2D HOIPs in the field, a primary concern is the lack of feasible preliminary candidates that at least fulfill band gap matching. Once a large pool of preliminary candidates has been secured, it will be much easier to identify suitable candidates that meet the remaining stricter conditions required for PV and LED applications. The search for such candidates, however, is not an easy task, as experimental high-throughput screening for HOIPs is still in its early stages14. One of the most efficient approaches in the search for appropriate candidates is employing density functional theory (DFT) calculations, which may lead to tangible discoveries. Rather than a pure DFT calculation-based approach, a data-driven machine learning (ML) approach in conjunction with DFT calculations has recently been used as an alternative, more efficient material discovery strategy15,16,17,18,19,20,21 and has already been applied in the field of perovskite research22,23,24,25,26,27,28,29,30,31,32,33,34.

We propose a different type of data-driven strategy for the discovery of virtual Pb-free 2D HOIPs based on DFT calculations and nominate a number of virtual Pb-free 2D HOIPs suitable for PV and LED applications. The alternative data-driven strategy that we suggest is titled ‘stepwise optimization’, which does not use ML but rather involves metaheuristic35,36 and Bayesian optimization (BO)37. Similar approaches, such as closed-loop optimization15 or active learning16, have recently been reported, mainly based on BO, despite the fact that ML was still in use. The underlying principle for stepwise optimization is similar to that of data-driven approaches15,16, but the stepwise optimization approach differs from other data-driven approaches in that we do not use a ML algorithm or parameter estimation. the stepwise optimization has already been utilized and allowed us to discover many inorganic functional materials over the last two decades38,39,40,41,42.

We use this strategy to discover novel Ruddlesden–Popper (RP) phase perovskites with a fixed number of inorganic octahedron layers (n = 2)43,44, which are regarded as representative 2D perovskites and are among the most commonly studied, although there are equally important Dion–Jacobson phases45,46,47 and alternating cations in the interlayer space (ACI) phases48,49. There should be a tremendously large number of virtual RP phase perovskites by adopting many possible organic spacer molecules, smaller molecules residing in 3D octahedron slabs, octahedron-forming elements (except Pb) and several halide elements. There is no way to identify such a large number of candidates, even though we limited ourselves only to RP structures with n = 2. We need an optimization strategy to reduce the computational burden. We introduce a tandem optimization strategy that combines an elitism-reinforced nondominated sorting genetic algorithm (NSGA-II)50,51,52,53 and a multiobjective Bayesian optimization (MOBO) algorithm54,55,56. Our major aim is to suggest hypothetical Pb-free HOIPs in a limited search space (RP with n = 2). Other larger search space including other phases with other n values will be tracked down separately.

The key point of the proposed approach is the establishment of preset specific target properties (in this case, band gap and effective mass). The preset targets are achieved through a step-by-step improvement of the decision variables. A large amount of data is not required since no ML model training process is required in this approach. Only a small amount of data is required to instantaneously evaluate the objective function values in each step of the NSGA-II and MOBO iterations. The NSGA-II and MOBO algorithms outperform well-known gradient-based optimization algorithms for small problems with at most a few decision variables. The NSGA-II and MOBO algorithms that we adopted for the HOIP discovery process considerably reduced the computational costs of DFT calculations. Although we have previously reported many similar NSGA-II-driven approaches for the discovery of various materials38,39,40,41,42, a NSGA-II and MOBO tandem strategy has never been attempted. BO has been used for simple single-objective optimization problems in some cases57,58,59,60. However, it has never been used for a multiobjective optimization problem involving material discovery, despite the fact that MOBO has been well developed in other disciplines54,55,56.

Results and discussion

Problem setting

We independently implemented two closed-loop stepwise optimization processes: one for the discovery of HOIPs for LED applications and the other for PV applications. Different objective function settings with the same decision variable setting were used to investigate both targets, i.e., two different band gaps were set as targets and optimized independently in the same search space. Five generations (rounds or steps) were made for the NSGA-II/MOBO optimization process. Conventional NSGA-II (or MOBO) computations are iterated thousands of times until a complete optimization is achieved. In contrast, our NSGA-II/MOBO processes were completed after the fifth round due to the high-cost DFT calculation-based evaluation of the objective functions. However, the limited iterations up to at best the fifth round are acceptable by considering the fact that the most drastic improvement was always achieved during the initial stage of iterations, regardless of what metaheuristic-based optimization algorithms were used. The main goal of our NSGA-II/MOBO processes was not to achieve complete convergence on the global optimum but to provide a greater number of plausible HOIP candidates than other knowledge-based trial-and-error approaches.

Objective function selection is critical for the optimization process. There is no single objective function that can incorporate all the properties of interest. We chose objective functions with two critical properties: band gap and effective mass. The formation energy, decomposition energy and energy above the convex hull, which represent thermodynamic stability, were omitted since complex molecule inclusion made it impossible to obtain proper information on the constituent compounds and their convex hull. It is inevitable to remove the thermodynamic stability from the NSGA-II/MOBO process since DFT calculations are unavailable for binary and ternary compounds that involve organic molecules. Almost none of the binary and ternary compounds involving A’- and A-site organic molecules have been discovered yet. It would be irrational to involve DFT calculations for the non-existent structures. Most plausible approach at the current status of development is to adopt hypothetical binary and ternary structures by introducing existing simple-metal-based binary and ternary compounds as a template structure for the non-existent organic-molecule-involved structures23,61,62,63. We also adopted this type of approach in our previous report34, wherein we were dealing with relatively small molecules only. In the present investigation, however, we are dealing with much bigger A’ site molecules than ever before. It would be a complete non-sense to use such a simple-metal-based binary or ternary compound as a template for such a complicated large-organic-molecule-based binary or ternary compound that could scarcely be actually synthesizable.

In addition, the DFT calculation-driven thermodynamic stability represented by formation energy, decomposition energy and energy above the convex hull, etc. would seem to play no significant role in judging if a hypothetical HOIP of concern is useful for the standard solar energy harvesting or light emitting applications, while both the band gap and effective mass play a crucial role. In fact, the decomposition energy had a much smaller variance than the band gap and effective mass in our previous report34. The variance for the min-max-normalized decomposition energy, band gap and effective mass for 266 hypothetical HOIPs were estimated to be 0.008, 0.032, and 0.014, respectively. The variance for the decomposition energy was much smaller in comparison to those for band gap and effective mass. This clearly confirmed that the decomposition energy would not play a key role as an objective function in the NSGA-II/MOBO process.

The decision variables that we adopted were A’-site molecules (85 molecules), A-site molecules (41 molecules), B-site atoms (2 atoms), and X-site atoms (3 atoms) in the RP structure (n = 2), which yielded 20,910 possible HOIP candidates. Figure 1 shows the chosen candidates for A’- and A-site molecules. The careful selection of organic spacer materials, i.e., the A’-site molecule, plays a crucial role in determining the HOIP performance. We introduced all possible candidates as A’-site molecules, and some of the molecules were also selected as A-site molecules based on the size: 41 smaller molecules were simultaneously nominated as A’- and A-site molecule candidates. The B-site atoms were Sn and Ge (Pb was excluded), and the X-site atoms were Cl, Br, and I. It would be practically impossible to determine all the candidates via DFT and ab initio molecular dynamics (AIMD) calculations because the calculation time would be prohibitively long for HOIPs with relatively large cells involving many atoms per cell (>100) when using conventional lab-scale computational capacity. The partial mixing of molecules or atoms at their A’, A, B and X sites was not allowed for the same reason.

Fig. 1: Eighty-four molecule candidates to be inserted in the A’-site of the RP structure (n = 2), among which 41 relatively smaller molecules (located on the upper-left side) were shared with the A-site.
figure 1

1. Ammonium, 2. Hydrazinium(HY), 3. Methylammonium(MA), 4. Hydroxylammonium, 5. Aziridinium, 6. Oxiran-1-ium, 7. Azetidinium(AZ), 8. Oxetan-1-ium, 9. Dimethylammonium(DiMA), 10. Pyrrolidinium(PY), 11. 2-Propanaminium, 12. Tetrahydrofuranium, 13. Ethylammonium(EA), 14. Formamidinium(FA), 15. Trimethylammonium(TriMA), 16. Tetramethylammonium(TetraMA), 17. Cyclopropylammonium, 18. Thiazolium, 19. Iodoformamidinium, 20. 1-Aminoethaniminium, 21. 1-Aminoethaniminium, 22. Guanidinium (GA), 23. 1H-Pyrrole-1-ium, 24. Imidazolium(IM), 25. Protonated thiourea cation, 26. Heteroatom substituted alkylammonium, 27. eEthylammonium bromide, 28. Piperazin-1-ium(PIP), 29. Tetrahydropyran-1-ium, 30. Piperidinium, 31. Ehthylammonium iodide, 32. Cyclobutylammonium, 33. Oxepan-1-ium, 34. Cyclopentylammonium, 35. isobutyl-ammonium(IBA), 36. Azepanium, 37. propan-1-ammonium, 38. Diethylammonium(DiEA), 39. Phenyltrimethylammonium(PTA), 40. 3-iodopyridinium, 41. Cyclohexylammonium, 42. Tropylium, 43. Propylammonium hydroxide, 44. 2-(methylthio)ethylamine(MTEA), 45. Propylammonium bromide, 46. Propylammonium iodide, 47. Cyclohexylphosphonium, 48. But-3-yn-1-ammonium(BYA), 49. Allylammonium(ALA), 50. Ammonium 4-butyric acid, 51. Butan-1-ammonium(BA), 52. Benzimidazolium, 53. Benzylammonium(BZA), 54. 2-thienylmethylammonium(ThMA), 55. Cyclohexylmethylammonium, 56. Butylammonium hydroxide, 57. Butylammonium bromide, 58. Butylammonium iodide, 59. 2-(2-thienyl)ethanaminium, 60. Penta-1-ammonium, 61. 2-chlorophenethylammonium, 62. 2-bromophenethylammonium, 63. Phenylethylammonium(PEA), 64. 4-methylbenzylammonium, 65. 2-(1-cyclohexenyl)ethylammonium, 66. 2-fluorophenethylammonium(FPEA), 67. Pentylammonium hydroxide, 68. 4-fluorophenylethylammonium(F-PEA), 69. Perfluorophenethylammonium(F5-PEA), 70. Pentylammonium bromide, 71. Pentylammonium iodide, 72. 1-(2-naphthyl)methanammonium(NMA), 73. 5-ammonium valeric acid(5-AVA), 74. Hexane-1-ammonium, 75. Naphthalene-O-ethylammonium, 76. Propyl phenyl ammonium(PPA), 77. (carboxy)Cyclohexylmethylammonium(TRA), 78. Hexylammonium bromide, 79. Hexylammonium hydroxide, 80. Hexylammonium iodide, 81. 4-methoxyphenethylammonium(MeO-PEA), 82. Heptan-1-ammonium, 83. 2-(2-naphthyl)ethanammonium(NEA), 84. Octan-1-ammonium, 85. Pyrene-O-ethylammonium.

Figure 2 shows a schematic of the entire NSGA-II/MOBO optimization process, elucidating the closed-loop optimization strategy that differs from conventional ML approaches. The objective functions and decision variables are schematically described in Fig. 2. NSGA-II was hybridized with the MOBO algorithm in such a way that every generation included two MOBO-selected entries. A brief description of the hybridized optimization process is also schematically provided in Fig. 2, and a more detailed description of the process is given in the Methods section. Two parallel optimizations were carried out independently: one for PV applications and one for LED applications. The band gap objective function for PVs (Eg_PV) was set as the energy difference from 0.74 eV, which was determined by considering the GGA underestimation34,64,65,66 of the Shockley–Queisser efficiency limit, which is 1.34 eV67. The band gap objective function for LEDs (Eg_LED) was set as the energy difference from the GGA-based 1.98 eV, which corresponds to the GGA underestimation of a typical blue range peaking at 2.76 eV, since our goal is to pursue blue PeLEDs.

Fig. 2: Schematic representation of the entire computational process involving the DFT-calculated-data-based NSGA-II/MOBO process.
figure 2

The search space based on the RP structure (n = 2) is schematically summarized on the leftmost side. The hybridized NSGA-II and MOBO algorithm are shown in the middle. All the operators for NSGA-II and MOBO are listed in the box. The NSGA-II/MOBO process led to 24 HOIP candidates for LEDs, which were reduced to seven candidates with stricter selection criterion, and in parallel, seven candidates for solar cell applications were directly selected from the NSGA-II/MOBO process.

The effective mass objective function (me/h) for both the PV- and LED-targeted NSGA-II/MOBO optimization processes was defined as the greater effective mass value of either an electron or a hole. In principle, a lower value should be preferred in PV and LED applications because it represents the charge transport potential. Nonetheless, the maximum value was chosen as an objective function because it was minimized in the NSGA-II/MOBO optimization process. It is logically more favorable to minimize the maximum rather than simply pursuing the minimum in an attempt to achieve minimization while simultaneously ensuring a balance between the electron and hole effective masses.

The population size for each generation was 20 for NSGA-II, and two entries recommended by the MOBO algorithm were inserted into each generation except for the randomly chosen first generation, yielding 22 HOIP entries for each generation after the first generation. This results in a total of 216 entries (108 for each of the PV and LED applications). The MOBO-recommended entries at each generation were derived from all previous generations, not just the immediately preceding generation. The NSGA-II/MOBO hybridized optimization process was implemented until the fifth round (generation). Although five NSGA-II generations, along with MOBO, may seem insufficient for complete optimization, it is not an issue because it is well known that the most significant improvement always occurs during the initial stage of optimization, as discussed above. In addition, it should be noted that the goal of the NSGA-II/MOBO hybridized optimization process is to obtain preliminary, rough guidance rather than complete optimization.

Search for 2D HOIPs for use in LED applications

The current development status of blue perovskite light-emitting diodes (PeLEDs) lags far behind those of their red and green counterparts. A desirable blue light-emitting 2D HOIP has yet to be discovered, in contrast to well-established green and red light-emitting perovskites that achieve external quantum efficiencies beyond 20%68,69,70. Blue HOIPs can be fabricated by reducing the dimension of the perovskites, and phases with smaller n values typically have larger band gaps71. However, a strong exciton–phonon coupling of 2D perovskites would limit the external quantum efficiency of 2D blue PeLEDs72. Although modifying the bromide/chloride composition is a well-known method for achieving blue emission, the practical implementation of this strategy has been challenging due to color instability, phase separation caused by electric fields in LED applications, and severe photoluminescence quenching73. Using mixed-dimensional perovskites or colloidal perovskite nanocrystals, bandgap engineering has been attempted74,75,76. However, the combination of 2D and 3D perovskites, known as quasi-2D perovskites, is not applicable for DFT calculations due to our computational capacity. Therefore, we changed the template structure to a 2D RP (n = 2) structure and searched for novel blue light-emitting HOIPs within this structure. In this regard, the decision variables involve each constituent molecule/element at the A’, A, B, and X sites in the fixed 2D RP (n = 2) structure, with no co-occupancy allowed at any site.

Figure 3 shows the summarized result obtained from the NSGA-II/MOBO-hybridized optimization process. Evidence for the intended evolution was clearly detected. Specifically, the NSGA-II/MOBO process drove the HOIP entries to minimize both the objective functions, Eg_LED and me/h. The bad gap objective function (Eg_LED) differs from the band gap itself (Eg). Accordingly, the band gap shifted toward the intended region (GGA-translated blue region) by minimizing Eg_LED, and the effective mass continued to decrease as the NSGA-II/MOBO iteration progressed. Figure 3a shows the evidence for the evolution in terms of the number of promising entries with band gaps between 1.83 and 2.13 eV (GGA-translated blue region) and effective masses less than the rest mass of a free electron, mo. Figure 3b also shows the number of promising entries that fulfill both requirements simultaneously. The number of promising entries increased as the NSGA-II/MOBO iteration continued until the fifth generation. The average objective function values for the first Pareto frontier are also plotted as a function of the generation number in Fig. 3c, with both Eg_LED and me/h decreasing as the NSGA-II/MOBO iteration progresses. The expected hypervolume improvement (EHVI) is also plotted as a function of the generation number, with a clearly increasing trend shown in Fig. 3d. To confirm the evolution (improvement in both the band gap and effective mass) in more detail, the top three Pareto frontiers are plotted for 5 generations, as shown in Fig. 3e. It is worthwhile to focus on the promising area below and to the left of the red lines. The population in these areas becomes denser as the NSGA-II/MOBO iteration progresses. This is clear evidence that the NSGA-II/MOBO optimization algorithm was successful. In particular, the MOBO-nominated entries are specially marked as stars in Fig. 3e, indicating that relatively promising entries were nominated by the MOBO algorithm in later generations. It is clear that the NSGA-II/MOBO hybridization was successful. All entries that appear during the NSGA-II/MOBO iteration are listed in Supplementary Table 1. The density of states (DOS), band structure, and relaxed structure schematics are presented in Supplementary Figs. 1 and 2.

Fig. 3: The NSGA-II/MOBO screening result targeting HOIPs for LED applications.
figure 3

a The numbers of HOIP candidates exhibiting 1.83 < GGA_Eg < 2.13 eV and me/h < mo. b The number of promising entries simultaneously fulfilling the above conditions. c The band gap and effective mass objective functions averaged from the first Pareto frontier are plotted as a function of the NSGA-II/MOBO round. Note that the band gap objective function does not represent the pristine band gap itself, and the effective mass objective function is the maximum effective mass out of electron and hole masses. d The EHVI is plotted as a function of the MOBO round. The increasing trend is indicative of a certain degree of intended evolution. e Pareto-sorted entries were collected from the top three Pareto frontiers of each generation for the NSGA-II/MOBO iteration (the MOBO-nominated entries are separately marked as pentacles).

We collected 24 entries that meet the predefined conditions of 1.83 < Eg < 2.13 eV and me/h < mo, indicating that they could be promising HOIP candidates for LED applications. We finally identified seven candidates from the 24 entries based on the much stricter criterion of 1.91 < Eg < 2.04 eV and me/h < 0.24mo to maintain a balance with the solar cell HOIP search case. These final candidates were validated using HSE06-based DOS and band structure, as well as AIMD. The results from the HSE06-driven DFT and the AIMD calculations are discussed in more detail in a later subsection.

Search for 2D HOIPs for use in PV applications

It is generally known that 2D perovskites are not a good choice as absorbers for solar cell applications because of their wide optical bandgap and the limited charge transport associated with the 2D structure77. In this regard, a 3D/2D mixed-dimensionality approach is recommended for PV applications6,78,79,80,81,82. Low-dimensional HOIPs and phases with smaller n values are known to have larger band gaps and exciton binding energies, so RP phase perovskites (n = 2) may be more suitable for LED applications83. Therefore, the use of RP phase perovskites (n = 2) as a template structure would appear to be a challenge if solar cell applications were the ultimate goal. Despite this challenge, however, we aimed to search the 2D RP structure pool for 2D perovskites suitable for solar cell applications. Although it would be difficult to find candidates with reduced band gaps and improved charge mobilities in the RP structure pool, the search effort was worthwhile because the NSGA-II/MOBO approach could facilitate the search process. It should be noted that the adoption of closed-loop optimization strategies based on NSGA-II and MOBO is motivated by this harsh and seemingly insurmountable challenge. This type of data-driven material discovery strategy could achieve unconventional results beyond the traditional knowledge-based approach.

Evidence for premeditated evolution was also clearly detected in solar cell application-targeted NSGA-II/MOBO implementation. Analogous to the previous case of LED application, NSGA-II/MOBO drove the nominated HOIPs to minimize both the objective functions, Eg_PV and me/h. Accordingly, the band gap approached 0.74 eV (the GGA-translated Shockley–Queisser efficiency limit), and the effective mass continued to decrease as the NSGA-II/MOBO iteration progressed. Figure 4a shows the evidence for the evolution in terms of the number of promising entries with band gaps between 0.61 and 0.87 eV and effective masses less than mo. The number of entries that meet the band gap requirement was lower than the number obtained from the LED-targeted HOIP search (Fig. 3a), although it showed an increasing trend as the NSGA-II/MOBO iteration progressed. Figure 4b also shows the number of promising entries that fulfill both requirements simultaneously. As a result, seven entries remain that fulfill 0.61 < GGA_Eg < 0.87 eV and me/h < 0.24mo. The number of promising entries discovered in the solar cell-targeted HOIP search was less than the number discovered in the LED-targeted HOIP search, where NSGA-II/MOBO nominated 24 promising HOIP entries (Fig. 3b). This reduced number supports the discipline-specific common knowledge that RP phase perovskites (n = 2) are more suitable for LED applications than solar cell applications83. It should be noted, however, that the NSGA-II/MOBO strategy allowed us to find seven candidates even in the RP phase perovskite pool, which is promising evidence that the NSGA-II/MOBO strategy was successful. However, the experimental realization of these hypothetical candidates has yet to be achieved, since the aim of the present investigation was limited to computational suggestions. The suggested HOIPs would be a good starting point for experimental investigators.

Fig. 4: NSGA-II/MOBO screening results targeting HOIPs for solar cell applications.
figure 4

a The numbers of HOIP candidates exhibiting 0.61 < GGA_Eg < 0.87 eV and me/h < mo. b The number of promising entries simultaneously fulfilling the above conditions. c The band gap and effective mass objective functions averaged from the first Pareto frontier are plotted as a function of the NSGA-II/MOBO round. Note that the band gap objective function does not represent the pristine band gap itself, and the effective mass objective function is the maximum effective mass out of electron and hole masses. d The EHVI is plotted as a function of the MOBO round. e Pareto-sorted entries were collected from the top three Pareto frontiers of each generation for the NSGA-II/MOBO iteration (the MOBO nominated entries are separately marked as pentacles).

The average objective function values for the first Pareto frontier are also plotted as a function of the generation number in Fig. 4c, with both Eg_PV and me/h decreasing as the NSGA-II/MOBO iteration progresses. The EHVI is also plotted as a function of the generation number, with a clearly increasing shown in Fig. 4d. The top three Pareto frontiers are plotted for five generations, as shown in Fig. 4e. It is worthwhile to focus on the area below and to the left of the red lines. Analogous to the previous case of LED applications, the population in these areas became denser as the NSGA-II/MOBO iteration progressed. This is clear evidence that the NSGA-II/MOBO was successful for the solar-cell-targeted HOIP search. The MOBO-nominated entries are specially marked as stars in Fig. 4e, indicating that relatively promising entries were nominated by the MOBO algorithm in later generations. All entries that appear during the NSGA-II/MOBO iteration are listed in Supplementary Table 2. The DOS, band structure, and relaxed structure schematics for all entries appearing during the NSGA-II/MOBO iteration are presented in Supplementary Figs. 3 and 4.

NSGA-II/MOBO-nominated HOIPs

We identified 14 HOIP candidates from the solar cell- and LED-targeted HOIP search processes. Since 24 candidates were identified in the LED-targeted HOIP search, we reduced the number of candidates to seven candidates using a stricter criterion that the band gap range should be between 1.91 and 2.04 eV and the effective mass should be less than 0.24mo. As a result, 14 nominated HOIPs in the RP structure pool (seven for each of the PV and LED applications) are listed in Table 1. These nominated HOIP entries were recalculated using the HSE06 exchange correlation functional. Figure 5 shows the HSE06-based band structure, DOS, and schematic structure for the 14 HOIP candidates. The HSE06 band gap fell into realistic value range; 2.68 < Eg < 2.89 eV for LED applications and 1.19 < Eg < 1.49 eV for solar cell applications. We did not include SOC for HSE06 calculation. However, we examined the effect of SOC on the calculation. Supplementary Table 3 shows the effect of SOC on the HSE06 calculation result. The band gap values for HSE06 and HSE06-SOC calculations are similar. Although the effective mass differs between the HSE06 and HSE06-SOC calculations, it does not matter because the effective mass for HSE06-SOC is similar to that for the PBEsol-SOC that we used for the entire NSGA-II/MOBO process. It turned out that the SOC does not affect the band gap but greatly affect the effective mass.

Table 1 Details for the 14 hypothetical HOIPs.
Fig. 5: Fourteen HOIPs chosen from the NSGA-II/MOBO screening results (seven for LED applications and seven for solar cell applications), including the HSE06-based band structure and DOS results; the band structure and DOS were drawn based on the original Bravais lattice.
figure 5

The AIMD total energy plot as a function of time for 5 ps is also presented on the bottom.

Additionally, AIMD was implemented to test the phase stability, and it was revealed that the AIMD total energy never changed significantly as a function of time for 5 ps for any of the nominated HOIP candidates. However, this does not guarantee the stability of the HOIP candidates. The AIMD result shown in Fig. 5 is a weak indicator of the stability of the selected HOIP candidates. Neither the AIMD total energy nor the DFT-driven thermodynamic energy preference can directly indicate the phase stability. The AIMD just shows the possibility of dynamic stability, and indicates neither formability nor synthesizability. However, these could be a rough indicator of the stability. The thermodynamic energy preference and synthesizability of the nominated candidates should be validated experimentally, which would be the future step in the discovery of novel HOIPs.

Although the GGA-band gap is significantly underestimated, it is not an issue to use it as an objective function since the NSGA-II/MOBO iteration operates based on the relative difference in the objective function rather than the absolute value of the objective function. As long as the relative differences among the GGA-based band gaps were consistent with those among the real values, the optimization would not be affected. More realistic band gap values can be easily obtained by referring to a correction curve. It is well known that the HSE06-based band gap is closer to the true value, and a linear relationship exists between the GGA- and HSE06-based band gaps. Thus, we constructed a correction curve using previous data23,34. Figure 6 shows the typical linear relationship between GGA- and HSE06-based band gaps for HOIPs. We superimposed the present 14 HOIP data points on the well-fitted plot. The new data never deviates from the plot.

Fig. 6: Plot of the GGA vs. HSE06 band gap for the 14 HOIP candidates (green circles) along with a linear regression fit from our previous investigation33.
figure 6

The GGA-based band gap can be translated to an HSE06-based calculated version and vice versa by using this linear fit result.

There is no overlap between the 14 NSGA-II/MOBO-nominated HOIP candidates and the well-known existing HOIPs listed in Table 2. Table 2 exhibits only a few HOIPs that meet the selection criterion adopted in the present investigation, as marked as LED and PV in the rightmost column. Six HOIPs are suitable for LED applications and only a HOIP is found useful in PV applications. The lack of PV-targeted HOIPs is rationalized by the commonsense that RP-based 2D perovskites are not suitable for the PV application. The reason for the lack of RP-based 2D perovskites for PV applications were already described above. It should be noted, however, that the seven hypothetical RP-based 2D perovskites for PV applications, which were nominated by the NSGA-II/MOBO strategy, are valuable discoveries in comparison to the scarcity in Table 2, although the synthesizability is yet to be accomplished. The six known HOIPs for use in LED applications are all based on Pb and well-known small molecules for both A’- and A-sites. This means that the NSGA-II/MOBO-nominated, Pb-free HOIPs deserve attention.

Table 2 Details for well-known existing HOIPs based on an RP structure with n = 2.

In summary, we implemented both PV- and LED-targeted HOIP search procedures using a NSGA-II/MOBO hybridized optimization algorithm. The DFT-calculated band gap and effective mass were used as objective functions to be optimized by the NSGA-II/MOBO algorithm, while the choice of molecules and atoms in the RP structure (n = 2) were taken as decision variables. We identified 14 promising, Pb-free HOIPs in the RP structure (n = 2), seven candidates for each of the PV and LED applications. The nominated LED-targeted HOIPs have a HSE06-based band gap between 2.68–2.89 eV (GGA-translated 1.91–2.04 eV) and an effective mass below 0.24mo, while the nominated PV-targeted HOIPs have a band gap between 1.19–1.49 eV (GGA-translated 0.61–0.87 eV) and an effective mass below 0.24mo. The phase stability of the nominated HOIPs was validated using AIMD calculations.

Methods

NSGA-II and MOBO

The elitism-reinforced nondominated sorting genetic algorithm, known as NSGA-II50,51,52,53, has been widely used for multiobjective optimization problems in many fields. Multiobjective optimization problems deal with objectives that trade off with one another, meaning that while one objective increases, the other decreases. Of course, NSGA-II also works in non-trade-off problems. There is no unique global solution, but rather a set of solutions known as the Pareto frontier. The following mathematical forms represent a typical multiobjective optimization problem:

$${\rm{Minimize}}/{\rm{Maximize}}:f_m\left( {{{\mathbf{X}}}} \right),\,m = 1,2,.........,\,M$$

A solution is a vector of n decision variables:

$$\begin{array}{l}{{{\mathbf{X}}}} = \left( {x_1,x_2,.........,x_n} \right)^T\,{{{\mathrm{bounded}}}}\,{{{\mathrm{such}}}}\,{{{\mathrm{that}}}}\,{{S}}:x_i^{(L)} \le x_i \le x_i^{\left( U \right)},\\\,i = 1,2,.........,\,n\end{array}$$

In our specific problem, HOIP discovery was subjected to no constraints. The set of all feasible solutions belong to the feasible region, or S. The term “search space” is also used to refer to the feasible region. The objective space consists of the possible values of the M objective functions for all solutions in S. M is usually 2 in NSGA-II. If M > 2, then the more recently developed NSGA-III would work better34,84. NSGA-II enables us to obtain a nondominated set of the entire feasible search space S: the globally Pareto-optimal set, also called the Pareto frontier. NSGA-II drives the solutions to the Pareto frontier by employing two important principles, elitism and crowding distance, and shares all the other general GA principles, such as crossover, mutation, and tournament selection. NSGA-II uses an elitist principle on a large scale, in contrast to the limited elitism in normal GA, i.e., it performs nondominated sorting in the combined parent and offspring populations, i.e., the solutions are sorted based on the nondominance. NSGA-II also uses an explicit diversity-preserving mechanism (crowding distance) and thereby pursues evenly distributed nondominated solutions.

MOBO54,55,56 has scarcely been used in the materials research domain, although single-objective BO has been used for materials research57,58,59,60. MOBO is still in its infancy, and its application in the materials research domain is rare. We hybridized MOBO and NSGA-II for the purpose of discovering virtual HOIPs in the present investigation and thereby were able to reduce the risk from insecure MOBO. We adopted a state-of-the-art MOBO algorithm that incorporated a parallel q-expected hypervolume improvement acquisition function (qEHVI), enabling efficient and effective gradient-based optimization methods60. BO is powerful for optimization tasks that require time-consuming evaluations of objective functions. For instance, our DFT-calculation-driven HOIP discovery task took the band gap and effective mass as objective functions to be optimized. Therefore, at least several hours (sometimes more than a day) were required for the band gap and effective mass calculations for a single HOIP candidate under our computational capacity. The hybridization of MOBO with NSGA-II would be a useful option to reduce the computational burdens for the current optimization task. Rather than a hard-to-handle objective function, an acquisition function that was easy-to-handle, cheap-to-evaluate, and easily optimizable (with exact gradients computed) was used to determine the next samples to be evaluated. It was sufficient to evaluate the objective function of only a limited number of samples.

Although many blindly trust the capabilities of BO, we do not. It is our opinion that BO capability has been slightly exaggerated due to its outstanding merit, namely low-cost optimization with a relatively small number of samples to be evaluated. Even a single-objective BO execution does not always yield an acceptable result when using a simple benchmark function as a single objective function. Moreover, when compared with the benchmark results for single-objective BO execution, a worse performance can be expected for MOBO. Hybridization with the well-known NSGA-II would be a good option to compensate for the shortcomings of MOBO. It should be noted that our hybridization is not a typical use of the NSGA-II or NSGA-III algorithms for acquisition function optimization since qEHVI is differentiable. We introduced q = 2 for the MOBO process. Two independent samples were created for each step by the MOBO algorithm using all the existing samples previously introduced by both NSGA-II and MOBO. We monitored those generated by MOBO separately for comparison with the NSGA-II-recommended samples.

DFT and AIMD calculations

First-principle calculations were performed using conventional lab-scale computational inventories based on the Vienna Ab initio Simulation Package (VASP5.4)85,86,87,88,89. The generalized gradient approximation (GGA) parameterized by Perdew, Burke, and Ernzerhof85, along with the van der Waals (vdW) functional (DFT-D3)90, was employed for the NSGA-II iterations. Projector-augmented wave potentials91,92, a cutoff energy of 500 eV and a k-mesh interval maintained below 0.3 eV were adopted using the Monkhorst–Pack scheme for all input model structures, which ensured that total energy converged at less than 10−5 eV. We also incorporated spin-orbit coupling93 for all entries in the NSGA-II/MOBO iterations. All structural aspects (atomic position, lattice size, and shape) were allowed to relax until the final force on all relaxed atoms was less than 0.01 eV/Å. Hybrid DFT calculations within the Heyd–Scuseria–Ernzerhof (HSE) formalism with a 25% Hartree–Fock exchange94,95 were employed for some promising candidate compounds elicited from the NSGA-II/MOBO iteration results to perform linear regression fitting for a GGA band gap vs. an HSE06 band gap. To examine the room-temperature structural stability of the selected candidates, AIMD96,97 calculations were implemented at 300 K within the vdW correction using the DFT-D3 method. The AIMD simulation lasted 5 ps, with a time step of 1 fs, and was based on the canonical ensemble (NVT) and Nosé-Hoover thermostat algorithm.