Influence of local fields on the dynamics of four-wave mixing signals from 2D semiconductor systems

In recent years the physics of two-dimensional semiconductors was revived by the discovery of the class of transition metal dichalcogenides. In these systems excitons dominate the optical response in the visible range and open many perspectives for nonlinear spectroscopy. To describe the coherence and polarization dynamics of excitons after ultrafast excitation in these systems, we employ the Bloch equation model of a two-level system extended by a local field describing the exciton-exciton interaction. We calculate four-wave mixing signals and analyze the dependence of the temporal and spectral signals as a function of the delay between the exciting pulses. Exact analytical results obtained for the case of ultrafast ($\delta$-shaped) pulses are compared to numerical solutions obtained for finite pulse durations. If two pulses are used to generate the nonlinear signal, characteristic spectral line splittings are restricted to short delays. When considering a three-pulse excitation the line splittings, induced by the local field effect, persist for long delays. All of the found features are instructively explained within the Bloch vector picture and we show how the exciton occupation dynamics govern the different four-wave mixing signals.


Introduction
With the observation of an extraordinarily strong light emission from monolayers of transition metal dichalcogenides (TMDCs) in 2010 [1,2], this material class came into the focus of semiconductor optics. Since then, the research on the fundamental physics of these materials and the potential applications of them has flourished and is still growing. The optical properties of TMDCs are strongly dominated by excitons which, due to the strong Coulomb interaction in these materials, exhibit binding energies on the order of 500 meV [3], which is orders of magnitude larger than in typical III-V or II-VI semiconductors. While initially the observed spectral lines were rather broad, it turned out that by encapsulating the monolayer in hexagonal boron nitride the inhomogeneity of the structure could be strongly reduced resulting in linewidths of the excitonic transitions approaching the homogeneous limit [4,5,6]. Among other techniques, also four-wave mixing (FWM) spectroscopy has been applied to these materials [7,5,8], which gave access to the coherence and density dynamics of the excitons after ultrafast excitation. These studies revealed FWM signals even for negative delay times [8], which are clear hints for contributions to the signals resulting from exciton-exciton interactions. These features revived the interest in local field models for the description of FWM signals.
Exciton dynamics in semiconductor nanostructures have been thoroughly explored over many years using tools of nonlinear spectroscopy [9,10,11], like differential transmission [12,13], spectral hole burning [14,15] or FWM [16,17]. In the corresponding theoretical description it is often sufficient to apply two-or few-level models [18,19]. For systems like quantum dots (QDs), in particular in samples with a low QD density, it is generally accepted that each excitonic few-level system can be treated individually [20,21] and a collection of QDs is then treated by ensemble averaging [22]. In spatially extended systems like GaAs quantum wells, however, many body effects become relevant [23]. It turned out that an effective mean field treatment of the exciton-exciton interaction successfully reproduces the experiments in such samples [24,25]. This model describes the influence of all other excitons on the two-level system (TLS) of a single exciton in the form of a so-called local field, which allows one to analyze the resulting FWM signals in the lowest contributing order of the exciting fields, i.e., the χ (3) -regime [23]. This local field contribution to a TLS has subsequently been derived from more fundamental theoretical approaches. Starting from a microscopic density matrix description, it can be interpreted as an interaction with the exciton-exciton scattering continuum [26,27]. In the case of a resonant excitation of the 1s exciton, such local field contributions can also be obtained from a simplification of the semiconductor Bloch equations [28]. Recently, also in TMDC systems simplified models derived from a microscopic theory have been found to show terms which can be interpreted as local field effects [29,30].
Motivated by these theoretical works and the experiments in Ref. [8], but not limited to TMDC systems, we build on the original model and thoroughly study the influence of the local field effect on the spectral dynamics of different FWM signals. In practice we focus on different excitation scenarios, namely a two-and a three-pulse excitation scheme, which allows us to probe different aspects of the system dynamics [20]. We show that an analytic treatment in the limit of ultrafast laser pulses is possible even without restrictions to a perturbative treatment, which allows us in particular to study the dependence of the FWM signals on the intensities of the exciting pulses. An instructive explanation of the involved signal dynamics based on the Bloch vector description clarifies their physical interpretation. In turn, the numerical simulation of FWM signals generated by laser pulses with realistic durations in the sub-picosecond range allows us to make predictions for actual experiments and to study the effect of temporally overlapping pulses.
The paper is organized as follows: After the introduction to the model in Sec. 2 an analytic solution of the equations of motion for ultrafast optical excitations is given in Sec. 3. Two-pulse FWM signals are discussed in Sec. 4 and three-pulse signals in Sec. 5, where first the δ-pulse limit and then non-vanishing pulse durations are analyzed. Finally, in Sec. 6 we draw some conclusions.

Model
The optical driving of a collection of two-level systems, each consisting of the ground state |g and the excited state |x separated by an energy ω 0 , is in general described by the optical Bloch equations [18], The central quantities are the polarization p = |g x| and the occupation n = |x x| . Transitions between the two states are induced by the classical light field E(t), expressed here in terms of the instantaneous Rabi frequency Ω(t) = M · E(t)/ . M is the dipole matrix element and E(t) is taken to be in resonance with the transition energy ω 0 . Therefore, both E(t) and Ω(t) are described only by their envelopes. Note that we have applied the standard rotating wave approximation (RWA) and the equations are given in the frame rotating with the transition energy ω 0 .The RWA is well justified for a resonant or close-to-resonant excitation with pulse durations in the hundred femtosecond range and moderate pulse powers [31]. In the context of TMDC monolayers, the restriction to a two-level model is expected to be applicable, when the dynamics are restricted to excitons in a single valley, which is the case in a resonant co-circular excitation scheme. For large exciton densities the scattering into other states might become relevant [30]. Nevertheless, we will study the behavior of the TLS also in the regime of larger pulse areas beyond the third order regime, keeping in mind that in actual experiments effects from other excitons might contribute to the signals. In this way, when comparing to measured signals, our results may act as a reference to estimate at which excitation powers the TLS model loses its validity. We have included a phenomenological dephasing rate β and a decay rate of the excited state Γ.
In the local field model the optical field is supplemented by a contribution due to the field generated by the polarization of the TLSs themselves [23]. The full optical field then reads Ω(t) = Ω ext (t) + V p, with the external optical laser field Ω ext (t) and the coupling parameter V that results from the Coulomb interaction among the excitons and that determines the strength of the self-interaction [28,26]. This substitution leads to the Bloch equations with a local field contribution, Note that the local field is characterized by a real value of V [23], therefore it does not appear in Eq. (2b). Considering an imaginary part for V in Eq. (2a) also describes an excitation induced dephasing (EID) effect [32], as can be derived from a microscopic theory including exciton-exciton scattering [33]. Note, however, that Eq. (2b) remains unchanged because exciton-exciton scattering does not lead to generation or recombination of excitons and therefore does not change the exciton occupation. To include both features, in the following we will treat V as a complex constant. For simplicity we will continue to refer to V p as a local field effect. When discussing the analytical results we will explicitly address the influence of real and imaginary part of V . The local field obviously leads to an additional nonlinearity ∼ np in the optical equations. By reformulating Eq. (2a), this term can be interpreted as an effective occupation-dependent shift of the transition energy from ω 0 to ω eff = ω 0 + ω loc (n) and an effective occupation-dependent dephasing β eff = β + β loc (n) according to with When the system is in its ground state with n = p = 0, the effective transition frequency is reduced to ω eff = ω 0 − Re(V ). It increases linearly with growing occupation, reaching ω eff = ω 0 + Re(V ) in the excited state n = 1, p = 0. Consequently, for n = 1/2 the frequency remains unchanged ω eff = ω 0 . To obtain EID, i.e., a dephasing which increases with increasing density Im(V ) < 0 is required. This, in turn, requires β > −Im(V ) to still have a dephasing in the linear regime.

Solution for ultrashort pulses
Despite the nonlinearity, Eqs. (2) can be solved analytically in the limit of ultrashort pulses, i.e., for δ-pulses. Note that while mathematically we use δ-pulses, physically the ultrashort pulse limit is reached if the pulse duration is much shorter than the characteristic timescale of the system's dynamics. This is still well within the validity of the RWA. As will be seen later, the obtained results will be helpful for understanding the specific behavior of the considered FWM signals. To reach the δ-pulse limit, we assume a rectangularly shaped, resonant pulse of duration ∆t centered around the time t 0 with total pulse area θ and phase φ as and at the end perform the limit ∆t → 0. To solve the differential equations, we use a parametrization of the instantaneous pulse area [34], instead of the time t. Clearly, A is restricted to the interval [0, θ] corresponding to the beginning of the pulse and its end, where the total pulse area is reached. Utilizing the pulse area instead of the time avoids divergences in Eqs. (2) connected with an ultrashort optical pulse. The transformed equations during the pulse read In the limiting case ∆t → 0 all contributions from the local field, dephasing and decay of the TLS vanish and Eqs. (8) have the same solution as the ordinary optical Bloch equations given by [34], where p + , n + (p − , n − ) denote the polarization and occupation directly after (before) the pulse, respectively.
In the absence of an external field, Ω ext = 0, the dynamics of the occupation are only subject to the decay. For an initial occupation n 0 at t = 0 it simply drops exponentially This time dependence contributes to the polarization as a time-dependent phase shift introduced by the local field coupling which therefore reads In most semiconductor systems, the polarization dephases on a much shorter timescale than the occupation decays [20]. In this case we can take Γ → 0 and the polarization dynamics simplify to where the polarization simply rotates with the effective transition frequency ω eff (n 0 ) = ω 0 − (1 − 2n 0 )Re(V ) and it decays with the effective dephasing rate β eff = β + (1 − 2n)Im(V ), as given in Eq. (3). However, note that for completeness in the analytical results presented in the following this limit has not been performed and the parameter Γ is still included.

Two-pulse FWM
The most basic FWM experiment utilizes the two-pulse sequence depicted in Fig. 1. Two resonant laser pulses are applied with a variable delay τ 12 with respect to each other. The pulse areas θ 1 and θ 2 of the first and second pulse are kept fixed while the pulse phases φ 1 and φ 2 are varied during the repetition of the experiment. In the simulation the FWM signal is extracted by a phase-selection scheme from the microscopic exciton polarization as explained later. The second arriving pulse generates the FWM signal and sets the time t = 0.

δ-pulse limit
Based on the analytic results for a single ultrashort pulse derived in Eqs. (9) and (10), we start our discussion by calculating the two-pulse FWM signal analytically in that limit. The polarization after the second pulse depends on the time t after that pulse and the delay τ 12 within the pulse pair and reads Note, that the polarization is not oscillating with the transition frequency because we are working in the rotating frame of the laser field. Directly after the second pulse at t = 0, the polarization consists of three parts depending on the phases φ 1 , φ 2 and 2φ 2 − φ 1 . In the field-free propagation after the second pulse, the polarization receives an additional contribution from the occupation via the local field coupling. The corresponding term is the last exponential function in Eq. (12a), which also contributes to the phase dependence via the difference φ 2 − φ 1 and its complex conjugate phase. This will be discussed in detail below. From Eqs. (12) we find that in the limit Γ → 0 an imaginary part of the local field parameter only gives rise to an additional damping, depending on the pulse areas and the delay time. Thus, we expect that EID manly gives rise to a faster decay of the signals and a broadening of the corresponding spectra. Since here we are mainly interested in characteristic features in the signals and the spectra in the following we will neglect an imaginary part and assume that V is a real parameter.
In the Appendix B we will show results of calculations including EID, which will confirm that this expectation is indeed fulfilled. The FWM dynamics are obtained by filtering the polarization in Eq. (12a) with respect to the inverse FWM phase −φ Hence, the FWM dynamics consist only of that part of the polarization which carries the FWM phase as all other phase-dependencies vanish. This procedure models the typical heterodyne detection in FWM experiments [11] Without local field interaction there is no additional phase in the free propagation. Then the FWM polarization consists only of the term ∼ c in Eq. (12a) which reads With local field interaction, the resulting FWM dynamics are We find that each term of the polarization contributes to the FWM dynamics with a Bessel function of a different order. The full expression, after inserting Eqs. (12b)-(12g) into Eq. (15), becomes quite involved, but we will now have a more detailed look on the creation of the FWM signal to understand the origins of its different contributions. A flow chart of the pulse sequence and phase selection resulting in the FWM signal is depicted in Fig. 2. Initially the system is in the ground state with n = p = 0. The first pulse creates an occupation 0 n + 1 and a polarization with the phase of the first pulse φ 1 p + 1 , marked as green wave for a laser pulse induced step. For clarity, we give the phase dependence of each quantity on the left side of the symbol and the number of the pulse on the right side, where + is directly after and − directly before the respective pulse. During the field-free propagation before the second pulse two things happen. On the one hand both quantities simply propagate in time indicated by black arrows. On the other hand the occupation contributes to the phase through the local field coupling [Φ in Eq. (12f)], indicated by a violet dotted arrow. The second pulse creates an occupation consisting of a part independent of the pulse phases 0 n + 2 and a part depending on the pulses' phase difference φ 2 −φ 1 φ 1 −φ 2 n + 2 . Because the occupation is a real quantity, this quantity has to depend on the phase difference and its conjugate, therefore two terms appear on the left side of the symbol. As mentioned above, the polarization created by the last pulse consists of parts depending on the phases of the exciting pulses φ 1 p + 2 , φ 2 p + 2 and the FWM phase 2φ 2 −φ 1 p + 2 . After the second pulse, φ 2 −φ 1 φ 1 −φ 2 n + 2 contributes to the polarization via the local field resulting in additional wave-mixing possibilities (violet dotted arrows). We illustrate this by one example: The simplest wave-mixing process between the occupation φ 2 −φ 1 φ 1 −φ 2 n + 2 and the polarization φ 2 p + 2 that results in the FWM phase is where the occupation enters once. It is possible to generalize this to larger numbers of with k ∈ N, where the occupation is used in total 2k − 1 times. In the flow chart this order of the phase difference mixing is annotated on the violet dotted lines. Considering the limit of weak local field coupling |V | β, i.e., |η| 1 the mixing orders enter as η 2k−1 . This directly determines the order of the corresponding Bessel function, which is one in this case. Applying this procedure also to the other polarizations we find the Bessel functions of 0th, 1st, and 2nd order.
In the case of a negative delay, the pulse alignment is inverted as depicted in Fig. 3. The chronologically first (second) pulse has the pulse area θ 2 (θ 1 ) and the phase φ 2 (φ 1 ). To obtain the FWM signal for negative delays, Eq. (12) can be re-used where the pulse areas, pulse phases and the sign of the delay have to be switched In contrast to the FWM dynamics with positive delay, the orders of the Bessel functions are different as the wave-mixing orders of φ 2 −φ 1 φ 1 −φ 2 n + 2 change. In particular, as both pulses change their ordering, in the pure TLS only the polarization 2φ 1 −φ 2 p + 2 is created, which results in a vanishing FWM signal. Therefore, to obtain a signal with the FWM phase 2φ 2 − φ 1 at least a third order mixing via the local field is necessary. Thus, the lowest order of wave-mixing is one, which is the contribution of φ 2 p + 2 mixing with the phase difference φ 2 − φ 1 of the occupation once.

Simulations for non-vanishing pulse durations
To study FWM signals with non-vanishing pulse durations ∆t > 0, the Bloch equations with local field coupling in Eqs. (2) and the phase filtering are evaluated numerically [35].
The optical field is modeled as resonant Gaussian pulses centered at t = −τ 12 and t = 0, for τ 12 ≷ 0 .
To account for the inverted pulse ordering, for τ 12 > 0 the upper indices in θ1 2 and φ1 2 are used, while the lower ones refer to τ 12 < 0. For the following discussion we keep as many system parameters as possible fixed to keep the analysis as instructive as possible.
For the exciton dephasing we choose β = 0.5 ps −1 which is a typical value for hBNencapsulated TMDC monolayer [36]. The exciton decay time is usually much longer than the dephasing [37] and we choose Γ = 0. For a non-vanishing value the impact of the local field effect would additionally decay on time in a non-trivial way. To avoid this complication we disregard the decay. For the pulse duration we assume ∆t = 0.1 ps as a typical value for pulse trains in FWM experiments [8]. In addition we set θ 2 = 2θ 1 which is usually applied in FWM experiments [38]. As local field coupling we consider V = 6 ps −1 which is an order of magnitude larger than the dephasing rate β. In the literature similar proportions have been used to describe semiconductor quantum well systems [25,24]. In monolayer TMDCs, the dephasing and decay rates differ between samples. Additionally, the strength of the local field is generally not known and may depend on the ambient conditions. For simplicity, it is fixed to a value such that local field effects are clearly visible. Future comparisons to experiments will help to find the correct values for the local field strength. We begin the discussion by analyzing the influence of the local field coupling on FWM signals in the regime of small pulse areas. Exemplarily, in Fig. 4, the FWM dynamics for θ 1 = 0.05π of the pure TLS with V = 0 and with a local field coupling V = 6 ps −1 are shown in (a) and (b), respectively.
In the pure TLS without local field interaction, the FWM dynamics in Fig. 4(a) directly represent the polarization dynamics which are given by exponential decays in real time t and delay time τ 12 with the dephasing rate β [Eq. (14)]. In the direction of negative delays the dynamics drop sharply when going to τ 12 < 0 as there is no FWM polarization created when the pulse with phase φ 2 arrives before the pulse with phase φ 1 [20]. The corresponding FWM spectrum, defined as is a Lorentzian line for positive delays, i.e., a single peak at the bare transition energy ω = ω 0 (not shown). Upon including the local field interaction, the FWM dynamics in Fig. 4(b) change qualitatively. On the one hand, the signal extends to τ 12 < 0 and on the other hand, the maximum of the signal is shifted to larger times t > 0. Both aspects can be understood  with the analytic solution for δ-pulses. For low excitation powers, the lowest order contributing to the FWM signal is called the χ (3) -regime which for t ≥ 0 reads [23] p with κ 2 = −iθ 1 θ 2 2 /8 and the Heaviside function Θ(t). For positive delays, the dynamics consist of a part decaying as e −β(t+τ 12 ) which reflects simply the FWM dynamics without local field interaction and a part that rises for short times ∼ V te −β(t+|τ 12 |) . This contribution results from the mixing of the polarization φ 2 p + 2 with the phase-dependent occupation φ 2 −φ 1 φ 1 −φ 2 n + 2 , i.e., mixing order 1 in Fig. 2. In contrast, for negative delays, only the last term contributes, as the FWM polarization is not created in the pure TLS but the polarization φ 2 p + 2 can still mix with the occupation. Here, the pulse with the phase φ 2 arrives first, therefore both φ 2 p + 2 and φ 2 −φ 1 φ 1 −φ 2 n + 2 are reduced due to dephasing before the second pulse with φ 1 . This results in the additional decay of the signal with e −β|τ 12 | in Eq. (21). So, in total, the signal decays twice as fast for negative delays than for positive ones. The FWM spectrum also consists of a single Lorentzian peak that is shifted by − V due to the energy renormalization from the local field coupling (not shown).
For larger pulse areas beyond the χ (3) -regime, the FWM dynamics change remarkably. In Fig. 5(a) the FWM dynamics for θ 1 = 0.2π are shown. As a first striking difference, they exhibit oscillations, which stem from the Bessel functions known from the δ-pulse solution in Eq. (15). For increasing absolute delays, the minima of this modulation shift to larger t. Their functional evolution τ (t) in the plot can be estimated from the equations. It is determined by the curve for which the argument η of the Bessel functions [Eq. (12g)] is constant, i.e., its differential dη vanishes:  This curve is included as dashed line in Fig. 5(a). Therefore, the exponential loss of coherence between the pulses leads to the logarithm-shaped evolution of the minima.
In Fig. 5(b), the corresponding FWM spectra are shown as a function of τ 12 . While for small absolute delays |τ 12 | 1/β, the spectrum is broad it becomes narrow for large delays, which can be explained by considering the FWM dynamics. Because the oscillations vanish for large τ 12 , the single maximum at t > 0 leads to a single line in the spectrum. This behavior of the FWM spectra is similar for positive and negative delays and the spectral positions of large |τ 12 | coincide. However for negative delays the signal is weaker and decays faster, as already seen in the χ (3) -regime in Eq. (21). The physical meaning of the spread and position of the spectrum will be discussed below.
Also the FWM dynamics for higher pulse areas of θ 1 = 0.4π, shown in Fig. 5(c), exhibit similarities with the previous example like the oscillatory behavior. However, one difference is that the signal decays much faster. In addition, when looking at small delays |τ 12 | ≈ ∆t, the minima deviate from the logarithmic shape. This indicates, that there is a difference between the action of two separate and two simultaneous pulses. The reason is that the pulse area and duration have a non-trivial impact on the resulting state when the system deviates from the pure TLS [38,39]. Therefore, the pulse area theorem [18] does not hold for extended pulses anymore [39] and the pulse area θ does not agree with the rotation angle of the Bloch vector. This makes the dynamics of the signal more involved. Details about this pulse area renormalization are given in Appendix A. In addition, during the pulse overlap complex dynamics stemming from Rabi oscillations contribute to the FWM signal for large enough pulse areas [40]. Qualitatively also the FWM spectra, shown in Fig. 5(d), have a similar form as the ones at lower pulse area. The most striking difference is however, that the final peak energy for long delays moves from ω < ω 0 to ω > ω 0 . A discussion of the differences between the simulations and the δ-pulse limit is presented in Appendix B, where the respective signals in the ultrashort pulse limit are shown. As expected, some deviations between the simulations with non-vanishing pulse duration and δ-pulses appear for delays that are shorter or of the order of the pulse duration, i.e., in the regime τ 12 200 fs, where the overlap has a significant influence on the generated signal. For longer delays, the δ-pulse limit turns out to be a very accurate approximation. In the following we analyze the spread and position of the FWM spectra at small and large delays in detail. A systematic study for small delays is shown in Fig. 6(a), where the spectra are depicted for a continuous variation of the pulse area θ 1 . The delay is fixed to twice the pulse duration τ 12 = 2∆t, such that the two pulses are mainly separated. At low pulse areas the spectrum is very narrow as we have discussed for the χ (3) -regime. We have already seen in the exemplary FWM spectra that the spectral width increases with growing pulse area. This spread of the spectra can be traced back to the fact, that the transition energy is modified by the occupation [see Eq. (3)]. So, in order to develop an intuitive picture which occupations can be reached in the FWM experiment, we consider the Bloch vector representation of the systems state in Fig. 6(b). The Bloch vector is given by its coordinates [2Re(p), 2Im(p), 2n − 1] and it points to the surface of the Bloch sphere for pure states and is shorter than unity when dephasing happens. Laser pulse excitations lead to rotations of the Bloch vector around an axis that depends on the phase of the pulse and, in general, also on the laser pulse detuning. However, here we assume a resonant excitation throughout this study.
Keeping in mind that all possible combinations of pulse phases φ 1 and φ 2 that match the FWM phase are realized during the repetition of the experiment, we search for the smallest and largest possible occupation after the two pulses. Following Eq. (3) the corresponding renormalized energies can be calculated to ω eff which connects the spread of the FWM spectrum with the extent to which the system can be addressed by its coherences. We can interpret this reasoning as a kind of coherent control experiment [41].
In particular, such a manipulation of the occupation by varying the phases of optical pulses is performed in Ramsey fringe experiments [42].
As schematically shown in the picture the largest occupation N The corresponding energies ω eff (N ± ) are marked in Fig. 6(a) as dashed black lines. They describe the boundaries of the spectrum very well. But we find a slight mismatch between the dashed line and the edge of the spectrum. This deviation can be traced back to a renormalization of the pulse area due to the local field interaction, as we have mentioned before. While the pulse area is defined via its rotation angle in the pure TLS, the local field interaction changes this rotation angle in a non-trivial way (see Appendix A).
The FWM spectra for large delays are depicted in Fig. 6(c). As we have already seen in Figs. 5(b, d), the spectrum becomes a single narrow line. Its energy depends on the pulse area and is the same for positive and negative delays. These findings can again be traced back to the behavior of the Bloch vector depicted in Fig. 6(d). After the first Rabi rotation the polarization dephases as indicated by a dotted line, i.e., the Bloch vector moves onto the Bloch sphere's z-axis. Afterwards, the second pulse leads to another Rabi rotation, which is reduced because the length of the Bloch vector is smaller than one. The spectral position of the FWM signal is then determined by the occupation after the second pulse N (2) ∞ . For ultrafast pulses the final occupation reads and is the same for both positive and negative delays as also shown in the schematic. This is the reason why the spectra lie at the same position for positive and negative delays. Because N (2) ∞ does not depend on the pulse phases anymore the spectrum consists of a single sharp line. In the context of coherent control this means that for delays much longer than the dephasing time the system cannot be addressed coherently anymore. Naturally, the loss of coherence also leads to a decay of the signal strength.

Three-pulse FWM
While in the bare TLS the two-pulse FWM signal carries information on the coherence of the system, it is possible to measure the occupation dynamics in a three-pulse configuration with φ [20]. As for the influence of the local field the occupation plays a crucial role we expect interesting features to appear for this pulse sequence.

δ-pulse limit
With three pulses there are two delays, τ 12 between the first and second pulse and τ 23 between the second and third pulse. Like in two-pulse FWM, the pulse areas θ 1,2,3 are kept fixed while the pulse phases φ 1,2,3 are scanned to generate the FWM signal. Usually, the delay τ 12 is kept small, such that the pulses slightly overlap, and it stays fixed while the delay τ 23 is varied.
As explained before in the limit of ultrashort pulses all properties of the system can be calculated analytically. Following the results in Eq. (12) and applying Eqs. (9) we end up with the following phase dependencies immediately after the third pulse: The occupation n + 3 carries the phases φ 2 − φ 1 , φ 3 − φ 1 , φ 3 − 2φ 2 + φ 1 and their complex conjugates, the polarization p + 3 carries the phases φ 1 , φ 2 , φ 3 , 2φ 2 −φ 1 , 2φ 3 −φ 2 , 2φ 3 −φ 1 , and 2φ 3 − 2φ 2 + φ 1 . In the following free propagation additional wave mixing between the occupation and the polarization can contribute to the FWM signal. As this large number of possible phase combinations complicates the situation massively, we will only analyze the limiting case of large delays τ 23 1/β analytically. So we assume that before the interaction with the third pulse the polarization has dephased entirely and we can set p − 3 = 0. Choosing further τ 12 = 0, the polarization after the three pulses reads with Note that these equations are valid for a complex local field parameter V , i.e., EID is included. However, in the following we will again concentrate on a real parameter V . Using the FWM phase φ A flow chart of the single steps leading to the FWM signal is shown in Fig. 7. Analogous to Fig. 2 the first and second pulse create an occupation which consists of a part independent of the pulse phases 0 n + 2 and a part dependent on the pulses' phase difference These two terms remain even for long delays because we assume that the excited state decay is much slower than the dephasing. The polarization which consists of φ 1 p + 2 , φ 2 p + 2 and 2φ 2 −φ 1 p + 2 vanishes due to the long delay τ 23 which is indicated by black dotted arrows. As a consequence the last pulse creates polarizations exclusively from the occupations before this pulse. Therefore, the final polarization consists of a part with the phase of the third pulse φ 3 p + 3 , indicated by the green wave from 0 n + 2 , and a part φ 3 +φ 2 −φ 1 φ 3 −φ 2 +φ 1 p + 3 which adapts the phase difference from the occupation φ 2 −φ 1 φ 1 −φ 2 n + 2 . There is also the possibility that the occupation does not receive an additional phase dependence and simply becomes φ 2 −φ 1 φ 1 −φ 2 n + 3 . Without the local field coupling, only φ 3 −φ 2 +φ 1 p + 3 contributes to the FWM signal and we see that it carries the information of the previous occupation. Through the local field coupling, the polarization mixes with φ 2 −φ 1 φ 1 −φ 2 n + 3 in the following field-free propagation which is marked by violet dotted arrows. This again results in Bessel functions describing the dynamics of the FWM signal. In direct analogy to the discussion of two-pulse FWM, the minimal number of mixing processes involved determines the order of the Bessel function. The possible mixing orders are again annotated on the violet arrows.
Finally, in case of long delays τ 23 1/β, the FWM dynamics read The dynamics have a similar form as the analytical result from the two-pulse FWM in Eq. (15). This is because the occupation n + 2 has the same phase dependence as in the two-pulse case and lowest mixing orders are also 0, 1, and 2 for φ 3 +φ 2 −φ 1 p + 3 , φ 3 p + 3 , and φ 3 −φ 2 +φ 1 p + 3 , respectively.

Simulations for non-vanishing pulse durations
To discuss non-vanishing pulse durations ∆t > 0, we simulate the FWM signals numerically with Gaussian laser pulses. We set all pulse areas to the same value θ 1 = θ 2 = θ 3 and fix the delay between the first two pulses to τ 12 = 0.2 ps. The other parameters agree with the two-pulse case, i.e., β = 0.5 ps −1 , Γ = 0, and ∆t = 0.1 ps. Beginning with small pulse areas, the FWM dynamics for θ 1 = 0.05π without local field interaction V = 0 and with a local field coupling of V = 6 ps −1 are shown in Figs. 8(a) and (b), respectively. For V = 0 in (a) the FWM signal decays in real time t as the FWM polarization dephases. The signal does not change when varying positive delays τ 23 because we did assume a vanishing decay rate and the signal is probing the occupation dynamics. For negative delays however, the pulse with φ 3 arrives first. After this pulse only the polarization is phase-dependent and the next two pulses with φ 1 and φ 2 create the FWM signal. Therefore the FWM signal reflects the polarization dynamics. This quantity decays ∼ e −β|τ 23 | due to the dephasing in the field-free propagation after the single pulse.
With the local field interaction in Fig. 8(b), the signal changes qualitatively. Similar to two-pulse FWM, the maximum of the FWM dynamics is shifted to t > 0 due to the wave-mixing in the free propagation after all three pulses. In addition the signal is strongest for short delays τ 23 1/β. These aspects can be explained by calculating the   FWM dynamics in the χ (3) -regime to with κ 3 = −iθ 1 θ 2 θ 3 /4. Independent of the sign of the delay, the energy is renormalized. Hence, the FWM spectrum consists of a single line at ω = ω 0 − V (not shown). For positive delays the first term in the curly brackets in Eq. (28) contributes. There, the first contribution simply stems from the pure TLS without local field coupling. The second contribution is created by the interaction with the local field and consists of two parts. The first one does not depend on the delay and stems from the mixing between the polarization after the third pulse φ 3 p + 3 and the occupation φ 2 −φ 1 n + 3 . The second part decays with e −2βτ 23 and stems from the polarization φ 2 p + 3 which mixes with φ 3 −φ 1 n + 3 . As both of these quntities stem from the polarization after pulses 1 and 2, they suffer from the dephasing in the field-free propagation after the pulse pair. Therefore this contribution decays with twice the dephasing rate.
For negative delays the second term in the curly brackets in Eq. (28) contributes. The pulse with phase φ 3 arrives first and after the delay τ 23 , the two pulses with φ 1 , φ 2 excite almost simultaneously. Consequently the only contributing quantity after the first excitation is φ 3 p + 3 and the entire signal decays with e −β|τ 23 | . The FWM dynamics consist of a contribution from the pure TLS and a part from the local field interaction. The latter consists of the mixing between φ 3 p + 3 and φ 2 −φ 1 n + 3 and between φ 2 p + 3 and φ 3 −φ 1 n + 3 resulting in the term ∼ V t.
Going to larger pulse areas, the FWM spectra first change slightly. For an intermediate pulse area of θ 1 = 0.25π in Fig. 9(a) the FWM spectrum is initially significantly broadened and becomes narrower for larger delays but it does not evolve into a single symmetric line and remains broader than the Lorentzian in the χ (3) -regime. A detailed discussion of the linewidth at large delays will be carried out later.
At still higher pulse areas, e.g., θ 1 = 0.45π depicted in Fig. 9(b), the FWM spectra change significantly. Initially, the FWM spectrum is broad and evolves into two separate lines for large τ 23 which is a striking difference to the previous picture. Both lines are  separated by a minimum of the signal at ω = ω 0 . This characteristic behavior will be explained in the context of coherent control later. For negative delays τ 23 < 0 the details of the FWM spectra change significantly when comparing Figs. 8(a) and (b) and become quite involved. However, in all cases the intensity decays rapidly due to the dephasing as discussed for the χ (3) -regime.
While the spectra are all broadened and quite involved for small delays τ 23 ≈ 1 ps they become very clean for long delays τ 23 ≈ 5 ps. The reason for this behavior is the multitude of different possible phase combinations for short delays and the vanishing influence of the polarizations for long delays as explained before. We find pulse areas that result in a single or in two separate lines. This effect is further highlighted in Fig. 10(a), where the FWM spectra for large delays τ 23 1/β are shown as a function of the pulse area θ. For small pulse areas in the χ (3) -regime, the spectrum appears as a single line at ω = ω 0 − V . Then the spectrum broadens for an increasing pulse area. At θ 1 ≈ π/4, the spectrum splits into two parts with a pronounced minimum at ω = ω 0 in agreement with Fig. 9(b). For θ 1 ≈ π/2, the two lines merge again into a single peak at ω = ω 0 . Increasing the pulse area further, the single line splits into two lines again.
For the explanation of this behavior we consider the Bloch vector picture again. Figure 10(b) shows the schematic construction of the smallest and largest possible occupation after the three pulses and a full dephasing after pulse two. The first Rabi rotations (blue and orange) can act destructively, i.e., the respective rotations of the Bloch vector compensate each other. The state of the TLS after the second pulse then points to the south pole. The third pulse rotates from there and reaches the final occupation which is the smallest possible. Note, that strictly speaking this phase combination leads to a vanishing FWM signal because both, polarization and occupation are zero after the second pulse. But at this point we are just interested in the hypothetically reached shift of the transition energy. In the other extreme case the two first pulses act constructively and the pulse areas essentially add up. Before the third pulse the state dephases entirely, The effective transition energies corresponding to these two limiting cases are depicted in Fig. 10(a) as dashed black lines. We find that they again very accurately follow the boundaries of the calculated spectra. The slight deviation again stems from the pulse area renormalization due to the local field coupling (see Appendix A). Also the reduction of the signal at ω = ω 0 (seen most clearly for pulse areas around θ 1 = 0.4π) can be explained in this picture. We consider all possible combinations of φ 1 and φ 2 that make the Bloch vector point to the equator after the second pulse. The following dephasing ends in the center of the sphere, which is the balanced mixture of ground and excited state. In this point the system is called transparent because the third pulse cannot create a polarization from there. Consequently also no FWM signal can be generated.
For the special case θ 1 = 0.5π, where the spectral peaks in Fig. 10(a) merge at ω = ω 0 , we remember that after the full dephasing for long delays τ 23 all possible Bloch vectors lie on the z-axis of the Bloch sphere. From there the Bloch vector is rotated by the third pulse into the equatorial plane no matter where on the z-axis it was. Therefore the only possible occupation is n + 3 = 1/2 which results in a renormalized energy of ω loc = 0 [Eq. (3)], i.e., a single line in the spectrum at ω = ω 0 . Note, that in this scenario the equator is reached after the third and not the second pulse, so the argument is not in conflict with the previous discussion of the spectral minimum at ω = ω 0 .

Conclusions
Motivated by recent nonlinear FWM measurements on TMDC monolayers we revived the Bloch equation model extended by a local field effect. This effect takes exciton-exciton interactions into account on a mean field level. After introducing analytic solutions for the optical driving and the field-free propagation between two laser pulses in the limit of ultrashort excitations we calculated the FWM signal after a two-pulse excitation in this limit. This result was later used to explain the numerically simulated spectral dynamics for excitations with laser pulses of non-vanishing duration. We found that the local field effect leads to oscillations in the FWM signal dynamics which translate into a spectral broadening and even a line splitting for short pulse delays. We have explained these aspects with the pulses' action on the TLS's occupation, visualized by means of the Bloch vector, which determines the FWM signal via the local field. In the case of three-pulse FWM signals, the occupation dynamics plays the central role and we found that line splittings even persist for long delays between pulses two and three. Utilizing the illustrative Bloch vector again, we showed that the variations of the FWM spectral dynamics strongly depend on the applied pulse areas.
Overall, this work develops a basic understanding of nonlinear optical signals from systems where exciton-exciton interaction can be modeled by a local field approximation. Motivated by the first experiments on TMDCs indicating local field effects, this work proposes how the fundamental physics can be explored by investigating the spectral dynamics in different FWM scenarios. Especially promising is the utilization of the local field model for situations where funneling effects lead to increased exciton densities and therefore to pronounced exciton-exciton interaction.

Appendix B. FWM signals for ultrashort pulses
In the main text we discuss FWM signals from non-vanishing pulse durations and interpreted the signals with the help of calculations in the δ-pulse limit. To get a clearer view on the impact of the pulse duration, the FWM-dynamics and spectra from Fig. 5 are calculated for δ-pulses via Eqs. (15) and (18). Regarding the case of θ 1 = 0.2π, whose FWM dynamics and spectra are depicted in Figs. B1(a) and (b), respectively, the analytical solution matches the numerically simulated signal almost perfectly for both positive and negative delays.
For the larger pulse area θ 1 = 0.4π the dynamics shown in Figs. B1(c) and (d)  exhibit more prominent differences with respect to Figs. 5(c) and (d). On the one hand, the oscillating signal follows the derived logarithmic behavior also for small delays because the influence of the pulse overlap vanishes for δ-pulses. On the other hand, the transition from positive to negative delays is very abrupt. Nevertheless, the qualitative aspects of the spectra, i.e., width, line splitting, and narrowing are well-reproduced. This demonstrates that overall an interpretation of the spectra via the Bloch vector retrieved in the δ-pulse limit is well justified.
For all the results presented up to now we have assumed a real-valued local field, i.e., a real value of V . As discussed in the main text, an imaginary part can be added to model the phenomenon of excitation-induced dephasing (EID). This gives rise to an additional exponential decay of the polarization depending on the pulse areas of the exciting pulses, as discussed in connection with Eq. (12). In Fig. B2 we show the same two-pulse FWM signals as in Fig. B1, but now with EID described by an imaginary part of Im(V ) = −0.5 ps −1 . To obtain the same linear dephasing, we set β + Im(V ) = 0.5 ps −1 . In the case of weak excitation θ 1 = 0.2π in Figs. B2 (a) and (b), we observe a slightly enhanced decay both as a function of the real time t and the delay time τ 12 and correspondingly a slightly broader spectrum. However, the differences are rather small. This changes strongly when increasing the pulse area to θ 1 = 0.4π in Figs. B2 (c) and (d). Now the decay is indeed much faster and there is a considerable additional broadening of the spectrum. The enhancement of the decay is particularly strong in the case of negative delay times, which can be understood from the fact the now the stronger pulse with θ 2 = 0.8π arrives first and leads to a higher occupation between the pulses than in the case of positive delays.
The results depicted in Fig. B2 also show that despite the faster decay of the signals and the corresponding broadening of the spectra, the overall shape of the signals and spectra remains remarkably unaffected. In particular, the minima in the signals as a function of real time and delay time (panels (a) and (c) in Figs. B1 and B2) are essentially unchanged, except for a reduction in the contrast due to the faster decay. This confirms our motivation to neglect EID in the main text, where we concentrated on the characteristic temporal and spectral features introduced by a real local field.
Let us briefly comment on the order of magnitude of the parameters chosen in our studies. Our value of V = 6 ps −1 leads to shifts in the spectra of the order of a few millielectronvolts. In Ref. [29], based on a microscopic theory, local fields of this order have been obtained for exciton densities of the order of 10 12 cm −2 . According to Ref. [43] the threshold for lasing in TMDC materials, which is related to the presence of an inversion, is expected for carrier densities of the order of a few times 10 13 cm −2 , indicating that densities of several 10 12 cm −2 are indeed already close to inversion. The EID parameter has been calculated to be in the range of about 1 meV for densities of the order of 10 12 cm −2 [33]. Therefore we conclude that our parameters have a realistic order of magnitude for TMDC materials.