SUPER excitation of quantum emitters is a multi-photon process

The swing-up of quantum emitter population (SUPER) scheme allows to populate the excited state of a quantum emitter with near-unity fidelity using two red-detuned laser pulses. Its off-resonant, yet fully coherent nature has attracted significant interest in quantum photonics as a valuable tool for preparing single-photon sources in their excited state on demand, while simultaneously ensuring straightforward spectral filtering of the laser. However, the physical understanding of this mechanism in terms of energy exchange between the electromagnetic field and the emitter is still lacking. Here, we present a fully quantized model of the swing-up excitation and demonstrate that it is in fact a multi-photon process, where one of the modes loses two or more photons while the other gains at least one. Our findings provide an unexpected physical interpretation of the SUPER scheme and unveil a new non-linear interaction between single emitters and multiple field modes.


Introduction
Efficient sources of indistinguishable single photons are essential components of optical quantum computers and simulators [1,2].Whereas spontaneous parameteric downconversion allows for straightforward generation of highly indistinguishable heralded photons, its probabilistic nature leads to low single-photon purity and collection efficiency, which can only be improved with complex multiplexing schemes [3,4].On the other hand, quantum emitters based on semiconductor quantum dots yield nowadays highly indistinguishable single photons with high purity and collection efficiency of the order of ≈ 0.6 [2,5,6].Activating such sources on demand calls for a fast and coherent optical excitation protocol that prepares the emitter in the desired excited state.
In the so-called swing-up of quantum emitter population (SUPER) scheme [7], two red-detuned laser pulses interacting simultaneously with the emitter cause fast oscillations of the excited state population.For a proper choice of the laser parameters (frequency, amplitude, and duration of each pulse), the emitter is finally left in the excited state with fidelity close to 100%.The technological advantage of this technique, as compared for example with resonant fluorescence, is that laser rejection is straightforwardly achieved via spectral filtering due to the red-detuned nature of the pump [8,9].Moreover, it populates the excited state directly with a fast and coherent process, as opposed for example to pumping into higher energy levels (which requires incoherent internal relaxation to create the exciton).This attractive scenario has stimulated intense theoretical and experimental research, including the prospect of applying this technique to cavity-coupled quantum dots [10,11], the analysis of the important role played by phonon scattering [12][13][14], and the possibility of achieving a similar effect with hybrid acousto-optical methods [15].While the SUPER scheme was originally devised for semiconductor quantum dots, it relies in fact on general properties of quantum emitters coupled to the electromagnetic field and has now been investigated in different material platforms, such as tin vacancies in diamond [16] and emitters in the layered transition metal dichalcogenide WSe 2 [13].
In spite of this active interest, the physical origin of this effect is not yet fully understood.Bracht et al. [17] have explained the swing-up effect with a dressed state picture of the emitter-field system.While one of the laser pulses dresses the bare emitter states and effectively modifies their energy 1 arXiv:2406.17540v2[quant-ph] 18 Sep 2024 splitting, the other one drives transition between the dressed states.However, a fundamental interpretation in terms of energy exchange between the emitter and the electromagnetic field is still missing.For example, whereas resonance fluorescence is easily interpreted in terms of absorption of one single photon from the laser pump, such a simple picture is currently unavailable for the SUPER scheme.This is further complicated by the off-resonant nature of this protocol and by the presence of two simultaneous laser fields, which makes the physical interpretation less intuitive.
In this work, we surprisingly show that the SUPER scheme excitation involves multiple photon exchanges between the emitter and the laser.We achieve this by developing a quantum-mechanical model for an emitter coupled to two quantized field modes, which allows to count the number of photons in each mode before and after interaction with the emitter.Specifically, we find that two photons are typically absorbed from one of the modes during the swing-up process.Of these two excitations, one is stored in the emitter by raising its population from the ground to the excited state, while the remaining one is absorbed into the second field mode.Interestingly, we find that population inversion may still occur even when one of the field modes is in the vacuum state, provided that the other one is populated with at least two photons.
This article is organized as follows.First, we describe the model and the methodology in Section 2. Adopting a Fock state assumption for the electromagnetic field, we analyze the dynamics of the SUPER and the red-and-blue dichromatic scheme [12,18] in the large photon number regime in Sections 3 and 4, respectively.We then focus on the few-photon regime in Section 5, where we consider the case where one of the field modes is in the vacuum state.In Section 6, we describe the field in terms of coherent states, and compare the results with the ones obtained under a Fock state assumption.Then, we touch upon the effect of dissipation and decoherence in Section 7. We finally discuss our findings in Section 8, before drawing our conclusions in Section 9.

Model
We model the quantum emitter as a two-level system with ground state |⟩ and excited state |⟩.Its Hamiltonian is   = ℏ   † , with  † = |⟩⟨ | the raising operator.The emitter is coupled to two quantized electromagnetic field modes with Hamiltonian   = ℏ   †    ,  ∈ {1, 2}, where   is the annihilation operator for mode .Using the rotating wave approximation, the emitter-field coupling is modeled as a two-mode Jaynes-Cummings Hamiltonian, The coupling constants   in Eq. ( 1) are modulated by a time-dependent Gaussian envelope exp −  2  2

𝑝
to simulate pulsed excitation.The Gaussian modulation ensures a smooth switch-on and switch-off of the interaction and removes additional sidebands from the electromagnetic field spectrum, which would add unwanted features to the system response and hinder the visibility of the SUPER mechanism (see Supplement 1 for details).
We thus calculate the dynamics numerically by solving the Schrodinger equation ℏ  |()⟩ =  int () |()⟩ in the interaction picture with respect to   +  1 +  2 , with and   =   −   the detuning of each mode with respect to the emitter frequency.Specifically, noting that exp −3  int ()  int () > 1, even for ⟨()|()⟩ = 1.Therefore, we take care of normalizing the quantum state to 1 after each step in the calculation.This is not strictly necessary to achieve convergence, as the lack of normalization may be compensated with a finer time step d.However, it reduces the computation time up to a factor of 10 with respect to the case without state normalization.
For the numerical implementation, the state of each field mode  is expanded on a truncated number state basis   , with   ∈ ( min  , . . .,  max  ), and operators acting on mode  are represented as   ×   matrices, with   =  max  −  min  + 1.We make sure that results are converged with respect to the choice of  max  and  min  .This can be checked, for instance, by ensuring that the expectation value of the excitation number N =  †  +  † 1  1 +  † 2  2 is conserved throughout the simulation, as required by the fact that  int (), N = 0.

Dynamics of the SUPER scheme
We begin by considering the system dynamics when the field modes are both red-detuned with respect to the emitter and initialized in large-number Fock states  i  .This is motivated by the fact that optical excitation of a quantum emitter involves laser pulses with a large number of photons.Although Fock states are generally inaccurate to describe the quantum field of a laser, their use allows for a substantial numerical simplification of the problem in terms of the number   of states needed.The case of coherent states, which represent a more accurate description of the laser field, will be examined later in Section 6, where we find good qualitative agreement between the two pictures.We note that the dynamics is governed by the dimensionless quantities Δ  =     , and   =     , and we will discuss the results in terms of these parameters.
In Fig. 1, we explore the population   =  †  of the |⟩ state and of both field modes at time  f = 3  , i.e. after interaction has occurred.Following the literature [8,9,12,13], we fix the parameters for the first mode (dimensionless detuning Δ 1 =    1 and initial photon number  i  ), and scan the corresponding parameters for the second mode.Note that the terms "first" and "second" identify the mode with smaller and larger detuning in absolute value, respectively.In the SUPER scheme literature based on semiclassical models of light-matter interaction, results are typically shown as a function of the electric field amplitude [7,12], which is proportional to the square root of the photon number.Therefore, we choose to plot the results as a function of   √︃ The dimensionless coupling is fixed here to  1 =  2 = 0.1.The case of stronger coupling  1 =  2 = 5 will be discussed later in Section 5.
In Fig. 1a, which is taken at fixed √︃  i 1 = 62.83 ( i 1 = 3947), we identify a region of near-unity exciton preparation fidelity, with a maximum of   = 0.99.This is accompanied by additional resonances at smaller detuning with incomplete population inversion of the order of ≈ 0.4 at maximum.In contrast, by increasing the initial photon number in mode 1 to √︃ we observe multiple resonances with   ≥ 0.99 (see Fig. 1b).Similar features have been observed in the literature using a classical description of the field and represent the signature of the SUPER scheme.We conclude that our model of a two-level emitter interacting with two quantized field modes captures the physics of the SUPER swing-up effect satisfactorily.
Differently from the semiclassical description, a quantized model give access to the exact number of photons in each mode.We thus calculate the difference Δ  =  †    ( f ) −  †    ( i ) of the photon number in each mode before and after interaction, which is plotted in Fig. 1c-f.In correspondence with each resonance of   , we observe variations of the photon number in both modes.Remarkably, mode 1 shows always a decrease in photon number, whereas mode 2 shows an unexpected increase.Exploring the dynamics in correspondence of   ≈ 1 (see Fig. 1g-j), we find that exactly two photons are subtracted from mode 1, while mode 2 gains one photon in the process.This demonstrates surprisingly that SUPER scheme is a multi-photon process involving a redistribution of photons between the field modes.We also identify a higher-order process leading to large but incomplete population inversion   > 0.85 (see Fig. 1k) where almost three photons (⟨Δ 1 ⟩ = −2.76)are subtracted from mode 1, while mode 2 gains almost two photons (⟨Δ 2 ⟩ = 1.90).We hypothesize that a process leading to ⟨Δ 1 ⟩ = −3 and ⟨Δ 2 ⟩ = 2 exactly might be found with a thorough investigation of the parameter space, which is not performed here.Despite these variations in the photon number, we observe that the total excitation number 2 is indeed conserved in the process, i.e.   + ⟨Δ 1 ⟩ + ⟨Δ 2 ⟩ = 0.The photon number distribution (plotted in Fig. 2 for two exemplary cases) shows that the final state of modes 1 and 2 is to a very good approximation a Fock state with a well defined photon number.
In contrast with the semiclassical description of the SUPER scheme dynamics, which shows fast and ample oscillations in time of the |⟩ state population within one pulse (of the order of 10-30), the first leftmost resonance in Fig. 1a-b is characterized by a smooth and monotonic dynamics.Higher order resonances at smaller detuning and larger photon number display few oscillations in the emitter population and photon number of the field modes (of the order of 2-3).

Dynamics of the red-and-blue dichromatic scheme
In the red-and-blue dichromatic scheme, two laser pulses are symmetrically detuned to the red and blue side of the spectrum with respect to the emitter.It has been demonstrated that specific configurations involving pulses of different amplitude generate complete population inversion, whereas symmetric configurations with identical pulse amplitude lead necessarily to   = 0 [18].Exciton preparation with the red-and-blue scheme occurs through a similar swing-up effect of the population [12,18].
To model this scenario, we constrain the detuning to Δ 1 = −Δ 2 = 6 and explore the dependence of the final exciton population as a function of the initial photon number in each mode.Once again, we initialize the field modes in Fock states  i  .As shown in Fig. 3a, we obtain several resonances featuring   ≈ 1 in quantitative agreement with results based on a semiclassical model [12].No resonance is indeed observed along the diagonal  i 1 =  i 2 , in agreement with the requirement of different pulse amplitudes.
In contrast with the SUPER scheme, we observe that this process involves photon exchange with a single mode, see Figs. 3b and Fig. 3c.For example, at √︃

√︃
i 2 = (16.00,61.25) and (29.21, 89.84), exactly one photon is subtracted from mode 1 (the one at positive detuning) to populate the excited state, while no photon exchange occurs with mode 2 (negative detuning).The opposite holds true by exchanging  i 1 with  i 2 .There, one photon is subtracted from mode 2 with no gain/loss in mode 1.We notice, however, that in both cases the mode with Δ  = 0 participates actively in the dynamics, as demonstrated by oscillations of the average photon number in time reported in Fig. 3d-g.Similarly to the SUPER scheme, we notice that the fundamental resonance (namely, the one occurring at smaller photon number) shows a monotonic increase in   with no oscillations (Fig. 3d-g).Higher order resonances display few small oscillations.

Strong emitter-field coupling: "vacuum" swing-up
So far, the emitter dynamics under two-color excitation with a large number of photons is consistent with a semiclassical approach where the electromagnetic field is treated as a classical object.We now explore a truly quantum regime where the field modes are populated by a few photons only.
The strength of the emitter-field coupling is governed by the coupling constants   and the electric field amplitude   ∝ √︃  i  , as one can readily verify by calculating the matrix elements of  int .Therefore, to maintain the same effective light-matter coupling while scaling down the field occupation to the few photon regime, we simultaneously increase the value of the dimensionless coupling constants   =     .In experiments, this can be done by increasing the emitter-field coupling   via optical engineering of the electromagnetic environment, or by increasing the pulse duration   .
Starting with the SUPER scheme, we explore this scenario in Fig. 4 by increasing the coupling to  1 =  2 = 5 and scaling the initial number of photons in mode 1 down to  i 1 = 5 and  i 1 = 2 (panels a and b, respectively).Interestingly, we observe that the typical resonances of the SUPER scheme are still clearly visible, and only few photons are needed to achieve full population inversion.Quite surprisingly, exciton preparation is still possible even when the second field mode is prepared in the vacuum state  i 2 = 0. We observe   = 0.97 for ( i 1 ,  i 2 ) = (5, 0) and   = 0.88 for ( i 1 ,  i 2 ) = (2, 0), where incomplete population inversion   < 1 is explained by the fact that we have chosen an arbitrary value of Δ 1 = −6.By fixing the initial number of photons in each mode and continuously varying both detunings Δ  , we obtain indeed full population inversion with   = 1.00 at ( i 1 ,  i 2 , Δ 1 , Δ 2 ) = (5, 0, −7.56, −28.56), and   = 0.99 at ( i 1 ,  i 2 , Δ 1 , Δ 2 ) = (2, 0, −4.06, −15.96), see Figs. 4c and 4d.The corresponding time-dependent dynamics is plotted in Figs.4e and 4f, demonstrating that full population inversion with exchange of multiple photons in indeed possible even when one of the field modes is in the vacuum state.For the case of Fig. 4f, where the field is initialized in ( i 1 ,  i 2 ) = (2, 0), we notice that mode 1 is left in the vacuum state, while mode 2 is in the single-photon Fock state  f 2 = |1⟩ after interaction.It should be noted that the Gaussian modulation of the interaction plays a minor role in the case of a vacuum state, i.e. we can still explain the vacuum swing-up qualitatively if we remove the exp − 2 / 2  factor from  2 (see Supplement 1 for details).We observe no population inversion when ( i 1 ,  i 2 ) = (1, 0), for any value of the detunings.This demonstrates that the minimum excitation number to activate the SUPER mechanism is N = 2, consistent with the fact that the SUPER scheme cannot be explained with the exchange of one single photon between the emitter and the field.
Applying similar considerations, we find signatures of the red-and-blue dichromatic scheme in the few-photon regime as well.Despite the fact that dichromatic excitation in the large photon number regime is explained with the subtraction of a single photon from one of the modes, we do not observe population inversion for initial state ( i 1 ,  i 2 ) = (1, 0) or (0, 1), indicating that the requirement on the excitation number is still N ≥ 2. On the other hand, we find that preparing the field modes in two single-photon states  i 1 ,  i 2 = (1, 1) at detuning Δ 1 = −Δ 2 = 3.12 leads to population of the exciton state with fidelity   = 0.96, see Fig. 4g.Here, the system (|, 1, 0⟩ + |, 0, 1⟩) after interaction, meaning that half a photon has been subtracted from both modes on average.Finally, it is worth noting that the multi-photon resonances of the SUPER scheme work in the reverse fashion as well.In other words, it is possible to completely depopulate an initially excited emitter by subtracting one excitation from mode 1, while increasing the population of the other field mode by two excitations.This may be used to engineer non-linear interaction between field modes at different frequency.For example, as shown in Fig. 4h, a single-photon state at detuning Δ 1 = −15.68 is converted into a two-photon state at Δ 2 = −3.78accompanied by relaxation of the emitter to the ground state.

Coherent states
So far we have considered Fock states of the field modes for simplicity.However, the quantum state of a laser pulse is best described by a coherent state /( √ !) |⟩.In Fig. 5, we compare the dynamics for a case where the field is initialized in the multi-mode Fock state  i 1 ,  i 2 with the case where the initial field state is  i 1 ,  i 2 , with  i  representing coherent states containing the same number of photons on average.For sufficiently high excitation number (see Fig. 5a-c), we observe that the two situations are qualitatively similar.Indeed, the description based on coherent states confirms that the SUPER scheme excitation takes place by removing approximately two photons from mode 1 while simultaneously increasing the population of mode 2 by one photon.However, whereas the Fock state description yields a monotonic dynamics, the coherent state model reproduces the typical swing-up pattern that characterizes the SUPER scheme, resulting in fast oscillations of the exciton population.Despite this remarkable qualitative difference, the attained value of   using coherent states is quantitatively similar to the one obtained using Fock states.An inspection of the photon number distribution, see Fig. 5g-j, confirms the agreement between the two pictures.In both cases, the distribution of mode 1 (2) is rigidly shifted towards smaller (larger) values by the same amount.This also shows that both Fock and coherent states maintain their characteristics in terms of number distribution.
However, the agreement breaks down in the deep quantum regime of low excitation number.As shown in Figs.5d-f, a model using coherent states predicts   = 0.32 for ) and (Δ 1 , Δ 2 ) = (−6, −19.81), while the value predicted with a Fock state assumption is   = 0.96.We conclude that SUPER scheme excitation in the low photon number regime necessitates of few-photon Fock states as compared with coherent states of small amplitude.Whereas the latter can be realized, for example, with attenuated laser pulses, the former are fundamentally different and more challenging to realize [22][23][24].

Dissipation and decoherence
Before concluding, we briefly examine the effect of dissipation and decoherence on the multiphoton scattering process.We limit the discussion to (i) spontaneous decay and (ii) pure dephasing of the two-level emitter.A more detailed modeling of noise and decoherence, including for example phonon scattering beyond the pure dephasing approximation, lies outside the scope of this work.
In both cases, the dynamics becomes non-unitary and an open quantum system approach is necessary.We thus solve the master equation for the system density operator (),   where spontaneous decay at rate Γ and pure dephasing at rate  are included, with Here, the density operator is represented as a 2 1  2 × 2 1  2 matrix on the same truncated number state basis   as before.
In Fig. 6a-c, we consider the configuration previously shown in Fig. 4e (i.e. i 1 = 5 and  i 2 = 0, with  1 =  2 = 5) and we add spontaneous emission with increasing rate Γ.Here, we fix the pulse duration to   = 1 ps and calculate the corresponding detunings   and coupling strengths   to obtain the same values of Δ  =     and   =     as in Fig. 4e.We observe that the SUPER resonance is progressively damped with increasing Γ.Specifically, the dynamics is unaffected when Γ  = 0.01, i.e. when the cavity-emitter interaction occurs on a timescale   that is much smaller than the exciton lifetime 1/Γ.On the other hand, no population inversion occurs in the opposite limit of Γ  = 10, where the exciton lifetime is smaller than the interaction time and dissipation dominates.Here, we observe that the cavity population is depleted almost entirely, see the inset of Fig 6c.
We observe similar features when considering pure dephasing at rate , see Fig. 6d-f.The dynamics is almost unaffected if   ≪ 1, whereas the multi-photon scattering is hindered for larger values of   (although it is still partially visible at   = 10).

Discussion
We have investigated the physics of two-color excitation schemes in terms of energy exchanges between the emitter and the laser pump.To do so, we have developed a model for a two-level emitter interacting with two quantized field modes, where interaction is modulated with Gaussian pulses of duration   in time.Both the SUPER and the red-and-blue dichromatic schemes have been considered.
We have observed that exciton population under the SUPER scheme, where both modes are red-detuned with respect to the emitter, is accompanied by variations in the photon number of both modes, unveiling the multi-photon nature of this process.We find clear signatures of events with (  †  , ⟨Δ 1 ⟩ , ⟨Δ 2 ⟩) = (+1, −2, +1) and (+1, −3, +2), where multiple excitations are subtracted from the 1st field mode (the one with smaller detuning, in absolute value).While one excitation is used to raise the emitter population to the excited state, the remaining are absorbed by the 2nd field mode at larger detuning.Interestingly, this holds true in the regime where the field modes are populated with only few photons, and even when one of the field is in the vacuum state.This last scenario requires precisely tailored nanophotonic environment that supports the two modes of interest at the required detuning, to avoid accidental interaction with other unoccupied modes.We speculate that events involving exchange of higher number of photons, i.e. (  †  , ⟨Δ 1 ⟩ , ⟨Δ 2 ⟩) = (1, −,  − 1) with  > 3, might be found at larger total power, although we have not observed such a case in this work.
Interestingly, we find that energy conservation is violated by the multi-photon scattering event.Before interaction, considering an initial state ,  i 1 ,  i 2 , the total energy is Following a variation of Δ  in the average photon numbers, the final energy is It follows that the difference Δ =  f −  i is independently of   , with ℏ  ⟨Δ 1 ⟩ + ⟨Δ 2 ⟩ +  †  vanishing because the excitation number N is conserved.For the fundamental resonance (  †  , ⟨Δ 1 ⟩ , ⟨Δ 2 ⟩) = (+1, −2, +1), we thus have Δ = ℏ( 2 − 2 1 ).In our work, we always find that | 2 | > 2| 1 | is needed to obtain population inversion close to 1, consistently with the original prediction formulated within a classical approximation for the electromagnetic field [7,17].It follows necessarily that Δ = ℏ( 2 − 2 1 ) ≠ 0. This violation might be explained by the time-dependent nature of the Hamiltonian.Indeed, energy conservation requires an Hamiltonian that is translational invariant in time, in the same way as momentum conservation requires a translational invariant Hamiltonian in space.This finding is in stark contrast with the paradigm of resonant excitation, where the emitter exchanges one energy quantum with a single field mode and energy is overall conserved.We have indeed verified that our model reproduces this scenario by removing the 2nd field mode (i.e.setting  2 = 0) and moving the 1st mode into resonance with the emitter (Δ 1 = 0).By scanning the initial number of photons  i 1 in mode 1, we have found clear signatures of Rabi oscillations.For configurations resulting in   = 1, we have observed that one single photon is removed from the field.A similar scenario is expected to take place in phonon-assisted excitation [25][26][27], although we have not included phonon coupling in this work.
Reversing the working principle of the SUPER scheme, our model shows that it is possible to stimulate emission from an excited emitter with a simultaneous redistribution of photons among the field modes.For instance, a system initialized in the state |,  1 ,  2 ⟩ is found in the state |,  1 − 1,  2 + 2⟩ after interaction with two red-detuned modes.Once again, this is valid in the case  2 = 0 where mode 2 is in the vacuum state.Quite interestingly, this observation challenges the typical paradigm of spontaneous emission stating that the emitted photon must have the same frequency as the photons in the incident wave.These findings may be used to engineer a non-linear interaction where one single photon stimulates the emission of two identical photons at a different frequency.
Considering the red-and-blue dichromatic excitation, where one of the field modes is moved to the blue side of the spectrum, we have observed that it involves energy exchange with only one of the field modes.However, our results demonstrate that the presence of a second field mode is necessary to engineer full population inversion, even when no variation in its the photon number is recorded.An exception to this paradigm is the case of dichromatic excitation for an initial state |, 1, 1⟩, where we find that half a photon is subtracted on average from both modes.
It is worth noting that we have obtained a good qualitative agreement with the description of the SUPER and red-and-blue dichromatic scheme under a semiclassical model for light-matter interaction, where the field is not quantized (see also Supplement 1).However, the advantage of the quantum model presented here is that it makes it possible to calculate the variation in photon number for each mode.We have furthermore observed that a description of the laser field in terms of coherent states is needed to reproduce the typical swing-up behavior of the excited state population under SUPER scheme excitation, which is characterized by fast oscillations.However, we have surprisingly found that a description in term of Fock states yields qualitatively similar results for the final population inversion, and demands less computational resources.For example, the Fock state dynamics shown in Fig. 5a is fully converged when using  1 =  2 = 11 states in the field Hilbert space, whereas  1 =  2 = 91 states are required for full convergence of the corresponding coherent state simulation.The qualitative agreement between the two descriptions is however limited to the large photon number regime.For small photon numbers of the order of few units, fundamental differences in the behavior of Fock and coherent states emerge.
Finally, we have used values in the range 0.1-5 for the dimensionless coupling constants   =     .Considering short pulses of the order of   = 2-3 ps, a value   = 0.1 is obtained with a light-matter coupling strength of the order of   = 0.03-0.05THz.The latter corresponds to typical values attained in high-Q micropillars [12,28] or open tunable microcavities [6], and similar values are found elsewhere in the literature [26].Two strategies may be pursued to attain stronger dimensionless coupling   = 5.On one hand, the pulse duration may be increased by one order of magnitude to   ≈ 100 ps.We notice, however, that the system in this regime is more prone to environment induced decoherence, as proven in Fig. 6.As alternative, a light-matter coupling strength   ≈ 2 THz is needed to obtain   = 5.Whereas such a strong coupling maybe challenging to realize, we notice that nanocavities with extreme photon confinement well below the diffraction limit in dielectrics have been designed and fabricated in silicon [29].Here, ultra-strong light-matter coupling is expected due to the extremely low mode volume.
During the writing of this manuscript, we became aware of a related work which also demonstrates the multi-photon nature of the SUPER scheme [30].

Conclusions
We have studied the exchange of energy between a two-level emitter and two quantized field modes under the SUPER scheme excitation.Surprisingly, we have found that one mode loses two or more photons while the other gains at least one, proving that the SUPER is a multi-photon process involving a redistribution of photons between the field modes.Our results unveil a novel and unexpected off-resonant interaction of light and matter, and open new possibilities for manipulating the quantum state of atoms using only few off-resonant photons.

SUPER excitation of quantum emitters is a multi-photon process: supplemental document 1. ROLE OF THE TIME-DEPENDENT GAUSSIAN ENVELOPE OF THE COUPLING CONSTANTS
In this Supplementary Note, we investigate the role of the time-dependent Gaussian envelope exp −t 2 /t 2 p of the coupling constants in Eq. ( 1) of the main text.Light-matter interaction between a single two-level emitter and a quantized cavity mode is typically modeled with a time-independent Jaynes-Cummings Hamiltonian.In a frame rotating at the bare exciton frequency ω X , and considering two quantized modes of the electromagnetic field, the Hamiltonian reads where σ † raises the emitter population from the ground (|G⟩) to the excited (|X⟩) state, a † j creates one photon with energy h(ω X + δ j ) in mode j, and we make use of the rotating wave approximation.
In Eq. ( 1) of the main text, we perform the substitution to introduce a time-dependent Gaussian modulation of the coupling constants, with identical shape and length t p of the pulse for each mode.The Gaussian envelope is chosen to simulate the scenario of pulsed laser excitation, as typically done when the electromagnetic field is treated as a classical (i.e.non-quantized) object [1][2][3].Moreover, the smooth envelope avoids an abrupt switch-on and switch-off of the interaction, which introduces additional features in the system response as we demonstrate here below.In Fig. S1a and Fig. S1b, we plot the final excited state population obtained with and without the time-dependent envelope, respectively.Apart from removing the time-dependent modulation, all parameters in Fig. S1b are identical to Fig. S1a.It should be noted that the removal of exp −t 2 /t 2 p has the same effect as replacing the Gaussian pulse with a Square pulse of duration 6t p , since we run calculations from the initial time t i = −3t p to t f = 3t p .We observe that the typical SUPER resonance of Fig. S1a is replaced by a rich landscape in Fig. S1b characterized by fast oscillations, where it is difficult to identify a clear resonance.
It is instructive to compare this scenario to the results obtained under a classical approximation of the electromagnetic field.We thus introduce the semi-classical Hamiltonian

Fig. 1 .
Fig. 1. (a-f): Quantum state of the coupled emitter-field system after a SUPER pulse, as a function of the detuning Δ 2 =    2 and the initial number of photons  i 2 in mode 2. The system is initialized at time  i in the state ,  i 1 ,  i 2 , with  i  representing Fock states.The quantities plotted are the final excited state population  †  (panels a, b) and the variation Δ  =  †    ( f ) −  i  of the average photon number in mode 1 (c, d) and 2 (e, f).(g-k): Evolution of  †  , ⟨Δ 1 ⟩, and ⟨Δ 2 ⟩ for different configurations corresponding to local maxima in panels a and b, see red arrows.Parameters for the first mode are fixed at Δ 1 = 6 (everywhere), √︃  i 1 = 62.83 (panels a, c, e), and √︃  i 1 = 157.08(panels b, d, f).The dimensionless interaction strength is  1 =  2 = 0.1 (everywhere).

Fig. 2 .
Fig.2.Photon number distribution of the initial and final state corresponding to Fig.1g(panels a, b) and Fig.1j (panels c, d).Here, the population   of each number state |⟩ is plotted.

Fig. 3 .
Fig. 3. (a, b, c): Quantum state of the coupled emitter-field system after a red-and-blue dichromatic pulse, as a function of the initial number of photons  i  in each mode .The system is initialized in the state ,  i 1 ,  i 2 , with  i  representing Fock states.The quantities plotted are the final |⟩ state population  †  (panel a) and the variations Δ  =  †    ( f ) −  i  of the average photon numbers in modes 1 (b) and 2 (c).(d-g):Evolution of  †  , ⟨Δ 1 ⟩, and ⟨Δ 2 ⟩ in time for different configurations corresponding red arrows in panel a.In all panels, the mode detunings are fixed at Δ 1 = −Δ 2 = 6, and the dimensionless interaction strength is  1 =  2 = 0.1.

Fig. 4 .
Fig. 4.(a-d): Final |⟩ state population  †  after a SUPER pulse in the low photon number regime.The system is initialized in the state ,  i 1 ,  i 2 , with  i  representing Fock states.In (a) and (b), the detuning Δ 1 and initial number of photons  i 1 in mode 1 are fixed as reported in each panel, Δ 2 is varied continuously along the -axis, and  i 2 is varied in integer steps along the -axis.In (c) and (d), the initial photon numbers  i  are fixed and the detunings Δ  are varied continuously.(e, f): Evolution of  †  , ⟨Δ 1 ⟩, and ⟨Δ 2 ⟩ in time for different configurations corresponding to red arrows in panels c) and d).(g): Evolution of  †  , ⟨ 1 ⟩, and ⟨ 2 ⟩ in time for a red-and-blue dichromatic configuration with Δ 1 = −Δ 2 = 3.12.(h): Evolution of  †  , ⟨ 1 ⟩, and ⟨ 2 ⟩ in time for a SUPER configuration with Δ 1 = −15.68,Δ 2 = −3.78,and where the emiter is initially in the excited state |⟩.Note that panels g) and h) report the absolute photon numbers   rather than the variation Δ  .In all panels, the dimensionless interaction strength is  1 =  2 = 5.
Photon number distribution, mode 1 initial final

Fig. 5 .
Fig. 5. (a-f): Evolution of  †  , ⟨ 1 ⟩, and ⟨ 2 ⟩ in time for different initial states of the electromagnetic field.The field modes are initialized either in Fock states containing  i 1 and  i 2 photons respectively (dashed lines), or in coherent states with the same average number of photons (solid lines).Detunings Δ  and initial photon numbers  i  are indicated in the top panel title.The dimensionless interaction strength is  1 =  2 = 1 (a-c) and  1 =  2 = 5 (d-f).(g, h): Initial and final photon number distribution in mode 1 corresponding to panel b for Fock and coherent states, respectively.(i, j): Initial and final photon number distribution in mode 2 corresponding to panel c for Fock and coherent states, respectively.

Fig. 6 .
Fig. 6.Effect of (a-c) spontaneous emission at rate Γ and (d-f) pure dephasing at rate  on the multi-photon scattering.The pulse duration is   = 1 ps.Other parameters are as in Fig. 4e.

Funding.
This work received supported from the European Research Council (ERC-CoG "Unity", grant no.865230) and the European Union's Horizon 2020 Research and Innovation Programme under the Marie Skłodowska-Curie Grant (Agreement No. 861097).Disclosures.The authors declare no conflicts of interest.

Fig. S1 .
Fig. S1.(a-d): Final population of the |X⟩ state after a SUPER pulse predicted by a quantized (a, b) and a classical (c, d) model of the field.Parameters for the first pulse are fixed as reported in each panel.(e, f): Frequency spectrum for Gaussian (e) and Square (f) pulses in a classical model.Parameters are Θ 1 = 7π, Θ 2 = 10π, ∆ 1 = −6, ∆ 2 = −22, corresponding roughly to the maximum in panel c.
2≈ 10 −4 , we initialize the system in the chosen initial state |( i )⟩ at time  i = −3  and we calculate the evolution of |()⟩ with a 4th order Runge-Kutta algorithm until the final state |( f )⟩ at time  f = +3  , where the population of the emitter and each field mode is inspected.Here, the use of the interaction picture is beneficial to avoid fast oscillating terms, especially when dealing with field states with a large number of photons.