Two-photon imaging of soliton dynamics

Optical solitary waves (solitons) that interact in a nonlinear system can bind and form a structure similar to a molecule. The rich dynamics of this process have created a demand for rapid spectral characterization to deepen the understanding of soliton physics with many practical implications. Here, we demonstrate stroboscopic, two-photon imaging of soliton molecules (SM) with completely unsynchronized lasers, where the wavelength and bandwidth constraints are considerably eased compared to conventional imaging techniques. Two-photon detection enables the probe and tested oscillator to operate at completely different wavelengths, which permits mature near-infrared laser technology to be leveraged for rapid SM studies of emerging long-wavelength laser sources. As a demonstration, using a 1550 nm probe laser we image the behavior of soliton singlets across the 1800–2100 nm range, and capture the rich dynamics of evolving multiatomic SM. This technique may prove to be an essential, easy-to-implement diagnostic tool for detecting the presence of loosely-bound SM, which often remain unnoticed due to instrumental resolution or bandwidth limitations.

I do have two small comments: (1) In figure 1 (d), the frame rate for EFXC is listed as kHz. I understand this is accurate for low repetition rate pulsed laser. However, in microcavity [reference 3], the frame rate can be as high as 50 MHz. So I wonder if there is a better way to illustrate frame rate that includes both results in conventional mode-locked laser and the microcavity solitons.
(2) I don't really understand why the frame rate of the two-photon method can be higher than the optical coherent sampling. Is it because optical coherent sampling is phase sensitive and thus strictly limited by the Nyquist condition, while two-photon cross-correlation is not phase-sensitive and thus can violate the Nyquist condition to some extent? I think it will be better to clearly show what the frame rate limit is with a mathematical expression.

Reviewer #2 (Remarks to the Author):
In the paper titled "Two-photon imaging of soliton dynamics" by Łukasz A. Sterczewski et al., the authors demonstrate a way to imaging soliton molecules (SMs) based on non-interferometric intensity cross-correlation (IXC) technique. The local oscillator (1550nm) is a separate optical probe pulse stream generated at a repetition rate that is close to that of the SMs (1150-2200nm), and the small difference in these rates causes a pulse-to-pulse temporal shift of the probe pulses relative to the SM pulses. The coincidence of the two pulse streams on a photodetector with a large bandgap triggers two-photon absorption which does not require matching of wavelength or polarization. Evolutions of soliton molecules are captured as a proof-of-concept demonstration.
Overall, this mechanism is novel and should find applications in characterizing ultrafast pulses. I would recommend its publication in Nature Communications. Nevertheless, this technology also faces some practical challenges, which I request the authors to clarify in the revision.
1. Provided the low efficiency of the 2PA process, it is difficult to sample and reconstruct pulses with low pulse energy. Therefore, for high-rate pulses (like those generated in microresonators), this technology requires unpractically high average power that may damage the PD. The authors should evaluate the required pulse energy and repetition rate that gives a satisfactory SNR in their current setup.
2. Even if the average power of the pulse stream is kept low, aliasing of the detected signal would arise due to saturation of the photodetector induced by high peak powers of the pulses. This is a well-known limitation of dual-comb-based technologies. Please provide the actual power limitations of the detection scheme.
Besides, I have some questions: 1. The authors mentioned that " Since the ultimate goal is almost always high imaging speed to fully capture the SM evolution trajectory, …, as shown in Fig. 3a. ". The analysis is not complete since the frame rates should exceed the evolution rate of the solitons. Otherwise, considerable details of soliton evolution would be lost.
2. The authors also mentioned that " In this context, worth studying is also a single-frame IXC temporal resolution limit, …,the IXC signal due to the LPF lowers the peak contrast. ". Although ∆f_r/f_r^2 does represent the ultimate limit of temporal resolution, the limit imposed by the pulse width of the local oscillator is also significant in many cases (especially when the difference between the repetition rates is small). This should be discussed in the main text. 3. In Figure 1d, the frame rate of EFXC can exceed 50 MHz with temporal resolution a few hundred femtoseconds [See Ref 3].

Reviewer #3 (Remarks to the Author):
The manuscript by L. A. Sterczewski and J. Sotor titled: "Two-photon imaging of soliton dynamics" reports stroboscopic two-photon imaging of the soliton molecules in a mode-locked laser. The authors demonstrate a simple method of detecting the shape of the laser solitons based on the intensity cross-correlation. More explicitly, they used two unsynchronized pulsed laser sourcesone as a local oscillator and another one as a laser under test -operating at different wavelengths (1550 nm and 1800-2100 nm, respectively) that after a low pass filter were directed to a conventional photodiode operating at 400 -1100 nm. However, the publication of this manuscript in Nature Communication is not recommended for the following reasons: 1) The motivation of the study has several questionable statements. a) About solitons in optics: "This is because optical solitons do not spread out during propagation and exhibit robustness against perturbations; therefore they frame the core concept in optical pulse generation". This statement does not fully reflect the motivation. Indeed, optics, first of all, provides an exceptional degree of control over the parameters and low propagation loss that made possible to generate solitons described by nearly integrable equations. This triggered the interest to create a soliton telecommunication line [1]. b) "Despite advances in mathematical modeling, the understanding of these complex inter-soliton interactions still appears to be in infancy." Even though it can be true for some advanced and complex systems, for the examples presented in the manuscript, the study of the formation of soliton molecules is definitely not in its infancy. Authors do not differentiate different platforms and put them into one basket (i) conservative solitons described by an integrable equation such as NLSe, (ii) dissipative solitons in a passive system such as microresonators governed by the Lugiato-Lefever equation, (iii) and dissipative solitons in an active system governed by the Ginzburg-Landau equation -the case experimentally investigated in the present manuscript. Soliton physics in these systems is very different. Indeed, in conservative systems, soliton molecules can be fully described analytically [2]. In passive resonators, this study of dissipative soliton interaction has been done in the 90s [3]. The literature on the soliton molecules and soliton interaction is extensive in the latter case as well [4]. c) One of the key motivation statements is: "To bypass these limitations and unlock the kHz-to sub-MHz rate imaging potential, in this Article we adapt the non-interferometric intensity crosscorrelation". Implying that the dual-comb technique is incapable of achieving such rates, which contradicts the data presented in a table in the method section of a recent paper by Caldwell et. al. [5] 2) Authors do not discuss several powerful ultrafast measurement techniques, such as temporal imaging and its extensions [6]. These techniques have been employed not only for the single-shot detection of the temporal profile [7] but also for the full field characterization [8]. The last one is the prominent example of the study of soliton (as well as solitons ensembles) build-up in a passively mode-locked laser. Also, the possibility of implementing this technique in the free space optics, makes it insensitive to the fiber transparency window [9].
3) The scheme is simple yet powerful. However, it is a natural extension of previously known techniques which obscures the novelty of the research. This method is a superposition of the conventional nonlinear cross-correlation technique extended by the TPA. As a result, very similar experimental techniques have been proposed only a few years after the discovery of the first laser, in 1968 by M. A. Duguay and J. W. Hansen [10]. Also, a similar approach using TPA has been used in spectroscopy [11]. 4) Importantly, the phase reconstruction -in contrast to the paper that inspired this research (Ref. [29] of the manuscript) -is not shown in the manuscript, which makes it of limited interest to the community. 5) The same concerns the laser soliton dynamics. Effects described there have been reported and observed previously, mainly using DFT [12].
Minor comment: a large number of unnecessary and unconventional acronyms makes the paper difficult to read.
Concluding, the manuscript presents a study of a well-known problem using an original but anticipated technique which is a slight modification of well-known results. Thus, I confirm that the paper is publishable in a scientific journal but does not meet the novelty criteria in this particular case.
Overall the paper is well written, the idea is very novel, and it could have high impacts on nonlinear optics study. Therefore, I strongly recommend this paper for publication in Nature Communications.
I do have two small comments: (1) In figure 1 (d), the frame rate for EFXC is listed as kHz. I understand this is accurate for low repetition rate pulsed laser. However, in microcavity [reference 3], the frame rate can be as high as 50 MHz. So I wonder if there is a better way to illustrate frame rate that includes both results in conventional modelocked laser and the microcavity solitons.
We fully agree with the need to better illustrate the obtainable frame rate Δfr irrespective of the repetition rate fr. One of the big difficulties is that permittable Δfr does not scale linearly with the repetition rate, and additionally depends on the probed optical bandwidth. To satisfy the Reviewer's requirement, we propose a figure-of-merit (FOM) in units of Hz 2 /Hz 2 derived from the aliasing criterion.
In conventional EFXC, the frame rate must meet the condition Δfr ≤ fr 2 /2Δν, where Δν is the probed optical bandwidth (practically defined for a -20 dB drop in intensity). To describe the "speed" of the technique we propose to calculate how much above the Nyquist criterion (yet only for sampling techniques, to which DFT does not belong) one can image the soliton trajectory.
In the case of the setup presented here (Δν = 7.5 THz, fr=100.8 MHz, Δfr=670 Hz for a theoretical Nyquist-limited frame rate), we obtain an FOM of 0.99 if the experiment were performed in EFXC mode. However, in the above-Nyquist limit presented in the paper, when Δfr varies between 1 kHz and 100 kHz, the corresponding FOMs are 2.11, and 147.6. This is possible because the FOM must no longer be lower or equal to one. Of course, such high rates come at the expense of lowered temporal resolution, in which case distinguishing closely spaced soliton pulses may be difficult or even impossible.
For non-sampling techniques, the obtainable frame rate reaches the laser repetition rate Δfr = fr. Then, the FOM approaches infinity as the bandwidth increases.

2Δν / fr = FOMNS
We believe this addresses the requirement for dimensionless comparison of the obtainable frame rates. The above discussion has been added to the caption of Fig. 1.
(2) I don't really understand why the frame rate of the two-photon method can be higher than the optical coherent sampling. Is it because optical coherent sampling is phase sensitive and thus strictly limited by the Nyquist condition, while two-photon cross-correlation is not phase-sensitive and thus can violate the Nyquist condition to some extent? I think it will be better to clearly show what the frame rate limit is with a mathematical expression.
Per Reviewer's request, we have added a page-long mathematical formulation of how the conventional Nyquist criterion can be violated by means of dispersion (pulse elongation given a fixed optical bandwidth), and prior assumptions about the pulse. Please see sections Violation of the Nyquist criterion, and Violation of the Nyquist criterion in the presence of prior assumptions in the Methods section for details. For Reviewer's convenience it is also provided below:

Violation of the Nyquist criterion.
To provide a mathematical formulation of violating the conventional (DCS) Nyquist limit, we will consider a linearly chirped Gaussian pulse with a spectral width Δν, which closely approximates a sech 2 optical pulse encountered in many laser systems. The pulse has an electric field and intensity where a relates to the pulse's full width at half maximum (FWHM) τp through In Eq.5, b is the chirp parameter defining the sweep rate of the instantaneous frequency: In conventional DCS, the (mutual) spectral width between the LO and LUT ultimately defines the maximum frame rate and hence Δfr. The presence of chirp matters little here from an aliasing standpoint. Assuming both spectral widths Δν being equal, the conventional aliasing limit is This, however, greatly changes for two-photon detection. Because of chirp, the optical spectral width Δν no longer implies a pulse width. Instead, in only imposes a lower bound on τp known as the Fourier transform limit:

In other words, the presence of chirp leading to pulse broadening to duration τd can be seen as equivalent to optical band-pass filtering in conventional DCS. It reduces the occupied electrical bandwidth B in the measured signal by
Propagation of a transform-limited pulse through a medium with group delay dispersion D2 causes the pulse to elongate according to Consequently, the conventional aliasing condition due to the Fourier pulse width limit is replaced in IXC with From a practical standpoint, chirping the pulse is easier in implementation than selective optical bandpass filtering. It should be noted, however, that this operation lowers the obtainable temporal resolution in IXC studies.
To better illustrate this effect, we will provide a numerical example. Consider a 60-nm wide optical spectrum centered at 1550 nm emitted by a laser with fr = 100 MHz, similar to that presented here (Δν  7.5 THz). The Fourier-limited Gaussian pulse width is ~68 fs. Using an equally-broad LO comb in conventional DCS implies Δfr ≤ 667 Hz. However, guiding the pulse through a 55-cm long piece of singlemode fiber (PM-1550 XP) (D2 -1.2 × 10 4 fs 2 ) increases its duration to 0.5 ps. This in turn relaxes the IXC Nyquist criterion ~7 times, yielding the maximum unaliased Δfr = 4.67 kHz.

Violation of the Nyquist criterion in the presence of prior assumptions.
Another reason why the IXC technique is more tolerant to operation above the Nyquist limit relates to the pulse observability criterion. If the IXC trace is to be used only for diagnostic purposes and not for accurate pulse width characterization, one can accept some degree of amplitude modulation of peak intensity no matter at what time instance it is detected. Note this is true for a quasi-stationary process, when the frame rate is much faster than pulse trajectory evolution. Accepting this apparent signal distortion due to amplitude underestimation can be seen as a form of controlled aliasing.
In the extreme case, we require the IXC signal to contain only one point when the pulses coincide (with some tolerable temporal offset that causes the modulation). This requires us to find a relative delay between the pulses that produces a signal weaker than for perfect overlap by factor η. Again, we will consider two Gaussian pulses with intensity profiles I1(t)=exp(-2at 2 ), and I2(t)=exp(-2bt 2 ). A crosscorrelation (★) of the two as a function of lag τ, due to the shape symmetry, is equivalent to a convolution (✱)

This result can be derived from the central limit theorem --a convolution of two zero-mean Gaussians with variances σ 2 is a zero-mean Gaussian with its variance being the sum of the individual variances
If we now ignore the proportionality constant, we define a new function η(τ), which provides a fraction of the maximum of the cross-correlation between the pulses at a given time lag. We will also relate it to the pulse widths τa, τb: For two arbitrary amplitude underestimation factors: η0.5=η(τ0.5)=0.5, and η0.1=η(τ0.1)=0.1, we obtain the following analytical expressions These quantities can be intuitively understood as the relative delay between the pulses that yields a given fraction (0.5 or 0.1) of the theoretical cross-correlation maximum at zero delay. Now we will recall that the asynchronous interaction of the pulses occurs at discrete time intervals increasing by Δfr / fr 2 . Therefore, ensuring pulse detection (even in the case of a multi-pulse SM) with a relative amplitude not lower than η requires the temporal separation to advance between consecutive cavity round-trips by The factor of 2 in τη relates to the fact that the pulse shape is symmetric, and either a positive or negative delay will produce the required response. For delays advancing by less than τη, one would get two points above the predefined threshold. The condition defined in Eq. 18 works for pulse trajectory evolution occurring at rates much smaller than the frame rate Δfr.
It is important to note that the Gaussian pulse shape is only an approximation. Sech 2 pulses, which are more difficult to treat analytically, have much longer wings, and even further relax the conventional aliasing condition, particularly in the τ0.1 case.
Another thing worth mentioning is that in the weakly-chirped regime, measuring the IXC high above the aliasing limit requires one to ensure an integer k= fr / Δfr ratio to avoid signal scalloping, Vernierlike filtering effects. Because only discrete effective time intervals spaced by 1/(kfr) are probed, features far away from this temporal grid may not be sampled, and hence overlooked. On the other hand, noninteger k ratios offer enhanced temporal resolution due to scanning all possible LUT-probe relative delays like in a sampling oscilloscope. That said, acquiring such high-resolution scans takes multiple Δfr periods. Clearly, different trade-offs must be taken into account when violating the aliasing limit.
In the paper titled "Two-photon imaging of soliton dynamics" by Łukasz A. Sterczewski et al., the authors demonstrate a way to imaging soliton molecules (SMs) based on non-interferometric intensity cross-correlation (IXC) technique. The local oscillator (1550nm) is a separate optical probe pulse stream generated at a repetition rate that is close to that of the SMs (1150-2200nm), and the small difference in these rates causes a pulse-to-pulse temporal shift of the probe pulses relative to the SM pulses. The coincidence of the two pulse streams on a photodetector with a large bandgap triggers two-photon absorption which does not require matching of wavelength or polarization. Evolutions of soliton molecules are captured as a proof-of-concept demonstration.
Overall, this mechanism is novel and should find applications in characterizing ultrafast pulses. I would recommend its publication in Nature Communications. Nevertheless, this technology also faces some practical challenges, which I request the authors to clarify in the revision.
1. Provided the low efficiency of the 2PA process, it is difficult to sample and reconstruct pulses with low pulse energy. Therefore, for high-rate pulses (like those generated in microresonators), this technology requires unpractically high average power that may damage the PD. The authors should evaluate the required pulse energy and repetition rate that gives a satisfactory SNR in their current setup.
We are thankful for this suggestion as it has encouraged us to search for more sensitive 2-photon detectors and explore optical power constraints of the current Si detector.
First, we found that a combined optical average power as high as 200 mW does not damage the silicon photodiode. It was the highest power we could obtained using our erbium doped fiber amplifier. After decreasing the power to the original level, the detector responds identically as prior to the high-power illumination.
However, what we find much more appealing to the readers, and potentially to the Reviewer is that using an InGaAsP 1.3 μm Fabry-Perot telecom laser diode structure (operating here as a 2PA detector see Reference 42, listed below), we have successfully probed 1.5 μm laser pulses using the IXC technique at LUT power levels as low as 9 μW. This represents a sensitivity improvement by three orders of magnitude. The only downside is that in the current implementation, a short piece of fiber attached to the laser adds some dispersion, yet will be eliminated in a future free-space setup and is not a limitation of the technique per se. The following paragraph along with Fig. 6 have been added to satisfy the Reviewer's comment and reflect this major addition to the manuscript. Fig. 6 (see Methods for device details). Whereas for the Si photodiode the sensitivity defined as a product of the peak LUT power and average LO power (42) amounts to ~4.4×10 6 mW 2 using an unbiased laser diode as a detector yields an improvement to 2.4×10 3 mW 2 . In other words, we can probe μW average power level pulses instead of mW with corresponding fJ pulse energies. This is shown in Fig. 6a, where 9 μW, 0.5 ps-long pulses are probed by a 15 mW LO. This major sensitivity improvement offered by higher detector nonlinearities should unlock the IXC imaging potential of multi- (Fig. 6b).   Fig. 6 -InGaAsP quantum well laser as a two-photon photodetector. a IXC signal in a soliton singlet state when the LUT average power was 9 μW. b IXC signal when the LUT average power was 6 mW.

GHz fr sources like microresonators with typical sub-pJ pulse energies and sub-W peak powers. Greater sensitivity obviously translates into better signal-to-noise (SNR) performance, particularly when the probed laser operates with mW-level average power
The LO power was 15 mW, and 23 mW, respectively.

(42): Reid, D. et al. Commercial semiconductor devices for two photon absorption autocorrelation of ultrashort light pulses. Applied optics 37, 8142-8144 (1998)
Assuming that a pulsed laser with a repetition rate in the MHz range has a 1000× lower peak intensity than a GHz-rate laser at the same pulse duration, the improved detection sensitivity offered by laser diode structures should allow microresonators (or semiconductor laser frequency combs in general) to be probed using the IXC technique. It is only a question of a suitable two-photon detector. Our work may trigger efforts to push the 2PA sensitivity to new values.
2. Even if the average power of the pulse stream is kept low, aliasing of the detected signal would arise due to saturation of the photodetector induced by high peak powers of the pulses. This is a well-known limitation of dual-comb-based technologies. Please provide the actual power limitations of the detection scheme.
The reviewer has raised the important concern of photodetector saturation. Indeed, in conventional dual-comb spectroscopy optical powers as low as 100 μW can distort the interferogram and lead to improper absorption line shapes. Currently, special algorithms taking into account the generation of spectra at harmonic frequencies with respect to the carrier are being developed by the community to In the case of the two-photon detection process, we do not observe the clipping or distortion behavior of the cross-correlation signal as in dual-comb spectroscopy. What we find a much bigger concern is the necessity of using high-rejection-ratio optical long-pass filters (or a cascade of those). Residual light in the one-photon absorption range weakens the 2PA effect and produces a significant amount of noise. This obviously leads to a behavior different from the expected quadratic nonlinearity.
Nevertheless, we do observe saturation effects in the case of the Fabry-Perot diode detector, which, unexpectedly, are caused by electronics. At elevated optical powers (~dozens of mW), when the transimpedance amplifier gain is set to 10 6 or higher, the amplifier clips the signal when it produces a volt-level signal at the output.
To satisfy the Reviewer's requirement, we have added the following to the Methods section: High average optical power can potentially damage the photodetector. In the case of the Si photodiode, illumination with a combined optical power of ~200 mW did not damage it (which was the highest optical power obtainable using our erbium-doped fiber amplifier). It is therefore uncertain, how much power the detector can withstand. Another issue that may arise in the experiment are limitations of the transimpedance amplifier. At high transimpedance gains, a clipping of the electrical signal has been observed at when the incident optical power exceeded 10 mW for the FP laser detector. Therefore, a combination of moderate transimpedance gain with a high-vertical-resolution digitizer may be needed for optimal detection performance.
Besides, I have some questions: 1. The authors mentioned that " Since the ultimate goal is almost always high imaging speed to fully capture the SM evolution trajectory, …, as shown in Fig. 3a. ". The analysis is not complete since the frame rates should exceed the evolution rate of the solitons. Otherwise, considerable details of soliton evolution would be lost.
The Reviewer is absolutely right that details of soliton evolution will be lost if the frame rate is too low. This problem is particularly challenging for typical fiber or solid-state lasers, which (as mentioned earlier) have a conventional Nyquist limit in the sub-kHz range corresponding to ~100 000 cavity round trips to produce a single frame. While the IXC technique will never reach the imaging speed of DFT or time lens because it does not operate a shot-to-shot basis, breaking the Nyquist limit tens or even a hundred times is already a major step towards capturing the rapid SM dynamics in such low-repetitionrate sources.
GHz-rate sources like microcavity resonators have the unique capability of being imaged at rates of 10's of MHz due to favorable scaling of imaging speed with a square of the repetition rate. In such scenarios, the IXC technique can offer probing sources in more challenging spectral regions while using mature telecom-grade optical modulators at 1.5 μm rather than a strong violation of the Nyquist criterion.
The manuscript now includes an extra sentence that talks about the incomplete picture of SM evolution at insufficiently high frame rates: We need to underline, however, that if the soliton evolution rate exceeds the frame rate, one has to resort to single-shot imaging techniques like DTF (12) 2. The authors also mentioned that " In this context, worth studying is also a single-frame IXC temporal resolution limit, …,the IXC signal due to the LPF lowers the peak contrast. ". Although ∆f_r/f_r^2 does represent the ultimate limit of temporal resolution, the limit imposed by the pulse width of the local oscillator is also significant in many cases (especially when the difference between the repetition rates is small). This should be discussed in the main text.
We are thankful for this comment since the Reader may have a false perception of the obtainable resolution. For instance, despite a large temporal magnification factor, when the frame rate is low, probing features with tens of femtoseconds of width using a near-ps long LO pulse will smear out the details. Although in principle a deconvolution technique can be used to restore the original shape, it always comes at the expense of increased noise.
Just like in any sampling technique, the obtained temporal resolution will be the RMS sum of individual widths. This discussion has been now added to the text.
It should be also noted that the ultimate resolution limit of the technique, considering the LO pulse width σLO, and its jitter σJ, should obey the root mean square sum law, i.e. the obtainable resolution will be = √ 2 + 2 + 2 .
3. In Figure 1d, the frame rate of EFXC can exceed 50 MHz with temporal resolution a few hundred femtoseconds [See Ref 3].
We fully agree with the Reviewer. Please see our response to the same concern raised by Reviewer #1, Comment 1.

Reviewer #3 (Remarks to the Author):
The manuscript by L. A. Sterczewski and J. Sotor titled: "Two-photon imaging of soliton dynamics" reports stroboscopic two-photon imaging of the soliton molecules in a mode-locked laser. The authors demonstrate a simple method of detecting the shape of the laser solitons based on the intensity crosscorrelation. More explicitly, they used two unsynchronized pulsed laser sources -one as a local oscillator and another one as a laser under test -operating at different wavelengths (1550 nm and 1800-2100 nm, respectively) that after a low pass filter were directed to a conventional photodiode operating at 400 -1100 nm.
However, the publication of this manuscript in Nature Communication is not recommended for the following reasons: 1) The motivation of the study has several questionable statements.
a) About solitons in optics: "This is because optical solitons do not spread out during propagation and exhibit robustness against perturbations; therefore they frame the core concept in optical pulse generation". This statement does not fully reflect the motivation. Indeed, optics, first of all, provides an exceptional degree of control over the parameters and low propagation loss that made possible to generate solitons described by nearly integrable equations. This triggered the interest to create a soliton telecommunication line [1]. b) "Despite advances in mathematical modeling, the understanding of these complex inter-soliton interactions still appears to be in infancy." Even though it can be true for some advanced and complex systems, for the examples presented in the manuscript, the study of the formation of soliton molecules is definitely not in its infancy. Authors do not differentiate different platforms and put them into one basket (i) conservative solitons described by an integrable equation such as NLSe, (ii) dissipative solitons in a passive system such as microresonators governed by the Lugiato-Lefever equation, (iii) and dissipative solitons in an active system governed by the Ginzburg-Landau equation -the case experimentally investigated in the present manuscript. Soliton physics in these systems is very different. Indeed, in conservative systems, soliton molecules can be fully described analytically [2]. In passive resonators, this study of dissipative soliton interaction has been done in the 90s [3]. The literature on the soliton molecules and soliton interaction is extensive in the latter case as well [4].
In our introduction we tried to encompass the diverse landscape of soliton phenomena. Obviously the picture we draw will always be subjective and will only attempt to briefly cover the exciting underlying physics given the manuscript word count constraints. In this work, we only attempt to draw a broader picture in the introduction, which obviously, may seem incomplete from a soliton physics expert's point of view.
Regarding the maturity of soliton molecule understanding, we tend to agree with the Reviewer that much work has been done in the early days of computer-aided numerical modeling. Nevertheless, novel soliton molecule states are still observed (despite the large number of available simulation packages) even in simple fiber resonators (i.e. molecular complexes or interaction between soliton molecules in different polarizations, as governed by the Complex Ginzburg-Landau equation).
c) One of the key motivation statements is: "To bypass these limitations and unlock the kHz-to sub-MHz rate imaging potential, in this Article we adapt the non-interferometric intensity crosscorrelation". Implying that the dual-comb technique is incapable of achieving such rates, which contradicts the data presented in a table in the method section of a recent paper by Caldwell et. al. [5] In the manuscript we do not claim that interferometric cross-correlation is incapable of reaching such rates. In fact, the imaging rate is relative to the laser repetition rate, and scales with the square of it. Please see our response to Reviewer 1, point #1, where we highlight the difficulties that fiber lasers with MHz repetition rates face to be properly diagnosed/imaged.
2) Authors do not discuss several powerful ultrafast measurement techniques, such as temporal imaging and its extensions [6]. These techniques have been employed not only for the single-shot detection of the temporal profile [7] but also for the full field characterization [8]. The last one is the prominent example of the study of soliton (as well as solitons ensembles) build-up in a passively modelocked laser. Also, the possibility of implementing this technique in the free space optics, makes it insensitive to the fiber transparency window [9].
It is impossible to disagree with the Reviewer that temporal imaging has been overlooked. This has been now discussed in the manuscript with suitable references.

Temporal imaging with a time lens (26) has also gained an established position as a tool for single-shot laser pulse diagnostics (27,28).
It should be noted, however, that the temporal imaging technique is not universal and often requires specialized equipment like phase modulators for the relevant spectral region or fibers with low fourwave mixing thresholds. Therefore, it is hampered by the same limitations as the DFT technique, albeit with the primary advantage of providing shot-to-shot laser diagnostics information. The mentioned free-space implementation of time lens requires a suitable nonlinear crystal (that requires phase matching) along with a pump laser at a completely different wavelength, which adds many layers of complexity compared to the fiber realization. As always, different trade-offs must be taken into account.
Still, we do inform the Reader that in some cases there is no other option but to use DFT or time lens: We need to underline, however, that if the soliton evolution rate exceeds the frame rate, one has to resort to single-shot imaging techniques like DTF (12), time-lens (26) or even a combination of both (28) to capture all details of the soliton evolution trajectory.
References: 3) The scheme is simple yet powerful. However, it is a natural extension of previously known techniques which obscures the novelty of the research. This method is a superposition of the conventional nonlinear cross-correlation technique extended by the TPA. As a result, very similar experimental techniques have been proposed only a few years after the discovery of the first laser, in 1968 by M. A. Duguay and J. W. Hansen [10]. Also, a similar approach using TPA has been used in spectroscopy [11].
In the manuscript we do not claim that the use of nonlinear cross-correlation is unprecedented. We properly acknowledge prior advances in the field. However, early demonstrations have explored nonlinear crystals that require phase matching, which may make sampling lasers at dissimilar wavelength impossible. Additionally, in addition to wavelength agility and polarization insensitivity, the technique offers probing femtojoule optical pulses, as shown in the revised manuscript per Reviewers' suggestions. It is therefore unfair to compare it with Ref. 11, which exploits mJ-level optical pulses with UV coverage to produce a cross-correlation signal.
4) Importantly, the phase reconstruction -in contrast to the paper that inspired this research (Ref. [29] of the manuscript) -is not shown in the manuscript, which makes it of limited interest to the community.
The intensity cross-correlation measurement in Ref. 29 resembles a conventional intensity autocorrelation setup with the notable difference that the laser pulse is analyzed by an unbalanced Michelson interferometer instead of a balanced one. One of the arms includes a dispersive element, which causes the measured cross-correlogram to become asymmetric and hence carry phase information. The same laser (only one source, not two) is measured twice, and a phase reconstruction technique can be used if the dispersion of the added element is known, and the optical spectrum is measured. The technique is referred to as PICASO.
This is in stark contrast to the much more difficult case of two optical pulses with unknown temporal profiles that produce an intensity cross-correlation signal as here. The number of unknown parameters to be estimated (and hence the solution space) is much larger. In principle, accompanying spectral measurements for each of the lasers, and a supplementary intensity autocorrelation of each of the interacting lasers can be used to guide the multidimensional optimization algorithm to reconstruct the phases. However, a suitable convergence analysis must be performed first. We are thankful for this suggestion as this sets new research directions for developing the technique. Nevertheless, given the problem complexity, which goes far beyond the single-source case, we believe that our demonstrations still make it of practical relevance to the community by the same way the intensity autocorrelation plays an important role even without providing phase information.
5) The same concerns the laser soliton dynamics. Effects described there have been reported and observed previously, mainly using DFT [12].
This manuscript does not claim the discovery of new phenomena or soliton physics in fiber oscillators. In fact, the opposite is true: we image well-known or recently discovered SM states because such can be leveraged to better draw an analogy between existing pulse diagnostic techniques, and the wavelength-agile IXC technique. Agreement with prior studies only proves the validity of the idea.
Minor comment: a large number of unnecessary and unconventional acronyms makes the paper difficult to read.
We have revised the text to keep only relevant, well-established acronyms. An example of an unnecessary (and overused) acronym was PD, which now has been removed from the text. The same holds for IGM -the interferogram and many more. For convenience, we have included a red-lined version of the manuscript to track changes in the text.
Concluding, the manuscript presents a study of a well-known problem using an original but anticipated technique which is a slight modification of well-known results. Thus, I confirm that the paper is publishable in a scientific journal but does not meet the novelty criteria in this particular case.
We are thankful for recognizing the novelty of our idea. As the Reviewer says, the technique is original, and therefore new. It has not been demonstrated to date. The fact that it is a modification of existing techniques, and that it can be anticipated should make it look more appealing to broader audiences. As recognized by Reviewers #1 and #2, it could have high impacts on nonlinear optics study because of the large number of practical difficulties and fundamental limitations that hamper existing imaging methodologies. Therefore, along with Reviewers #1 and #2, we believe this work is suitable for publication in Nature Communications.
pulsed sources operating in the emerging 2-micron region (demonstrated here) even at highly dissimilar repetition rates (from a conventional dual-comb perspective) is highly attractive. Also an extension to the mid-IR region, urgently needed by the laser community, is quite straightforward via simple frequency translation of a 1.5 micron laser through soliton self-frequency shift merged with a suitable-bandgap detector (i.e. extended InGaAs).
Thanks to our work, it is likely that some of the early references suggested by the author will be rediscovered when put in a different context like laser pulse diagnostics. The application of the 2PA process to study soliton molecules, which the Reviewer agrees is unprecedented, fulfills the novelty criterion of the journal.
We hope that this satisfactorily addresses the concerns raised by the Reviewers and we thank them for their constructive comments. Sincerely,