Uni-directional optical pulses, temporal propagation, and spatial and temporal dispersion

I derive a temporally propagated uni-directional optical pulse equation valid in the few cycle limit. Temporal propagation is advantageous because it naturally preserves causality, unlike the competing spatially propagated models. The exact coupled bi-directional equations that this approach generates can be efficiently approximated down to a uni-directional form in cases where an optical pulse changes little over one optical cycle. They also permit a direct term-to-term comparison of the exact bi-directional theory with its corresponding approximate uni-directional theory. Notably, temporal propagation handles dispersion in a different way, and this difference serves to highlight existing approximations inherent in spatially propagated treatments of dispersion. Accordingly, I emphasise the need for future work in clarifying the limitations of the dispersion conversion required by these types of approaches; since the only alternative in the few cycle limit may be to resort to the much more computationally intensive full Maxwell equation solvers.


Introduction
The importance of having simple and robust methods for the propagation of optical pulses has attracted increasing attention in recent years. This is due to the multitude of applications [1] in which ever shorter pulses are used to act like a strobe-lamp that takes snapshots of ultrafast processes [2,3], or subwavelength electric field profiles are created [4][5][6] to achieve detailed control over atomic or molecular responses. Alternatively, strong nonlinearity can be used to construct pulses that are both wide-band and temporally extended, such as (white light) supercontinnua [7][8][9]. Such nonlinearities can even be used to generate sub-structure that is instead temporally confined, as in optical rogue waves [10]; or even the temporally and spatially localised filamentation processes [11][12][13].
This progress towards achieving ever shorter pulse durations, with their associated larger spectral bandwidths, and higher pulse intensities, has been pushing traditional pulse propagation models to their limits, or breaking them. If we want to avoid the computational expense of always resorting to high resolution Maxwell's equations solvers coupled to detailed material response models, we need to be confident that our simpler, less demanding approaches still work. In particular, we need to have a clear idea of the physics that may have been removed, and what the sideeffects of those approximations are. In this paper, I use a directional approach, whose relatively simple and straightforward derivation allows easy comparison of both the approximate and the exact propagation equations, whilst still resulting in the analytical and numerical convenience of a first-order wave equation.
However, unlike perhaps the most common approach in nonlinear optics, I consider propagation in time rather than along some chosen spatial axis [14,15]. The most significant distinctions between temporal and spatial propagation approaches are summarised on figures 1 and 2 respectively. Notice that the initial conditions (starting states) are completely different, and that only the temporal propagation model is going to provide causal solutions in a straightforward way. The temporally propagated evolution equations for the spatial wave profiles do not give us access to the frequency spectrum [16], but instead we have the wavevector spectrum, as well as a causally appropriate time evolution [17]. This requires an alternate approach to the temporal response of the propagation medium, which highlights the existing and often somewhat poorly characterised approximations inherent in spatially propagated treatments of dispersion. Of course, full finite element and/or FDTD [18,19] pulse propagation methods can also be used, but here my aim is to simplify the time propagated approach in a directional approach in line with common spatially propagated methods.
In section 2, I derive a customised second order wave equation from Maxwell's equations, and reorganise it to define the material properties appropriately and set up the factorisation stage. In section 3, I use a factorisation method [14] that allows us to construct an explicitly bi-directional model, and which is then approximated to the popular unidirectional limit in section 4. Section 4.4 remarks on commonly used modifications that can be applied to the equations given in sections 3 and 4, typically in order to clarify their properties, simplify them, or compare them to existing models. Having derived this main result-the propagation equations and their various specialisations-I turn in section 5 to the consequences of temporal propagation and the perspective it provides on the handling of mixed spatial and temporal dispersive effects. This is then followed in section 6 by a specific comparison between the ordinary spatially propagated nonlinear Schrödinger (NLS) equation-perhaps the most widely investigated equation in nonlinear opticsand its temporally propagated counterpart. Finally, in section 7, I present my conclusions.

Second order wave equation
My starting point is the standard macroscopic Maxwell's equations, where I aim for a solution based on a uniform and source free dielectric medium, but still intending to allow material properties that are as general as possible. A typical approach to this is to construct a second order wave equation for the electric field E, as results from the substitution of the H D J t ´= ¶ + Maxwell's equation into E B t ´= - ¶ (see e.g. [21]). Here, however, I want to follow the displacement field D (not E) since it naturally occurs in conjuction with a time derivative-likewise, the other important field is B rather than H. Thus, we rearrange the curl Maxwell's equations to emphasise both their causal properties [17] and their time derivatives, as Temporal propagation of waves, where disturbances (or pulses) evolve either forward or backward in space. At any point in the propagation, we know the spatial behaviour of our wave field at all points in space, as indicated by the pale vertical lines. The pale shaded triangles indicate the past light cones (i.e. the causal past) of a wave element (black circles) at selected times along the path of the disturbance. In a temporally propagated numerical simulation which has a maximum wave speed, the causal past matches the computational past. A notional interface has been added to the diagram to show how a reflection would behave. Reproduced with permission from [16]. The light shaded triangles indicates the computational past of a wave element at particular points (black circles) along the path of the disturbance. Note that unlike the temporally propagated case shown in figure 1, the computational past of a spatially propagated system is not the same as the causal past. A notional interface has been added to the diagram to show how reflections behave-i.e. in an unexpected way [20]. This is because a reflection should have been put in the initial conditions, but was not due to an (assumed) lack of knowledge of that future behaviour. Reproduced with permission from [16].
Note that for both of these equations, we will also (as usual) want to specify the material response, whether dielectric or magnetic, in terms of a dielectric polarisation P D E 0  =and a magnetisation M B H 0 m = -. Either of these equations given above might be either in the spatial domain (with argument r r r r , , x y z = ( ) ), or the wavevector domain (with argument k k k k , , x y z = ( )). Now, we define the total magnetisation M in terms of a component M L linearly dependent on B, and another component M B containing the rest. As a result, the excess magnetisation M M B º m , which will typically include any nonlinearity, is also a function of B and not H. In traditional H-based spatially-propagated approaches, the temporal response ('dispersion') can be incorporated naturally, here it cannot. We have that since after a Fourier transform from r into k, the products become spatial convolutions (''); which emphasises that anything wavevector dependent is necessarily a non-local concept. Note that this is the converse of the usual situation where a temporal response is described using time-domain convolutions, and these become products in the frequency domain. Now we define the total polarisation P in terms of a component P L linearly dependent on D, and another component P D containing the rest. As a result, the excess polarisation P P D  º , which will typically include any nonlinearity, is also a function of D. We have that P r P r P r r D r P r , 2.5 which in the spatial frequency (wavevector k) domain is t ¶ , and for M and P set up, we can now proceed with the derivation of a second order wave equation for D - Remember that in this picture, all field properties are specified as functions of our chosen primary field D; notably, the magnetic field B, which is usually the main argument for M B , must be determined from the calculated D using the relevant Maxwell's equation (i.e. equation (2.2)).

Simple case: isotropic and homogeneous
Since trying to treat all the details of equation (2.15) correctly leads to excessively complicated expressions, I will first simplify it down into the case where (a) the material's reference properties a m , a  do not vary in space, and (b) the effective monopole current is zero (K 0 = ). Since (magnetic) monoples do not exist, and all non reference properties can still be encoded in P D , M B , or J, this simplification does not (of itself) impose any approximation. However, the first simplification means that our reference properties cannot incorporate any exact knowledge about the spatial structure of the propagation medium 1 . This restriction on obtaining the best possible match between the reference behaviour and the exact behaviour of the propagation medium will typically affect later approximations, where we assume deviations from the reference behaviour are small. One additional advantage of selecting constant reference parameters a m and  a is that we can easily replace them with matrices to allow for e.g. birefringence and cross-polarisation couplings.
The simplified second order wave equation for changes in the displacement field D r ( ) as it propagates forward in time is Note that the retention of the D  · and P D  · terms allows us to include charge effects, as is needed when discussing high-power nonlinear situations such as filamentation [22]. 1 This differs from the situation present in the complementary spatially propagated approach [14], where the medium's linear temporal response typically can be subsumed into the reference behaviour. Nevertheless, in neither case can the spatial properties be incorporated exactly into the reference behaviour; although approximations that assume a particular k dependence can be made.
In the spatial Fourier 'wavevector' regime, we replace ık   , so that k k k 2 = · , and reorganise slightly. The equation for changes in the displacement field D k ( ) as it propagates forward in time is c a a k c a k ı c a a c a where we have defined a reference frequency Ω for the propagation, based on the reference material properties, i.e.
We could-if we believed we understood the spatial properties of the propagation medium sufficiently well-simply alter this definition by incorporating an alternate k dependence that mimics an assumed spatial dispersion. However, it needs to be understood that such a step is distinctly ad hoc, since it lacks the convolutions over k necessary for any accurate representation of the material structure. But, that said, such an approach may still have considerable practical utility; and indeed is functionally equivalent to how spatial dispersion can be handled as a non-local (and hence, strictly speaking, a non-causal) process: e.g. those that import a waveguide dispersion calculated from the (spatial) modes given by the guide's transverse cross-section. To allow for this possibility, in the following, I will replace the physically rigorous magnitude k argument for Ω with k, its vector counterpart 2 . This allows for not only more complicated assumed spatial dispersions, but also additional orientation dependence. However, these benefits (only) arise as a result of approximation. Strictly speaking, both on the grounds covered above, and ones taking stricter view of causality [16], many simplistic notions of spatial dispersion involving adding some convenient k dependence is incompatible with locality 3 . Unlike for the spatially propagated scheme in [14], we have not included temporal response of the propagation medium in our definition of the directional fields, since it is incompatible with our temporal propagation. To include this we must model it explicitly as is done in Maxwell solvers such as FDTD [18,19]; or approximate it as if it were a spatial effect (see e.g. [24], or the discussions in the later section 5).

Factorisation
I now factorise [14,25] the second order wave equation for D. This process neatly avoids some of the approximations necessary in traditional approaches, and takes its name from the fact that the lhs of equation (2.17) or equation (2.19) is a simple difference of squares which might be factorised. Indeed, this was done by Blow and Wood in 1989 [27], albeit in a rather ad hoc-but nevertheless effective-fashion. Since the factors are just of the form ık t ¶  ( ), taken individually they each look like a simple forward (or backward) directed first order wave equation. The factorisation method therefore allows us to define a pair of counter-propagating Green's functions, which divides the original second order wave equation into a pair of oppositely directed first order wave equations that are coupled together. In addition, it allows us to straightforwardly compare the exact bi-directional and approximate uni-directional theories term by term, which is in distinct contrast to other approaches, where the backward or unwanted parts are approximated away piecemeal, and may not be suitable for direct comparison. A short summary of the factorisation method, adapted to this new context from earlier work [14], is given in appendix A.
An important point to remember is that the choice of a a ,  m and therefore k W( ) in equation (2.19) defines the specific Greens functions used. This means that it also defines the basis upon which we will then propagate the displacement field D.
In this section and the next, the derivation will closely follow the proceedure, language, and terminology used in earlier work [14]. However, despite the mathematical similarities, it is important not to lose sight of the distinct physical differences. That previous work focussed on propagation of optical fields along a spatial axis, whereas here I instead consider propagation in time. Since my aim is to compare and contrast the two approaches, as discussed at length in sections 5 and 6, it is useful to enable comparisons not only at the endpoint of the derivation, but also at each step along the way.

Bi-directional wave equations
A pair of bi-directional wave equations suggests similarly bidirectional fields, so I split the electric displacement field into forward (D + ) and backward (D -) directed parts, with D D D = + + -. In the following, remember that P D , M B , etc.
all depend on the full field D, and not only one partial field (e.g. only D + ). This is an important point since we see that they then drive both the forward and backward evolution equations equally. The coupled bi-directional first order wave equations for the directed fields D  are propagated forward in time whilst 2 In some cases, depending on the approximate model chosen, this new k W( ) definition may also need to be convolved with other parts of the expression, e.g.
although such an additional complication is likely to prevent us obtaining much conceptual or mathematical benefit.
Consequently, in what follows, I do not allow for such convolutions. 3 Of course, it is possible to treat spatial dispersion in a properly causal way; a notable example being the hydrodynamic plasmon model [23]. This takes propagation in the electron fluid into account, giving rise to an additional k 2 dependance.
being evolved forward and backward in space are Here, I have replaced the partial time derivatives on the rhs with over-dots, this being to emphasise that if our expression is to remain strictly causal [17], then our models for current J and magnetisation M B should return their time-derivatives as explicit functions of known quantities (here, typically D or t).
For example, a current model specification such as J D s = would mean that J D t = ¶ , giving a D t ¶ on both sides of (3.2) and confusing the causal interpretation.
Note that factorised equations can be rebuilt into a single second-order equation by taking the sum and difference, then substituting one into the other with the assistance of a time derivative (see section IV.B of [25]).
A remaining clarification required is to understand the meaning of 'forward' and 'backward' in the context of three dimensional space-this being less straightforward than in the spatially propagated case, where one spatial axis is selected for propagation 4 , and forward and backward refers to time t. Here, we should understand 'forward' and 'backward' as being with respect to a given choice of k.

Propagation, evolution, and directed fields
Since we have chosen to enforce propagation towards later times t on our solutions of the wave equations, the fields t D  ( ) are directed forwards and backwards in space; these fields then evolve forwards and/or backwards in space as t increases. I use this terminology (propagated, directed, evolved) throughout this paper to mean these three specific things. This usage is consistent with that used in the alternative spatially propagated approach [14], but in that case the fields z E  ( ) are directed in time, and evolve forwards and/or backwards in time as z increases.
The wave equation (3.2) which evolves the directed fields D  as they propagate forward in t, has two types of terms on the rhs: and I call these the 'reference' and 'residual' parts [14,28].
Reference evolution (or 'underlying evolution') is that given by ı k D  W  ( ) term, and is determined by our chosen a  and a m , and any additional ad hoc refinements. By itself, it would describe an ordinary oscilliatory evolution where the field oscillations in D r  ( ) would move forward (+) or backwards (−) in space. This is analogous to the choice of reference when constructing directional fields [29], but here is done for temporal rather than spatial propagation [20]. Residual evolution accounts for the discrepancy between the true evolution and the underlying/reference evolution. It consists of all parts of the material response not included in a a ,  m (and hence k W( )); i.e. it contains all of the rest of the terms on the rhs of equation (3.2). These will usually consist of any nonlinear polarisation, or spatially or orientationally dependent linear terms; they are analogous to the correction terms used in methods based on directional fields. Alternatively, such contributions are called 'source' terms by Ferrando et al [25]. Generally, we will hope they only provide a weak perturbation, so that we might make the (desirable) uni-directional approximation discussed later; but the factorisation procedure itself is valid regardless of their strength.
3.3. Reference evolution: choice of Ω and the resulting D 7 I now consider how our choice of Ω will affect the relative sizes of the forward and backward directed D + and D -. I therefore now consider the example of a simple medium in which the field is known to propagate with frequency ω; but instead we choose a reference evolution determined by a frequency Ω that is different from ω. For example, for a linear isotropic medium we could exactly define ; 2 2 2 w = W + D but in general we would just have some residual (source) term Q. As a consequence, the definitions of forward and backward directed fields do not correspond exactly to what the wave equation actually will evolve forward and backward, as we propagate towards later times.
The second order wave equation is D Now if we choose the situation where our field D only evolves forward, we know that ı t D D exp This tells us how much Dwe need to combine with D + so that our pulse evolves forward; the Dis strongly coupled to D + , and so will be dragged forward against its usual preference. This interdependence of D  is generic-no matter what the origin of the discrepancy between Ω and the true evolution of the field (i.e. the residual/source terms such as mismatched linear behaviour, nonlinearity, etc.): some nonzero backward directed field Dmust exist but still evolve forwards with D + . Comparable behaviour is also seen in the directional fields approach of Kinsler et al [29].
Typically we will hope that this residual Dcontribution is small enough so as to be negligible. If we assume D 0 - , then we find that Further, if we choose k W = W( ) with a wave-vector dependence, then we see that the source-like terms inherit that dispersion; and this is almost inevitable, since typically ck W~. This means, for example, that even if we start with model of nonlinear polarisation without any k dependence, our factorised equations will have a k dependence on the nonlinear terms. This is the complementary process to what happens in the spatially propagated approaches of Kinsler et al [14,29], where choosing a dispersive reference has consquences for the behaviour of the correction terms.

Uni-directional wave equations
We can now make a single, well defined class of approximation, and simplify the exact coupled bi-directional evolution of D  down to just one uni-directional first order wave equation. As with the complementary spatially propagated theory [14], this does not require a moving frame, a smooth envelope, or to assume inconvenient second order derivatives are somehow negligible, as was necessary in traditional treatments. The approximation assumes that the residual terms are weak in comparison to the (reference) ı D  W terme.g. weak nonlinearity or orientation dependence. This means that I can assert that if we start with D 0 = -, then Dwill stay negligible. In this context, 'weak' means that no significant change in the backward field is generated in a time shorter than one period ('slow evolution'); and that small effects do not build up gradually over propagation times of many periods ('no accumulation').
Slow evolution is where the size of the residual terms is much smaller than that of the underlying linear evolutioni.e. smaller than D W . This allows us to write down straightforward inequalities which need to be satisfied. It is important to note the close relationship between these and a good choice of Ω, as discussed in section 3.3. If Ω is not a good enough match, there always be significant contributions from both forward and backward directed fields; and even if nothing ends up evolving backwards, an ignored backward directed field will result in (e.g.) miscalculated nonlinear effects, since the total field D D D = + + will be wrong.
No accumulation occurs when the evolution of any backward directed field Dis dominated by its coupling via the residual terms to the forward directed field D ; + and not by its preferred underlying backward evolution. No accumulation means that forward evolving field components do not couple to field components that evolve backwards; this the typical behaviour since the frequency mismatch between forward evolving and backward evolving components is 2 ; W in essence it is comparable to the common rotating wave approximation (RWA). This rapid relative oscillation means that backward evolving components never accumulate, as each new addition will be out of phase with the previous one; it is not quite a 'no reflection' approximation, but one that asserts that the many micro-reflections do not combine to produce something significant. An estimate of the terms required to break this approximation are given in appendix B; generally speaking this is a much more robust approximation than the slow evolution one. Of course, a periodic temporal modulation of the medium will give periodic residual terms, and these can be engineered to force phase matching. In spatially propagated analyses, where we talk of spatial (wavevector) phase matching and not this temporal (frequency) phase matching [30], this would be a periodicity based on a relatively small phase mismatch (see e.g. quasi phase matching in Boyd [31]); but might even go as far as matching the backward wave (see e.g. [32]).
Note, however, that small forward perturbations from the residual terms can accumulate on the forward evolving field components, as indeed can backward perturbations accumulate on the backward evolving field components. Despite the fact that the residual terms acting on the forward and backward field evolutions are the same size, forward evolving components of the residuals can accumulate on the forward evolving field because they are phase matched; whereas backward residuals are not, and rapidly average to zero.

Residual terms
The handling and criteria for different types of residual terms has been discussed already [14] for the spatially propagated case. While it would be straightforward to repeat that level of detail here, the basic methodology is virtually identical except for the different roles played by the t and r arguments. Accordingly, I will consider only the most commonly considered residuals-the dielectric scalar and vector terms as well as currents. Those interested in magnetic or magnetoelectric effects can adapt the process to their own systems in a similar manner.
The relevant total polarisation P D can then be decomposed into pieces before seeing how each might satisfy the slow evolution criteria. Thus we write The scalar part is represented by a  , and might contain (nonreference) linear parts and time response (a L ), perhaps a convolution (' * ') over the past, or angle dependence; it might also include nonlinear contributions a N such as a third order nonlinearity with a D D D D N µ ( · ) . The vector part U D , if present, could be due to something like a second order nonlinearity, which couples the ordinary and extra-ordinary field polarisations. This description of the material parameters is not restrictive, might e.g. include any order of nonlinearity.

Slow evolution?
Having categorised the relevant residual terms, it now remains to treat each one individually and determine the conditions under which the oppositely directed field can nevertheless remain negligible: i.e., for D  , we have that D 0   . where the scalar a L contains the linear response of the material that is both isotropic and lossless (or gain-less); since here it is a time-response function, it is convolved with the electric displacement field D. Note that the field vector D, and indeed the material parameter a L are all functions of time t and space x y z r , , ; = ( ) the polarisation P D and its components a  , U  are a functions of time t, space r, and the field D.
Below, the field vector D will be split into components D i }. Like we also have wavevector components k i from k k k k , , x y 2 2 2 = + and further, k j is also used as a substitute symbol to represent any one of k x , k y , or k z .
Any corrections due to imperfect choice of Ω will appear in the P D polarisation correction term. However, unlike in the spatially propagated case, there is no diffraction-like termhere, that is quite naturally part of the reference behaviour.
Firstly, we consider the scalar polarisation terms a, which could be either linear (a k L ( )) or nonlinear (a k D, N ( )). These might represent e.g. the time-response of the medium (and hence its dispersion), or be some nonlinear response such as the third-order Kerr nonlinearity. Since these are locally correlated in space, in the wavevector picture they include convolutions over k. The criterion therefore is In the important linear case, a a L º is independent of D, which means that the pulse properties play no role, and only the material parameters are constrained. In the nonlinear case, there is a further constraint, which is on the peak intensity of the pulse, but still there are no smoothness assumptions or bandwidth restrictions.
Secondly, we consider the linear and nonlinear vector terms from U. This criterion is similar to the the scalar case in equation (4.2), but with U replacing aD. Thus for i x y z , , Î { }, we can write down constraints for each component U i , which are In the linear case, U U L º , and the linear relationship between U L and D again means that the criterion only constrains the material parameters contained in U L , not the field profile or amplitude. The nonlinear case is similar to the scalar one, and the peak pulse intensity is restricted; e.g. for a 2 c ( ) medium, U D Fourthly, we have the divergence terms kk D · , kk P D · , which amount to charge density contributions either from D or P D , with k a k P · . Thus, if the wavevector is oriented along a unit vector u, then is the relevant criterion.
To summarise, these various criteria assert that modulations away from the reference evolution must be weak. Notably, weak nonlinearity is invariably guaranteed by material damage thresholds [33]. However, and in contrast to the usually favoured (but strictly speaking less causal) spatially propagated methods, the lack of any rigorous way to incorporate dispersion into the reference frequency k W( ) makes it harder to satisfy the criteria for the ordinary linear response.

Uni-directional equation for D +
In a situation in which all of the above slow-evolution criteria hold, we can be sure that the backward directed field Dis only driven by a negligible amount, and if the no-accumulation condition also holds, then neither will there be any build up of backward evolving contributions to the field. Consequently, we can be sure that an initially negligible Dremains so, and equation where now the polarisation P D , magnetisation, and divergence are solely dependent on the forward directed field D + .

Modifications
Just as for the comparable spatially propagated derivation [14] we might also apply some of the strategies used in other approaches to get a more tractable and/or simpler evolution equation. Unlike in 'traditional' derivations [34][35][36][37], none of these are required, but they nevertheless may be useful. Briefly, a co-moving frame might be added, using z z tv f ¢ = + . This is a simple linear process that causes no extra complications; the leading rhs ıcD + term is replaced ) . Note that setting c v = will freeze the phase velocity of the pulse, not the group velocity. Also, a carrier-envelope separation could be implemented using defining the envelope A(z) with respect to wavevector k 1 and carrier frequency 1 w . Multiple envelopes centred at different wavevectors and carrier frequencies (k i , i w ) could also be used to split the field up. This is typically done when the field principally comprises a number of distinct narrowband components (e.g. [31,38,39]). The wave equation can then be separated into one equation for each piece, coupled by the appropriate wavevector-matched polarisation terms (see [30]).
Typically, such modifications not only make the model easier to understand, but can also reduce the computational requirements when propagating the fields. This is in addition to the advantage given by the unidirectional model having only one first order differential equation for the field, in comparison to two in the case of FDTD. However, these advantages for field propagation may still be affected by numerical issues of stability and accuracy, and must be placed in the context of other auxiliary equations which may also be being solved, such as for temporal dispersion, an electron density, or other dynamic effects.

Dispersion
The distinct contrast between the physical meaning of this temporally propagated wave equation and spatially propagated ones [14,27,31] now gives us an opportunity to consider the respective roles of temporal and spatial dispersion. This is because the temporally propagated picture holds all spatial information at any given moment in time, and so is a natural arena in which to treat spatial dispersion; whereas the spatially propagated picture, as we know, holds all temporal information at any given point in space, and so is a natural arena in which to treat temporal dispersion. However, the comparison is not as straightforward as we would like, so, in what follows, I restrict the discussion to the context of the simple slab waveguide.
Nevertheless, we should always keep in mind that the only way to do temporal dispersion correctly in a temporally propagated model is to explicitly model the time-response of the medium, just as you would do in (e.g.) FDTD methods [19].
A dielectric slab waveguide is a high index planar sheet (the core) of high index material clad on both sides by semiinfinite volumes of lower index material. Here, the waveguide is taken to have thickness d in the perpendicular x direction, propagation in the z direction, and with core and cladding permittivities 1  and 0  . In the continuous wave (CW) limit, the spatial structure of the waveguide gives rise to frequency dependent properties that are usually called 'spatial dispersion'. Even for this simple slab design, the matching of the boundary conditions between core and cladding regions gives the dispersion relation a non trivial form; in this case, it is given by the solution to a transcendental equation [40].
Typically, this is written in a way implying we want to calculate k x and k z b = from ω, i.e. for the transverse electric (TE) field modes we have However, we can now notice that this expression looks like one where we know know ω, and from that have to calculate the matching wavevector properties-i.e. that this is an implicitly spatially propagated representation. Therefore, for the temporally propagated case that is the focus of this paper, where k is the known quantity, equation (5.2) should be rewritten. Accordingly, we use k z for β (since wavevector is no longer a propagation reference), and Ω for ω (since frequency now is a propagation reference); thus, we have Here, we see that the natural (reference) propagation frequency is defined solely by the transverse wavevector k x , as is the matching evolution wavevector k z . While apparently simpler than the traditional form, since Ω is directly given by k x , we usually want to neglect any direct knowledge of the transverse properties. This means that what we really want is not k x W( ), but k z W( ), the inverse of the traditionally preferred b w ( ). Despite these complications, which are in any case rather similar to those we see in the traditional spatially propagated picture, we can still find the same waveguide modes and their dispersions, but this time given as functions of k z , not ω. To simplify the presentation, I now assume that for some nth mode of the waveguide, we can expand about some central evolution wavelength k zn , which corrresponds to a reference frequency k n z n W = W( ). This means we can write where c 1 1 g g = +¯has two contributions-that due to the vacuum (just c), and the estimated modifications due to spatial effects ( 1 ḡ ). Now we arrive at the crucial point in the argument, where we also decide to add the effects of whatever temporal response the material might have to our propagation calculation; but also decide that an approximate model where we modify the expression above in equation (5.5) is sufficient. This is the converse of the traditional procedure, where the waveguide dispersion is added to whatever b w ( ) is appropriate for the medium's time response. Consequently, to proceed further we only need to make very similar assumptions: (a) that the two sources of dispersion are weak and so can simply be added, (b) that the temporal response is assumed to have negligible effect on the waveguide's propagating modes, and (c) that the temporal dispersion is the same at all points in the waveguide.
The temporal response of the waveguide is quite naturally written in terms of a wavevector dependent on a specified frequency, and here is taken to have been simplified to a quadratic. Following the recommendations of [28], I also exclude imaginary components (e.g. loss or gain terms) in either the wavevector reference β or the frequency, intending to incorporate such effects as necessary at some later stage. Consequently, for the nth mode, we have where c 1 1 1 k k = + -¯h as two contributions-that due to the vacuum (just c 1 ), and the modification due to the temporal response ( 1 k ). In order to map this into the temporally propagated picture, with a reference frequency being dependent on a wavevector; we replace ω with Ω, and β with k z . Since we have chosen a quadratic form, the resulting expression can then be solved 5 to find Ω. With ck n z n W = , this gives which we could expand into a power series for small k k zn z -. Since 1 k includes the vacuum, and dispersion 2 k is assumed to be small, we should make the top sign choice since it gives the physically relevant small frequency corrections. The linear term in the series has the coeffient 1 . We now combine the two dispersive effects-the spatial/ geometric/waveguide ones based on parameters i g , and converted temporal ones based on i k , to give a total reference frequency based on k z of Note that we have to avoid incorporating the vacuum contribution twice when summing the dispersive effects, which means that 1 ḡ appears instead of 1 g . In the complementary case, useful for spatial propagation, we can follow the converse-and indeed the usual procedure-to convert the CW effect of the waveguide's spatial structure k z W( ) into a frequency domain form [40,41]. This then can be combined with temporal response of b w ( ) to get 1 .

Discussion
We might have hoped to find a less approximate comparison between the spatially propagated and the temporally propagated approaches to dispersion what has just been discussed. Further, although qualitative comparisons can also be made, ideally we would like a specific and quantitative comparison as done for e.g. testing the unidirectional approximation [33].
That this is not as simple as it might sound can be demonstrated as follows: write two mathematically identical but physically and notationally distinct propagation equations, i.e.
For simplicity, this comparison is done in the 1+1D limit and for unidirectional equations; the residual ω or z behaviour has been merged with the reference behaviour to give single propagation parameters k w ( ) or z W( ). Superficially, the comparison between equations (5.11) and (5.12) looks promising. However, the vacuum part of k w ( ) is c w , but the vaccum part of z W( ) is in contrast a spatial derivative, i.e. c z ¶ . Even if it were possible to match up the functional forms of the non-vacuum temporal response of c k w w -( ) with a spatial structure giving z c z W - ¶ ( ) , the vacuum parts would remain unmatched. Alternatively, we could compare the z-propagated ω-domain to the t-propagated k-domain. However, although in this case, the reference behaviours match, but the k-domain model now has a convolution that does not appear in the ω-domain one.
As a result, it is unclear how well approximating temporal dispersion as spatial properties should work; but then it is equally unclear how well approximating spatial properties as temporal dispersion should work either. Nevertheless, this latter process is so ubiquitous in optics as to be effectively invisible, and-more importantly-does not seem to give rise to significant problems. Removing either of the mismatches discussed above-either in reference behaviour or convolution-requires a narrowband limit, a process which necessarily obscures the parameter regime where interesting tests and comparisons can (and should) be made-i.e. the short pulse, large bandwidth limit. Indeed, the stringent nature of the approximation would defeat the entire point of the comparison.
Nevertheless, it may be possible to design numerical simulations containing structures and temporal responses whose scales are carefully graduated so that the mundane bandwidth issues can be kept separate from the interesting propagation ones. Such a design would, as desired, allow the propagation approximations (only) to be compared and contrasted by numerical simulation. However, this is a non-trival task, and one which I leave for later work.

Pulse propagation
As an example, I will now consider pulse propagation in a simple optical fibre model [21,26,40,41], where the 5 In general-if there are more terms in the expansion, or β has some specific and non-trivial analytic form-this step is more complicated. dominant interesting effect is the third order nonlinearity present in silica. Here I compare 1+1D propagation such a waveguide, in order to elucidate the differences between the standard spatially propagated picture (as discussed in [14]) and the temporally propagated picture derived here. This comparison is made much clearer by not only the use of the factorisation method in both cases, but also because it enabled both derivations to closely follow the same steps and approximations.
The assumptions made are those of transverse fields, weak dispersive corrections, and weakly nonlinear response; these all allow us to decouple the forward and backward wave equations. This decoupling means that without using any extra approximations, we can simplify our description and treat forward only pulse propagation. The specific example chosen here is for an instantaneous cubic nonlinearity, but it is easily generalised to non-instantaneous cases or even other scalar nonlinearities.

Spatially propagated NLSE
The commonly used spatially propagated NLS for E x w + ( ) [31,34,42] based on a directional factorisation [14] can be written where ... F[ ] is the Fourier transform that converts the nonlinear polarisation term into its frequency domain form. Here we have chosen a fixed, non-dispersive reference wavevector β, the temporal response of the propagation medium is encoded in the coefficients j k . For the temporal (time response) of the material, this is an in-principle exact rerepresentation of that response as a Taylor series expansion; although in practise the series is usually truncated after a few terms and the propagating fields restricted to within a bandwidth where only those terms are significant.
The important point to remember is that in this spatially propagated picture we do know the full time history, at least in a computational sense-i.e. as we compute a solution to the propagation equation. This 'computational past' [16,43], comprising all previously computed values of the field properties and material state means that a full frequency spectrum and response is known. As a result, the time response of the material can be directly encoded in an accurate and useful dispersion relation, and then be approximated as a Taylor series in ω. A depiction of this spatial propagation scheme was shown in figure 2.
Since this propagation is assumed to be taking place in a waveguide, we also have its geometric-wavelength sensitive -properties to consider. However, as discussed in the previous section we typically just re-represent those as if they were instead generated by a temporal response (see e.g. [40,41]). Thus the coefficients j k usually combine both the effect of time response and geometric properties into a single total dispersion; this is the very basis of dispersion compensation in optical fibres, since the material (temporal) dispersion is offset by the waveguide design (spatial dispersion) [40].

Temporally propagated NLSE
Using the approach derived and presented in this paper, we can develop a time propagated counterpart to equation (6.1). In a purely mathematical sense, it is, apart from notation, identical to equation (6.1). However, in physical terms, the change is not trivial, since the physical meaning and boundary conditions differ significantly. Starting from equation (4.6), we can get a propagation equation in wavevector space for D k where ... G[ ] is the Fourier transform that converts the spatially localised nonlinear polarisation into its wavevector domain form. Here, all dispersive behaviour has been combined into the coefficients j g , which mimic the common notion of spatial dispersion. Such terms can arise directly from a Taylor expansion in k of a relevant CW dispersion relation about some suitable reference wavevector; however, although we can use the expansion to represent the spatial structure of the propagation medium (e.g. a waveguide), the conversion from spatial structure to k expansion is only relatively straightforward in the CW limit.
If the propagation medium also has a temporal response (i.e. temporal dispersion), and if we do not want to model it explicitly, then we can use the approach described in the previous section, where the temporal response of the material was (approximately) represented as if it were instead due to spatial properties.
Finally, note that in ordinary temporally propagated simulations, the computational past and the physical (temporal) past are identical; this is not the case for spatially propagated simulations, as mentioned above. A depiction of the temporal propagation scheme was shown in figure 1.

Discussion
Now that we have seen and compared the two complementary versions of unidirectional NLS equations, it is worthwhile to emphasise some of their features. In a spatial propagation model, the back-evolving wave is distinctly problematic, because it is travelling backwards in time. If our boundary conditions are strictly initial (starting) conditions as in figure 2, and we disallow sources of reverse-propagating waves at the endpoint of our propagation-then the backevolving wave is non-causal. This means we can argue it needs to be removed (or guaranteed negligible) on purely physical grounds. In contrast, in temporal propagation, the back-evolving wave is simply evolving in the opposite (spatial) direction to our pulse, and there is no argument for removing it on purely physical grounds. Nevertheless, it is convenient to do so, as long as the criterial discussed in subsection 4.2 and appendix B hold, in order to derive the simple equation (6.2).
Note that if the third order nonlinearity is preserved accurately (as for the instantaneous 3 c ( ) in equations (6.1) and (6.2)), the dispersion is correctly accounted for, and the necessary constraints hold, pulses of any bandwidth can be evolved correctly by either form (see e.g. [26] for the spatial case). However, if we introduce a carrier-envelope picture to get a simpler NLS equation, we can drop the third harmonic generation contribution and retain only the self-phase modulation nonlinear term. In spatial propagation, such an approximation restricts the allowed pulse bandwidth to one third of the carrier frequency, but in temporal propagation this is instead a restriction on the wavevector bandwidth. This constraint also helps ensure that the spectrum does not extend onto the negative frequency or wavevector range.

Conclusion
I have derived a general first order wave equation for unidirectional pulse propagation in time. The scope was kept as general as possible, allowing for arbitrary dielectric polarisation, magnetisation, diffraction, and free electric charge and currents. This propagation equation has a utility all of its own, allowing efficient unidirectional propagation in a strictly causal manner (i.e. time directed). However, it requires comparable approximations to those used in deriving spatially propagated unidirectional propagation equations, although the specific tradeoffs are different.
The propagation equation was derived by first factorising the second order wave equation for D into an exact bidirectional model. Next, I applied the same type of well defined type of approximation to all non-trivial effects (e.g. nonlinearity, diffraction), and so reduced the bi-directional propagation equations down to a simpler first order unidirectional wave equation. One feature of this factorisation method is that it provides an easy way to do a term-to-term comparison of the exact bi-directional theory and approximate uni-directional descriptions.
In addition to its use where the causal simulation of a propagation problem is of strict concern, it also illuminates the process of handling material response and spatial structure in waveguides by constructing a single combined dispersion model. As discussed in section 5, we can now see that this simple addition of temporal and spatial dispersions in an ad hoc manner stands on rather poor foundations, particularly in the few-cycle pulse limit. Further, it is hard to see any straightforward way of testing the limitations of this common procedure, which poses an interesting challenge for the future. That said, in most applications, we expect that a combined dispersion will remain a perfectly adequate approximation, since the pulses are rarely so short, and it seems reasonable to assume that the character of the pulse evolution should remain similar, even if details might differ. Lastly, the lessons learnt from this investigation could also be applied to other types of wave propagation, most notably acoustic systems (see e.g. [20]).
) is a forward-like (Green's function) propagator for the field, but note that in my terminology, it evolves the field. The complementary backward-like propagator is 1 w + W -( ) . As already described in the main text, we now write D D D = + + -, and split the two sides up to get