Efficient unitary method for simulation of driven quantum dot systems

Density matrices evolved according the von Neumann equation are commonly used to simulate the dynamics of driven quantum systems. However, computational methods using density matrices are often too slow to explore the large parameter spaces of solid state quantum systems. Here we develop a unitary computation method to quickly perform simulations for closed quantum systems, where dissipation to the environment can be ignored. We use three techniques to optimize simulations, apply them to six time-dependent pulses for a semiconductor quantum dot qubit system, and predict the dynamic evolutions. We compare computational times between our unitary method and the density matrix method for a variety of image sizes. As an example, we implement our unitary method for a realistic four-state system [Z. Shi, et al., Nat. Commun. 5, 3020 (2014)], and find that it is over two orders of magnitude faster than the corresponding density matrix method implemented in the popular quantum simulation software QuTiP.


I. INTRODUCTION
The evolution of multi-state quantum systems with time-dependent driving is a fundamental problem in quantum information. While it is simple to specify an initial state, a Hamiltonian and its time dependence, determining the final state from these specifications is challenging. The standard method is to use density matrices [1][2][3][4][5][6][7] . While density matrix methods are robust and extensible, their numerical implementations are computationally expensive.
Two research methodologies depend on the simulation of driven quantum systems. In the first, experiments are performed and their results are then backed by simulations. In the second, simulations are conducted and experiments seek to realize their behavior. Yet, both methodologies have challenges. In the first, the challenge is to determine the Hamiltonian that reproduces the experimental results. Finding this Hamiltonian requires a parameter space exploration. In the second, finding evolutions that warrant experimental study requires a parameter space exploration, and the experimental investigation may be challenging.
The parameter spaces of solid state quantum systems are usually very large. For example, consider a charge based quantum dot qubit in silicon. An ideal qubit is a two-state system, yet in real solids such as silicon, realizations of qubits are more complex. The realization of a charge qubit involves the creation of two quantum dots, where the position encodes the quantum state [8][9][10][11][12][13] . Yet in real solids, there are additional pseudospin/valley states 5,14,15 . In silicon, there are six valley states; considering the lowest two states turns the simulation of a qubit system into the simulation of a four-state system.
Here we develop three techniques for efficient unitary evolution and present this method as a faster alternative to density matrix methods for the exploration of parameter spaces. Previous studies of unitary evolution and related methods have justified the decomposition of the unitary evolution operator into a product of exponentials of time-independent Hamiltonians [16][17][18][19][20][21][22][23] . Meanwhile, J.C. Tremblay, et al. preconditioned adaptive step sizes were introduced for Runge-Kutta solutions to the Schrödinger Equation in Ref. 24. In addition, we apply our method to the simulation of a realistic four-state system realized in a charge based quantum dot qubit system in a semiconductor by Z. Shi,et al. in Ref. 7.

A. Coherent Oscillations
Coherent oscillations in the probability of a state arise during time evolution as a result of quantum tunneling or spin precession. While coherent oscillations for simple systems such as the Larmor Precession of a spin-1/2 particle in a magnetic field are easy to visualize theoretically, the situation for multi-state systems is less clear.
Consider a two-state system with eigenstates |L and |R , corresponding to two quantum dots with potentials V L and V R . Now consider a tunnel coupling of ∆ between the two dots. The Hamiltonian for this system is then, with detuning defined as ≡ V R − V L 25-27 : With unitary evolution at constant ∆ and , this leads to coherent oscillations with a frequency of ω = E/h = ∆ 2 + 2 /4/h. Such coherent oscillations are most clearly visualized on a image of probability amplitude with detuning plateau and time evolved. In Figure 1, oscillations occur at a frequency ω, and peak amplitudes depend on the detuning. 28 Images of coherent oscillations can be determined for real systems. Experimentally, this involves fabricating a system of quantum dots and electrodes. The electrodes are then connected to a pulse generator which sets electric potentials on the quantum dots 25,29,30 .
In modeling these systems, the detuning is time dependent and experimentally relevant time dependencies may arXiv:1909.02532v2 [physics.comp-ph] 19 Apr 2020 Coherent Oscillations in a Two-State System be considered-see Fig. 2 for some examples. For example, a perfectly square pulse cannot be experimentally generated, so a trapezoid pulse approximates the finite rise and fall time characteristics of the pulse generator. Additionally, dots may have on-site splittings.
These factors result in new physics [5][6][7]14,15 , and necessitate numerical solutions to determine accurately model the time evolution. Computationally, states may be evolved using density matrix methods, or, if there is no dissipation, unitary evolution methods. We do this by selecting an initial state, evolving through the pulse, and conducting post-pulse averaging for each pulse duration and detuning amplitude in some range. The final state is taken to be the average of the state at the minimum detuning over a readout time following the pulse.

B. Density Matrix Evolution
Currently, density matrix methods are the standard method used to model the time evolution of quantum information systems [1][2][3][4][5][6][7] . While these methods are accurate, and can create images from known parameters, they are too slow to explore the parameter space of real quantum dot systems.
Density matrices, ρ, encode state information in a matrix of the form ρ = p j |ψ j ψ j |, for positive real p j and tr(ρ) = 1. Expectation values of observables, Q, are computed as Q = tr(Qρ). Density matrices evolve according to a differential equation such as the von Neumann Equation 31 :ρ Density matrix methods then implement numeric solutions to the differential equation. Two such methods are the 4th Order Runge-Kutta Method (RK4), and the Adams Linear Multistep Method (as is used in the QuTiP Library 32,33 ).

C. Unitary Evolution
As an alternative, we present unitary evolution. While unitary evolution cannot describe dissipation or coupling to the environment, the methods are, after optimization, fast and effective ways to approximate state evolution and to develop a theoretical understanding of real systems.
Here we develop three techniques to optimize unitary evolution methods for determining images of coherent oscillations in the time evolution of quantum-dot systems driven by detuning pulses. When particle number ψ|ψ is conserved, quantum states evolve with a unitary timeevolution operator: If we know the time-evolution unitary U (t, 0) and the initial state |ψ(0) , then we are done. Yet it is often challenging to find U (t, 0). In particular, if the state is a mixed state which is entangled with the environment or if the Hamiltonian has a probabilistic dependence, then it is hard to find the corresponding unitary, and it is simpler to use the density matrix formalism for time evolution. Otherwise, so long as Hamiltonians of different times commute the time-evolution unitary may be found using the formula 21 : Now, we may approximate the integral by expressing the unitary evolution as product of N steps of constant H spaced at τ n ∈ {τ 0 , τ 1 , . . . , τ N } ⊂ (0, t), τ 0 = 0, τ N = t: Carefully selecting the time intervals τ n , and how the product is evaluated and stored leads to substantial improvements in computational time. Here N is the total number of time steps in the simulation of a system, and we fix num as the image size num × num.
In particular, in the diagonal basis {|a j } at each times τ n , the unitary operator is the timeordered matrix product of the sum over the states:

D. Charge Based Semiconductor Quantum Dot Qubits
Multi-state quantum systems are commonly realized in semiconductors. Of particular interest are silicon qubits, where quantum dots encode state information. State information may be stored as spin, charge, or on-site valley splitting states. In this paper, we consider examples of charge qubits composed of quantum dots with valley splittings. The coherent oscillations of the electronic states in quantum dots are of key interest for the encoding and measurement of quantum information 6,26,34,35 .

E. This Paper
This paper aims to answer the question: "how can we computationally explore the dynamics of driven quantum systems in the large parameter space of solid state quantum systems?" Our answer is: unitary evolution with optimizations for realistic driving pulse characteristics.
In Section II, we develop optimization techniques for unitary evolution. These optimization techniques depend on assuming the form of the time dependence, but these assumptions are usually experimentally applicable.
In Section III, we present six pulses: square, trapezoidal, ramp, sine, arc, and noise, discuss optimization techniques for unitary evolution methods with these detuning pulses.
In Section IV, we compare the computational times and time-complexities of the optimized unitary methods with density matrix methods implemented in QuTiP for the three state system developed by Schoenfield, et al. in Ref. 6.
In Section V, we apply an optimized unitary method for a trapezoidal pulse to the realistic four-state system developed by Shi,et al. in Ref. 7.

II. OPTIMIZATION TECHNIQUES
A simple implementation of unitary evolution for images of coherent oscillations in state's evolution is to use two nested for loops: one over the detunings and one over the pulse times, with a fixed step size for unitary evolution. However, this is not particularly efficient. If we assume that the only time dependence of the Hamiltonian comes from the detuning, then we may optimize the evolution method. In particular, we develop three optimization techniques: evolution over constant intervals, extraction of repeated features, and extension of prior computations.

A. Constant time dependence
For unitary evolution, with the diagonal-timedependence Hamiltonians that arise in quantum dot sys-tems, no accuracy is gained by subdividing intervals that have a time-independent Hamiltonian. Therefore a unitary evolution method may be preconditioned to evaluate the evolution of such intervals in one step: This is useful for square pulses, nearly square pulses, and pulses with steps.

B. Repeated features
Repeated features need only be calculated once. In particular, with the time-ordering operator T : This is useful for experimentally generated pulses which have fixed rise and fall times that are independent of pulse times, and other pulses with features that are repeated.

C. Extension of previous computations
If a new pulse is an extension of an old pulse, then by the separability of Eq. 6: This is useful for all pulses that progress uniformly, or have sections in between repeated features that progress uniformly.

D. Combining techniques
Which techniques are appropriate for a given time dependence must be considered carefully, and it should be noted that multiple techniques may be used for a single detuning pulse.

III. PULSES
Here, we present six detuning pulse patterns, as seen in Fig. 2, and describe methods for optimizing unitary evolution with such time dependencies, and discuss computational times. See Appendix C for pseudocode implementations of these methods.

A. Square
Consider a pulse that is a constant detuning, as in Fig. 2(a). An expression for a pulse from time t = 0 to t = t p with detunings min and max is: For this pulse, the unitary for evolution may be calculated in one step as in Sec. II A.

B. Trapezoid
A more experimentally realistic pulse is a trapezoid pulse with fixed rise and fall time as in Fig. 2(b). An expression for a pulse from t = 0 to t = t p with detunings from min and max is: For this pulse, the unitary for evolution may be broken into three sections: If the rise and fall unitaries are calculated once per pulse height as in Sec. II B, and the plateau is calculated in one step as in Sec. II A.

C. Ramp
A pulse which progresses uniformly from t = 0 to t = t p with pulse heights from min and max , with a discontinuity at t = t p , as in Fig. 2(c), is: For this pulse, the unitary for evolution may be computed iteratively as in Sec. II C.

D. Sine
Another experimentally realistic pulse is a sine pulse. To ensure that the pulse is continuous, select angular frequency ω such that ω = jπ/t p for an integer j. However, we consider a sine wave with a discontinuity at t = t p , as in Fig. 2(d). An expression for a pulse of frequency ω from t = 0 to t = t p with detunings centered on center with amplitude amp is: For this pulse, the unitary for evolution may be computed iteratively as in Sec. II C. While in principle, it would be possible to store U period and decompose: This method is discouraged because it yields only modest improvements over iterative time dependence, demands large memory usage, and is relatively difficult to implement and modify.
Note that QuTiP can implement unitary evolution for periodic driving through its Floquet methods. We note that for a 200 × 200 image with no averaging during readout that QuTiP's mesolve method took 6420 sec, QuTiP's Floquet methods took 383 sec, and the diagonalbasis unitary method took 3.6 sec.

E. Arc
An experimentally unrealistic pulse for which the detunings have neither constant time dependence, repeated features, nor extension of previous computations is a parabolic arc from t = 0 to t = t p , with detunings from min to max , is as in Fig. 2(e), and: One might imagine that the computation could be optimized by U new = (U old ) tnew/t old as for scalars, yet we have non-commuting matrices: [H new , H old ] = 0, so this method does not work. This type of pulse is therefore an example of the worst-case scenario for the unitary evolution methods.

F. Noise
Consider a pulse composed entirely of noise. One such pulse is shown in Fig. 2(f). Specifying (0), the range of permissible noise ∈ (−a, a), and a roughness factor r, the pulse is for x = random(−a, a): This evolution has no constant time dependence, and no repeated features, that we can know a priori. Such a pulse could be defined to have extension of previous computations as in Sec. II C.

IV. COMPARISON OF METHODS
Here we compare the speed of density matrix and optimized unitary methods for determining image arrays of the time evolution. In particular, we consider the evolution of a two quantum dot system with valley splitting on the left dot: |L → |L 1 , |L 2 (a three-state system) subject to a trapezoidal detuning pulse as developed by Schoenfield et al. in Ref. 6. The Hamiltonian is: Parameters used for comparison are: fixed rise and fall times of 100 ps, time at detuning plateau of 0 to 3 ns, detunings of −200 to 1200 µeV, and ∆ 1 = 26.5 µeV, ∆ 2 = 56.2 µeV, δ = 23.0 µeV. Following pulses, all systems were evolved at readout for 1 ns, and then the state measurement was averaged over that 1 ns.
For a speed test, we selected values of the image size num. For each size, we used (1) the QuTiP method to generate a num×num image of coherent oscillations with a trapezoidal detuning pulse, followed by evolution and pulse averaging at min (2) optimized unitary methods to generate num×num images of coherent oscillations with a trapezoidal, sinusoidal, and square detuning pulse, each followed by evolution and pulse averaging at min . We timed each of the computations and plot the results in Fig. 3. For comparison, computations were completed using Python on a single core of an Intel i7-7700HQ processor. For QuTiP, the reporting step size was 10 −10 sec, and for the optimized unitary methods, the step size was 10 −12 sec (which introduced errors of less than 1.5%). See Appendix A and Appendix B for details on these number and the accuracy and error bounds with step size.
In Fig. 3, we observe that while all solutions are computationally intensive for large systems, the optimized unitary method is much faster than the density matrix method at the same size. This is because step sizes and distribution of computations are more flexible for unitary methods than for solving solving differential equations.
The best-fit lines to the data in Fig. 3

V. APPLICATION TO A FOUR-STATE SYSTEM
While exploring the parameter space using density matrices is infeasible (a 4×4 hermitian matrix has 10 free parameters, and determining one clear image (num ≥ 100) takes roughly one hour in QuTiP), such an investigation is possible using optimized unitary methods. Note that using optimized unitary methods, three-state systems and four-state systems compute in the same time 36 . For examples of four-state parameter space evolution, see the supplementary material.

Realistic four-state Hamiltonian
Using the parameters in Ref. 7, we demonstrate that the optimized unitary method can produce the same results as density matrix methods for a four-state Hamiltonian, assuming no decoherence or noise. The results are displayed in Fig. 4. We find that the optimized unitary method is two orders of magnitude faster than the density matrix method.

VI. CONCLUSIONS
We developed three optimization techniques for unitary evolution methods for determining images of coherent oscillations in driven quantum systems. These methods make assumptions on the form of the time dependence, here assumed to be a detuning pulse, in order to optimize computations. The methods act by taking advantage of experimentally relevant forms of detuning pulses: time-independent intervals of detuning, repeated detuning features, or subsequent detuning pulses that are extensions of previous detuning pulses. These assumptions allow the optimization of unitary evolution methods for pulses that are commonly realized experimentally including trapezoidal and sinusoidal detuning pulses. However, some pulses cannot be optimized using these techniques; for example, pulses which are dilations, and pulses which are not known a priori. We found that optimized unitary methods are much faster than the corresponding density matrix methods for all image sizes. In addition, we found that for the unitary methods, three-state systems and four-state systems evaluate in roughly the same time. We then applied an optimized unitary method and a QuTiP density matrix method to a realistic four-state system, and found that the optimized unitary method is over two orders of magnitude faster than the corresponding density matrix method.

Comparison of Numerical Simulation Results
We believe that these techniques of unitary optimization can enable the computational exploration of the parameter spaces of coherent oscillations in quantum dot and multi-state systems with time dependencies such as the detuning pulses presented here. We expect such explorations to lead to a better theoretical understanding of driven multi-state systems, and they may lead to the discovery of new physical behaviors. Step Size (sec)

Appendix A: Accuracy of Methods
For the comparison of computational methods, it is necessary to ensure that each method attains the same accuracy. Thus, we are interested in establishing a bound on the error of the optimized unitary methods, and finding computational parameters such that this error is less than a specified tolerance.
To establish such a bound, we use the three-state Hamiltonian in Eq. 19 with the parameters in the main text, and consider a trapezoid pulse with a rise time of 0.1 ns, a fall time of 0.1 ns, and a plateau time of 0.8 ns, with the detunings min = −200 and plateau = 1200 µeV. The rise and fall in this pulse have the largest time dependence (diabaticity) of all pulses, and they therefore set a bound on the error.
To set an "exact" probability, we used a sixth-order Runge-Kutta density matrix method with a step size of τ n = 10 −15 sec on the rise/fall, and ρ(0) = |R R|, to calculate the probability P exact ≡ R|ρ(1 ns)|R 37 . From comparison with a fourth-order Runge-Kutta density matrix method, we believe P exact to be accurate to within 10 −10 . Then, with the optimized unitary method, for a range of step sizes, we computed P = ψ(1 ns)|R R|ψ(1 ns) . We then took the magnitude of the difference. The results are plotted in Fig. 5.
Then we selected a tolerance on the accuracy. We used the Parula colormap to plot images of coherent oscillations, and this colormap has 64 colors, so we specify the tolerance as 1/64. In Fig. 5, the optimized unitary method is accurate to within 1/64 for a step size of 10 −13 sec.
However, the maximum error is not the only reasonable measure of accuracy. We may instead consider the Step Size (sec) Step Size for Optimized Unitary Method by Diabaticity average error. For an image array A, the average error is mean|A − A exact |. By computing A exact with as above, and A with a step size of τ n = 10 −12 sec, we find that mean|A − A exact | < 1/64, so we use this step size. Note that in QuTiP, there is no direct control over step size. Instead, QuTiP uses adaptive step sizes to ensure an accuracy of 10 −3 to 10 −4 , and reports expectation values at a series of time steps. Therefore we select the largest reporting step size that converges: 10 −10 sec.

Appendix B: Further Optimizations
The optimized unitary methods presented in this paper used fixed step sizes, τ n , for the time-dependent intervals of the Hamiltonian. However, it is possible to adapt the step size, either for each row of detuning, or for individual evolutions.
For rows, we calculated the step sizes required for an accuracy of 1/64 for a variety of pulse heights (diabaticities); the results are plotted in Fig. 6. Within individual evolutions, adaptive step sizes may be implemented in the same spirit as the Runge-Kutta-Fehlberg Methods 38 . We expect that adaptive step sizes could further accelerate computations by approximately one order of magnitude (while also increasing code complexity).
Noting that each row of probability P ( ) is computed independently from each other row, the for loop over detunings may be parallelized. This is expected to lead to performance gains in direct proportion to the degree of parallelization up to num-fold improvement. Other performance considerations include the choice of programming language (including Cythonization) and computational hardware.