QuESTlink—Mathematica embiggened by a hardware-optimised quantum emulator

We introduce QuESTlink,pronounced ‘quest link’, an open-source Mathematicapackage which efficiently emulates quantum computers. By integratingwith the Quantum Exact Simulation Toolkit (QuEST), QuESTlink offers ahigh-level, expressive and usable interface to a high-performance, hardware-accelerated emulator. Requiring no installation, QuESTlink streamlines the powerful analysis capabilities of Mathematica into the study of quantum systems, even utilising remote multi-core and GPU hardware. We demonstrate the use of QuESTlink to concisely and efficiently simulate several quantum algorithms, and present some comparative benchmarking against core QuEST.


I. INTRODUCTION
Classical emulation is crucial in the design of quantum computers and algorithms.Despite the recent demonstration of quantum supremacy [1], today's quantum computers are of insufficient quality to run and test many interesting algorithms.Even precise quantum computers of tomorrow may provide limited help in writing new algorithms, since unlike emulators, they offer limited information about the evolving quantum state.Furthermore, some algorithms, particularly those for noisy intermediate-scale quantum (NISQ) devices [2] like the variational class of algorithms [3], admit limited analytic treatment.Hence, the value of classical emulation is undeniable.
The research community needs high-level usable tools that are easy to deploy, offer rapid numerical study, and integrate with other established software.However, the exponentially-growing cost of classically simulating a quantum device makes emulation of even NISQ computers very resource intensive.Emulators must therefore make good use of classical high-performance computing techniques, like multithreading and GPU parallelisation, and be written in low-level performant languages like C. This requirement is at odds with the need for usable tools, which can be used by non-expert programmers and the wider quantum community.
Within this context we have developed QuESTlink: a high-performance Mathematica package for numerically emulating quantum computers, by off-loading expensive computation to remote accelerated hardware, running QuEST [4].Mathematica is both a language and computational tool, prevalent among physicists, which offers a convenient interactive interface (through notebooks), and an extremely comprehensive and powerful set of utilities.Although the most widely used tool for calculations in the physical sciences [5,6], Mathematica does not have an intrinsic toolset specifically dedicated to quantum computing emulation.In contrast, QuESTlink offers a usable Mathematica interface, without compromising the excellent performance and simulation capacity of QuEST.
From a laptop environment, a user can symbolically specify a circuit in an intuitive high-level operator format akin to how they appear in the literature.Behind a platform-agnostic Mathematica interface, QuESTlink sends the circuit to a C backend, either locally or on remote high-performance hardware, where it is efficiently emulated.QuESTlink offers multithreaded and GPU simulation of state vectors and density matrices, multiqubit multi-controlled general unitaries, general noise processes, circuit drawing, and a wide range of stan-dard gates.Like QuEST, QuESTlink is free, open-source and can be used stand-alone; furthermore, without setup, compilation or installation of any kind.

A. QuESTlink facilities
While the forthcoming sections give a thorough overview of QuESTlink's facilities, we here provide a short summary to acquaint the reader with the essentials.QuESTlink is, offers, features or enables: • Multithreaded and GPU-accelerated emulation of quantum computers.
• State-vector and density matrix simulation.
• Off-loading of simulation to remote hardware, through a backend-agnostic interface.
• Seamless integration with Mathematica's powerful and comprehensive toolset, and interactive notebook programming style.
• Rapid development with Mathematica's concise, functional language.
• A concise but expressive circuit language, akin to the symbolic description of circuits in the literature.
• Stand-alone with no required external downloading, compiling, or installation of any kind whatsoever.
• Through Mathematica, is compatible with all major operating systems.
• Rendering of circuit diagrams, which can be exported to any file format through Mathematica.
• A suite of functions for higher-level calculations, like computing Hamiltonian expectation values, and derivatives of unitary circuits.
• A comprehensive suite of unitary gates and decoherence channels, including multi-controlled multi-qubit general unitaries, and multi-qubit Kraus maps.

II. TECHNICAL SUMMARY
This section provides an overview of the inner workings of QuESTlink, and the technologies used to build it.Readers intending to use QuESTlink right away may wish to skip to Section III B.

A. Architecture
QuESTlink consists of both a Mathematica package (QuESTlink.m)and an underlying C program (quest link.c), which interface using C/Link; an implementation of the Wolfram Symbolic Transfer Protocol (WSTP) [8] (formerly called MathLink [9]).QuESTlink emulates quantum computers using the Quantum Exact Simulation Toolkit (QuEST) [4], which is an open-source, high-performance emulator written in C. QuEST is multithreaded, GPU-accelerated and distributed, and the first two of these facilities are made available to QuESTlink.
C/Link facilitates conversion of Mathematica types, utilised in the user's notebook, into C types, and maps Mathematica-callable functions to C functions.The executable quest link (quest link.ccompiled with WSTP) sits below as a "backend" process, performing intermediate processing and invoking QuEST's API.This software stack is visualised in Figure 1.
Through this stack, QuESTlink offers a high-level Mathematica interface to quantum emulation which is numerically performed in C (a significantly faster language than Mathematica), using memory persistent in the C process, and potentially using accelerated hardware.The expensive numerical representations of the quantum states reside only in the C backend (and through QuEST, may also be persistent also in GPU memory), and each are identified by a unique ID.Quantum circuits are represented symbolically in their entirety in Mathematica, and sent to the C backend (at a small runtime overhead) only at emulation-time, compactly encoded into arrays of real numbers.The circuit language, and details of the QuESTlink interface, are presented in Section III B.

B. Deployment
QuESTlink can be obtained and deployed entirely within a Mathematica notebook, without any installation or configuration.This is done by online hosting (currently at qtechtheory.org) of the Mathematica package The protocol for stand-alone deployment of QuESTlink.First, a user calls Mathematica's Import[url] function to fetch a copy of the QuESTlink.mpackage, hosted at qtechtheory.org.This provides the QuEST' namespace, and defines functions to connect to a quest link backend; this backend can exist on the calling machine, or remotely as here pictured.More details on so called "QuEST environments" are presented in detail in the proceeding sections.Note the featured code snippets are simplified here for clarity, and their syntax is reviewed in Section II.
code, and dependent quest link executable.Figure 2 presents the process of serving the package to a user's Mathematica kernel.

C. Remote computation
QuESTlink even enables quantum emulation using remote computing resources.
Through WSTP, the quest link process can run on a remote machine (e.g. a supercomputer) and communicate with the user's local Mathematica kernel via TCP/IP.The remote machine can employ more powerful hardware than available on the user's machine, and potentially simulate larger quantum states than can fit in the user's local memory.The protocol of off-loading a circuit simulation to remote hardware is outlined in Figure 3.The tools and documentation for setting up a remote QuESTlink server are provided between the Github repo [7], and questlink.qtechtheory.org.
In this remote configuration, Mathematica calls to QuESTlink functions will involve network communication with the remote environment, and hence incur overheads; this is worthwhile if the remote hardware sufficiently accelerates a large quantum simulation which otherwise dominates runtime.Despite QuESTlink's effort to minimise the communication cost, this network overhead will be prohibitively costly for some applications.For example, when large simulated quantum states undergo processing by the Mathematica kernel, and hence need to be copied back and forth between the local kernel and the remote environment.To mitigate this slowdown whilst still running QuESTlink remotely, one can launch their Mathematica kernel on the remote machine, using Mathematica's remote kernel facilities [10].QuESTlink can also be used inside other Mathematica packages, and so be launched

III. USER GUIDE A. Mathematica review
Before continuing, we offer a quick review of the Mathematica syntax used in this manuscript.
Evaluation of function f with input a is denoted by f[a], or equivalently f @ a.The expression g[f[a]] can be formed using prefix notation as g @ f @ a, or postfix notation as a //f //g.Matrix multiplication between (possibly complex) matrices a and b is denoted by a.b, and a † denotes the conjugate transpose of a. Expressions with a trailing ; suppress their otherwise displayed result.While f[]=a denotes immediate evaluation of a and assignment to f[], the syntax f[]:=a denotes delayed assignment, whereby later invoking f[] will evaluate a (which may have changed Finally, for those copying code into Mathematica, subscripts can be quickly entered into a notebook with keyboard shortcut Ctrl & -.
In the code snippets featured in this manuscript, context should make clear what is input to the Mathematica notebook, and what is a rendered output.

B. QuESTlink overview
The QuESTlink package requires no installation, and can be downloaded directly from within Mathematica: The user then has a choice of several "QuEST environments" in which to perform quantum emulation, all of which provide an identical user experience.

CreateDownloadedQuESTEnv[os]
The first, CreateDownloadedQuESTEnv[os], enables simulation on the user's machine by directly downloading a serial QuESTlink executable from qtechtheory.org,compatible with the Operating System os (e.g."MacOs").This requires no apriori setup whatsoever.
CreateLocalQuESTEnv[fn] will attempt to launch an existing local QuESTlink executable, located at fn.This allows local simulation using serial, multi-core or GPU resources, depending on how the executable was compiled.Users can compile QuESTlink for their platform using the tools on the Github repo [7].
CreateRemoteQuESTEnv[ip, port1, port2] connects to a remote QuESTlink environment at the given ip address and ports.The remote machine can use serial, multithreading or GPU-acceleration to emulate quantum systems.The facilities to setup a remote QuESTlink server are provided in the Github repo [7].
Once connected to a QuEST environment, a full list of the supported QuESTlink facilities and gate symbols can be obtained by evaluating ?QuEST`* and the documentation of a particular function or operator obtained similarly using ?.

? M
M is a destructive measurement gate which measures the indicated qubits in the Z basis.Emulation begins by creating quantum registers (a "Qureg"), each represented by a state-vector or densitymatrix.
These functions return a unique ID for each Qureg, the memory for which is stored in the QuEST environment.Once created, QuESTlink provides a few functions to initialise Quregs: or directly modify them (here, creating a random density matrix):

SetQuregMatrix[ρ, m];
A quantum circuit u can be applied to a Qureg, agnostic of whether it is a state vector (effecting u |ψ ) or a density matrix (effecting u ρ u † ), using ApplyCircuit[u, qureg].QuESTlink features a concise and expressive language for specifying gates and decoherence processes, where target qubits are denoted with subscript integers to gate symbols, and control gates "wrap" the base gate.The below example makes use of the Circuit function, which disables commutation and allows a concise product representation of the circuit.In combination, this enables a syntax akin to how circuits are denoted in the quantum computing literature.
The result of Circuit[] is just a list of operators, allowing easy circuit manipulation.Mathematica features many tools for such lists allowing the user to easily extend, alter, join, compare, etc, their quantum circuit.For example: These circuits can be swiftly visualised with DrawCircuit[u], which supports saving rendered circuit diagrams to raster and vector images.However, it is typically more efficient to calculate desired quantities, like the expectation value of a Hamiltonian in the Pauli-basis, in the QuEST environment itself.

DrawCircuit[u[θ]]
At any time, Quregs can be individually (or, alltogether) destroyed to free up memory in the QuEST environment.

DestroyQureg[ρ]
DestroyAllQuregs[] With these facilities, QuESTlink offers a seamless integration with Mathematica's comprehensive range of computational and graphical tools, as illustrated in the following demonstrations.

IV. DEMONSTRATIONS
To demonstrate the concision [11] offered by QuESTlink, we provide several examples of somewhat sophisticated computations written in only several lines, and efficiently simulated.For users wishing to run these demonstrations directly, they are compiled into a single notebook at questlink.qtechtheory.org/paper_demos.nb.

A. Decoherence
To begin, we very compactly demonstrate the effect of two-qubit depolarising noise on the expected measurement of a simple Hamiltonian h.Starting in a random pure state ψ, a depolarising channel with probability 0.1 of any Pauli error occurring is repeatedly applied, in total 100 times.depolarising Note that undisclosed variable opts contained additional code for customising the plot.

B. Variational imaginary-time
In this demonstration, we emulate the quantum variational imaginary-time simulation routine [12] to approximate the ground-state of a molecular Hamiltonian.
We first download a reduced 6-qubit representation of the electronic structure Hamiltonian of Lithium Hydride (LiH).
We create a simple 6-qubit ansatz circuit featuring 39 parameters, denoted with variables θ = {θ 1 , . . ., θ 39 }.This ansatz circuit consists of one and two qubit rotation gates; exp(−iθ j σ/2), for σ ∈ {X, Y, Z, X ⊗ X, Y ⊗ Y, Z ⊗ Z}, which in QuESTlink are denoted by Rx q [θ] (etc.) and R[θ,X q1 X q2 ].In the diagram below, each such paired two-qubit rotation is marked by a vertical link.ψ will be maintained as the output state of the ansatz circuit, and hψ will store the result of applying the Hamiltonian h to ψ. φ will merely provide intermediate work-space for calculations, and dψ will store a Qureg for each parameter in the ansatz (a total of nθ).

DrawCircuit[ansatz, nQb]
Next, we choose a random initial assignment of the ansatz parameters, and measure the energy of the resulting quantum state.The variational imaginary-time algorithm [12] involves repeatedly measuring a matrix and vector of quantities, and iteratively updating the parameters under which we now concisely emulate for nt iterations.The energy of the output quantum state, as produced by the reached assignment of the parameters, is close to the true groundstate found through matrix diagonalisation.

CalcExpecPauliSum[ψ, h, ϕ] -7.87037
Min @ Eigenvalues @ CalcPauliSumMatrix @ h -7.88074 Interestingly, a significantly slower but direct minimisation in Mathematica reveals the energy reached by imaginary-time was not quite the best possible of our chosen ansatz.We invite the interested reader to compare the Mathematica code above to a native C implementation of imaginary time evolution, as hosted on Github [13].

C. Noisy Trotterisation
In this demonstration, we emulate Trotterisation of a spin-ring Hamiltonian, with and without noise, and compare it to direct numerical solving of the Schrödinger equation.
Our Hamiltonian is the one-dimensional nearest-neighbour (periodic boundary conditions) nQbspin Heisenberg model with a random magnetic field in the z direction, Real-time simulation of this model to time t = nQb is nominated by Childs et al. [14] as an early practical application of a quantum computer.In this example, we'll study a five-spin chain Hamiltonian h.
To simulate evolution on a quantum computer, we will emulate circuits formed by the Suzuki-Trotter decompositions [15] of the unitary evolution operator of varying order n.Below, r is the number of repetitions of the order-n circuit to perform, to ultimately reach time t.Note the original ordering of the Hamiltonian terms is arbitrary, and should ideally be optimised into commuting groups, or at least randomly shuffled.In particular, Childs et al.'s uniform randomisation scheme [16], whereby each repetition sees a random Hamiltonian ordering, is trivially implemented as: childsify[h_, n_, r_, t_] := Flatten @ Table [ symmetrize[RandomSample @ h, t / r, n], r] Even Campbell's qDRIFT routine [17] can be given a compact implementation.However, we opt instead to utilise only deterministic Trotterization for clarity.For example, the first-order (n=1) single-repetition (r=1) Trotter circuit (arbitrarily to time t=1) has the form: We compare the states produced from Trotter circuits with the "true" (to numerical precision) time evolution, found by numerically solving the Schrödinger equation using Mathematica's in-built NDSolveValue routine.In the proceeding solutions, the Qureg ψ0 will store the initial state (a random pure state), ψt will store the true state, and ψ will store outputs of the Trotter circuits.We can now commence the computation by generating circuits of a varying Trotter order and number of repetitions, emulating their execution, and computing the fidelity of their output state with the true state.We then plot the result below, using some undisclosed parameters opts to tweak the plot style.The horizontal axis is the total number of gates in the Trotter circuit.We can even imitate an noisy quantum device by inserting decoherence operators into our circuit.In this demonstration, we will follow each Trotter-prescribed gate with one or two qubit depolarising operators, which effect the channel

ListLogLinearPlot[fids, opts]
(noting σ = σ † for Paulis) and similarly for the two-qubit analogue.We choose an error rate of p = 10 −4 , comfortably below the rates recently demonstrated in Google's Sycamore machine [1].Below is a visualisation of the first five unitary gates of the resulting noisy circuit.As one might expect, the accuracy of the Trotter circuits have waned, and eventually decrease with increased circuit depth, due to the opportunity for additional errors to accrue.Complete notebooks demonstrating some extended computations in QuESTlink are available at questlink.qtechtheory.org.

V. BENCHMARKING
We now benchmark QuESTlink's local emulation of some arbitrary quantum circuits, and compare it to benchmarks of direct simulation in C, using QuEST.This helps elucidate the runtime overheads incurred by Mathematica integration, and whether QuESTlink is the right emulation tool for the user's target simulation regime.
We nominate a simple circuit consisting of X, Y , and Z axis rotations of random angles on each qubit, followed by tessellated controlled rotations on every pair of neighbouring qubits.This pattern is repeated in the circuit, and is illustrated in Figure 4.These benchmarks will emulate 15-qubit circuits, with between 1 to 50 repetitions (an upper bound of 4350 gates), and each will be Core QuEST is profiled directly in C, through precise timing of each full circuit execution.In contrast, QuESTlink is profiled through a notebook using Mathematica's AbsoluteTiming[] function; its runtime will include Mathematica evaluation, QuESTlink.mpreprocessing and circuit encoding, the C/Link and WSTP overheads, quest link.c'sdecoding of the circuit, and ultimately QuEST's simulation.
Benchmarking is performed on a 12-core Xeon W-2133 3.6GHz CPU.At most 8 threads will be employed in multithreaded mode, so as not to interfere with threads used by the Mathematica kernel.GPU-accelerated testing will employ a 24 GB NVIDIA Quadro P6000 in the same machine.
The results of benchmarking are presented in Figure 5, and are for the most part, as expected.The Mathematica overhead of invoking serial QuEST through QuESTlink is small (a factor of ≈ 1.1).Multithreading introduces additional variation in QuESTlink's runtime, most likely due to dynamic reallocation of threads between the Math-ematica kernel and the backend QuEST process, during execution.Otherwise, the multithreading overhead is similarly small (a factor ≈ 1.08).At first glance, GPU QuESTlink appears anomalously slow; on average 7.2 times slower than core GPU QuEST.However, in the largest serial simulation, QuESTlink was on average 5 × 10 −2 seconds slower than QuEST.Treating this as an overhead unaffected by GPU-acceleration, GPU QuEST's mean edge of 4 × 10 −2 s over QuESTlink is less surprising.That is, while likely that multithreading benefited the Mathematica kernel in the pre-processing stage (e.g. in circuit encoding), GPU acceleration exemplified the expense of this overhead by contrast, rather than by enhancing it.
While not measured presently, use of a remote QuEST environment is expected to add a constant overhead to QuESTlink's performance, due to network latency.Though in principle the data cost of communicating a circuit from Mathematica to a remote backend scales linearly with the emulated circuit depth, this cost should be overshadowed by the exponentially growing cost of quantum emulation, and other network overheads.The size of an encoded circuit is upper bounded by 8 (#gates + #ctrls + #targs + #params) bytes, where #gates is the total number of present gates, #ctrls and #targs are the total number of control and target qubits (respectively) aggregate over all gates, and #params is the total number of present parameters (e.g.angles of rotations).A circuit would need to contain approximately 30 million single-qubit gates to saturate a 1 GB bandwidth network.

VI. FUTURE WORK
QuESTlink is currently under active development, with a growing list of planned work and new features.This list includes: • Integration with a GPU-accelerated linear algebra library, to perform fast numerical routines directly on the simulated quantum state.For example, diagonalisation of a density matrix enables rapid calculation of the Trace distance, which currently requires expensive communication with the Mathematica kernel.
• Support for emulation on remote, distributed hardware.
• Routines for symbolically evaluating quantum circuits, in lieu of numerical emulation, in order to study them analytically.
• A QASM [18] parser and generator, to and from QuESTlink's circuit specification language.
We caution that QuESTlink is still in an early form and likely to change, both in interface and architecture.

VII. CONCLUSION
This manuscript introduced QuESTlink, a Mathematica package for emulating quantum circuits, state-vectors and density matrices.QuESTlink offers both high-level symbolic manipulation of quantum circuits, and rapid simulation using possibly remote hardware, such as multicore and GPU-accelerated supercomputers.We presented a broad technical overview of QuESTlink, including its protocol for stand-alone installation-free deployment.We then demonstrated the concision and flexibility possible of QuESTlink, by stepping through several examples of otherwise sophisticated simulations.These examples should enable an interested reader to begin using QuESTlink immediately.Lastly, we performed some simple benchmarks of QuESTlink to estimate the overhead over core QuEST, across its parallelisation modes.QuESTlink is open-source, and accessible at questlink.qtechtheory.org,or on Github [7]

FIG. 1 .
FIG.1.The QuESTlink software stack, from the user's code (user.nb) to the driving QuEST[4] framework.This stack can run entirely on a user's local machine, achieving an interactive Mathematica interface to a C-based emulator.Note the distinction between WSTP (the protocol for communicating between Mathematica and an external process) and C/Link (an implementation for C) is unimportant, and hereafter disregarded.
FIG. 2.The protocol for stand-alone deployment of QuESTlink.First, a user calls Mathematica's Import[url] function to fetch a copy of the QuESTlink.mpackage, hosted at qtechtheory.org.This provides the QuEST' namespace, and defines functions to connect to a quest link backend; this backend can exist on the calling machine, or remotely as here pictured.More details on so called "QuEST environments" are presented in detail in the proceeding sections.Note the featured code snippets are simplified here for clarity, and their syntax is reviewed in Section II.

FIG. 3 .
FIG. 3. Protocol for off-loading an emulation task onto a remote QuEST environment.Simulation structures declared in Mathematica are actually allocated remotely, possibly even in accelerated hardware memory.Quantum circuits, by first being encoded into arrays of numbers, are communicated to the server via TCP/IP.Note the syntax used in the code snippets in this diagram are reviewed in Section II ) each time.Elements of a list x = {a, b, c} are accessed as x[[i]] where index i ≥ 1, and x[[m;;n]] returns the sublist spanning indices m to n.The shortcut ex /.a -> b replaces subexpressions of ex which match pattern a, with b. Import["https://qtechtheory.org/QuESTlink.m"] , qureg2] returns the Hilbert-Schmidt distance (Frobenius norm of the difference) between the given density matrices.
After applying a unitary circuit, one can of course opt to obtain the entire state vector or density matrix, as for example: ψ h ψ -Tr (h ρ)

FIG. 4 .FIG. 5 .
FIG.4.A 5-qubit 2-repetition example of the arbitrary circuit used for comparative benchmarking of QuESTlink and QuEST.Every gate involves a rotation of a uniformly random angle between 0 and 4π, which generate all rotations.The 15-qubit circuits employed for benchmarking feature 87 gates per repetition.