Qiskit Pulse: Programming Quantum Computers Through the Cloud with Pulses

The quantum circuit model is an abstraction that hides the underlying physical implementation of gates and measurements on a quantum computer. For precise control of real quantum hardware, the ability to execute pulse and readout-level instructions is required. To that end, we introduce Qiskit Pulse, a pulse-level programming paradigm implemented as a module within Qiskit-Terra \cite{Qiskit}. To demonstrate the capabilities of Qiskit Pulse, we calibrate both un-echoed and echoed variants of the cross-resonance entangling gate with a pair of qubits on an IBM Quantum system accessible through the cloud. We perform Hamiltonian characterization of both single and two-pulse variants of the cross-resonance entangling gate with varying amplitudes on a cloud-based IBM Quantum system. We then transform these calibrated sequences into a high-fidelity CNOT gate by applying pre and post local-rotations to the qubits, achieving average gate fidelities of $F=0.981$ and $F=0.979$ for the un-echoed and echoed respectively. This is comparable to the standard backend CNOT fidelity of $F_{CX}=0.984$. Furthermore, to illustrate how users can access their results at different levels of the readout chain, we build a custom discriminator to investigate qubit readout correlations. Qiskit Pulse allows users to explore advanced control schemes such as optimal control theory, dynamical decoupling, and error mitigation that are not available within the circuit model.


I. INTRODUCTION
In quantum computing, information is stored and processed according to the laws of quantum mechanics [2]. The primary quantum programming paradigm is the circuit model. In this model, the underlying dynamics of the physical system implementing the quantum computer are abstracted as a sequence of unitary gate operations and projective measurements applied to a set of qubits. Gates manipulate the states of qubits, while measurements extract classical information in the form of bitstrings, which encode the outcome of projective measurements of the qubits in a particular measurement basis.
Qiskit is an open-source quantum computing framework designed to enable research on near-term quantum computers and their applications. It provides tools for creating, manipulating and running quantum programs on quantum systems independent of their underlying technology and architecture. The standard programming abstraction for a quantum circuit is a quantum assembly language (QASM) such as OpenQASM [3] which Qiskit supports [1] and many similar languages that have been described in the literature [4,5]. However, hardware is not capable of natively implementing quantum instructions and must compose these operations from the classical stimulus avilable to control hardware.
At the hardware level the time-dependent dynamics of a quantum system interacting with applied control fields is described by its Hamiltonian and the Schrödinger equation. Through careful engineering of applied classical control fields a quantum system may be steered * These two authors contributed equally; Corresponding author: talexander@ibm.com through a desired unitary evolution [6]. Superconducting transmon qubits, for example, encode a qubit in a nonlinear oscillator formed by a parallel circuit consisting of a Josephson junction and capacitor, and may be manipulated by applying shaped microwave control pulses [7]. Implementing the quantum circuit model on such an architecture requires compiling circuit instructions to a set of microwave control instructions, or pulses, which enact the desired state-transformations and/or measurements. In the circuit domain, an atomic circuit instruction is agnostic to its pulse-level implementation on hardware. Extracting the highest performance out of quantum hardware requires the ability to craft a pulse-level instruction schedule, which cannot be done within the standard circuit model. To enable pulse-level programming an instruction set, OpenPulse [8], was developed to describe quantum programs as a sequence of pulses, scheduled in time. We present within this paper a Python implementation of OpenPulse, Qiskit Pulse which adds to the Qiskit compilation pipeline the capability to schedule a quantum circuit into a pulse program intermediate representation, perform analysis and optimizations, and then compile to OpenPulse object code to execute on a quantum system.
The various hardware architectures used for currentday quantum computing systems creates a need for a pulse-level instruction set that may address most platforms at an abstract level, compatible with both commercial off-the-shelf and proprietary control instruments, including arbitrary waveform generators (AWG), signal generators, filters, amplifiers and digitizers [9]. To program such systems at the pulse-level in a hardware independent manner requires the user-level instruction set to be target-compiled to the underlying system hardware components, each of which may have a unique instruc-tion set and programming model. Recent efforts to construct microarchitectures that conform to classical computer engineering paradigms [10][11][12] have programming semantics closely tied to the underlying microarchitecture. Compiling directly from the circuit model to target hardware obfuscates the underlying pulses that manipulate the hardware removing a powerful degree of control.
With Qiskit Pulse we enable the development of a common and reusable suite of technology-independent quantum control techniques [6] that operate at the level of analog stimulus which may be remotely retargeted to cloud-based quantum computing systems.
Paper outline -In Sec. II we present our pulse programming model. We demonstrate the capabilities of Qiskit Pulse in Sec. III where we show Hamiltonian characterization of the two-qubit cross-resonance interaction, and calibration of a high-fidelity entangling gate, on a cloud-based quantum computer available on the IBM Quantum Experience. We discuss how the readout of quantum computers is incorporated in Qiskit Pulse in Sec. IV and conclude in Sec. V. The source code and data for the experiments within this paper has been made available online [13].

II. QISKIT PULSE PROGRAMMING MODEL
In the standard quantum circuit model, the time elapsed between operations is irrelevant as long as the order of non-commuting gates is preserved [14]. However, when controlling quantum hardware at the pulse level, properly timing and synchronizing instructions is crucial for accurately enacting quantum operations. For instance, users may create new gate definitions, characterize and correct for crosstalk on qubits neighboring interacting qubits, implement optimal control techniques such as GRAPE [15] or mitigate errors through Richardson extrapolation [16][17][18].
We envision that a classical microprocessor with an embedded pulse coprocessor will be responsible for controlling and measuring the quantum device. Within this work we only focus on describing a virtual execution model and limited set of instructions for the pulse coprocessor which can be compiled to the instruction set architecture (ISA) of the underlying control hardware. Qiskit Pulse's position in the predicted quantum computing compilation pipeline is demonstrated in Fig. 1. We expect that as quantum hardware continues to be refined these abstractions will be extended.
Qiskit Pulse provides an open source, front-end implementation of the OpenPulse interface [8]. Third parties can fully integrate with Qiskit Pulse by implementing their own Qiskit provider [1] which is responsible for translating Qiskit Pulse programs to executable programs on the provider-specific hardware which might include components such as AWGs and digitizers. Qiskit Pulse programs are composed of pulses, channels, and instructions which we present in the following subsections. QASM programs may be built and optimized with information of the system topology, native gates, and error rates, and then are scheduled into pulse programs by using calibrated native gate definitions. Pulse programs are compiled to a processor-specific ISA through a target code generation procedure. The typical user is expected to program at the circuit level, whereas Qiskit Pulse enables advanced users to control at the pulse level.

A. Pulses
A pulse is a time-series of complex-valued amplitudes with a maximum unit norm, [d 0 , . . . , d n−1 ]. Each d j , j ∈ {0, ..., n − 1}, is called a sample. Every system specifies a cycle-time of dt which is the finest time-resolution exposed on the pulse coprocessor, and is typically defined by the sample rate of the coprocessor's waveform generators. Each sample in a pulse is output for one cycle, a timestep. All pulse durations and timesteps are defined and discretized dimensionlessly with respect to dt. The ideal output signal has amplitude at time jdt, where f and φ are a modulation frequency and a phase. The pulse samples describe only the envelope of the produced signal which are then mixed up in hardware with a carrier signal defined by a frequency and a phase. To reduce encoding sizes we also allow hardware providers to define parametric pulse shapes. For example, one parametric pulse supported by IBM backends and made available through the Pulse library is the Gaussian pulse. It takes three parameters: an integer duration in terms of dt, a complex amplitude amp, and standard deviation sigma. This parametric pulse can be instantiated within Qiskit in the following way:  Trigger the channel to collect data for the given duration, and store the measurement result in a register.

B. Channels
Hardware components are modeled with Channels. Channels label signal lines that either transmit or receive signals between the control electronics and the quantum device. Each channel executes instructions from a firstin, first-out (FIFO) queue as outlined in subsection II C.
Channels are constrained at target code generation time to target hardware components, e.g., an AWG. The calibrated parameters of a channel, such as its frequency, and the pulses played on that channel, depend on the physical properties of the targeted qubit. Therefore, channels are not interchangeable at the pulse abstraction layer, i.e. permuting channels over qubits will not give equivalent results. This highlights another difference between circuit and pulse instructions, the parameters of gates in a quantum circuit do not depend on the physical properties of the targeted qubits. The qubits in a quantum circuit can therefore be interchanged without affecting the computational result as long as the topology of the device is taken into account and gate imperfections are ignored.
There are several different channel types, and each may support a different instruction set. A summary of channels and their descriptions is provided in Table. I. The channel type, and thus the supported instructions, is determined by the effect of the channel on the quan-tum device. For example, a PulseChannel models the output of a control field α k (t) in a system Hamiltonian H(t) =Ĥ sys + k α k (t)Ĥ k which is composed of timeindependent system and time-dependent control terms. The Hamiltonian term for a given channel,Ĥ k in general may be arbitrary, but is typically associated with the subtype of the respective pulse channel that is assigned at system configuration time.
The sub-types of PulseChannels are DriveChannels, MeasureChannels, and ControlChannels. Each pulse channel maintains an instruction writeable frequency f and phase φ, which modify the channel output as per Eq. (1). Tracking the phase in this way enables the implementation of virtual Z-gates [19,20]. Qubit drive and readout pulses are respectively assigned to DriveChannels and MeasureChannels, see e.g., the drive pulses on drive channels d0 and d1 in Fig. 2. Their index is trivially mapped to the address of the target qubit. The Con-trolChannel implements any remaining control fields, such as coupler drives or two-qubit drives as depicted by u2 in Fig. 2. The backend hardware may choose to map multiple PulseChannels to the same control unit in the system, which enables tracking a unique phase for each channel. For example, in the IBM Quantum systems used to perform the experiments within this paper, every DriveChannel may share an AWG with multiple ControlChannels, each of which have a frequency and phase adjusted to track that of their respectively coupled qubits, enabling the implementation of two-qubit gates as demonstrated in Sec. III.
The AcquireChannel is used to communicate to the system when qubit readout data must be acquired. It is not associated with a control term in the Hamiltonian and does not output stimulus to the quantum system. Data collected on these channels are used to determine the qubit state, see Sec. IV for more details.

C. Instructions and Execution Model
Instructions may be scheduled on Channels to manipulate the quantum system. Pulse instructions have as operands channels and instruction-dependent constants. All pulse instructions have a fixed, deterministic duration, which may be specified either implicitly or explicitly. Instructions are executed with an allocation and trigger timing model in which instructions are loaded into a FIFO queue unique to each channel and then execution is initialized synchronously across all channels with an external trigger signal. Consequently the absolute start and end of every pulse instruction may be scheduled at compile-time across channels with hard real-time deadlines relative to the external trigger signal. Instructions may have multiple channels as operands causing an execution dependency. In this case channels stall execution until all operand channels are available. We now outline the different types of instructions, which are summarized in Table I. Every channel supports a Delay instruction which has as operands a duration which is specified as a number of cycles, and a target channel. The channel will idle for the duration of the instruction.
The Play instruction allows users to play a pulse on a target PulseChannel with a frequency and phase set with the ShiftPhase and SetFrequency instructions. The ShiftPhase instruction has an implicit duration of zero and accepts an input float phase and PulseChannel. The ShiftPhase will shift the phase φ of the target channel by phase radians. This relative shift persists on the channel from the time of the instruction, allowing the phase of a channel to be accumulated throughout an experiment. The SetFrequency instruction has an implicit duration of zero and accepts an input float frequency and a PulseChannel. This instruction will set the frequency f of all proceeding pulses on the target channel to frequency Hz. Like any other instruction, SetFrequency can be used on a single channel multiple times within a schedule, subject to the instantaneous bandwidth of the hardware. It can therefore be used, for example, to measure the anharmonicity of a transmon qubit.
The Acquire instruction has as operands a duration, an AcquireChannel, and a classical register in which to store the observed result. This instruction signals to the measurement unit to begin acquiring data, and for how long. Each Acquire instruction should align with a corresponding measurement stimulus Play instruction to induce a measurement of the target qubit. An acquisition channel outputs an unsigned integer value N into the result register. For the standard two-level qubit, this will be a single bit {0, 1} 1 .
If a measurement stimulus pulse measures multiple qubits, as is typical for multiplexed measurement schemes [21,22], an acquisition instruction must be synchronously scheduled for each of the measured qubits. In hardware, each AcquireChannel is constrained to a measurement chain which usually includes data acquisition, filtering, kerneling, and state discrimination. To accommodate the heterogeneous readout schemes encountered in hardware we defined three levels of readout data and how to convert between them, see Sec. IV.
The set of operations in Qiskit Pulse should have sufficient generality to program a pulse coprocessor for an arbitrary quantum computing system in the time-domain and be embedded within a larger instruction set that might include both classical control flow and a traditional gate-level description of a quantum program with instructions being implemented by a lowering procedure to pulse instructions. The benefit of this approach is that a single software stack may provide the middle-end for the rapidly developing heterogeneous quantum computing platforms.

D. The Pulse Schedule
The pulse Schedule is the representation of a pulse program in Qiskit Pulse and is an ordered collection of scheduled pulse instructions. The pulse schedule is equivalent to a basic block [23] in a classical computation with deterministic instruction durations. To construct a pulse schedule Instructions may be appended as demonstrated in the example below, which prepares qubit 0 in the |1 state and then measures it: # Create a pulse schedule. sched = Schedule(name='excited_state') # Create gate and measurement pulses. x180 = Drag(x_dur, x_amp, x_sigma, x_beta) measure = GaussianSquare(m_dur, m_amp, m_sigma, m_square_width) # += appends an Instruction to a Schedule. sched += Play(x180, DriveChannel(0)) # Measure qubit 0. sched += Play(measure, MeasureChannel(0)) # Determine the state of qubit 0 and store it # in a persistent MemorySlot register which # will be returned in the program result. sched += Acquire(AcquireChannel(0), MemorySlot(0)) # Run the schedule and get the result. counts = execute(sched, backend).result().get_counts()

E. Scheduling
Quantum circuits and pulse schedules are both representations of a quantum program. The Qiskit transpiler optimizes quantum circuits according to the properties of the targeted quantum system such as the device topology, the native gate set, and the gate fidelities. A scheduler compiles a circuit program to a pulse program, as depicted in Fig. 1. Scheduling requires system dependent information, most notably the definitions of the native gates in terms of scheduled pulse instructions. The scheduler therefore requires a quantum circuit to be transpiled to the native gate set of the target system prior to scheduling. Furthermore, during scheduling it is crucial to maintain the relative timing of groups of pulse instructions calibrated to implement an element of the native gate set. For instance, the cross-talk cancellation tone of a cross-resonance gate applied to the target qubit must be aligned with the pulse that drives the control qubit at the frequency of the target qubit [24].
The input circuit provides only implicit topological timing constraints, which allows the scheduler to arbitrarily resolve the remaining free time-alignment parameters in the output schedule. The scheduler's behavior in resolving free parameters is set by specifying a scheduling method or policy. By default the Qiskit scheduler follows an "as-late-as-possible" scheduling method [1]. This will schedule individual gates as late as possible while minimizing the deadtime between instructions on the same channel. This scheduling routine mitigates T 1 and T 2decay errors by maximizing the time that qubits will spend in their initial ground state prior to the first pulse, while also minimizing the time between the last pulse and the measurement. Fig. 2 provides a code snippet for scheduling a quantum circuit into a pulse schedule using Qiskit, and visually demonstrates the correspondence between the circuit instructions input to the scheduler and the calibrated output pulse sequences. This output is easily generated for both the QuantumCircuit and Schedule with the draw method.
Qiskit Pulse users may create pulse programs to replace the default pulse programs of the native gate set provided by the backend and pass them as an argument to the scheduler. This gives users low level control over the gate definitions used at scheduling time. They may specify their own scheduling policies to dynamically aggregate gates and generate composite pulse sequences such as would be required to implement the compilation techniques described by Shi et. al. [25].  2: (a) Qiskit code to construct a quantum circuit that prepares and measures a Bell state and then schedules the circuit to produce an equivalent pulse schedule. Here, h is a Hadamard gate, cx a CNOT gate and backend is a description of a quantum system received from a hardware provider. (b) and (c) Visualization of the mapping between circuit instructions (b) and the composite pulse sequences that will implement the circuit elements (c). Pulse envelopes filled with bright and dark colors respectively represent the real (in-phase) and imaginary (quadrature-phase) components of the input control waveform. The circular arrows represent a phase shift. The gray shadow on a0 and a1 indicates the data acquisition trigger for the ADC which is synchronized with the measurement stimulus pulse. These mappings are automatically provided by the hardware backend, but may be overridden by the user as we demonstrate in section III. The scheduler will align the gates in time according to the selected scheduling policy.

III. DEMONSTRATION OF A CROSS-RESONANCE ENTANGLING GATE
To highlight how Qiskit Pulse can enable tasks that cannot be done in the circuit model we perform standard quantum process tomography (QPT) [26] of both echoed and un-echoed cross-resonance (CR) [27] pulses for varying amplitudes on a cloud-based quantum computer. We use the tomography data to calculate the coefficients of the effective CR Hamiltonian as a function of the pulse amplitude, and show how to implement a high-fidelity Controlled-NOT (CNOT) gate based on the calibrated CR pulse.

A. The Cross-Resonance Interaction
The CR gate is a microwave-only two-qubit entangling gate for fixed-frequency dispersively coupled qubits [27]. It is physically realized by driving the control qubit with microwave pulses at the frequency of the target qubit to stimulate the evolution of an effective ZX interaction Hamiltonian, where Z and X are the Pauli-Z and X operators of the driven control qubit and the target qubit, respectively.
The two-transmon system driven by the CR pulse is described by a time-dependent Hamiltonian H CR (t) which, in the absence of noise, results in the unitary evolution U CR . We can approximate the evolution as being generated by a time-independent Hamiltonian H CR using the perturbative method presented in Ref [28] which showed good agreement with experimental results for single-pulse and echoed CR gates [24]. This technique approximates the time-dependent control qubit drive pulse with a constant amplitude pulse and block-diagonalizes the resulting Hamiltonian to second order. Using this approach the CR evolution is approximated by If the ZX term could be isolated, the resulting unitary gate would be a two-qubit rotation where the rotation angle θ ZX depends on the strength and duration of the pulse applied to the control qubit. The unitary gate U ZX (π/2) is a perfect entangler -it can be used to generate a maximally entangled state from a separable input state and is locally equivalent to a CNOT gate [29]. Therefore, combined with arbitrary singlequbit operations, it is sufficient for universal quantum computation.
The terms in addition to ω ZX ZX in H CR lead to coherent errors and divergences from the ideal target unitary in Eq. (3). Characterizing the strength of these terms and designing pulse sequences that suppress them is necessary to create high fidelity entangling operations. The standard techniques used to suppress these terms are multi-pulse echos and cancellation tones [24].

B. Constructing and Calibrating a Cross-Resonance Gate
The experiments presented within this section are executed on the twenty-qubit IBM Quantum system ibmqalmaden to take advantage of higher resolution waveforms with a cycle-time dt = 0.222 ns afforded by infrastructure under test on that system at the time of writing. We use qubit 1 and qubit 0 as the control and target qubits, respectively. The resonance frequency and anharmonicity of the control qubit are f 1 = 4.972 GHz and δ 1 = −319.7 MHz, and f 0 = 4.857 GHz and δ 0 = −320.2 MHz for the target qubit.
We implement both a single-pulse (CR1), and an echoed two-pulse (CR2) variant of the CR gate without a cross-talk cancellation tone on the target qubit [30]. The CR pulse envelope is a GaussianSquare pulse, i.e. a square pulse with Gaussian-shaped rising and falling edges. The pulse has a square amplitude A, a phase φ = −0.166 rad., discussed in Appendix B, and a total duration t CR = 848 dt = 184.4 ns. The square portion of the pulse has a duration of 720 dt and the Gaussian rising and falling edges last 64 dt and have a 32 dt standard deviation. The pulse duration is chosen so that a π/2 rotation angle can be achieved within the weak driving regime.
The CR1 sequence is a single CR pulse on the Con-trolChannel u1, see Fig. 3(a). The CR2 sequence consists of two CR pulses with opposite phases on u1, and two additional single-qubit pulses on the DriveChannel d1, one after each CR pulse, see Fig. 3(b). This echo sequence refocuses unwanted terms in the interaction Hamiltonian [24]. The following code exemplifies how to build the CR2 schedule in Qiskit Pulse.

C. Quantum Process Tomography of the CR Gate
To study the dynamics of the CR pulse, we perform standard QPT [26] of the CR1 and CR2 pulse sequences for a range of CR pulse amplitudes using the tomography module of Qiskit Ignis [31]. Given a d-dimensional noisy

FIG. 3: CNOT pulse
Schedules implemented by the CR gates with the local fidelity optimization. Local operations are realized by X ± π/2 pulses with three virtual-Z gates before and after CR gates on the DriveChannels d0 and d1. The CR1 and CR2 gate are surrounded by red boxes. (a) CR1-based CNOT gate composed of a single CR pulse CR π/2 on the ControlChannel u1. (b) CR2-based CNOT gate composed of two CR pulses CR ± π/4 on u1 with echo pulses X + π applied on d1. Measurement and acquisition pulses are not shown. The numbers below the channel aliases show an amplitude scaling factor used for plotting. The 12 circular arrows topped by floating point numbers represent phase shifts in units of radians and each phase shift corresponds to an optimization parameter Θi. Note that phase shifts on u1 reflect those in d0 to synchronize the frame of both channels; they are automatically inserted by the pulse scheduler.
quantum channel E, QPT reconstructs the Choi -matrix Λ E which is the positive-semidefinite matrix defined by The QPT circuits, shown in Fig. 4 to prepare the required input states and measurement bases, respectively. We prepare each qubit in the states |0 , |1 , 1 √ 2 (|0 + |1 ), 1 √ 2 (|0 + i |1 ) and measure in the X, Y and Z bases. Our amplitude-dependent CR pulse is inserted into the QPT circuits in Fig. 4 as a custom gate that the Qiskit pulse scheduler maps to a pulse program, see appendix A 1. Each of the 144 two-qubit QPT pulse schedules is ex-ecuted 2048 times to estimate the measurement outcome probabilities of each qubit. The details of the readout process are presented in Sec. IV. We correct for measurement errors using the readout error mitigation techniques [33] implemented in Qiskit Ignis. Readout error mitigation for two qubits requires four additional schedules which were interleaved with the QPT schedules. The reconstructed Choi-matrix E CR (A) for the noisy gate was obtained from the convex-optimization QPT fitter in Qiskit Ignis for each value of the CR pulse amplitude A. This fitter uses maximum likelihood estimation to find the completely-positive trace-preserving process that is most likely to fit the measured data after correction for readout errors. We use the fitted Choi-matrices E CR (A) to compute estimates of the coefficients of the effective CR Hamiltonian in Eq. (2). The method is described in Appendix C. We then fit these coefficients to a third order model to find the CR pulse amplitude that implements a θ ZX = π/2 rotation, see appendix C. The estimated amplitudes, marked by the stars in Fig. 5, were 0.229 ± 0019 and 0.098 ± 0005 for CR1 and CR2, respectivly.

D. Optimizing CNOT Fidelity with Local Operations
To estimate the highest fidelity of a maximally entangling gate that the CR gate can be transformed into, we optimize the average gate fidelity F over all singlequbit pre and post-rotation angles Θ on both the control and target qubits, see appendix D. The optimized fidelities for the measured CR1 and CR2 process maps are F max = 0.992 and F max = 0.994.
We then use the Qiskit transpiler and pulse scheduler to optimize and build a CNOT gate from the calibrated CR gate and the device-calibrated single-qubit gates [34] which implement the optimal local rotation parameters Θ from Eq. (D1). The optimized CNOT schedules are shown in Fig. 3. The average gate fidelities of the calibrated CNOT gates are measured with randomized benchmarking (RB) [35]. The details of the pulse program applying the local rotations and setting up the RB measurements are given in Appendix A 2. The RB experiments estimate an average gate fidelity of F = 0.981 and F = 0.979 for the CR1 and CR2 gates, respectively. These fidelities are comparable to the measured fidelity of F = 0.984 of the standard CNOT gate provided by ibmq almaden which is implemented using a highly-tuned calibration process including an echo sequence, cancellation tone, and closed-loop amplitude calibration [24]. It is worth noting that the CNOT gates demonstrated in this paper have no cancellation tone and all parameters are obtained with open-loop calibration. In the same way, we can create custom basis gates which may enable hardware-efficient implementations of quantum algorigthms.

IV. READOUT AT THE PULSE LEVEL
Readout is the process through which the qubit state is projected onto |0 or |1 and a corresponding classical bit is obtained. This process is modeled by a readout chain in which the observed signal undergoes a series of successive transformations. Qiskit Pulse supports returning the output data of each measurement layer to the program-mer. The lowest level accessible to the user, level-zero or raw data, typically corresponds to a digitized time-series signal. A kernel method applied to the signal data removes its time dependency and results in a complex value which encodes the qubit state (level-one kerneled data). Finally, the classified qubit state (level-two disciminated data), is obtained by applying a discriminator to the kerneled data. For a superconducting qubit processor, the time traces are complex vectors representing the digitized readout signals reflected or transmitted from the readout resonators [9]. The kernel method, such as the boxcar integrator used within this paper, outputs points in the IQ plane which a discriminator may use to classify the qubit's state.
Qiskit users are now able to retrieve data from different levels in the readout-chain by specifying the readout data-level, i.e., zero, one, or two. For example, users of IBM Quantum processors may request the kerneled data in the form of IQ points so that they may implement their own discriminator. To be sure, Qiskit users that do not wish to implement their own kernels and discriminators can request level-two data therefore using the built-in readout scheme. The readout methodology that we implemented reflects the typical data flow during readout in hardware and should allow users to test novel readout schemes [36] as well as accommodate different quantum computing architectures. Aside from counting qubit states, discriminators may also be used to infer properties of the system and benchmark it. We illustrate this by investigating spurious correlations in the qubit readout of the IBM Quantum sys-  [36]. For example, the Single-Q16 discriminator was fitted with the calibration schedules cal 00 and cal 01 while the discriminator Both-Q16 was fitted with all four calibration schedules: cal 00, cal 01, cal 10 and cal 11. The confidence intervals were obtained using Jeffreys interval at a 95% confidence level. tem ibmq singapore selected based on its availability at the time of the experiment. From ibmq singapore we randomly selected qubits 16 and 17 to study as they are a neighboring pair of qubits on the chip. Kerneled data was measured for four calibration schedules, named cal ij with i, j ∈ {0, 1}. Here, to prepare the state |ij a π-pulse is applied to qubit i if i = 1 and simultaneously to qubit j if j = 1. The single-qubit pulses are followed by the measurement stimulus pulses and acquisition instructions for all qubits. For each qubit we fit two discriminators based on linear discriminant analysis [36]. For qubit i, one discriminator is fitted to a subset of the calibration data in which the other qubit is always in state |0 , shown in Fig. 6, while the other discriminator is fitted using all four calibration schedules. We expect that the discriminator that uses all the calibration data will perform best if there is measurement cross-talk. The fidelities of the fitted discriminators, shown in Tab. II, suggest that there is no significant cross-talk between the qubits that we measured. This is verified by t-tests on sixteen Pearsons' correlation coefficients r j (ES i,X , GS i,Y ) between ES i,X and GS i,Y which correspond to the X, Y ∈ {I, Q} data of qubit i in state j ∈ {0, 1} when the other qubit is in the excited state (ES, i.e., |1 ) and ground state (GS, i.e., |0 ), respectively. These correlation coefficients are sensitive to cross-talk between the two qubits. With 1024 degrees of freedom, i.e., measurement shots, we do not observe any statistically significant correlation at the 95% confidence level. This implies that it is sufficient to fit discriminators using only a ground and excited reference schedule for each qubit and consequently only 2n calibration schedules are required for n qubits when there is no cross-talk rather than the 2 n calibration schedules that would be required with all-to-all measurement crosstalk.

V. CONCLUSIONS AND FUTURE WORK
Rapid development in quantum computing has led to publicly available quantum computers with an increas-ing number of qubits, improved connectivity, and greater control. Prior to this work, publicly available quantum programming frameworks for cloud-based quantum computers have been at the relatively high-level of the circuit model, or implementation-specific, thus limiting their application. In the near-term, pulse-level control is desired to extract as much quantum volume as possible from the hardware by experimenting with novel control and characterization schemes [37][38][39].
We have introduced Qiskit Pulse, an implementation of the virtual pulse-level programming model, OpenPulse [8]. We have demonstrated that the Qiskit circuit scheduler can target pulse instructions and that physical superconducting qubit hardware can interpret these instructions to execute useful programs. By embedding our pulse programming instruction set in Qiskit we have integrated gate-level quantum programs and classical pulse stimulus, exposing a new level of hardware control to Qiskit users. The benefit that pulse control provides quantum programmers was demonstrated by calibrating a cross-resonance pulse on a cloud-based quantum computer and embedding it as a gate within the standard circuit programming model and characterizing this userdefined gate using quantum process tomography.
Giving users pulse-level access to current-day quantum computers will allow them to explore techniques such as error mitigation and dynamical decoupling schemes that cannot be investigated at the circuit level. In the future we will explore embedding the pulse programming model as a coprocessor within a classical virtual instruction set architecture that supports classical arithmetic and control-flow [40]. We will also investigate extensions to the pulse programming model such as defining special purpose registers to track phase across multiple channels which would reduce the number of required PulseChannels and enable simpler tracking of shared phase for composite gates. We would then use these capabilities to explore the implementation of active error-correcting codes, and promising variational quantum-classical algorithms such as the variational quantum eigensolver [41][42][43].
# Pulse parameters determined during calibration.   sched += Play(cr1_pulse, ControlChannel(0)) 28 29 # Add the CR1 instruction to basis_gates and inst_map. 30 basis_gates += [gate_name] 31 inst_map.add(gate_name, [1,0]  The above example uses the mock backend FakeAlmaden for IBM Quantum system ibmq almaden which can be substituted for the real backend. The pulse envelope of the CR1 cr1 pulse is created with a flat-topped Gaussian pulse. The pulse schedule sched of CR1 is then added to the basis gates and the circuit instruction to pulse schedule mapping (inst map) for qubits one and zero with the name cr1. The basis gates defines a list of primitive circuit instructions available in the system and the inst map defines a lookup table of calibrated pulse schedules for each basis gate on each qubit. The circuit object of the CR1 sequence is created with a custom gate cr1 gate. The QPT circuits are then assembled by calling the process tomography circuits function in Qiskit-Ignis. This appends state preparation circuits and measurement circuits before and after the circuit. The returned qpt circuits is a list of quantum circuits containing each possible combination of input states and measurement bases. The qpt circuits are then mapped to the backend in question by Qiskit's transpiler, taking into account the extended set of basis gates. Finally, we call the pulse scheduler with the custom inst map containing CR1 to create the QPT pulse schedules. The QPT program for the CR2 sequence is created with the same procedure.

Performing Randomized Benchmarking
The code example below demonstrates how the standard randomized benchmarking (RB) schedules with the CR1-CNOT used in Sec. III D are created in the qiskit pulse module.    inst_map.add(gate_name, [1,0]

rb_schedules_seeds.append(rb_schedules_seed)
As shown in Sec. A 1, the pulse schedules are programmed with the aid of the QuantumCircuit class to apply device calibrated single-qubit gates around the CR1 pulse sequence abstracted by cr1 gate. The local rotation parameters can be obtained by the optimization routine shown in Sec. D. It should be noted that in a two qubit standard RB sequence the CNOT gate can assign both qubit 0 and 1 as a control qubit. Beacuse the CNOT gate is not identical under the exchange of the control and the target qubits , we need to prepare pulse schedules for both qubit arrangements. Then, the default CNOT instruction in the inst map is overwitten by the pulse schedules based on the calibrated CR1 sequence. Finally, the RB circuits are generated by a call to the randomized benchmarking seq function in Qiskit-Ignis. The returned rb circuits seeds is a list of RB circuits for each random seed. These RB circuits are then independently transpiled and scheduled to create RB pulse programs. The RB programs for the CR2 sequence are created with the same procedure.

Appendix B: Cross Resonance Phase Calibration
In the twenty-qubit IBM Quantum system ibmq almaden, microwave pulses programmed with Qiskit Pulse are generated by waveform generators at room temperature and travel through coaxial cables to the qubits [44]. The transfer function between the room temperature electronics and the qubits can cause a phase offset φ 0 in Eq. (1) resulting in an error in the rotation axis of the target qubit. The Hamiltonian may therefore have an unwanted ZY interaction term which we eliminate by adjusting the phase of CR pulse φ. We perform this calibration with the CR2 schedule since its time-independent Hamiltonian, which we approximate by has less terms than the CR1 Hamiltonian due to the echo. Here, Ω is the strength of the CR drive as a function of its amplitude A and its phase φ while ε represents the small interaction terms which are not fully refocused by the echo sequence. First, we initialize the qubit in the |00 state. We sweep the amplitude A and measure the target qubit in the Pauli-Z basis to find the pulse amplitude A opt = 0.108 which creates an equal superposition of |0 and |1 . If the offset φ 0 is zero this transformation is a π/2-rotation around the X-axis so that the target qubit, measured in the Y -basis, yields Tr(σ y ρ) = ±1 with the sign depending on the state of the control qubit. We thus measure the readout signal at A opt in the Y -basis for both initial states of the control qubit |10 and |00 , while sweeping the phase φ. The calibrated phase that maximizes | Tr(σ y ρ)| is φ opt = −0.166 rad.
We use the fitted Choi-matrices E CR (A) to compute estimates of the coefficients of the effective CR Hamiltonian in Eq. (2). Since a real CR pulse will have noise the resulting process is not unitary. Noisy quantum evolution for a time-independent Hamiltonian in the presence of Markovian noise may be described by the Lindblad equation where the operator L is the generator of the unitary evolution, and D is the generator of the non-unitary dissipative evolution. As with unitary evolution, the Lindblad equation can be solved as a matrix differential equation obtaining |ρ(t) = S E |ρ(0) , where |A denotes a column-vectorized matrix A, and S E = exp(tS G ) is the superoperator representation of quantum process E [32].
For a Hamiltonian H we note that the operators B ij = 1 2 P i ⊗ P j , with P i are single-qubit Pauli operators, define an orthonormal basis for two-qubit operators -i.e. Tr[B ij B † kl ] = δ ik δ jl . Hence for a Hamiltonian given by H = ij ω ij B ij , we can extract the coefficients via ω ij = T r[B † ij H]. The superoperator for the Hamiltonian component of G is given by (C4) We can use the fact that the superoperators of the Hamiltonian basis term S L B ij are also an orthogonal (but notnormalized) basis for S L H , and importantly, are orthogonal to the dissipative part of the generator (Tr[S † L B ij S D ] = 0) when the dissipator only involves Pauli and T 1 and T 2 relaxation terms. This allows us to extract the coefficients from the Lindblad superoperator generator as To compute the superoperator generator S G , we first obtain the Choi-matrix estimate for a channel E from quantum process tomography and then convert it to the superoperator representation S E . For additional details on the superoperators and converting between superoperators and the Choi-matrix representation obtained from tomography see [32]. Next, we take the matrix logarithm to obtain the generator S G = t −1 log(S E ) from which we estimate ω ij for our two-qubit system using Eq. (C5). The process fidelities of the estimated CR Hamiltonian using this technique and the experimentally obtained Choi-matrix are 99.4 % and 98.6 % on average for CR1 and CR2 experiments, respectively.
We find that as predicted only the terms ZX, ZY , ZZ, ZI, IX, IY , and IZ, shown in Fig. 5, are significant for CR1 and CR2 while all other remaining Pauli terms are negligible. In both CR sequences the ZY term is suppressed by the calibrated CR phase φ opt and a monotonic increase of the desired ZX term is observed as the pulse amplitude A increases. The CR1 pulse without echoing has large contributions from the IX, IY and ZI terms, see Fig. 5(a). Such unwanted interactions, except for ZZ, are removed by the echo sequence in CR2, compare Fig. 5(a) and (b). The effect of these unwanted interactions can be reduced by applying single-qubit gates before and after the CR pulse to correct for local coherent errors as discussed in Sec. III D.
We now find the CR pulse amplitude that creates a maximum entangling gate, i.e. θ ZX = π/2 in Eq. (3). Due to the Gaussian edges of our CR pulses we relate the drive strength Ω to the time-averaged pulse amplitude A through a linear response Ω = λA. The measured ZX interaction strengths are fit by the third order expansion of the CR Hamiltonian [28] ω ZX (A) 2 = − JλA ∆ δ 1 δ 1 + ∆ (C6) + J(λA) 3 δ 2 1 (3δ 3 1 + 11δ 2 1 ∆ + 15δ 1 ∆ 2 + 9∆ 3 ) 4∆ 3 (δ 1 + ∆) 3 (δ 1 + 2∆)(3δ 1 + 2∆) where J is the coupling strength, δ 1 is the anharmonicity of the control qubit, and ∆ is the frequency difference between the control and target qubits. In this model, we have a pair of fit parameters J and λ. The coupling strength obtained from the fit was J = 1.87 ± 0.046 MHz and 1.79 ± 0.033 MHz for the CR1 and CR2 data, respectively. The λ coefficient was −271.2 ± 12.2 MHz and −288.9 ± 10.2 MHz for the CR1 and CR2 data, respectively. These fit parameters are independent of the pulse sequences and both results almost agree within the error range. The small mismatch between the fit values may be caused by imperfections in the Hamiltonian reconstructed from the tomography data. These fit curves yield the controlled rotation angle as a function of the CR pulse amplitude θ ZX (A) = n CR ω ZX (A)t CR where n CR = 1 for CR1 and n CR = 2 for CR2. Finally, we can find the pulse amplitudes A for θ ZX = π/2. The estimated amplitudes were respectivly 0.229 ± 0019 and 0.098 ± 0005 for CR1 and CR2. These are marked by the stars in Fig. 5. Due to the nonlinearity between the ZX term and the average pulse amplitude A, see Eq. (C6), the estimated drive amplitude of CR1 is slightly larger than double the drive amplitude of CR2.