Miniaturizing neural networks for charge state autotuning in quantum dots

A key challenge in scaling quantum computers is the calibration and control of multiple qubits. In solid-state quantum dots, the gate voltages required to stabilize quantized charges are unique for each individual qubit, resulting in a high-dimensional control parameter space that must be tuned automatically. Machine learning techniques are capable of processing high-dimensional data - provided that an appropriate training set is available - and have been successfully used for autotuning in the past. In this paper, we develop extremely small feed-forward neural networks that can be used to detect charge-state transitions in quantum dot stability diagrams. We demonstrate that these neural networks can be trained on synthetic data produced by computer simulations, and robustly transferred to the task of tuning an experimental device into a desired charge state. The neural networks required for this task are sufficiently small as to enable an implementation in existing memristor crossbar arrays in the near future. This opens up the possibility of miniaturizing powerful control elements on low-power hardware, a significant step towards on-chip autotuning in future quantum dot computers.

A key challenge in scaling quantum computers is the calibration and control of multiple qubits. In solid-state quantum dots, the gate voltages required to stabilize quantized charges are unique for each individual qubit, resulting in a high-dimensional control parameter space that must be tuned automatically. Machine learning techniques are capable of processing high-dimensional dataprovided that an appropriate training set is available -and have been successfully used for autotuning in the past. In this paper, we develop extremely small feed-forward neural networks that can be used to detect charge-state transitions in quantum dot stability diagrams. We demonstrate that these neural networks can be trained on synthetic data produced by computer simulations, and robustly transferred to the task of tuning an experimental device into a desired charge state. The neural networks required for this task are sufficiently small as to enable an implementation in existing memristor crossbar arrays in the near future. This opens up the possibility of miniaturizing powerful control elements on low-power hardware, a significant step towards on-chip autotuning in future quantum dot computers.

I. INTRODUCTION
Solid-state quantum dots (QDs) are one of several promising candidates for qubits, the basic building blocks of quantum computers [1][2][3][4][5][6][7][8]. They are engineered in semiconductor devices by the electrostatic confinement of single charge carriers (electrons or holes), and precisely tuned to a few-carrier regime where quantum effects dominate. Even single-dot devices require the tuning of multiple gates for the control of reservoirs, dots, and tunnel barriers [2]. The relationship between applied gate voltages and physical properties of a QD is highly complex and device-specific, requiring significant calibration and tuning. Thus, a key challenge in scaling up QD architectures to act as multi-qubit devices will be tuning within the high-dimensional space of gate voltages. This is a highly non-trivial control problem that cannot be accomplished without significant automation [9].
An automated process of finding a range of gate voltages in which a QD is in a specific carrier configuration is called autotuning. Due to the variability inherent in different QD devices, autotuning naturally benefits from a data-driven approach. Several compelling machine learning strategies have recently been introduced for from-scratch QD tuning [10], coarse tuning into charge regimes [11][12][13][14][15], fine tuning couplings between multiple dots [16,17], or performing autonomous measurements [18,19]. These studies have demonstrated * sczischek@uwaterloo.ca robust effectiveness in identifying electronic states and charge configurations, and automating the precise tuning of gates. Common algorithms in supervised learning, such as support vector machines, deep, and convolutional neural networks, are sufficiently powerful for the complex characterization tasks involved in this automation. Coupled with the advent of community training data sets for tasks such as state recognition [20] and charge transition identification [21], the enterprise of QD autotuning may well be the first demonstration of a large-scale qubit control problem tamed by machine learning.
Like many elements of modern microprocessor design, an important consideration in quantum computers will be integrating control technologies "on-chip" -i.e. on or near the physical qubits inside the cryostat. This requires consideration of energy budgets to limit thermal dissipation in the various control tasks, the performance costs of transferring data, and the benefits of miniaturizing various control elements [22][23][24][25]. In autotuning of QDs, one of the simplest control tasks involves the identification of charge state transition lines in two-dimensional stability diagrams. While previous works use image analysis algorithms [26] or deep (convolutional) neural networks [11,15], in this paper we explore whether this task can be performed by extremely small feed-forward neural networks. Mixed-signal vector-matrix multiplication accelerators based on crossbar arrays of emerging memory technologies (e.g. memristors [27]) provide the possibility to implement sufficiently small neural networks on miniaturized hardware with low power consumption [28][29][30][31][32][33].
We find that neural networks with input layers as small as 5 × 5 pixels are capable of classifying small patches of arXiv:2101.03181v1 [cond-mat.mes-hall] 8 Jan 2021 the stability diagram with sufficient accuracy to identify charge state transitions. The patches can then be shifted around the stability diagram in order to tune a quantum dot into a desired charge state starting from an arbitrary position. One can increase the success rate of this autotuning procedure by considering arrays of connected patches. By finding the minimal array size that provides high success rates, we show that the experimental measurement costs can be significantly reduced by covering only small regions of the stability diagram. The number of parameters in the feed-forward neural networks required for this autotuning procedure are sufficiently small to make their implementation possible in presentday memristor crossbar arrays [28,29,33]. Further, we demonstrate that these parameters can be trained on synthetic (simulated) stability diagrams, while achieving excellent performance in classifying transition lines from real experimental silicon metal-oxide-semiconductor quantum dot devices. Our work thus opens the possibility of taking advantage of the high speed and energy efficiency of memristor crossbar arrays [32,33], which could be integrated as part of a larger on-chip control system for QD autuning in the near future.

II. SUPERVISED LEARNING FOR STABILITY DIAGRAMS
In order to pursue a machine learning approach to the problem of quantum dot (QD) autotuning, we need to define training and testing data sets, and an appropriate neural network architecture. Our end goal will be to use supervised learning to classify transition lines in small patches of the stability diagrams of silicon metal-oxidesemiconductor QD devices. Since experimental data for such stability diagrams is expensive to obtain, we aim to train our neural network architectures on synthetic data, which was obtained from a simulation package developed by M.-A. Genest [21]. This synthetic data approximates experimental diagrams with noise effects after some signal processing, and provides "ground truth" labels for the charge stability sectors which can be used for training.
The machine learning architecture that we propose is a simple feed-forward neural network (FFNN), with a single output neuron that indicates whether a transition line is found in the input patch of data or not. In the following, we explore FFNNs with a very limited number of trainable parameters, roughly commensurate with the number of memristors expected in high-density crossbar arrays available in the near future.

A. Data: experimental and synthetic
Our experimental data is obtained from a silicon metaloxide-semiconductor QD device, as previously discussed in [34]. The upper panel of Fig. 1(a) shows the corresponding setup of gate electrodes, creating a potential landscape along the green arrow. Two plunger gates at the QD and a connected reservoir (R) are denoted as G1 and G2. The resulting potential landscape for the carriers (electrons or holes) is sketched in the lower panel of Fig. 1(a), where the low-potential island corresponds to the QD in which carriers are trapped. We consider a unipolar QD device, which can only trap either electrons or holes. While all following statements are as well true for holes, we focus on considering electrons throughout the paper. The number of electrons trapped in the single QD defines its charge state, which can be tuned most efficiently via the gate G1 controlling the depth of the QD potential. Neighboring gates, such as G2 which defines the electron reservoir connected to the QD, can additionally affect the QD charge state through cross-capacitance effects [34].
Transitions between different charge states can be measured via a single-electron transistor (SET) which is tuned such that a measured current I SET is sensitive to potential changes. Changing the charge state of the QD by adding or removing an electron causes abrupt jumps in I SET . The current I SET can be measured as a function of the voltages at the QD plunger gate (V 1 applied at gate G1) and the electron reservoir gate (V 2 applied at gate G2), while keeping all other gate voltages fixed, providing two-dimensional charge stability diagrams [2]. Transitions between different charge states appear as distinct lines in the stability diagram and must be identified in order to tune the QD [11,15,26,34].
The upper panel of Fig. 1(b) shows a measured stability diagram where an additional oscillating background, caused by cross-capacitance effects of G1 and G2 acting on the SET, impedes detection of the transition lines. This background is specific for the considered measurement setup [34] and caused by the absence of a compensation procedure such as dynamic feedback control [35]. It can be removed via a signal processing algorithm as discussed in [26], transforming the stability diagram into a binary image, see lower panel of Fig. 1(b).
After the signal processing, noise effects still appear in the transformed diagrams. This noise is due to imperfections in the experimental setup, such as the SET coupling to external charges which are not part of the quantum dot [26,34]. It is these noisy transition lines, which lie in the stability diagrams after signal processing, that we aim to detect with the FFNN of the next section. Detecting the transition lines in the binary image is a general approach and can be applied to any experimental setup after a setup-specific signal processing algorithm has been applied to the raw measurement outcome.
It would be preferable to have a large database of labelled experimental stability diagrams with which to train supervised machine learning strategies such as the considered FFNN. Since a suitable database is not available, however, we propose to train the networks on synthetic data. We use the numerical algorithms discussed in [21] to simulate post-signal-processed stability diagrams, which include noise effects similar to those found in ex- The QD is controlled via the electrostatic gate G1 and connected to an electron reservoir (R) whose electronic density is controlled via gate G2. The SET is located next to the QD and enables the detection of transition lines. Along the green arrow a potential landscape is created which is tuned to trap carriers in the QD. A sketch of the potential is shown in the lower panel, where the potential at the QD and at the reservoir can be tuned via gates G1 and G2, respectively, influencing the number of carriers trapped in the QD. (b) The experimentally measured stability diagram (upper panel) contains a strong oscillating background, where transition lines can be seen as sudden jumps. By applying a signal processing algorithm, the background can be removed and a binary stability diagram shows the transition lines with additional noise effects (lower panel). (c) A numerical algorithm is used to create synthetic stability diagrams which simulate the experimental diagrams after the signal processing. Realistic noise effects are applied to transition lines in the diagram (upper panel), while the ground truth data (lower panel) can be extracted containing the transition lines without noise. (d) A small feed-forward neural network is trained on patches of the synthetically created stability diagrams to detect transition lines and is then applied on patches of experimental stability diagrams after signal processing. The network input is given by the pixels in the small patch and the output is a single neuron telling whether a transition line is detected in the patch or not. We consider different numbers of hidden neurons in one or two layers in between input and output with the intention of miniaturizing the neural network and with this the computational complexity.
perimental measurements, see App. A for details. This enables us to create a large data set of synthetic stability diagrams, which include the clear presence of transition lines [see Fig. 1(c)]. This can then be transformed into a training set for supervised learning on small patches of pixels, which include the ground-truth labels corresponding to the presence or absence of a transition line in the patch. To discuss this further, we must first examine in more detail the specific neural network architecture used for patch classification.

B. Neural network training strategy
Motivated by on-chip integration of the autotuning inside the cryostat, we wish to explore the performance of supervised learning tasks on the minimal-size artificial neural networks possible. Therefore, in this section we restrict ourselves to small FFNNs with one or two hidden layers between the input and the output layer, as illustrated in Fig. 1(d). We furthermore restrict the amount of hidden neurons to small numbers, which puts limitations on the complexity of the network. In order to know which small network architectures are useful, we will explore their performance for a simple classification task on the smallest possible patches of the binary stabil-ity diagram after signal processing. The input neurons of the network correspond to pixels in the patch, and a single output neuron is used for a binary classification of whether a transition line is detected or not.
As experimental stability diagrams contain large areas without transition lines [see Fig. 1(b)], when creating a test data set, the overhead of empty patches compared to patches with transition lines should be considered. To account for this, we add a weighting towards patches with transition lines in the training procedure to compensate. When evaluating the performance of the FFNN classifier in the next section, we consider the total accuracy on the full test set, as well as the accuracies for correctly classifying only patches with and without transition lines.
Finally, by shifting patches across the diagram via an algorithm driven by the classification outcome, the quantum dot can be tuned into any desired charge state [15,26]. As only charge transition lines can be detected but not the absolute charge value, this shifting algorithm needs to find the regime where the quantum dot is empty, which is reached when no more transition lines can be found. From this empty reference point the QD can be filled with the desired number of electrons by crossing the corresponding number of transition lines. We discuss this algorithm as a step towards on-chip autotuning in Sec. III B. The total test accuracy (blue) is considered together with the accuracy for correctly classifying empty patches (orange) or patches including a transition line (green). Five feed-forward neural networks with a single hidden layer of 10 neurons are trained over 500 epochs for each data point. Accuracies are averaged over training epochs 450 to 500 to ensure convergence for larger patch sizes. The results of all five networks are averaged with error bars denoting standard deviations.

III. RESULTS
In this section, we begin by discussing the detection of transition lines in small patches of the stability diagram using small FFNNs. In the below, we train the network on a data set of 80000 patches which are extracted from 800 numerically created synthetic stability diagrams at random positions. For the training we use the Adam optimizer [36] on a cross-entropy loss function, applying sigmoid activation functions in the hidden and output layers. The output corresponds to the probability of detecting a transition line and we round it to zero or one to achieve a binary outcome. The trained network is then tested on 2700 patches extracted from random positions of 27 experimentally measured stability diagrams after signal processing. These neural network classifications are used to define a shifting algorithm for small patches, capable of autotuning a single QD into a desired charge state when starting from a random position in the stability diagram.

A. Detection of transition lines
To begin our supervised learning procedure, we find a suitable patch size by analyzing the accuracy reached by a network consisting of a single hidden layer with ten hidden neurons. Fig. 2 shows the accuracies reached when classifying patches of N = L × L pixels with varying L. The total accuracy and the accuracy for correctly classifying empty patches always reach ∼ 96%. In contrast to this, the accuracy for correctly detecting transition lines (1) (2) (3) (4) (5) (6)  is lower and shows a strong dependence on the patch size. We find the best performance for L = 5, where all three accuracies are high. Therefore, we focus on this patch size, which dictates the size of the input layer of the FFNNs in the following.
With the size of the input and output layers thus fixed, the total number of learnable parameters in the FFNN is dictated by the number of hidden units. In Fig. 3(a) we analyze the dependence of the classification accuracy on different hidden unit architectures. We consider different network setups as summarized in Tab on the synthetic training set.
As illustrated in Fig. 3(a), the total accuracy and the accuracy for correctly classifying empty patches reaches ∼ 96% for all architectures in Tab. I. However, a clear dependence on the network architecture is observed in the accuracy for classifying patches with transition lines. Interestingly, optimal results are found for networks with small numbers of learnable parameters [(1) and (2)], affirming that we are not limited by the restricted number of hidden neurons. The accuracy decreases for larger network setups with more learnable parameters.
In addition, we compare the accuracies reached by the neural network classifier with results from a simple classifier based on the amount of bright pixels in the patch. If the amount of bright pixels crosses a certain threshold defined via a pixel number, the patch is classified as containing a transition line. Fig. 3(b) shows the accuracies reached with this pixel classifier as a function of the threshold pixel number. Good performances are found when choosing the threshold at 2 or 3 pixels. In Fig. 3(a) we add the results of the pixel classifier for a direct comparison and observe that small networks, especially architectures (1) and (2), outperform the pixel classifier. This further justifies the use of FFNNs to learn to detect structure in the small patches.
Finally, to confirm that training the networks for 200 epochs is sufficient, Fig. 3(c) shows the accuracies achieved as function of the number of training epochs, for network architectures (2) and (6) trained on a 5 × 5 pixel patch. We find convergence after less than 100 training epochs, providing evidence to justify our choice.
To conclude this section, we refer again to Fig. 3(a) to emphasize that optimal performance is consistent with network architecture (2). As illustrated there, using this FFNN on a 5 × 5 pixel input patch, we reach ∼ 96% for all testing accuracies. Given the fact that this high test accuracy for classifying experimental data occurs for a small and simple network structure, particularly one that is trained on simulated synthetic data, we argue that the results are quite promising. Therefore, we emphasize this architecture in the next section, to define a shifting algorithm for small patches, with the end goal of autotuning a QD into a desired charge state in the stability diagram.

B. Tuning the device with a patch shifting algorithm
In this section we consider a shifting algorithm which we develop for the small patches considered above. With this algorithm we tune a single QD into a desired charge state when starting at a random position in the stability diagram. The full algorithm is discussed in detail in App. B and depends on the classification outcome of the FFNN for the patch at each position. As the transition lines only provide information about changes in the charge state but not the absolute charge of the QD, we first need to find a reference point in the diagram [11,15,26,34]. Similar to [15,26] we use the empty QD as a reference point, which is reached when no more transition lines are detected while shifting the patch to the left.
From this empty reference point, the QD can be filled with the desired number of electrons by crossing the corresponding number of transition lines, as identified by the neural network strategy discussed in the previous section. Below, we focus on finding the single-electron regime, where the QD can be interpreted as a qubit.
An example for the shifting algorithm finding the single-electron regime is shown in Fig. 4(a), where a patch of 5 × 5 pixels is classified with an FFNN with architecture (2) (see Tab. I). A patch is placed at a randomly chosen initial position (red patch, upper panel) and shifted towards the empty regime (gray patches, center panel). The empty reference point is found in the upper left corner of the diagram. From this reference point the patch is shifted to the right until the first transition line is detected to fill the dot with a single electron (green patches, lower panel). The patch follows the line for a few steps to avoid misclassifications of noise effects. The target point in the single-electron regime is indicated by a green cross in the lower panel of Fig. 4(a) and is chosen to the right of the detected transition line.
To analyze the performance of the shifting algorithm we consider 25 experimentally measured stability diagrams after signal processing where the true singleelectron regime can be clearly identified. For each diagram we choose ten random initial positions and check how successful the shifting algorithm performs in finding the empty reference point and the single-electron regime.
We perform this analysis on all the different FFNN architectures listed in Tab. I. Fig. 4(b) shows the success rates averaged over the ten initial positions. Error bars denote the standard deviation when averaging over the initial positions, and show that the dependence on the latter is small. The empty reference point is found successfully in ∼ 98% of the diagrams, while the success rate for finding the single-electron regime is rather small with ∼ 53%. Finding the empty QD regime is easier as in most cases the left corner of the diagram is reached, where the algorithm terminates. This is a specific effect of considering finite measured stability diagrams and is not expected to appear when directly tuning the experimental device according to the shifting algorithm. The fact that no clear dependence on the network architecture can be observed suggests that the performance of the shifting algorithm is the limiting factor.
The above results demonstrate that the simple control task of identification of charge state transition lines can be parlayed into a successful algorithm for tuning to a specific charge sector of the QD stability diagram, while using very small neural networks suitable e.g. for implementation on vector-matrix multiplication accelerators. However, an obvious caveat of choosing the small patch size can be seen in the diagrams in Fig. 4(a) and Fig. 1(b). The transition lines are interrupted by gaps caused by (b) Success rate for finding the empty (blue) and the singleelectron regime (orange) using an array of 5 × 5 pixel patches which are classified with a feed-forward neural network with structure (2) (see Tab. I) as a function of array size K. Arrays are quadratic arrangements of patches with K stating the number of patches in one direction. We consider 25 experimentally measured stability diagrams after signal processing and average the success rates over ten start points for each diagram with error bars denoting standard deviations. Initial points are chosen randomly but are the same for all array sizes. Empty reference points are always found with good accuracy while the success rates for the single-electron regime show a strong dependence on the array size.
the experimental measuring procedure, see App. A for details. These gaps are larger than the classified patch, so that transition lines can be missed. While striving to keep the input of the FFNN small, this issue can be overcome, for example, by extending the algorithm to couple adjacent patches. To illustrate this, we consider an array of K × K patches, which reduces the risk of the patch translation algorithm hitting a gap in the transition lines, while only slightly increasing the computational costs and preserving the high accuracy reached in the classification of small patches. Each patch in the array is classified individually by our small FFNN, and the shifting algorithm in this case now depends on whether a transition line is found in any of the individual patches, see App. B for details. Fig. 5(a) illustrates the shifting algorithm for an array of 4 × 4 patches with 5 × 5 pixels per patch on three different stability diagrams. The whole array is shifted to first find the empty reference point and then detect the first transition line, leading to a target point in the single-electron regime indicated by the green cross. Fig. 5(b) shows the success rate of the shifting algorithm as a function of the array size K, where the same data as in Fig. 4(b) is used. An FFNN with architecture (2) is used for the classification of the 5 × 5 pixel patches. The empty reference point is found successfully in ∼ 99% of the diagrams, while finding the single-electron regime shows a dependence on the array size. An increase in the success rate can be observed already when going from K = 1 (single patch) to K = 2 and a further increase is found for larger K. At K = 4 the success rate reaches its maximum and saturates before it appears to decrease again for K ≥ 8. The tuning into the single-electron regime is successful in ∼ 75% of the diagrams for K = 4, which is a relatively large success rate. For different patch sizes L with similarly high classification accuracies (see Fig. 2), we find similar success rates if we adapt the array size K such that the total amount of (L × K) 2 pixels in the array is kept constant. We hope this result will encourage the further exploration of small FFNNs to autotune single QD devices.

IV. DISCUSSION AND OUTLOOK
We have demonstrated that the complex task of autotuning a quantum dot (QD) into a target charge state can be robustly achieved by harnessing the power of very small feed-forward neural networks (FFNNs). Such FFNNs are the workhorses of supervised machine learning, and enable a data-driven approach akin to transfer learning, where networks trained on simulated data can be used for classification of experimental QD data. We have shown in particular that a classification approach, where such small neural networks are trained to detect the presence or absence of charge transition lines in small patches of stability diagrams, can be transferred with excellent accuracy to experimental stability diagrams obtained from silicon metal-oxide-semiconductor quantum dot devices. Further, by combining this classification approach with an algorithm that shifts arrays of small input patches, we have shown that a single QD can successfully be autotuned into the single-electron regime, when starting from a random configuration of gate voltages. With our small FFNNs, we reached classification accuracies, as well as final success rates, which are comparable to previous results with deep (convolutional) neural networks [11,15]. The success of small input patches further suggests that the experimental cost of measuring the stability diagram can be significantly reduced. Indeed, with our shifting algorithm we are able to consider smaller regions than in comparable works [15,26].
We have found that the FFNNs used for this task require a very small number of learnable parameters to achieve a maximal classification accuracy. This suggests that such neural networks could be implemented on stateof-the-art memristor crossbar arrays [28,29,33]. Hence, our work provides a first step towards developing an energy-efficient autotuning device, which could conceivably be implemented in an on-chip control system for QDs. In order to further pursue this idea, the performance of real memristor crossbar arrays, which is expected to be influenced by imperfections [37,38], will need to be carefully analyzed on the classification task at hand.
It is particularly important to emphasize that throughout this work we have trained the small FFNNs on input patches extracted from synthetic stability diagrams generated with a numerical simulation package [21]. Due to the relative expense involved in obtaining experimental data on QDs, the success of this transfer learning approach is an important step in further developing our machine learning strategy. It also exposes the clear opportunity for creating a larger community data set for training neural networks to detect charge transition lines in QDs, similar to [20]. Eventually, such data setswhether synthetic or experimental -will play an important role in standardizing and benchmarking machinelearning based QD control and autotuning.
Finally, we remark that in this work we applied our charge transition line detection algorithm on stability diagrams that have already undergone significant signal processing according to [26]. With our successful machine learning approach, this signal processing step becomes computationally more expensive than the classification task. It would therefore be interesting to further refine our data-driven strategy, to eventually enable a similar neural-network based classification to occur directly on raw experimental data. It is feasible that significant miniaturization of such tasks could lead to a highlyefficient on-chip control system for autotuning quantum dots in the near future.

DATA AND IMPLEMENTATION
The code to create synthetic stability diagrams is obtainable at GitHub 1 , and the data set used for training is available at GitHub 2 . The feed-forward neural networks are implemented and trained using PyTorch [39] and NumPy [40], figures are created using Matplotlib [41]. The training of the feed-forward neural network to detect transition lines in patches of stability diagrams requires a large training data set. As it is hard to create such a large set of experimentally measured stability diagrams, we create the training set numerically. However, we obtain the reached classification accuracies from applying the trained network to an experimentally measured test data set.
To get stability diagrams from the experimental device, the quantum dot is coupled to a single-electron transistor (SET). A current I SET running through the SET is measured while tuning the voltages at two gate electrodes to obtain two-dimensional diagrams. The remaining gate electrodes are kept at fixed voltages, as two-dimensional diagrams bring many advantages for the auto-tuning procedure compared to one-dimensional single-gate scans [26].
The current I SET shows an oscillatory behavior where transition lines are obtained as jumps in the oscillations [26,34]. Since the transition lines are hard to extract from this raw data, we separate them from the oscillatory current via a signal processing algorithm as discussed in [26]. In this algorithm, the raw signal is first sent through a high-pass filter to remove background effects. Afterwards the frequency of the oscillations is extracted via a Hilbert transform. At charge state transitions, which appear as jumps in the oscillations, the frequency shows negative peaks which can be identified by a threshold method. The considered threshold is adapted to the obtained frequency distribution. This algorithm provides a binary mapping of the stability diagrams, where the transition lines are extracted from the raw data.
While those binary diagrams already contain the extracted charge transition lines which are needed to tune the device into a desired charge state, the diagrams also contain noisy pixels due to imperfections in the experimental measurement procedure and in the signal processing algorithm. The main imperfections that appear are the following: • The SET couples to an external charge: an additional line appears in the diagram which is not a transition line, • the quantum dot couples to an external charge: the transition lines experience a sudden jump and continue at a shifted position, • measurements are performed too fast: if the tunneling rate is too low, the electrons do not get through the barrier during the measurement, leading to a spreading and fading of transition lines at low voltages, • when the derivative of the oscillating signal is close to zero, which is the case at the top and bottom of the oscillations, the negative peaks in the frequency do not appear: quasi-periodic gaps are found in the transition lines, • the measurement sensitivity can decrease: spots of noisy pixels appear.
All these effects make the detection of transition lines in the binary diagrams harder, which is why we use the feed-forward neural network to detect them. The network performance is tested on experimental diagrams after signal processing, but we train it on numerically created synthetic diagrams which simulate the noise effects discussed above. The synthetic training data can be generated efficiently and faster than experimental data and additionally comes with ground truth data containing only the transition lines without noise. This ground truth data is necessary to perform the supervised training of the network, where labels need to be provided.
To create the synthetic training data, we use the algorithm discussed in [21]. First, ideal noiseless diagrams are created which contain the transition lines simulating a given experimental setup. The positions of the transitions are calculated via the Thomas-Fermi approximation and hence show a realistic orientation and spacing [11,12,21]. This simulation directly provides binary diagrams which are similar to the experimental diagrams after signal processing. The noise effects discussed above are then added numerically to the ideal data, so that synthetic diagrams close to the experimental ones are created. Additionally, we add a unitary background noise by turning each pixel bright with probability 0.01.
Keeping track of the transition line transformations when applying the noise effects yields the ground truth data. We take into account that some noise effects, such as gaps appearing in the lines or distortions of lines, also apply to the ground truth data, while other effects, such as additional lines, noisy spots, or the background noise, do not affect the ground truth.
With this algorithm we create 20 ideal stability diagrams with the parameters of the experimental setup chosen randomly in a specific regime. We then apply 20 randomly chosen combinations of the discussed noise effects to each ideal diagram, leading to 400 synthetic stability diagrams. As the concentration of transition lines is high in the synthetic data, we additionally create 400 stability diagrams where we randomly apply the noise effects of the SET coupling to an external charge or the measurement losing its sensitivity to empty diagrams without lines and add the background noise. This yields a total of 800 synthetic stability diagrams from which patches can be extracted to train the feed-forward neural network on.

Appendix B: Shifting algorithm
The quantum dot (QD) device is tuned into the singleelectron regime via shifting the classified small patch over the stability diagram. The algorithm for the shifting procedure depends on the classification outcome of the considered patch and can be divided into two parts. First, the empty reference point of the QD needs to be found, from which the device can be tuned into the singleelectron regime in a second step. This algorithm is similar to [15,26], but needs to be adapted for the small patch sizes.
In the following we generally talk about an array of K × K patches as considered in the main text, where the special case of a single patch corresponds to K = 1. The patch has a size of L × L pixels.
To find the empty regime we start with a patch array at a random position in the stability diagram. We then shift the array K×L pixels to the left and K×L pixels upwards as long as no transition line is detected. While the upper corner of the diagram is not reached, we apply periodic boundary conditions in the x-direction when searching for the first transition line.
If a line is detected in one patch of the array, we follow the line by shifting the array one pixel to the left and L pixels upwards. If the upper boundary of the diagram is reached, we shift the patch L pixels to the left and lose the line. To avoid ending up in a wrong regime due to misclassifications of noise, we only declare the transition line as found if the same patch detects the line ten steps in a row. Otherwise, if the line is not detected anymore, we continue shifting the patch diagonally with periodic boundary conditions until the next transition line is detected.
When the first transition line is found, we continue following the line until it fades out and is not detected anymore. We then shift the array K × L pixels to the left until any patch detects the next line, which we follow again until it fades out. When no line is detected after moving 40/K steps of K × L pixels to the left, no transition lines are found anymore and the dot is empty. We define this position as the empty reference point.
During the whole procedure of finding the empty-dot regime, the process is terminated if the upper left corner of the diagram is reached and this corner is defined as being in the empty regime. All shifts are only applied as long as the boundaries of the stability diagram are not reached. If a shift would cross one of these boundaries, it is not applied in this direction.
Having found the empty-dot regime, the next task is to fill the dot with a single electron, which is done by shifting the array until one transition line is crossed. Since the empty reference point is to the left of all transition lines, the array is shifted K × L pixels to the right and two pixels down as long as no line is detected.
If any patch in the array detects a transition line, we follow the line by shifting the array one pixel to the right and L pixels downwards. When the bottom boundary of the diagram is reached, we continue shifting the patch L pixels to the right. We apply the same procedure as before to avoid ending up in the wrong regime due to misclassified noise and only declare detecting a line when a single patch detects it five times in a row. If this is not the case, we continue shifting the array until the next line is detected. Since the shifting towards the reference point moves the patch to the upper part of the diagram, the chances of hitting the bottom boundary before finding a transition line are very low.
When the first transition line is found starting from the empty regime, we move the array diagonally 2K×L pixels to the right and downwards and with this find a position in the single-electron regime. If a different charge state is desired, the procedure can be iterated analogously until the desired number of transition lines is crossed. The shifting process for filling the dot is terminated whenever a corner of the diagram is reached and the position at this corner is defined as the single-electron regime.
Mind that the terminations due to reaching corners which we apply in both steps of the shifting algorithm are specific for the case of having finite pre-measured stability diagrams. Generally, when directly tuning the experimental device according to the shifting algorithm, the voltage limitations are not expected to be reached.