Unsupervised Learning of Rydberg Atom Array Phase Diagram with Siamese Neural Networks

We introduce an unsupervised machine learning method based on Siamese Neural Networks (SNN) to detect phase boundaries. This method is applied to Monte-Carlo simulations of Ising-type systems and Rydberg atom arrays. In both cases the SNN reveals phase boundaries consistent with prior research. The combination of leveraging the power of feed-forward neural networks, unsupervised learning and the ability to learn about multiple phases without knowing about their existence provides a powerful method to explore new and unknown phases of matter.


Introduction
Machine learning (ML) algorithms enable computers to learn from experience and generalize their gained knowledge to previously unknown settings. It is perhaps the most transformative technology of the early 21th century. The ability to recognize objects in images [1] or translate languages [2] without being explicitly programmed for this task, highlights the enormous potential of machine learning.
In recent years the physical sciences have adopted machine learning based algorithms to explore complex questions. Many methods have been designed to solve problems beyond the scope of data science, and have now the potential to revolutionize physics. The most prominent examples of promising tasks that have been tackled include finding phase transitions [3,4,5,6,7,8,9,10,11,12,13], reconstructing or simulating quantum systems [14,15,16,17,18,19,20] and rediscovering physical concepts [21,22,23,24,25,26,27,28,29,30]. All these advances are summarized in review articles directed at different audiences. A didactical review to the most modern techniques can be found in [31], a review article focused on the applications of machine learning to examine quantum matter [32] and a broad overview across different physical disciplines is summarized in [33].
The current manuscript contributes to the development of methods that automatize the calculation of phase diagrams with little to no human prior knowledge about the nature of the underlying phases. The subfield of automated phase recognition can be subdivided into two categories: supervised and unsupervised phase recognition.
In the first case, the operating scientists are aware of the of the possible phases and have a rough estimate of where these phases are positioned in the phase diagram; however, they are unsure about the exact location of the phases and the transitions between them. Supervised learning of phase transitions can be based on different machine learning algorithms. It was initially introduced using convolutional neural networks [3], which are to this day the most powerful and robust tools to learn accurate physical phase boundaries. There are hybrid methods that build upon purposely mislabelling phase classes [4], methods that are built upon support vector machines [24], and other powerful frameworks [7,34,35,36,37,38,39]. These methods have demonstrated success across a wide range of physical systems, from simple spin lattices, over strongly correlated quantum systems, up to lattice gauge theories [8,10,11,13,23,40,41,42,43,44,45,46,47,48,49,50].
The second category contains unsupervised phase recognition algorithms. These algorithms are useful when the researcher who is employing these tools is unaware of the underlying phase structure, meaning they do not know about the existence or location of certain phases and thus cannot supply this information to the machine learning algorithm. The simplest unsupervised phase recognition scheme is based on principal component analysis [5] and the most widely used unsupervised scheme that leverages the power of artificial neural networks is based on variational autoencoders [6]. These methods have been examined, enhanced [9,51,52,53] and successfully applied to many systems in physics and materials science [54,55,56,26,57,58,59,60,61,62,63]. Compared to supervised algorithms, unsupervised methods usually have the drawback of lacking accuracy in determining phase boundaries [6], or restricting the kinds of order parameters that can be learned [5].
We introduce an unsupervised machine learning method to discover phase transitions based on Siamese neural networks (SNN). Siamese networks were initially introduced for fingerprint and signature identification [64,65]. Instead of predicting a certain class, Siamese networks predict if two inputs belong to the same class. Hence, these networks can be used for multi-or infinite class classification by comparison to anchor data points whose label is known. Although Siamese networks are very powerful, they have experienced little use in the physical sciences. So far Siamese networks have been employed to discover symmetry invariants and conserved quantities [25]. While Siamese neural networks are supervised machine learning algorithms, our proposed phase recognition method is unsupervised, in the sense that it does not require any phase information. This apparent contradiction is reconciled in Section 3.4.
While we initially present our phase recognition method using the example of two stacked Ising models exhibiting four different phases, we demonstrate the power of this method by examining the phase diagram of the Rydberg atom array. Rydberg atom arrays are a powerful platform for experimental realizations of quantum many-body phenomena [66,67,68]. Neutral atoms are typically arranged via optical tweezers to construct various physical lattices at varying interaction strengths which give rise to rich phase diagrams [69]. Such systems have already been examined with the help of machine learning algorithms. The phase diagram has been revealed by a combined effort of unsupervised and supervised methods [70]. Experimental states have been reconstructed using neural network based tomography [71]. Ground states have been calculated [72] and simulated measurement data has been used for pre-training variational wave functions [73].
The paper is structured in the following order: we first introduce the models we are examining with our new Siamese network based framework. These models include a stacked Ising model and the Rydberg atom array. Subsequently, we describe how we prepare the input data using Monte Carlo simulations. We describe how Siamese neural networks are constructed and trained, and develop our framework to do unsupervised learning of phase transitions. Then we apply our method to both models and present the results in the form of the predicted underlying phase structure. Finally, we summarize our findings and place our method into the broader context of machine learning tools for phase recognition.

Stacked Ising Model
The Ising model on the square lattice is a simple, well studied, and exactly solvable model from statistical physics that exhibits a phase transition. Thus, it provides the ideal starting point for benchmarking the performance of a phase recognition algorithm. Its Hamiltonian is a function of spin configurations S = (s 1 , ..., s n ). In the following, we focus on the ferromagnetic Ising model J = 1 and set the external field h = 0. SNN phase recognition is an algorithm that is able to detect multiple phases in an unsupervised manner. Hence, we trivially combine two Ising models by overlaying them on the same lattice where we can tune each temperature T,T , or in other words the ratios J k B T andJ k BT , independently. The combined Hamiltonian acts on lattices containing two spins per site, S = ((s 1 ,s 1 ), ..., (s n ,s n )). Since there is no interaction between them, phase transitions trivially occur at lines (T,T ) ≈ ( ⋅ , 2.269) and (T,T ) ≈ (2.269, ⋅ ) in the phase diagram. The red sites indicate spin-up, and the grey sites correspond to spin-down. The two lattice configurations are independently sampled at temperatures T andT to produce a single stacked configuration at (T,T ).

Rydberg Array
A common Hamiltonian that can be implemented by Rydberg arrays has a form similar to that of a Transverse Field Ising Model, meaning it is sign-problem free and thus amenable to simulation on classical computers. The Hamiltonian acts on a collection of atoms which individually act like 2-level systems, having a ground-state g⟩ ≡ 0⟩ and an excited, so-called Rydberg state r⟩ ≡ 1⟩. The atoms are subject to a longrange interaction which is described by a van der Waals (vdW) interaction of the form V ij ∼ r i − r j −6 that penalizes atoms that are simultaneously in the Rydberg state [74].
Additionally, the atoms are subject to coherent laser fields: a detuning δ which acts like a chemical potential, driving atoms into their Rydberg states, and a Rabi oscillation with frequency Ω which excites ground-state atoms and de-excites atoms in Rydberg states.
where n i = 1⟩⟨1 i is the occupation/number operator, and σ x i = 1⟩⟨0 i + 0⟩⟨1 i . The interaction strength is typically parametrized in terms of a Rydberg blockade radius R b , which describes an effective radius within which two simultaneous Rydberg excitations are heavily penalized:

Monte-Carlo Simulation
The well-known single-spin-flip Metropolis algorithm is used to generate importance sampled Monte Carlo configurations of the Ising model on a square lattice of size 20 × 20 with periodic boundary conditions [77]. After initializing a random lattice, we evolve the simulation for 7168 MC steps between drawing samples. It is important to note that neural networks can pick up on any residual correlations, thus relying on conventional auto-correlation measures to determine the independence of lattice configurations is not enough. We produce 92 independent configurations at each of 100 temperatures ranging from T min = 1.53 to T max = 3.28. This naturally translates to 92 × 92 = 8464 samples for the stacked Ising model at each temperature pair (T,T ).
For the Rydberg system, we make use of a recent Quantum Monte Carlo method [76] to generate occupation basis samples of the Rydberg Hamiltonian. The QMC simulation is based on a power-iteration scheme which projects out the ground-state of the Hamiltonian: where +⟩ is the positive eigenstate of σ x , N is the number of lattice sites, C is a constant energy shift used to cure the sign-problem emerging from the diagonal part of the Hamiltonian, and M is called the projection length. We perform our simulations on a 16×16 square lattice with open boundaries at various parameter values. Unlike previous DMRG-based studies [75,69] we do not impose a truncation on the vdW interaction. We take our projection length M to be 100,000 which we found was more than enough to accurately converge to the ground-state over the parameter sets which were simulated. For our simulations we fixed Ω = 1 and performed scans over R b ∈ {1.0, 1.1, . . . , 1.8} and δ ∈ {0.5, 0.6, . . . , 2.9}.
To generate the occupation basis data for the SNNs, we first perform 100,000 Monte Carlo update steps to allow the chain to reach equilibrium. We then record one sample every 10,000 steps in order to eliminate any possible autocorrelation between successive samples. Each Monte Carlo step consists of a diagonal update step, followed by a cluster update step in which all possible line-clusters are built deterministically and flipped independently according to a Metropolis condition; see [76] for further details. Additionally, at each point in parameter space (R b , δ) we run 3 independent Markov chains. The chains are allowed to evolve until each has generated 400 samples, giving a total of 1200 independent samples for each parameter pair. Artificial neural networks are directed graphs that have the ability to learn an approximation to any smooth function f (x) = y given sufficiently many parameters. A neural network is built by successively applying matrix multiplications characterized by weights w L ij that are offset by biases b L i . (i, j are neuron indices in different layers L). Between subsequent matrix multiplications there is a non-linear activation function, common choices of which are sigmoid or rectified linear units. A neural network is trained by applying it to a data set and optimizing the network parameters to minimize a certain objective function using gradient descent.

Siamese Neural Networks
Siamese neural networks (SNN) were introduced to solve an infinite class classification problem as it occurs in finger print recognition or signature verification [64,65]. Instead of assigning a class label to a data instance, the SNN compares two data points and determines their similarity. A solution to the infinite class classification problem is obtained by calculating the predicted similarity between labelled anchor data points and a new unlabelled data instance.
A SNN F (S i , S j ) (Fig. 3) consists of two identical sub-networks f (Fig. 2) which project an input pair into a latent embedding space. The similarity of two inputs is determined based on the distance in embedding space.
Possible distance metrics d must be chosen according to the problem at hand, in our case we chose the average squared Euclidean distance on the unit sphere which is equivalent to the cosine distance.
Here f 1i , f 2i denotes different components in an N dimensional embedding space. Instead of training the SNN on paired training data, an effective way to train Siamese neural networks is through minimizing contrastive loss functions involving triplets where the hyperparameter α is chosen such that the neural network is prevented from learning trivial embeddings. The intuition behind α is its interpretation as a margin to encourage separation between the anchor and the negative embedding in the embedding space [78]. Minimizing L requires minimizing the distance between the anchor and positive sample d(f (S a ), f (S + )), while maximizing the distance between the anchor and negative sample d(f (S a ), f (S − ). max(⋅, 0) prevents the neural network from learning runaway embeddings f (S i ) i → ∞.

Model Architecture
The explicit model architecture for f Fig. 2 depends on the underlying data set. In the case of the Ising model, f consists of two 2-D convolutional layers, both with stride (1, 1), a kernel size of (3, 3), and with 6 and 10 filters, respectively. Each layer is fed into a ReLU activation function. The resulting image dimensions are (16,16). This is followed by a (16, 16) average pooling layer. The output is flattened and fed to a dense layer with 10 neurons. Subsequently, we feed this output to the embedding layer, which also contains 10 neurons and a sigmoid activation function. The embedding is normalized to unit length under euclidean norm.
The model architecture for examining the Rydberg system is similar to the Ising model. In this case, we begin with two 2-D convolutional layers, both with stride (1, 1), a kernel size of (3, 3), and with 6 and 4 filters, respectively. The resulting image dimensions are (12,12). As before, we subsequently apply an average pooling layer, but this time of dimension (12,12). The remainder of the architecture is identical to before.
We use the Adam optimizer to train our neural network. Furthermore, our training procedure involves the early stopping callback. This technique involves training as long the loss is decreasing. If the loss is not decreasing for m epochs, training is stopped. The value of m is known as the patience. We set the maximum number of epochs to 150, which is enough to allow the callback to decide when to terminate the training process. We observe that small values of patience, around m = 1..3 and m = 6 are sufficient for training on the Ising model and the Rydberg system, respectively. Additionally, we use a learning rate of α lr = 0.001 for the Ising model, and α lr = 0.0005 for the Rydberg system. Another hyperparameter is the margin for the contrastive loss, where we use α = 0.4. We employ the TensorFlow and Keras libraries to implement the network, training, and callbacks.

Phase Boundaries from Siamese Networks
3.4.1. Supervised Learning of Phase Transitions: Since the proposal to use supervised machine learning algorithms for calculating phase diagrams, the most powerful method still remains using feed-forward neural networks for the binary classification of Monte-Carlo samples [3]. In this case a neural network is trained on configurations from known parts of the phase diagram labelled by their phase. By denoting the phases with binary labels y ∈ [0, 1], a neural network f is trained to predict the phase of a configuration S Unsupervised Learning of Rydberg Atom Array Phase Diagram with Siamese Neural Networks9 After training, this neural network is then applied to samples from unknown parts of the phase diagram. Since these networks intrinsically learn the underlying physical features characterizing the phases like order parameters and other thermodynamic quantities [23], the predictions of the neural networks flip from one label to the other at the position of the phase transition.
The principle that guides us through the development of an unsupervised Siamese network-based scheme for phase recognition is to leverage the power of neural networks for phase classification in the supervised setting. This is done by reformulating the task of predicting phases by a neural network. A Siamese neural network takes a pair of input configurations and predicts if they are similar or different with respect to a metric imposed by the objective function during training.
In this formulation a SNN can be used as a supervised algorithm for multi-phase recognition. In order to do unsupervised learning of phases with SNNs the training data cannot be supplied with phase labels. In order to enable a SNN to learn from unlabelled data we need to understand how data affects the gradients while training neural networks.

Gradient Manipulation:
To develop an unsupervised framework, it is important to understand how gradients update the neural network. Let us discuss this at the example of a general neural network h trained in a supervised setting to minimize the mean square error loss function on a labelled data set (X, Y ). The discussion can be extended to any network in this manuscript. The effect of a single training example (x, y) ∈ (X, Y ) on the loss function is The neural network switches its prediction at the decision boundary Neural networks are trained using backpropagation of gradients. Let us focus on the gradient signal on an example weight w out of the millions of parameters characterizing a neural network. Let us further assume we have two identical training configurations x with opposite labels labels y ∈ [0, 1]. Each update step invloves the product of the learning rate η and the inverse of the derivative of the loss function with respect to w: Unsupervised Learning of Rydberg Atom Array Phase Diagram with Siamese Neural Networks10 The update depends on the label y: Thus, by supplying the neural network with two similar training samples, but opposite labels in the same training step, their effect on the weights of the neural networks would approximately cancel each other out. While this combined training signal forces the neural network to be more uncertain h(x) → 0.5, it will never change the prediction itself.

Unsupervised Learning of Phase Transitions:
In each update step the Siamese network is trained on triplets (S a , S + , S − ), where S a is called anchor, S + the positive comparison, and S − the negative comparison. In an unsupervised setting we do not have the true phase labels at hand. However, we know that two samples from the exact same point on the phase diagram must have the same phase. Thus, we create training batches where S a and S + are from the same coordinates in the phase diagram, while S − is sampled randomly from anywhere in the phase diagram.
Building on the discussion of gradient signals: If S + and S − stem from the same phase the training signals should approximately cancel each other out, such that the remaining noise is subleading compared to the signal that is obtained when S + and S − are from different phases. In a physical context, the noise might stem from thermal fluctuations, and the leading signal from relevant thermodynamical quantities.
Since there is still no comprehensive theory on neural network training dynamics, the above discussion lacks in mathematical rigor. Hence, in line with all other machine learning based phase recognition methods, the only way to convince ourselves of the capabilities of the proposed method is an empirical study by applying the method to physical systems.

Results
For the purpose of calculating phase diagrams, we train SNNs as outlined in the previous paragraphs for both the stacked Ising model and the Rydberg atom array. We create training triplets (S a , S + , S − ) containing a randomly sampled anchor configuration S a , a positive configuration S + sampled from the same point in the phase diagram and a negative configuration S − sampled from any other point in the phase diagram. After having successfully trained a SNN it is employed to perform pairwise comparisons on configurations along a specified one-dimensional slice through the phase diagram.

Adjacency Comparisons
In this scheme, a configuration corresponding to a point in the phase diagram is compared to its neighbors within a certain distance. At the example of a single Ising model, this means a configuration from temperature T n is compared to a configuration at T n+k , where n is the temperature list index and k is a constant shift. The value of k can be treated as a hyperparameter. A comparison between configurations at T n and T n+k is assigned a temperature of 1 2 (T n + T n+k ). Our similarity measure of choice is the normalized cosine dissimilarity scaled to a range [0, 1], where 1 corresponds to the empirically found maximal dissimilarity and 0 to the empirically determined minimal similarity. The cosine dissimilarity is related to the cosine distance via 1 − cos(θ), where The cosine distance is equivalent to the euclidean distance when acting on the SNN embedding space normalized to the unit sphere ( f (S i ) 2 = 1): Since the neural network f is designed to output positive values f (S n ) i ∈ [0, 1], the cosine dissimilarity takes on its minimum 1 − cos(θ) = 0 for similar embeddings and its maximum 1 − cos(θ) = 1 if the embeddings are dissimilar. The result of scanning across a phase transition is depicted schematically in Fig. 4(c). The highest peak will indicate the SNN prediction of the phase transition.

Ising Model
We examine the results of applying SNN unsupervised phase recognition to the 20 × 20 stacked Ising model. Analytically, the phase boundaries of the stacked Ising model in the thermodynamic limit are (T,T ) = (2.269, ⋅) and (T,T ) = (⋅, 2.269). The phase transition temperature is prone to finite size effects as the lattice becomes smaller. If the phase transition would be calculated using the magnetization, finite size effects would distort the phase boundaries to (T,T ) ≈ (2.36, ⋅) and (T,T ) ≈ (⋅, 2.36); however, it is important to note that the quantities a neural network learns might differ from the magnetization and thus experience different finite size scalings [55].
In order to calculate the stacked Ising model phase boundaries, we choose to perform vertical and horizontal scans across the phase diagram, where each scan is repeated five times and the standard deviation of this ensemble is displayed as uncertainty. Exemplarily, the results of two of these scans can be found in Fig. 5. The location of these scans is depicted in (c). (a) reveals the phase transition from (ferromagnetic, paramagnetic) to (paramagnetic, paramagnetic) (FP to PP), while (b) displays the phase transition from PF to PP. Collecting 21 vertical and 21 horizontal scans reveals the phase boundaries of the stacked Ising Model, as seen in Fig. 6. By comparing the horizontal (green dotted line) and vertical scans (red dotted line) with the Ising model phase boundaries in the thermodynamic limit (black dashed lines) one can observe a clear difference. The SNN predicts critical temperatures of T c ≈ 2.352, consistent with the finite size correction of the magnetization which indicates a phase transition at T c ≈ 2.36.
In order to reveal the power of our SNN based phase recognition scheme, we have to scan across diagonal slices within the phase diagram that contain more than one phase transition. For this purpose we scan diagonally across the phase diagram as depicted in Fig. 7. The two diagonal scans across the lattice reveal that this technique can identify more than one phase transition. The first diagonal scan is performed along a line through the center of the lattice. In this case, we see a single phase transition, as  Fig. D2.

723). A closer examination is found in
Furthermore, the embeddings shed some light on how the neural network encodes the aforesaid phase transitions. In Fig. 8 we see embeddings which encode phase information for the diagonal slice. These embeddings clearly separate between the two underlying phases. Embedding 1 only spikes in the PP phase, while embedding 10 activates at FF regimes. A switch between both embeddings occurs at the phase transition. Other embeddings can be found in Fig. C1. In the case of a single embedding, we may have noise contained in the same embedding neuron where the phase behaviour is captured. With additional embeddings, the noise may be relegated to other embedding neurons, isolating the phase transition information.  (c) counts the number of instances in which a point is above the halfway line in the y-axis of (a) and (b). There is a separation in these histograms between the two phase regions, FF and PP. According to (c), embedding 1 (blue) has a preference for encoding information in the PP region, while embedding 10 (green) encodes the FF region.
The embedding space of the shifted diagonal scan is depicted in Fig. 9. We observe three crucial behaviours in the embeddings. Each of the three histograms show maximum activity in three different regions. Embedding 2 encodes the FF phase, embedding 4 the PP phase, and embedding 7 encodes configurations from the FP phase.

Rydberg Array
The Rydberg atom array phase diagram is not as well studied as the Ising model, so we need to compare our SNN phase boundaries to recent papers [69,75] and evaluate the order parameters on the QMC data ourselves. In order to identify the approximate phase boundaries we compute predictors for the phases of interest. Each phase corresponds to various peaks in the absolute value of the Fourier transform (FT) of the one-point function: where n j is the Rydberg occupation of the jth site. Furthermore, we symmetrize the FT by averaging over permutations of the momentum axes: Unsupervised Learning of Rydberg Atom Array Phase Diagram with Siamese Neural Networks16 The peaks occur for the checkerboard phase at (π, π), for the striated phase at (π, 0) and (π, π), and for the star phase at (π, 0), and (π 2, π) [75]. We compute, for each point in parameter space (R b , δ), the symmetrized FT at these specific momenta averaged over the full 1200 sample data set given to the SNNs. The Rydberg atom array phase diagram is examined in regimes where the checkerboard, striated and star phases are present. Thus, guided by [69] we focus the region R b ∈ {1.0, . . . , 1.8} and δ ∈ {0.5, . . . , 2.9}. The training of the SNN on Rydberg QMC data is performed similar to the Ising model on each horizontal and vertical slice. Inference is performed by adjacency comparisons with step-size k = 1. Because the Rydberg phase diagram is coarser than the Ising model phase diagram (25 points between δ = 0.5 to δ = 2.90 for the Rydberg array, compared to 100 points between T = 1.53 and T = 3.28 for the Ising model), there is no reason to use k > 1. Figure 11: Phase transition lines are superposed on the Rydberg phase diagram. These lines result from connecting the peaks from Fig. 10 and Fig. C4 in the smoothest possible way. Doing so results in three distinct phase boundaries. In (a)-(c) we overlay these phase lines on the diagram generated via our QMC simulation, whereas (d)-(f) use the phase diagram produced by [69].

Testing
All horizontal and vertical scans depicted in Fig. 10 and Fig. C4, respectively, are the result of an ensemble average over 5 different runs per slice. The error bars represent the standard deviation between runs. The vertical lines in each plot in Fig. 10 represent the approximate phase transition at each R b based on [75]. Our observations differ from their result in the sense that we consistently observe two phase transitions in all slices (although some of these phase transition peaks are weak, as shown in Fig. 10). However, we also confirm that in all cases the SNN predicts a phase transition that coincides with the results from [75]. There is only a slight deviation at R b = 1.4 in the horizontal scanning direction. Furthermore, the results in Fig. C4 reveal the striated phase in between the star and checkerboard phases, which is also consistent with [75].
In figures Fig. 11 (a)-(c) we compare the SNN predictions with the results from evaluating order parameters on our QMC data. For each the checkerboard phase, the star phase, and the striated phase we create separate plots. The QMC data is chosen as the background, while in the foreground we connect phase transition signals from the SNN in the smoothest way possible in the form of blue, red, orange, pink and yellow lines. We first observe a perfect agreement of the checkerboard phase transitions seen in (a). The striated and star phase in figures (b) and (c) are very elusive; however, the SNN tends to capture clearer phase information than what is suggested by the evaluation of the order parameters. The red line captures the striated phase for R b > 1.3. There are two distinct differences between SNN and QMC order parameter results: There is no QMC order parameter signal to explain the orange line. Further, the red line continues well within the checkerboard phase, where none of the three order parameters signal a phase transition.
We also compare our results to those obtained by [69] in Fig. 11 (d)-(f). Again the SNN prediction of the checkerboard phase transition is in good agreement with the DMRG results. Further, the red line for 1.3 < R b < 1.6 is similar to their striated phase boundary. The pink and yellow lines also bound the striated phase. In addition, our orange line corresponds very well to the DMRG star phase. The DMRG star phase is much larger than the QMC star phase. However, the SNN results should mimic what is present in the QMC data. This contradiction might be resolved by two different explanations: 1) the neural network is able to extract features which are a stronger indicator for the star phase than Fourier modes. 2) there is another phase present in the QMC data that is responsible for the orange phase boundary, [75] suggests possible candidates in form of rhombic, banded or staggered order. Figure 13: Latent space embeddings of the SNN are applied to configurations of the Rydberg system belonging to the horizontal slice along R b = 1.20 in the checkerboard region. Of the 10 embedding neurons, some encode phase information along R b = 1.20. Embeddings 1, 2 and 8 are such neurons, revealing a phase transition near δ = 1.20, which is marked by the black dashed line. The phase information is markedly partitioned by this line, as can be seen in (a) and the corresponding red embedding in (d). This phase transition corresponds to the first peak in Fig. 10(c). The second peak in Fig. 10 The full independent SNN prediction of the Rydberg Array phase diagram is constructed in Fig. 12, where (a) depicts the raw peaks of the horizontal and vertical scans in Fig. 10 and Fig. C4, respectively. The vertical scan slices between δ ∈ {0.50, . . . , 1.00} do not reveal any explicit phase transition, as expected, and are therefore not included in the phase diagram. Fig. 12(b) is constructed by connecting markers as smoothly as possible. This results in 7 phase regions, labelled according to Fig. 11. Fig. 13 provides insight into how the SNN encodes the phase information in the Rydberg system. Certain embeddings correspond to very specific regions in the phase diagram. Embeddings 1 and 2 separate in the vicinity of the first phase transition, and where embeddings 2 and 8 separate, the SNN indicates a second phase transition.
It is surprising that the SNN is able to reveal clearer phase information from the QMC data than what a direct evaluation of order parameters might indicate. The reason for this might be that the neural network is able to calculate other thermodynamic quantities which are relevant to describe the underlying physics, such as energies or susceptibilities. Further, the neural network might be able to deal better with domain boundaries. This observation encourages us to predict that neural networks might be able to reveal phases where conventional order parameter evaluations on MC configurations have trouble doing so. In that sense the SNN predicts the occurrence of 2 phases that are not evident in the QMC order parameter analysis, the blue and orange regions in Fig. 12. These findings guide us to examine these regions closer with a keen eye on revealing previously unknown physics.

Conclusion
We have introduced a Siamese neural network (SNN) based method to detect phase boundaries in an unsupervised manner. This method does not require any physical knowledge about the nature or existence of the underlying phases. SNN based phase detection shares the power of a) feed forward neural networks, which have been the most powerful machine learning algorithm to be applied to reveal physical phases, and b) certain unsupervised methods which can learn multiple phases without knowing about their existence. This method is shown to reproduce phase diagrams when trained on Monte-Carlo configurations of the corresponding physical system. In our case we introduced the method at the example of a model consisting of two stacked Ising Models exhibiting a phase diagram of four phases. Further, we used this method to calculate the phase diagram of a Rydberg atom array which is to the most extent consistent with prior results, and shows additional signatures of unknown and coexistence phases. Futher, in some regimes the SNN tends to be better at picking up phase information than order parameters applied to QMC data. As typical for neural network based phase recognition schemes, we do not have insight on what features a neural network is learning in order to calculate the phase boundaries. These quantities have been shown to be related to order parameters and other physically relevant quantities. Explicitly revealing them is difficult, but possible using methods like [23,79].
With this work we have contributed to the zoo of machine learning methods for phase diagrams. While it remains to be seen if one of these methods will reveal a completely new unknown phase, we believe SNN based phase detection has the features of a top contender with its ability to detect multiple phases in an unsupervised manner.

Acknowledgements
We thank Roger Melko for helpful discussions. We thank the National Research Council of Canada for their partnership with Perimeter on the PIQuIL. Research at Perimeter Institute is supported in part by the Government of Canada through the Department of Innovation, Science and Economic Development Canada and by the Province of Ontario through the Ministry of Colleges and Universities.

Appendix A. Normalized Cosine Dissimilarity
In this section we display the calculation of the cosine dissimilarity in Fig. A1. Figure A1: (a) We begin with the standard cosine similarity plot. (b) We then compute 1 − cos(θ) to obtain the cosine dissimilarity. (c) We normalize to guarantee that the plot ranges from 0 to 1. Figure B1: We train on Rydberg array configurations belonging to the slice R b = 1.50, employing early stopping. (a) The blue loss curve reveals that a single peak, shown in (c), occurs when training falls prey to local minima. Training ceases at around 12 epochs in this case due to the early stopping callback. Furthermore, we observe that the red loss curve succeeds in escaping this minimum, and converges to a lower loss. This lower loss corresponds to dual peaks, shown in (b).
set. Once again, we use these to produce stacked configurations at random for both the training and testing sets. However, this time we produce 3500 stacked configurations for the training set, and 500 for the testing set. Because the diagonal slice does not contain many points from any of the three phases it crosses, we increase the amount of training data to compensate. Figure C2: Latent space embeddings of the SNN are applied to configurations of the stacked Ising Model belonging to the shifted diagonal slice. This slice contains three regions -FF, PF, and PP, as depicted in (a), (b). Some neurons belonging to these embeddings encode phase information, albeit redundantly, while other neurons do not appear to encode anything at all. For instance, (e) and (g) both encode the FF phase, but almost identically. On the other hand, (a), (c), (i), and (k) do not encode any phase information. Figure C3: Latent space embeddings of the SNN are applied to configurations of the Rydberg system belonging to the horizontal slice along R b = 1.20 in the checkerboard region. Most neurons encode redundant information. For instance, (a) and (c) encode information related to the same phase, while (c) and (f) exhibit identical phase structures. We also observe that embedding 9 in (m) encodes no phase information. Figure C4: We apply the SNN phase recognition method to the Rydberg array in the region R b ∈ {1.0, . . . , 1.8} and δ ∈ {0.5, . . . , 2.9}. We scan along each vertical slice by choosing a fixed δ, and traversing along all R b ∈ {1.00, . . . , 1.80}. We choose k = 1 to perform adjacency comparisons within these scans. The dissimilarity peaks when configurations at R b,i and R b,i+1 are of different phases.

Appendix D. Additional Adjacency Comparisons
This section contains further examines the effect of the hyperparameter k on the adjacency comparison.  . Higher values of k appear to smooth out noise. For instance, the FP phase of (a) is characterized by noise, which gradually lessens through to (c).