Quantum Extreme Learning of molecular potential energy surfaces and force fields

Quantum machine learning algorithms are expected to play a pivotal role in quantum chemistry simulations in the immediate future. One such key application is the training of a quantum neural network to learn the potential energy surface and force field of molecular systems. We address this task by using the quantum extreme learning machine paradigm. This particular supervised learning routine allows for resource-efficient training, consisting of a simple linear regression performed on a classical computer. We have tested a setup that can be used to study molecules of any dimension and is optimized for immediate use on NISQ devices with a limited number of native gates. We have applied this setup to three case studies: lithium hydride, water, and formamide, carrying out both noiseless simulations and actual implementation on IBM quantum hardware. Compared to other supervised learning routines, the proposed setup requires minimal quantum resources, making it feasible for direct implementation on quantum platforms, while still achieving a high level of predictive accuracy compared to simulations. Our encouraging results pave the way towards the future application to more complex molecules, being the proposed setup scalable.


Introduction
The simulation of complex phenomena involving large biochemical systems, such as protein folding or enzyme activation, requires accurate predictions of energies and inter-atomic forces for different molecular conformations.Standard techniques such as molecular dynamics (MD) require on-the-fly computing the energy and inter-atomic forces of the target biochemical system at each step of the simulation.In most MD-like schemes, the Potential Energy Surface (PES) and Force Fields (FF) are determined on empirical grounds or on databases.In contrast, a higher degree of accuracy demands an ab initio calculation of the PES, typically using density functional theory (DFT) [1].The use of DFT on the fly leads to a much longer simulation running time and, due to its huge computational cost, it turns out to be a viable route only for short-time simulations (picoseconds) of small molecules.
A precise knowledge of the functional relation between the molecular geometry and the corresponding energy on the PES, predicted by the DFT, would incredibly speed up ab-initio MD simulations.As Behler and Parrinello observed in [2], this functional relation may be well approximated by neural networks (NNs) after appropriate training on the pairs of molecular conformations and their DFT energies (or forces).The generation of the PES and FFs using various machine learning (ML) paradigms has been further investigated in [3][4][5][6][7][8][9][10][11][12][13][14][15][16].Given the intrinsic quantum nature of the PES, it is natural to believe that quantum algorithms may indeed help [17,18].Through the implementation of a supervised learning setup where the role of the classical NN is played by a parameterized quantum circuit (PQC), the authors of [19,20] provided the first evidence that quantum machine learning (QML) routines may offer improved and more accurate PES predictions.
Since PQCs can be implemented on NISQ devices, they have been extensively utilized in recent times for different purposes.However, they are not the unique viable QML paradigm for supervised learning.An attractive alternative is provided by quantum extreme learning machines (QELMs).This setup may be advantageous for various reasons.First, it requires reduced training efforts, as in the QELM paradigm no parameters in the quantum platform need to be tuned and the training consists only of a simple classical linear regression.Second, quantum extreme learning machines can be implemented on different physical platforms [21][22][23][24][25][26][27][28] and not necessarily on gate-based quantum devices.We propose to use QELM to speed up ab initio molecular dynamics computations.We show that a QELM trained on small sets of geometry-energy pairs is able to efficiently predict the energy of a set of test geometries with great accuracy.The QELM setup that we propose significantly reduces the quantum resources required for the training process compared to other supervised learning paradigms such as VQE (as discussed in [19]).By quantum resources, we refer to either the depth of the quantum circuits employed, as reported in Table 1 and Table 2 and the number of runs on the quantum platform.In fact, in QELM the number of runs only depends on the size of the training set and the chosen shot statistics; the optimization stage only consists of simple linear regression on the output of the quantum circuit, and no parameter on the quantum device needs to be optimized.In VQE, instead, the number of runs also depends on the number of iterations necessary for the optimization to converge and on the number of parameters in the quantum circuit to be optimized.
Section 2 contains a self-consistent review of the QELM paradigm.The details about our particular realization of QELM adapted to PES and FF generations are all collected in section 3.1 and section 3.2.Finally, in section 3.3, we collect all the results about the simulations and quantum-hardware realization of the QELM training in the case of lithium hydride, water, and formamide.

QELM as supervised training of a POVM outcome
The QELM paradigm with classical input data is a supervised learning model.Its task is to learn a function f : R X → R Y that well approximates the actual functional relation between inputs and outputs of some training set {x tr i , y tr i } Mtr i=1 .The quality of the training is assessed on a separate test set of previously unseen pairs {x test i , y test i } Mtest i=1 .
In the case at hand, the training set consists of the coordinates parameterizing the conformations of a given molecular system and the corresponding energy and forces, computed classically with standard quantum chemistry routines via Hartree-Fock (HF) approximation or DFT.While a possible accuracy threshold may be the chemical accuracy 1.6 • 10 −3 Hartree, it should be emphasized that the chemical accuracy plays a role only in the comparison with experimental results, where energy differences related to chemical processes are measured, and not in the evaluation of the absolute energy of a molecular geometry obtained through calculations.Even if chemical accuracy is a good target precision to keep in mind, the goodness of a learning training should be actually assessed on the typical fluctuations of the variable y in the test dataset.If the error is well below such typical fluctuations, we have a good chance to accurately predict the real difference between the energy or the force of two given sampled geometries.
In a QELM, the classical x data are encoded in the quantum state of some input qubits, initially prepared in a state ρ 0 .The encoding is performed by applying a parameterized unitary transformation: After such encoding, a POVM measurement µ = {µ a } Σ a=1 is performed on the input qubits.
Each outcome a is measured with probability p a (x) = Tr(µ a ρ x ) . (2) In the QELM protocol, the function f is approximated as: where W is a Y × Σ matrix.The optimization of the unknown matrix W in order to fit the training data can be performed using standard techniques and it consists of a simple linear regression.
where P tr is a Σ ⊗ M tr matrix whose columns are the probability vectors p(x This is an estimate of the ability of the trained QELM to provide accurate predictions on a new set of test input data.As already mentioned, the root mean square error alone is not a good quantifier and it should be compared with the typical fluctuations in the output values y if we sample two random elements from the dataset.This scale is measured by the variance where Y test is the Y × M test matrix whose column are all equal to the mean value y test .If ϵ QELM is much smaller of this rooted variance, this means that we will able to predict the variation of the variable y with great precision. Let us point out that it is always possible to rewrite (3) in the following way: where μ = W µ is a vector of Y operators acting on the Hilbert space of the input qubits.This means that we are actually approximating the function f as a vector of expectation values.On real quantum platforms, the probabilities (2) are estimated performing a finite number of shots (finite statistics).This implies that a statistical error is always present and how to minimize it is an open problem of crucial relevance [29][30][31][32].Furthermore, the choice of the embedding unitaries U (x) and the choice of POVM µ strongly affects the performances of the QELM both at infinite and finite statistics.

Implementing informationally complete POVMs with a reservoir
As we have pointed out, QELM is equivalent to estimating the expectation value of some observable.Looking at (7), this is possible only if such observable can be decomposed as a linear combination of the effect of the POVM and this is possible for any operator only if the the POVM is informationally complete (IC).From an experimental point of view, implementing an IC-POVM is in principle a hard task.To overcome this issue, one may let the input qubits interact with some ancillae (the reservoir ) prepared in a fixed state ρ R .Then, an easy-to-implement measurement ν is performed.For instance, some qubits can be measured in the computational basis.The string of probabilities obtained with this procedure can be written as: where we defined the quantum channel Φ(ρ) = Uρ ⊗ ρ R U † and Φ † denotes its dual.
Comparing this expression with (2), it is evident that the role of the interaction of the input qubits with the reservoir is to generate an effective IC-POVM μ = Φ † (ν) on the input qubits.The reconstruction performances of QELM are affected by the choice of ν and by the scrambling properties of the interaction U between the input qubits and the ancillae [33].It is common in the literature to assume the measurement ν to act only on the reservoir.However, it is evident from our discussion that this is not necessary.In the following, we will assume the full set of input qubits and ancillae to be measured in the computation basis.In this way, the number of ancillae required to generate an effective IC-POVM is minimized.

Expressivity and QELM
Different choices of embedding of the classical input may influence the capacity of the QELM to reconstruct the functional relation between the elements of the training set.In order to understand how this choice may affect the performances, it is useful to observe that the QELM paradigm shows some similarities with the quantum kernel method, a feature that seems to be common to all quantum supervised learning routines [33,34].
In other words, we embed the classical data x into an element of a higher dimensional space, ρ x ∈ D. The input Hilbert space is endowed with an inner product defining the kernel of the model: Different kernels κ(x, y) allow to approximate different kinds of functions.For instance, a popular choice is the rotation encoding: where σ x is the Pauli matrix x acting on the i-th input qubit.The corresponding kernel is the translation invariant squared cosine kernel [34]: < l a t e x i t s h a 1 _ b a s e 6 4 = " N 3 x 8 k t 8 w T 2 s Z N 6 g N F u 5 < l a t e x i t s h a 1 _ b a s e 6 4 = " < l a t e x i t s h a 1 _ b a s e 6 4 = " 7 Figure 1.The QELM setup used for PES prediction with 4 input qubits.The mixing unitary W is made of three blocks, each of the same form as W (1) , with random rotation angles drawn from the same ensemble.The two-qubit entangling gates in each layer are ECR gates.
The rotation embedding can be considered as a particular instance of the more general Fourier embedding: where W (1) , W (2) , . . .are random unitary matrices acting on the input qubits, and G is a Hermitian operator, for instance, a Pauli string.The name of this embedding comes from the peculiar form of the associated kernel: where c m,n are complex coefficients and n, m ∈ R X .The Fourier frequencies m are proportional to the gaps λ i − λ j , where λ i,j are eigenvalues of G [35].The number of distinct Fourier frequencies is bounded from above by 2 2N −1 − 1, where N is the number of encoding qubits.As the number of frequencies increases, the model becomes more and more expressive.For this reason, studying the kernel associated with the embedding is a useful way to assess expressiveness [33,34,36,37].As shown in [35], the Fourier encoding models are universal approximators for appropriate choices of generator G.

PES predictions with QELM: simulations
In this section we train a QELM to learn the PES and the FF for three different molecules of increasing complexity: lithium hydride (LiH), water (H 2 O) and formamide (HCONH 2 ).The datasets used during the training are generated in the same way for all the investigated molecules and their generation is discussed in greater detail in section 3.1.
The QELM model used can be adapted to molecules with an arbitrary number of atoms and is presented in section 3.2.

Datasets
The conformations of any molecule are represented in terms of generalized coordinates, either bond lengths or bond angles, that are usually used in the Z-matrix description of any species.The lithium hydride conformations are described in terms of a single coordinate, the bond length (Figure 2).The conformations of a water molecule are parameterized by three coordinates: the HO bond lengths r 1 , r 2 and the oxygen and the HOH bond angle ϕ (Figure 3).The parameterization of formamide deserves a few more words.Such organic molecule is a particular instance of amides, the smallest molecules presenting a peptide bond.They have particular biological relevance: proteins are polymers whose units are connected by amide groups.The distinctive strength of the peptide (double) bonds forces the formamide to be a planar molecule in its ground state configuration.For such reason, we parameterized a formamide molecule assuming the geometry to be planar and the two NH bond lengths are assumed to be the same.The remaining conformations are parameterized in terms of 8 generalized coordinates (instead of 12) and their choice is shown in Figure 4.
The training and test configurations are then obtained by sampling random tuples of generalized coordinates in given intervals.In the three cases under consideration, these intervals are centered around the ground state configuration in order to focus on the geometries that are typically explored during a molecular dynamics simulations.In fact, for our three molecular species, the PES are characterized by a deep potential well which it's hard to escape.Given that the PES of LiH is actually a one-dimensional curve, in this case, we explore a wider part the configuration landscape sampling the bond lengths from a larger interval than H 2 O or HCONH 2 .The electronic structure calculations performed to generate the used dataset were conducted using the PySCF software implemented in Pennylane package (LiH) or Gaussian 16 software [38] (H 2 O and HCONH 2 ).Since the main focus of this work is to assess the agreement of QELM with generic PES data obtained from traditional quantum chemistry methods, we are not restricted to a specific theoretical framework.For this reason, the level of theory chosen to investigate the different chemical species changes as a consequence of both the complexity of the molecule and the necessity of a properly sized dataset.
In the case of lithium hydride and water, calculations were performed using DFT with M06 exchange-correlation functional [39] and STO-3G (cc-pVDZ) basis set.For LiH, the generated dataset is generated performing 170 single point calculation on as many geometries, obtained by choosing the bond length values randomly in the range 0.9 Å ≤ r ≤ 4.5 Å.The QELM is trained over 50 geometries while 120 configurations are left for test.For H 2 O, the dataset is generated performing 900 single point calculations on as many geometries.Each geometry is obtained by choosing each value for the three internal coordinates at random, in a range centered on their respective equilibrium values, i.e. 0.964 ± 0.2 Å for distances, and 102.792 • ± 13.0 • for the HOH angle.The QELM is trained over 300 elements of the whole dataset, while 600 configurations are left for test.
In the formamide case, the dataset consists of approximately 6500 structures (even if much less are actually used for the training).Each of the geometries that constitute the dataset was generated by randomly sampling values for the aforementioned bonds and angles within specific intervals.In particular, these intervals are centered on the value of the respective internal equilibrium coordinates, with ranges of: i) ±0.10 Å for bonds involving a hydrogen atom; ii) ±0.15 Å for the remaining bonds; and iii) ±8.00 • for all angles.For each structure, a calculation was performed at the HF/SVP level of theory to obtain the electronic energy and atomic force values.

QELM setup
The gate parameters in quantum hardware are usually rotation angles.For this reason, we need to perform a rescaling of the coordinate, with particular attention to radial coordinates.Each bond length is rescaled with respect to a reference one r.The reference distances are chosen to be r LiH = 6 Å and r H 2 O = r HCONH 2 = 2 Å.The bond angles are instead rescaled by a factor 2. The input data of water are thus of the following form: with analogous expressions for lithium hydride and formamide.The string x is fed in the state of N qubits using a Fourier encoding (12) with: This amounts to act on each input qubit with an R z gate.Observe that the number of input qubits of the encoding is not fixed and can be increased at wish in order to improve performances.The random unitaries W (1) , W (2) , . . . in the Fourier encoding are made of one layer of the form depicted in Figure 1, where θ i are random rotation angles uniformly drawn in the interval (0, π/2).All the input qubits are measured in the computational basis, with no addition of ancillae.Let us also stress that the Fourier encoding as presented in (12) can be adapted to any molecule or system of molecules with an arbitrary number of generalized coordinates.
The kernel associated with the choice of generator ( 15) is a Fourier kernel (13) with integer frequencies −N, −N + 1, . . ., N − 1, N .Even if the number of frequencies scales only linearly with the number of qubits, it must be stressed that the choice of embedding and gates in the setup is driven by the wish to minimize the depth of the quantum circuit once it is implemented on real quantum hardware.In fact, as it will be clear in the following, the accuracy of the PES predictions grows as the number of encoding qubits increases, reflecting the wider spectrum of Fourier frequencies of the model (see Figure 5).
In the following, we will present the actual experimental implementation of the proposed setup on the IBM BRISBANE device, a quantum processor with 127 superconducting transmon qubits.Such a device admits thee native single-qubit gates (X, √ X and R z (ϕ) for any angle) and a single two-qubit gate, the Echoed cross-resonance gate (ECR): The ECR gate is equivalent to the CNOT gate up to single-qubit rotations.Any other gates must be decomposed as a combination of native gates and this may cause a huge increase in the circuit depth.For instance: Since the R y gates are slightly cheaper in terms of native gates, they should be preferred in order to minimize the circuit depth.For this reason, our circuit is built out of R z,y and ECR gates only.

Results
For each atomic species that we considered, we performed three different kinds of analysis: first, we simulated the performances of the QELM in the ideal case of infinite statistics and absence of noise.We refer to this case as statevector simulation.Then we reduce the level of approximation simulating the QELM training with finite statistics but in noiseless conditions.To address this task, we use the QASM simulator of the Qiskit package [40].Finally, we implement the QELM setup on IBM BRISBANE, taking thus into account also the actual noise present on NISQ devices.It must be stressed that we do not perform any error mitigation on the quantum-hardware output data.
The statevector simulation is the ideal playground to test the achievable precision of the QELM prediction when the parameters of the setup are varied.The only freedom left in our configuration is in the number of encoding qubits and the dimension of the training set.In Figure 5 we plot the behavior of the statevector RMSE for the energy for all the molecules under investigation when both N and M train are varied.As one can observe, as the number of qubits increases, the performance of the QELM improves.This is consistent with the idea that a larger reservoir can be used to approximate a wider range of functions [33].What is interesting is that the RMSE always reaches a plateau above a certain dimension of the training set.Above this threshold value, any further information does not significantly affect the accuracy of the predictions on the test set.The reason relies on the linear regression that is carried out in post-processing in the QELM paradigm.As explained in Figure 2, the training consists of a single linear Quantum Extreme Learning of molecular potential energy surfaces and force fields 11 regression once the matrix W = Y tr P + tr (19) is determined.The reservoir dependence of the training is all contained in the probability matrix P + tr .It is well known that some information about the accuracy of the training can be extracted by looking at the singular values of the probability matrix only [29,41].However, the number of singular values of P + tr is always less than min(2 N , M tr ).This implies that, as long as the number of qubits is kept fixed, the larger the value of M tr is compared to 2 N , the less any new training element will impact performance.Looking at Figure 5, the RMSE plateau is reached for: In the first row of each sub-table of Table 1, we collect the results of training for the three atomic species for a specific choice of N .The RMSE of the energy is in all three cases well below the typical energy fluctuations in the dataset, estimated by the square root of the variance, var(E tr ), where E tr is the vector whose elements are the training energies.In all the three cases, LiH, H 2 O and HCONH 2 , the variance is of order 2 • 10 −2 Ha.For the force associated to bond lengths instead, var(F r ) ≈ 10 −1 Ha/ Å.
It is interesting to compare these results with the performances (in the infinite-statistics, noiseless ideal case) of the VQE routine proposed in [19], where an analogous study of LiH and H 2 O is carried out Table 2.  [42].The number of parameters to be optimized is taken to be the same for VQE and DNN for each molecular species while it is greater for BPNN.As it is evident, also the VQE can outperform the classical networks.The QELM not only improves the VQE performances, but it does so while keeping minimal quantum computational cost.
In realistic implementations, the probability matrices P tr and P test are estimated with finite precision, due to the fact that probabilities are estimated by performing a limited number of shots on a quantum device.This implies that the accuracy of the QELM predictions is in general lower than the ideal case.The results of the training either in the noiseless simulation or the real-hardware implementation on IBM BRISBANE are again collected in table Table 1, where they can be compared with their statevector counterpart.One can see that because of the finite statistics, the RMSE increases by a factor that depends on the specific molecule.However, the RMSE in the QASM simulations and IBM BRISBANE implementation are always of the same order of magnitude, meaning that the noise does not affect significantly the performances and the noiseless simulations are reliable.The dependence of the RMSE on the number of training elements follows the same argument as above.Moreover, looking at Figure 6 it is evident that another effect of the finite statistics is that the RMSEs for different numbers of qubits are much closer to each other on the plateau.
In order to have a visual perspective of the accuracy of our predictions at finite statistics (either in QASM simulation or IBM BRISBANE realization), we plot in Figure 7, the value of the force computed with classical methods against the respective QELM prediction (in the LiH case we rather plot the value of the forces in the r − F plane).

Conclusions
In this work, we have demonstrated a fits application of QELM to quantum chemistry.
The QELM is trained to learn the functional dependence of the potential energy surface and force fields of a molecule on a set of generalized coordinates that parameterize its possible conformations.The predictions of the QELM can be then exploited to speed up ab-initio molecular dynamics computations that require the knowledge of the PES and FF at HS or DFT precision.
We have studied in detail three case studies, LiH, H 2 O and HCONH 2 but the proposed setup is scalable, in the sense that can be applied to molecules with any number of atoms and degrees of freedom.The setup is also optimized for practical implementation on quantum devices, with IBM superconducting transmon quantum computers in mind.However, by changing the set of native gates, the setup can be easily adapted to other platforms, e.g.trapped-ion.The precision of the prediction can be arbitrarily increased by utilizing more encoding qubits.The number of qubits in turn fixes the dimension of the training set that should be used.
The accuracy of the QELM predictions is first tested in noiseless simulations at infinite statistics, finding an exceptional agreement.We then investigate the performances at finite statistics, either with noiseless simulation and also with practical implementations on the IBM BRISBANE quantum device.In both cases we find encouraging results, suggesting that the proposed method can be effectively competitive in a near future.It must be stressed that QELM is particularly efficient in terms of quantum resource needed (it does not require any optimization of quantum circuit parameters via gradient descent) and in terms of classical post-processing (we do not perform any error mitigation).
We leave to future work the application to more complicated molecules in order to understand the ultimate frontiers of our proposal and the study of efficient methods to overcome the limitations deriving from finite statistics.
r y P R i 0 c 0 8 k i A b H q H B i b M P j L T a 0 I 2 f I B f 4 V V P 3 o x X P 8 O D / + I u c l C w T p W q 7 n R 1 e Z G S h m z 7 0 1 p a X l l d W 8 9 t 5 D e 3 t n d 2 C 3 v 7 D R P G W m B d h C r U L Q 8 M K h l g n S Q p b E U a w f c U N r 3 R d e Y 3 H 1 A b G Q b 3 N I 6 w 6 8 M g k H 0 p g F L J L R S T j v b 5 n T u e l D s 0 r 3 / j C 5 Z s 5 0 i + w P r 4 x v B s Z j I < / l a t e x i t > R y (✓ 1 ) < l a t e x i t s h a 1 _ b a s e 6 4 = " S 0 6 j 9 z d e P / l 6 U v r 7 6 9 n d s 1 S 0 o P 0 = " > A A A C B 3 i c b V D L T g J B E J z F F + I L 5 e h l I j H B C 9 k l v o 5 E L x 7 R y C M B s u k d G p g w + 8 h M r w n Z 8 A F + h V c 9 e T N e / Q w P / o sL c l C w T p W q 7 n R 1 e Z G S h m z 7 0 8 q s r K 6 t b 2 Q 3 c 1 v b O 7 t 7 + f 2 D h g l j L b A u Q h X q l g c G l Q y w T p I U t i K N 4 H s K m 9 7 o e u o 3 H 1 A b G Q b 3 N I 6 w 6 8 M g k H 0 p g F L J z R e S j v b 5 n T u e l D o 0 R A K 3 c u L m i 3 b Z n o E v E 2 d O i m y O m p v / 6 v R C E f s Y k F B g T N u x I + o m o E k K h Z N c J z Y Y g R j B A N s p D c B H 0 0 1 m 4 S f 8 O D Z A I Y 9 Q c 6 n 4 T M T f G w n 4 x o x 9 L 5 3 0 g Y Z m 0 Z u K / 3 n t m P q X 3 U Q G U U w Y i O k h k g p n h 4 z Q M m 0 F e U 9 q J I J p c u Q y 4 A I 0 E K G W H I R I x T i t K Z f 2 4 S x + v0 w a l b J z X j 6 7 P S 1 W r + b N Z N k h O 2 I l 5 r A L V m U 3 r M b q T L A x e 2 L P 7 M V 6 t F 6 t N + v 9 Z z R j z X c K 7 A + s j 2 / D Q Z j J < / l a t e x i t > R y (✓ 2 ) < l a t e x i t s h a 1 _ b a s e 6 4 = " y t A l + R O d R I L i 9 r d c c F T h x S x a 1 B s = " > A A A C B 3 i c b V D L T g J B E J z F F + I L 5 e h l I j H B C 9 n 1 r 9 g E c e + M g l 0 7 p r m S H a M V M o u I R J q R d p C B k f s S F 0 E + o z D 7 Q d Z 6 k n d C / S D A M a g q J C 0 k y E 3 x s x 8 7 Q e e 2 4 y 6 T G 8 1 d N e K v 7 n d S M c n N q x 8 M M I w e f p I RQ S s k O a K 5 H U A b Q v F C C y N D l Q 4 V P O F E M E J S j j P B G j p J 9 S 0 o c 1 / f 0 s a R 3 U r O P a 0 e V h p X 6 W N 1 M k O 2 S X V I l F T k i d X J A G a R J O F H k i z + T F e D R e j T f j / W e 0 Y O Q 7 2 + Q P j I 9 v X W S W b A = = < / l a t e x i t > R z (x 1 )< l a t e x i t s h a 1 _ b a s e 6 4 = " a 9 G / i j p j o Y G 5 0 y b I D u l y 2 X c q 4 l A = " > AA A B + 3 i c b V C 7 T s N A E D y H V w i v A C X N i Q i J K r I Rr z K C h j J I 5 I G S K D p f N u G U O 9 u 6 W y N F l r + C F i o 6 R M v H U P A v n I 0 L S J h q N L O r n R 0 / k s K g 6 3 4 6 p a X l l d W 1 8 n p l Y 3 N r e 6 e 6 u 9 c 2 Y a w 5 t H g o Q 9 3 1 m Q E p A m i h Q A n d S A N T v o S O P 7 3 O / M 4 j a C P C 4 A 5 n E Q w U m w R i L D h D K 9 3 3 U c g R J J 1 0 W K 2 5 d T c H X S R e Q W q k Q H N Y / e q P Q h 4 r C J B L Z k z P c y M c J E y j 4 B L S S j 8 2 E D E + Z R P o W R o w B W a Q 5 I F T e h Q b h i G N Q F M h a S 7 C 7 4 2 E K W N m y r e T i u G D m f c y 8 T + v F + P 4 c p C I I I o R A p 4 d s g 9 C f s h w L W w T Q E d C A y L L k g M V A e V M M 0 T Q g j L O r R j b a i q 2 D 2 / + + 0 X S P q l 7 5 / W z 2 9 N a 4 6 p o p k w O y C E 5 J h 6 5 I A 1 y Q 5 q k R T h R 5 I k 8 k x c n d V 6 d N + f 9 Z 7 T k F D v 7 5 A + c j 2 / Z C J U g < / l a t e x i t > W < l a t e x i t s h a 1 _ b a s e 6 4 = " Y 0 f h U C S y Z O 6 T d n V 6 9 4 9 E g E D m 8 9 Y = " > A M H M + 9 l 4 n 9 e L 8 b R p Z c I F c U I i m e H U E j I D x m u R d o D 0 K H Q g M i y 5 E C F o p x p h g h a U M Z 5 K s Z p M Z W 0 D 2 f + + 0 X S O W k 4 5 4 2 z 2 9 N a 8 6 p o p k w O y C G p E 4 d c k C a 5 I S 3 S J p x M y B N 5 J i 9 W Y r 1 a b 9 b 7 z 2 j J K n b 2 y R 9 Y H 9 8 a x Z N + < / l a t e x i t > W (1) < l a t e x i t s h a 1 _ b a s e 6 4 = " v C 2 0 d 4 b r A Q y 4 0 w a w l s m i D S H J g 9 k o D s e M y C F g i Y K l N A J N T D f k 9 D 2 x h e Z 3 3 4 A b U S g b n A S g u u z k R J D w R m m 0 i 3 c x e K x 7 1 w m / U r V r t k 5 6 C x x C l I l B R r 9 y l d v E P D I B 4 V c M m O 6 j h 2 i G z O N g k t I y r 3 I Q M j 4 m I 2 g m 1 L F f D B u n A d O 6 H 5 k G A Y 0 B E 2 F p L k I v z d i 5 h s z 8 b 1 0 0 m d 4 b 6 a 9 T P z P 6 0 Y 4 P H N j o c I I

< l a t e x i t s h a 1 _Figure 2 .
Figure 2. Parameterization of the geometry of a single molecule of lithium hydride.

u 5 7
C t v e + C r z 2 w + o j Q y D O 5 p E 6 P p 8 F M i h F J x S 6 V b 3 n X 6 l a t f s H G y e O A W p Q o F G v / L V G 4 Q i 9 j E g o b g x X c e O y E 2 4 J i k U T s u 9 2 G D E x Z i P s J v S g P t o 3 C S P O m W H s e E U s g g 1 k 4 r l I v 7 e S L h v z M T 3 0 k m f 0 7 2 Z 9 T L x P 6 8 b 0

1 <
R e r T f r / W d 0 w S p 2 9 u A P r I 9 v r B G S K Q = = < / l a t e x i t > r l a t e x i t s h a 1 _ b a s e 6 4 = " 0 + x x Z M Y g G a n U z q d t 6 9 / Y b 9 6 o e V 4 = "

2 <Figure 3 .
Figure 3. Parameterization of the geometry of a single molecule of water.

1 <
u 5 7 C t v e + C r z 2 w + o j Q y D O 5 p E 6 P p 8 FM i h F J x S 6 V b 3 n X 6 l a t f s H G y e O A W p Q o F G v / L V G 4 Q i 9 j E g o b g x X c e O y E 2 4 J i k U T s u 9 2 G D E x Z i P s J v S g P t o 3 C S P O m W H s e E U s g g 1 k 4 r l I v 7 e S L h v z M T 3 0 k m f 0 7 2 Z 9 T L x P 6 8 b 0 / D C T W Q Q x Y S B y A 6 R V J g f M k L L t A N k A 6 m R i G f J k c m A C a 4 5 E W r J u B C p G K e l l N M + n N n v 5 0 n r u O a c 1 U 5 v T q r 1 y 6 K Z E u z D A R y B A + d Q h 2 t o Q B M E j O A J n u H F e rR e r T f r / W d 0 w S p 2 9 u A P r I 9 v r B G S K Q = = < / l a t e x i t > r l a t e x i t s h a 1 _ b a s e 6 4 = " 0 + x x Z M Y g G a n U z q d t 6 9 / Y b 9 6 o e V 4 = "

1 <
r G a S + l t A 9 v / v t F 0 j y p e u f V s 9 v T S u 0 q b 6 Z I D s g h O S Y e u S A 1 c k P q p E E 4 u S d P 5 J m 8 O I / O q / P m v P + M F p x 8 Z 5 / 8 g f P x D f H F k 3 I = < / l a t e x i t > l a t e x i t s h a 1 _ b a s e 6 4 = " 5 b

3 <
+ S I 5 M B E 1 x z I t S S c S F S M U 5 L K a V 9 O L P f z 5 P m U d U 5 q 5 7 e n F R q l 9 N m i r A H + 3 A I D p x D D a 6 h D g 0 Q M I Q n e I Y X 6 9 F 6 t d 6 s 9 5 / R g j X d 2 Y U / s D 6 + A a 8 v k i s = < / l a t e x i t > r l a t e x i t s h a 1 _ b a s e 6 4 = " E 0 C / p n I p / d I Z G z 3 9 x 6 n l J s + Z

2 < 4 < 4 <
n 1 X l z 3 n 9 G l 5 x 8 5 4 D 8 g f P x D f N U k 3 M = < / l a t e x i t > l a t e x i t s h a 1 _ b a s e 6 4 = " / c U O 4 u w 4 t B 1 E A G i o h O J M o n r m b 5 0 = " > A A A B 9 X i c b V C 7 T s N A E F y H V w i v A C X N i Q i J K r J R e J Q R N J R B k I e U W N H 5 s g m n n B + 6 W 4 M i K 5 9 A C x U d o u V 7 K P g X H O M C E q Y a z e x q Z 8 e L l D R k 2 5 9 W Y W l 5 Z X W t u F 7 a 2 N z a 3 i n v 7 r V M G G u B T R G q U H c 8 b l D J A J s k S W E n 0 s h 9 T 2 H b G 1 / N / P Y D a i P D 4 I 4 m E bo + H w V y K A W n V L r V / V q / X L G r d g a 2 S J y c V C B H o 1 / + 6 g 1 C E f s Y k F D c m K 5 j R + Q m X J M U C q e l X mw w 4 m L M R 9 h N a c B 9 N G 6 S R Z 2 y o 9 h w C l m E m k n F M h F / b y T c N 2 b i e + m k z + n e z H s z 8 T + v G 9 P w w k 1 k E M W E g Z g d I q k w O 2 S E l m k H y A Z S I x G f J U c m A y a 4 5 k S o J e N C p G K c l l J K + 3 D m v 1 8 k r Z O q c 1 Y 9 v a l V 6 p d 5 M 0 U 4 g E M 4 B g f O o Q 7 X 0 I A m C B j B E z z D i / V o v V p v 1 v v P a M H K d / b h D 6 y P b 7 C + k i w = < / l a t e x i t > r l a t e x i t s h a 1 _ b a s e 6 4 = " x P o 1 a t v L 8 t p 3 C b p I a g v 6 U n G Y S O I = " > A A A B + H i c b V C 7 T s N A E D y H V w i v A C X N i Q i J K r I R r z K C h j J I 5 C E l U X S + b J I j 5 / P p b o 0 U r P w D L V R 0 i J a / o e B f s I 0 L S J h q N L O r n R 1 f S 2 H R d T + d w t L y y u p a c b 2 0 s b m 1 v V P e 3 W v a M D I c G j y U o W n 7 z I I U C h o o U E J b G 2 C B L 6 H l T 6 5 T v / U A x o p Q 3 e F U Q y 9 g I y W G g j N M p G Z X j 0 X / t F + u u F U 3 A 1 0 k X k 4 q J E e 9 X / 7 q D k I e B a C Q S 2 Z t x 3 M 1 9 m J m U H A J s 1 I 3 s q A Z n 7 A R d B K q W A C 2 F 2 d p Z / Q o s g x D q s F Q I W k m w u + N m A X W T g M / m Q w Y j u 2 8 l 4 r / e Z 0 I h 5 e 9 W C g d I S i e H k I h I T t k u R F J D U A H w g A i S 5 M D F Y p y Z h g i G E E Z 5 4 k Y J b 2 U k j 6 8 + e 8 X S f O k 6 p 1 X z 2 5 P K 7 W r v J k i O S C H 5 J h 4 5 I L U y A 2 p k w b h 5 J 4 8 k W f y 4 j w 6 r 8 6 b 8 / 4 z W n D y n X 3 y B 8 7 H N / Z y k 3 U = < / l a t e x i t > l a t e x i t s h a 1 _ b a s e 6 4 = " 7 z y B t I y s 7 L s a 6 h 2 a O k k z C d m c I O w = " > A

3 Figure 4 .
Figure 4. Parameterization of the geometry of a single molecule of formamide.The two NO bond lengths are taken equal and parameterized by the same coordinate r 4 .

Figure 5 .
Figure 5. Behaviour of the energy RMSE in statevector simulations for a different number of the encoding qubits and different sizes of the training set.

Figure 6 .Figure 7 .
Figure 6.Behaviour of the energy RMSE in QASM simulations and IBM BRISBANE realizations for different numbers of the encoding qubits and different sizes of training set.
tr i ) of the training set.The superscript + denotes the Moore-Penrose pseudoinverse and Y tr is a Y ⊗ M tr matrix whose columns are the output y tr of the training set.The matrix W is thus the one minimizing the root mean square error (RMSE) on the training set ϵ tr = ||Y tr − W P tr || F / √ M tr where || • || F is the Frobenius norm.What is relevant is the root mean square error on the test set:

Table 1 .
Summary of the performances of the QELM trained to learn PESs and FFs of a molecule of LiH, H 2 O and HCONH 2 .The RMSE is measured in Ha for the energy and Ha/ Å for the force.

Table 2 .
[19]ormances (statevector simulation) of the VQE routine proposed in[19]to learn PES and FF of LiH and H 2 O, together with the classical machine learning routines used as benchmark.
[19]oth cases, the QELM outperforms the VQE, not only in terms of RMSE but also in terms of needed resources.In fact, even if the VQE routine of[19]uses a smaller circuit (3 qubits), it needs a circuit with a depth approximately 200 times bigger than in the QELM counterpart.Moreover, the VQE circuit needs to be run many more times, in order to obtain the gradients of the parameters (see section Appendix A) and in order to perform more training epochs.In