Analysis of the Hopfield Model with Discrete Coupling

Growing demand for high-speed Ising-computing-specific hardware has prompted a need for determining how the accuracy depends on a hardware implementation with physically limited resources. For instance, in digital hardware such as field-programmable gate arrays, as the number of bits representing the coupling strength is reduced, the density of integrated Ising spins and the speed of computing can be increased while the calculation accuracy becomes lower. To optimize the accuracy-efficiency trade-off, we have to estimate the change in performance of the Ising computing machine depending on the number of bits representing the coupling strength. In this study, we tackle this issue by focusing on the Hopfield model with discrete coupling. The Hopfield model is a canonical Ising computing model. Previous studies have analyzed the effect of a few nonlinear functions (e.g. sign) for mapping the coupling strength on the Hopfield model with statistical mechanics methods, but not the effect of discretization of the coupling strength in detail. Here, we derived the order parameter equations of the Hopfield model with discrete coupling by using the replica method and clarified the relationship between the number of bits representing the coupling strength and the critical memory capacity. In this paper, we used the replica method for the Hopfield model with general nonlinear coupling (Sompolinsky (1986)) to analyze the model with a multi-bit discrete coupling strength, and we novelly derived the de Almeida-Thouless line of the model with general nonlinear coupling.


Introduction
[10][11] Many important problems belong to the nondeterministic polynomial time (NP)-hard complexity class, and for typical instances, require a computation time that scales exponentially with the problem size.Many of these problems can be translated into problems of finding the ground states of an Ising model. 12)The Hamiltonian of an Ising model is written as where S 1 , . . ., S N are Ising variables, which take either −1 or +1, and J ij expresses the coupling strength between the ith and jth Ising variables.The coupling strength is symmetric, i.e., J ij = J ji .The mapping from many combinatorial optimizations onto an Ising model motivated us to develop machines dedicated to the search for the ground state.Many such machines have been proposed in the past decade.,19) Utsunomiya et al. proposed a coherent Ising machine (CIM) that executes Ising computations in an injectionlocked laser network.23][24][25][26][27][28] Goto proposed a quantum adiabatic computation algorithm based on a nonlinear oscillator network that searches for the ground state of an Ising model.29,30) This algorithm is implemented as a superconducting circuit 31) or two-photondriven Kerr parametric oscillators.32) Goto also proposed a simulated bifurcation algorithm, which is a classical approximation of a quantum adiabatic computation using a nonlinear oscillator network, implemented in field-programmable gate arrays (FPGAs).33) Other examples of such machines include electromechanical resonators, 34) nano-magnet arrays, 35) electronic oscillators, 36) and laser networks.37) There are machines based on simulated annealing (SA), implemented in complementary metal-oxide-semiconductor (CMOS), [38][39][40][41] FP-GAs, [42][43][44][45][46] and magnetic devices.47) Many of these Ising computers have hardware restrictions on the implementation of their algorithm.For example, the superconducting quantum annealing processor 13,48) restricts the graph topology to a chimera graph.CMOS annealing 38,39) restricts the graph topology to a three-dimensional lattice built from two-layer two-dimensional lattices. Dirct mapping of most of the combinatorial optimization problems onto Ising models requires all-to-all couplings.Thus, we have to translate the all-to-all coupling Ising models into equivalent Ising models with other graph topologies implementable on these machines. Some tanslation techniques have been proposed.[49][50][51][52] As other examples, the measurementfeedback type of CIM [23][24][25][26][27][28] and the Digital Annealer 44) require that the coupling strength takes a discrete value.Because calculating the local field requires large computing resources of digital circuits (e.g.FPGA), the number of bits representing the coupling strength is the main factor determining the number of implemented spins, processing speed, and development cost.The number of bits representing the coupling strength should be made as small as possible while maintaining performance as much as possible.
Therefore, there is a growing demand for evaluating the effect of such hardware restrictions on the performance of Ising computers.][55][56][57] The Hopfield model with a two-bit coupling strength, named "clipping synapses", was analyzed using the replica method 54) and self-consistent signal-to-noise analysis (SCSNA). 57)oreover, the perceptron with an up-to-four-bit discrete coupling strength has been analyzed. 58,59) n the other hand, Mimura et al. have analyzed the Hopfield model with a multi-bit discrete coupling strength, in which discrete intervals were non-uniformly optimized to maximize its memory capacity. 56)However, the multi-bit discretization manner proposed by Mimura et al.is different from the practical manner of integer and fixed-point representations used in the Ising-computing-specific systems developed recently.Thus, the evaluation of the systems with the practical multi-bit discretization manner for coupling strength is demanded.
The Hopfield model shares many statistical mechanics pictures with other Ising models.Therefore, through an analysis of the Hopfield model with the practical multibit discretization manner for a coupling strength, we expect to be able to estimate how many bits are needed to represent coupling strengths and at the same time speculate the performance of other Ising models with such discrete couplings.In this paper, we used the replica method for the Hopfield model with general nonlinear coupling 54,55) to analyze the model with a multi-bit discrete coupling strength, and we novelly derived the de Almeida-Thouless (AT) line of the model with general nonlinear coupling.Moreover, as mentioned above, there is a novelty that we theoretically evaluate the performance of Ising-computing-specific systems with the practical discrete representation for the coupling strength.

Model
In the original Hopfield model, the coupling strength was determined using the Hebb rule. 53,60) n this study, we determined the coupling strength restricted to discrete values by using the following modified Hebb learning rule, 54,57) where ξ µ = (ξ µ 1 , . . ., ξ µ N ) T ∈ {1, −1} N is the µ-th memory pattern, p is the number of patterns, N is the system size, and f is a function to discretize the coupling strength.The memory patterns are generated according to the probability distribution, The original Hopfield model corresponds to having a linear function f (x) = x.The two-bit coupling strength, called clipping synapse, is determined using the signum function f (x) = sgn (x).To determine the multi-bit coupling strength, we define the discretization function f as follows: where n represents the number of bits, ⌊• • • ⌋ represents the floor function, and ⌈• • • ⌉ represents the ceil function.
In addition, we introduce a loading rate α, defined as α = p/N , and a local field h i at the i-th site, defined as Equation ( 4) discretizes the coupling strength in the range of −1 to 1.We attempted to verify how the phase diagram changes as the range of the discretization function changes.Thus, we modified Eq. (2) as follows: where n σ is a parameter which decides the range of the discretization function.n σ f (x/n σ ) in Eq. ( 6) discretizes the value of x in the range of −n σ to n σ .For example, when n σ = 2, this function discretizes values in the range of −2 to 2.
Figure 1 shows the profiles of the discretization functions.Figure 1

Hebbian-Glassy Coupling Effectively Equivalent to
Discretized Coupling As a first step, by performing a naive signal-to-noise (S/N) analysis, we derive a Hebbian-glassy coupling effectively equivalent to Eq. ( 6).When field Eq. ( 5) is The first part of Eq. ( 7) is the signal term, and the second part is the noise term.In the limit N → ∞, the signal term can be rewritten as where ij obeys a Gaussian distribution by the central limit theorem, Eqs. ( 9) and ( 10) can be obtained.On the other hand, the mean and the variance of the noise term become where • • • implies averaging over all of the random memory patterns {ξ µ i }.The result of the S/N analysis for the pattern ξ k i are satisfied for any k = 1, . . ., N .By adding and subtracting the same term J √ pT ij /N to/from Eq. ( 6), the coupling strength J ij defined in Eq. ( 6) can be rewritten as follows, The first and second parts of Eq. ( 14) correspond to the signal term and noise term in Eq. ( 7), respectively.Assuming that a signal condensed pattern exists, the first and second parts in Eq. ( 14) can be considered to be statistically independent.Thus, according to the central limit theorem, the second part in Eq. ( 14) can be replaced by a Gaussian random variable with zero mean and variance α( J − J 2 )/N in the limit of N → ∞.Thus, we get where the glassy coupling part η ij (i = j) has been proved to be an independently and identically distributed Gaussian random variable with zero mean and variance J 2 ∆ 2 /N independent of the Hebbian rule part J/N p µ=1 ξ µ i ξ µ j . 54)Note that η ij = η ji (symmetry) and J ii = η ii = 0. ∆ is defined as

Replica Method
In this subsection, we analyze using the replica method, following the recipe of the previous study. 54,61) e introduce the temperature T = β −1 and define the partition function as follows: Applying the replica trick, we derive the average free energy per spin f .The details are given in Appendix A. Assuming replica symmetric theory, we obtain the following equation.
Here, [• • • ] implies averaging over the glassy coupling part and • • • ξ 1 denotes averaging over the random pattern ξ 1 .m 1 is an order parameter called the macroscopic overlap, defined as the correlation between a state of spins and a condensed pattern {ξ 1 i }, where • • • T represents the thermal average.q is the Edwards-Anderson order parameter, r is the mean-square of the overlaps with uncondensed patterns, Extremizing f with respect to q, r, and m 1 , we obtain the following saddle-point equations.
Equation ( 22) has a trivial solution m 1 = q = r = 0, which is called the paramagnetic (PARA) phase.Besides this phase, there are two other phases.One phase with m 1 = 0, q = 0 is termed the ferromagnetic (FM) phase or retrieval phase.The other phase with m 1 = 0, q = 0 is termed the spin-glass (SG) phase.Note that Eq. ( 22) is the same as the saddle point equations for the original Hopfield model in the case of J = 1 and J = 1, resulting in ∆ = 0. 61) Figure 2 shows the phase diagram, which plots the critical temperatures of the SG phase and the FM phase as a function of α.In each figure, the dashed line shows the transition temperature T g to the SG phase, the solid line shows the temperature T M at which the FM phase first appears, and the dotted line shows the AT line.We verified the cases of two-bit, fourbit and eight-bit coupling strengths to plot these transition temperatures for n σ = 1, 2, and 3.The tricritical point of the FM phase, SG phase and PARA phase is T = J and α = 0.

SG Phase
The transition from the PARA phase to the SG phase is of second order.To find the transition temperature T g , we expand q and r in Eq. ( 22) under the assumption of a fixed m 1 = 0 and obtain a leading order equation, which yields the following equation, which determines the transition temperature T g .
As shown in Fig. 2(a), T g increases as J increases.In the case of a two-bit coupling strength, J is proportional to n σ , and ∆ is constant with respect to n σ .Thus, T g obeying Eq. ( 24) increases in proportion to n σ .On the other hand, in the case of a four-bit or eight-bit coupling strength, T g is no longer proportional to n σ (Figs.2(b) and 2(c)), because both J and ∆ depend on n σ .

FM Phase
The FM phase is defined by m 1 = 0. Above T = J, there are no FM solutions for any value of α.For T < T g and α < α c , one finds the line T M (α), below which the FM phase appears.Here, α c is the critical memory capacity at T → 0 (the details are described below).In the FM phase, the macroscopic overlap m 1 becomes O(1), which means retrieval of the condensed pattern {ξ 1 i }.In the case of a two-bit coupling strength, T M is proportional to n σ , as is T g .In the case of four-bit and eight-bit coupling strengths, T M increases with n σ , but saturates for n σ larger than two for each case of α.Especially in the case of a four-bit coupling strength, T M is maximized at about n σ = 2 and α > 0.12.In each case, as α approaches α c , the T M line asymptotically approaches the T = 0 axis.To confirm the accuracy of the saddle-point equations obtained by the replica method, we performed the Markov Chain Monte Carlo (MCMC) simulation with Gibbs sampling for the system size N = 2, 000 in the case of n σ = 1. Figure 3 shows the theoretical results obtained by saddle-point equations Eq. ( 22) and the numerical results obtained by MCMC simulations.In Fig. 3(a) -Fig.3(c), the solid line in each subfigure shows the transition temperature T M obtained by solving Eq. ( 22) numerically, and the color plot shows the value of m 1 obtained by MCMC simulations.In Fig. 3(d obtained by MCMC simulations.The phase transition points obtained from Eq. ( 22) coincided with those of the MCMC simulations in the case of the two-bit, four-bit, and eight-bit coupling strengths in many regions.However, as the number of bits was decreased and the loading rate α was increased, the transition temperature T M estimated with the saddle-point equation Eq. (22) was not matched to that with the MCMC simulations very well (see Figs. 3(d) and 3(b)).As suggested by Eqs. 10, 13, and Eq. ( 15), the relative strength of the effective glassy coupling part increases as the number of bits decreases and α increases.Thus, we suspect that there might be many metastable states due to the effective glassy coupling part, and thus, the relaxation time in the MCMC simulations might be longer.This discrepancy became more pronounced at lower temperatures (see Figs. and 3(b)), which supports the above suspicion.

Critical Memory Capacity
From Eq. ( 22), we can obtain the following equations by taking the limit T = β −1 → 0: where U = Jβ(1 − q).These equations are identical to those obtained by SCSNA. 57)Since q = 0, the PARA phase no longer appears in the limit T → 0. These equations have a non-trivial solution with overlap m 1 = 0 when α < α c .However, when α > α c , only the trivial solution with m 1 = 0 exists.Figure 4(a) shows the critical memory capacity α c as a function of the number of bits in the case of n σ = 1.The critical memory capacity increase saturated after the number of bits reaches eight.Figure 4(b) shows the critical memory capacity α c as a function of the range of the discretization function n σ in the case of two-bit, threebit, four-bit and eight-bit coupling strengths.In the case of a two-bit coupling strength, α c remains constant with respect to n σ since ∆ is independent of the value of n σ .In the case of a three-bit coupling strength, the critical memory capacity is maximized when n σ ≈ 2.1, and it decreases when n σ is more than this value.In the case of a four-bit coupling strength, the critical memory capacity is maximized when n σ ≈ 2.5, and it decreases when n σ is more than this value.In the case of an eight-bit coupling strength, the critical memory capacity increases until n σ approaches 3.83, and it decreases slowly as n σ increases.The α c obtained numerically from Eq. ( 26) is almost equal to the value in the original Hopfield model for n σ ≈ 3.83.

The Almeida-Thouless Line
To determine whether or not a replica-symmetric solution of the FM phase is stable against replica symmetry breaking (RSB), we calculated the Hessian matrix of the free energy.The details were given in Appendix B. The Almeida-Thouless (AT) line 62) is obtained by solving the following equations: Figure 5 shows an enlarged view of the AT lines T R (α) in Fig. 2.These lines were obtained by numerically solving Eq. ( 27).In the case of a two-bit coupling strength, T R is proportional to n σ , since ∆ is independent of n σ .In the cases of four-bit and eight-bit coupling strengths, the variation in T R depending on n σ was smaller than in the two-bit case.

Discussion
We succeeded in deriving the saddle-point equations for the Hopfield model with discrete coupling by using the replica method and used them to obtain the critical memory capacity of the model for different numbers of bits and ranges of the discretization function.In the original Hopfield model, the critical memory capacity is 0.138. 63)On the other hand, the critical memory capacity in the Hopfield model with clipping synapses becomes α c = 0.1. 54)In Ref. 54, Sompolinsky showed that the critical memory capacity α c and the overlap m 1 increase when ∆ approaches 0. This implies that α c increases by tuning the nonlinear function f so that ∆ becomes smaller.It was reported that the critical memory capacity is α c ≈ 0.12 when the three-level coupling strength taking -1, 0, or 1 was tuned such that ∆ becomes the smallest. 55)In the case of up-to-three-bit discretization, the memory capacity was reported to be maximized by optimizing intervals non-uniformly. 56)In this study, the memory capacity has been shown to be maximized by adjusting the range of the discretization function, n σ , depending on the number of bits in integer or fixed-point number representation.
As shown in Fig. 4(a), as the number of bits increases, the critical memory capacity α c monotonically increases and saturates to 0.1287 around eight bits when n σ = 1.This result means that eight bits is sufficient to represent the coupling strength and achieve almost the same performance as in the continuous case.However, in the case of n σ = 1, the critical memory capacity does not approach 0.138 even with numerous bits.Thus, we also have to adjust the range of the discretization function.As shown in Fig. 4(b), there is an optimal value of n σ that maximizes the critical memory capacity dependently on the number of bits.In particular, in the case of eight bits, the critical memory capacity is maximized around n σ = 3, and it is almost the same as 0.138.This result shows that the model in the case of an eight-bit coupling strength with the range n σ = 3 achieves almost the same performance as the original Hopfield model.Moreover, in the case of a four-bit coupling strength with the range n σ = 2, α c is degraded by about 2% compared with the 0.04 0.06 0.08 0.1 0.12 0.14 0 original Hopfield model.In the case of a three-bit coupling strength, the maximum value of α c became 0.1295 at around n σ ≈ 2.1.This maximum memory capacity value was lower than that of the Hopfield model with the optimal three-bit non-uniform discretization for coupling strengths (α c = 0.135). 56)On the other hand, in the case of a two-bit coupling strength, α c is invariant with respect to n σ , and thus, the performance can not be improved by adjusting n σ in this case.
We expect that the results obtained here give a suggestion on how many bits are needed to represent coupling strengths and maintain the performance of other Ising models, because the Hopfield model shares many statistical mechanics pictures with other Ising models.We surmise that the performance of other models deteriorates slightly under the four-bit condition with n σ = 2, whereas the other models under the eight-bit condition with n σ = 3 achieve almost the same performance as the original ones.

Conclusion
We investigated the properties of the Hopfield model with discrete coupling.Using the replica method, we estimated the effect of discretization of the coupling strength on the critical memory capacity of the Hopfield model with discrete coupling.As a result, the critical memory capacity increases as the number of bits increases.In addition, we showed the relationship between the critical memory capacity and the range of the discretization function n σ and that the critical memory capacity is maximized at the optimal discretization parameter in the cases of three-bit, four-bit and eight-bit coupling strengths.In particular, the critical memory capacity in the case of an eight-bit coupling strength and n σ = 3 is almost the same value as that of the original Hopfield model.Moreover, the critical memory capacity in the case of a four-bit coupling strength deteriorates by about 2% in comparison with the original Hopfield model when the range of the discretization function is optimal.The Hopfield model shares many statistical mechanics pictures with other Ising models.Thus, as discussed above, we expect that the results obtained here give a suggestion on how many bits are needed to represent coupling strengths for maintaining the performance of other Ising models.To achieve an efficient digital hardware implementation of Ising computing, the number of bits representing the coupling strength should be made as small as possible while maintaining performance as much as possible.Our results provide reference values for designing a numerical data processor for calculating the local field.
This work is supported by the Japan Science and Technology Agency through its ImPACT program, NTT Research Inc., and the National Science Foundation of the United States of America.

Appendix A: Derivation of the Free Energy
In this appendix, we derive the free energy using the replica method.Using the "replica trick," the average free energy per spin can be written as Here, Z is the partition function defined as Eq. ( 17).Following the recipe of the replica method, we calculate [Z n ] , which is physically equivalent to the average of the partition function of n replicas, by substituting Eq. (1) and Eq. ( 15).Substituting Eqs. ( 1) and ( 15) into Eq.( 17), First, we take the average over the glassy-coupling part η ij .Since η ij obeys independently and identically distributed Gaussian random variables with zero mean and variance J 2 ∆ 2 /N , we obtain Next, using the standard technique in the replica method of the original Hopfield model, 64) we take the quenched average over the uncondensed patterns {ξ µ i } µ>1 .
[Z n ] ∝e −Jβnp/2−J 2 ∆ 2 β 2 (n 2 −nN )/4 where • • • ξ 1 denotes the average over the pattern ξ 1 , I n denotes an n-dimensional identity matrix, and Q is a matrix whose off-diagonal elements are q ρ,σ and diagonal elements are zero.We apply the saddle point method to the integral in Eq. (A•4) in the thermodynamic limit N → ∞.Accordingly, the average free energy per spin in Eq. (A•1) can be rewritten as Taking the replica symmetric ansatz, m 1 ρ = m, q ρ,σ = q, r ρ,σ = r, (A•7) we obtain Eq. (18).

Appendix B: Derivation of the AT Line
In this appendix, we derive Eq. ( 27).The Hessian matrix of the free energy with respect to q ρ,σ and r ρ,σ is an n(n − 1) × n(n − 1) matrix around the replica-symmetric solution having the following block structure: A ρσ,ρτ =A ρρ A ρσ + A 2 ρσ , (B•3b) A ρσ,τ υ =2A 2 ρσ , (B•3c) (a) shows the linear function, which is used in the definition of the original Hopfield model.Figures 1 (b) and (c) indicate the functions to discretize the coupling strength into two-bit and multi-bit values, respectively.

Fig. 1 .
Fig. 1.Examples of the discretization function f .(a): Linear function f (x) = x, which is used in the definition of the original Hopfield model.(b): Signum function f (x) = sgn (x), which is used in two-bit coupling strength, called clipping synapses.(c): Multi-bit discretization function defined as Eq.(4) in the case of n = 4.

Fig. 2 .
Fig. 2. Plots of critical temperatures of the SG and the FM states as a function of α. (a): Case of two-bit coupling strength.(b): Case of four-bit coupling strength.(c): Case of eight-bit coupling strength.In each panel, the range of the discretization function varies as nσ = 1, 2, 3.The dashed line shows the transition temperature Tg to the SG state.The solid line shows the temperature T M at which the FM states first appear.Replica symmetry is broken below the dotted line T R .
) -Fig.3(f), the solid lines in each subfigure show the values of m 1 as a function of T with various α, which were obtained by solving Eq. (22) in the case of n σ = 1, and the symbols and error bars show means and standard deviations of the values of m 1

Fig. 3 .
Fig. 3. Plots of theoretical results obtained by saddle-point equations Eq. (22) and numerical results obtained by MCMC simulations.(a), (d): Case of two-bit coupling strength.(b), (e): Case of four-bit coupling strength.(c), (f): Case of eight-bit coupling strength.In (a)-(c), the solid line shows the transition temperature T M obtained by solving Eq. (22) numerically in the case of nσ = 1, and the color plot shows the value of m 1 obtained by MCMC simulations.In (d)-(f), the solid line shows the value of m 1 obtained by solving Eq. (22) numerically in the case of nσ = 1, and the symbols and error bars show means and standard deviations of the value of m 1 obtained by MCMC simulations.In (d)-(f), the loading rate α varies as 0.005, 0.05, 0.1, and 0.15.

Fig. 4 .
Fig. 4. (a): Critical memory capacity αc as a function of the number of bits in the limit T → 0 when nσ = 1.The subfigure at the bottom right shows αc as a function with a wider range of bits.(b): Critical memory capacity αc as a function of nσ in the limit T → 0. The subfigure at the bottom right shows αc in the case of an eight-bit coupling strength as a function with a wider range of nσ.Each figure was obtained by solving Eq. (26) numerically.

Fig. 5 .
Fig. 5. Enlarged view of the AT lines in Fig. 2. (a): Two-bit coupling strength.(b): Four-bit coupling strength.(c): Eight-bit coupling strength.Note that the scale of the vertical axis in the case of the two-bit coupling strength is different from those of the four-bit and eight-bit coupling strengths.