Photonic circuits for iterative decoding of a class of low-density parity-check codes

Photonic circuits in which stateful components are coupled via guided electromagnetic fields are natural candidates for native implementation of iterative stochastic algorithms based on propagation of information around a graph. Conversely, such message passing algorithms suggest novel circuit architectures for signal processing and computation that are well matched to nanophotonic device physics. Here we construct and analyze a quantum optical model of a photonic circuit for iterative decoding of a class of low-density parity-check (LDPC) codes called expander codes. Our circuit can be understood as an open quantum system whose autonomous dynamics map straightforwardly onto the subroutines of an LDPC decoding scheme, with several attractive features: it can operate in the ultra-low power regime of photonics in which quantum fluctuations become significant, is robust to noise and component imperfections, achieves comparable performance to known iterative algorithms for this class of codes, and provides an instructive example of how nanophotonic cavity quantum electrodynamic components can enable useful new information technology even if the solid-state qubits on which they are based are heavily dephased and cannot support large-scale entanglement.


I. INTRODUCTION
Recent advances in the realization of nanoscale optical devices have shown the potential for ultra-low power integrated photonic circuits for classical information processing that would have significant advantages over electronic circuits in terms of heat generation and interconnect density [1,2]. In parallel, theoretical and computational tools have been developed for modeling the dynamics of photonic devices that have switching energies in the deeply sub-femtojoule, few-photon regime and are thus subject to quantum fluctuations [3]. These developments present an opportunity to consider the conventional (as opposed to quantum entanglement-enhanced) computational potential of such quantum noise-limited systems and to begin to consider architectural approaches that naturally accommodate noisy, low-power components interacting via coherent signal fields.
An intriguing source of architectural guidance is the broad and growing field of iterative, graph-based algorithms used today for computational tasks such as errorcorrection, probabilistic inference, optimization and signal processing [4]. Such algorithms, including variants of message-passing schemes like belief propagation, have the flavor of nodes repeatedly exchanging information locally with their neighbors until global convergence. This picture invites an analogy to the dynamics of a network of photonic components, each of which has some internal degree of freedom (e.g., an 'atomic' state), coupled via continuous interaction with propagating coherent fields. Thus photonic information processing systems could provide a native hardware platform for the implementation of iterative graph-based algorithms that are currently executed using electronic computers with incommensu- * Electronic address: dmitrip@stanford.edu † Electronic address: hmabuchi@stanford.edu rate (though universal) circuit architectures that simulate message passing inefficiently.
Here we develop an instance of this direct mapping of a graph-based algorithm to a photonic circuit design for a simple and practically useful task: iterative decoding of expander codes, a class of low-density parity-check (LDPC) error-correcting codes for communication over a noisy channel. We work in the setting of linear coding theory in which every codeword is required to satisfy a set of parity check constraints, i.e., sums modulo 2 of subsets of its bits. The assignments (0 or 1) of the codeword bits and the values of their parity check sums correspond to the states (|0 or |1 ) of a collection of two-state systems.
Here we have in mind that |0 and |1 ideally should correspond to orthogonal quantum states of an atom-like elementary physical degree of freedom, to facilitate ultralow energy scales for switching, but our circuit does not require coherent superpositions or entanglement. For decoding a possibly corrupted channel output, we consider a simple iterative decoding procedure for the expander LDPC codes [5,6]: flip any bit (i.e., 0 ↔ 1) that appears in more unsatisfied than satisfied parity check constraints; repeat until no more flips occur. We map this decoding procedure onto a closed-loop feedback circuit: a simple sub-circuit is engineered to encode parity check sum values in the state of an optical field, and another sub-circuit is designed to route feedback optical fields such that the states of certain components are flipped (i.e., |0 ↔ |1 ) at a rate that grows with the number of unsatisfied parity check constraints.
The proposed circuit is autonomous, continuous-time and asynchronous. No external controller, measurement system or clock signal is required, so the circuit can be realized as a single photonic device whose only required inputs are stationary coherent optical fields that drive the computational dynamics (i.e., supply power) [7]. This follows the spirit of the systems we have designed in previous work on autonomous quantum memories [8,9]. In contrast to our earlier work, the decoding circuit in the present proposal is straightforwardly extensible to the long block lengths (thousands of bits) used in practical LDPC implementations, as it involves a simpler feedback circuit architecture [10].
Our circuit requires a collection of two-state latch systems coupled to input and output field modes. Here we consider designs based on the attojoule nanophotonic relay proposed in [11], which is based on ideas of cavity quantum electrodynamics (cavity QED), but any photonic system that functions as a latch potentially could be used in our circuit, e.g., [12]. Moreover, our scheme tolerates noisy components (e.g., spontaneous switching of a latch between the 0 and 1 states), can compensate for this noise with increased input optical power, and actually performs optimally (in terms of bits decoded per second) when the components "misbehave" at some nonzero rate. The graceful change in performance with increasing component imperfection and with varying optical input power is important for the practical usefulness of such a circuit. In our circuit design there is no real distinction between power and signal, as the power carried by the optical signal fields drives all the computational dynamics of the components, and it will be shown in Fig. 9 that simply increasing the optical input power reduces the error correction latency with fixed hardware. Our circuit tolerates a wide range of input powers with a constant performance as measured by bits corrected per joule. This paper is organized as follows: We first briefly review linear error-correcting codes and an iterative decoding scheme for expander LDPC codes. We then describe in an intuitive way the operating principles of our photonic circuit implementation of an iterative LDPC decoder. The subsequent section gives a more detailed picture of our circuit in terms of open quantum systems theory. We then present some numerical tests of our system and conclude with a discussion. The appendices describe circuit composition rules for open quantum systems, discuss the details of our numerical simulations, and derive some bounds for a parameter regime in which we expect our scheme to work.

II. LINEAR CODES AND ITERATIVE DECODING
We briefly review and set up notation for block binary linear error-correcting codes and an iterative decoding procedure for expander LDPC codes.

A. Linear Codes
We work with binary bits transmitted in blocks of length n through the binary symmetric channel (BSC) that with some fixed probability independently flips (i.e. 0 → 1, 1 → 0) the transmitted bits. To protect from errors, the sender restricts the possible channel inputs to the set of codewords-a subset of all 2 n possible inputs. The decoder attempts to find the nearest codeword to the possibly corrupted output of the channel. Equivalently, the bits are stored in memory that accumulates errors with time; the sender/decoder attempt to minimize losses through redundancy in the encoded memory bits.
Linear codes require each codeword x n = (x 1 , . . . , x n ) to satisfy m parity check constraints. A parity check constraint c is a subset of the n message bits whose sum is constrained to equal 0 modulo 2: A vector x n is a codeword if and only if it satisfies every constraint. The rate R of the code is the ratio of the number of non-redundant bits to the total number of bits per transmission, R = (n − m)/n. It is useful to think of a code as an undirected bipartite graph, the Tanner graph [13], whose n 'variable' nodes correspond to the message bits and whose m 'check' nodes correspond to the constraints. Edges connect variable nodes and the constraints that include them.

B.
Linear ⊃ LDPC ⊃ expander codes Low-density parity-check (LDPC) codes are linear codes introduced by Gallager in 1962 [14,15] and are among the first known near capacity-achieving efficiently decodable codes. The parity checks of a (n, l, k) LDPC code all include k bits, and each bit is included in l parity checks (in the Tanner graph, each variable node has degree l and each check node has degree k). The codes are "low-density" because the total number of variablecheck pairs is ln, linear in the block length n (rather than quadratic in n for a dense graph); the Tanner graph is sparse. The rate of the code is R = (n−m)/n = (k−l)/k. Fig. 1 shows the Tanner graph for a particular (n = 8, l = 3, k = 4) LDPC code, where variable (check) nodes are drawn as circles (squares), and we have highlighted a particular parity check constraint. This graph would look sparse for larger n.
LDPC codes shine because they can be decoded efficiently by iterative algorithms that have good performance in practice and in theory. These schemes include those in Gallager's original work [15], as well as messagepassing algorithms and belief propagation; for a theoretical analysis of their performance see [16][17][18][19]. These schemes all have the flavor of variable and check nodes repeatedly exchanging information about the most likely codeword given the observed channel output and differ from each other in how that information is represented (e.g. binary or real-valued messages) and how new messages are computed from old.
Expander codes are a class of LDPC codes, introduced by Sipser and Spielman [5,6], for which a particularly simple iterative decoding procedure exists and which are easy to make by using a random construction. Expander codes require the Tanner graph to be a good expander graph, meaning that the number of check nodes neighboring any small enough subset V of the variable nodes grows fast enough linearly with |V |. For our purposes it suffices to note that a randomly sampled bipartite graph with fixed variable and check node degree (a regular LDPC code) probably makes a good expander code [6].

C. Iterative decoding of expander codes
The iterative decoding procedure that is our focus in this work is the sequential decoder of Sipser and Spielman [6]. The variable bits are initially assigned to 0 or 1, equal to the observed output of the channel (we work with a binary symmetric channel that flips incoming bits with probability less than 1/2). The initial assignment of the variables may fail to satisfy all parity check constraints due to errors. The decoding procedure is as follows: • Flip (i.e. 0 ↔ 1) any variable that is included in more unsatisfied than satisfied constraints.
• Repeat until no more variables are flipped.
Each iteration reduces the total number of unsatisfied constraints, so the procedure terminates when either there are 0 unsatisfied constraints (successfully outputting a codeword) or it gets stuck and declares failure to decode. While this procedure could be applied to any binary linear code, [6] prove that for expander codes this procedure removes a constant fraction of errors and, if the initial fraction of errors is low enough, is guaranteed to succeed. For the expander LDPC codes, each variable participates in k constraints, so we flip the variable's assignment if the number of unsatisfied constraints is greater than k/2.
Importantly for our work, in [6]'s numerical experiments, it was found that permitting the algorithm to make some amount of backwards progress (sometimes increasing the total number of unsatisfied constraints) increased the probability of success. This suggests the procedure is robust to noise affecting the computation. In our approximate implementation of this iterative algorithm, described below, backwards progress is unavoid-able and the hardware itself is noisy, so this robustness of the decoding procedure to noise is desirable.
This procedure is not technically a message-passing algorithm in the sense of [20], in that information flow from a check to a variable node (a possible "flip" instruction) does not exclude information received by the check node from that variable node (the bit state). Nonetheless it is convenient to discuss the error-correcting dynamics, as [6] do, in terms of variable nodes receiving "flip messages" from check nodes.

III. A PHOTONIC DECODING CIRCUIT: OVERVIEW
We give an intuitive description of the operation of our expander code decoder circuit before giving a more precise description in terms of open quantum systems in the Section that follows.

A. The idea
Our circuit consists of a collection of two-state (|0 or |1 ) systems, one for each of n variable and m check nodes in the Tanner graph for an error-correcting code. Information exchange between the variable and check systems is mediated by coherent fields interacting with these systems (e.g. a beam scattering from one atom-cavity system into another). There are two crucial interactions: • Fields outgoing from a system can encode that system's state (perform a measurement) • Fields incoming to a system can drive that system into a desired state (apply a control) These two interactions allow us to construct a closedloop, autonomous measurement and feedback circuit that achieves: • Parity checks/Measurements: A field scattered (e.g. a beam reflected) from the set of all variable bit systems included in some parity check constraint encodes their sum modulo 2. This field then drives the check system into the |satisfied or |unsatisfied states (|0 or |1 , respectively).
• Error correction/Feedback: A field scattered from the set of all check systems that include a particular variable has an amplitude that increases with the number of unsatisfied checks involving that variable. This field then drives the variable system to flip between the |0 and |1 state at a rate proportional to the magnitude of the field amplitude. The more unsatisfied parity checks, the faster the flipping occurs.
The time evolution of this circuit is modeled as a continuous time Markov jump process [21]. The jumps are changes in state (|0 ↔ |1 ) and the jump rates depend on amplitudes of fields interacting with the two-state systems. The circuit is autonomous and asynchronous in that there is no external clock signal or external controller to process the parity measurement outcomes and to create an appropriate feedback field.
We note that the iterative decoding algorithm of [6] that our circuit emulates, summarized in Section II C, can be cast in terms of a continuous time Markov jump process as well: if a variable is included in more unsatisfied than satisfied constraints, set the rate for "flipping" it to R flip > 0, otherwise set R flip = 0. In our implementation, the value of R flip scales with the number of unsatisfied constraints in a different way (and is never 0; see Section IV C 2), but we attain comparable empirical performance in simulation.
Finally, we note that our circuit is essentially classical in its operation, even though we utilize quantum stochastic differential equations (QSDEs) to describe the dynamics of the components and their interactions in order to obtain a circuit model that is valid in the ultra-low power regime of significant quantum fluctuations (photon shot noise). Entanglement between different subsystems is insignificant and is not exploited, and thus does not need to be protected from interactions with the outside environment.

IV. A PHOTONIC DECODING CIRCUIT -CONSTRUCTION
We briefly review open quantum systems connected into circuits, describe the photonic component subsystems that make up our circuit, and specify their interconnection to form our iterative decoder circuit. We give an intuitive description of our circuit's dynamics and defer a more detailed description to Appendices A and B.

A. Open quantum systems and circuits
We work in the framework developed by Gough and James [22,23] for modeling open quantum systems interacting via coherent fields [24][25][26][27]. The basic component model (shown in Fig. 10 of Appendix A 1) comprises a system with internal degrees of freedom coupled to incoming and outgoing field modes. The system is parametrized by its Hamiltonian H, by the coupling of the external modes to the internal degrees of freedom (n by 1 operator-valued vector L), and by the way the incoming external field modes scatter into outgoing external field modes (n by n operator-valued unitary matrix S). The density matrix ρ for the system's internal degrees of freedom evolves in time according to the master equation: The latch approximated as a two-state continuous time Markov jump process after adiabatically eliminating the excited states |e and |s (see [11] for this derivation). (d) The latch routes the input fields into output fields, switching them if its internal state is driven to where L i is the i-th component of the external field mode coupling vector L. See Appendix A for a more detailed discussion. The Gough-James circuit algebra allows us to compute new (S, L, H) triplets in terms of old for two systems connected in series, in parallel, or for one system selfconnected through feedback. These composition rules are given in Appendix A 2. A systematic, automated approach for specifying and simulating such circuits in software is presented in [28,29].

B. Photonic circuit components
The basic component of our circuit -used to represent both variable and check node assignments (|0 and |1 ) -is a photonic latch, shown in Fig. 2, that behaves like the set-reset latch in electronics. There are several pro-posals for implementing latching behavior in nanophotonic circuits [11,12,[30][31][32]]. One such system, a coupled atom-cavity system [11], is shown in Fig. 2 (panel (b)). Our circuit construction is defined without reference to a particular physical system and assumes that the latch system that is used implements the following protocol.
The latch has a discrete internal degree of freedom (e.g. an atomic state) coupled to two external field modes, labeled "set" and "reset." A signal incoming to the "set" ("reset") input drives the latch into the |1 (|0 ) state. When neither the set nor reset input is powered, the latch maintains its current state. Usefully for us, driving both the set and reset inputs simultaneously -an undefined condition for the electronic set-reset latch -results in astable behavior, with the latch state repeatedly jumping between the |0 and the |1 state with exponentiallydistributed jump times.
The latch routes two input channels (in 1 and in 2 ) into two output channels (out 1 and out 2 ). When the latch is in the |0 state, the outputs match the inputs (out 1,2 = in 1,2 ); when the latch is in the |1 state, the outputs are switched (out 1,2 = in 2,1 ).
In addition to the latch, our circuit uses beamsplitters with some fixed transmission and reflection coefficient. Proposals for integrated nanophotonic beamsplitting devices include [33,34]. The Gough-James (S, L, H) description of these components connected to each other and driven by coherent fields is provided in Appendix B.

C. Circuit construction
We describe how the latches, beamsplitters, and coherent inputs are used to form our expander code decoding circuit. There are two kinds of interactions to implement between the variable and check systems: parity check sums and feedback to "flip" the variable nodes. (here ⊕ denotes addition modulo 2). The current assignment (0 or 1) of the variables included in c is represented by the states (|0 or |1 ) of the variable latches; the check latch's state is meant to represent the sum of these assignments modulo 2. As shown in Fig. 3(b), the variable latches share two common optical paths for their in 1 and in 2 inputs and outputs. An input field with amplitude α is incident to input port in 1  Each time a |1 state is encountered at a variable latch along the beam path, the latch switches the beam path between the upper and lower branches. If the output power of the final latch is in the upper (lower) branch, then the parity of the variable assignment is odd (even), and the SET (RESET) port of the check latch receives power, driving the check latch into the |unsatisfied = |1 (|satisfied = |0 ) state. The rate at which the check latch is driven to the appropriate state is proportional to the input field power |α| 2 in units of photons per second.

Parity checks
The check latch Q check c in turn routes fields that participate in the feedback circuit described in the next Section.  Fig. 4(c), the check latches share a common optical path. An input field with amplitude β is incident to input port in 1 of latch Q check v (1) . Subsequently, for each check latch Q check v(i) , 1 ≤ i ≤ l, the second output is fed back into the second input of the same latch after passing through an attenuator (e.g. a beamsplitter) that dumps (e.g. reflects out of the beam path) a fraction γ < 1 of incident power and transmits a fraction 1−γ of the power back into the beam path.

Feedback to variables
Each time an unsatisfied parity check constraint state (|1 state) is encountered at a check latch along the beam path, the power reaching the next check latch in the path is attenuated by a factor of γ. The output of the final check latch in the path Q check v(l) is routed to drive both the SET and RESET inputs of the variable latch Q var v , causing it to "flip" between the |0 and |1 states.
Once a flip of variable v occurs, the parity check system discussed in the previous Section updates the states of the check systems that include this variable, resulting in an updated value of the flipping rate for variable v. If the power in the measurement circuit used to perform the parity check computation is low enough, the feedback circuit may induce multiple flips of the same variable before the measurement system reacts. We consider this situation in the numerical results Section below.
The rate at which the variable latch Q var v flips is proportional to the attenuated power outgoing from the final latch in the beam path: If all l parity constraints that include a variable v are unsatisfied, the state of variable latch Q var flips with the maximum rate proportional to |β| 2 . If all l constraints are satisfied, the variable is flipped with non-zero rate proportional to γ l |β| 2 . Thus our circuit can induce errors. For γ 1, a single induced error should be quickly corrected since the rate for correcting it is a factor of 1/γ l 1 larger than the rate for inducing it. Our circuit corrects errors that are involved in i parity check violations on a timescale proportional to 1/γ i . The smaller we make the attenuation factor γ, the fewer induced errors there are, but the longer the decoding takes to complete. We derive some bounds on the maximum value of γ in terms of the code parameters such that our procedure is likely to succeed in Appendix C. We guess that the attenuation factor γ should not be too small, since the decoding probability may increase when some induced errors are permitted, as observed in [6]. This intuition is consistent with our observations in the numerical results Section below. Fig. 5 shows both the measurement and feedback subcircuits for a fragment of our decoder circuit corresponding to a fragment of the Tanner graph of an errorcorrecting code. There is one such fragment for each of nl edges in the Tanner graph of the code. Fig. 6 shows a portion of a simulated trajectory for a fragment of the code. The top panel shows the state (|0 or |1 ) of a latch corresponding to a variable bit (blue) and the three latches corresponding to the three parity checks that include this bit (dark red). At time 0, an error causes the variable bit latch (blue) to flip state (perhaps the component malfunctioned or the feedback system induced the error). The three check latches corresponding to this bit then turn on (enter the unsatisfied, |1 state) after some exponentially-distributed waiting time (the mean of the waiting time is set by the input probe power used to perform the parity check sum computation). For each check latch that enters the unsatisfied |1 state, the feedback power reaching the variable bit grows by a factor of 1/γ, where γ is the attenuation constant. Around time 1.25, the feedback induces the bit to flip back to the |1 state. After an additional random waiting time, the three latch systems return to the satis- fied |0 state. Note that the feedback power reaching the bit is never 0, but reaches a minimum when all parity check constraints are satisfied.

E. Fan-in/Fan-out
Our decoder circuit requires each variable latch component to participate in multiple (l) parity check constraints, and requires each parity constraint latch component to feed back to multiple (k) variables. Since the latch described in Section IV B (and in greater detail in Appendix B 3) can switch only a single pair of signal inputs, it is not on its own sufficient for our needs. We can augment our latch to achieve the desired fan-in/fan-out (and avoid the difficulty of having multiple beam paths access a single structure in a planar circuit) by breaking up each latch into a set of subsystems, each responsible for routing a single in/out signal pair. The subsystems are yet more latches, but correspond to the single pair of in/out signals latch description of Section IV B. This augmented latch is used implicitly in our circuit description above and is described in Appendix D. Our circuit evolves according to the master equation (2). Rather than solve this equation for the density matrix ρ for our system, we sample multiple trajectories of the system wavefunction |ψ and average observed quantities over these trajectories. Simulation of quantum trajectories given a master equation in the form of (2) is computationally easier than integrating the master equation and is discussed in detail in [35]. One way to perform such simulations is to sample exponentially-distributed jump times for each component of the system L vector (rate for i-th component is ∼ | ψ|L † i L i |ψ | 2 ), apply the nearest-in-time jump to the system wavefunction, and resample all of the jump times given the new wavefunction. In general, there is a smooth Hamiltonian evolution occurring between jumps as well, but our decoder circuit's Hamiltonian is diagonal in the {|0 , |1 } state basis, and this basis is fixed by the components of L (the jump terms) so we can ignore the smooth evolution and treat the system as a continuous time Markov jump process.

V. NUMERICAL EXPERIMENTS
We prefer the trajectory approach in part because we want to average over different random instances of the expander code (with different network connectivities each time) and because it is useful to examine the time evolution of individual trajectories for an intuitive view of the circuit.

B. Trajectories
We uniformly randomly sample 30 bits to corrupt from the initial all-0 codeword of length n = 1000 for a randomly sampled LDPC code with l = 5, k = 10, and track the remaining number of errors in time. The code is generated by randomly sampling a bipartite graph with 1000 variable nodes each with degree 5, 500 check nodes each with degree 10. We take the feedback attenuation parameter γ = 0.01, set the feedback power to 1 (arbitrary units), the probe power to something much larger (10 5 ), and set the rate for spontaneous component flips η = 0. Fig. 7 shows the number of errors remaining as a function of time averaged over 999 trajectories, and for three individual trajectories. 999 of 1000 trajectories decoded successfully (converged the all-0 codeword). The one that did not is not included in the average.
We point out two features of the trajectory simulations. One is that (e.g. the red trajectory in Fig. 7) the number of errors remaining sometimes increases in the course of a simulation. As discussed in our circuit description in Section IV C 2, the circuit induces errors at some non-zero rate and then corrects the induced errors. Errors are most likely to be induced for variables that are involved in some, but not a majority of parity check violations. When the attenuation constant γ is too high (too little attenuation), the circuit may induce errors faster than they are corrected, resulting in a failure to decode. On the other hand, as γ is decreased, the circuit corrects errors at a lower rate, suggesting an optimal value of γ in terms of a performance vs. decoding time tradeoff. This tradeoff is considered in the next Section.
Second, the empirical mean of 999 trajectories (black trace in Fig. 7 exhibits three shoulders (alternates between being locally convex and concave) in its decay toward 0. The shoulders are spaced approximately 1/γ = 100 logarithmic time units apart, corresponding to the correction of errors that are involved in 5, 4, and 3 parity check violations, respectively. The mean number of errors remaining first declines significantly at time t ∼ 10 0 , consistent with feedback at maximal rate (no attenuation) |α fb | 2 = 1 flipping variables all l = 5 of whose corresponding parity check constraints are initially unsatisfied.

C. Performance vs. initial number of errors
We simulate our decoding circuit using the same code parameters as [6]: a (n = 40000, l = 5, k = 10) expander code, generated by randomly sampling a bipartite graph with 40000 variable nodes, 20000 check nodes, and degree 5 and 10 at the variable and check nodes, respectively. The performance of our decoder in simulation for these parameters is shown in Fig. 8. This performance (top panel) is somewhat better than that of [6]'s scheme and somewhat worse than their version of the scheme permitting some backwards progress -occasionally allowing the total number of parity constraint violations to increase.
We see in Fig. 8 (top) that the decoder's performance in terms of block error rate appears to saturate as the attenuation parameter γ decreases. At the same time, the median time [36] to successfully decode grows as γ decreases (bottom), since the rate to flip bits scales exponentially in γ (eq. (3)). Thus we could set γ to the highest achievable value for a given channel error probability, desired mean decoding time, and probability to decode successfully.

D. Performance vs. input power with noisy circuit components
We consider the decoder's performance as a function of applied input power in terms of probability to decode, decoding rate (bits/s), and decoding energy (bits/J). Additionally, we set some non-zero rate η at which the circuit components undergo spontaneous flips (|0 ↔ |1 ). This noise affects both the variable and the check latches and in turn both the measurement and feedback parts of the circuit. Fig. 9 shows our numerical results for fixed component noise rate η, LDPC code parameters, initial number of errors, and attenuation parameter γ (see caption for parameter values).
We see (top panel of Fig. 9) that to decode successfully most of the time, the feedback power needs to be large enough to overcome the errors induced by noise in the circuit components, but not much larger than the probe power. When the feedback power is much larger than the probe power, the probe circuit is too slow to turn off the feedback once an error is corrected and too slow to turn on the feedback for new errors (induced by either the feedback or spontaneous flips), so the feedback system may induce more errors than it corrrects.
For the bottom panel of Fig. 9 we fixed the probe to feedback power ratio at 1 and plotted the mean decoding rate and energy versus input power in bit/s, bit/J, respectively [37]. We defined the decoding rate as the re- (bottom, solid lines) 90% interval for time to decode successfully. We did not track these quantities past 1875 initial errors due to low succesful decoding probability. The code parameters are the same as in [6]: (n = 40000, l = 5, k = 10). We set α probe = 10 3 , α fb = 10, η = 10 −80 . We sampled 3000 trajectories for each data point.
ciprocal of the mean decoding time, conditioned on successfully decoding, and the decoding power as the decoding rate divided by the input power. We see that for large enough input power, the decoding rate is proportional to the input power, while the energy cost per decoded bit is constant.

VI. DISCUSSION
We have described a photonic circuit that implements an iterative decoding scheme for expander LDPC codes. This circuit consists of a collection of optical latching relays, whose interactions via coherent fields map naturally onto the subroutines of the iterative decoder.
This circuit is autonomous-it is powered by the same optical signals that it acts upon to implement the decoding procedure, and it requires no external controller, measurement system, or clock signal. It operates robustly in the low-power limit in which quantum fluctuations of the fraction decoded successfully optical fields are significant. The feedback-induced latch state fluctuations provide a natural source of randomness to drive the decoding algorithm. Crucially for the feasibility of such a system, our circuit's performance, as measured by decoding time and error rate, can be tuned smoothly by varying the optical input power. Tuning the input power can be done without loss in efficiency, as our circuit decodes a constant number of bits per Joule at a rate linear in the input power. Thus, noise that acts on the circuit components and potentially disrupts the computation can be overcome by increasing input power until the circuit works. Our construction highlights the computational utility of cavity QED-based nanophotonic components for ultralow power classical information processing, and points to the utility of the probabilistic graphical model framework in engineering autonomous optical systems that operate robustly in the quantum noise regime.
Appendix A: Gough-James circuit algebra We briefly review the Gough-James treatment of open quantum systems and circuits composed of such systems [22,23]. We give sufficient detail for the reader to reproduce our numerical simulations.

Open quantum systems
In the Gough-James circuit algebra for modeling open quantum systems, a system coupled to n external fields is parametrized by a (S, L, H) triplet, where the scattering matrix S is n by n unitary with operator-valued entries, the coupling vector L is n by 1 with operatorvalued entries, and H is the system's Hamiltonian. Fig.  10 summarizes this picture. The density matrix ρ for the system's internal degrees of freedom evolves in time according to the master equation (eq. (2)): where [A, B] = AB − BA, {A, B} = AB + BA, and † denotes conjugation. The scattering matrix S does not appear in (2), but appears when we interconnect such systems below.

Circuits
The Gough-James circuit algebra allows us to compute new (S, L, H) triplets in terms of old for two systems connected in series, in parallel, or for one system self-connected through feedback. We briefly state these circuit composition rules.
The series product takes two open quantum systems G 1 = (S 1 , L 1 , H 1 ), G 1 = (S 2 , L 2 , H 2 ) coupled to an equal number of external modes and returns the system G 2 G 1 obtained by feeding the outputs of G 1 into the inputs of G 2 : The concatenation product takes two open quantum systems G 1 and G 2 , coupled to n 1 and n 2 modes, respectively, and returns the system G 2 G 1 obtained by considering the two systems as one system coupled to n 1 + n 2 modes and introducing no interactions between them: The feedback product takes a single open quantum system coupled to n modes and returns the system [G] k→l obtained by feeding back the k-th output mode to the l-th input mode, coupled to n − 1 external modes. The form of this product is given in [22] (Section 5) (and in the notation used here in [28], Appendix A).

Appendix B: Components
We describe the components we need for our decoder circuit in terms of a (S, L, H) triplet, focusing on an intuitive input-output picture.

Beamsplitter
To give an intuition for these systems and to specify a components we need, we first describe the beamsplitter as an open Markov quantum system. A 50/50 beamsplitter has two input and two output ports and is parametrized by: (B1) By examining the scattering matrix, we see that for a field incident into input port 1, half the power is transmitted into output port 1 and half is reflected into output port 2 with a π phase shift. The beamsplitter has no internal degrees of freedom that concern us here, so L = 0 and H = 0. The scattering matrix for a beamsplitter that transmits a fraction γ < 1 of incident power -our attenuation component -is a 2 by 2 rotation matrix with angle arccos √ γ.

Coherent input field
A coherent field input is modeled as a Weyl operator W α , which displaces n vacuum inputs into coherent states |α 1 , . . . , |α n with amplitudes α 1 , . . . , α n : For example, driving the beam splitter above with |α in the first input and |β in the second input results in the series connection: (B3) resulting in the mixing of the two inputs in the two outputs, as we expect.

Latch
In terms of an (S, L, H) triplet, the latch is given by the concatenation (parallel product) of two systems: Q set-reset accepts the set and reset inputs and drives the latch into the |0 or |1 state, and Q in-out routes the input fields in 1,2 into the output fields out 1,2 . We have Q = Q set-reset Q in-out , where (B5) where Π 0 = |0 0| and Π 1 = |1 1| are projection operators onto the |0 and |1 states and σ 01 = |0 1|, σ 10 = |1 0| switch |0 and |1 . Conditional on the state of the latch, either S in-out = 1 0 0 1 or S in-out = 0 −1 −1 0 , thus either switching or not switching the input fields. This is the same latch model as that used in our earlier work [8].
A possible physical system that achieves this desired behavior is shown in Fig. 2 and was first proposed in [11]. The |0 and |1 states are degenerate ground states of an atom in a cavity. Set, reset, and input fields are resonant with transitions to one of two excited states (|e and |s ), from which the atom then decays back into one of the ground states. In a regime of strong atom-cavity coupling, the limiting behavior of the switch system is obtained by using the QSDE limit theorem [38] to adiabatically eliminate the excited state dynamics. An alternate proposal for such a switch using a Kerr cavity is found in [12].
Appendix C: Bounds on nonlinearity of feedback Consider a (n, l, k) LDPC code, and suppose there is only one variable v that needs to be flipped to return to a codeword. This variable participates in l parity check constraints, all of which are violated, so it flips at some maximal rate r. These l parity check constraints together include at most l(k − 1) variables other than v (at most because they may have some in common), each of which is involved in least one parity constraint violation, and so flips with rate at least rγ l−1 . The total rate for erroneously flipping any variable other than v is then In order to flip v before errors accumulate, we set r > R err and find In our numerical tests we have used l = 5, k = 10, yielding γ < 0.38. Numerically we found that our decoder mostly fails to decode already for γ = 0.1 (see Fig.  8), but this is an upper bound assuming only one total error.
Appendix D: Fan-in/Fan-out As discussed in Section IV C, our circuit requires a latch component Q var v corresponding to variable bit v to participate in multiple (l) parity check constraints, and a latch Q check c corresponding to parity check c to feed back to multiple (k) variables. Since the latch described in Section IV B and Appendix B routes only two input and two output ports (in/out 1,2 ), it is insufficient for our needs: we need a latch that routes multiple in/out 1,2 signal pairs -switching each pair if and only if the latch state is |1 (see upper panel of Fig. 11). We can augment our latch to achieve the desired fan-in/fan-out in two ways.

Routing multiple signals
One way is to simply add extra input/output ports to the latch system depicted in Fig. 2: we could have input pairs in (i) 1,2 for i ∈ {1, . . . , N } and the corresponding outputs (in addition to the two set/reset ports) for some integer N , all coupled to the same latch state. This would be difficult to achieve in a nanophotonic system, if only due to constraints of geometry -it would be difficult to have multiple beam paths access a single structure in a planar circuit.
An alternate scheme is depicted in Fig. 11. The idea is to break up each latch into a set of N subsystems Q (1) route , . . . , Q (N ) route , each responsible for routing a single in/out signal pair, and a single subsystem Q sr responsible for accepting the set/reset inputs (see Fig. 11). Each of the N +1 subsystems is another latch, but one that routes only a single in/out signal pair and fits the description of Section IV B. The set/reset subsystem Q sr routes power (in orange path in Fig. 11) to the set/reset ports of the N routing subsystems Q (i) route , driving the state of each routing subsystem to match the state of the set/reset subsystem. Thus the N routing subsystems Q (i) route all mirror the overall system state, defined as the state of the set/reset subsystem Q sr .
This construction introduces a delay in distributing the state of the set/reset subsystem to the N routing subsystems -due to both the waiting time for a routing subsystem to switch and to the time for a signal to propagate around a circuit (we do not model the latter source of delay for this circuit). The construction also introduces extra circuit components that could be subject to noise (e.g. spontaneously changing their state). We thus need to use high-enough input power (in orange path in Fig.  11) to make this construction useful.

Accepting multiple set/reset inputs
We note that we can use a similar construction to make a latch system that accepts multiple set/reset inputs in addition to routing multiple in/out signal pairs; though such a system does not appear in our decoder circuit, it may be useful for other purposes. When there are multiple set/reset input pairs for a device, these inputs lose their interpretation as "set" and "reset" for the elecronic latch. We can instead associate each set/reset pair with an internal state and define an overall state as the sum modulo 2 of these internal states, so that changing any of the internal states changes the overall state. This behavior could be useful if we are interested in having a circuit component with multiple "flip" control inputs.
The idea is to break up the set/reset latch subsystem described above into a set of M subsystems Q implementation of latch in inset using only the single in/out signal pair latches described in Section IV B. The routing latches Q (i) route are each responsible for routing a single in/out signal pair. The set/reset latch Qsr accepts external set/reset inputs and is responsible for distributing its state -the overall latch state -to the routing latches. so flipping the state of any of them changes the overall state. The sum modulo 2 is performed as for the parity check circuit described in Section IV C 1 and shown in Fig. 3 (b). The probe beam path (black path in Fig.  3 (b)) would now access each of the Q (i) sr subsystems in sequence before driving the set or reset port of each of the Q (j) route subsystems, as described in the previous Subsection (orange path in Fig. 11).