Ordered Reliability Bits Guessing Random Additive Noise Decoding

Error correction techniques traditionally focus on the co-design of restricted code-structures in tandem with code-specific decoders that are computationally efficient when decoding long codes in hardware. Modern applications are, however, driving demand for ultra-reliable low-latency communications (URLLC), rekindling interest in the performance of shorter, higher-rate error correcting codes, and raising the possibility of revisiting universal, code-agnostic decoders. To that end, here we introduce a soft-detection variant of Guessing Random Additive Noise Decoding (GRAND) called Ordered Reliability Bits GRAND that can accurately decode any moderate redundancy block-code. It is designed with efficient circuit implementation in mind, and determines accurate decodings while retaining the original hard detection GRAND algorithm's suitability for a highly parallelized implementation in hardware. ORBGRAND is shown to provide excellent soft decision block error performance for codes of distinct classes (BCH, CA-Polar and RLC) with modest complexity, while providing better block error rate performance than CA-SCL, a state of the art soft detection CA-Polar decoder. ORBGRAND offers the possibility of an accurate, energy efficient soft detection decoder suitable for delivering URLLC in a single hardware realization.

I. INTRODUCTION Shannon's pioneering work [1] established that the highest code-rate that a channel can support is achieved as the code becomes long. Since 1978, however, it has been known that optimally accurate Maximum Likelihood (ML) decoding of linear codes is an NP-complete problem [2]. Taken together, those results have driven the engineering paradigm of co-designing significantly restricted classes of linear code-books in tandem with code-specific decoding methods that exploit the codestructure to enable computationally efficient approximate-ML decoding [3] for long, high-redundancy codes. For example, Bose-Chaudhuri-Hocquenghem (BCH) codes with hard detection Berlekamp-Massey decoding [4], [5], Turbo codes with soft detection iterative decoders [6], Low Density Parity Check Codes (LDPCs) [7] with soft detection belief propagation decoding [8], [9], and the recently proposed CRC-Assisted Polar (CA-Polar) codes, which will be used for all control channel communications in 5G New Radio [10], with soft detection CRC-Assisted Successive Cancellation List (CA-SCL) decoding [11]- [20] and some alternate candidates [21]- [23]. Ken  Contemporary applications, including augmented and virtual reality, vehicle-to-vehicle communications, machinetype communications, and the Internet of Things, have driven demand for Ultra-Reliable Low-Latency Communication (URLLC) [24]- [28]. As realizing these technologies requires shorter codes, the computational complexity issues associated with long codes will be vacated in delivering URLLC, offering the opportunity to revisit the possibility of creating high-accuracy near-optimal universal decoders. The development of practical universal decoders would open up a massively larger palette of potential code-books that can be decoded with a single algorithmic instantiation, greatly reducing hardware footprint, future-proofing devices against the introduction of new codes, and enabling the flexibility for each application to select the most suitable code-book.
Key to unlocking that promise is the development of algorithms that are inherently suitable for efficient implementation in circuits. One potential approach is the recently introduced Guessing Random Additive Noise Decoding (GRAND). Originally established for hard decision demodulation systems [29], [30], GRAND provides ML decodings for any moderate redundancy block-code construction. It does so by sequentially removing putative noise-effects, ordered from most likely to least likely based on a statistical channel model, from the demodulated received sequence and querying if what remains is in the code-book. The first instance where a code-book member is found is the decoding. Pseudo-code for GRAND can be found in Algorithm 1.
Algorithm 1 Guessing Random Additive Noise Decoding. Inputs: a demodulated channel output y n ; a code-book membership function such that C(y n ) = 1 if and only if y n is in the code-book; and optional statistical noise characteristics or soft information, Φ. Output: decoded element c n, * ; and the number of code-book queries made, D, a measure of confidence in the decoding. Consider an arbitrary block code of rate R = k/n consisting of M = 2 k binary strings of length n, C n = {c n,1 , . . . , c n,M }. With c n being a transmitted code-word, y n denoting the hard decision demodulation, Z n denoting an independent binary additive noise-effect on the binary sequence, and ⊕ denoting addition in F 2 , we have y n = c n ⊕Z n . A maximum likelihood decoding satisfies c n, * = arg max i∈{1,...,M } P (y n |c n,i ) = arg max i∈{1,...,M } P Z n = y n ⊕ c n,i .
Even for short codes, naïve brute force identification of such a c n, * is not possible as it requires M = 2 k computations for each decoding.
By rank-ordering putative noise-effects in decreasing order of likelihood and breaking ties arbitrarily, i.e. determining the sequences {z n,i ∈ {0, 1} n } such that P Z n = z n,i ≥ P Z n = z n,j for all i < j, subtracting them from the demodulated received sequence in that order and querying if what remains, y n ⊕ z n,i , is in the code-book, the first such z n, * is an ML decoding so long as noise-effects are queried in decreasing order of likelihood, even for channels with memory in the absence of interleaving [30], [31]. The simplicity of GRAND's hard detection operation and the evident parallelizability of its code-book queries have already resulted in the proposal [32], [33] and realization [34] of efficient circuit implementations. The VLSI designs in [32], [33] focus on maximizing throughput and minimizing worstcase latency by parallelization. The taped-out realization [34] provides a universal 128-bit hard decoder chip with classleading measurements of precision, latency and energy per bit.
GRAND algorithms have two core components: a codebook membership checker and a sequential putative noiseeffect sequence generator. The former is common to all GRAND variants. If the code is unstructured and stored in a dictionary, a code-book query corresponds to a tree-search with a complexity that is logarithmic in the code-length. If the code is a Cyclic Redundancy Check (CRC) code, which is typically only used for error detection, checking for codebook membership requires a simple polynomial calculation. If the code is linear in any finite field, code-book membership can be determined by a matrix multiplication and comparison. Instead it is the putative noise-effect sequence generator that differs with each variant in light of statistical or per-realization information on channel characteristics.
Incorporating soft detection information into decoding decisions is known to significantly improve accuracy [35]- [37]. Doing so requires that additional quantized soft information be passed from the receiver to the decoder and, for GRAND, the development of an appropriate noise-effect pattern generator that can accurately and efficiently create noise-effect sequences in order of decreasing likelihood in light of that soft information.
Symbol Reliability GRAND (SRGRAND) [38], [39] is a variant that avails of the most limited quantized soft information where one additional bit tags each demodulated symbol as being reliably or unreliably received. SRGRAND retains the desirable parallelizability of the original algorithm, is readily implementable in hardware, and provides a 0.5 − 0.75 dB gain over hard-detection GRAND [39]. At the other extreme, Soft GRAND (SGRAND) [40] is a variant that uses real-valued soft information per demodulated bit to build a dedicated noiseeffect query order for each received signal. Using dynamic max-heap data structures, it is possible to create a semiparallelizable implementation in software and, being a true soft-ML decoder, it provides a benchmark for optimal decoding accuracy performance. However, SGRAND's execution is algorithmically involved and does not lend itself to hardware implementation.
Here we develop Ordered Reliability Bits GRAND (OR-BGRAND), which bridges the gap between SRGRAND and SGRAND by obtaining the decoding accuracy of the latter in an algorithm that is, by design, suitable for implementation in circuits. A preliminary version of ORBGRAND that provides near-ML performance for arbitrary length, moderateredundancy codes and block error rates (BLER) greater than 10 −3 was presented at IEEE ICASSP in 2021 [41]. Its promise for a highly parallelized hardware realization has already resulted in VLSI architectures being proposed [42]- [44] and it has been used to investigate the suitability of both existing and non-traditionally structured codes for use in URLLC [45], [46]. Here we explain the rationale behind ORBGRAND's design and expand on the preliminary conference version to generate near-ML performance for higher SNR. In the process, we describe an efficient algorithm that is suitable for hardware implementation, and establish performance.
The rest of this paper is organized as follows. Section II provides a brief overview of practical short codes and other approaches to universal soft detection decoding. Section III presents the rationale behind ORBGRAND and its practical implementation, which leads to the basic and full versions of ORBGRAND. Performance evaluation results that demonstrate ORBGRAND's effectiveness are presented in Section IV. Section V closes with final remarks.

II. RELATED WORK
In the quest to identify short code solutions, new low-latency applications have placed renewed focus on conventional codes [47]- [49] such as Reed-Solomon Codes [50] and BCH codes [51]. Soft detection decoders offer a non-trivial decoding performance gain over hard decoders [3], which will be especially necessary for short, high-rate codes. However, many traditional codes do not have corresponding soft decoders. Some stateof-art codes with dedicated soft decoders, such as Turbo and LDPC codes, can reach near Shannon-capacity performance with long codes, but their performance degrades when used with short, high-rate codes.
Notably, Polar codes, which were the first non-random codes that were mathematically established to be capacityachieving [52], have received significant attention. Owing to their poor performance at practical block-lengths [53]- [55], however, they have not been adopted on their own. Instead, a concatenated design has been proposed where a CRC is first added to the data, which is then Polar coded, resulting in CA-Polar codes. These codes are usually decoded with a list decoding approach where a collection of candidate Polar code-words is first determined, and then a code-word that satisfies the CRC is selected [11]- [14]. As they can be constructed at short block-lengths and have an efficient soft detection decoder, CA-Polar codes have been adopted for use for all control channel communications in the 5G New Radio standard [10]. Considered as a single code, a CA-Polar code is itself a linear code, albeit one that has no dedicated decoder. As a result, GRAND algorithms have previously established that there is additional performance left to be squeezed of out of them [46].
An alternate approach to designing code-specific decoders is to instead develop a universal decoder. One class of soft detection decoders that can decode any binary linear code, which works on a list-decoding principle, has been substantially investigated [56]- [62]. In Ordered Statistics Decoding (OSD), rather than compute the conditional likelihood of the received signal for all members of the code-book, instead the computation is done for a restricted list of candidate code-words that is hoped to contain the transmitted one. The algorithm permutes the columns of the parity check matrix in a manner that depends on the received signal reliability and Gaussian elimination is then performed to rewrite the generator matrix in systematic format, subject to checks that ensure a basis is identified, so that the systematic element of the code is based on the most reliable bits. Treating the code as a hash, a candidate list of code-words is determined by placing a ball of fixed Hamming distance around the reliable bits, and completing them with the hash. Transforming elements of this list back into the original basis, maximum likelihood decoding is performed on the restricted list. To achieve approximate-ML decoding performance, multiple stages of reprocessing are required, making it a challenge to implement the algorithm efficiently in hardware, especially for high throughput designs [45].
ORBGRAND inherits GRAND's potential for a highly parallelized implementation suitable for either high throughput applications or ultra-low power for use in battery-operated devices. Leaving the code-book checker unchanged, core to ORBGRAND is a new noise-effect pattern generator that incorporates per-realization soft information in a manner that lends itself to efficient hardware implementation, as explained in the following sections.

III. ORBGRAND
We first introduce the principle behind ORBGRAND's design, before explaining how the basic and full variants are implemented.

A. ORBGRAND Principles
An n-bit binary block code-word c n ∈ {0, 1} n , is modulated to mod(c n ) ∈ {−1, 1} n by mod(c i ) = 2c i −1, transmitted and impacted by independent continuous additive noise, N n ∈ R n , resulting in a random received signal Y n = mod(c n ) + N n , from which the hard decision sequence y n = demod(Y n ), an estimate of c n , is obtained. The noise effect is the difference between what the transmitted binary codeword and the demodulated received signal, Z n = c n y n . All GRAND algorithms make queries to identify the noise effect, Z n , rather than the original continuous noise on the channel N n . With the loglikelihood ratio defined as While there are many ways to quantitatively capture the soft information in Y n , for ORBGRAND it is instructive to first represent it as a sequence, B n = (B 1 , B 2 , . . . , B n ), where B i is the a posteriori probability that the hard decision bit y i is in error, which can be written as and so B i can be expressed in terms of the bit reliabilities as where B i is monotonically decreasing with |LLR(Y i )|. From B n we can evaluate the a posteriori probability of a binary noise-effect sequence z n , Therefore, up to a constant shared by all sequences, the likelihood of a putative noise effect sequence z n is determined by the sum of the reliabilities of hard-detected bits being flipped, To rank order putative noise sequences, z n , in decreasing likelihood, it is, therefore, sufficient to rank order them by increasing reliability sum, Rel(z n ).
If no soft information is available, by defining |LLR(Y i )| to be an arbitrary positive constant for all i, Rel(z n ) is proportional to the Hamming Weight of z n , w H (z n ) = n i=1 z i . In this case, putative noise sequences would be rank ordered in increasing Hamming weight, as used in the original hard detection GRAND for a binary symmetric channel. SRGRAND filters |LLR(Y i )|, setting it to be +∞ if it is above a threshold and to a positive constant if below that threshold, resulting in putative noise sequences being be rank ordered in increasing Hamming weight within the masked region of finite reliability bits. Armed with {|LLR(Y i )| : i ∈ {1, . . . , n}}, true soft ML decoding is achieved by SGRAND [40] using a dynamic algorithm that recursively generates a max-heap for each set of reliabilities to generate z n with increasing Rel(z n ). Our goal with ORBGRAND is to obtain comparable performance with an algorithm that is amenable to efficient implementation by design.
For notational simplicity, we shall assume that the reliabilities, {|LLR(Y i )| : i ∈ {1, . . . , n}}, happen to be received in increasing order of bit position, so that |LLR(Y i )| ≤ |LLR(Y j )| for i ≤ j. In practice, for each received block we sort the reliabilities and store the permutation, π n = (π 1 , . . . , π n ), such that π i records the received order index of the i th least reliable bit. The permutation π n enables us to map all considerations back to the original order that the bits were received in.
The core of the approach underlying ORBGRAND is the development of statistical models of the non-decreasing sequence {|LLR(Y i )| : i ∈ {1, . . . , n}} that are accurate, robust, and lead to computational efficient algorithms for generating rank ordered putative noise sequences. The approach can be most readily understood with the example of a channel using BPSK modulation that is subject to Additive White Gaussian Noise (AWGN), where LLR(Y ) ∝ Y . As constants of proportionality will prove to have no impact on ORBGRAND's order, from here on we will refer to L i = |Y i | as the reliability of the i-th bit. Sample rank ordered reliability values {L i : i ∈ {1, . . . , n}} are plotted in Fig. 1 for various SNRs. At lower SNR, the reliability curve is near linear with a zero  intercept, while for high SNR the intercept is non-zero and there is notable curvature, particular for the least reliable bits, which are most significant for generating an accurate query order. Different levels of approximation to the reliability curve lead to distinct decoding complexity and performance, as will be explored in the following sections.

B. Basic ORBGRAND -The Low SNR Model
The simplest statistical model, λ n , for the reliability curve is a line through the origin with slope β > 0, (2) This model is illustrated by the dashed line in Fig. 1, where it can be seen to provide a good approximation at lower SNR. For the zero-intercept linear model, where we define is the sum of the positions that are flipped in z n to be the Logistic Weight of the binary sequence z n . Thus, in this model the likelihoods of putative noise effect sequences are ordered in increasing logistic weight and hence the value of β need not be estimated. As a result, for any β, the first putative error sequence always corresponds to no bits being flipped, which has w L = 0. The second query corresponds to z n with the least reliable bit flipped, having w L = 1. The third corresponds to only the second least reliable bit of z n flipped, which has w L = 2. The next query is either the noise-effect where only the third least reliable bit is flipped, or the one where the least reliable and second least reliable bits are both flipped, both having w L = 3, with the tie broken arbitrarily. The ordering proceeds in that fashion as illustrated in Fig. 2, which describes the noise-effect sequence generator in basic ORBGRAND [41]. Thus, for its operation, ORBGRAND based on this statistical model only requires the permutation recording the positions of the rank ordered reliabilities of the received bits, π n , from which the algorithm proceeds deterministically. Note that for a binary string of length n, the maximum logistic weight is achieved by the sequence of all 1s giving w L (1, . . . , 1) = n(n + 1)/2, and so what remains to do for basic ORBGRAND is to develop an efficient algorithm that sequentially generates putative noise sequences in terms of increasing logistic weight. To do so, we must be able to identify all allowable noise-effect sequences for each given logistic weight W ∈ {0, . . . , n(n + 1)/2}, That objective can be fractionated by conditioning on the Hamming weight, w, of the sequences, giving where the upper-bound on the union stems from the fact that if the Hamming weight of z n is w, the smallest logistic weight that z n can have is from the sequence with flipped bits in the first w positions of z n , giving a logistic weight of w L (z n ) = w(w + 1)/2 ≤ W . Consider a single set in the union in Eq. (6) for Hamming weight w. Determining where v w contains the indices of the flipped bits in z n , which amounts to finding all integer partitions of W of size w with non-repeating positive parts subject to a maximum value of n.

By setting
it is possible to reformulate the set in Eq. (7) in one final way in terms of the u i , as the integer partitions of W = W − w(w + 1)/2 into w not-necessarily distinct, non-negative parts no larger than n = n − w [63]. That is, determining all the elements in the set Eq. (7), is equivalent to finding all integer vectors u w such that Here we introduce an efficient algorithm for determining all sequences that are in the partition, which is suitable for implementation in hardware. It will form an essential component of the full ORBGRAND, which uses a more sophisticated model than described in Eq. (2).

C. Integer Partition Pattern Generator
Integer partitions can be represented by diagrams [64] as illustrated in Fig. 3, where each column represents an integer part with its value, u i , equaling to the number of cells in the column and the total number of cells in the diagram equaling the integer to be partitioned i u i . Here we use a mirror   Fig. 3 (c) and (e). Fig. 3 (a) represents an extreme case in which the minimum number of non-zero integer parts is achieved by pushing cells to the right with part values maximized. Another extreme case is that cells are spread to maximum number of parts achieving the minimum number of rows, or equivalently, satisfying D(1) ≤ 1, as illustrated in Fig. 3 (h). All partitions for the setting of W = 8, w = 4 and n = 4 are obtained in the migration procedure from (a) to (h), which can be accomplished with the Landslide algorithm presented in Algorithm 2.
The algorithm heavily relies on the Build-mountain routine, in which a partial partition is performed to push unallocated cells to the right-most parts, akin to building the steepest, highest mountain allowable on the right side of the diagram. For example, in Fig. 3 (e), u 1 = 1 is determined from step (d), the remaining 7 cells are to be assigned to u 2 , u 3 and u 4 . The assignment can be accomplished by first making u 2 , u 3 and u 4 identical to u 1 = 1, and then assigning the remaining 4 cells to the right-most parts, with u 4 maximized and u 3 increased by 1.
In general, when the values of {u 1 , u 2 , . . . , u k } have been specified, the allocation of the remaining cells to {u k+1 , u k+2 , . . . , u w }, or the Build-mountain routine, is carried out as follows: The initial partition in Fig. 3 (a) is obtained with the same method by simply assuming a dummy part u 0 = 0. With the Build-mountain routine explained, the Landslide algorithm is described as in Algorithm 2.

Algorithm 2 The Landslide Algorithm
Input: W , w, n Output: {u w,j , j = 1, 2, . . .} 1: Build-mountain for initial partition 2: j ← 1 3: u w,j ← u w 4: Update D(i) for 1 ≤ i ≤ w 5: while D(1) ≥ 2 do Update D(i) for 1 ≤ i ≤ w 10: j ← j + 1 11: u w,j ← u w 12: end while 13: Return u w,1 , u w,2 , u w,3 , . . . Fig. 3, the procedure of the Landslide algorithm is illustrated in Algorithm 2, along with the mapping from partition u w,i to v w,i according to Eq. (8). The diagram indicates the potential for efficient implementation of the Landslide algorithm. While one routine generates partitions for one Hamming weight w at a time, multiple parallel routines can generate partitions for different Hamming weights, providing sufficient noise-effect sequences for highly-parallelized code-book checking.

D. The full ORBGRAND Algorithm
The zero-intercept, linear statistical model for rank-ordered bit reliabilities that underpins basic ORBGRAND in Eq. (2) requires no input beyond a rank ordering of received harddetection bits by increasing reliability and provides a good approximation to the reliability curve in low SNR conditions. It is, however, evidently a poor description at higher SNR in Fig. 1. That mismatch results in basic ORBGRAND's query order diverging from true likelihood order at higher SNR, with corresponding performance loss. By expanding the statistical model used to describe the reliability data to a piece-wise linear one for full ORBGRAND, we retain the algorithmic efficiencies of generating integer partition sequences while improving block error rate performance at higher SNR.
As illustrated in Fig. 5, with I 0 = 0 and I m = n, the msegment statistical model curve is represented as where 1 ≤ i ≤ m is the segment index. The anchor indices {I i : i ∈ {0, 1, . . . , m}} define the domain of each segment, while J i−1 ∈ Z and β i ∈ N, respectively, determine the initial value and slope of the i-th segment. That J i−1 and β i are restricted to being integers is crucial to enabling efficient algorithmic implementation producing rank ordered putative noise sequences, and results will demonstrate that no loss in performance results from this constraint. The model used for basic ORBGRAND, Eq. (2), is a special case of Eq. (10) with m = 1, I 1 = n and J 0 = 0.  The approximate reliability sum of z n , namely the reliability weight, based on the full model is then and the likelihood of noise effect sequences decreases with increasing reliability weight. With this new approximation, the set of noise-effect sequences for a weight of W becomes where , . . . , m} and × represents Cartesian product. Thus, to generate all elements of S W in Eq. (11), we identify the set of all possible splitting patterns of W , denoted by using Algorithm 3, explained later. For a given W m = (W 1 , . . . , W m ) ∈ Ξ W , consider the generation of the partial sequence set Ψ i Wi defined in Eq. (11). Recalling Eq. (10), each partial sequence must satisfy (13) which, defining w i = w H (z Ii−1+1 , . . . z Ii ) and with v k = j k − I i−1 being the relative indices of the flipped bits, is equivalent to Eq. (14) indicates that, with the partial reliability weight W i and Hamming weight w i specified for the i-th segment, the partial noise-effect sequence generation reduces to the integer partition problem that is efficiently solved by the Landslide algorithm in section III-C. Splitting a reliability weight value of W into m parts, as defined in Eq. (12), is a distinct integer partition problem, which we call the integer splitting problem for differentiation. The difference here lies in that the same group of parts with different orders are distinct splitting patterns. A common approach to finding all splitting patterns in Ξ W is given in Algorithm 3, which starts with sweeping W 1 from 0 to W . For a given value of W 1 , W 2 is swept from 0 to W − W 1 . For each fixed W 1 and W 2 , W 3 is swept and the nested loop reaches W m−1 . Then W m is computed as W − Eq. (14) indicates that the actual number of valid splitting patterns is, however, much smaller than ξ W , owning to the requirement that each element W i of a valid W m must satisfy all of: Therefore, any non-zero element W i in W m must be associated with a non-empty set of partial Hamming weights {w i }, such that Eq. (15) is satisfied. Otherwise W m is invalid and should be discarded. The associated set for W i can be obtained with Algorithm 4. The FAIL return from Algorithm 3 invalidates W i as well as the whole split pattern W m . In Algorithm 2, each new value of W i is checked against Algorithm 3. A return of FAIL discard the current value of W i and force the loop to jump to the next iteration with a new value of W i . Only when W i is validated, can the follow-up nested loop over W i+1 continue. Each returned partial Hamming weights set {w i,k , k = 1, 2, . . .} should also be saved for the later generation of partial noiseeffect sequences.

Algorithm 4 The collection algorithm for valid partial Hamming weights
In addition to the validation from Algorithm 4, more measures are available for further reduction of the set size of Ξ W . For example, after the initial value of 0, W i can jump to J i−1 + β i omitting all values in between. Generally, due to the small segment number m in practice, the generation of splitting patterns W m has limited impact on the overall efficiency of the ORBGRAND algorithm, which is instead dominated by the efficient Landslide algorithm.
A significant complexity reduction is, however, available if J i−1 is divisible by β i . In this case, W i must also be divisible by β i in order to have Eq. (15) satisfied. This can be achieved by sweeping W i in steps of size β i . Then the validation of a partial Hamming weight w i is straightforward, forsaking the need of Algorithm 4. The extra restriction on J i−1 logically leads to a potential performance loss. As demonstrated by later simulations, the minor performance loss justifies the complexity reduction measure.
Given parameters of the statistical model in Eq. (10), all the components necessary to create the full ORBGRAND algorithm have been described. The likelihood order of generated noise-effect sequences is governed by the increasing value of reliability weight. For each specified weight value W , Algorithm 2 (or its optimized version) is used to generate Ξ W , the set of valid splitting patterns. Each splitting pattern W m ∈ Ξ W has its element (or partial reliability weight) W i assigned to the i-th segments. In each segment, Eq. (14) indicates that the Landslide algorithm can efficiently generate Ψ Wi , the set of all possible partial noise-effect patterns, as defined in Eq. (11). Cartesian product over partial sequence sets, as shown in Eq. (11), is performed to create the set of noise-effect sequences for the current splitting pattern W m . Finally, the union operation in Eq. (11) forms S W , the full set of noise-effect sequences for the reliability weight of W . Parallel implementation can be achieved at several levels, such as jointly generating partial sequences for multiple segments, or concurrently generating noise-effect sequences for multiple splitting patterns. What remains now is determining the parameters in the piece-wise linear model.

E. Piece-wise linear fitting and quantization
Key to ORBGRAND's practical complexity is that it operates on λ n = (λ 1 , . . . , λ n ), an approximation to the original rank ordered reliability curve for a given received code block (L 1 , . . . , L n ). The approximation level determines the trade-off between algorithmic complexity and the decoding performance. The simplest statistical model is a line through the origin, which only requires rank ordering of the received bits by their reliability, but results in degraded performance at higher SNR scenarios. The model underlying the full ORBGRAND necessitates two stages: piece-wise linear fitting and quantization. While there are numerous approaches for either higher accuracy or lower complexity, here we introduce a method with moderate algorithmic complexity that serves as a reference design and demonstrates the robustness of ORBGRAND.
Given an independent and identically distributed set of random variables, {A i : i ∈ {1, . . . , n}}, drawn from a cumulative distribution F A , results from the theory of Order Statistics [65] tell us that rank ordering from least to greatest, so that A (i) is the i-th smallest value, leads to A (i) ≈ F −1 A (i/n) for 1 ≤ i ≤ n and large n. F −1 A (·) is a monotonically increasing function and serves as the functional mean of rank ordered ensembles of observations {A i : i ∈ {1, . . . , n}}. For rank ordered reliabilities of blocks of bits received from the channel, (L 1 , . . . , L n ), this serves as guidance for a fitting procedure for the statistical model.
We can, therefore, use the edge point at index I a,0 = 1 and the center point at index I a,1 = n/2 on L n as the initial set of anchor points from which other anchor points for segmentation can be found, as illustrated in Fig. 6. A straight line is drawn linking the anchor points at I a,0 and I a,1 . The maximum vertical gap between the straight line and the reliability curve determines the location of the new anchor point with its index marked as I a,2 . New anchor points can be found between adjacent anchor points in the same way. A rule of thumb is that more points should be located in the high-curvature area near the edge. The indices of anchor points define the segmentation of the reliability curve, and the linear lines linking adjacent anchor points form a piece-wise linear fitting to the reliability curve.

Reliability
Rank order If the curve L n is close to a straight line between two anchor points an additional segment is unnecessary, or, a casually added segment has little impact to the performance except for some overhead in the splitting of logistic weight. The same fitting technique can be applied to the high reliability area near the right edge, however, as shown in Fig. 6, we choose to extend the central line to cover the area. In low SNR cases, the extended straight line by itself is a good approximation, and in high SNR cases, high reliability bits have little influence on the generation order of noise-effect sequences. The assertion has been verified with simulations.
From Eq. (10), the piece-wise linear approximating curve is defined with three sets of non-negative integer parameters: I i ∈ Z + , 0 ≤ i ≤ m, the indices for segmentation; J i ∈ Z, 0 ≤ i ≤ m − 1, the offset of each linear segment; and β i ∈ N, 1 ≤ i ≤ m, the slope of each segment. When m + 1 anchor points on L n have been obtained, their indices are used as the segmentation indices and is denoted as I i , i = 0, 1, 2, . . . , m, where I 0 = 0 and I m = n. We further use the smallest slope of fitted linear lines to quantize parameters, which is computed as where the slope of the first segment is specially treated due to the lack of L 0 . The quantized parameters of linear lines are then computed as, where [ ] is the rounding operation. Again, β 1 and J 0 are specially treated for the first segment. A complexity reduction technique in section III-D requires J i−1 to be integer multiples of β i , which can be easily achieved with operation [J i−1 /β i ] β i . The segmentation method in Fig. 6 and line parameters obtained from Eq. (17) complete the piece-wise linear fitting and quantization.

IV. PERFORMANCE EVALUATION
As stated in the motivation, ORBGRAND is particularly well suited to low to moderate redundancy. Such redundancy regimes can be achieved by having short length codes, or longer codes with sufficiently high rate to make the number of redundancy bits low to moderate. Commonly used codes have structures that limits their operating range, but Random Linear Codes (RLCs) have no such limitation and can be constructed for any length and rate. As an illustration, Fig. 7 shows as heat map of block error rates (BLERs) for different code lengths and rates for RLCs decoded using basic ORBGRAND. With the range of suitable rates and lengths for ORB-GRAND in mind, we can now explore the performance. A key feature of all GRAND algorithms is that they provide excellent decoding performance for all moderate redundancy codes, regardless of length or structure, and so can be used to identify the best code structures. Our first comparison is naturally CA-Polar codes, which are are the state of the art codes which are both moderate length and high rate and which are designed for decoding with soft information, with dedicated CA-SCL decoders. We consider CA-Polar[256, 234], which has 22 parity bits and uses the 11-bit CRC specified for 5G NR up-link control channels. The CA-SCL decoder the list size is set to 16, and apply the CA-SCL decoder from the AFF3CT toolbox [66] as our performance reference. Non-standard codes include BCH codes, which can be well designed for low to moderate redundancy but are not designed for decoding with soft information, and CRCs, which are designed for error detection rather than correction but that are being considered for error correction using GRAND [46], [67]. CRCs present desirable low complexity in encoding and code-book checking. Finally, we also consider RLCs, whose use with GRAND is also being explored [31], [34], [41], [45], [46].
Our comparison is in Fig. 8, where the 3-line version of ORBGRAND is used to decode. All ORBGRAND algorithms in our figures below have been set to abandon searching and record a block error if no code-book element is identified within 5 × 10 6 code-book queries.  [46]. For a BLER of 10 −4 or below, CA-SCL outperforms the basic variant of ORBGRAND as its model does not produce putative noise sequences in near-ML order at higher SNR. ORBGRAND with 1-line fitting provides an observable, but limited, improvement over the basic version because their only difference is that the 1-line version starts from the quantized value of L 1 instead of the origin in the basic version. With the 2-line version, the curvature in the low reliability region is captured by the two fitted lines, essentially resulting in the elimination of performance loss, and leaving only a small room of improvement for the 3-line version, which in turn overlaps with the 4-line version.
Similar observations can be seen in the simulation results for CA-Polar[512, 490] and CA-Polar[1024, 1002] codes in Fig. 11, 12 and Fig. 13, 14   number of parity bits, the loss of performance of the basic version occurs at a higher BLER, as shown in Fig. 13, where CA-SCL surpasses the basic version before BLER of 10 −3 .
Up to now the simulations have demonstrated the effectiveness of the multi-line ORBGRAND in maintaining its performance advantage over the state-of-art CA-SCL decoders. In the next step we evaluate the influence of complexity control methods, which can bring significant advantages for ORBGRAND in practical implementations, on performance.
An example complexity control measure is to have J i−1 in Eq. (14) be an integer multiple of β i . As discussed in Section III-D, the advantage is that Algorithm 4 is no longer needed, improving the efficiency of Algorithm 3. As shown in Fig.  10, Fig. 12 and Fig. 14, with the factor of J i−1 /β i joined in, there is trivial change of performance between decoders with corresponding segmentation, demonstrating the robustness of  ORBGRAND.

V. DISCUSSION
With an abundance of new applications requiring low latency and high reliability for their operation, finding and decoding short, high-rate codes is attracting substantial attention. Old and new candidate codes along with their standard decoders have been explored and recognized to have imperfections in either the decoder or the code itself. We have introduced ORBGRAND, a practical soft detection variant of guessing random additive noise decoding, with which it is possible to decode any moderate redundancy code with near optimal performance.
ORBGRAND offers a range of design complexities with its basic version being the simplest and requiring the least soft information. The core algorithm of the basic ORBGRAND  generates integer partitions, for which we proposed the Landslide algorithm, which is suitable for efficient hardware and real-time implementation. That algorithm is an essential component for the full ORBGRAND, which has higher design complexity, but can better exploit soft information at higher SNRs for additional decoding gains. Simulation results show that the performance of ORBGRAND is directly dependent on how well the reliability curve is approximated, and the basic ORBGRAND adopts the simplest one. Inspired by the finding, we proposed the piece-wise linear approximation to the reliability curve, which optimizes ORBGRAND across all SNRs. The ORBGRAND algorithm, curve fitting techniques, and robustness to complexity improvement are established with simulations. The decoding performance is dependent on ORB-GRAND's design complexity, but the 3-line version is capable of maintaining close-to-optimal performance in most scenarios. The proposed complexity control method is demonstrated to have little impact on performance, illustrating the robustness of ORBGRAND and anticipating the potential for further complexity reduction measures to facilitate VLSI implementation.