Identifying codewords in general Reed-Muller codes and determining their weights

: Determining the weight distribution of all Reed-Muller codes is a huge and exciting problem that has been around since the sixties. Some progress has been made very recently, but we are still far from a solution. In this paper, we addressed the subproblem of determining as many codeword weights as possible in Reed-Muller codes of any lengths and any orders, which is decisive for determining their weight spectra (i.e., the lists of all possible weights in these codes). New approaches seem necessary for both the main problem and the subproblem. We ﬁrst studied the di ﬃ culties and the limits of the approach, which consisted of using the usual primary and secondary constructions of Boolean functions for the purpose of determining as many weights as possible in Reed-Muller codes. We then introduced a way, di ﬀ erent from the usual constructions, to generate Boolean functions in n variables having an algebraic degree bounded from above, without any restriction on n , and whose Hamming weights can be determined. This provided weights in Reed-Muller codes of any lengths 2 n and any orders, allowing us to reach potentially new values in the weight spectra of Reed-Muller codes (as we illustrate with all Reed-Muller codes of lengths up to 2 21 ), with the related codewords being given with their supports and their algebraic normal forms being mathematically derived.


Introduction
For every nonnegative integers r, n such that r ≤ n, the Reed-Muller code RM(r, n) of length * N = 2 n and order r equals the vector space over F 2 of n-variable Boolean functions of algebraic degree at most r.Recall that each n-variable Boolean function f : F n 2 → F 2 admits a representation in the form of a multivariate polynomial over F 2 of a particular shape: (the sum being calculated modulo 2).Such representation is unique for each function and is called its algebraic normal form (ANF).The global degree max{|I|; a I = 1} of the ANF is called the algebraic degree of f .For a binary block code needing to be a subset of F N 2 for some N, each Boolean function is identified with the list of its N = 2 n values, some order on F n 2 being previously chosen.When we shall speak of codewords of Reed-Muller codes, we will not make the difference between an n-variable Boolean function and the corresponding vector of length N.
Reed-Muller codes were introduced in 1954 by David Muller in [27] and their decoding algorithm was given the same year by Irving Reed in [29].These codes have originally played an important role in the theory of error correcting codes, as well as in their applications.It is well known that the Reed-Muller code RM (1,5) was used in the sixties for correcting the errors of transmission of the first photographs of Mars by Mariner.These photographs were in black and white.Every codeword corresponded to the level of brightness of a pixel.There were 64 different levels since there are 64 codewords in RM (1,5), and the minimum distance of this code was equal to 16, with up to 16−1 2 = 7 errors that could be corrected in the transmission of each codeword † .
Reed-Muller codes were also used in the 3rd generation (3G) of mobile phones (starting in 2000).Reed-Muller codes intervened in the initial "handshake" between the mobile device and the base station, whose role was to inform the receiver of what type of communication would come next.Again, RM (1,5) was initially used for this purpose, and it was later replaced by a punctured subcode of the second-order Reed-Muller code RM (2,5), which had a dimension of 10 and a minimum distance of 12.
The parameters of Reed-Muller codes are not so good, except for the first order, but they contain optimal codes such as the Kerdock codes [19].They still play an important role nowadays, thanks to their specific properties (see, e.g., [2,13]) and their roles with respect to new problematics, such as locally correctable codes [20]), low degree testing, private information retrieval, and compressed sensing.The interest in Reed-Muller codes has also been renewed because of polarization (see, e.g., [24]).At various block-lengths and rates, Reed-Muller codes can be superior to polar codes [25], even for 5G [14].A nice survey on Reed-Muller codes can be found in [1].
We can easily generate the ANF (1.1) of (infinite classes of) codewords in any Reed-Muller codes, but in most cases, it is impossible to calculate (mathematically) their Hamming weight w H ( f ) = |{x ∈ F n 2 ; f (x) = 1}|.Determining Hamming weights (if possible, all weights of codewords, and, if possible, the whole weight distribution) in Reed-Muller codes has always been considered very important; see, e.g., the papers [4, 5, 7, 12, 15-18, 23, 26, 30], the data in [31], and the books [22,28].The weight distributions of the Reed-Muller codes of length 2 n and orders 0, 1, 2, n − 2, n − 1, n are known.The weights in these codes equal 0, 2 n for the order 0, with additionally 2 n−1 for the order 1, and 2 n−1 ± 2 i where n 2 ≤ i ≤ n for the order 2; see, e.g., [22].The weights in RM(n, n) are all integers between 0 and 2 n since RM(n, n) = F 2 n 2 ; the weights in RM(n − 1, n) are all even integers between 0 and 2 n ; the weights in RM(n − 2, n) are all even integers between 0 and 2 n except 2 and 2 n − 2. For all these codes, the weight distributions are known (thanks to the Mac Williams identity for the orders n − 2, n − 1 [21,22], since the dual of RM(r, n) equals RM(n − r − 1, n)).The weight distributions of some Reed-Muller codes RM(r, m) have been determined thanks to heavy computations, for m small enough; they are reported in [31].
The weights in RM(n − 3, n) have been recently determined in [12].They are all even integers in {0, 2, 4, ..., 2 n } \ {2, 4, 6, 10, 2 n − 10, 2 n − 6, 2 n − 4, 2 n − 2} = {0, 8, 12 + 2i, 2 n − 12, 2 n − 8, 2 n }, where i ranges over consecutive integers from 0 to 2 n−1 −13.They have been obtained by an induction (the Mac Williams identity does not allow us to determine the weight distribution, which is still unknown despite the fact that the weight distribution of RM(2, m) is known, because the expression of the number of codewords of Hamming weight 2 n−1 in RM(2, n) is too complex).This induction does not allow us to determine the weight distribution, and new ideas to be found seem necessary for obtaining it.However, determining the weight spectrum ‡ of RM(n − 3, n) is already a step forward.
For general Reed-Muller codes, bounds are known on the weight enumerators, which are useful for studying the capacity of Reed-Muller codes on the binary erasure channel and the binary symmetric channel (see [1,Chapter 4]), but our knowledge on the weights themselves is limited.
McEliece's theorem [23] shows that the weights in RM(r, n) are divisible by 2 n−1 r , and Kasami-Tokura's result (that we shall recall in Section 2) and Kasami-Tokura-Azumi's results [17] give the weights of RM(r, n), which are between the minimum distance d = 2 m−r and 2.5 times d.It is conjectured in [12] that for every constant c and for n large enough, the weight spectrum of RM(n−c, n) is made of 0 and 2 n and all the weights between the minimum distance 2 c and its complement to the length 2 n , which are authorized by McEliece's theorem and Kasami-Tokura-Azumi's results.This means, in particular, that every even number between 2.5 times the minimum distance and its complement to 2 m would be a weight in RM(m − c, m).This conjecture § is verified by the weight spectra of RM(n − 5, n), RM(n − 4, n) and RM(n − 3, n).The method used in [9,12] for handling these three weight spectra is the same: There is a corollary in [30], which can easily be proved directly, and which says that the weight spectrum of RM(r, n) includes A + A, where A is the weight spectrum of RM(r − 1, n − 1).This allows us to address the weight spectrum of RM(n − c, n) by an induction on n, starting from a value n 0 such that the weight spectrum of RM(n 0 −c, n 0 ) is already generic, which means that it has, according to McEliece's theorem, a divisibility by 2 and not by a larger power of 2. This means that we need to start from n 0 ≥ 2c.Indeed, according to McEliece's theorem, all the weights in RM(c − 1, 2c − 1) are divisible by 4, while those in RM(c, 2c) are divisible by 2. We know from [6] that McEliece's divisibility bound is tight in the sense that there is at least a codeword in every RM(r, n) code, with a weight congruent to 2 n−1 r modulo 2 n−1 r +1 .We can try to see whether the weights obtained from A+A, where A is the weight spectrum of RM(c, 2c), allow us to reach all the weights authorized by ‡ In coding theory, contrary to Boolean function theory, the spectrum does not include the multiplicities of the values (when these multiplicities are taken into account, we speak of weight distribution).
§ It seems a little risky to present this as a conjecture and in [9], it is then presented as an open question.
McEliece's theorem and Kasami-Tokura-Azumi's result.The first difficulty is then to reach all weights in RM(c, 2c).In the case of c = 3, 4, this has been rather easy, but proving the conjecture recalled above for c = 5 with this method, which needs to start the induction with n = 10 (a value much larger than what can be reached with the heavy computations made by M. Terada, J. Asatani, and T. Koumoto and reported in [31]), has led to the construction of functions in 10 variables with an algebraic degree of at most 5 and having all possible even weights between 2.5 times the minimum distance 32, that is, 80 and 2 10 − 80.The next step c = 6 needs to address the code RM (6,12), which has huge parameters [4096,2510], while the largest reached currently are [512,256] and [512,382]).It is shown in [9] how determining the weight spectrum of RM (6,12) needs to determine whether some specific values (such as 166), which are "holes" after general methods were applied, are the weights of codewords.This may not be as hard as expected for c = 6, but addressing larger values of c will probably lead to more of such "holes".Hence, being able to build as many weights as possible in Reed-Muller codes is of a great importance, and in particular, reaching weights that are not obtained by classic constructions.
Providing weights can indeed be tried by investigating the known (primary and secondary) constructions of Boolean functions and deducing functions whose weight can be determined, as was done in [9].Some weights are easily reached this way, but we can expect that these constructions will not suffice for addressing the weights in RM(n − c, n) for larger values of c.
Note that the codes RM(n − c, n) considered above, being such that n ≥ 2c, are of the form RM(r, n) Another case where more weights in Reed-Muller codes RM(r, n) are useful information is when r < n 2 .Recall that when Boolean functions in n variables are given, for instance, by their ANF, with n ranging over N, it is rarely possible to mathematically evaluate their Hamming weights.Of course, it is always possible when the function is affine (belonging then to the Reed-Muller code of order 1), but this provides only three weights for each n.When the function is taken quadratic (i.e., belonging to the second-order Reed-Muller code), there are methods for determining its weight (see a survey in [8,Chapter 4]).However, these methods allow us to concretely address only a few cases (even the first step, which consists of determining the linear kernel of the function, is impossible to complete systematically).The weights of quadratic functions are very specific.The indicators of affine spaces (flats) are also addressable, but their weights are minimal in the Reed-Muller codes to which they belong.It needs specific work to study the weights of Boolean functions obtained by the constructions evoked above, and we shall describe in Section 2, as nothing automatic exists.
The problem we want to address in this paper is not as hard as determining the weight of any given Boolean function: We only want to find as many weights as possible in general Reed-Muller codes.However, it is not so easy to provide codewords of Reed-Muller codes whose weights can be determined.
For finding more weights, methods complementary to the usual constructions are needed.In the present paper, we give such a method to automatically generate codewords in Reed-Muller codes of any lengths 2 n .These codewords depend on the number of variables n, the order r, a parameter t, and the choice of t vectors a i .We have, thanks to a property of the corresponding functions, an upper bound on their algebraic degree (but determining the degree exactly would be difficult, and even trying to directly show this upper bound by working on the ANF of the functions seems quite hard).The weights of these functions can be evaluated or at least bounded from above, because when these Boolean functions are given as the sums (modulo 2) of atomic ones, the only limitation for evaluating
their weights is to determine the number of these atomic functions which appear an odd number of times in the expression.There is a case (when the vectors a i involved in the construction are linearly independent) where we can ensure that all these atomic functions are distinct, which allows us to exactly calculate the Hamming weight.This provides information on the weight spectra of Reed-Muller codes when they are unknown (that is, currently, for the orders from 3 to n − 6).For instance, we shall see in the tables provided that our method gives weights in RM(r, n) that are much larger than twice the minimum distance and have low valuation.
The case mentioned above, where the vectors a i are linearly independent, provides at most n 2 distinct weights for each Reed-Muller code, and this is not much.We then investigate two cases where the vectors are linearly dependent.We do not cover all the cases where the vectors are linearly dependent (it seems impossible to do so), but other cases could be similarly investigated.
We also study the weights of the sums of the designed functions, in a case where we know they have disjoint supports.This provides many more weights.
The paper is structured as follows.In Section 2, we recall the state of the art in the determination of weights in Reed-Muller codes by using the classic constructions (Maiorana-McFarland, etc.).We show the difficulties presented by this method and why it suits better for low orders.In Section 3, we introduce our new construction of Reed-Muller codewords and we study some particular cases.We determine the weights under a condition that is rather general (namely, some vectors a i involved in the construction are linearly independent), and we also study two cases where this condition is not satisfied; this provides a list of weights for each Reed-Muller code, which is longer for larger orders.We then show that more weights -a huge number when the order is large enough -can be obtained as the additions of some of these weights.To conclude this section, we determine the ANF of the constructed functions when the vectors a i are linearly independent.We conclude with some observations on future work.

State of the art on the Hamming weights of Reed-Muller codewords
It is well-known that the minimum nonzero Hamming weight of RM(r, n) equals 2 n−r (see [22,Chapter 13], and see [8,Chapter 4] for a more direct proof), and that the nonzero minimum weight codewords in this code are the indicators of the (n − r)-dimensional affine subspaces of F n 2 .All the low Hamming weights are known in all Reed-Muller codes, and there are very few: Berlekamp and Sloane [4] (see the Addendum in this paper) and Kasami and Tokura [16] have shown that, for r ≥ 2, the only Hamming weights in RM(r, n) occurring in the range [2 n−r ; 2 n−r+1 [ are of the form 2 n−r+1 − 2 n−r+1−i , where we have i ≤ max(min(n − r, r), n−r+2 2 ).The latter has completely characterized the codewords: The corresponding functions are affinely equivalent either to The functions whose Hamming weights are strictly less than 2.5 times the minimum distance 2 n−r have later been studied in [17].
Recall that, on the contrary, the general weights in RM(r, n) can be rather diverse, as soon as r ≥ 3 and n is large enough.Indeed, as shown in [7], for every Boolean function f on F n 2 , there exist an integer m and a Boolean function g of an algebraic degree of at most 3 on F n+2m 2w H (g) = 2 m (2 n − 2w H ( f )).Hence, the Hamming weight of f is related in a simple way to the Hamming weight of a cubic function (in a number of variables which can be exponentially larger).This shows that the weights in RM(3, n) (that is, the distances) can be complex, contrary to those in RM(2, n).Unfortunately, this result does not provide an efficient method for finding weights in thirdorder Reed-Muller codes: Trying to find new weights in these codes by starting with Boolean functions f of any degree in less variables and applying the result does not work well, because m in this result is exponentially large compared to n.
The possible weights of the codewords in the Reed-Muller codes of orders 3, . . ., n−6 whose values lie between 2.5 d and 2 n − 2.5 d are unknown ¶ , except for some functions that we shall describe, and which hardly allow to provide non-peculiar weights for general Reed-Muller codes: possibly added with constant 1 (that is, complemented), when we are able to ensure that the linear functions l 1 , . . ., l 2k are linearly independent.Then f equals the function composed on the right by a linear or an affine automorphism (we say that such a function is linearly, respectively affine, equivalent to , added with an affine function (we say then that the function is extended affine equivalent to ), and we can evaluate its Hamming weight.This provides weights 0, 2 n−1 , 2 n−1 ± 2 i , where i = n 2 , 2 n , which are all weights in RM(2, n) (all being easy to produce), but are rather peculiar in the larger Reed-Muller codes.We can also calculate the weights of the concatenations of such functions, of course, whose weights are a little more general (but the algebraic degree needs to then be determined).
• Indicators of flats (and their concatenations as well), that is, minimum nonzero weight codewords in Reed-Muller codes (see [22,Chapter 13]), in the form i∈I (a i • x + i ), where a i ∈ F n 2 , i ∈ F 2 , when we are able to ensure that the vectors a i are linearly independent.This provides weights 2 i , where i = 0, . . ., n, which are also easy to produce but are peculiar, too.Note that this class of functions is (as the previous one) preserved by affine equivalence.
• Functions whose weight is smaller than twice-and-a-half the minimum distance d of the Reed-Muller codes to which they belong.We have recalled above what these weights are when they are smaller than 2d; between 2d and 2.5d, the weights (determined in [17]) are too numerous for being recalled here; They are easy to produce but we encounter the same difficulty as for quadratic functions if we want to exhibit all functions with such weights: We know that they are affine equivalent to some particular functions, but ensuring such affine equivalence is not mathematically possible in an exhaustive way.Anyway, this strong result by Kasami, Tokura, and Azumi allows us to reach in Reed-Muller codes all weights smaller than 2.5 times the minimum distance (and their complements to 2 n ).The question is then to find as many other weights as possible.
• Some functions obtained by using the classic primary constructions of Boolean functions, in particular, Maiorana-McFarland, Niho, and PS ap -like constructions; see [8,Chapter 4].This allows us to reach some weights, but numerous subclasses of functions have to be separately investigated for allowing us to cover enough weights.Finding the weights that are reachable often poses technical issues, to be overcome for each subclass, such as solving equations, which can be done in some cases but not in general.To give an example, the weights of those particular ¶ But when n = 2r + 1, they are known in some cases by using invariant theory, because the code is then self-dual, see [22,28]).

Maiorana-McFarland functions of the form
and "•" is an inner product, are deduced from the relation ( which theoretically makes the study of the weights of these particular functions simpler.However, this replaces the difficulty of determining the weights of the functions having algebraic degrees of at most r by that of determining the possible values of the size |φ −1 (0)| when φ has an algebraic degree of at most r − 1, that is, when all its coordinate functions have algebraic degrees of at most r − 1.This latter problem, which is interesting to study for its own sake, may be hard since it results in determining the possible numbers of solutions of nonlinear systems of equations.Denoting the coordinate functions of φ by φ 1 , . . ., φ t , the solutions of the equation φ(y) = 0 are the elements of the support of the Boolean function t i=1 (φ i (y) + 1), which has an algebraic degree of at most t (r − 1).In the case t = 1, we only get that the weights in RM(r − 1, n − 1) are also weights in RM(r, n) (which is clear since, denoting x i = y i−1 for i = 2, . . ., n, the n-variable function x 1 g(x 2 , . . ., x n ) has the same Hamming weight as the (n − 1)variable function g), and as soon as t ≥ 2, the situation becomes complex.For instance, for r = 3 and t = 2, we will arrive in general to the determination of the support of a function of degree 4, which instead of reducing the degree, increases it.Moreover, the weights that are easier to obtain correspond to a large value of t and are then not quite general, since they have a valuation of at least t.The same kind of situation happens with the general Maiorana-McFarland, Niho, and PS ap -like constructions.Hence, even if it is possible to try using these classic constructions to reach weights in Reed-Muller codes, it is necessary, for reaching many weights, to have other approaches posing less problems; this is the purpose of the present paper.
• Direct sums of monomials and threshold functions (see a complete study of the cryptographic parameters of these functions in [10]).These are two cases where we can give the Hamming weights.The character sum x∈F t 2 ,y∈F n−t 2 (−1) f (x,y) of a direct sum f (x, y) = f 1 (x)+ f 2 (y), of functions f 1 , f 2 being the product of the character sums x∈F t 2 (−1) f 1 (x) and y∈F n−t 2 (−1) f 2 (y) of these functions, the Hamming weight of the direct sum i∈I 1 x i + • • • + i∈I k x i of monomials (where the index sets I 1 , . . ., I k are disjoint and . The Hamming weight of the function whose support equals all vectors of a Hamming weight of at least k equals n i=k n i .We find in both cases rather peculiar weights and, in the latter case, the algebraic degree needs to be determined.
There exist also secondary constructions of Boolean functions: • The direct sum, already recalled above in the particular context of monomials, consists of adding functions whose sets of variables are disjoint.It gives weights that are a little peculiar: We have recalled above that if f is the direct sum of a t-variable function f 1 and a (n − t)-variable function f 2 , then the character sum of f equals the product of the character sums of f 1 and f 2 , and this implies: This construction is interesting because it does not need particular precautions about the algebraic degree of f , which equals the maximum of the algebraic degrees of f 1 and f 2 .Hence, for every weight w 1 in RM(r, t) and every weight w 2 in RM(r, n − t), the number w such that RM(t, t) (and can then have the weight of any integer between 0 and 2 t ).With this construction, there is a systematic way of building weights in RM(r, n) from weights in RM(r, t) and RM(r, n−t).• The indirect sum (see [8, Sections 6.1.16and 7.1.9])also deals with functions whose sets of variables are disjoint, but in a more complex way: We have two functions f 1 , f 2 on the same set of t variables, two functions g 1 and g 2 on the same set of n − t variables, disjoint from the previous one, and ] and, therefore: .
The algebraic degree of f is not automatically bounded by r from above, unless we take the initial functions f 1 , f 2 in RM(s, t) with s ≤ r and the initial functions g 1 , g 2 in RM(r− s, n−t) but this does not allow to provide interesting weights.If we take f 1 , f 2 in RM(r, t) and g 1 , g 2 in RM(r, n − t), this construction provides weights that are possibly less peculiar than with the direct sum, but in a much less systematic way, because we need to take care of the algebraic degree.• The sum without extension of the number of variables (see [8, Sections 6.1.16and 7.1.9])takes three n-variable Boolean functions f 1 , f 2 , f 3 and defines the Boolean We have: This secondary construction has been introduced because of the nice behavior of its Walsh transform, but it has the same drawback as the indirect sum about the algebraic degree of f .• The so-called (u|u + v)-construction (see [22]) allows us to construct all of RM(r, n) from RM(r − 1, n − 1) and RM(r, n − 1).It corresponds to the fact that an n-variable Boolean function f (x 1 , . . ., x n ) can be written in the form f 0 (x 1 , . . ., x n−1 ) + x n f 1 (x 1 , . . ., x n−1 ) and has an algebraic degree of at most r if and only if f 0 has an algebraic degree of at most r and f 1 has an algebraic degree of at most r − 1.The corresponding codeword is the concatenation of the codewords in RM(r, n − 1) associated to f 0 and f 0 + f 1 , and for the Hamming weight, it has the sum of the Hamming weights of these two functions.The pairs ( f 0 , f 0 + f 1 ), when f 0 ranges over RM(r, n − 1) and f 1 ranges over RM(r − 1, n − 1), do not provide all possible pairs of codewords in RM(r, n − 1) because of the restriction that f 1 has an algebraic degree of at most r − 1, but if we impose that f 0 itself ranges over RM(r − 1, n − 1), then the weights of the resulting codewords of RM(r, n) range over the sums of two weights in RM(r − 1, n − 1).This leads to a result given in [30] and used in [12]: For all pairs of integers (r, n) with 0 ≤ r ≤ n, the weight spectrum of RM(r, n) includes as a subset S + S , where S is the weight spectrum of RM(r − 1, n − 1).This result has allowed us to obtain the weight spectra of infinite classes of Reed-Muller codes, but only for orders that are very close to n.
A completely different way of evaluating weights in Reed-Muller codes consists of the fact that, for every Boolean function f of an algebraic degree of at most r, we have obtain the absolute value of x∈F n 2 (−1) f (x) = 2 n −2w H ( f ), and since every Reed-Muller code is invariant under the complementation of its codewords, this provides two weights if x∈F n 2 (−1) f (x) 0. However, this method, which is clearly more efficient for low orders r, is better suited for determining some specific weights than for systematically finding new weights in infinite classes of Reed-Muller codes.
It is then useful to find a new way, as systematic as possible, for providing weights (hopefully previously unknown) and codewords having such weights.

A new construction of Boolean functions with an algebraic degree bounded from above
In this section, we present our construction.It comes from a formula that is satisfied by all Boolean functions of an algebraic degree bounded from above by some number s (and therefore by all vectorial functions F : F n 2 → F m 2 of an algebraic degree of at most s).This formula has been originally found and used (in [11]) in the framework of countermeasures against side channel attacks, a domain of applied cryptography.It also corresponds to what we call zero-sum sets, a notion used in the cryptanalysis of block ciphers.It could seem rather unrelated to coding theory in general and to the determination of weights in Reed-Muller codes in particular, but it is not, as we shall see.This formula depends on parameters (that are elements of F n 2 ) and will lead to numerous Boolean functions f of the algebraic degree bounded from above, since the Hamming weight of these functions can be determined, to numerous weights in Reed-Muller codes.

Degree-s zero-sum sets as Reed-Muller codewords
A set S ⊆ F n 2 is called degree-s zero-sum if we have x∈S f (x) = 0 for every n-variable Boolean function f of an algebraic degree of at most s (and then x∈S F(x) = 0 for every vectorial function F in n variables of an algebraic degree of at most s).
The degree-s zero-sum sets are then the supports of the codewords in the dual code of RM(s, n).The dual of RM(s, n) equals RM(r, n) where r = n − s − 1 [22] and degree-s zero-sum sets are then the supports of the n-variable Boolean functions of an algebraic degree of at most r, that is, of the codewords of RM(r, n).Hence, determining the possible sizes of degree-s zero-sum sets is directly related to determining the weights in Reed-Muller codes.

A construction of Boolean functions with bounded algebraic degree
We know from [11, for every j ≤ s, with the conventions l 0 = 1 for every l and i∈∅ a i = 0.According to (3.1), the set of all the elements a of F n 2 , which appear an odd number of times as a = t i=1 a i , or a = i∈J a i where J has size at most s and µ t,s (|J|) = 1, is a degree-s zero-sum set.We then have the following result, in which, for every a ∈ F n 2 , we denote by δ a the Boolean function over F n 2 which takes value 1 at a and 0 everywhere else (such a function can be called an atomic, or Dirac, or Kronecker function): Theorem 1.Let n, s ≥ 0 and t ≥ 1 be integers such that s < t and s < n.Given any elements a 1 , . . ., a t of F n 2 , the Boolean function: J⊆{1,...,t};|J|= j (where µ t,s ( j) = t− j−1 s− j mod 2 = t− j−1 t−s−1 mod 2), has an algebraic degree of at most r = n − s − 1.
(1) f (s) a 1 ,...,a t is in general not a symmetric function (that is, its value changes when we permute its input bits) despite the fact that its expression (3.2) is symmetric with respect to a 1 , . . ., a t (i.e., its value does not change when we permute the a i 's).
(2) For every positive integers n, s, t such that s < n and s < t, and every a 1 , . . ., a t in F n 2 , all the functions f (s) a 1 ,...,a t , f (s+1) a 1 ,...,a t , . . ., f (t−1) a 1 ,...,a t have algebraic degrees of at most r.(3) Suppose that for some n, s, t, the function f (s) a 1 ,...,a t has an algebraic degree r < n − s − 1, then it is orthogonal to every codeword of the Reed-Muller code RM(n − r − 1, n) with n − r − 1 > s, and it is, therefore, orthogonal to the Reed-Muller code RM(s + 1, n), whose elements satisfy the Relation (3.1).There seems to most often exist codewords of RM(s + 1, n) which do not satisfy Relation (3.1).We deduce that, most often, f (s) a 1 ,...,a t has in fact an algebraic degree of r = n− s−1 exactly.Examples 1 and 2 will illustrate this, but there are also examples where f (s) a 1 ,...,a t has an algebraic degree strictly smaller; see, for instance, Proposition 1.

3.2.1.
Linear equivalence between the constructed functions when a 1 , . . ., a t are linearly independent We say that two n-variable Boolean functions f, g are linearly (resp., affinely) equivalent if there exists a linear automorphism (resp., an affine automorphism) L of F n 2 such that g = f • L, then f and g have the same Hamming weight and the same algebraic degree.All the functions in a same equivalence class contribute then for the same weight in the weight spectrum of the corresponding Reed-Muller code.We are then interested, when we find a function with a known algebraic degree and weight, to know whether it is inequivalent to previously found functions.For t ≤ n, two choices "a 1 , . . ., a t ", respectively, "a 1 , . . ., a t ", of linearly independent elements give linearly equivalent functions f (s) a 1 ,...,a t and f (s) a 1 ,...,a t , because there exists a linear automorphism L, mapping a 1 , . . ., a t to a 1 , . . ., a t , respectively, and, therefore, mapping i∈J a i to i∈J a i for every J.We then have f (s) a 1 ,...,a t = f (s) a 1 ,...,a t • L.
3.2.2.Studying some particular cases of (t, s) when a 1 , . . ., a t are not necessarily linearly independent For two choices a 1 , . . ., a t and a 1 , . . ., a t , of linearly dependent elements, the corresponding functions f (s) a 1 ,...,a t and f (s) a 1 ,...,a t may not be affine equivalent.Of course, if a 1 , . . ., a t and a 1 , . . ., a t satisfy exactly the same linear relations over F 2 , then there is again a linear automorphism mapping a 1 , . . ., a t to a 1 , . . ., a t , respectively (indeed, the two families have the same rank k; we can choose in each family k elements generating the other elements of the family by the same relations and deduce such linear automorphism), but if not, then the functions f (s) a 1 ,...,a t and f (s) a 1 ,...,a t may be inequivalent.
Before seeing an example where f (s) a 1 ,...,a t and f (s) a 1 ,...,a t are not affine equivalent, let us systematically visit the first possible values of s (for any t > s): • Case s = 0: For t ≥ 1, we have f (0) a 1 ,...,a t = δ t i=1 a i + δ 0 , which can have a weight of either 0 or 2; we get then only the two smallest weights of RM(n, n − 1); • Case s = 1: For t ≥ 2, we have f (1)  a 1 ,...,a t = δ t i=1 a i + (t − 1) δ 0 + t i=1 δ a i (we omit the "mod2"); if t is even, then we get δ t i=1 a i + δ 0 + t i=1 δ a i , which has an even weight of at most t + 2, and if n is odd, then we get δ t i=1 a i + t i=1 δ a i , which has an even weight as well of at most t +1; Since t is not bounded above, we get all possible weights of RM(n, n − 2) (and this case is then very different from the previous one): We can easily check that the weights 2 and 2 n − 2 are impossible and all other even weights between 0 and 2 n are possible; for instance, weight 4 is achieved by taking either t = 2 and a 1 , a 2 nonzero and distinct (i.e., linearly independent over F 2 ) or t = 3 and a 1 , a 2 , a 3 distinct; • Case s = 2: For t ≥ 3, we have f (2)  a 1 ,...,a t = δ t i=1 a i + t−1 2 δ 0 + (t − 2) t i=1 δ a i + 1≤i< j≤t δ a i +a j ; hence, if all the sums a i + a j and the a i are nonzero and distinct, we have a function of a Hamming weight in • Case s = 3: For t ≥ 4, we have f (3) a 1 ,...,a t = δ t i=1 a i + t−1 3 δ 0 + t−2 2 t i=1 δ a i + (t − 3) 1≤i< j≤t δ a i +a j + 1≤i< j<k≤t δ a i +a j +a k ; hence, if all the sums a i + a j + a k are distinct, we have a function of a Hamming weight of at least t Since, for the same value of n and the same value of t, f (1) a 1 ,...,a t = δ t i=1 a i + (t − 1) δ 0 + t i=1 δ a i can have different Hamming weights according to the values of the a i 's when they are linearly dependent, we have an example where f (s) a 1 ,...,a t and f (s) a 1 ,...,a t are not affine equivalent, even if a 1 , . . ., a t are distinct as well as a 1 , . . ., a t .
Let us now systematically visit the first possible values of t > s (for any s): • For t = s + 1, we have µ t,s ( j) = s− j s− j mod 2 = 1 for all j ≤ s.Note that this was expected since Relation (3.1) expresses, in particular, that for a function of degree of at most s, the sum of the values of the function taken over any (s + 1)-dimensional affine space equals 0. The Hamming weight w s+1,s of f (s) a 1 ,...,a s+1 is at most 1 + s j=0 t j = 2 s+1 .Hence, since 2 s+1 equals the minimum distance of RM(r, n), the Hamming weight of f (s) a 1 ,...,a s+1 is either zero or 2 s+1 (depending on the choice of a 1 , . . ., a s+1 ).More precisely: Proposition 1.For every s ≥ 0 and every linearly independent a 1 , . . ., a s+1 in F n 2 , f (s) a 1 ,...,a s+1 is the minimum weight codeword in RM(r, n) whose support equals a 1 , . . ., a s+1 , the vector space over F 2 generated by a 1 , . . ., a s+1 .If a 1 , . . ., a s+1 are linearly dependent, then f (s) a 1 ,...,a s+1 equals the zero function.
Proof.We have f (s) a 1 ,...,a s+1 = J⊆{1,...,s+1} δ i∈J a i .If a 1 , . . ., a s+1 are linearly independent, then f (s) a 1 ,...,a s+1 equals the indicator of the vector space generated by a 1 , . . ., a s+1 (and we obtain with the functions f (s) a 1 ,...,a t all the minimum weight codewords in RM(r, n)).If a 1 , . . ., a s+1 are linearly dependent, then the Hamming weight of f (s) a 1 ,...,a t is strictly less than the minimum distance of RM(r, n), and it is then 0. Note that, assuming (without loss of generality, thanks to the invariance of f (s) a 1 ,...,a t when permuting the a i 's) that a t = a 1 + • • • + a k , for some k < t, it is easily seen that each Dirac function obtained after replacing a t by its value in the expression of f (s) a 1 ,...,a s+1 appears an even number of times.This implies that this expression cancels.
, and we have: δ i∈J a i .

On the weights of the constructed functions
The interest of Theorem 1 is that it is possible to calculate mathematically, under some conditions, the Hamming weight of f (s) a 1 ,...,a t , and that the weights obtained do not look peculiar.Proposition 3. Let n, s ≥ 0 and t ≥ 1 be integers such that s < t and s < n.For any elements a 1 , . . ., a t of F n 2 , let f (s) a 1 ,...,a t be the Boolean function given by (3.2).If a 1 , . . ., a t are linearly independent over F 2 , then f (s) a 1 ,...,a t has Hamming weight: where µ t,s ( j) = t− j−1 s− j mod 2, and otherwise, it has a Hamming weight of at most w t,s .Indeed, the former assertion comes from the fact that, for any two distinct J, the corresponding elements i∈J a i are distinct, since a 1 , . . ., a t are linearly independent over F 2 , and the latter is obvious.Note that the Hamming weight of f (s) a 1 ,...,a t has necessarily the same parity as w t,s since the atomic functions involved in (3.2) cancel by pairs, but since we already know that this weight is even because r = n − s − 1 is strictly smaller than n, this only tells us that w t,s is even (while it may not always be a weight in RM(r, n) when t > n).Note also that w t,s ≥ 1 + t s since µ t,s (s) = 1 (and then the weight of f (s) a 1 ,...,a t cannot equal w t,s if t s ≥ 2 n ), and that if t − s is odd, then w t,s ≥ 1 + t s−1 + t s , since µ t,s (s − 1) = t − s (and then the weight of f (s) a 1 ,...,a t cannot equal w t,s if t s−1 + t s ≥ 2 n ).
Example 2. Let us take n = 12, r = 8.We can check that f (s) a 1 ,...,a t can reach weight 166 in two cases where a 1 , . . ., a t are linearly independent over F 2 .Indeed, for having r = n − s − 1 = 8, we need to take s = 3.For the weight w t,s = 1 + j∈{0,...,s}; µ t,s ( j)=1 t j given by Proposition 3 to equal 166, we need to take t ∈ {10, 11}.Recall that all these functions are affine equivalent, for a fixed value of t.Denoting by (e 1 , . . ., e 12 ) the canonical basis of F 12  2 (made of all weight 1 vectors), we obtain then two classes of functions, that are respectively affine equivalent to f (3)  e 1 ,.
x I .
It is interesting to notice that w t,s , defined in Relation (3.3), does not depend on n (we only have the condition that n ≥ t).Of course, n plays a role through the value of r.
We can see that the weights provided by Proposition 3 are few for low orders (since t ranges from n − r to n) and a little more numerous for large orders.
We now observe a property of w t,s that seems easier to show by considering Relation (3.3) than to infer directly from the way f (s) a 1 ,...,a t was derived: Lemma 1.For every s, i ≥ 0, we have w s+2i+1,s = w s+2i+2,s ≤ w s+2i+3,s and this latter inequality is strict for s > 0.
In the next corollary we call the weight spectrum of RM(r, n) the list of all possible weights in RM(r, n)) Corollary 1. Whatever the positive integers of n and r < n are, the weight spectrum of RM(r, n) contains all the numbers: . . .
Indeed, for every t ≤ n, there exist t linearly independent elements.In Table 1, we give for n ≤ 21 and for all r = 1, . . ., n − 1, the list in regular roman ** of the values w t,n−r−1 where t ranges from n − r to n.All these numbers are weights in RM(r, n), and all the lists displayed for the input pairs (n, r), (n, r − 1), . . ., (n, 1) provide weights in RM(r, n).We can check on these lists that Lemma 1 is verified, that is, the numbers go by pairs of consecutive equal values and the lists are nondecreasing.
We can find in Table 1 many numbers which were not known before as weights in RM(r, n), such as 3004 or 6436 in RM (6,14).
** The values in bold will be obtained below in Subsection 3.3.1.

AIMS Mathematics
Volume 9, Issue 5, 10609-10637.We have seen that restricting ourselves to the case where a 1 , . . ., a t are linearly independent over F 2 reduces the number of the weights, which can be found by using Theorem 1, because t is then necessarily in the range {n − r, . . ., n} and since, for fixed n and r (i.e., for fixed n and s), all the obtained functions corresponding to the same t have the same Hamming weight.In the present section, we investigate two cases where a 1 , . . ., a t are linearly dependent.We shall see that the first does not provide more weights but the second does.
Case where two elements are equal: We study this case by curiosity, to check whether with t elements a 1 , . . ., a t , it is identical to the case of t − 2 elements or not (the formulas are different but the functions and/or the weights may be the same).
To ease the comparison, we start with t + 2 elements a 1 , . . ., a t+2 such that (without loss of generality) a t+2 = a t+1 , then for every J ⊆ {1, . . ., t + 2}, we have that i∈J a i equals . We get the same atomic function (which cancels then) if exactly one element among {t + 1, t + 2} belongs to J, whether we choose t + 1 or t + 2. We deduce that: f (s) a 1 ,...,a t+2 = δ t i=1 a i +
to find other ways to provide more weights.One is very simple.Since all the functions f (s) a 1 ,...,a t have an algebraic degree of at most r = n − s − 1, we can sum, for every choice of s, some of the functions f (s) a 1 ,...,a t , f (s+1) a 1 ,...,a t , . . ., f (t−1) a 1 ,...,a t for different choices of t > s and of a 1 , . . ., a t .The difficulty is to evaluate the Hamming weight of the resulting functions, but there is a case where the weight is easily determined: when we take disjoint families of vectors a i whose union is made of linearly independent vectors.
In the simplest case, we have (globally) t linearly independent vectors a 1 , . . ., a t in F n 2 (with t ≤ n), and we partition {1, . . ., t} in two subsets (without loss of generality, we can take these subsets equal to {1, . . ., l} and {l + 1, . . ., t}), then two functions f (s) a 1 ,...,a l and f (s ) a l+1 ,...,a t with s < l and s < t − l have algebraic degrees of at most r = n − s − 1 and r = n − s − 1, respectively, and they have disjoint supports.Their sum has then an algebraic degree of at most max(r, r ) and has for Hamming weight the sum of their Hamming weights, that is, w l,s + w t−l,s .
Of course, Proposition 5 can be generalized to sums of more than two numbers (taking more than two families partitioning {a 1 , . . ., a t }).
Remark 4. The conditions r ≥ r 1 , r 2 and r 1 + r 2 ≥ 2n − t imply that 2r ≥ 2n − t, that is, t ≥ 2n − 2r and since t cannot be larger than n, this means that Proposition 5 can be used only if n ≥ 2n − 2r, that is, r ≥ n 2 .Our results are then unfortunately limited to those of Proposition 3 (that is, those of Table 1) for the orders in the lower half of [0, n] (those for which the table provides the least values), in particular for the smallest order for which the weight spectrum is unknown: r = 3.We shall see below that, on the contrary, we can derive a very large number of weights as soon as r is large enough.Note that if we partition {a 1 , . . ., a t } in three families {a 1 , . . ., a l }, {a l+1 , . . ., a k } and {a k+1 , . . ., a t }, we get the weight w l,n−r , and the condition r ≥ n 2 becomes r ≥ 2n 3 .For each n ≤ 23, the weights provided by Proposition 5 can be obtained by adding in Table 1 any number located in any row r 1 ≤ r at the k-th position (by taking k = l − (n − r 1 ) + 1 so that it starts at position 1 in the list given by Table 1) where k ≤ t − n + r 2 − (n − r 1 ) + 1 = t − 2n + r 1 + r 2 + 1, and the number located in any row r 2 ≤ r such that r 1 + r 2 ≥ 2n − t, at the position corresponding to t − l, that is at the position  1 as explained above.For n = 12 and r = 6, the condition n ≥ t ≥ 2n − 2r writes t = 12.The conditions r 1 , r 2 ≤ r and r 1 + r 2 ≥ 2n − t allow only one possibility: (r 1 , r 2 ) = (6,6), which provides only one weight (since the number k ∈ {1, t − 2n + r 1 + r 2 + 1} in the description we gave can take value 1 only): 128, which is already there.
-For t = 12, the conditions r 1 , r 2 ≤ r and r 1 + r 2 ≥ 2n − t allow: (r 1 , r 2 ) = (7, 7), which provides the weight 64 also already obtained.We could continue by visiting RM (8,14) (which is the first case where we obtain a weight not divisible by 4: 3474) etc., but with this example, we see the huge difference between low and high orders.We leave as an open problem the determination of more weights in RM (6,12) (and in particular, some that are not divisible by 4), which will probably need to find another method than exploiting Relation (3.1).
3.4.The ANF of the constructed functions when a 1 , . . ., a t are linearly independent over F 2 We have seen that for t ≤ n, two choices "a 1 , . . ., a t ", respectively "a 1 , . . ., a t ", of linearly independent elements give linearly equivalent functions, having then the same weight and the same algebraic degree.
Let us then determine the ANF of f (s) e 1 ,...,e t , where t ≤ n.We shall need the following lemma: Lemma 2. Let n ≥ 1, t ≥ 1, j ≥ 0 be integers such that j ≤ t ≤ n and let e 1 , . . ., e t be the t first elements of the canonical basis of F n 2 .The Boolean function: x I , where µ t,s ( j) = t− j−1 s− j mod 2 = t− j−1 t−s−1 mod 2. This is straightforward since f (s) e 1 ,...,e t = h t,e 1 ,...,e t + s j=0 µ t,s ( j) h j,e 1 ,...,e t .
Open problem: Determine the exact algebraic degree of f (s) a 1 ,...,a t by means of s, n and a 1 , . . ., a t .Subproblem: Determine the algebraic degree of f (s) a 1 ,...,a t by means of s and n when a 1 , . . ., a t are linearly independent.
Still more complex is the following: Open problem: Determine what can be the ANF of f (s) a 1 ,...,a t when a 1 , . . ., a t are linearly dependent.

Conclusions
We have introduced a novel way of constructing Reed-Muller codewords.It consists of exploiting relations satisfied by all n-variable Boolean or vectorial functions F of an algebraic degree of at most s (corresponding when F is Boolean to codewords in RM(s, n)), these relations being interpretable in terms of the orthogonality between some Boolean function, say f , and (the coordinate functions of) all such F. Function f belongs then to RM(r, n), where r = n − s − 1.This construction depends on n, s (or r), a parameter t and the choice of t vectors a i .We showed how it allows us to determine weights in Reed-Muller codes that are not accessible by other methods, as far as we know, and in a simpler way.As a matter of fact, our method for determining weights in Reed-Muller codes is complementary of the classic method, which consists of using the known constructions, since the latter is more efficient for low orders and our method is more efficient for large orders.Anyway, the method using the known constructions poses technical problems (and provides a number of weights that is small compared to the amount of work needed) while ours provides weights with less difficulties.Functions having the weights we can derive with our method can be deduced, as well as a general form of their ANF when the vectors a i are linearly independent, but determining mathematically their exact algebraic degree seems difficult.This is one of the open problems we proposed.We also found more weights by considering cases where the vectors are linearly dependent.We could also identify that with some of the constructed functions having disjoint supports, the weights of the sums are equal to the sums of the weights; this provided for each Reed-Muller code of a sufficiently large order a very large number of new weights.
More work is possible in many directions, for instance, by investigating as many cases of functions as possible where the vectors a i are linearly dependent and studying sums of such functions as well.Moreover, there may be other relations to find that are interpretable in terms of orthogonality, leading to more codewords and weights in Reed-Muller codes.This may provide an avenue for further results, with the ultimate goal of determining all the weight spectra of Reed-Muller codes (starting with those of high orders when they are still unknown, since they seem to be more accessible than those of low orders larger than 2), and still better, their weight distributions.

Use of AI tools declaration
The author declares he has not used Artificial Intelligence (AI) tools in the creation of this article.

Table 1 .
Lists of values of w n−r,n−r−1 , . . ., w n,n−r−1 ; w t,n−r−1 .The weights in some cases where a 1 , . . ., a t are linearly dependent It is for large orders that our ,136,208,256,384,496,512,628,736,784,992,1024,1420.We can add the weights obtained by adding weights from RM(r 1 , 12) and RM(r 2 , 12) in Table