A novel systematic byte substitution method to design strong bijective substitution box (S-box) using piece-wise-linear chaotic map

Cryptography deals with designing practical mathematical algorithms having the two primitive elements of confusion and diffusion. The security of encrypted data is highly dependent on these two primitive elements and a key. S-box is the nonlinear component present in a symmetric encryption algorithm that provides confusion. A cryptographically strong bijective S-box structure in cryptosystem ensures near-optimal resistance against cryptanalytic attacks. It provides uncertainty and nonlinearity that ensures high confidentiality and security against cryptanalysis attacks. The nonlinearity of an S-box is highly dependent on the dispersal of input data using an S-box. Cryptographic performance criteria of chaos-based S-boxes are worse than algebraic S-box design methods, especially differential probability. This article reports a novel approach to design an 8 × 8 S-box using chaos and randomization using dispersion property to S-box cryptographic properties, especially differential probability. The randomization using dispersion property is introduced within the design loop to achieve low differential uniformity possibly. Two steps are involved in generating the proposed S-box. In the first step, a piecewise linear chaotic map (PWLCM) is utilized to generate initial S-box positions. Generally, the dispersion property is a post-processing technique that measures maximum nonlinearity in a given random sequence. However, in the second step, the concept is carefully reverse engineered, and the dispersion property is used within the design loop for systematic dispersal of input substituting sequence. The proposed controlled randomization changes the probability distribution statistics of S-box’s differentials. The proposed methodology systematically substitutes the S-box positions that cause output differences to recur for a given input difference. The proposed S-box is analyzed using well-established and well-known statistical cryptographic criteria of nonlinearity, strict avalanche criteria (SAC), bit independence criteria (BIC), differential probability, and linear probability. Further, the S-box’s boomerang connectivity table (BCT) is generated to analyze its strength against boomerang attack. Boomerang is a relatively new attacking framework for cryptosystem. The proposed S-box is compared with the state-of-the-art latest related publications. Results show that the proposed S-box achieves an upper bound of cryptographic properties, especially differential probability. This work hypothesizes that highly dispersive hamming distances at output difference, generated a systematic S-box. The mixing property of chaos generated trajectories utilized for decimal mapping. To test the randomness of generated chaotic trajectories, a cryptographically secure pseudo-random sequence was generated using a chaotic map that was tested using the National Institute of Standards and Technology (NIST) NIST-800-22 test suit.


INTRODUCTION
Cryptography aids individual users and corporate organizations in protecting their digital data and information. With the prevalence of cryptography (Paar & Pelzl, 2009), digital data transmission over an insecure network has significantly improved. This rapid increase in transmission has entailed a significant enhancement of information security. The new standards for data communication and information technology have developed with the requirement of a specific mechanism to resist cryptographic attacks (Standaert, Piret & Quisquater, 2003;National Institute of Standards and Technology, 2001;Biryukov, 2011;Daemen & Rijmen, 2002). With his paper on the communication theory of secrecy, Shannon has laid the foundation of a modern era of cryptography (Shannon, 1949). Symmetric and asymmetric critical cryptographic algorithms at the byte/word level or bitlevel are used to secure and protect digital information transmitted over insecure channels. In the light of the previous discussion, this paper attempts to design a systematic S-box with improved cryptographic property, especially DP.
Data confidentially in cryptography is related to the encryption of digital data. Modern block ciphers, including DES (National Institute of Standards & Technology, 1999) and variants of DES, Blowfish (Schneier, 1993), Camelia (Aoki et al., 2001), Kasumi (ETSI, 2001), RC5 (Rivest, 1995), RC6 (Handschuh, 2011), PRESENT (Bogdanov et al., 2007), and AES (Daemen & Rijmen, 2002) are based on Shannon principle of confusion and diffusion. Confusion is a technique that obscures the relationship between the key and the ciphertext, thus making it difficult for an attacker to guess the key while wiretapping. An S-box, a nonlinear auxiliary table, is used in the encryption algorithm as a confusion component.
An S-box is a bijective mapping S ¼ fð0; 1Þ n 7 !ð0; 1Þ n g where equality exhibits that input and output bits are the same, hence an asymmetric S-box. An S-box ensures nonlinear propagation of plaintext through rounds of an encryption algorithm to achieve confusion and prevent an attacker from recovering the correct key. After introducing differential cryptanalysis (Biham & Shamir, 1991;Langfordl & Hellman, 1994), an expanded set of S-box design criteria was proposed (Dawson & Tavares, 1991;Yi, Cheng & You, 1997;Nyberg, 1991). It was revealed in the early '90s that the known structure acts as a basis to mount the differential cryptanalysis. Therefore, an S-box based on given criteria preferably leads to near-optimal resistance against differential and linear attacks.
Differential cryptanalysis is a beneficial attack on block ciphers, also known as a chosenplaintext attack. To mount this attack, a cryptanalyst first chooses input differential Dx of plaintext pairs (x, x′), examines the propagation, and finds output differential pairs through encryption. In this attack, a cryptanalyst uses an S-box to compute a complete set of output differences (Δy) for all given input differences (Δx). Subsequently, input/output differences are tabulated as a difference distribution table (DDT). It searches for high probable output pairs for a given Dx through differential analysis of a cipher. Thus, a differential attack marks weaknesses within the cipher and achieves desirable results on the part of an attacker (Heys, 2002;Biham & Shamir, 1991). The following definitions will help understand the concept of DDT to measure DP (Biryukov & Perrin, 2015;Biham & Shamir, 1991).
Definition 1: Let S ¼ fð0; 1Þ n 7 !ð0; 1Þ m g where "ðm ¼ nÞ", is a substitution function. The number of pairs gives the DP of the differential (Δx, Δy) with input difference Dx and output difference Dy, divided by the total number of pairs with input difference Dx The DP is considered a stochastic variable and can only take limited values of either 0 or multiple of 2 1Àn . Rijndael S-box was a two-step algebraic design based on AES's GF(256) inverse and affine transformation. It was based on the NIST criteria, inspired by the linear and differential attack (Biham & Shamir, 1991;Matsui, 1996). The introduction of AES established the basis to design strong cryptosystems. In the same era, Kocarev (2001) portrays an excellent foundation on chaos-based cryptography and summarizes similarities and differences between a chaotic map and cryptographic algorithms. For example, chaotic maps are defined on a subset of real numbers, and cryptographic algorithms are defined on finite sets. The parameters of a map may represent the key of an encryption algorithm. Encryption rounds in a cryptographic algorithm fulfill the desired confusion and diffusion properties, and the iterations of a chaotic map fulfill the ergodicity property. Chaos has deterministic dynamics and has properties like positive Lyapunov exponent, mixing, and ergodicity. These properties are favored in cryptography and have an advantage over algebraic designs due to their less computational complexity, ease of implementation, sensitive dependence on initial conditions, and hardware efficiency.
Step 3 utilizes a chaotic sine map to generate a permutation of integer matrix S(16 × 16). In Tian & Lu, 2017 a method based on a 1-D logistic map and optimized using the bacterial foraging optimization method was proposed. An algebraic method using cubic traction transform was proposed (Zahid, Arshad & Ahmad, 2019). Wang et al. (2020) proposed an S-box design method based on a logistic map and genetic algorithm. The proposed methodology is two-step. First, a chaotic logistic map generates the initial pool of S-boxes. Secondly, a genetic algorithm is applied to obtain the final S-box. Shakiba (2020) proposed a simple chaotic S-box based on the I-D Chebyshev map. Artuğer & Özkaynak (2020) proposed a method to analyze chaotic S-box design using the zigzag mapping technique. Various discrete and continuous maps are chosen, and integer mapping is performed using the zigzag transformation approach. In Ahmad et al. (2020) a hybrid approach to design a bijective S-box was proposed. First, key-dependent improved S-boxes are generated using I-D sine-powered chaotic map and heuristic search technique. Secondly, chaotic features of obtained S-boxes are improvised using the action of an algebraic group. In Khan & Jamal (2021) author proposed an S-box design based on the composition of chaotic maps for lightweight design. In Zahid et al. (2021) a method to design S-box based on heuristic evolutionary strategy and modular operation is presented. Hua et al. (2021) proposed an S-box method using an improved logistic map and bijective matrix. The chaotic logistic map is iterated to generate a Latin matrix then randomized to obtain the final S-box. Zhu et al. (2020) proposed a dynamic S-box design method. The final S-box is obtained by applying the fitness function on the proposed static S-box. The static S-box is generated by iterating the logistic-tent system. Solami et al. (2018) proposed an S-box based on the mixing property of a higher dimensional map. A 5-D hyperchaotic system is used to obtain the final S-box. Alhadawi et al. (2021) obtained an S-box utilizing a cuckoo search algorithm and a 1-D discrete space chaotic map. In Jiang & Ding (2021) author generated an 8 × 8 S-box using chaotic bent functions.
The discussed S-boxes achieved strong cryptographic properties that have been analyzed using performance criteria. However, the DP value of these S-boxes is 0.03906, which shows that the maximum DP value is 0.03906. Moreover, several methodologies have been proposed utilizing mathematical transformation of linear fractional transform combined with the symmetric group, elliptic curve, coset diagram, etc., that have a DP value of 0.03906 (Siddiqui, Naseer & Ehatisham-ul-Haq, 2021;Nizam Chew & Ismail, 2020;Beg et al., 2020;Zahid, Arshad & Ahmad, 2019;Farwa, Shah & Idrees, 2016;Hussain et al., 2013aHussain et al., , 2013bAhmad et al., 2020;Hayat, Azam & Asif, 2018;Khan, Ahmed & Saleem, 2019;Aboytes-González et al., 2018;Hussain et al., 2018;Siddiqui et al., 2020). However, few methodologies have generated an S-box with a differential probability of 0.156 (Aboytes-González et al., 2018;Siddiqui et al., 2020;Nizam Chew & Ismail, 2020;Ahmad et al., 2020;Cui & Cao, 2007;Tran, Bui & Duong, 2008). Additionally, recent research on the chaos-based S-box (Özkaynak, 2020) shows that an S-box based on the mixing property of chaotic map has high differential uniformity and nonlinearity. It was observed that exiting hybrid S-box methodologies improve nonlinearity property with chaos and optimization or heuristics. However, differential uniformity of these S-boxes is still high. The nonlinearity property is used in the fitness function as an improvement criterion. The heuristics and optimization-based techniques are an added layer on chaotic mapping to achieve highly nonlinear S-boxes. The nonlinearity of an S-box reflects its resistance against linear cryptanalysis. Chaos-based S-box has better LP as compared to algebraic S-boxes. Despite high differential uniformity of chaos-based S-box, DP property cannot be considered an improvement criterion. The DP property is a good criterion for systematic S-box design. The input/output difference information is required to understand the confusion component for systematic design, along with strong diffusion and key mixing components, makes differential attacks like chosen plaintext/ciphertext attacks infeasible. A recent and notable contribution on chaos-based S-box that uses mixing property of chaotic map and DDT within the design loop to improve the DP value is given in Khan et al. (2018), Khan, Jeoti & Manzoor (2012. It is still challenging to improve the DP value of the chaos-based S-box. It is hypothesized that systematic methodologies, designs based on the knowledge of cryptographic attacks and cryptographic properties as a tool within the methodologies for design, are required to generate S-box with a strong structure. For example, chaos-based S-boxes have a higher DP property value than algebraic S-boxes. The observations of this study are as follows: 1. An S-box is a nonlinear component in an encryption algorithm that provides confusion.
2. An S-box Provides uncertainty that obscures the relationship between plaintext and ciphertext, and a strong encryption algorithm makes chosen plaintext/ciphertext attack infeasible.
3. The low DP value of an S-box indicates high dispersion among Dy.

4.
A strong S-box must have an upper bound of cryptographic performance criteria.
5. Chaos-based S-box has poor cryptographic criteria as compared to algebraic S-box.
6. The cryptographic criterion of DP has remained high in chaos-based S-box.
7. Systematic chaos-based S-box with a solid structure and a good understanding of cryptanalytic attack may lead to a strong S-box with improved cryptographic performance criteria, especially differential uniformity.

Problem statement
An S-box based on mathematical transformations has near-optimal cryptographic performance criteria compared to a chaos-based S-box. However, a chaos-based S-box can have better immunity against various side-channel attacks (Özkaynak, 2020). A chaosbased S-box with comparable performance criteria to an algebraic S-box is still challenging. Further, it was established that mapping techniques (continuum to integer) to produce an S-box structure are more important than the chaotic system properties (Artuğer & Özkaynak, 2020). Therefore, an S-box solid structure with improved performance criteria can be designed with a good understanding of cryptanalytic attacks, such as linear and differential attacks (Kocarev, 2001).

Contributions
The contribution of this paper is the use of dispersion property as a new tool to design an S-box. This section explains the use of dispersion property within the design loop to achieve results. While, we mainly focused on presenting the research hypothesis, which is later proved in the results section. An S-box design is critically essential to resist all known attacks, especially differential cryptanalysis. Cryptosystem having S-box with high DP property value may be prone to chosen-plaintext attacks. It uses plaintext ciphertext pairs to mount differential cryptanalysis. The aim is to recover information without the knowledge of the key. With the help of the cryptosystem's S-box, an attacker tabulates pairs (Dx; Dy) in DDT and finds DP using Eq. (1). An attacker looks for pairs with a maximum count in DDT to measure the differential uniformity of a given S-box. The dispersion property is employed as a tool to design the proposed S-box. The dispersion property is an added layer provided within the design loop.
For a given n bit S-box S : ð0; 1Þ n 7 !ð0; 1Þ n , the dispersion property computes all pairs (DS x ; DS y ), where DS x is the input spread and DS y is the output spread. Similar to DDT, these pairs tabulated in a dispersion matrix (DM). The total number of dispersion pairs and pairs which recur in DM are used to measure the normalized dispersion value between 0 and 1. The normalized dispersion is computed as The d total ðpÞ is the total pairs count in DM, d R ðpÞ is the total recur pairs count in DM, and 'T' is the total number of S-box positions. The normalized value of '0' stands for no dispersion, and the '1' entails high dispersion among DS y . Further, the normalized value close to 1 shows that the input S-box substituted the input sequence with high randomness, which entails efficient decorrelation among the substituted sequence. Hence, S-box exhibits high nonlinearity. The normalized value of 1 requires d R p ð Þ ¼ 0. However, this argument requires distinct pairs in DM. The occurrence of recurring pairs is due to the relative positions of elements in the auxiliary table of an S-box. It can be hypothesized that systematic selection and positioning of elements in the S-box may control the d R p ð Þ in DM.
In the light of the discussion in previous sections, this article attempts to design a systematic S-box using dispersion property within the design loop. The proposed methodology works in layers iteratively. The discretized PWLCM generates initial S-box positions that fill the S-box table. The dispersion property is then used as an added layer that systematically decides the relative position of the S-box element in the S-box table. The proposed method is an increment design approach, starting with an initial pool of S-box positions, using Eq. (3), DM is dynamically generated. The new S-box positions are approved after checking the recurrence of pairs in DM. Due to the dynamic systematic S-box generation, design conditions are proposed under which the DM is dynamically generated, and the relative location of S-box positions are chosen. The recurrence of pairs in DM is closely monitored, and positions are regenerated and placed in the S-box table that entails a high occurrence of pairs in DM. The high-level flow diagram is given in Fig. 1. The ergodic and mixing behavior inherent in chaos generates all S-box positions in a reasonable time. For the added layer of DM generation, each position to confirm as the final S-box position, all pairs are added and checked for recurrence. The time complexity TC of this added layer is closely approximated between (Oð2 2n Þ , TC , Oð2 3n Þ, where n is the cardinality of the proposed S-box. On the other hand, choosing chaos also stands critical in cryptosystem design. In this work, a multi-dimensional PWLCM is chosen to generate initial trajectories. A multi-dimensional map is crucial in resisting key-related attacks in secure chaotic communications systems (Liu, Xiang & Liu, 2020). Initially, a random number generator (RNG) design is proposed using PWLCM. The random numbers generated using PWLCM are cryptographically secure, and statistically analyzed by the National Institute of Standards and Technology (NIST) criterion. This paper is organized as follows: "Random Number Generation Using PWLCM" performs a randomness test of PWLCM, "Materials and Methods" and "Dispersion Matrix Generation" presents the proposed methodology. With the understanding of DDT for differential cryptanalysis that finds the weaknesses in the S-box structure, this research proposes a systematic S-box design. "Proposed Systematic S-box Application in Image Encryption" evaluates the performance of proposed S-boxes. "Boomerang Connectivity Table" and "Feistel counterpart of BCT (FBCT)" analyze the BCT and FBCT of proposed S-box. The performance criteria of chaos-based S-box are not optimal compared to algebraic S-boxes. However, S-box differential uniformity certainly improved as compared to recently proposed S-boxes.

Random number generation using PWLCM
The randomness of PWLCM is evaluated using the NIST-800-22 statistical test suite. The test suite includes 15 different types of tests. Any bitstream must pass all these tests from the random bitstream pool to be accepted as a successful key and used as a secure key in encryption. A length of one million of the bitstream is required for NIST-800-22 statistical tests. The PWLCM equation is defined as: x nþ1 ¼ x n p ; 0 x n , p ðx n À pÞ ð0:5 À pÞ ; p x n , 0:5 ð1 À p À x n Þ ð0:5 À pÞ ; 0:5 , x n , 1 À p where, x o ∈ [0, 1) is the initial value and p ∈ (0, 0.5) is the control factor. Any arbitrary chosen initial condition can be used. It is well established that the randomness of the RNG numbers directly affects encryption applications' security. Hence they have crucial importance. Therefore, a successful bitstream selected as a key for encryption possesses a property that should have a uniform probability distribution of 1′s and 0′s. It means that the number of 1′s and 0′s in the bitstream should be equal or nearly equal. A PWLCM generates floating-point numbers in the given range of [0-1). As a result, by using PWLCM trajectories, we can generate infinite real number values in this range. A suitable threshold value is set on the continuous-valued output of RNG. Therefore, this paper chooses the typical median value of the threshold, i.e., s ¼ 0:5, bearing in mind the output range of RNG values to be [0-1). The steps for generating a random bit stream using the proposed RNG are as follows: Step 1: The initial condition ðx 0 ¼ 0:78Þ and parameter ðp ¼ 0:16Þ are provided as input to PWLCM for generating random floating-point numbers having a range ½0; 1Þ.
Step 2: The PWLCM is iterated 10 6 times to generate 1 million random floating-point values.
Step 3: Thresholding is applied to the floating-point values obtained after step 2 to generate a random bit stream of 0′s and 1′s. Each floating-point value x i (where 1 i 10 6 ) is mapped to either '0' or '1' depending upon the following criteria: If x i ! s, it is mapped to a bit '1'; otherwise, the value is mapped to a bit '0'. In this way, a bitstream of a length of 1 million is generated using the proposed RNG.
Step 4: In the last phase, NIST tests are applied to the bitstream obtained in step 3 to assess the bitstream's randomness. The test results are evaluated based on a calculated test statistic value, i.e., P-value, which is a function of the data. The P-value reveals the strength of the randomness of a bit sequence. A P-value of 1 means the sequence is entirely random, whereas a P-value of 0 indicates entirely non-random. For each test, if P-value obtained is greater than or equal to the significance level 'α,' the test is considered successful. The significance level lies in the range [0.001-0.01]. We used the default parameters for all tests to test our proposed RNG using the NIST test. The value of α was chosen equal to 0.01, which means that for a test to be successful, the P-value obtained must be greater than or equal to 0.01. The random bit stream obtained from the proposed RNG using PWLCM passed all NIST tests presented in Table 1.

Dispersion matrix generation
Generally, the dispersion property is a post-processing technique used to measure the randomness in a sequence. The proposed novel methodology uses dispersion property within the design loop for systematic S-box design. The dispersion property can be defined as: Definition 2: The dispersion measures the irregularity in output spread DS y for a given input spread DS x . For a given substitution π, the list of dispersion pairs of π is defined as where DS x , and DS y is the input and output spread. As described in the contribution subsection, Counting d total ðpÞ and d R ðpÞ computes normalized dispersion. The utilization of dispersion property within the design loop under proposed design conditions requires an understanding of the computation of DM. Figure 2 shows a threecolumn vector of input information, the S-box, which is used to substitute input information and substituted input information using the S-box, respectively.
The dispersion matrix is filled with the spread pair (DS x ; DS y Þ. The input spread is measured using input differential with spread variable C 2 ð0; 255Þ. Figure 2 shows the process of measuring DS x and DS y . Further, Table 2 demonstrates the process of selecting input spread using the input spread variable to measure the DS y . In Table 2, the C = 0 column entails 0, hence not considered herein. Finally, Table 3 shows the dispersion matrix. The DM is quite straightforward compared to DDT, which requires a complete S-box for DDT generation. It is further hypothesized that improving the recurrence of pairs in DM may improve the count of output difference in DDT. Therefore the proposed method  Table 2 Selection of input differentials using input spread variable C. systematically substitutes S-box elements with low DP value, which seems impossible using a typical chaos-based algorithm (Özkaynak, 2020).
Steps to design proposed S-box 1. Variable initialization: The first step is to initialize variables used during the proposed design, such as an initial condition for the map x n , the final position of the map x nþ1 , position vector PV to store the final S-box.

S-box position mapping:
The behavior of any generated chaotic trajectories is vetted using the Lyapunov exponent. The nonlinear behavior of the chaotic map to the decimal domain is preserved. The domain in the range [0.1-0.9] is divided into 256 equal intervals, and the intervals are sequentially labeled as position counter PC. In doing so, the generated S-box positions acquire the nonlinear behavior of chaotic trajectories.

Chaotic trajectories decimal mapping:
The PWLCM is iterated using an arbitrarily chosen initial seed x n ; however, we use x n ¼ 0:346 to generate the proposed S-box, which entails x nþ1 , is checked in the range [0.1-0.9] where it falls and marks associated interval/subdomain number if empty using PC. This PC is an S-box element and stored in a position vector. The S-box's bijective property is assured by ignoring output value that falls visited subdomain whose PC is already stored in PV ensures distinct positions generation. The chaotic decimal mapping entails an initial S-box.

Systematic byte substitution using dispersion matrix:
This step ensures the substitution of weak S-box positions that affect the performance parameters of the final S-box. The inherent structure of chaotic trajectories habitually includes these wrong positions as a part of the S-box. Therefore, this work proposed a dispersion matrix-based systematic byte substitution method to generate a near-optimal S-box. The flow graph is presented in Fig. 3. The dispersion matrix is generated within the loop of the proposed S-box design by tabulating the output differential DS y = PV [PC] th and PV [PC -Δx[i]]th in the dispersion matrix. The dispersion matrix is filled column-wise due to the S-box design's dynamic nature until each row has a distinct output differential. "Dispersion Matrix Generation" of the proposed methodology details the generation of the dispersion matrix.
1. If the output difference is repeated in any column of the dispersion matrix, the S-box's corresponding position is ignored and regenerated.
2. Tabulate all output differences of the S-box in the dispersion matrix for all given input differences.
3. The regeneration of S-box positions due to repeated output differences is attempted in the arbitrary given time; otherwise, allow repetition to generate S-box in a reasonable amount of time.

RESULTS
This section evaluates the proposed systematic S-box's cryptographic properties, and the results are presented in detail. The performance of the proposed S-boxes is tested and evaluated based on the following parameters: bijection, nonlinearity (NL) (Meier & Staffelbach, 1990), strict avalanche criterion (SAC) (Webster & Tavares, 1986), bit independence criterion (BIC) (Webster & Tavares, 1986;Farwa, Shah & Idrees, 2016), and maximum expected linear and differential probability (Heys, 2002;Matsui, 1996;Hong et al., 2000) and boomerang differential probability (Wagner, 1999;Cid et al., 2018). Numerous researchers have presented tools for verifying an S-box (Wang et al., 2009;Özkaynak, 2019;Picek et al., 2014). The numerical results obtained corresponding to the proposed S-box given in Table 4, are presented in results and discussion sections, verified using the S-box tool. Furthermore, these results are compared with the existing chaosbased S-boxes, algebraic S-box, and other recently proposed methodologies. The following sections briefly explain the S-box testing parameters and discuss their numerical results obtained for the proposed S-box.

Bijective
The bijection test evaluates the uniqueness of the output of an S-box. If an S-box fulfills the bijection criterion, its output values are unique and non-repeating in the interval ½1; 2 nÀ1 . Also, there is a one-to-one mapping between each input and output value. It can be observed that the proposed S-box satisfies the bijection test. Each S-box produces unique output values in the interval [0,255], and there is a one-to-one mapping between every input and output.

Nonlinearity
The nonlinearity (NL) test measures the smallest Hamming distance of the reference function from all the affine functions (Meier & Staffelbach, 1990;Webster & Tavares, 1986;Farwa, Shah & Idrees, 2016). It represents the number of bits that must be altered in the truth table of a Boolean function to approach the nearest affine function. Mathematically, the nonlinearity of a Boolean function is defined as follows: where S g ð Þ ðWÞ represents the Walsh spectrum, which is defined as: The maximum possible nonlinearity value in GFð2 n Þ is N ¼ 2 n À 2 n 2 À1 (Nyberg, 1991). Hence, the maximum achievable nonlinearity is 120. The values of nonlinearity achieved for the proposed S-boxes with different initial conditions are given in Table 5. Our proposed S-box provides a minimum and maximum nonlinearity of 100 and 108, respectively. The average nonlinearity achieved with the proposed S-box is between 103.5 and 105.5, which falls under good nonlinearity.

Strict avalanche criterion
Strict Avalanche Criterion (SAC) (Webster & Tavares, 1986) measures how many output bits change for a function when a single input bit is altered. If a function satisfies the SAC, each output bit should change with a probability of one-half whenever a single input bit is complemented. In other words, changing a single input bit should change almost one-half of the output bits. For an S-box to be ideal, the SAC value should be equal to 0.5. The proposed S-box generated SAC and SAC offset values with the proposed scheme (Table 6) achieves an average SAC value approximately equal to 0.5. Additionally, the SAC values obtained are comparable to the existing S-boxes, which shows that the proposed S-boxes satisfy the SAC test.

Bit independence criteria
The output bit independence criterion (BIC) is a crucial property for any cryptographic system and was introduced by Webster & Tavares (1986) to analyze the behavior of bit patterns at the output. A single plaintext bit is altered for investigating the BIC, and the output binary vectors are analyzed for independence. All avalanche variables must be pair-wise independent for a given set of avalanche vectors generated by complementing a single plaintext bit to satisfy the BIC. The correlation between an input-output pair measures the amount of independence among all avalanche pairs. For two variables, A and B, correlation presented in mathematical form as follows: where, pfA; Bg, and covfA; Bg is the correlation coefficient and covariance of A and B, respectively. The proposed S-boxes achieve an average BIC value of 108 each, equal or better than most of the existing S-boxes is given in Tables 7 and 8. Thus the proposed S-box successfully fulfills the BIC.

Linear approximation probability
The linear approximation probability (LP) measures the maximum imbalance between input and output bits (Aboytes-González et al., 2018). Mathematically, the linear approximation probability of an S-box is defined as: where À x and À y are input and output masks, respectively, x is the set of all possible input values, and 2 n is the number of S-box elements. The LP value of the proposed S-box is  Table 9. Further, the histogram of LAT of the proposed S-box is given in Fig. 4. As a result, the S-boxes generated using the proposed method are resilient to linear cryptanalysis.

Differential approximation probability
The differential approximation probability (DP) exhibits the differential uniformity of an S-box (Aboytes-González et al., 2018;Hong et al., 2000), which is mathematically defined as given in (8).  Table 9 LAT of proposed S-box. Dx and Dy are input and output differential, respectively, X is the set of possible input values, and 2 n is the number of S-box elements. An S-box with lower differential uniformity is considered cryptographically secure. This research aims to propose a systematic S-box methodology to improve the differential uniformity. The DDT of the proposed S-box is shown in Table 10. The proposed S-box has a differential uniformity of 8 and a maximum DP value of 0.03125. The obtained maximum DP value of the proposed S-box is compared with existing related S-box methodologies and tabulated in Table S2. For chaos-based S-box, the maximum DP value of 0.03125 is considered near-optimal compared to most existing S-boxes. The frequency of occurrence of Dy in DDT is shown in Fig. 5. Further, the histogram of DDT of the proposed S-box is given in Fig. 6. It shows that the proposed S-box improves the occurrence of Dy in DDT and 98% of the Dy occurs with the probability of 0.234. Hence, we prove the hypothesis that controlling the d RðpÞ under given design conditions by systematically chosen S-box position entails improved occurrence of Dy in DDT. Therefore, it is concluded that the proposed scheme ably generates S-boxes that are core security components in encryption algorithms and provide strong security to resist cryptanalytic attacks.

Correlation analysis: sensitivity among S-boxes
To study the randomness among S-boxes, the correlation coefficient is measured. It determines the similarities among S-boxes with a slight change in the initial condition. The correlation coefficient, ρ, is measured as: where EðS i Þ ¼ 1 N P N k¼1 S k ; rðS i Þ ¼ P N k¼1 ðS k À ES k Þ 2 , and N ¼ 2 n where n = size of the S-box. The initial condition is changed to the 4th decimal digit for the analysis, and 500 S-boxes are generated. Figure 7 shows the correlation among proposed S-boxes. The x-axis shows the number of inputs, the y-axis shows the number of S-boxes, and the z-axis shows the values of correlation coefficients. The upper and lower bound of the achieved correlation coefficient ranges from −0.2139 to 0.2667. It is quite evident from the   Fig. 7 that the proposed S-boxes have very low correlation coefficient values. The correlation of the S-box gives the value of 1, as shown with a diagonal bar in Fig. 7. The differential uniformity of all generated S-boxes is measured and plotted in Fig. 8. Hence, it proves the hypothesis that it retains good DP values and inherent design technique results in highly uncorrelated S-boxes. Therefore, the proposed S-box method is highly suitable to design key-based S-boxes.

Boomerang connectivity table (BCT)
The boomerang attack, proposed by Wagner (1999), is a popular cryptanalytic technique used to analyze the security of a block cipher. The boomerang connectivity table (BCT), proposed by Cid et al. (2018), is an efficient and simple method that accurately measures the connection probability for a boomerang-styled attack. Like the DDT, BCT provides useful information for analyzing an S-box for a cryptosystem. Therefore, the strength of an S-box as a confusion component can be measured using BCT. For a given input difference D i , the BCT computes the probability of boomerang of D i using output difference r o for all values of input x. The BCT computes all pairs (D i ; r o ) using the following equation, where S À1 is the inverse of an S-box, D i and r o is the input and output difference, respectively. The BCT, given ðD i ; r o Þ and for all input x, determine and tabulate in BCT, the probability of boomerang of D i . There is a deep relationship between BCT and DDT (Cid et al., 2018). The number of entries in BCT is greater than or equal to DDT, with the proportion given in Song, Qin & Hu, 2019. The BCT table of the proposed S-box is given in Table 11. The histogram of BCT of size 256 Â 256 of our proposed S-box is given in Fig. 9. The entries in the first row and first column of BCT are all 256. For a better illustration of the internal structure of BCT, Fig. 9 does not include the first row and first column of the BCT. The frequency of each entry in BCT and DDT of our proposed S-box is summarized in Table 12. Due to the inherent generation structure of BCT, the differential uniformity in BCT is 16 with 21 entries. In comparison, differential uniformity in DDT is 8 with 20 entries. The number of BCT and DDT entries of proposed S-box can be visualized in Fig. 10.
A detailed analysis is provided in Cid et al. (2018) for desired BCT differential uniformity of 4 Â 4 and 8 Â 8 S-boxes to resist boomerang attack and later Boura & Canteaut (2018) provided the best possible differential uniformity of BCT for 4 Â 4 S-boxes. However, the best possible differential uniformity of BCT for 8 Â 8 S-box still is an open problem. Another related extension of BCT for ciphers following Feistel construction was proposed by Boukerrou et al. (2020). For a given S-box, the Feistel counterpart BCT (FBCT) is defined as:  The FBCT was given D i ; r o ð Þand for all values of x, the probability that (12) hold is computed and tabulated in FBCT. Some direct properties of FBCT are given as: (1) Symmetry: for all 0 D i ; r o 2 n À 1, FBCT(0, r o ) ¼ 2 n  Figs. 11 and 12. The FBCT entry values at the first row, first column, and diagonal is 2 n . The entries of 2 n in diagonal of FBCT is called the Feistel switch. The F-boomerang uniformity (b F ), the highest value in FBCT, ignoring the first row, first column, and diagonal is b F ! 4. The F-boomerang Table 12 The number of entries for each value in DDT and BCT of the proposed S-box and AES S-box.  uniformity of the proposed S-box is b F ¼ 12. In FBCT, number of entries of each value of 256, 12, 8, 4, and 0 is 766, 51,990, 11,274, 1,392, and 114, respectively.

Proposed systematic S-box application in image encryption
The suitability of the proposed S-box is evaluated as an application in image encryption. Image encryption, measures the strength and robustness of the proposed S-box, is performed using majority logic criteria (MLC) (Hussain et al., 2012;Shah et al., 2011). It is presented herein just to showcase the capability of the proposed S-box and not being used as a cipher. We used a standard gray-level San Diego aerial image of size 512 Â 512 as plaintext to perform the substitution (Weber, 1981). This image can be used freely for research purpose. This image was substituted using the proposed S-box and AES S-box individually. The S-box substituted the pixel values of an image with the corresponding value in the S-box. The ciphertext is the scrambled image that hides the visual information contained in the plaintext. We performed a single round image substitution to perform some statistical analysis on plain and encrypted images. We performed these statistical analyses, namely histogram analysis, entropy, energy, correlation, contrast, and homogeneity analysis. It can be observed from Table 13 that the proposed systematic S-box efficiently disperse the correlated pixels that provide effective image substitution. Results show that parameters are mainly comparable to the AES S-box. The entropy parameter value obtained using the proposed systematic S-box is 7.4060, near the superior value of 8. The entropy value indicates the randomness in an image. Hence, the proposed S-box is designed to provide near-optimal decorrelation between input and output elements in the image, amplifying randomness. The energy parameter value of the plain image is 0.0780. When image encryption is applied to plain images, we achieved an energy value of 0.0161, the same as the AES S-box energy value. The achieved energy value is small, which entails efficient image encryption performance of the proposed S-box. The correlation shows the linear independence between plain and encrypted images.
The coefficient value of approximately 0 indicates no or weak correlation between images. The proposed S-box's correlation parameter value is 0.0398, close to 0, and comparable with AES S-box. The proposed S-box enhances the spread and dispersion among input and output pixels. Thus, it results in a weak correlation among pixels values. Further, the proposed S-box enhances the modern encryption properties of confusion and diffusion. The contrast parameter value of the proposed S-box is 9.9895. The constant image entails a contrast value of 0. A high value of contrast indicates randomness in the image. Due to systematic nonlinear mapping using the proposed S-box, Objects in the plain image are dispersed completely. Therefore, we achieved a high value of contrast in encrypted that indicates strong encryption. The homogeneity parameter measures the closeness of the distributed pixels of GLCM to its diagonals. The achieved homogeneity results using the proposed S-box and AES S-box are comparable and show strong encryption. Using majority logic, the image substitution analysis entails the proposed S-box results comparable to the state-of-the-art results in Table 13.
The visual demonstration of plain image (Fig. 13) substituted using proposed and AES S-box is also shown. Fig. 14 shows the histogram of plain image. The substituted image using proposed and AES S-box is shown in Figs. 15 and 16, respectively. It is evident from the figures that the proposed S-box hides all visual information contained in an image.

DISCUSSION
We proposed a systematic S-box to achieve near-optimal cryptographic properties of bijective, nonlinearity, SAC, BIC, DP, and LP. The generated S-box is given in Table 1. The design assumption is to reserves engineer the attacking scenario of differential cryptanalysis, which uses the high probability of DP value to mount an attack. However, dispersion property that measures the randomness among output differentials is used within the loop to generate S-box positions systematically. It is hypothesized that high dispersion among S-box's output differential entails improved differential uniformity. The proposed S-box achieved a maximum DP value of 0.03125, which shows the maximum value in the DDT  Hussain et al., 2018;Siddiqui et al., 2020). Thus, as a confusion component in the cryptosystem, the proposed systematic S-box entails strong resistance against differential cryptanalysis. The achieved cryptographic properties of our proposed S-box are summarized in Table 14. This article introduced a relatively new cryptanalysis known as a boomerang attack. The BCT table is generated to individually analyze the differential characteristics of cryptosystem system components. BCT provides more versatility than DDT and finds the nonzero pairs ðD i ; r o Þ using switching technique where DDT ðD i ; D o Þ ¼ 0. The differential uniformity of our proposed S-box is 16 in BCT. Further, the FBCT of proposed S-box is also generated. FBCT is a variant of BCT for cryptosystem employing the Feistel structure. The differential uniformity, F-boomerang uniformity in FBCT, of proposed S-box is 12. The chaotic maps are inherently provided nonlinear trajectories in the real domain. Therefore, efficient mapping of chaotic domain retains the nonlinear behavior. Our proposed S-box has maximum and average nonlinearity of 108 and 106, respectively, which is well above the required bound and comparable to existing chaos-based S-boxes given in Siddiqui, Naseer & Ehatisham-ul-Haq (2021) Hussain et al. (2018) and Siddiqui et al. (2020). Highly nonlinear mappings in encryption algorithms are considered vital to resist linear cryptanalysis. The proposed S-box method of dispersion-based randomization inadvertently achieves an excellent linear probability value of 0.1028, better than most existing methods (Aboytes-González et al., 2018;Siddiqui et al., 2020;Nizam Chew & Ismail, 2020;Ahmad et al., 2020;Cui & Cao, 2007;Tran, Bui & Duong, 2008). Cryptographically secure vectorial Boolean functions have avalanche and propagation properties for unpredictable substitutions. The average value of the SAC of the proposed S-box is 0.5123, which is very close to the ideal value of 0.5. Thus, we can say that our proposed S-box is highly nonlinear and behave unpredictable manner in any symmetric encryption algorithm. The proposed S-box can generate a balanced output that can be validated using the bijective property.
Further, the proposed nonlinear mapping, real to decimal, generated cryptographically secure PRN were tested using the NIST approved test suite. All of the tests in the NIST suite were passed, and P-values were well under the accepted range (0.01 < P-value < 1.00). The NIST suite's frequency and block frequency test also validates the balance properties. We also employed our S-box in image encryption algorithm and performed various statistical tests to investigate the proposed S-box's performance and suitability in image encryption applications. The proposed S-box shows excellent statistical entropy, energy, correlation, contrast, homogeneity.

CONCLUSIONS
A novel method to generate a near-optimal S-box is proposed. A chaotic multilevel map is employed for initial chaotic trajectories. The given PWLCM generates cryptographically secure PRN, vetted through the NIST test. Under given design conditions, the dispersion matrix is systematically employed within the proposed design loop. The proposed design criteria efficiently substitute weak S-box positions for a robust S-box structure and nearoptimal results. The proposed S-box also exhibits high dispersion in design which is critical to achieving the notion of confusion. The proposed S-boxes were evaluated based on expanded S-box design criteria. The proposed S-boxes were comparable to recently published state-of-the-art S-box designs in the field. Our results demonstrate that the proposed S-box has excellent cryptographic properties. The nonlinearity value is in the range of 100 to 108 and achieves the differential uniformity of 8. A systematic and robust methodology of chaos-based S-box is required to achieve the DP in the range of 4 to 10.
The strength of our proposed S-box was also tested against new boomerang cryptanalysis. Therefore, the BCT and FBCT table of the proposed S-box was generated to find the maximum BCT/FBCT differential probability. The proposed S-box had a maximum BCT and FBCT differential probability of 0.0625 and 0.0468, respectively. The BCT/FBCT analysis provides a new insight to design and analyze the S-box for cryptosystem. Our proposed S-box shows an upper-bound value of LP of 0.1028. It is evident from the results presented in this paper that our S-box achieves an upper bound of cryptographic properties. To validate the suitability of the proposed S-box as a confusion component in image encryption algorithms, a substitution-based statistical test of entropy, energy, correlation, contrast, and homogeneity was performed to achieve the values of 7.358, 0.016, and 0.033, 10.11, 0.406, respectively. Our S-box show excellent performance against these tests and is suitable for image encryption applications.
In the future, this work can be extended to design key-based S-boxes. The S-boxes are based on chaotic parameters, where the S-boxes are dynamically generated in each round of encryption to obtain a more secure cryptosystem. Further, the differential uniformity of BCT/FBCT of chaos-based S-boxes will be analyzed to study the resistance against boomerang attack. Furthermore, the applications of the proposed S-boxes in image encryption and watermarking can be investigated.

ADDITIONAL INFORMATION AND DECLARATIONS Funding
This work was supported by Universiti Tun Abdul Rehman. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Grant Disclosures
The following grant information was disclosed by the authors: Universiti Tun Abdul Rehman.
Muhammad Asif Khan conceived and designed the experiments, performed the experiments, analyzed the data, prepared figures and/or tables, authored or reviewed drafts of the paper, and approved the final draft. Ramesh Kumar Ayyasamy performed the experiments, analyzed the data, prepared figures and/or tables, nIST tool analysis, and approved the final draft. Muhammad Wasif conceived and designed the experiments, analyzed the data, prepared figures and/or tables, authored or reviewed drafts of the paper, and approved the final draft.

Data Availability
The following information was supplied regarding data availability: The source code files are available in the Supplemental Files. The raw data is the standard image of San Diego 2.1.02.tiff, available from: https://sipi. usc.edu/database/database.php?volume=aerials.