On the Cryptanalysis of a Latin Cubes-Based Image Cryptosystem

Based on orthogonal Latin cubes, an image cryptosystem with confusion–diffusion–confusion cipher architecture has been proposed recently (Inf. Sci. 2019, 478, 1–14). However, we find that there are four fatal vulnerabilities in this image cryptosystem, which leave open doors for cryptanalysis. In this paper, we propose a reference-validation inference algorithm and design screening-based rules to efficiently break the image cryptosystem. Compared with an existing cryptanalysis algorithm, the proposed method requires fewer pairs of chosen plain-cipher images, and behaves stably since different keys, positions of chosen bits and contents of plain images will not affect the cryptanalysis performance. Experimental results show that our cryptanalysis algorithm only requires 8×H×W3+3 pairs of chosen plain-cipher images, where H×W represents the image’s resolution. Comparative studies demonstrate effectiveness and superiority of the proposed cryptanalysis algorithm.


Introduction
One image is worth more than ten thousand words. How to protect image contents from illegal accesses has become a crucial security issue for the practical applications such as virtual meeting, video surveillance or telemedicine, especially when we have entered the era of big data. Cryptography is a cornerstone in the field of information security. Conventional data encryption techniques, e.g., DES (data encryption standard), AES (advanced encryption standard) and IDEA (international data encryption algorithm), are inappropriate for image encryption applications because there are high redundancies and strong correlation among adjacent pixels [1,2]. Permutation-and-diffusion cipher architecture, which alternately shuffles pixel positions and changes pixel values, has the capability to reduce the redundancy and the correlation, thereby playing a central role in many image encryption algorithms [3][4][5][6].
Naturally, an image can be encoded as a three-dimensional (3D) bit matrix, in which bits are the smallest elements representing digital information. Some methods [7][8][9][10][11][12][13][14] have extended the permutation-and-diffusion cipher architecture to 3D version, in which the bit-level permutation not only shuffles the bit positions but also changes the pixel values at the same time. Zhu et al. [7] employed Arnold cat map for bit-level permutation and Logistic map for diffusion. Zhang et al. [8] invented a collision-free random bidirectional visiting mechanism by coupling Chen chaotic system with Arnold cat map, and developed a new hybrid 3D permutation rule. In the image cryptosystem [9], multiple chaotic maps were used to control the bit-level row/column-wise permutation. Cai et al. [10] proposed a plaintext-related random-access mechanism that scrambles the 3D bit matrix based on a mixture of three chaotic maps. Gan et al. [11] obtained a random mapping sequence from Chen chaotic system, and designed a tailor-made multilevel quantizer for

Review of Target Image Cryptosystem
In this section, we briefly introduce the TIC (target image cryptosystem) [13]. Overall, it consists of three encryption phases based on Latin cubes, namely pre-permutation, diffusion, and post-permutation.
For better readability, we use bold uppercase symbols (e.g., P) and bold lowercase symbols (e.g., p) to represent cubes and sequences, respectively. Let P(x, y, z) denote the element of P at the coordinate (x, y, z). Let p(n) denote the nth element of p. Non-bold italic font (e.g., n and N) denotes scalars, and Greek letters (e.g., µ) stand for the keys of an image cryptosystem. Bold calligraphic capital letters (e.g., N ) is used to represent sets. Let Z represent the ring of integers.
Consider an 8-bit plain image with H × W resolution, where the total number of bits is 8 × H × W. Let N denote the side length of a bit-cube, whose value equals 3 √ 8 × H × W. For simplicity, Xu and Tian [13] assume that the image size H and W are appropriate values to ensure that the side length N is an integer. Reshape the plain image into the plain bit-cube, denoted by P.
In the TIC [13], Logistic map is adopted to generate a chaotic sequence, denoted by r = [r(0), r(1), . . . , r(N − 1)]. The definition of the Logistic map can be expressed as r(n + 1) = µ·r(n) (1 − r(n)), where n = 0, 1, . . . , N − 1, and κ = r(−1) is the initial condition. In Equation (1), µ is the system parameter. When its value lies in the interval (3.573815, 4], the system exhibits chaotic properties, including ergodicity and high sensitivity to the initial conditions [13]. Figure 1 shows diagrams of bifurcation, Lyapunov exponent, and information entropy of the Logistic map. More details can be found in [23]. For better readability, we use bold uppercase symbols (e.g., P) and bold lowercase symbols (e.g., p) to represent cubes and sequences, respectively. Let P(x, y, z) denote the element of P at the coordinate (x, y, z). Let p(n) denote the nth element of p. Non-bold italic font (e.g., n and N) denotes scalars, and Greek letters (e.g., μ) stand for the keys of an image cryptosystem. Bold calligraphic capital letters (e.g., ) is used to represent sets. Let ℤ represent the ring of integers.
Consider an 8-bit plain image with H × W resolution, where the total number of bits is 8 × H × W. Let N denote the side length of a bit-cube, whose value equals √8 × H × W 3 .
For simplicity, Xu and Tian [13] assume that the image size H and W are appropriate values to ensure that the side length N is an integer. Reshape the plain image into the plain bit-cube, denoted by P.
In the TIC [13], Logistic map is adopted to generate a chaotic sequence, denoted by r = [r(0), r(1), …, r(N − 1)]. The definition of the Logistic map can be expressed as where n = 0, 1, …, N − 1, and κ = r(−1) is the initial condition. In Equation (1), μ is the system parameter. When its value lies in the interval (3.573815, 4], the system exhibits chaotic properties, including ergodicity and high sensitivity to the initial conditions [13]. Figure 1 shows diagrams of bifurcation, Lyapunov exponent, and information entropy of the Logistic map. More details can be found in [23]. (c) diagram of information entropy. Note that when calculating the information entropy, we simply discretize r(n) through the formula ⌊r(n)·10⌋.
The chaotic sequence r is first sorted in ascending order, and then the sorted result is used to form a random mapping sequence, denoted by s = [s(0), s(1), …, s(N − 1)]. Specifically, s represents the random mapping relations between r and its sorted counterpart. Clearly, s(n) is an integer lying in the interval [0, N − 1]. By using the random mapping sequence s, three orthogonal Latin cubes, denoted by L1, L2 and L3, are generated as follows where L1(x, y, z), L2(x, y, z) and L3(x, y, z) are the elements of L1, L2 and L3 at the coordinate (x, y, z), respectively. In Equation (2), the addition operator "+" and the multiplication operator "×" are both defined on the ring ℤ/Nℤ. Note that, in the TIC [13], N is fixed to 128 so that the addition and multiplication operators are originally defined on the finite field GF(2 7 ). However, it is too strict to consider a constant side length. In a practical image cryptosystem, N may not be a power of a prime number. Therefore, in this paper, we define the two operators on the ring ℤ/Nℤ rather than on a finite field. (c) diagram of information entropy. Note that when calculating the information entropy, we simply discretize r(n) through the formula r(n)·10.
The chaotic sequence r is first sorted in ascending order, and then the sorted result is used to form a random mapping sequence, denoted by s = [s(0), s(1), . . . , s(N − 1)]. Specifically, s represents the random mapping relations between r and its sorted counterpart. Clearly, s(n) is an integer lying in the interval [0, N − 1]. By using the random mapping sequence s, three orthogonal Latin cubes, denoted by L 1 , L 2 and L 3 , are generated as follows where L 1 (x, y, z), L 2 (x, y, z) and L 3 (x, y, z) are the elements of L 1 , L 2 and L 3 at the coordinate (x, y, z), respectively. In Equation (2), the addition operator "+" and the multiplication operator "×" are both defined on the ring Z/NZ. Note that, in the TIC [13], N is fixed to 128 so that the addition and multiplication operators are originally defined on the finite field GF(2 7 ). However, it is too strict to consider a constant side length. In a practical image cryptosystem, N may not be a power of a prime number. Therefore, in this paper, we define the two operators on the ring Z/NZ rather than on a finite field. The three control parameters, namely α, β, and γ, in Equation (2), must be different nonzero numbers within the ring Z/NZ. This is a necessary and sufficient condition for the three Latin cubes to be mutually orthogonal. The orthogonality property means that each triple (L 1 (x, y, z), L 2 (x, y, z), L 3 (x, y, z)) appears only once after traversing all possible ones, where x, y, and z take values from [0, N − 1]. The orthogonality property ensures that all the mapping relations between (x, y, z) and (L 1 (x, y, z), L 2 (x, y, z), L 3 (x, y, z)) are one-to-one correspondences without collision, so that the orthogonal Latin cubes can be directly used for permuting a bit-cube. It is worth clarifying here that the initial condition κ, the system parameter µ, and the three control parameters α, β, γ collectively serve as keys in the TIC [13].
The diffusion phase, which is executed after the pre-permutation phase, aims to flip a part of bits in U under the control of a random bit sequence. In the TIC [13], the random bit sequence is extracted from L 1 through binarization In Equation (4), N/2 is the threshold. The binarized bit-cube is denoted by B = [B(x, y, z) |x, y, z = 0, 1, · · · , N − 1]. Reshape U and B into two one-dimensional (1D) bit sequences according to the same scanning order. Here, the 3D-to-1D coordinate transformation can be explicitly formulated as The two resulting 1D bit sequences are denoted by where v = [v(0), · · · , v(n), · · · , v(N − 1)] is the diffused bit sequence, and v(−1) is initialized to 0. In Equation (6), the sign "⊕" represents bit-wise exclusive or (XOR) operator. In the last post-permutation phase, a CPV (cipher-parity value), denoted by t, is first defined by Then, reshape the 1D diffused bit sequence v into a diffused bit-cube V. Here, the 1D-to-3D coordinate transformation for each bit position can be explicitly expressed as where the floor sign "·" rounds down to the nearest integer of the number enclosed within the sign, while "%" represents a remainder operator. The three Latin cubes are reused in the following form to permute the diffused bit-cube V In Equation (9), C = [C(x, y, z) |x, y, z = 0, 1, . . . , N − 1] represents the resulting bitcube of the post-permutation phase. Finally, reshape C into an 8-bit cipher image with H × W resolution. Decryption is composed of the inverse encryption operations, which are organized in a reverse order.

Cryptanalysis
In this section, we first summarize the existing vulnerabilities of the TIC [13] from four aspects. Then, inspired by the chosen-plaintext attack, we propose an efficient reference-validation inference algorithm to break the random mapping sequence, and design screening-based rules to determine the keys of the orthogonal Latin cubes. In total, our cryptanalysis algorithm requires only 3 √ 8 × H × W + 3 pairs of chosen plaincipher images to break the TIC [13], where H and W represent the image's height and width, respectively.

Vulnerability Analysis
Although Xu and Tian [13] conducted various security tests, there still exist four fatal vulnerabilities in their image cryptosystem. First, the process of generating the chaotic sequence r is independent of the plain image, so that an attacker can arbitrarily fabricate desired plain images without influencing r. Second, the diffusion phase, as shown in Equation (6), fundamentally inherits from the traditional CBC mode. This means that modifying one bit in the plain image may only affect a small part of the diffused bits. In such circumstance, an attacker can mine useful information, for example the number of unchanged bits, to infer the random mapping sequence s. Third, the initial value of the diffusion phase is set to v(−1) = 0 without introducing any cryptography mechanism. This flaw somewhat eases the cryptanalysis task. Fourth, the post-permutation phase, as shown in Equation (9), fails to conceal the statistical information of V. This enables an attacker to bypass the post-permutation phase and to calculate the equivalent CPV directly from the cipher images.
Kerckhoffs's principle, which lies at the core of cryptanalysis, sets forth that the security of a cryptosystem relies only on the keys, rather than on the details of the encryption/decryption algorithm. In other words, the encryption/decryption details, e.g., the coordinate transformations between 1D and 3D as shown in Equations (5) and (8), are all open knowledge. In summary, the task of breaking the TIC [13] is equivalent to reconstructing the key-based information, including the random mapping sequence s (which is controlled by the keys κ and µ), and the three orthogonal Latin cubes L 1 , L 2 , and L 3 (which are controlled by the keys α, β, and γ).
Hereafter, we act as an attacker and use the existing vulnerabilities to break the TIC [13]. In Section 3.2, we ascertain what controls the CPV and propose a reference-validation inference algorithm to reconstruct s. In Section 3.3, we turn to design screening-based rules to determine the keys α, β, and γ.

Simplify the Pre-Permutation Phase
As discussed in [22], some special coordinates in the plain bit-cube P can be used to simplify the pre-permutation phase. When substituting s(x) = s(y) = 0 into Equation (2), we have L 1 (x, y, z) = L 2 (x, y, z) = L 3 (x, y, z) = s(z). Then, the pre-permutation phase, as shown in Equations (2) and (3), can be rewritten as U(x, y, z) = P(s(z), s(z), s(z)), in which s(z) must be an integer lying in the interval [0, N − 1]. Here, we introduce a new notation s −1 (·) to represent the inverse of s(·), and further formulate Equation (10), giving U(s −1 (0), s −1 (0), s −1 (n)) = P(n, n, n), where the coordinate (n, n, n) is located at the diagonal of P. In other words, as long as we visit the diagonal bits of P, the original pre-permutation phase can be simplified to Equation (11). This paves the way for inferring the mapping relations between s −1 (n) and n. As doing so, reconstructing s is trivial since s −1 (·) belongs to a bijective mapping.
To this end, the first step is to choose the plain images, which can simplify the prepermutation phase according to Equation (11). We create N plain bit-cubes by modifying the diagonal bits of P in turn. This procedure is described as follows where n = 0, 1, · · · , N − 1. The notation P n = [P n (x, y, z)|x, y, z = 0, 1, · · · , N − 1] represents the nth created plain bit-cube. Clearly, the only one different bit between P and P n is located at the coordinate (n, n, n). As discussed above, modifying bit values in P will not influence the chaotic behaviors of the Logistic map so that all these plain bit-cubes, namely P and P n , share the same s. This provides us with the opportunity to break the random mapping sequence s by the chosen-plaintext attack. For the convenience of presentation, we shall attach a prime superscript on the letter to signify the intermediate encryption result of P n . For example, the symbols U n , V n , C n correspond to the pre-permutated, the diffused and the post-permutated version of P n , respectively. Feeding P n into the TIC [13], we can obtain the post-permutated bit-cube C n . The new CPV, denoted by t n , can be directly computed from C n even without knowing V n because the post-permutation phase does not affect any bit values. According to whether t n = t or t n = t, the N pairs of chosen plain-cipher images can be divided into CPV-preserving and CPV-changing groups. The two kinds of groups will be tackled through different strategies, as will be presented below.

CPV-Preserving Group
When t n = t, the bit-cubes C and C n have undergone the same post-permutation phase, as shown in Equation (9), so that we can directly apply the bit-wise XOR operator to them. Calculate a differential bit-cube D n = C C n , and count the total number of 1s in D n . Let d n be the counting result. Since the diffusion phase, as shown in Equation (6), belongs to the weak CBC mode, a modified bit in u n will affect the diffused sequence v n starting from the landmark position and propagating the influence of the modification along the way forward to the end, as illustrated in Figure 2. Here, the landmark position, denoted by l p n , plays a dual role. On one hand, it corresponds to the flipped bit in u n and to the modified bit in P n at the coordinate (n, n, n), as indicated by the dotted arrow in Figure 2. Hence, the landmark position l p n can be represented by Equation (13) is a 3D-to-1D coordinate transformation, which instantiates Equation (5) by On the other hand, the landmark position can reflect the number of flipped bits in v n , denoted by h n , taking the following form: where N 3 is the total number of bits. As illustrated in Figure 2, when t n = t holds, the number of flipped bits h n can be exposed by counting the number of 1s in D n , namely that the equation h n = d n holds. Substitute for l p n in Equation (13) using Equation (14) and apply the 1D-to-3D coordinate transformation, giving Equation (15) establishes the relationship between s −1 (n) and h n . Based on this relationship, we can readily determine the mapping relations s −1 (·) due to the one-to-one correspondence between h n and n. Equation (15) establishes the relationship between s −1 (n) and h n . Based on this relationship, we can readily determine the mapping relations s −1 (·) due to the one-to-one correspondence between h n and n.

What Controls the CPV
It is worth ascertaining what controls the CPV and how many pairs of chosen plaincipher images in the CPV-preserving and CPV-changing groups, respectively. The following three propositions comprehensively assert that the last term in Equation (13), namely s −1 (n), controls the CPV. Proposition 1. If the number of flipped bits, namely h n , is even, then t n = . Otherwise t n ≠ t. Proof. Clearly, the flipped bits in v n , as marked by red ink in Figure 2, are the sources of changing the CPV. Hence, in this proof, we only focus on the flipped part that consists of h n bits. Suppose that h n = h n (1→0) + h n (0→1) , where h n (1→0) (and h n (0→1) ) denotes the number of 1s (and 0s) being flipped to 0 (and 1) caused by the one bit modification at the landmark position.
If h n is even, we will encounter one of the following two cases: (1) both h n (1→0) and h n (0→1) are even; (2) both h n (1→0) and h n (0→1) are odd. In both cases, h n (1→0) and h n (0→1) share the same parity so that flipping h n bits will not change the

CPV.
On the contrary, if h n is odd, the two cases become: (1) h n (1→0) is odd while h n (0→1) is even; (2) h n (1→0) is even while h n (0→1) is odd. In both cases, h n (1→0) and h n (0→1) possess the opposite parity. This means that the number of 1s in the to-be-flipped part of v will be changed from an odd integer to an even integer for case (1), and from an even integer to an odd integer for case (2). Consequently, if h n is odd, t n is necessarily opposite to t. □ Proposition 2. If the landmark position lp n is even, then t n = . Otherwise t n ≠ t.
Proof. Equation (14) establishes the relationship between h n and lp n , in which N 3 = W × H × 8 is even because it is a multiple of 8. Due to this evenness, Equation (14) forces h n and lp n to share the same parity. Consequently, the landmark position lp n , just like h n , controls the CPV. □ Proposition 3. If s −1 (n) is even, then t n = . Otherwise t n ≠ t.

What Controls the CPV
It is worth ascertaining what controls the CPV and how many pairs of chosen plaincipher images in the CPV-preserving and CPV-changing groups, respectively. The following three propositions comprehensively assert that the last term in Equation (13), namely s −1 (n), controls the CPV.

Proposition 1.
If the number of flipped bits, namely h n , is even, then t n = t. Otherwise t n = t.
Proof. Clearly, the flipped bits in v n , as marked by red ink in Figure 2, are the sources of changing the CPV. Hence, in this proof, we only focus on the flipped part that consists of h n bits. Suppose that h n = h ) denotes the number of 1s (and 0s) being flipped to 0 (and 1) caused by the one bit modification at the landmark position.
If h n is even, we will encounter one of the following two cases: (1) both h On the contrary, if h n is odd, the two cases become: possess the opposite parity. This means that the number of 1s in the to-be-flipped part of v will be changed from an odd integer to an even integer for case (1), and from an even integer to an odd integer for case (2). Consequently, if h n is odd, t n is necessarily opposite to t.

Proposition 2.
If the landmark position l p n is even, then t n = t. Otherwise t n = t.
Proof. Equation (14) establishes the relationship between h n and l p n , in which N 3 = W × H × 8 is even because it is a multiple of 8. Due to this evenness, Equation (14) forces h n and l p n to share the same parity. Consequently, the landmark position l p n , just like h n , controls the CPV. Proposition 3. If s −1 (n) is even, then t n = t. Otherwise t n = t. 1/3 and N 2 = 4·(W × H) 2/3 are multiples of even integers, they must be even. Similarly, the first and second terms on the right-hand side of Equation (13) must be even as well because they take N and N 2 as multipliers. The remaining two terms, namely l p n and s −1 (n), are therefore forced to share the same parity. Consequently, s −1 (n) exactly plays the same role as l p n and h n in controlling the CPV.
Since s −1 (·) must be an integer lying in the interval [0, N − 1], half CPVs t n will remain unchanged when s −1 (n) is even, while the remaining half ones are opposite to t when s −1 (n) is odd. When t n = t, we determine h n by counting the number of 1s in D n , and then calculate s −1 (n) by using Equation (15). When t n = t, V and V n will be permuted by different ways, as shown in Equation (9), so that it is meaningless to calculate the bit-wise XOR of C and C n . To alleviate this problem, we propose a reference-validation inference algorithm that requires only two additional pairs of chosen plain-cipher images to determine the mapping relations between s −1 (n) and n for the CPV-changing group. See details in the next three subsections.
For future convenience, we define two index sets N = and N = with the same cardinality N/2. The former consists precisely of the indices n for which t n = t is true. Members of the latter are the indices n for which t n = t is true. Properties of the CPVpreserving and CPV-changing groups are summarized in Table 1 for ease of comparison. Table 1. Properties of the CPV (cipher-parity value)-preserving and CPV-changing groups.

CPV h n lp n s −1 (n) Size
CPV-preserving group t n = t even even even

Pair of Reference Plain-Cipher Images
When t n = t, D n = C C n where n ∈ N = , is no longer informative because the number of 1s in D n does not reflect the number of flipped bits in C n . We shall resort to another means to measure h n . The reference-validation inference algorithm fulfills this need through the following three steps [step-1]: using a pair of reference plain-cipher images to detect leftmost and rightmost landmark positions for the CPV-changing group (discussed in this subsection); [step-2]: using a pair of validation plain-cipher images to determine the index n ∈ N = , whose l p n corresponds to the leftmost landmark position (discussed in Section 3.2.5); [step-3]: using the leftmost landmark position to measure h n , where n ∈ N = (discussed in Section 3.2.6).
To achieve step-1, we create the reference plain bit-cube by simultaneously modifying three diagonal bits of P where (ii) the corresponding three landmark positions satisfy the inequality l p n left < l p n mid < l p n right .
The requirement (i) implies that P ref provides a bridge between the CPV-preserving and CPV-changing groups. The requirement (ii) ensures that the new CPV, denoted by t ref , Entropy 2021, 23, 202 9 of 21 equals t. The reason for this is presented below. Since l p n left < l p n mid < l p n right , one bit modification will affect v ref starting from l p n left and ending at l p n mid . The flipped bits appear once again starting from l p n right until the end of the last bit of v ref . As illustrated in Figure where N 3 is an even integer. Moreover, both l p n left and l p n right are odd because the indices n left and n right belong to N = . Conversely, the landmark position l p n mid is even since n mid ∈ N = . As shown in (17) The requirement (i) implies that P ref provides a bridge between the CPV-preserving and CPV-changing groups. The requirement (ii) ensures that the new CPV, denoted by t ref , equals t. The reason for this is presented below. Since lp n left < lp n mid < lp n right , one bit modification will affect v ref starting from lp n left and ending at lp n mid . The flipped bits appear once again starting from lp n right until the end of the last bit of v ref . As illustrated in Figure  3 where N 3 is an even integer. Moreover, both lp n left and lp n right are odd because the indices n left and n right belong to . Conversely, the landmark position lp n mid is even since n mid ∈ . As shown in (17) To meet the two requirements, we design the following procedures for selecting the three indices. First, define a new differential bit-cube D m,n = C m ⨁ C n , where both m and n come from ≠ , and we have t m = t n ≠ t. The number of 1s in D m,n , denoted by d m,n , is informative in providing the distance between lp m and lp n , giving where we stipulate that m < n . Next, traverse all indices n ∈ and select the one To meet the two requirements, we design the following procedures for selecting the three indices. First, define a new differential bit-cube D m,n = C m C n , where both m and n come from N = , and we have t m = t n = t. The number of 1s in D m,n , denoted by d m,n , is informative in providing the distance between l p m and l p n , giving d m,n = |l p m − l p n |.
where we stipulate that m = < n = . Next, traverse all indices n ∈ N = and select the one whose landmark position is the median. That is It is worth mentioning that the traversal operations used above are computationally feasible for N = and N = . This is because the cardinality of N = (and N = ) equals N/2, which is a negligible number compared with the size of key space. Clearly, n mid is set to n = . However, there exist two candidate settings, which, in this paper, are called null and alternative hypotheses, respectively. The null hypothesis is that n left = m = and n right = n = . The alternative hypothesis is that n left = n = and n right = m = . Which of these two hypotheses is true will be determined by using the pair of validation plain-cipher images, as will be discussed later. At this moment, we break the tie by taking the null hypothesis as a provisional setting.
With these preparations, we detect the leftmost and rightmost landmark positions as follows. First, the equality t ref = t enables us to define a reference differential bit cube D ref by applying the bit-wise XOR operator to C and C ref , (21) in which the middle landmark position l p n mid is known, whose value has been calculated in the last subsection. Further reformulate Equation (21) as follows where all unknowns are gathered on the left-hand side while the terms on the right-hand side are all accessible. Then, the distance between l p n right and l p n left can be calculated according to Equation (18), giving where the absolute value sign has been removed because l p n right must be greater than l p n left . Combining Equations (22) and (23), we can obtain the solutions for l p n left and l p n right .

Pair of Validation Plain-Cipher Images
In this subsection, we conduct the hypothesis testing to determine the true values of n left and n right from the two candidates m = and n = . The null hypothesis is that n left = m = and n right = n = , which has been considered as a provisional setting. The alternative hypothesis is that n left = n = and n right = m = . To this end, we create the validation plain bit-cube, denoted by P val , by simultaneously modifying three bits at the diagonal of P where P val (x, y, z) is the element of P val at the coordinate (x, y, z). Accordingly, a letter with the "val" superscript, such as V val or C val , signifies the intermediate encryption result of P val . The indices n left and n mid are set as before, namely that n left = m = and n mid = n = . However, n right is set as follows Recall that Equation (19) seeks the two indices m = and n = that maximize d m,n . Here, Equation (25) leaves m = unchanged and seeks a new index n = so that the distance between l p m = and l p m = reaches the second largest value.
Since the new indices n left , n mid and n right still satisfy the requirements (i) and (ii), C val and C ref are constrained to share the same CPV. We have t val = t ref = t. This allows us to define a differential bit cube D val = C C val . Let d val denote the number of 1s in D val . As done before, we can get the solutions for l p n left and l p n right , where d val instead of d ref is used in Equation (22). As illustrated in Figure 4, if l p n left = l p n left and l p n right = l p n right , accept the null hypothesis. Otherwise, accept the alternative hypothesis. Since the new indices n left , n mid and n right still satisfy the requirements (i) and (ii), C val and C ref are constrained to share the same CPV. We have t val = t ref = t. This allows us to define a differential bit cube D val = C ⨁ C val . Let d val denote the number of 1s in D val . As done before, we can get the solutions for lp n left and lp n right , where d val instead of d ref is used in Equation (22). As illustrated in Figure 4, if lp n left = lp n left and lp n right ≠ lp n right , accept the null hypothesis. Otherwise, accept the alternative hypothesis.

CPV-Changing Group
From the CPV-changing group, the leftmost landmark position lp n left plays a key role in determining the mapping relations between s −1 (n) and n. Specifically, we state that the index n left satisfies the following equation where d n left ,n representing the number of 1s in D n left ,n , has been calculated in Equation (19). Traverse each member n ∈ , look up d n left ,n in turn, and use Equation (27) to get lp n . Doing so exposes all landmark positions for the parity-changing group. Further, exploit Equations (14) and (15) to determine the mapping relations between s −1 (n) and n, where n ∈ . So far, all s −1 (n), where n = 0, 1, …, N − 1, has been obtained, enabling us to reconstruct the random mapping sequence s even without knowing the keys κ and μ.

Screening-Based Rules
The pre-permutation phase is based on the three orthogonal Latin cubes, as shown in Equation (2), which belong essentially to quadratic equations over the ring ℤ/Nℤ. The keys, namely α, β, and γ, can be regarded as unknown variables from the perspective of cryptanalysis. Under the condition that s has been reconstructed, the three shared factors, namely s(x), s(y), and s(z), can be exposed by the chosen-plaintext attack. In other words, s(x), s(y), and s(z) are viewed as known variables of the quadratic equations in this sub-

CPV-Changing Group
From the CPV-changing group, the leftmost landmark position l p n left plays a key role in determining the mapping relations between s −1 (n) and n. Specifically, we state that the index n left satisfies the following equation n left = argmin n l p n , for n ∈ N = . (26) Equipped with this property, we can remove the absolute value sign in Equation (18), and represent the unknown landmark positions through l p n = l p n left + d n left ,n , where d n left ,n representing the number of 1s in D n left ,n , has been calculated in Equation (19).
Traverse each member n ∈ N = , look up d n left ,n in turn, and use Equation (27) to get l p n . Doing so exposes all landmark positions for the parity-changing group. Further, exploit Equations (14) and (15) to determine the mapping relations between s −1 (n) and n, where n ∈ N = . So far, all s −1 (n), where n = 0, 1, . . . , N − 1, has been obtained, enabling us to reconstruct the random mapping sequence s even without knowing the keys κ and µ.

Screening-Based Rules
The pre-permutation phase is based on the three orthogonal Latin cubes, as shown in Equation (2), which belong essentially to quadratic equations over the ring Z/NZ. The keys, namely α, β, and γ, can be regarded as unknown variables from the perspective of cryptanalysis. Under the condition that s has been reconstructed, the three shared factors, namely s(x), s(y), and s(z), can be exposed by the chosen-plaintext attack. In other words, s(x), s(y), and s(z) are viewed as known variables of the quadratic equations in this subsection. However, there may exist multiple groups of the shared factors due to the uncertainty of the CPV. We design screening-based rules to eliminate the uncertainty, paving the way for breaking α, β, and γ.
Feed P m into the TIC [13]. Obtain C m and t m , where m = 0, 1. The number of the flipped bits in C m , denoted by h m , is the key information to expose s(x m ), s(y m ) and s(z m ). Instantiate Equation (8) by using h m , giving all of which can be readily converted into s(x m ), s(y m ) and s(z m ) by using the knowledge of s.
However, it is not trivial to obtain h m . We shall first check whether t m equals t or not. If t m = t, calculate a differential bit-cube D m = C C m , and then count the number of 1s in D m , denoted by d m . Similar to the CPV-preserving group, we have h m = d m . If t m = t, we select an index n from N = and have t m = t n = t. In this case, D m should be defined as D m = C n C m , and d m represents the distance between l p m and l p n . Similar to Equation (18), we have where the absolute value sign is necessary. The reason for this is presented as follows.
In this subsection, the coordinates no longer lie on the diagonal of P, meaning that the pre-permutation phase cannot be simplified. As a result, the statement that l p n left (l p n right ) is located at the leftmost (rightmost) side is no longer true. For an arbitrarily selected n ∈ N = , its landmark position l p n may be located at the left or right side of l p m . According to Equation (14), l p m and l p n can be expressed as N 3 − h m and N 3 − h n , respectively. Based on this expression, we can rewrite Equation (29) in the form where both h n and d m are accessible. Equation (30) means that, when t m = t, the calculation of h m involves uncertainty, resulting in two possible values. Accordingly, by using Equation (28) and the knowledge of s, we may obtain two groups of the shared factors, denoted by {s + (x m ), s + (y m ), s + (z m )} and {s − (x m ), s − (y m ), s − (z m )}, respectively. Here, the superscripts are intended to signify which of the two operators "+" and "−" is used during the calculation of h m . Since it is difficult to forecast the CPV at the stage of creating P 0 and P 1 , we would have to consider following three cases, and design screening-based rules separately to determine the real solutions of α, β, and γ. The first case, indicated by "case 1" in Figure 5, occurs when t 0 = t and t 1 = t. In this case, we obtain that h 0 = d 0 and h 1 = d 1 without uncertainty, meaning that the shared factors {s(x 0 ), s(y 0 ), s(z 0 )} and {s(x 1 ), s(y 1 ), s(z 1 )} are both authentic. Therefore, the quadratic equations can be written as where χ 0 and χ 1 are used here to represent the unknown variables in Equation (2). Solve Equation (31) over the ring Z/NZ and obtain χ 0 = {χ 0 (0), χ 0 (1)} and χ 1 = {χ 1 (0), χ 1 (1)}, each of which must contain two real solutions due to the authenticity of the shared factors [24]. For this case, the screening-based rule is to assign β = χ 0 ∩ χ 1 , α = χ 0 \{β}, and γ = χ 1 \{β}, which directly breaks the keys.  Figure 5. A toy example that illustrates the screening-based rules for the three cases. Suppose that the keys α, β, and γ take values 1, 2, and 3, respectively. The bold dotted boxes in different colors highlight the scope of each case. The gray tick and cross indicate that the group of solutions will be accepted and discarded, respectively. Figure 5 shows a toy example that illustrates the screening-based rules for the three cases. Assume that n = 5 is selected from , and the pair of chosen plain-cipher bit-cubes, namely P 5 and C 5 , plays a role when t m ≠ t. Note that, since the shared factors may be incorrect, there exists empty solution, such as χ 1 as shown in this example. The empty solution can be discarded directly.

Performance Analysis
In total, the proposed cryptanalysis algorithm requires 2 × √H × W 3 + 3 pairs of chosen plain-cipher images to break the TIC [13]. Compared with Zhang's work [22] that requires 2.5 × √H × W 3 + 6 pairs, our method is more efficient especially when the image has a high resolution. This merit is particularly useful when the number of admissible accesses to a TIC is limited.
To reconstruct s, the proposed reference-validation inference algorithm requires 2 × √H × W 3 + 1 pairs of chosen plain-cipher bit-cubes. First, we need to create 2 × √H × W 3 − 1 plain bit-cubes P n using Equation (12), where n = 1, 2, ⋯ , N − 1. In practice, the plain bit-cube P 0 is needless because s 1 (0) can be immediately derived from Equation (15) regardless of the value of n. Second, to tackle the CPV-changing group, we create a reference plain bit-cube P ref using Equation (16) and a validation counterpart P val using Equation (24).
To break the keys α, β, and γ, Zhang's work [22] requires two, four or six pairs of chosen plain-cipher images, respectively, to deal with the three cases described in the last section. Note that when evaluating the performance of a cryptanalysis algorithm, we always consider the worst bound that need the most computational consumptions. From this perspective, we state that six pairs are needed in Zhang's work [22]. By contrast, the designed screening-based rules can eliminate the uncertainty in h m by fully mining the exclusion and intersection relationships between the group-wise solutions. Therefore, in our work, only two pairs of chosen plain-cipher images are sufficient to break the keys α, β, and γ, whichever case we encounter in practice. Figure 5. A toy example that illustrates the screening-based rules for the three cases. Suppose that the keys α, β, and γ take values 1, 2, and 3, respectively. The bold dotted boxes in different colors highlight the scope of each case. The gray tick and cross indicate that the group of solutions will be accepted and discarded, respectively.
The third case, indicated by "case 3" in Figure 5, occurs when t 0 = t and t 1 = t. In this case, both h 0 and h 1 have two possible values, each of which generates two groups of the shared factors. Hence, the quadratic equations take one of the following four forms from which we explicitly obtain four groups of solutions, denoted by {χ + 0 , χ + 1 }, {χ + 0 , χ − 1 }, {χ − 0 , χ + 1 }, and {χ − 0 , χ − 1 }, respectively. Inspect each group, the screening-based rule makes the following judgement. If the intersection of the two solutions is empty, then the corresponding group will be discarded, otherwise, it must be the authentic one {χ 0 , χ 1 }. For example, if only the second group has a non-empty intersection, namely that χ + 0 ∩ χ − 1 = ∅, then we set χ 0 = χ + 0 and χ 1 = χ − 1 . By doing so, the double uncertainties can be eliminated as well. Figure 5 shows a toy example that illustrates the screening-based rules for the three cases. Assume that n = 5 is selected from N = , and the pair of chosen plain-cipher bit-cubes, namely P 5 and C 5 , plays a role when t m = t. Note that, since the shared factors may be incorrect, there exists empty solution, such as χ + 1 as shown in this example. The empty solution can be discarded directly.

Performance Analysis
In total, the proposed cryptanalysis algorithm requires 2× 3 √ H × W + 3 pairs of chosen plain-cipher images to break the TIC [13]. Compared with Zhang's work [22] that requires 2.5 × 3 √ H × W + 6 pairs, our method is more efficient especially when the image has a high resolution. This merit is particularly useful when the number of admissible accesses to a TIC is limited.
To reconstruct s, the proposed reference-validation inference algorithm requires 2 × 3 √ H × W + 1 pairs of chosen plain-cipher bit-cubes. First, we need to create 2 × 3 √ H × W − 1 plain bit-cubes P n using Equation (12), where n = 1, 2, · · · , N − 1. In practice, the plain bit-cube P 0 is needless because s −1 (0) can be immediately derived from Equation (15) regardless of the value of n. Second, to tackle the CPV-changing group, we create a reference plain bit-cube P ref using Equation (16) and a validation counterpart P val using Equation (24).
To break the keys α, β, and γ, Zhang's work [22] requires two, four or six pairs of chosen plain-cipher images, respectively, to deal with the three cases described in the last section. Note that when evaluating the performance of a cryptanalysis algorithm, we always consider the worst bound that need the most computational consumptions. From this perspective, we state that six pairs are needed in Zhang's work [22]. By contrast, the designed screening-based rules can eliminate the uncertainty in h m by fully mining the exclusion and intersection relationships between the group-wise solutions. Therefore, in our work, only two pairs of chosen plain-cipher images are sufficient to break the keys α, β, and γ, whichever case we encounter in practice.
Furthermore, the number of necessary pairs of chosen plain-cipher images can be treated as a constant with respect to the keys, the positions of chosen bits or the contents of plain images. This merit allows an attacker to accurately estimate the computational consumptions before launching the attacks. The experimental results in Section 4.3 will corroborate the above claims.

Experimental Results
In this section, we conduct simulation experiments and comparative studies to demonstrate the effectiveness and the superiority of the proposed cryptanalysis algorithm. The first experiment aims to exhibit the cryptanalysis results for five standard grayscale images. The second experiment tests the cryptanalysis performance for a camera-based natural scene image, showing the prospects for practical applications. The third experiment is devoted to the comparative studies, which demonstrates that our cryptanalysis algorithm requires fewer pairs of chosen plain-cipher images than Zhang's work [22]. Following the setting in [13], we specify that the default keys are κ = 0.12345678912341, µ = 3.99999, α = 1, β = 2, and γ = 3, respectively. Unless explicitly stated, the TIC [13] is governed by the default keys. All experiments are implemented on a desktop computer with a 2.90 GHz Intel i7-10700 central processing unit, 16.00 GB memory. The programming environment for simulations is Matlab R2017a installed on the Window 10 operation system. Figure 6 shows the results for the first experiment. The first and second columns display the plain images and the corresponding cipher images obtained from the TIC [13], respectively. Five plain images are "Lena", "Baboon", "Testpat", "Wedge", and "Black", all of which have the size of 512 × 512 × 1 (grayscale) and the side length N being equal to 3 √ 512 × 512 × 8 = 128. In this experiment, we first perform the proposed cryptanalysis algorithm for reconstructing s and breaking the keys α, β, and γ. Then, for each cipher image, perform the decryption algorithm of [13] governed by the broken information. In the third column of Figure 6, we provide the intermediate results, in which the cipher images are partially decrypted to counteract the post-permutation and diffusion phases (with the exception of the pre-permutation phase). The rightmost column shows the completely decrypted images. We see that these completely decrypted images are exactly the same as the plain images in the first column without any visual loss. For "Black", however, the partial decryption can entirely reveal the visual information because "Black" is immune to the pre-permutation phase. As expected, the third column of Figure 6e is an all zero-valued image, which is the same as the plain image. This provides us with a new perspective to verify the correctness of our cryptanalysis algorithm on the basis of the intermediate results.

Results of the Cryptanalysis Algorithm
In the upper panel of each image, we append some auxiliary information, intended to provide qualitative and quantitative indicators for monitoring the progress and validating the correctness of our cryptanalysis algorithm. The auxiliary information includes CSBPs (cross-sectional bit-planes), LBP (local binary pattern) histograms, and corresponding entropy values calculated from the LBP histograms.
For qualitative comparisons, we select the front CSBP P(0, :, :), the middle CSBP P(N/2 − 1, :, :), and the back CSBP P(N − 1, :, :) out of a given bit-cube P, where the colon operator returns a regularly-spaced vector [0, 1, . . . , N − 1]. Intuitively, the CSBP can reflect the bit correlation in a given bit-cube. Observing the CSBPs of the plain images, we see that there exist regular LBPs, especially for "Testpat" and "Wedge". In contrast, the cipher images' CSBPs consist of pseudorandom LBPs, meaning that the regularity has been eliminated by the TIC [13]. Most importantly, by comparative observations, we find that the same regular LBPs reappear for the completely decrypted images, thereby verifying that the proposed cryptanalysis algorithm is capable of breaking the TIC [13].
In order to better characterize the regularity, we define a histogram that reflects the probability distribution of occurrence of eight LBPs in a CSBP. As shown in Figure 7, the horizontal axis of the histogram lists the eight LBPs, where three adjacent bits with different binary combinations are viewed as patterns. The vertical axis corresponds to the probability values. For clarity, we omit the captions of the axes when displaying the LBP histograms in the upper panel. Furthermore, to quantificationally measure the regularity, we calculate the entropy value from each LBP histogram. By comparison, we find that cipher images possess flatter LBP distributions and greater entropy values. This supports the statement that the LBPs for the cipher images have lower regularity. The LBP histograms in the fourth column of Figure 6 share the same shapes as those in the first column. Also, we obtain equal entropy values. These results demonstrate that the TIC [13] has been successfully broken by the proposed cryptanalysis algorithm. In the upper panel of each image, we append some auxiliary information, intended to provide qualitative and quantitative indicators for monitoring the progress and validating the correctness of our cryptanalysis algorithm. The auxiliary information includes CSBPs (cross-sectional bit-planes), LBP (local binary pattern) histograms, and corresponding entropy values calculated from the LBP histograms.
For qualitative comparisons, we select the front CSBP P(0, :, :), the middle CSBP P(N 2 ⁄ − 1, :, :), and the back CSBP P(N − 1, :, :) out of a given bit-cube P, where the colon operator returns a regularly-spaced vector [0, 1, …, N − 1]. Intuitively, the CSBP can reflect the bit correlation in a given bit-cube. Observing the CSBPs of the plain images, we see that there exist regular LBPs, especially for "Testpat" and "Wedge". In contrast, the cipher images' CSBPs consist of pseudorandom LBPs, meaning that the regularity has been eliminated by the TIC [13]. Most importantly, by comparative observations, we find that the same regular LBPs reappear for the completely decrypted images, thereby verifying that the proposed cryptanalysis algorithm is capable of breaking the TIC [13].
In order to better characterize the regularity, we define a histogram that reflects the probability distribution of occurrence of eight LBPs in a CSBP. As shown in Figure 7, the horizontal axis of the histogram lists the eight LBPs, where three adjacent bits with different binary combinations are viewed as patterns. The vertical axis corresponds to the prob- ability values. For clarity, we omit the captions of the axes when displaying the LBP histograms in the upper panel. Furthermore, to quantificationally measure the regularity, we calculate the entropy value from each LBP histogram. By comparison, we find that cipher images possess flatter LBP distributions and greater entropy values. This supports the statement that the LBPs for the cipher images have lower regularity. The LBP histograms in the fourth column of Figure 6 share the same shapes as those in the first column. Also, we obtain equal entropy values. These results demonstrate that the TIC [13] has been successfully broken by the proposed cryptanalysis algorithm. Moreover, we conduct correlation analysis. The correlation coefficient can be viewed as a numerical indicator reflecting the consistency between the plain image and the completely decrypted image. In this experiment, we randomly select 10,000 pairs of adjacent pixels in horizontal, vertical, and diagonal directions from each image, and calculate the correlation coefficient r pq as follows Moreover, we conduct correlation analysis. The correlation coefficient can be viewed as a numerical indicator reflecting the consistency between the plain image and the completely decrypted image. In this experiment, we randomly select 10,000 pairs of adjacent pixels in horizontal, vertical, and diagonal directions from each image, and calculate the correlation coefficient r pq as follows where p l and q l consists of the lth pair of adjacent pixels, and E(p) stands for the expectation of p = {p 1 , p 2 , · · · , p 10000 }. The correlation coefficient r pq lies in the interval [−1, 1], and both 1 and −1 indicate the highest correlation while 0 no correlation. Particularly, we stipulate that r pq = NaN when p l = q l = c, where c is a constant, holds for all values of l. Table 2 lists the quantitative results. We find that the plain image and the completely decrypted image share the same correlation coefficients. For "Black", the correlation coefficient of the partially decrypted image equals NaN because counteracting the post-permutation and diffusion phases is sufficient to recover the zero-valued pixels, giving p l = q l = 0. These results also demonstrate the correctness of the proposed cryptanalysis algorithm.

Efficiency of the Cryptanalysis Algorithm
In this experiment, the goal is to demonstrate the effectiveness of our cryptanalysis algorithm under a real-life application scenario. To this end, we take a landscape photograph of our university campus as the plain image. It has three color channels with the spatial resolution of 1024 × 2048. Under different settings, the TIC [13] yields three cipher images, as shown in Figure 8b,d,f, respectively. In the first setting, the three channels of the plain image are separately encrypted with the same default keys. In the second setting, we introduce a tiny change into the default keys, and separately encrypt the three channels. In the third setting, the TIC [13] is governed by three different keys, and encrypts the three channels in turn. In our computing environment, the proposed cryptanalysis algorithm takes 86.43 s, 87.15 s and 267.58 s (about 89.19 s for each channel, on average), respectively, to complete the cryptanalysis task under the three settings. Figure 8c,e,g exhibit the cryptanalysis results, in which the decrypted images are the same as the plain image.
hibit the cryptanalysis results, in which the decrypted images are the same as the plain image.
We summarize following two points from the above experimental results. First, the proposed cryptanalysis algorithm is of high efficiency for the camera-based natural scene image, showing the feasibility of deployment in some practical system. Second, the efficiency of the proposed cryptanalysis algorithm is relatively stable. The time-consuming data tell us that different keys have almost no effect on the efficiency.  We summarize following two points from the above experimental results. First, the proposed cryptanalysis algorithm is of high efficiency for the camera-based natural scene image, showing the feasibility of deployment in some practical system. Second, the efficiency of the proposed cryptanalysis algorithm is relatively stable. The time-consuming data tell us that different keys have almost no effect on the efficiency.

Comparative Studies
Zhang's work [22] also focuses on attacking the same TIC [13]. Compared with this work [22], the proposed cryptanalysis algorithm requires fewer pairs of chosen plain-cipher images. We conduct two comparative studies as follows.
In the first comparative study, we count how many pairs of chosen plain-cipher images are necessary to break the TIC [13]. Plain images are the same as those used in Figure 6. The experimental protocol consists of the following steps. Randomly set two coordinates (L 1 (x k , y k , z k ), L 2 (x k , y k , z k ), L 3 (x k , y k , z k )), where k = 0, 1, according to the conditions mentioned in Section 3.3. Perform Zhang's cryptanalysis algorithm [22] and the proposed one, respectively. Record the number of necessary pairs of chosen plain-cipher images. For each plain image, we repeat the above steps ten times, and finally calculate the average number of necessary pairs. Moreover, we consider the default keys and a new set of keys, intended to examine whether the performance of the cryptanalysis algorithms is sensitive to keys or not. The new keys are κ = 0.12345678912343, µ = 3.99998, α = 2, β = 1, and γ = 4. Numerical results are listed in Table 3. For the plain images with the size of 512 × 512, the proposed cryptanalysis algorithm only requires 131 pairs of chosen plain-cipher images. By contrast, on average, 163.88 pairs are necessary for Zhang's work [22]. Furthermore, as we see in Table 3, the numerical results in the columns titled by "Ours" are all the same. This demonstrates that different keys, positions of chosen bits and contents of plain images will not affect the performance of our cryptanalysis algorithm. Thus, only given the images' resolutions, our cryptanalysis algorithm allows an attacker to pre-estimate the computational consumptions more accurately before launching the attacks.
In the second comparative study, we aim to verify that the proposed cryptanalysis algorithm performs much more efficient when dealing with larger images. The experimental protocol consists of the following steps. Resize the plain images to 1024 × 2048, 2048 × 3456, and 4096 × 4096, respectively, using bicubic interpolation. Accordingly, the enlarged plain bit-cubes have the side length N = 256, 384, and 512, respectively. For each cipher image, perform Zhang's cryptanalysis algorithm [22] and the proposed one, respectively. Record the number of necessary pairs of chosen plain-cipher images. For the five plain images in Figure 6, we calculate the average number of necessary pairs. For comparison, the numerical results are plotted in a bar chart, as shown in Figure 9. Regardless of the resolutions, the proposed cryptanalysis algorithm consistently outperforms Zhang's work [22], and the superiority becomes more obvious for larger images.

Conclusions
In this paper, we investigate Xu's image cryptosystem [13], and summarize security loopholes from four aspects. On this basis, we propose a reference-validation inference algorithm and design screening-based rules to efficiently break Xu's image cryptosystem [13]. Compared with an existing work [22], our cryptanalysis algorithm requires fewer pairs of chosen plain-cipher images. Only √8 × H × W 3 +3 pairs, where H × W represents the image's resolution, are sufficient to complete the cryptanalysis task. Moreover, the performance of the proposed cryptanalysis algorithm is highly stable since different keys, positions of chosen bits and contents of the plain images will not influence the number of necessary pairs. This merit enables an attacker to well pre-estimate and allocate the computational consumptions before launching the attacks.

Conclusions
In this paper, we investigate Xu's image cryptosystem [13], and summarize security loopholes from four aspects. On this basis, we propose a reference-validation inference algorithm and design screening-based rules to efficiently break Xu's image cryptosystem [13]. Compared with an existing work [22], our cryptanalysis algorithm requires fewer pairs of chosen plain-cipher images. Only 3 √ 8 × H × W + 3 pairs, where H × W represents the image's resolution, are sufficient to complete the cryptanalysis task. Moreover, the performance of the proposed cryptanalysis algorithm is highly stable since different keys, positions of chosen bits and contents of the plain images will not influence the number