A New Cryptographic Scheme Utilizing the Difficulty of Big Boolean Satisfiability

A search problem may be identified as one, which requires an actual “search” for an answer or a solution. Such a problem may have no obvious method, which could be followed to determine a solution, other than to intelligently search through all candidate or potential solutions, which constitute the search space, until a satisfactory one is found. Typically, we may have an efficient way of determining whether one of the possible solutions is actually correct, but no efficient way of determining how to find a correct solution. There are many such search problems, both theoretically and practically motivated, but they all have these difficulties in common. Consider the example of RSA cryptanalysis, where we are given an integer 𝑛 which is the product of two large prime numbers 𝑎 and 𝑏 , and we need to factor 𝑛 into its factors 𝑎 and 𝑏 . This can be achieved by attempting (according to the sieve of Eratosthenes) to divide n by every prime integer between 2 and √𝑛 , and hence it is a special kind of a search problem in itself. Several efforts in the past aimed to translate various encryption and hashing schemes into Boolean satisfiability (SAT). The SAT problem is a computationally intractable (NP-hard) problem but relatively efficient SAT-Solvers are built having computational complexity of 2 𝑘 (1−∈) , where 0 < ∈ < 1 and thus can prune the search space significantly. Guided by the above concepts, we propose herein a scheme that can encrypt a message by using a ‘big’ Boolean function, which produces an equation that cannot be solved by the conventional SAT-Solvers and leads to a dramatic increase in the search space from 2 𝑛 to (2 2 𝑚 ) 𝑛 in the worst case. Logical cryptanalysis shows that the proposed scheme is very hard to break, indeed. To the best of our knowledge, the adversary cannot reduce or prune the search space (except for shortening the task needed at every node), and is forced to traverse the whole search space. He might arrive at several candidate solutions, and has to search for clues as to which of them is the correct solution.


Introduction
In the last two decades, there has been a remarkable progress achieved in automated solvers of the Boolean satisfiability problem, henceforth referred to as SAT-Solvers (Marques-Silva, 1999;Moskewicz et al., 2001;Eén and Sörensson, 2003). These solvers have very enhanced mathematical algorithms for checking the solution of a SAT problem. An extensive amount of research has been done during the last two decades on the SAT techniques of DLL conflict-driven clause learning and logic simplification to increase the efficiency of SAT-Solvers (Selman et al., 1993;Marques-Silva, 1999;Zhang, 1997;Zhang and Malik, 2002;Eén and Sörensson, 2003;Moskewicz et al. , 2001;Ryan, 2004). The enhanced improvement in the modern SAT-Solvers opens new panoramas for a lot of real world practical applications belonging to the traditional field of artificial intelligence and formal verification that can be successfully solved by the SAT-Solvers (Prasad et al., 2005;Boyan and Moore, 1998;Hughes and Bultan, 2008). Modern SAT-Solvers have done a tremendous job solving practical-application problems of model checking (Biere et al., 1999), scheduling, testpattern generation in digital systems (Larrabee, 1992), design debugging and diagnosis (Smith et al., 2005), identification of functional dependencies in Boolean functions , technology-mapping in logic synthesis (Safarpour et al., 2006), circuit delay computation, power optimization (Sagahyroon et al., 2011), FPGA design (Hu et al., 2007), network Intrusion (Kim and

Boolean Satisfiability and SAT-Solvers
Boolean satisfiability is the problem of deciding whether a propositional logic formula can be satisfied given suitable value assignments to the variables of the formula. Generally, the problem is represented by a propositional formula consisting of a conjunction (ANDing) of clauses (alterms), each of which comprising a disjunction (ORing) of literals, where a literal is a variable in un-complemented form X or in a complemented form ̅ . This representation of the propositional formula is called a Conjunctive Normal Form (CNF). A CNF is also known as a product-of-sums (pos) expression. A literal X is said to be true if a value 1 is assigned to it, while a literal ̅ is said to be true if a value of 0 is assigned to X . So, a clause which is the disjunction of literals is said to be satisfied if any one of its literals is assigned to a true value. If each of the literals in a clause is assigned to a false value then the clause is said to be unsatisfiable and hence the whole formula is said to be unsatisfiable.
Many SAT-Solvers have been developed to solve the Boolean satisfiability problem. Today most modern SAT-Solvers (Moskewicz et al., 2001;Eén and Sörensson, 2003;Pipatsrisawat and Darwiche, 2007a;2007b) are based on the original DPLL algorithm (Davis and Putnam, 1960;Davis et al., 1962), which is composed of branching, unit propagation (a recursive from for what is known as Boolean constraint propagation) and backtrack searching. The DPLL algorithm performs a search process that traverses the space of variable assignments until a satisfying assignment is found, or the search space is exhausted without finding any satisfying assignment. The first major enhancement to DPLL was introduced in the GRASP solver (Marques-Silva and Sakallah, 1997;1999), which introduced a new learning mechanism from conflicting assignments. The GRASP solver performs non-chronological backtracking by learning clauses with conflicting assignments and attaching the class of learned clauses to the formula. The second main enhancement in DPLL SAT-Solvers was introduced by Gomes et al. (1998) by implementing a restart in the search algorithm. It restarts the search space to the root level when a certain amount of conflict has been reached. The limit on the number of conflicts varies in different SAT-Solvers. The most common restarting policies include the optimal speedup policy (Luby et al., 1993) and the adaptive restart strategy (Biere, 2008). The third main enhancement was the efficient implementation of Boolean Constraint Propagation (BCP) as one of the main features of DPLL. The Chaff solver implements two-literal watching, which efficiently reduces the overhead of the BCP (Moskewicz et al., 2001). Additional enhancements which further improve the performance of modern SAT-Solvers include Conflict-Based Adaptive Branching such as Dynamic Largest Individual Sum (DLIS) (Marques-Silva, 1999) and Variable State Independent Decaying Sum (VSIDS) (Pipatsrisawat and Darwiche, 2007a;2007b).

Cryptography
A cryptosystem is one of the most fundamental cryptographic protocols used in data security (Talbot and Welsh, 2006). The sender has a message text (Mesg) and an encryption function that can be used to encrypt the message into a cryptogram or cipher text (Ciph), as depicted by the equation The encrypted message is then sent on the (insecure) communication channel to the receiver. The receiver collects the message and decrypts it by using a decryption function to recover the message, namely  Fig. 1 shows a typical insecure communication channel with three players: a sender, a receiver, and an adversary. The channel employs a SAT cryptosystem. The essence of this system is that deciphering the encrypted message is very easy (even trivial) for the intended receiver (just straightforward substitution), while the same task is terribly difficult, costly and time-consuming for the adversary (amounting to solving an NP-hard SAT problem).
Usually, cryptosystems can be categorized into two types, symmetric (private-key) cryptosystems and asymmetric (public-key) cryptosystems. In a symmetric cryptosystem an otherwise secret key is shared between the sender and the receiver. The encryption function and the decryption function both depend on this shared key. The set of possible quintuples for this cryptosystem are below.
{ , , , ( , , ), ( , )} where M is set of all possible messages, K is the set of all possible keys, C is the set of all possible cipher texts, the function maps the message into a cipher text while the function maps the cipher text back into the original message. These two functions are described by: The techniques of Crypto-1, DES, and AES are the most popular cryptographic techniques which work on the principle of symmetric cryptography (Afianti and Barmawi, 2015). All of these cryptosystems can be attacked by using algebraic cryptanalysis (Kamal and Youssef, 2010) and an adversary can retrieve the secret key by solving the polynomial equation through the use of linearization, Gröbner bases or SAT-Solvers. For a SAT-Solver the equations can be encoded to CNF form and the adversary must use an efficient SAT-Solver so as to decrypt the message, a very difficult task, indeed, especially when required in real time.
On the other hand, the public-key cryptosystem does not have any secret key that is shared solely between the sender and the receiver. Instead, the key is public and the cryptosystem employs the Knows a particular solution that is easily (almost trivially) substituted in the function to get 1 and 0.
Intercepts the function and determines (with difficulty) if it is satifiable to 0 or 1.
concept of a one-way function, i.e., a function which is very easy to compute in the forward direction used for encryption and signature verification, but very hard to compute in the reverse direction used for decryption and signature generation. The larger the size of the (public) key is, the greater is the difference between the computations required in the forward direction and in the reverse direction.
Multiplication and factorization, discrete exponential and logarithm, and cryptographic hash functions such as SHA-1 and MD4/5 are well-known examples of one-way functions used in public-key cryptographic systems. Although all these functions used in cryptosystems are said to be one-way functions, there is no theoretical proof that any of them offers perfect security or unbreakable encryption. Hence, it is possible to discover algorithms, which can attack a cryptosystem based on any of them, and hence retrieve the secret information form it. For example, one might use logical cryptanalysis, in which a cryptosystem based on factorization, discrete logarithm, SHA-1 and MD4/5 hash functions (Jovanović and Janičić, 2005;Legendre et al., 2014) is encoded to a SAT problem and then, with the help of a SAT-Solver, is decrypted by an adversary (Faizullin et al., 2009).
The (Boolean/propositional) satisfiability problem is known to be NP-hard. In fact, it is the first problem ever proven to be so (Rushdi and Ahmad, 2016). In the worst case it takes a traversal of 2 values to solve the problem. Now, we propose a scheme for symmetric-key cryptography, in which we encrypt the message by using a 'big' Boolean algebra rather than the two-valued one. The advantage of this scheme is that it increases the difficulty of the problem and it cannot be solved by a conventional SAT-Solver without introducing appropriate modifications to it.

Methodology
As usual, our cryptosystem consists of two parts: encryption and decryption. In this scheme, abit message which consists of 1's and 0's can be encrypted by set of big Boolean functions of value 0, and a similar set of values 1. These Boolean functions are in the following form.
Where represents the number of each of the -type and -type functions, and [ 1 2 … . ] is an -tuple with being a variable belonging to a 'big' Boolean algebra 2 , where > 1. Such an algebra is atomic with atoms and has 2 partially-ordered elements forming a complemented distributive lattice. Fig. 2 shows the lattice structure of 16 , which is the free Boolean algebra ( , ) created by two generators and . Note that the elements of 16 are therefore all the switching functions of the two "variables" and . Therefore, a message of 0 is encoded by randomly choosing a function from the set of the -type functions ( ) and a message of 1 can be encoded by randomly choosing a function from the set of the -type functions ( ). For example, a 4-bit message 0010 can be encoded by the sequence 3 ( ) 2 ( ) 4 ( ) 3 ( ), where the functions ( ) and ( ) are arbitrarily or randomly selected from an available pool of 0-valued and 1-valued functions.
The sent message now consists of a sequence of functions encrypting the original bits of the form (6) and (7). Each equation in the pool of available equations (6) and (7) has a number of particular solutions, which we deliberately make arbitrarily large. However, we make sure that there is a single common particular solution for all the equations involving the 0-valued functions ( ) and the 1-valued functions ( ). This common solution is entrusted securely to the intended receiver. The job of the receiver is to trivially substitute this into the sequence of functions received, thereby converting it into the original sequence of 1's and 0's. However, it is a totally different story for the adversary, who needs to intercept the ciphered message in the first place and then try to evaluate the functions received at an arbitrarily selected value of . Actually the adversary will possibly have to substitute every in 2 to evaluate functional values each belonging to 2 , and identify those values among them belonging to {0,1} as candidate answers. If the adversary obtains a unique such value, it is the correct answer. Otherwise, a choice among the obtained values should be based on other consideration such as plausibility and context. Anyhow, the search space traversed always consists of 2 points. Typically, = 2 ( = 1,2,3, … . ), where is the number of generators used to generate the 'big' Boolean algebra as the free Boolean algebra ( 1 , 2 , 3 , … . ). The search space encountered by the adversary is a much larger than that in a conventional SAT problem.

Computational Illustration
We construct a toy cryptosystem having a pool of four -type and four -type functions  Table 1 shows these eight functions, characterized by their complete sets of particular solutions, which are of variable cardinalities and have an intersection equal to { }. The contents of Table 1 are kept totally secret and not entrusted to anybody other than the sender. The explicit expressions or formulas of the functions are obtained as shown in Table 2, and a sequence comprising a few of these are sent on the (insecure) communication channel so to encode a specific message. Along with these, a key is shared between the sender and receiver (once and for all) via some sort of secure channel. Now, we need to explore the jobs required of the sender, receiver, and adversary.

Job Required of the Sender
The sender needs to replace Table 1 by Table 2, i.e., to solve the inverse problem of Boolean equations (Rushdi and Albarakati, 2014). We now demonstrate this task by considering one of the -type functions and one of the -type functions. Let us consider the -type Boolean equation where the function 1 is specified according to Table 1 by the set of particular solutions of (6), namely The consistency condition associated with these particular solutions is assumed to be the identity (0=0). The function 1 is expressed via (Rushdi and Albarakati, 2014): which is simplified as shown by sequence of natural maps (VEKMs) in Figs 3(a) -3(c).
The final expression for 1 could be its minimal sum 1 ( 1 , 2 ) = 1 ̅̅̅̅ ∨ ̅ 1 ∨ 2 (11) but we deliberately write it using the (more involved but easier to transmit) minterm expansion Now let us the consider -type Boolean equation where the function 1 is specified by the set of particular solutions of (7), namely Again, under the assumption of a (0=0) consistency condition, the function 1 is expressed via (Rushdi and Albarakati, 2014):

Fig. 3. Three consecutive instances of the natural maps (VEKMs) of
which might be abbreviated as

Job Required of the Receiver
The receiver knows the value = [ ̅ ] and he receives the set of functions (17) (possibly abbreviated as in (18) over the insecure channel. For him, decryption is just a matter of trivial single substitution of in (17) that enables him to recover the original message 0010.

Job Required of the Adversary
The adversary might try to substitute each of the 256 elements of 2 16 . He/she will obtain at least a single 4-tuple of only 0's and 1's. Table 3

Discussion and Conclusions
Cryptography is the science of encrypting and decrypting data so as to allow secure transfer of information over space (transmission over a communication channel) or over time (storage within a computer memory). This paper serves as a first step towards an automated novel cryptosystem that is based on the utilization of a 'big' Boolean algebra, i.e., a finite (atomic) Boolean algebra other than the conventional two-valued one. The basic idea is to dramatically extend the search space needed in SAT-based cryptography. The adversary will not only be obliged to traverse a search space (that can be arbitrarily huge), but might end up with several candidate answers, all of which are wrong except one.
The paper demonstrated the feasibility of the proposed cryptosystem as well as the relative difficulty of breaking it. An automated version of the proposed cryptosystem is currently being built, and will be subsequently tested by subjecting it to various sorts of attacks. A forthcoming sequel of this paper will report the implementation of the proposed cryptosystem in arbitrarily big Boolean algebras, as well as techniques and results of cryptanalysis of this system.