Flexible and Efﬁcient Multi-Keyword Ranked Searchable Attribute-Based Encryption Schemes

: Currently, cloud computing has become increasingly popular and thus, many people and institutions choose to put their data into the cloud instead of local environments. Given the massive amount of data and the ﬁdelity of cloud servers, adequate security protection and efﬁcient retrieval mechanisms for stored data have become critical problems. Attribute-based encryption brings the ability of ﬁne-grained access control and can achieve a direct encrypted data search while being combined with searchable encryption algorithms. However, most existing schemes only support single-keyword or provide no ranking searching results, which could be inﬂexible and inefﬁcient in satisfying the real world’s actual needs. We propose a ﬂexible multi-keyword ranked searchable attribute-based scheme using search trees to overcome the above-mentioned problems, allowing users to combine their fuzzy searching keywords with AND–OR logic gates. Moreover, our enhanced scheme not only improves its privacy protection but also goes a step further to apply a semantic search to boost the ﬂexibility and the searching experience of users. With the proposed index-table method and the tree-based searching algorithm, we proved the efﬁciency and security of our schemes through a series of analyses and experiments.


Motivations
Cloud and IoT [1] services have become increasingly popular because of the rise in streaming services [2] and the development of machine learning, especially in the era of COVID-19. Outsourcing data to the cloud saves space for local storage and brings convenience so that users can access and share their data without any space and time limitations. However, since cloud service providers, or cloud servers for short, are not fully trustable, directly uploading sensitive data to the cloud is dangerous and undermines user privacy. Encrypting data and then uploading them seems a safer approach. Nevertheless, in many situations, traditional public key encryption (PKE) [3] schemes can only achieve secrecy but lack proper access controllability. For example, in some cases, we want to authorize files to only a specified group of people. Under PKE, we must copy files many times and encrypt them, respectively. Moreover, the management of secret keys is increasingly cumbersome and difficult. This challenge is specifically severe for medical and financial data because users have the right to decide who can review their sensitive medical and financial records. With attribute-based encryption (ABE) [4][5][6][7][8][9], we can make fine-grained access control much more manageable by only allowing some people with specified attributes (i.e., conditions) to access and view the files.
In addition to the access control, how to fetch the required data rapidly among the massive data stored in the cloud is also a critical issue. Downloading and decrypting all In addition to the access control, how to fetch the required data rapidly among the massive data stored in the cloud is also a critical issue. Downloading and decrypting all the data and then performing a search can reach the target, but it is not feasible because a massive amount of computation and storage is required on the user end. Apart from the excessive time overhead, these operations may be unsafe. Searchable encryption (SE) algorithms [10][11][12][13][14][15] bring reasonable solutions to this problem. Go a step further; combining the ABE and SE schemes allows users to have fine-grained access controls and searching capabilities regarding encrypted data.
Many searchable attribute-based encryption schemes (ABS) [16][17][18][19][20][21][22][23][24] have provided fine-grained access control, dynamic updates, and attribute revocations. However, searching capabilities could be more potent in most schemes to fulfill actual needs. Usually, they can embed only a single keyword into ciphertexts, which could be inconvenient and make searching more cumbersome. Although some schemes allow for combining multiple keywords and provide ranked search results, users can only fetch files containing all the keywords. More complicated relationships between keywords such as disjunctive logic "OR" can usually not be expressed. In addition, some advanced designs in searchable encryption algorithms have rarely been implemented on such systems. We summarize the standard advanced searching modes in Figure 1. The basic search mode is the keyword rank search which does the exact match of single or multiple keywords. However, in practice, the user's input commonly contains some typos or uses synonyms. As a result, two highlevel search modes, fuzzy search and semantic search, are induced to allow users to obtain the results without using the exact keyword. To tackle the problems listed above, we proposed two flexible and efficient multikeyword ranked searchable attribute-based encryption schemes (FEMRSABE), which are especially suitable for E-health applications. In our basic scheme, we designed a search tree data structure to enhance the expressiveness of the search, as shown in Figure 2. The server matches the trapdoors in leaf nodes with index files, traversing the tree and inducing the searching results of parent nodes by union or intersection. Finally, the aggregated search result of the root node is the final result, sorted according to the associated relevance score. The cloud server can only read the user-inputted logic structure but knows nothing about what users have searched. In addition, inspired by [25], we built an index table to boost search efficiency. We replaced the encryption mechanism from symmetric key encryption with pure attribute-based encryption. Data owners do not need to exchange keys with users in advance, making the scheme more realistic. It shows that the search speed is much faster than the case without the index table through experiments. We also provide fuzzy keyword searching ability by calculating the fingerprints of keywords. We refer to the generating method and the similarity score in [11] to ensure the search range is manageable. To tackle the problems listed above, we proposed two flexible and efficient multikeyword ranked searchable attribute-based encryption schemes (FEMRSABE), which are especially suitable for E-health applications. In our basic scheme, we designed a search tree data structure to enhance the expressiveness of the search, as shown in Figure 2. The server matches the trapdoors in leaf nodes with index files, traversing the tree and inducing the searching results of parent nodes by union or intersection. Finally, the aggregated search result of the root node is the final result, sorted according to the associated relevance score. The cloud server can only read the user-inputted logic structure but knows nothing about what users have searched. In addition, inspired by [25], we built an index table to boost search efficiency. We replaced the encryption mechanism from symmetric key encryption with pure attribute-based encryption. Data owners do not need to exchange keys with users in advance, making the scheme more realistic. It shows that the search speed is much faster than the case without the index table through experiments. We also provide fuzzy keyword searching ability by calculating the fingerprints of keywords. We refer to the generating method and the similarity score in [11] to ensure the search range is manageable.
Moreover, in our enhanced scheme, we reorganized the system architecture to minimize possible data leakages, such as the logical structure of search trees and the file list of a particular keyword. We further implemented the semantic search functionality with Word-Net's help [26]. As a consequence, we considered the actual semantics of the keywords. Users only need to express their intention of searching without considering the constraints on the data owners' actual keywords and their perfect spellings. These advanced search modes make the search procedure more flexible and easier to use. The functionality com-  Figure 2. This work uses a tree-based data structure and AND-OR gates to complete a complicated keyword search task in the encryption domain. This is an example of an E-health use case.
Moreover, in our enhanced scheme, we reorganized the system architecture to minimize possible data leakages, such as the logical structure of search trees and the file list of a particular keyword. We further implemented the semantic search functionality with WordNet's help [26]. As a consequence, we considered the actual semantics of the keywords. Users only need to express their intention of searching without considering the constraints on the data owners' actual keywords and their perfect spellings. These advanced search modes make the search procedure more flexible and easier to use. The functionality comparisons in later sections show that our scheme has more desirable searching capabilities than other benchmarking searchable attribute-based encryption schemes.
Flexible and efficient multi-keyword ranked searchable attribute-based encryption schemes (FEMRSABE) target the E-health use case. Users, who are equipped with IoT devices that can collect body data such as heart rate and body temperature, can upload their data to the server in encryption form. Doctors or other healthcare professionals can access the data with appropriate permission. Most importantly, the system does not require the accessor to input the exact identical keyword used for encryption. With the benefit of fuzzy and semantic search, FEMRSABE can automatically match and discover the possible meaning and search. This brings flexibility in that IoT providers and healthcare professionals do not need to negotiate the keyword beforehand, and the different IoT devices' cross-time can also pick up a suitable keyword rather than be limited to the previous choice.
In the security aspect, our FEMRSABE system can defend against selective ciphertext-policy and chosen-plaintext attack (IND-SCP-CPA) by building it on the general bilinear map cryptographic techniques and the associated assumptions.

Contributions of This Work
We proposed a flexible and efficient scheme, FEMRSABE. The possible contributions of this work include the following: Flexible Access and Searching Structure: We used linear secret-sharing schemes (LSSS) to build the basic data access structure, allowing data owners to express their data access policy by combining AND-OR logic gates as their wish. Furthermore, the conventional multi-keyword scheme can only find documents containing all the searching keywords. We designed a much more flexible tree structure so that users can express what they want to search by both conjunctive and disjunctive logic.
Ranked Searching Results: Following the techniques presented in [25], we built an index-table structure that can diminish the searching time and make ordered searching results possible. Users can obtain the most desired search results as soon as possible, avoiding unnecessary file decryption or filtering among many matched results.
Fuzzy and Semantic Search Mode: We further included advanced search mechanisms into the enhanced scheme, such as fuzzy and semantic search, by integrating with fingerprints introduced by [11]. Query keywords can now be inaccurate or have spelling errors, making it easier for users to obtain what they want. Flexible and efficient multi-keyword ranked searchable attribute-based encryption schemes (FEMRSABE) target the E-health use case. Users, who are equipped with IoT devices that can collect body data such as heart rate and body temperature, can upload their data to the server in encryption form. Doctors or other healthcare professionals can access the data with appropriate permission. Most importantly, the system does not require the accessor to input the exact identical keyword used for encryption. With the benefit of fuzzy and semantic search, FEMRSABE can automatically match and discover the possible meaning and search. This brings flexibility in that IoT providers and healthcare professionals do not need to negotiate the keyword beforehand, and the different IoT devices' cross-time can also pick up a suitable keyword rather than be limited to the previous choice.
In the security aspect, our FEMRSABE system can defend against selective cipher-textpolicy and chosen-plaintext attack (IND-SCP-CPA) by building it on the general bilinear map cryptographic techniques and the associated assumptions.

Contributions of This Work
We proposed a flexible and efficient scheme, FEMRSABE. The possible contributions of this work include the following: Flexible Access and Searching Structure: We used linear secret-sharing schemes (LSSS) to build the basic data access structure, allowing data owners to express their data access policy by combining AND-OR logic gates as their wish. Furthermore, the conventional multi-keyword scheme can only find documents containing all the searching keywords. We designed a much more flexible tree structure so that users can express what they want to search by both conjunctive and disjunctive logic.
Ranked Searching Results: Following the techniques presented in [25], we built an index-table structure that can diminish the searching time and make ordered searching results possible. Users can obtain the most desired search results as soon as possible, avoiding unnecessary file decryption or filtering among many matched results.
Fuzzy and Semantic Search Mode: We further included advanced search mechanisms into the enhanced scheme, such as fuzzy and semantic search, by integrating with fingerprints introduced by [11]. Query keywords can now be inaccurate or have spelling errors, making it easier for users to obtain what they want.
Multi-Authority: Allowing the central authority to take over all the jobs of generating user keys is neither efficient nor secure. If the central authority shuts down, the whole system will be affected, which is called the "single-point" failure. We set up multiple attribute authorities to spread the traffic and generate intermediate user keys to solve this problem and shorten the key-generating time.

Organization
This paper is organized as follows. We review some related attribute-based and searchable encryption schemes in Section 2. Some preliminaries and cryptography backgrounds are addressed in Section 3. Section 4 defines the problem formally and depicts the proposed architecture, while Section 5 addresses our concrete constructions in detail. We present our schemes' performances and security levels through a series of experiments in Section 6. Finally, Section 7 concludes this write-up.

Attribute-Based Encryption
Attribute-based encryption (ABE) is a technique that allows data owners to declare their access policies such as: "(Doctor OR Researcher) AND (Chest OR Surgery)". Only data users who meet the policy's attribute requirements are qualified to access the files. For instance, users with the attributes "Doctor and Surgery" can read the text, but ones with "Doctor and Researcher" cannot. Most ABE schemes can be categorized into the following two classes: ciphertext-policy attribute-based encryption (CP-ABE) and keypolicy attribute-based encryption (KP-ABE). Wang et al. [27] proposed a constant-size ciphertext KP-ABE scheme, while Water et al. [4] proposed the first practical CP-ABE scheme. The main difference between KP-ABE and CP-ABE is that CP-ABE puts the access policy into ciphertexts while KP-ABE puts it into the users' secret keys. In CP-ABE schemes, data owners can easily decide who can access the files, so it is more suitable for cloud storage applications. Hence, we adopted it to construct our systems. Over time, more powerful ABE schemes have been developed. Li [7] proposed an attribute-revocable scheme, and Chi et al. [5] proposed a policy-hiding scheme to protect data owners' privacy further. In addition, most ABE schemes involve bilinear pairing operations, which are very time-expensive, especially for resource-restricted devices such as mobiles and IoT devices. Han et al. [6] proposed a decentralized scheme to reduce the burden of data users by outsourcing the corresponding computational tasks.

Searchable Encryption
The main characteristic of searchable encryption (SE) is it allows users to search over many encrypted data without the decrypting of the documents in the dataset. High-level concepts of SE are that data owners extract keywords from plaintext files to build a "Secure Index" and then encrypt plaintexts with symmetric encryption schemes. Data owners transform searching keywords into corresponding trapdoors afterward. Finally, cloud servers match the Secure Indexes with the trapdoors to produce search results containing the target keywords the user longs for.
For this purpose, there are many ways to build the pre-described Secure Index. Most of the existing SE schemes involve calculating the term frequency-index document frequency (TF-IDF) values of keywords. Cao et al. [28] and Tzouramanis et al. [12] both use the K-nearest neighbors (KNN) method to build the Secure Index. It is effective; however, the associated neighborhood-related matrix will be too large, and therefore, the associated operations become time-consuming when too many keywords are involved in the system. Other methods include secure random masking, tree-based, and secure linked-list ones. The scheme proposed by Zhang et al. [25] used the secure linked-list method to build an index table, which we also adopted in our work for its efficiency.
Many functional search schemes have been developed to provide a more powerful search capability. For example, Wang et al. [13] proposed a tree-based method to provide range search. It is especially suitable for numerical datasets such as financial records. Aritomo et al. [29] and Fu et al. [10] both achieved semantic-based searching, while Zhang et al. [15] provided an efficient predicate search. Liu et al. [11] proposed a robust scheme combining semantic and fuzzy searches using fingerprint methods, which will also be adopted in our schemes. However, this scheme did not take any access control mechanism into account. They used fully homomorphic encryption (FHE) schemes [30][31][32] Cryptography 2023, 7, 28 5 of 18 to encrypt the index table instead. Due to complexity considerations, our work has not considered FHE schemes in our current system implementation. However, FHE schemes have lots of potential for constructing effective ABE schemes if the required complexity can be handled properly. An FHE-based ABE approach is exciting and can reduce the storage requirement of ciphertexts. We choose to put it into our future investigations.

Searchable ABE Schemes
Many ABE schemes have searching abilities. For this kind of scheme, it is crucial to allow only the qualified files to be searched. Otherwise, malicious users may launch keyword attacks to guess the contents of files and breach privacy. On the other hand, it is a waste of time for users to decrypt those unqualified files with failure. Sun et al. [22] proposed a famous searchable attribute scheme (ABKS) to hide the access policy. However, they use AND GATE as the access structure for policy hiding, which limits the access policy's expressiveness. Wang et al. [23] proposed a scheme that is aimed at E-health applications. They achieve a constant computational overhead, constant storage overhead, and policy hiding by hashing user attributes and keywords. However, the access policy's flexibility and searching are restricted due to its data structures. Moreover, they directly embed keyword hashes into ciphertexts, so it takes much time to match search results when there are many files in the dataset or only a single keyword can be used at a time. Miao et al. [21] and Sun et al. [33] proposed ABKS schemes with the ability for attribute revocations. Nevertheless, the searching capabilities of these schemes are weak because users can only use a single keyword once without any modifications to protocols.

Bilinear Pairing
Following the definitions in [33], let G and G T be two multiplicative cyclic finite groups of prime order p. Let g be a generator in G. The following equations hold to fulfill the definition of the bilinear pairing equations.

1.
Bilinearity: For all x, y ∈ G and all s, t ∈ Z p , e x s , y t = e(x, y) st holds. That is, the exponentiation operations inside pairings can be moved outside directly.

3.
Computability: For all x, y ∈ G, e(x, y) and any additive or multiplicative operations on it can be efficiently computed.

Access Structure
By definition in [4]: Let {P 1 , P 2 , . . . , P n } be a set of parties. A collection A ⊆ 2 {P 1 , P 2 , ..., P n } is monotone if ∀ B, C: if B ∈ A and B ⊆ C then C ∈ A. An access structure (respectively, monotone access structure) is a collection (respectively, monotone collection), A, of non-empty subsets of {P 1 , P 2 , . . . , P n }, i.e., A ⊆ 2 {P 1 , P 2 , ..., P n } /∅. The sets in A are called the authorized sets, and the sets not in A are called the unauthorized sets.

Linear Secret-Sharing Schemes
We choose the linear secret-sharing schemes as our access structure due to their full expressiveness in the access policy. Some papers [16,18,19,22,23] use the AND gate to bring efficiencies and policy-hiding capabilities. However, they do not apply to disjunctive operators. Thus, the flexibility of the access policy is quite limited.
The definition of a linear secret-sharing scheme can be found in [34]:

Definition 1. Linear Secret-Sharing Schemes (LSSS)
A secret-sharing scheme Π over a set of parties P is called linear over (Z p ) if

1.
The shares for each party form a vector over Z p .

2.
There exists a matrix M with is the vector of rows and n columns called the share-generating matrix for Π. For the i-th row of M, we let the function ρ define the party labeling row i, for all i = 1, . . . , l, as ρ(i). When we consider the column vector v = (s, r 2 , . . . , r n ), where s ∈ Z p is the secret to be shared, and r 2 , . . . , r n ∈ Z p are randomly chosen, then M·v is the vector representing the l shares of the secret s according to the scheme Π. The share (M·v) i belongs to party ρ(i).

Relevance Score
We use the TFxIDF measurement to express the relevance between the keyword, w, and the document, F, which has been widely adopted in many data mining and searchable encryption schemes. Term frequency (TF) represents the frequency of a keyword in the file. Nevertheless, only TF values are insufficient because some common words, such as prepositions, usually differ from what users want to search for, even if they have high occurrence frequencies in the text. Index document frequency (IDF) brings the solution. Engaged readers can find the definitions of TF and IDF in [11].

Threat Model
There are several players (or parties) in the investigated systems. Their role and the threat model are listed below.
Central Authority (CA): The central authority (CA) sets up the system and verifies intermediate user keys obtained from attribute authorities. After that, the CA produces the final user keys based on the master key generated by itself. In addition, the CA delivers the public key to the other parties. Notice that the CA is believed to be entirely trustworthy in most schemes and our systems.
Attribute Authority (AA): An attribute authority (AA) is equipped with some necessary cryptographic techniques, accepting the request of data users to generate user keys. They verify and generate intermediate user keys according to the attributes the data users provided. Their behavior is also honest so that they do not misbehave in the process of KeyGen and will not collide with data users.
Data Owner (DO): Data owners may be patients in a medical application. They extract some keywords from their medical records to build the Secure Index. After that, they upload encrypted data and the Secure Index to the cloud server. We also assume that DOs are fully credible. They will correctly extract keywords and perform succeeding encryption to the accessible files themselves.
Cloud Server (CSP): The cloud server provides storage to the encrypted files and performs encryption-domain searches. Their threat model is assumed to be honest but curious once again. They will honestly execute protocols but may attempt to obtain documents and keywords in plaintext form through statistical analyses. They are also interested in finding trapdoors uploaded by users, trying to guess what users are searching for, and tracing their search records.
Data User (DU): Data users may be doctors or researchers in an E-health application scenario. They request the encrypted files by transforming the searching keywords into respective trapdoors to perform searching. They may want to access or guess the contents of unqualified data by selective keyword attacks. However, they do not leak decrypted data to other unauthorized users. Figure 3 shows the players, the functional blocks, and the detailed information flow of the proposed system. From Figure 3, nine polynomial-time algorithms (PTAs), as listed below, compose our system. Table 1 demonstrates the symbols used in this write-up.

System Architecture
Cryptography 2023, 7, 28 7 of 18 of unqualified data by selective keyword attacks. However, they do not leak decrypted data to other unauthorized users. Figure 3 shows the players, the functional blocks, and the detailed information flow of the proposed system. From Figure 3, nine polynomial-time algorithms (PTAs), as listed below, compose our system. Table 1 demonstrates the symbols used in this write-up.  Setup (1K,U) → (PK,MK): The CA runs the setup algorithm and generates the master key pair. It delivers the public key, PK, to the other parties and keeps the master key, MK, for itself.  Setup (1K, U) → (PK, MK) : The CA runs the setup algorithm and generates the master key pair. It delivers the public key, PK, to the other parties and keeps the master key, MK, for itself.

System Architecture
Authority Setup (aaid, MK) → (MK Auth , PK Auth ) : The CA executes the authority setup algorithm to set up all the AAs. It grants authority to the master key, MK Auth, and authority to the public key, PK Auth , for each AA.
IntermediateKeyGen (PK, uid, S, MK Auth , PK Auth ) → ik : The AA verifies the user attribute set, S, and runs the intermediate key generation algorithm to generate the intermediate user secret key, ik, using its authority keypair.
KeyGen (PK, MK, S, ik) → uk : The CA verifies the validity of the intermediate user key, ik, and then generates the final user secret key, uk, by the key generation algorithm.
BuildIndex (PK, W) → (Ind, TF P) : DOs build an index table for each keyword, w, in the keyword set, W. In addition, they run a fingerprint generation algorithm to support fuzzy matching and build a fingerprint lookup table as one of the outputs. Figure 4 shows the data structure used to construct our index table.
Encrypt (PK, P, W, sk f , sk t ) → Ct : DOs extract keywords from the plaintext to obtain the keyword list, W, and then input the public key, PK, access policy, P, and the session key, sk f , to the encrypted algorithm for generating the ciphertext. Finally, it encrypts the tables with sk t . DUs recover the session keys and decrypt files and tables associated with this ciphertext.
GenTrapdoor (PK, Str Search , uk) → Td : DUs use the user key, uk, the public key, PK, and the search condition, Str Search, to generate the trapdoor, Td, based on the trapdoorgenerating algorithm. This algorithm has two phases: DUs obtain the hash values of the most proper keywords using the fingerprint-matching algorithm in the first phase. A search tree, Tree p , is constructed according to Str Search and the hash values. Each keyword, W , in Tree p is converted into a corresponding trapdoor, Td. In the second phase, all leaf nodes in Tree p are replaced by Td to become an encrypted search tree, Tree e .
Search (Tree e , Ind, k) → SR ranked : The CSP parses the encrypted search tree, Tree e , and executes the search algorithm to match Td with Ind to obtain the searching result, SR. The CSP sorts SR and outputs the top-k files as the final search result, SR ranked . In our enhanced scheme, the CSP only matches the trapdoor, leaving the jobs of traversing searching trees and ranking for DUs to ensure better data privacy.
Decrypt (uk, Ct, SR ranked ) → PF : DUs input their user key, uk, ciphertext, Ct, and the ranked searching result, SR ranked , to the decryption algorithm to obtain the plaintext files, PFs. Authority Setup (aaid, MK) → (MK Auth , PK Auth ): The CA executes the authority setup algorithm to set up all the AAs. It grants authority to the master key, MK Auth, and authority to the public key, PK Auth , for each AA.
IntermediateKeyGen (PK, uid, S, MK Auth , PK Auth ) → : The AA verifies the user attribute set, S, and runs the intermediate key generation algorithm to generate the intermediate user secret key, , using its authority keypair.
KeyGen (PK, MK, S, ) → : The CA verifies the validity of the intermediate user key, , and then generates the final user secret key, , by the key generation algorithm.
BuildIndex (PK, W) → (Ind, TF P): DOs build an index table for each keyword, , in the keyword set, W. In addition, they run a fingerprint generation algorithm to support fuzzy matching and build a fingerprint lookup table as one of the outputs. Figure 4 shows the data structure used to construct our index table. Encrypt (PK, P, W, sk f , sk t ) → Ct: DOs extract keywords from the plaintext to obtain the keyword list, W, and then input the public key, PK, access policy, P, and the session key, sk f , to the encrypted algorithm for generating the ciphertext. Finally, it encrypts the tables with sk t . DUs recover the session keys and decrypt files and tables associated with this ciphertext.
GenTrapdoor (PK, Str Search , ) → Td: DUs use the user key, , the public key, PK, and the search condition, Str Search, to generate the trapdoor, Td, based on the trapdoorgenerating algorithm. This algorithm has two phases: DUs obtain the hash values of the most proper keywords using the fingerprint-matching algorithm in the first phase. A search tree, Tree p , is constructed according to Str Search and the hash values. Each keyword, w , in Tree p is converted into a corresponding trapdoor, Td. In the second phase, all leaf nodes in Tree p are replaced by Td to become an encrypted search tree, Tree e .
Search (Tree e , Ind, ) → SR ranked : The CSP parses the encrypted search tree, Tree e , and executes the search algorithm to match with to obtain the searching result, SR. The CSP sorts SR and outputs the top-k files as the final search result, SR ranked . In our enhanced scheme, the CSP only matches the trapdoor, leaving the jobs of traversing searching trees and ranking for DUs to ensure better data privacy.
Decrypt ( , Ct, SR ranked ) → PF: DUs input their user key, , ciphertext, Ct, and the ranked searching result, SR ranked , to the decryption algorithm to obtain the plaintext files, PFs.

Security Model
The security model of the proposed system is built on general bilinear map cryptographic techniques and the associated assumptions. As addressed in the following paragraphs, we designed a security game to explore our system's security level. It shows that

Security Model
The security model of the proposed system is built on general bilinear map cryptographic techniques and the associated assumptions. As addressed in the following paragraphs, we designed a security game to explore our system's security level. It shows that our system can defend against selective ciphertext-policy and chosen-plaintext attack (IND-SCP-CPA).
The Ciphertext-domain Keyword Privacy Game. Init: Firstly, A delivers the challenge access matrix A * to B. Setup: B runs the same setup algorithm in the keyword private game. Phase I: B provides an oracle, O SK u , for a query. Furthermore, B builds a secret key list, Lst SK, to hold the query results. The oracle functions as follows: O SK (uid, S): A submits uid and the user attribute set, S, to obtain the corresponding user key, SK uid,S . Notice that S sent by A cannot satisfy the access structure, A * . If SK uid,S has been in the keyword list, Lst SK , B looks up the list and returns the result directly. Otherwise, B executes the key-generating algorithm and inserts the result into the list.
Challenge: A prepares two equal-length messages, m 0 and m 1 , for the challenge. B then decides on a random bit, b ∈ 0, 1, and encrypts them under A * . Finally, B sends back the ciphertext, CT * , to A.
Phase II: B can continue to query for ciphertexts after receiving CT * . The operation is the same as Phase I.
Guess: A makes a guess, b , for if the bit, b, is 0 or 1. If b = b , A wins the security game.
Cryptography 2023, 7, 28 9 of 18 The advantage of A to win the security game is Adv A = Pr[b = b] − 1 2 . Our system is IND-SCP-CPA secure if all polynomial-time adversaries only have negligible advantages at most in the security game above.

Construction of the Basic FERMSABE Scheme
With the pre-described nine PTAs, the basic FERMSABE system can be constructed as follows.
Step 1. The CA sets up the security parameter, K, and the global parameters (G 1 , G T , e), where pairing operations e: G 1 × G 1 → G T . Then, the CA generates three generators, g, g 0 , and g 1 , for the finite group, G 1 . The Setup algorithm randomly chooses a 0 , a 1 , b 0 , and x from the group Z p and chooses v x for each attribute in the universe. The rest of the public and the master keys are organized as follows.
After that, the CA publishes the master key pair to other parties. The CA further defines a hash function, H(x) : {0, 1} * → Z p , to map keywords into elements of Z p .
Step 2. The CA sets up each AA and grants the authority key pair, PK Auth and MK Auth , to the authority with an identifier, aaid. The AuthoritySetup algorithm generates a random element, t, from Z p while the authority key pair comprises PK Auth g t and MK Auth t.
Step 3. When a user requests the user key, the corresponding AA runs the Intermedi-ateKeyGen algorithm to generate the intermediate user keys using his authority key pair. The AA randomly picks an α from Z p and sends this value to the CA. The intermediate user key, ik, is generated as: ik K 0 = g t a 0 and K 1 = g t α . The AA sends this value to the CA to generate the final user key.
Step 4. The CA verifies the validity of the intermediate user key, ik, and then uses it to run the KeyGen algorithm for generating the final user secret key set, uk, which is composed of seven components. Then, the CA chooses µ 0 and u from Z p . The first six components of uk are: (1/α)·u , K 2 = g µ 0 , K 3 = u, K 4 = g x 2 /u , and K 5 = g a 0 0 .g µ 0 1 Notice that x 1 and x 2 are random elements taken from Z p such that x 1 + x 2 = x. The CA generates K x for each attribute in S, that is K x = H µ 0 x . The final user key = K 0 , K 1 , K 2 , K 3 , K 4 , K 5 , {K x } x∈S and will be sent back to the data user.
Step 5. DOs build an index table, Ind, based on keywords extracted from plaintext files. Our BuildIndex algorithm is founded on the approach presented in [35] to build our Ind. Figure 4 depicts the data structure of our index table, where each field in blocks of the linked list represents: − Id F j : The identifier of the file, j, which contains the keyword, i. − S ij : The relevance score of the keyword, i, and the file, j. Notice that the blocks will not be sorted according to this score for confusion. − r ij : Random strings of the same length. We use this field to prevent producing two identity blocks. − Padding values: We add padding values to every linked list to make them of the same size. This setting implies that some linked lists composed of all padding values may also be appended to the table.
Furthermore, DOs build a fingerprint table to support fuzzy search. Figure 5 illustrates the structure of our fingerprint table, and the corresponding generation algorithm can be found in [15]. We store the hash value of a keyword instead of itself to prevent DUs from knowing the keywords of DOs directly. Only the hash value is enough for the subsequent matching and searching tasks. identity blocks.

−
Padding values: We add padding values to every linked list to make them of the same size. This setting implies that some linked lists composed of all padding values may also be appended to the table.
Furthermore, DOs build a fingerprint table to support fuzzy search. Figure 5 illustrates the structure of our fingerprint table, and the corresponding generation algorithm can be found in [15]. We store the hash value of a keyword instead of itself to prevent DUs from knowing the keywords of DOs directly. Only the hash value is enough for the subsequent matching and searching tasks.  Table. In addition to these tables, the DO needs to put some extra data into the headers of Ind to allow the cloud server to perform matchings. We list the additional information in the following: Finally, the DO uploads the encrypted Ind and ciphertexts to the cloud server.
Step 6. DOs extract keywords from the plaintext files, PF, to build the keyword list, W, and input the public key, PK, access policy, P, and the session keys, sk f and sk t , to the Encrypt Algorithm. The former is used to encrypt PF, and the latter is used to encrypt T FP by symmetric encryption algorithms such as AES. They choose two elements, and , from for supporting secret sharing and, respectively, build the secret sharing vectors, and , for ∈ ( ) by LSSS schemes as follows. They further compute  Table. In addition to these tables, the DO needs to put some extra data into the headers of Ind to allow the cloud server to perform matchings. We list the additional information in the following: , and Finally, the DO uploads the encrypted Ind and ciphertexts to the cloud server.
Step 6. DOs extract keywords from the plaintext files, PF, to build the keyword list, W, and input the public key, PK, access policy, P, and the session keys, sk f and sk t , to the Encrypt Algorithm. The former is used to encrypt PF, and the latter is used to encrypt T FP by symmetric encryption algorithms such as AES. They choose two elements, s and s , from Z p for supporting secret sharing and, respectively, build the secret sharing vectors, λ x and λ x , for x ∈ ρ(i) by LSSS schemes as follows. They further compute C 0 = sk f ·e(g, g) x·s , C 1 = g s , C x = g a 0 ·λ x x∈ρ(i) , C 2 = sk t ·e(g, g) x·s , C 3 = g s , and Enc sk t (TF P ), Enc sk t (Ind)} to CSP.
Step 7. DUs first download the ciphertext pack from CSP and decrypt Ind and T FP with uk by the Decrypt algorithm. If DUs own the right user key, sk t can be obtained to decrypt these tables correctly. Otherwise, the algorithm halts. By using a fuzzy matching algorithm, DUs can find the fingerprint that best matches the fingerprint of the input keyword, where we adopt the fuzzy matching algorithm presented in [15] to realize this function. Nevertheless, we additionally set a matching threshold to 0.7. Suppose the relevance score between the best-matched fingerprint and the query fingerprint is lower than this threshold, the match will be discarded, and the corresponding leaf node will be removed to prevent fetching unrelated documents. Second, DUs look up TF P to obtain the best-matching hash value, H(w ). After that, DUs parse Str Search to build a search tree, as shown in Figure 6. Finally, DU chooses a random element, γ u , from Z p to disturb all the values on the leaf nodes. That is, using the GenTrapdoor algorithm, we compute threshold, the match will be discarded, and the corresponding leaf node will be removed to prevent fetching unrelated documents. Second, DUs look up TF P to obtain the bestmatching hash value, ( ) . After that, DUs parse Str Search to build a search tree, as shown in Figure 6. Finally, DU chooses a random element, , from to disturb all the values on the leaf nodes. That is, using the GenTrapdoor algorithm, we compute Figure 6. The Query keyword tree in plaintext form. This table is generated for the access condition of (breath OR fever) AND (pressure OR acute). Notice that this figure is for demonstration purposes only. In actuality, DUs need not know which keywords they have precisely matched.
DUs replace the plaintext domain to-be-searched keywords with these two values at the corresponding locations to produce T W for searching. Eventually, DUs provide T W and the decrypted Ind to the cloud server.
Step 8. The CSP first parses the encrypted search tree, Tree e . Then, it matches each Td in Tree e with each header information in Ind. In other words, it compares whether • , • ∈ , = ( , )? If any index satisfies the previous condition, all the document indexes stored in the latter linked list will be appended to the tree node. Notice that we only need to compute the right-hand side term once because it is fixed. Therefore, our Search algorithm is quite efficient. After all the leaf nodes are searched, the CSP takes the intersection or union of the search results' leaf nodes to become the final search results of the parent nodes depending on whether their parents are AND node or OR node. Finally, the CSP sorts the searching results, SR, in the root node and outputs the top-k files as the final searching result, SR ranked . Then, SR ranked will be sent back to the DUs.
Step . Figure 6. The Query keyword tree in plaintext form. This table is generated for the access condition of (breath OR fever) AND (pressure OR acute). Notice that this figure is for demonstration purposes only. In actuality, DUs need not know which keywords they have precisely matched.
DUs replace the plaintext domain to-be-searched keywords with these two values at the corresponding locations to produce T W for searching. Eventually, DUs provide T W and the decrypted Ind to the cloud server.
Step 8. The CSP first parses the encrypted search tree, Tree e . Then, it matches each Td in Tree e with each header information in Ind. In other words, it compares whether I 2 ·e(T 0 , I 0 ·Π x∈S I 1,x ) = e(C 1 , T 1 )? If any index satisfies the previous condition, all the document indexes stored in the latter linked list will be appended to the tree node. Notice that we only need to compute the right-hand side term once because it is fixed. Therefore, our Search algorithm is quite efficient. After all the leaf nodes are searched, the CSP takes the intersection or union of the search results' leaf nodes to become the final search results of the parent nodes depending on whether their parents are AND node or OR node. Finally, the CSP sorts the searching results, SR, in the root node and outputs the top-k files as the final searching result, SR ranked . Then, SR ranked will be sent back to the DUs.
Step 9. In the final phase, DUs use their user keys, uk, to match with the ciphertext, Ct, for finding the decryption keys. DUs will compute E = ∏ x∈S e(C x ,K 1 ) ωx e(C 1 ,K 4 ) = e(g,g) α st µ e(g,g) Using E, they further compute R = C 0 ·E K 3 e(g s ,K 0 ) . Suppose the user key satisfies the access policy. In that case, R will be identical to the final decryption key, sk f . Finally, DUs can use this key to decrypt encrypted data retrieved in the previous step and obtain the plaintext files. We will present the correctness proofs of searching and decryption in the next Section.

Security Analyses
In this section, we explore the proofs of the security model as mentioned above and other functional modules of our system. Theorem 1: Assume the q-parallel bilinear Diffie-Hellman (q-BDHE) assumptions hold in both G and G T groups. There is no probability that any polynomial-time adversary, A, can break the security of our schemes with a non-negligible advantage.
Proof: Assume the advantage of distinguishing a valid ciphertext from a random element for A is ε 1 = Adv I ND−sCP−CPA . We built a simulator, B, that can break the q-BDHE assumption with a non-negligible advantage ε 1 /2.
The q-BDHE challenger, C, first selects random elements a, s, b 1 , . . . , b q from Z p and sets ϕ = g, g s , . . . , g a q , g a q+2 , . . . , g a 2q , g s·b j , g a/b j , g a q/b j , g a q+2/b j , . . . , g a 2q/b j , g a·s·b i /b j , . . . , g a q ·s·b i /b j .
According to the definition of q-BDHE, A is still hard to distinguish e(g, g) a q+1 ·s even if he knows the above arguments. Then, C chooses a random bit, γ ∈ 0, 1. If γ = 0, C sets T = e(g, g) a q+1 ·s . Otherwise, T is set to a random element in G T . Init: The simulator, B, received a q-BDHE challenge instance (ϕ, T). The adversary, A, announces a challenge access structure (M * , ρ * ) and sends it to B, where M * is an l * × n * matrix and l * , n * < q.
Setup: B selects an element, x , in Z p randomly and sets e(g, g) x = e g a , g a q .
e(g, g) x which implicitly makes x = x + a q+1 . In addition, B initializes a v x for each attribute by choosing v x ∈ Z p at random, and also randomly selects an element, b 0 , from the same group. Finally, B sets H x = g b 0 ·v x and gives the partial public key parameters to A. Phase I: B keeps a list of the tuple (uid, S, SK) represented as Lst SK . Initially, the list is empty. A can query the following oracle in the polynomial form: − O SK (uid, S): Assume that B received a secret key query for (uid, S), in which S does not match the access structure (M * , ρ * ). B performs the following operations: if A has previously asked for S, B retrieves SK from the list, Lst SK , directly and returns it to A.
Otherwise, B chooses a vector, γ = (γ 1 , . . . , γ N * ) ∈ Z p , such that γ 1 = −1 and M * i ·γ = 0 for all i, ρ * (i) ∈ S. This matrix must exist according to the properties of LSSS. Then, B randomly picks σ ∈ Z p and represents t as: t = σ + γ 1 a q + γ 2 a q−1 + . . . + γ n a q+1−n * ·B further selects x 1 , x 2 ∈ Z p at random, such that x 1 + x 2 = x mod p, and sets x 1 = x 1 + a q+1 and x 2 = x 2 . Then, B, respectively, calculates K 1 and K 4 as: we noticed that g at contains a term of g a q+1 , which can be ignored with the unknown terms in g x 1 when calculating K 0 . That is, B computes K 0 as: K 0 = g x 1 g ασ · ∏ i=2,...,n * g a q+2−i γ i = g x 1 ·g at . Notice that K 5 and K x are irrelevant to t, x 1 , and x 2 , so we omit the generation of them here. Finally, B puts SK = K 0 , K 1 , K 2 , K 3 , K 4 , K 5 , {K x } x∈S into Lst SK and sends the keys to A.
Challenge: A prepares two equal-length messages, m 0 and m 1 , for the challenge. B then decides on a random bit, b ∈ 0, 1, and encrypts them under M * ·B computes C * 0 as C * 0 = m b ·T·e(g s , g x ), and C * 1 is generated as C * 1 = g s . It is hard for B to simulate C * x since it includes the term g a j s . To overcome this difficulty, B splits the secret to eliminate the above-mentioned terms. That is, B selects y 2 , . . . , y n * ∈ Z p randomly, and then shares the secret vector, V = s, sa + y 2 , sa 2 + y 3 + . . . + sa n * −1 + y n * ∈ Z p , with A. For i ∈ [1, l], we describe Q i as the set of all k = i making ρ * (i) = ρ * (k). B calculates C x * as: C x * = ∏ i=2,...,n * (g a ) M * i,k ·y k · ∏ x∈Q l ,k=1,...,n * g We produce C * 2 , C * 3 , and D * x in the similar way. Finally, B returns the challenge ciphertext, [1,l] , to A. Phase II: A continues to make queries similar to Phase I. Guess T: A outputs b which is a guess of b. If b = b, B returns γ = 0 to guess T = e(g, g) a q+1 ·s . Otherwise, B returns γ = 1, indicating that T is a random element chosen from G T . In this case, A won the security game and obtained an effective ciphertext. Now, the advantage of A is Pr[b = b|γ = 0] = 1/2 + ε 1 . Conversely, A cannot obtain any information about b and the ciphertext; thus, Adv B = 1/2. In conclusion, the advantage of A in winning the IND-SCP-CPA security game is: 1 Since A only has a negligible advantage in solving the q-DBHE problem, hence no polytime adversary, A can break the security of our schemes with a non-negligible advantage.
As for the keyword privacy, we will prove that any polytime adversary, A, cannot guess the input keyword, w, from the Secure Index, I, nor forge it.
Firstly, because the secret value, s, masked the term I 3 = g (b 0 ·s) H(w) . Even if A has produced the value g 1/H(w) on its own, the only term which contains b 0 is I 1,x = g obtain the value, v x , because it is one of the components of the master key, MK, to tell or forge the Secure Indices. To change the keyword of a trapdoor, A needs to modify . However, it is hard due to the difficulty in solving the discrete log problem. In summary, the unmalleability of the index and trapdoor of our scheme has now been proved.

Functional Comparisons
We compared some existing ABKS schemes with ours in terms of access control, keyword search, multi-keyword, ranked result, fuzzy search, and semantic search capabilities, as shown in Table 2. We use the symbol " " to mean that the scheme has the indicated function, while the symbol "-"represents the lack of this kind of function. Our scheme is the most functional from the table, providing fine-grained access control and supporting a multi-keyword ranked search result with various powerful search modes.  Table 3 compares the theoretical computation costs with some recent ABKS schemes and ours. Let |U| denote the universe size and |S| the size of user attributes, while we use |L| to represent the number of attributes the DO used in the access policy. We use P to symbolize pairing operations. E and E t represent the exponentiation operations in groups G and G T . Hash functions are excluded from our comparison because they are much more efficient than exponentiation and pairing operations. The table shows that our scheme is the most efficient one most of the time, especially for searching.

Computational Complexity Analyses
We concluded our theoretical storage costs compared with the above-mentioned schemes in Table 4. |G|, |G T |, and Z p are bit lengths required to store an element in the respective finite group. Our theoretical storage costs are similar to the MABKS [17] scheme. However, our scheme has lower constant terms and has little relevance to the user attribute size. Furthermore, our trapdoor size is quite reasonable compared with the other schemes. We put extra data into ciphertexts to eliminate the need for DOs to exchange keys with DUs. Even so, the space complexity of ciphertexts is still acceptable in actual cases.

Experimental Analyses
We designed a series of experiments to simulate the actual performance of our schemes. We used the real Enron email dataset [35] for testing. Moreover, we tested our schemes on a Windows machine with 2.80 GHz Intel(R) Core(TM) i7-1165G7 @ 2.80 GHz CPU and 8 GB ROM. We used JPBC (Java Pairing-Based Cryptography) as the pairing operation library and executed the programs on Java SDK 17 and JPBC 2.0.0. According to the most popular setting, we set Z p = 160 bit and |G| = |G T | = 1024 bit, and the Type-A elliptic curve: y 2 = x 3 + x is picked. For practical uses, the universe size is between [20,100], and the user attribute size is between [3,100]. In the subsequent experiments, we assumed at least one authorized document for DUs to retrieve. Figure 7a-d shows the simulation results of our basic scheme compared with others. The universe and the user attribute sizes have been mentioned above. Because some of these schemes do not support multi-keyword ranked search, we only examined one document and one searching condition for ease of simulations. However, it is sufficient to express the effectiveness of the proposed scheme. Figure 7a shows the setup time, demonstrating a linear dependency on the size of the system attributes. While the encryption time is irrelevant to the size of the system attributes, as shown in Figure 7b, our setup time is similar to the other benchmarking schemes, but we use a much shorter time for encryption. Our scheme shows superiority in decryption and user-key generation time, as demonstrated in Figure 7c,d. Notice that the required key-generation time is proportional to the size of the user attributes rather than that of the universe. Clearly, our scheme has more advantages when massive user attributes are required. Our approaches are the most efficient compared to the MABKS [21] and MSDVABE [33] schemes.
We constructed a practical system for the implementation of our enhanced scheme. Figure 8a-e shows this system's actual data retrieval and index-table building times. For ease of simulations, we realized the same extensions on the other benchmarked schemes to support more powerful searching modes. In these experiments, we set the universe size to 27, and the user attributes size to 3 for simulating real scenarios. These attributes are categorized into position, subject, and level classes. This setting does not affect the experiment results in any case. In Figure 8a, we fixed the size of the keywords.
We set the number of Provided by DOs to 30 and the number of search conditions selected by DUs to 5. Furthermore, we set the size of the document database to vary from 20 to 100. In this circumstance, our search time is almost constant and is similar to that of MABKS [21]; both are better than the MSDVABE [33]. In Figure 8b, the keyword size varies from 20 to 100 while the database size and searching conditions are fixed to 100 and 5, respectively. Our searching time is linearly proportional to the size of the keywords, while that of the MSDVABE [33] scheme varies more dramatically than ours. The same conclusion can be drawn from Figure 8. When the search conditions increase from 5 to 30, our scheme performs better than the others.
time is irrelevant to the size of the system attributes, as shown in Figure 7b, our setup time is similar to the other benchmarking schemes, but we use a much shorter time for encryption. Our scheme shows superiority in decryption and user-key generation time, as demonstrated in Figures 7c,d. Notice that the required key-generation time is proportional to the size of the user attributes rather than that of the universe. Clearly, our scheme has more advantages when massive user attributes are required. Our approaches are the most efficient compared to the MABKS [21] and MSDVABE [33] schemes. We constructed a practical system for the implementation of our enhanced scheme. Figure 8a-e shows this system's actual data retrieval and index-table building times. For ease of simulations, we realized the same extensions on the other benchmarked schemes to support more powerful searching modes. In these experiments, we set the universe size to 27, and the user attributes size to 3 for simulating real scenarios. These attributes are categorized into position, subject, and level classes. This setting does not affect the experiment results in any case. In Figure 8a, we fixed the size of the keywords. We set the number of Provided by DOs to 30 and the number of search conditions selected by DUs to 5. Furthermore, we set the size of the document database to vary from 20 to 100. In this circumstance, our search time is almost constant and is similar to that of MABKS [21]; both are better than the MSDVABE [33]. In Figure 8b, the keyword size varies from 20 to 100 while the database size and searching conditions are fixed to 100 and  Although the MSDVABE [33] scheme takes the shortest time in this experiment, it has a poor performance on searching. With a similar opinion to MABKS [21], we conducted one pairing operation in the index-building phase to prevent performing too many pairing operations in the searching phase. Therefore, some of the performance on building index tables is sacrificed. However, data owners usually build index tables only once, but data users may search the database many times. Therefore, our schemes are most realistic and practical in actual use. Furthermore, these two schemes take much more time, even making it impossible to perform fuzzy and semantic keyword-ranked searches combined with multiple keywords without our extensions. We proved that our schemes are efficient, flexible, and universal to apply to other performance-oriented AMKS schemes.

Conclusions
In this paper, we showed that the proposed FEMRSABE scheme has a powerful search capability that can satisfy most users' needs. Even if the user inputs do not fully match the keywords set up by the DO or have some minor spelling errors, users can still obtain the desired and most-related documents. Our basic protocol competes with the state-of-the-art schemes through the performance analyses given in the previous Section.
The state-of-the-art takes much more time to search and does not perform fuzzy and semantic keyword ranked searches which is the main contribution of our work.
Moreover, the enhanced one brings many more functionalities with a slight efficiency loss, which is tolerable in real-world scenarios. Moreover, we proved that our scheme is secure under the IND-SCP-CPA and the IND-CKA security requirements. However, there are some limitations in our system as well. For example, the attributes of users may frequently vary in the real world, while fine-grained attribute revocation and updating mechanisms are needed but are not included in our work currently. Furthermore, we tackle the single-point failure problem by setting up multiple attribute authorities, but there are probably malicious attribute authorities that can determine users' privacy by mis-operations.
We plan to add the attribute revocation and verification mechanisms mentioned above to make the system more steady and secure.