Improve Parallel Resistance of Hashcash Tree

: Denial of Service (DoS) attacks remain a persistent threat to online systems, necessitating continual innovation in defense mechanisms. In this work, we present an improved algorithm for mitigating DoS attacks through the augmentation of client puzzle protocols. Building upon the foundation of hashcash trees, a recently proposed data structure combining hashcash and Merkle trees, we introduce a new version of the data structure that enhances resistance against parallel computation (a common tactic employed by attackers). By incorporating the labels of children and the next node in a breadth-first traversal into the hash function, we establish a sequential processing order that inhibits parallel node evaluation. The added dependency on the next node significantly elevates the complexity of constructing hashcash trees, introducing a linear number of synchronization points and fortifying resilience against potential attacks. Empirical evaluation demonstrates the efficacy of our approach, showcasing its ability to accurately control puzzle difficulty while bolstering system security against DoS threats.


Introduction
Denial of service (DoS) attacks target resources such as websites, applications, and servers, rendering them unavailable for their intended purpose and posing a significant threat to the integrity of the Internet community [1].These attacks result in substantial losses [2,3] and have spurred the development of sophisticated detection methods [4][5][6][7][8].Dwork and Noar [9] proposed the use of proof-of-work (PoW) mechanisms to contrast the proliferation of spam emails, requiring a computational stamp to access a service, for example, forwarding a message in the case of emails.PoW, a cryptographic proof method, entails one party (the prover) demonstrating to others (the verifiers) that a specified computational effort has been expended, which verifiers can subsequently validate with minimal effort on their part [10].While PoW schemes are asymmetrical in favor of verifiers, posing moderate computational challenges for provers but facilitating easy solution verification, when applied to DoS mitigation, clients act as provers seeking access to a service while servers serve as verifiers providing the required service.
Client puzzle protocols constitute a cornerstone in the realm of DoS mitigation, leveraging PoW mechanisms to thwart malicious attacks.These protocols, categorized into challenge-response and solution-verification paradigms, mandate clients to engage in computational tasks to access services facilitated by servers.In challenge-response scenarios, servers proffer puzzles for clients to decipher, while in solution-verification contexts, clients autonomously formulate puzzles based on service specifications.Hashcash [11], a prominent cryptographic PoW algorithm, finds ubiquitous application across both protocol types.In challenge-response configurations employing hashcash, servers challenge clients to identify string extensions with predetermined hash properties, while in solutionverification scenarios, hashcash operates on service descriptors.These protocols epitomize an asymmetrical allocation of computational effort, requiring iterative attempts from the prover while imposing minimal verification overhead on the verifier.However, a notable limitation of hashcash lies in its dependency on the length of the required all-zeros prefix to dictate puzzle difficulty.Coelho [12] introduced a solution-verification protocol based on hash trees, offering superior control over puzzle difficulty.Hash tree structures assign labels to leaves by concatenating hashed leaf indices with service descriptions, while internal nodes receive labels by hashing concatenated child labels.While effective in regulating prover effort, practical implementation of this protocol necessitates handling large hash trees, posing storage and computational challenges.
In a recent study [13], researchers posed three research questions aimed at advancing client puzzle protocols within the domain of cybersecurity.The first question sought to ascertain the feasibility of crafting a client puzzle protocol founded on hashcash, albeit with enhanced control over puzzle difficulty.Concurrently, another question delved into the prospect of formulating a client puzzle protocol rooted in hash trees, yet requiring more compact tree structures compared to those previously proposed by Coelho [12].These research questions were comprehensively addressed, unveiling promising avenues for protocol refinement and optimization.However, the investigation encountered partial resolution concerning the final question about the resistance of the protocol to parallel computation.While substantial progress was made in elucidating the efficacy of the protocol in mitigating parallel computational threats, further exploration remains imperative to achieve comprehensive insights into its resilience under varied operational conditions.In fact, the protocol features a logarithmic number of synchronization points relative to the total number of nodes.
In a nutshell, the primary contribution of this recent study involves the development of a modified challenge-response protocol, structured into two distinct phases (Figure 1).In the first phase, the prover tackles a puzzle generated by the verifier and submits a commitment upon puzzle resolution.The subsequent phase requires the prover to furnish proof of solution validity.This protocol introduces a novel puzzle mechanism centered on constructing a tree with specific labeling criteria, enabling precise control over puzzle difficulty by adjusting parameters such as the length of the all-zeros prefix and the number of nodes.This new data structure, named hashcash tree merges elements from hashcash and hash trees to enhance puzzle complexity management and solution verification efficiency.Partial parallelization is possible in the construction of hashcash trees due to the interdependence of node label computation on the labels of child nodes, from which the logarithmically many synchronization points arise.Further improvements on the parallel resistance of the protocol were left as a future research line.
In this article, we actively engage with this critical research line.Parallel computation represents a sophisticated technique wherein multiple computational tasks are executed concurrently, posing a formidable challenge to the efficacy and integrity of the protocol.To directly address this significant challenge, our research endeavors to introduce synchronization points into the computational framework.These synchronization points serve as pivotal checkpoints strategically positioned within the computational process, facilitating the alignment and coordination of diverse computations.By incorporating synchronization points, we aim to foster a seamless and harmonized progression of computational tasks, thereby enhancing the ability of the protocol to effectively manage and control concurrent operations.Our approach entails substantial modifications to the underlying data structure, constituting a fundamental transformation aimed at augmenting the resilience of the protocol to parallel computation threats.Whereas previous iterations of the protocol relied on synchronization points scaled logarithmically in accordance with the total number of nodes, our new strategy entails a significant escalation in the number of synchronization points.This amplification, now directly aligned with the linear expansion of nodes, provides an exponential increase in the number of synchronization points.Importantly, this change is transparent to the end-user; just as the end-user did not directly interact with or perceive the hashcash computations before, they remain unaware of these processes now.

Background
This section summarizes the main notions involved in the definition of the two-phase challenge-response protocol based on hashcash trees [13].

Hash Functions and Hashcash
The term string is commonly used to refer to binary strings, which are sequences of elements from the set {0, 1} * , representing binary digits or bits.The empty string is denoted by ϵ.The term prefix describes the initial segment of a string s, specifically the first n bits where n ∈ N; this segment is represented as prefix(s, n).The concatenation of two strings s, s ′ is denoted by s||s ′ .The expression s||x is sometimes used informally to represent the concatenation of a string s with the binary string representation of a non-negative integer x.We define the concatenation of a string s with itself n ∈ N times, denoted as s n , through the following definition: s 0 = ϵ; s n+1 = s n ||s for n ≥ 1.The i-th element of a string s, and in general of a sequence or tuple, is denoted by s| i .

becomes 010010010010. ■
A hash function is a type of function that transforms strings of any length into strings of a consistent length.If m ∈ N + is a positive integer representing this fixed length, the signature of the hash function is defined as follows: For a string s being an element of the set {0, 1} * , the result h(s) generated by the hash function h is called the hash value of s; s is called the message.Additionally, cryptographic hash functions ensure the following properties: (i) the probability of a particular hash value for a message is 2 −m ; (ii) finding a message that matches a given hash value is unfeasible (preimage resistance); (iii) finding a second message that matches the hash value of a given message is unfeasible (second-preimage resistance); (iv) finding two different messages that yield the same hash value is unfeasible (collision resistance).
Hashcash is a (nondeterministic) function that depends on a hash function h and a positive integer k.It associates every input string s from the set of binary strings with any pair ⟨h(s||x), x⟩ such that x is a natural number and prefix(h(s||x), k) = 0 k .Let h k (s) represent any acceptable output ⟨h(s||x), x⟩ of the hashcash function configured with parameters h and k.Hashcash essentially requires identifying partial hash collisions within the prefix of zeros of length k, and the most efficient method for achieving this is through brute force [14]; see Algorithm 1.If we assume that the brute force algorithm starts with x = 0, obtaining a valid output value ⟨h(s||x ′ ), x ′ ⟩ necessitates computing x ′ + 1 hash values.Overall, hashcash incurs an unbounded probabilistic cost, implying that theoretically, the brute force algorithm could iterate indefinitely.However, the probability of not finding a solution diminishes rapidly toward zero.Conversely, confirming that ⟨h(s||x), x⟩ constitutes a valid output entails computing just one hash value, as shown in Algorithm 2.

Hashcash Trees
A labeled binary tree T can be described as either an empty set ∅ or a quadruple ⟨r, ℓ, L, R⟩, where r represents the root node (of T), ℓ denotes the label associated with r, and L and R are labeled binary trees referred to as the left child and the right child, respectively, of both T and r; r, L, and R are represented as root(T), left(T), and right(T), respectively.(In the subsequent context, the term tree denotes labeled binary trees.)Nodes of a tree are defined inductively: nodes(∅) := ∅; nodes(⟨r, ℓ, L, R⟩) := {r} ∪ nodes(L) ∪ nodes(R).When both L and R are empty, the node r is also referred to as a leaf.Leaves of a tree are defined inductively: leaves(∅) := ∅; leaves(⟨r, ℓ, ∅, ∅⟩) := {r}; leaves(⟨r, ℓ, L, R⟩) A tree is considered perfect if every internal node has nonempty children, and all leaves are situated at the same level.(For the subsequent discussions, only perfect trees are under consideration.)The height of a tree corresponds to the depth at which its leaves are situated.The order of v within T is determined using breadth-first search (BFS) or level-order search as follows: order(r, T) := 1; for each occurrence of ⟨v, ℓ ′ , L ′ , R ′ ⟩ in T, order(root(L ′ ), T) := 2 • order(v, T) and order(root(R ′ ), T) := 2 • order(v, T) + 1.Note that T can be efficiently represented using a one-based arr of size |nodes(T)|, where arr[i] = label(v, T) whenever order(v, T) = i.
A hash tree, also known as a Merkle tree, is a tree structure where each leaf node contains the hash value of a data block, and each internal node contains the hash value computed from the concatenation of the hash values of its child nodes.In this context, a data block refers to any piece of information within a commitment scheme, representing concealed data that remain unalterable.Moreover, sharing the hash value linked to the root ensures the integrity of the entire hash tree, preventing any modification to its labels.
To ascertain that a particular data block s i (where i ∈ [1..2 n ]) within a sequence s 1 , . . ., s 2 n associated with the hash tree T (for n ∈ N) has not been altered, it is adequate to examine the nodes along the path from the leaf labeled with h(s i ) to the root.The verification process entails examining the labels of these nodes and the hash values of their child nodes; in formulas, if v o denotes the node of T such that oder(v o , T) = o, the labels involved in the verification of The definition of the hashcash tree is based on fixed parameters: a hash function h and a positive integer k.Remember that h k (s) denotes any pair ⟨h(s||x), x⟩ where x ∈ N and prefix(h(s||x), k) = 0 k .Consider a string s and a positive integer n ∈ N + .Define the (nondeterministic) sequence of labels (ℓ i ) i∈N + as follows: The hashcash tree of size n for the string s is a perfect tree T with a height of ⌈log 2 (n)⌉.In this tree, for each node v, if its order in the tree T is i, then the label of the node v is set to ℓ i .To construct a hashcash tree of size n, the process begins by calculating the labels of the nodes at the highest level (i.e., the leaves), which corresponds to ⌈log 2 (n)⌉.Subsequently, the algorithm iteratively computes the labels of nodes at preceding levels until reaching the root node.Overall, the algorithm executes n hashcash computations.
Similar to a hash tree, verifying a leaf i within a hashcash tree T entails examining the nodes along the path from i to the root of T, as well as their children; see Algorithm 4.More precisely, the algorithm validates the prefixes of all hash values associated with these nodes (lines 8-9), recalculating hash values only for nodes along the path from i to the root of T (lines 2-6).

Client Puzzle Protocols
A client puzzle protocol (CPP) consists of two participants: a verifier V and a prover P. P must utilize a resource of V that is computationally expensive.Before handling the request from P, V challenges P to solve a puzzle.The two-phase challenge-response CPP (see Figure 1) based on Algorithms 3 and 4 involves the following procedures: 1.
At the beginning, the verifier V generates and retains a master key denoted as mk.

2.
To initiate the process, the prover P must transmit a request req to V. To accomplish this, P sends Req := h(req) to V.

3.
Based on its current workload, V determines the difficulty parameter n ∈ N + , generates a timestamp t indicating the deadline for completing the protocol, calculates s := h(mk||Req||n||t), and transmits ⟨s, n, t⟩ to P.

4.
P computes and generates the hashcash tree T of size n for s using Algorithm 3 and subsequently transmits ⟨sol, s, n, t, Req⟩ to V. Here, sol is label(root(T))| 1 .

6.
P sends ⟨req, S, I, i, s, n, t, Req⟩ to V, where S comprises the labels associated with nodes along the path from i to the root as well as their child nodes; in formulas, S represents the sequence containing ℓ 1 for j ∈ (As an optimization, integer witnesses for labels that are not on the path from i to the root can be excluded.)7.
V checks all the following conditions: s = h(mk||Req||n||t); t is in the future; I = h(mk||Req||sol||i), where sol is ℓ 1 | 1 in S (i.e., the hash value linked to the root of the partial hashcash tree transmitted by P); S is valid; Req = h(req).If all criteria are satisfied, the request req is handled.
At step 7, the validation of S involves verifying that each label ℓ j along the path from i to the root is computed according to (2); Algorithm 4 is used for this purpose..n].
return false; 10 return true;

Hashcash Tree with Improved Parallel Resistance
In order to improve the parallel resistance of hashcash trees, we modify the way labels of nodes are computed.In the refined hashcash tree, the label of each node is generated by computing the hashcash of a string that encompasses not only the labels of the children nodes but also incorporates the label of the subsequent node in a BFS.This slight alteration in the label generation process introduces a significant enhancement by incorporating additional synchronization points into the computational framework.By including the label of the subsequent node in the sequence along with the labels of the children nodes, each node label becomes intricately intertwined with its subsequent nodes.This creates more frequent points within the computation where different tasks must align or synchronize, effectively introducing more synchronization points into the process.As a result, the protocol gains heightened resistance to parallel computation, as tasks are forced to be computed sequentially.
Formally, the new label of the i-th node is defined as follows: Accordingly, line 7 of Algorithm 3 is modified as follows: Regarding the verification procedure reported in Algorithm 4, two changes are needed.First of all, line 5 is modified by including the label of the following node in BFS: Finally, it is required to verify the prefix of the following node; therefore, line 8 is modified as follows: Activity diagrams of the modified versions of Algorithms 3 and 4 are shown in Figures 2 and 3.It can be observed that the label of each node depends on those of the children and next node in BFS.In the construction of a hashcash tree (Figure 2), the number of hashcash computations stays the same as in the previous version, as the only difference is the message that is passed to the hashcash function.In the verification of a hashcash tree (Figure 3), the dependency on the next node in BFS ensures that the labels have been computed sequentially, from the last node in BFS to the root, while the dependency on the children allows for restricting the nodes to verify to those in the path from a leaf to the root.The CPP presented in Section 2 is adapted to the modified algorithms as follows (we report only the modified steps):

4'
P computes and generates the hashcash tree T of size n for s using the modified version of Algorithm 3 shown in Figure 2; subsequently, it transmits ⟨sol, s, n, t, Req⟩ to V. Here, sol is label(root(T))| 1 .

6'
P sends ⟨req, S, I, i, s, n, t, Req⟩ to V, where S comprises the labels associated with nodes along the path from i to the root as well as their child nodes and subsequent nodes in BFS; in formulas, S represents the sequence containing ℓ 1 and, for j ∈ , and ℓ ⌊i•2 −j ⌋+1 .(As an optimization, integer witnesses for labels that are not on the path from i to the root can be excluded.)7' V checks all the following conditions: s = h(mk||Req||n||t); t is in the future; I = h(mk||Req||sol||i), where sol is ℓ 1 | 1 in S (i.e., the hash value linked to the root of the partial hashcash tree transmitted by P); S is valid; Req = h(req).Here, the validation of S involves verifying that each label ℓ j along the path from i to the root is computed according to (3); the modified version of Algorithm 4 shown in Figure 3 is used for this purpose.If all criteria are satisfied, the request req is handled.The original version of the CPP based on hashcash trees provides partial resistance to parallel computation: Indeed, it was shown that any parallel implementation of the original HashcashTree(s, n, h, k) includes at least ⌈log 2 (n)⌉ sequential hashcash computations (i.e., calls to Algorithm 1; Theorem 3 in [13]).The result is derived from the observation that the hashcash tree exhibits resistance to parallel computation along its paths.Specifically, the labels of internal nodes are determined by computing hashcash output values for strings that encompass the hash values linked to their respective child nodes.This methodology unveils that the height of the hashcash tree serves as an indicator of its resistance to parallel computation.The addition of the label of next node in BFS to the message passed to hashcash exponentially improves this indicator.
Theorem 1.Any parallel implementation of the modified HASHCASHTREE(s, n, h, k) (Figure 2) includes at least n sequential calls to Algorithm 1.
Proof.The computation of the label of each node in the hashcash tree, including all leaves but the last in BFS (i.e., node v n ), is a synchronization point: the computation of the label of every node but v n cannot start before the label of the next node in BFS is completed.It turns out that n sequential calls to hashcash are required, and the proof is complete.
Thanks to the presence of the labels of the child nodes, other computational properties of the CPP are preserved.In particular, Theorems 1 and 2 in [13] are unaffected: 1.
The probability that HashcashTree(s, n, h, k) terminates after computing N + n hash values is for all N ∈ N. On average, the number of unused hash values computed by Algorithm HashcashTreeVerify(T, i, s, n, h, k) terminates after computing at most ⌈log 2 (n)⌉ hash values.

Implementation and Experiment
In this section, we present the implementation and evaluation of a new version of the hashcash tree data structure.We leveraged the proof-of-concept code for the hashcash tree presented in [13] as a foundation.This code is available on GitHub (https://github.com/alviano/hashcash-tree; accessed on 1 July 2024).To facilitate a performance comparison, we also implemented a parallel algorithm to construct the original version of the hashcash tree.This implementation utilizes the ProcessPoolExecutor class from Python's concurrent.futures library (https://docs.python.org/dev/library/concurrent.futures.html;accessed on 1 July 2024) to execute tasks asynchronously using a pool of worker processes.
To assess the performance and properties of the newly developed data structure, we integrated benchmark code within the implemented codebase.These benchmarks aim to answer the following specific research questions: Q1 Overhead of the New Data Structure: This question investigates the additional processing time or resource consumption introduced by the new data structure compared to the original version.We will measure the overhead by comparing the execution times of equivalent operations on both data structures for various input sizes and configurations.Q2 Parallel Resistance Weakness: This question explores the vulnerability of the original hashcash tree to parallelized construction attempts.We will identify parameter settings (e.g., tree size and hash function choice) under which the effectiveness of the original data structure deteriorates when utilizing parallel processing for construction.This will involve running the parallel construction algorithm for the original data structure with various parameter combinations and analyzing the impact on execution time and security properties.
By answering these questions, we aim to gain insights into the trade-offs between performance and security offered by the new data structure and identify potential weaknesses in the possibility to parallelize the construction of the original hashcash tree.
Our experiments were conducted on a high-performance computing system equipped with a dual CPU AMD EPYC 7313 processor.Each CPU offers 16 cores, resulting in a total of 32 processing cores available for parallel execution.The system boasts a generous 2TB of RAM, providing ample memory for data processing.To ensure consistent results and prevent resource exhaustion on the host machine, we imposed a time limit of 600 s (10 min) per experiment run and limited memory usage to a maximum of 16GB per run.
Our benchmarks focused on evaluating the performance of both the original and new hashcash tree data structures under varying conditions.We manipulated three key parameters:

•
Prefix Length (k): This parameter controls the difficulty of the proof-of-work mechanism within the hashcash tree.We varied the prefix length (denoted by k) to assess its impact on construction time.

•
Number of Nodes: This parameter determines the size of the hashcash tree, directly influencing the number of hashing operations required during construction, and therefore providing a fine-grained control on the difficulty of the proof-of-work mechanism within the hashcash tree.We experimented with different tree sizes to understand the scalability of both data structures.• Number of Parallel Workers (w): This parameter is specific to the parallel construction algorithm for the original hashcash tree.It controls the number of worker processes utilized for concurrent execution.By varying the number of workers (denoted by w), we aimed to identify potential weaknesses in the possibility to parallelize the construction of the original hashcash tree.
We measured the runtime required to build both the original and new hashcash tree data structures.This involved running multiple experiments for various combinations of the three parameters mentioned above.For the original data structure, we recorded the runtime of the serial construction algorithm as well as the parallel construction algorithm with different numbers of worker processes.
The detailed results of our experiments are presented in Figures 4-6.As an initial observation, consistent across all plots, the construction time of the new data structure closely matches the runtime of the original hashcash tree built using the serial algorithm.This finding directly addresses research question Q1: the new data structure does not introduce any significant overhead in terms of processing time compared to the previous version.This is a promising result, suggesting that the new data structure offers enhanced functionality without compromising performance on the core construction operation.Figure 4 sheds light on research question Q2, which explores the vulnerability of the original hashcash tree to parallelization.Here, we observe a critical interplay between the prefix length (k) and the effectiveness of parallel construction (parameter w).For short prefix lengths (k = 4), the computation associated with each node in the hashcash tree is relatively fast.This swift computation time renders parallelization inefficient due to the overhead introduced by communication and synchronization between worker processes.This inefficiency is particularly evident when comparing the serial and parallel construction times for k = 4.The serial construction finishes significantly faster than the parallel version with just one worker.As the prefix length increases (k = 5), the workload for each node becomes more intensive.This allows the parallel approach to demonstrate some performance gains, but these gains are limited.With k = 5, some speedup is observed up to w = 4 workers.However, the trend changes for even longer prefixes (k = 6, 7, and 8, as shown in Figures 5 and 6).Here, the increased workload per node due to the longer prefix makes parallelization more advantageous.We observe a clear reduction in construction time when utilizing a larger number of parallel workers (w = 8 for k = 6 or 7 and w = 16 for k = 8).
Based on these observations, we can conclude that the original hashcash tree exhibits a weakness in its partial parallel resistance when the prefix length surpasses k = 4 zeros.This finding is particularly relevant considering the recommendation of k = 4 or 5 zeros for prefix length in the previous work [13].Our analysis provides evidence that k = 4 remains the preferred choice for the original hashcash tree to maintain its parallel resistance.For scenarios requiring a longer prefix (e.g., k = 5), the new data structure emerges as a more suitable option, likely due to its inherent design inhibiting parallelization of node construction.This highlights the advantage of the new data structure in situations where stronger proof-of-work guarantees are necessary.
In order to test the puzzle protocol under varying conditions, we used the stress-ng utility (https://github.com/ColinIanKing/stress-ng;accessed on 1 July 2024) to simulate increasing workloads.In particular, we used the command stress-ng --cpu n, where n was assigned values between 1 and 32, to spawn n workers, each of which uses a CPU at 100%.Prefix length was fixed to 4 and the size of the hashcash tree was increased from 15 to 255 nodes.We measured the CPU usage (user time + system time) of both server and client to address 1000 (one thousand) requests, as well as the elapsed real (wall clock) time used by the process.Results are shown in Table 1, where we report the wall clock time and the percentage of CPU time used by the server (with respect to the total CPU time used by both the server and the client; we recall here that the time of the server includes the CPU usage for running FastAPI).It can be observed that even for hashcash trees of small size (15 or 31 nodes), the server has to address an easier computational task than the one addressed by the client.Finally, we observe that the effort of the server is not significantly affected by the workload of the server, which anyhow is expected to be monitored to determine the size of hashcash trees: in case the server is overloaded, access to resources is granted only after the client computes a hashcash tree of relatively large size so as to delay the rate of incoming requests.

Number of nodes
Runtime (seconds)

Related Work
Detecting DoS attacks remains a significant challenge, often requiring sophisticated techniques like machine learning approaches [4][5][6] (see also the survey by de Neira et al. [7]).As a proactive measure, prevention techniques offer valuable protection for critical ser-vices and organizational assets [15,16].This work focuses on client puzzle protocols as a preventative technique specifically designed to mitigate DoS attacks.
The concept of client puzzles originated with Juels and Brainard [17], who proposed their use to prevent DoS attacks.The core principle behind client puzzles is that they are solvable by a legitimate user within a reasonable time frame (polynomial time) while requiring a minimal amount of computational resources.This allows the server to grant access to its resources only upon receiving a valid solution to a newly presented client puzzle.Similar concepts were explored by Dwork and Naor [9] with the introduction of pricing functions to combat junk emails, and by Rivest, Shamir, and Wagner [18] with timed-lock puzzles for timed-release cryptography.Effective client puzzles are expected to possess several key properties: • Unforgeability: it should be computationally infeasible for an attacker to forge a valid solution without actually solving the puzzle [19].• Difficulty: solving a client puzzle should require a sufficient amount of computational effort to deter attackers from launching resource-intensive DoS attacks.• Determinable Difficulty: ideally, the server should be able to determine the difficulty level of the puzzle beforehand [20].

•
Parallel Computation Resistance: the puzzle's difficulty should not significantly decrease when attackers attempt to solve it using parallel processing techniques [20].
Client puzzles can be broadly categorized into two main classes based on the computational resource they primarily rely on during the solving process: CPU-bound and memory-bound puzzles.This work elaborates on the Client Puzzle Protocol (CPP) introduced in [13].It is a CPU-bound CCP and leverages a combination of hashcash [11] and hash trees [27].The primary challenge associated with using hashcash alone stems from its unbounded probability cost.The only parameter that controls the difficulty of a hashcash puzzle is the prefix length (k).However, both the difficulty and variance of the puzzle increase exponentially with increasing k values.To address this limitation and gain finer control over puzzle difficulty, Juels and Brainard (1999) proposed a CPP that essentially involves solving multiple subpuzzles [17].However, the solutions to all such sub-puzzles have to be transmitted and verified, which is practically unfeasible.In contrast, verifying a hashcash tree only necessitates a logarithmic number of solution verifications.Coelho (2008) introduced a CPP based on hash trees [12].This CPP can be viewed as a variant of the solution-verification version of the CPP given in Section 2.3, but with a key difference.In Coelho's CPP, the prefix length is fixed at k = 0 (effectively disabling hashcash) [11].The tree is then constructed based on the service description, and a selection of leaves are chosen for verification based on the root hash value.While this approach offers high precision in determining puzzle difficulty, it suffers from the potential for rapid growth in hash tree size.This poses a significant challenge, as the prover needs to store (or recompute) the entire hash tree to provide verification labels, which are only discovered after calculating the root hash.In contrast, the CPP we opted for leverages smaller trees.This is achieved by enabling hashcash, which allows for controlling the difficulty by adjusting the length of the required prefix for each individual node within the tree.
This work addresses a potential weakness identified in the original hashcash tree data structure: its susceptibility to a parallel processing attack.Traditionally, the construction of hashcash trees involved minimal synchronization between processing units, allowing an attacker to distribute the workload across multiple processors and potentially solve the puzzles significantly faster.This could render the DoS prevention mechanism less effective.To address this issue, we introduced strategic synchronization points within the construction process of the hashcash tree.These synchronization points act as dependencies, essentially forcing different processing units (e.g., CPU cores) to wait for each other at specific stages.This deliberate introduction of dependencies effectively disrupts the attacker's ability to fully parallelize the workload, hindering their attempt to exploit the system's resources.We further validated this approach through empirical studies.Our experiments confirm that the original version of the hashcash tree exhibited a significant weakness in its parallel resistance.In contrast, the modified version incorporating synchronization points demonstrated a substantial improvement in this aspect.This improvement translates to a more robust defense against DoS attacks, as parallelization becomes less advantageous for attackers.
In addition to the improvements introduced by the hashcash tree method, it is important to compare it with other DoS mitigation techniques to provide a comprehensive understanding of its relative advantages and limitations.In particular, we consider here memory-bound puzzles and machine-learning-based anomaly detection.Memory-bound puzzles are designed to be computationally intensive in terms of memory access rather than CPU usage.They are particularly effective against attackers using specialized hardware optimized for CPU-bound tasks.Among their strengths, memory-bound puzzles are difficult to solve with specialized hardware, ensuring a more level playing field between attackers and legitimate users.On the other hand, these puzzles can be inefficient on devices with limited memory, potentially leading to denial of service to legitimate users as well.While memory-bound puzzles are effective in certain scenarios, they cannot be employed to protect services that are accessed by general audience, who may use devices with limited memory.Machine learning techniques for DoS mitigation involve training models to detect anomalies in network traffic that may indicate an ongoing attack.Among their strengths, machine learning models can adapt to new attack patterns and have the potential for high accuracy in detecting sophisticated attacks.On the other hand, these techniques require extensive datasets for training, are complex to implement, and can suffer from false positives and negatives.While machine learning offers adaptability and accuracy, hashcash trees provide a simpler and more deterministic approach.They are easier to implement and do not require large datasets for training.Additionally, hashcash trees can be deployed in real-time scenarios with predictable performance, unlike machine learning models which might require periodic retraining and tuning.Nonetheless, it is also important to note that a more advanced mitigation method can be obtained by combining machine-learning-based anomaly detection systems and hashcash trees, essentially by determining the parameters controlling the difficulty in constructing the hashcash tree based on the anomaly level detected by the machine learning algorithms.
As a final remark, our work builds upon the concept of PoW for securing decentralized systems.Recent research by Baniata and Kertesz (2024) [28] analyzed the security of PoW-based blockchains, identifying a critical difficulty level where attackers can exploit the system with minimal resources.This emphasizes the need for robust security measures alongside increasing difficulty.Additionally, Agarwal (2024) [29] proposed a PoW algorithm using Corona Graphs as an alternative to SHA-based hashing in blockchains.This approach allows for scalable difficulty based on graph size.While our work does not directly address PoW for blockchains, it contributes to the broader field of securing systems through cryptographic techniques.Our focus lies on leveraging PoW for client-side puzzles to prevent DoS attacks, offering a complementary approach to securing centralized systems.

Conclusions
In this work, we presented an improved version of the hashcash tree data structure designed to mitigate Denial-of-Service (DoS) attacks.We addressed a key limitation of the original design-its weakness against parallel processing attacks.Our approach involved strategically introducing synchronization points within the construction process, effectively disrupting an attacker's ability to fully exploit parallelization for faster puzzle solving.We conducted benchmark experiments to evaluate the effectiveness of our modifications.The results confirmed that the original hashcash tree exhibited a significant vulnerability to parallelization, depending on the adopted length of the prefix.In contrast, the improved version incorporating synchronization points demonstrated a substantial improvement in parallel resistance.This translates to a more robust defense against DoS attacks.Our work highlights the importance of considering potential weaknesses in cryptographic constructions, particularly when parallelization is a possibility.The proposed modifications offer a valuable enhancement to the hashcash tree, making it a more reliable and secure solution for DoS prevention.Finally, it is important to observe that our puzzle protects the application layer, while lower layers need different protection mechanisms to prevent, among other attacks, volumetric attacks (e.g., UDP floods) or protocol attacks (e.g., SYN floods).

Figure 2 .
Figure 2. Activity diagram of the modified Algorithm 3. It builds a hashcash tree of size n for the string s using hash function h and prefix length k.The label of each node depends on those of the children and next node in BFS.

Figure 3 .
Figure 3. Activity diagram of the modified Algorithm 4. It verifies node i of a hashcash tree T of size n for the string s using hash function h and prefix length k.Recall that v o denotes the node of T such that oder(v o , T) = o for o ∈ [1..n].Note that the label of each node depends on those of the children and the next node in BFS.

Figure 4 .
Figure 4. Runtime to build hashcash trees of increasing sizes, using up to 32 parallel workers for the original data structure.Prefix length fixed to 4 (upper plot) and 5 (bottom plot).

Figure 5 .
Figure 5. Runtime to build hashcash trees of increasing sizes, using up to 32 parallel workers for the original data structure.Prefix length fixed to 6 (upper plot) and 7 (bottom plot).

Figure 6 .
Figure 6.Runtime to build hashcash trees of increasing sizes, using up to 32 parallel workers for the original data structure.Prefix length fixed to 8.

Table 1 .
Stress test of the puzzle protocol.Times expressed in seconds.