Enhancing Ethereum Smart-Contracts Static Analysis by Computing a Precise Control-Flow Graph of Ethereum Bytecode

The immutable nature of Ethereum transactions, and consequently Ethereum smart-contracts, has stimulated the proliferation of many approaches aiming at detecting defects and security issues before thedeploymentofsmart-contractsontheblockchain.Indeed,theactionsperformedbysmart-contractsinstantiatedontheblockchain,possiblyinvolvingsubstantialﬁnancialvalue,cannotbeundone. Unfortunately,smart-contractssourcecodeisnotalwaysavailable,henceapproachesbasedonstaticanalysishaveveryoftentofacetheproblemofinspectingthecompiledEthereumVirtual Machine(EVM)bytecode,retrieveddirectlyfromtheblockchain.However,duetotheintrinsiccomplexityofEVMbytecode(especiallyinjumpsaddressresolution),thestate-of-the-artstatic analysis-basedsolutionshavepooraccuracyintheautomateddetectionofEthereumsmart-contractsprogrammingdefectsandvulnerabilities. Thispaperpresentsanovelapproachbasedon symbolic execution of the EVM operands stack that allows to resolve jumps address in the EVM bytecode and to construct a precise Control-Flow Graph (CFG) of compiled smart-contracts. Many static analysis techniques are based on a CFG-based representation of the smart-contract to validate, and would therefore beneﬁt from our approach. We have implemented the CFG reconstruction algorithm in a tool called EtherSolve . Then, we have validated the tool on a large dataset of real-world Ethereum smart-contracts, showing that EtherSolve extracts more precise CFGs, w.r.t. state-of-the-art available approaches. Finally, we have extended EtherSolve with two detectors for two of the most prominent Ethereum smart-contracts vulnerabilities (Reentrancy and Tx.origin). Experimental results show that exploiting the proposed CFG reconstruction static analysis, leads to more accurate vulnerabilities detection, w.r.t. state-of-the-art security tools.


Introduction
Smart-contracts extend blockchain-based cryptocurrencies (e.g., Ethereum), allowing to store programs transparently on the blockchain, that are eventually executed by the distributed network of miners.In other words, a smart-contract is a self-executing program that runs on a blockchain.
The peculiarity of a smart-contract is that the program and all the actions it performs (e.g., money transactions) are immutable: once written on the blockchain, they cannot be modified nor deleted, even in case of programming defects identified after deployment.Programming defects and vulnerabilities might result in frauds or in financial values that are frozen indefinitely [44].Moreover, smartcontracts security is becoming a crucial point due to the advent of the so called Decentralised Financial (De-Fi): a range of semi-familiar financial products re-skinned for the cryptocurrency age [10].DeFi sees a great involvement of new smart-contracts, reaching 3.1 million contract calls in a single day [12] with $88.5 billion total value locked [45], and these numbers are still growing.
For these reasons, code review, possibly supported by automated analysis tools, is crucial, in order to detect programming defects and vulnerabilities before smart-contracts deployment or before erroneous transactions committed (and permanently stored) in the blockchain.This is especially critical for closed-source smart-contracts, whose source code cannot be inspected by end-users, and the only available representation is the compiled EVM bytecode on the blockchain [38].
The precision of the static analysis is one of the key points to promptly deploy and run correct smart-contracts.However, state-of-the-art tools for detecting programming defects and vulnerabilities in Ethereum smart-contracts experience substantial limitations: analysis results contain lots of false positives (false alarms) and false negatives (overlooked issues) [31].A possible explanation for such poor performance could be the intrinsic difficulty of analyzing the EVM bytecode.In fact, despite the bytecode is easy to parse (fixed length opcodes) its semantics and control-flow graph (CFG) are difficult to reconstruct due to the following EVM design choices.
• Jumps destination is not an opcode parameter.A jump opcode assumes the destination address to be available on the stack, dynamically computed by previous code.
• There is no opcode for returning from functions: the "return" is implemented by pushing the return address on the stack, and then performing a jump.
• Functions are removed by the compiler.Intra-contract function calls are replaced by jumps.Inter-contract function calls are resolved by a "dispatcher" placed at the smart-contract entry-point, that decides what address to jump to, depending on the call actual parameters.
• The smart-contract constructor is executed only when the contract is deployed on the blockchain, and then discarded.Thus, its bytecode is not available in the blockchain.
The most effective static analysis algorithms rely heavily on the particular representation of the code, that is usually based on the control-flow graph.Indeed, CFG precision is a key point of every analysis, and when research tools base their analysis on a partial or imprecise CFG, the final tool results are also imprecise.Since it is still quite difficult to precisely compute a precise CFG, often, tools use alternative program representations (e.g., trace tree [40] or three address code [8]), sacrificing analysis accuracy.
In this paper we go in the opposite direction: we aim at extracting a precise CFG from the EVM bytecode.To this end, we propose a static analysis approach, called symbolic stack execution, that resolves jumps destination based on the symbolic execution of the operands stack.After the jumps destination is resolved, an accurate CFG can be built.The approach has been implemented and open sourced as EtherSolve 1 , a fully automated tool to compute a precise CFG starting from EVM bytecode.This result represents a beneficial starting point for any subsequent sophisticated static analysis meant to identify programming defects or vulnerabilities in Ethereum smart-contracts.To quantify the improved precision of this CFG, we have extended Ether-Solve adding analyses aiming at detecting two of the most prominent Ethereum smart-contracts vulnerabilities [23], namely Reentrancy and Tx.origin.These detectors turn out to perform better than state-of-the-art security EVM bytecode scanning tools, with results comparable to analysis tools that can also inspect smart-contracts source code.
The present paper is a deeply revised and extended version of the companion conference paper [16].In particular, we better explained the problem and improved the overall presentation of the proposed methodology.Furthermore, we extended EtherSolve with a mechanism to identify functions entry-point and a specific analysis detecting Tx.origin vulnerabilities.Finally, we extended the experimental validation, adding an assessment of the accuracy of EtherSolve in identifying functions entry-point and a comparison of vulnerabilities detection efficacy (Reentrancy and Tx.origin) with the state-of-the-art source code level analysis tools.
The paper is structured as follows.After covering the background of smart-contracts in Section 2, the static analysis reconstructing the CFG of Ethereum smart-contracts is described in Section 3. In Section 4, the static analysis is extended in order to detect Reentrancy and Tx.origin vulnerabilities.Section 5 presents our empirical validation and comparison with state-of-the-art tools.Then, Section 6 discusses the related work and Section 7 closes the paper.

Ethereum
Ethereum is an open-source platform for decentralized applications, based on the blockchain technology.On the Ethereum network, it is possible to write simple programs, called smart-contracts [22,4], that (semi-)automatically manage the underlying network cryptocurrency, called Ether (ETH).The actions that can be performed in Ethereum are transactions, i.e., transfer of funds or data between different ETH accounts.Every new transaction is irreversible and it is permanently added in a new block that updates the blockchain [22,4].
In the Ethereum network each principle has an account identified by an address (a sequence of 20 bytes).There are two types of accounts: Externally Owned Accounts (EOA) and contract accounts [9].The former is a simple address that does not point to any code: it can only emit and receive transactions (similarly to Bitcoin wallets [5]).The latter is the identifier of a smart-contract deployed in the network, which is run whenever a transaction is sent to its address [4].
Ethereum uses a proof-of-work (POW) system as a consensus mechanism [55,6].Participants of the network, called miners, use their time, computational power and crypto currency assets to compete.The miner that succeeds is allowed to add blocks to the Ethereum blockchain and gets a reward [9].Rewards are paid by the users who invoke the execution of a smart-contract or simply want to transfer funds to other accounts.In fact, every operation in the network has a cost expressed in the unit of Gas and the price per unit is expressed in Wei, a fraction of an Ether.In the near future, a switch to a proof-of-stake (POS) consensus mechanism will be performed, yielding to Ethereum 2.0.In a POS setting, miners are replaced by stackers, that validate transactions based on the amount of cryptocurrecny they have staked, instead of the computational power they are able to employ.

The Solidity Language
Smart-contracts for Ethereum can be written with different high-level programming languages, but Solidity [25] is indubitably the most wide-spread [28].Solidity is a Turingcomplete object-oriented language and smart-contracts are basically objects with functions and fields.
The example in Listing 1 reports a smart-contract written in Solidity to implement a bank.The field balances stores the internal state of the smart-contract.It is a key-value map that associates every address to an integer value representing the funds owned by the address account.The functions deposit and deposit100 allow the user to deposit currency into its virtual account.The former allows to deposit an arbitrary amount, the latter is a special case which allows to transfer exactly 100 Wei.The withdraw function allows the user to get back a certain amount of Ether previously deposited.Solidity provides different primitives to interact with the blockchain environment, for instance: transfer, that sends Ether to a certain address; revert, that makes the transaction fail and rolls-back to the state preceding the transaction; require, that enforces a certain boolean condition and in case the condition is not met a revert is performed.

Compiling Solidity into EVM Bytecode
In order to actually run a smart-contract on the Ethereum blockchain, the Solidity source code needs to be compiled into EVM bytecode, in order to be executed by the Ethereum Virtual Machine [29].
Given a smart-contract, the Solidity official compiler solc generates the creation code and the runtime code.The former is the constructor of the smart-contract, that performs the initial operations and deploys the runtime code on the blockchain (the constructor code is then discarded and not stored in the blockchain [26]).The latter is the actual bytecode deployed on the blockchain and it is divided in three main segments.The first segment contains the opcodes that the EVM executes; the second segment is optional and contains static data (e.g., strings or constant arrays); the last segment contains the metadata.In particular, the metadata segment contains compilation information, such as the compiler version and the (hashed) sources used, in order to verify its source code [25].The metadata segment is hashed and appended to the contract bytecode.
In addition, solc also produces the Application Binary Interface (ABI), a file containing the list of the functions in the smart-contracts that can be called by a user, together with the type and number of parameters.Functions are not identified by their name but by the hash of their signature.The ABI file is not deployed on the blockchain, it is distributed separately to all parties that aim to interact with the smartcontract.§ ¤ Concerning the EVM execution, the main memory of a smart-contract consists in a stack, namely a volatile LIFO queue with 1024 blocks of 32 bytes [25,29].The execution relies heavily on it, as arithmetic and logic operations follow the reverse polish notation, where the data are loaded into the stack before the operation [48].For instance, the (hexadecimal) bytecode string 6005600301 translates to opcodes list PUSH1 0x05 PUSH1 0x03 ADD, and the EVM execution will: (i) push a byte to the stack containing the value 0x05; (ii) push the value 0x03; and (iii) execute the addition operation, which consumes two elements from the stack and produces their sum as result, leaving the final stack with the value 0x08 only.
The EVM bytecode is composed by EVM opcodes that can be grouped in categories, including arithmetic and logic operations, control flow operations, stack operations, environmental and block information, memory and storage operations and system operations.The complete list of opcodes with their semantics is defined in the Ethereum yellow paper [53], and there can be little variations among different EVM versions.Listing 2 shows a portion of a Solidity smart-contract compiled into EVM bytecode, while Listing 3 shows the translation of the bytes into EVM opcodes.Bytecode can be easily parsed into opcodes, which are the minimum instructions that the EVM can execute and are identified with bytes.
Every opcode pushes or pops a certain number of elements from/to the stack, and it can access memory, get information about the execution environment or interact with other blockchain smart-contracts.The only opcodes with a parameter are those in the PUSH family: the value that the EVM pushes into the stack is taken directly from the bytes following the opcode.There are different variants of PUSH, depending on the number of bytes that needs to be pushed to the stack, varying from PUSH1 (1 byte is pushed) to PUSH32 (32 bytes are pushed) [53,29].
A portion of the code can be used as read-only data; in fact with the CODECOPY opcode the execution can copy a portion of the code to the memory and then treat it as data [8].Thus, parsing this segment of memory as code might generate spurious results, including invalid opcodes and wrong jumps destination.
The control-flow of the smart-contract is also managed by means of the stack.In fact, in order to jump among different portions of the code both the JUMP and the JUMPI opcodes (unconditional and conditional jumps, respectively) have to read the jump destination from the stack.Jumps destination is not indicated with a label, but with the offset w.r.t. the next instruction in the code [8].Unlike x86 Assembly, in the EVM there is not the concept of function: everything is managed through jumps (there are no opcodes for function calls nor for call returns).The only return available is for function calls coming from external smart-contracts.These design choices make the EVM bytecode quite difficult to analyze statically.In particular, since jumps destination is computed at run-time, the CFG cannot be reconstructed without a sort of stack simulation, whose accuracy directly affects the precision of extracted CFG.

D R A F T
When a transaction starts the execution of a smartcontract, it can send both funds and information as call data.In order to transfer the control to the code corresponding to the intended function, the compiler adds a dispatcher at the beginning of the contract code.When a caller is willing to execute a certain contract function, it sends a transaction that contains the hash of the function signature, so that the dispatcher can compare it with all the hashes of the smartcontract functions and then move the execution to the begin of the corresponding function code.Instead, if no call data are supplied or none of the hashes matches, the dispatcher moves the execution to the beginning of the fallback function.This function has not parameters nor return values [25], and it is called automatically when a money transfer is performed (on the destination contract).
Every function call in the Solidity contract is translated by the compiler into a sequence of PUSH opcodes, followed by a JUMP and a JUMPDEST.This sequence loads into the stack the return address of the calling context, the (optional) actual parameters and the address of the function to call.Then JUMP executes the function body, that eventually consumes the parameters from the stack, leaving the return address which, once executed, moves the execution to JUMPDEST, resulting in an actual return statement.

Control-Flow Graph Reconstruction
As already introduced, the main goal of our approach is to reconstruct a precise Control-Flow Graph (CFG) of Solidity smart-contracts, starting from the EVM bytecode only (no ABI nor source code is needed).The CFG is a directed graph representing the smart-contract flow of execution: nodes are the contract basic blocks (sequence of opcodes with no jumps), while edges connect potential successive basic blocks.In this section we will describe in details how we retrieve a CFG from EVM bytecode.

Approach Overview
The CFG reconstruction algorithm is composed of the following incremental steps.

Bytecode Parsing
The binary representation of the bytecode is split in the actual smart-contract code and in the metadata.Then, the code part is further parsed to identify opcodes.

Basic Blocks Identification
Opcodes are grouped in basic blocks and the explicit jump destinations between basic blocks are computed.

Symbolic Stack Execution
Symbolic execution is applied to the execution stack, in order to resolve non-trivial jump destinations.

Static Data Separation
The static data segment is separated from the actual executable code.

CFG Decoration
The obtained CFG is decorated to highlight the dispatcher and to identify the entry-point of the fallback function.

Entry-points Detection
Functions entry-point is detected inspecting the dispatcher blocks.
In the following, we describe these steps in detail, referring to the deposit100 function of the SimpleBank smart-contract of Listing 1.

Bytecode Parsing
The analysis starts with the binary representation of the EVM bytecode.The metadata section is identified by finding the corresponding header reported in the official Solidity documentation [25].In case of metadata with experimental features, the header is different and not documented.Indeed, as stated in the Solidity documentation [25], some features (such as the pragmas ABIEncoderV2 and SMTChecker) are not enabled by default.Still, they can be enabled by using D R A F T the experimental pragma modifier.We inferred the header structure of non documented experimental cases by manually inspecting the bytecode of some contracts.The version of the Solidity compiler used to generate the bytecode is extracted from the metadata.
The metadata are then dropped and the remaining bytes are considered as the actual smart-contract code to be further parsed.An example of how the bytecode is parsed into opcodes is shown in Listing 2 and Listing 3, where every two characters of the bytecode are translated into the corresponding opcodes (e.g., 0x6080 becomes PUSH1 0x80).Each opcode is unequivocally identified by its offset address, i.e., the position of the opcode in the bytecode.

Basic Blocks Identification and Pushed Jumps
A basic block is a sequence of opcodes which are executed consecutively between a jump target and a jump instruction, without any other instruction that alters the flow of control.Thus, opcodes that alter the control flow of the program divide the code into basic blocks.Opcodes JUMP, JUMPI, STOP, REVERT, RETURN, INVALID, SELFDESTRUCT mark the end of a basic block, whereas JUMPDEST marks the beginning of a new basic block.Every basic block is uniquely identified by its offset, i.e., the position of its first opcode in the bytecode.In Fig. 1 we can see the basic blocks of the code in Listing 3 extracted following this procedure.Indeed, each basic block either starts with a JUMPDEST or ends with an opcode which alters the control flow.
Once the code is divided into basic blocks, we proceed with the computation of the CFG edges.This operation is not always trivial as the jump destination is not an opcode parameter, but rather it is available on top of the stack at execution time.We identified two types of jumps: pushed jumps and orphan jumps.A pushed jump is immediately preceded by a PUSH opcode, so that its target is easy to resolve, just by looking at the value in the preceding PUSH opcode.Instead, orphan jumps are not preceded by a PUSH and their target is not immediate to compute.In the example CFG of Fig. 1 the block (starting at offset) 53 ends with an orphan jump, whereas the remaining jumps are pushed jumps.
We start by computing the edges resulting from pushed jumps, then the edges resulting from orphan jumps will be computed in a subsequent phase.To this end, each basic block is analyzed according to its last opcode as follows.
JUMP preceded by a PUSH The argument of the push is the destination offset of the jump and the corresponding edge is added to the CFG.
JUMPI preceded by a PUSH The false branch goes to the next block (in offset order), while the true branch is the argument of the push interpreted as destination offset for the JUMPI.The two corresponding edges are then added to the CFG.
JUMP not immediately preceded by a PUSH The resolution of the jump is not trivial and it needs to be  resolved through symbolic stack execution, described in Subsection 3.4.

Others Opcodes like STOP, REVERT, RETURN, IN-
VALID and SELFDESTRUCT have no successors, as the control flow is interrupted.
At the end of this phase we have extracted a partial CFG where the edges related to orphan jumps are still unresolved.For example, the extraction of the CFG of the code in Listing 1 at this point is depicted by the basic blocks and continuous edges of the CFG in Fig. 1, while the outgoing edge from the basic block 53 has not been resolved yet.

Symbolic Stack Execution and Orphan Jumps
The most challenging step in the CFG reconstruction is the resolution of the destination addresses for orphan jumps.These jumps are very common: for instance, the Solidity compiler uses them to return from function calls.Indeed, between the function entry-point and the function exit-point (i.e., the return), the stack is heavily used by the function body to implement all the desired features (arithmetic operations, calls to other functions or transfer of funds).
The analysis consists in executing the stack symbolically: the algorithm walks the partially built CFG executing only the opcodes that interact with jump addresses, updating the state of the stack accordingly, in such a way that orphan jump destinations can be found on the symbolic stack.Indeed, the symbolic stack execution considers only the opcodes in the PUSH, DUP and SWAP families, together with the AND and POP opcodes.For every other opcode the symbolic stack pops and pushes unknown elements, as they do not deal with the jump addresses.
In the example depicted in Fig. 2 there is a simple piece of code that has been executed symbolically to highlight the procedure.In particular, the ADD is not modelled in full D R A F T Other .() times, .ℎ(unknown) times details (we do not need to): it simply consumes two elements of the stack and then generates a single unknown value to be pushed.The jump address is loaded before arithmetic operations, but it persists until the actual JUMP, so it can be resolved.
The symbolic stack execution handles the opcodes according to the rules represented in the look-up Table 1, where denotes a stack that can contain numeric or the unknown values.We indicate the top of the stack with the position 0. The third and fourth columns of the table represent, respectively, the number of pop and push operations executed on the stack, according to the language documentation [53].The last column describes the effect of an opcode on the symbolic stack, where ′ is the symbolic stack after the modification.The last row of the table considers a generic opcode not involving jump addresses, that performs an arbitrary number of push and pop operations on the stack.Note that, can be only 0 or 1 since, at the moment, there are no operations that push more than one element in the stack.
The algorithm walks through the CFG using a Depth-First Search (DFS) keeping a snapshot of the stack state for each basic block.The following constraints have been introduced in order to avoid infinite loops: an edge cannot be analyzed more than once with the same symbolic stack state; and there is a limit on the number of elements to compare when checking for stack equivalence.
Another important aspect to note is the fact that a function can be called from different points of the code, resulting in different symbolic stacks and different paths of the CFG.Indeed, blocks can be traversed during the symbolic execution by multiple paths.A path is said to be infeasible when a real execution, given a particular state, would never take it.An example is a path that contains branches that are guarded by contrasting conditions, e.g., > 0 in the first branch and ≤ 0 in the second branch.Even if the two branches are not dead code, they can not be taken by the same execution when is an input.In order to avoid infeasible paths, when the DFS visit encounters a block ending with JUMP, only its destination block (obtained from the symbolic stack) is added to the DFS queue.The detailed algorithm for resolving orphan jumps is shown in Algorithm 1.It starts at Line 2 by initializing the variable , that stores the edges that have already been analyzed using stack equivalence as described before (lines 20 and 27).An edge is also labelled with the symbolic stack that has been used for its symbolic execution (lines 19 and 26).Then, the queue used for the DFS is initialized at Line 5: it contains pairs with a block and a symbolic stack.The first pair ⟨ , ⟩ contains the first block and an empty stack.Then, the algorithm proceeds with the symbolic stack execution by iteratively repeating the following steps until is empty.Lines 9-11 symbolically execute the opcodes of the basic block and update the state of the symbolic stack according to the look-up Table 1.
Lines 12-16 resolve orphan jumps destination with the newly updated symbolic stack.The target block is added as a successor of the basic block under analysis.
Lines 17-31 handle the update of the queue .If the edge from the analyzed basic block to the target one has not been already analyzed using the same stack, then the successor blocks are added to .If the last opcode is a JUMP then only its target block is added to .
An example of the symbolic stack execution for the resolution of orphan jumps is shown in Fig. 3  portion of the contract shown in Listing 3 (the corresponding CFG is depicted in Fig. 1).The symbolic execution starts at the offset 36, which loads into the stack the value 0x29 after the value 0x27.Then, our approach symbolically executes the JUMP opcode that, according to Table 1, consumes a value.Next, the symbolic execution of JUMPDEST leaves the stack unchanged and then the value 0x01 is loaded.The execution proceeds until the opcode at offset 129 is reached, which leaves an unknown value on the stack, that is then removed by the POP.Finally, the opcode at offset 131 contains the orphan jump, which can be resolved with the value pushed into the stack back at offset 34.At this point, the symbolic stack execution can detect that the successor basic block is the one at offset 39.
Eventually, we have resolved the target of all branches in the CFG, so the dashed edge in Fig. 1 is added at the end of this phase.

Static Data Separation
The proposed approach proceeds with the removal of static data (if present).The first basic block containing the instruction 0xFE, which is the designated opcode for an invalid instruction, is identified.In fact, the Solidity compiler uses this opcode to mark the end of the executable code section and the beginning of the static data section.All the subsequent opcodes are removed from the graph, and considered as static data and not as code.Then, the algorithm proceeds by removing from the graph any basic block that is not connected to the main graph, if any.
The static data are usually strings contained in the source code or child contracts which are instantiated by the main one through the opcodes CREATE and CALL.Even if, in principle, we cannot be sure that the removed data is actually data and not code, our experimental validation indicates that this assumption is reasonable.

Control-Flow Graph Decoration
In order to provide to the CFG additional information potentially useful to an analyst or for a static analysis (e.g., for vulnerabilities detection), our approach tries to highlight some relevant code portions, such as the dispatcher, the fallback function and the last basic block of the contract.
The dispatcher is the entry-point of the smart-contract, so it is at the beginning of the bytecode.The dispatcher directs the execution to the intended Solidity function and it manages parameters and return values.The fact that the dispatcher manages return values is the key used for its detection.In fact, the only basic blocks that contain instructions such as RETURN and STOP are part of the dispatcher.These opcodes cannot be present in other locations as they would manage return values outside the dispatcher.So, the algorithm considers as dispatcher every block with an address lower than the address of these opcodes.In the example of Fig. 1, the dispatcher blocks are highlighted in gray.
This approach is effective in identifying both the linear dispatchers, used in the older versions of the Solidity compiler, and the tree dispatchers, introduced in the latest versions of Solidity in order to improve performances.
The detection of the fallback function entry-point is more difficult, because the dispatcher structure has been changing continuously across different versions of the Solidity compiler.In Figure 4 we report a portion of the CFG retrieved from a simple contract with declared fallback function.In the picture, white nodes are code nodes, while gray nodes are dispatcher nodes (we reported only some code and dispatcher nodes for simplicity, outgoing edges to not reported nodes have been omitted).As you can see from the picture, the dispatcher is quite complex and it is not trivial to identify automatically that the node at offset 0x97 is the first block of the fallback function.
The first check of the dispatcher is the presence of call data and, if missing, it moves the execution to the fallback function.Hence, the currently implemented technique starts from the entry block searching for the highest successor offset.The successor with the highest offset is considered as fallback only if it does not end with a REVERT.Indeed, that would mean that the fallback function has not been declared or has been declared with only the REVERT statement.However, this approach does not work with some versions of solc, due to different compilation patterns.
The last step of the CFG decoration is the addition of an artificial unique exit-point, for all the basic blocks with no successor.This could be useful for many static analyses techniques.This particular basic block in the example in Fig. 1 is the number 132.

Functions Entry-point Identification
In this final step we try to extract the hashes of the functions in the dispatcher blocks.As mentioned before, in order to pass the control to a desired function, the dispatcher matches the call data with the hash of the functions signatures.Hence, the algorithm navigates the dispatcher blocks looking for the opcodes dealing with such hash comparisons, D R A F T  obtaining a set of likely function hashes.The fallback function has no associated hash, and it is called in case no hash matches the call data.
In the example in Fig. 1, the function with hash 909dd8f6 is identified by the block starting at offset 12 (the fallback function is not present).

Smart-Contract Vulnerabilities
To demonstrate the usefulness of a precise CFG representation of Ethereum smart-contracts, we defined two static analyses for detecting smart-contracts vulnerabilities on top of EtherSolve, and compared their efficacy w.r.t.state-of-the-art detection tools.In particular, EtherSolve has been pipelined with two subsequent static analyses meant to detect cases of Reentrancy and Tx.origin vulnerabilities.Note that, we selected two security-related vulnerabilities, but any generic CFG-based analysis would benefit from our approach (e.g., data-flow analyses or under/over-flow detection mechanisms).
In this section, we first describe Reentrancy and Tx.origin vulnerabilities and then we explain the approach used to check such bugs in the EVM bytecode, based on the CFG extracted by EtherSolve.In the next section we will validate the efficacy of the proposed detectors on a dataset of realworld smart-contracts.

Reentrancy
One of the most prominent and dangerous vulnerabilities in Solidity (and, hence, in Ethereum smart-contracts) consists in the mishandling of possibly reentrant code.It has been made famous due to the catastrophic DAO incident [44], that caused the loss of a large amount of money and serious consequences to the whole Ethereum network.This vulnerability consists in reentering a paying function multiple times while the contract is in an inconsistent state, thus causing possible leak of funds [31].In the general case, the "reentrance" exploits the fact that the vulnerable contract calls primitives, such as money transfer, that the malicious contract can redefine, in such a way to reenter the vulnerable contract.If a money transfer occurs at each iteration of this loop, the process can be repeated and used to drain all resources from the attacked contract.Indeed, Reentrancy is consequence of an abuse of dynamicity in Solidity: the semantics of money transfer is dynamic and can be redefined.
A simpler, yet quite common, programming error that may lead to Reentrancy attacks consists in updating the contract state after (instead of before) executing a fund send primitive (i.e., a call).An example of a Reentrancy vulnerable contract following this pattern is shown in Listing 4, where the call statement at line 7 precedes the update of the variable balances at line 8.In Listing 5 we have a (malicious) contract that exploits the reentrancy vulnerability of the ReentrantContract.In particular, the attack is launched calling the withdraw method on the vulnerable contract (line 6).When the call primitive is executed in the withdraw method, performing the money transfer, the fallback function of the contract MaliciousContract is fired.The latter contains a recursive call to the vulnerable contract, again on the withdraw method (this is the reentrant code).Since the balances variable in the vulnerable contract is updated after the money transfer, the second withdraw can be legitimately called, and a second money transfer is performed.The process is repeated until all money is drained from the vulnerable contract.
Reentrancy Detector.Reentrancy vulnerabilities be exploited when a contract state update is performed after calling a, possibly unsafe, paying function.The proposed approach consists in traversing the CFG in order to detect potential flows of execution where a SSTORE opcode (which updates the contract state) is executed after a CALL opcode.This pattern is considered unsafe if the contract address where funds are transferred to by the CALL cannot be statically determined by the symbolic stack execution.Indeed, in this case the funds destination could be controlled by an attacker who can mount an attack to exploit a Reentrancy vulnerability (e.g., with an particularly crafted fallback function).Note that, Solidity statements to send funds are translated with the CALL opcode, that has the callee address hard-coded into the bytecode.
In particular, the detector traverses the contract CFG in order to detect blocks that refer to a CALL opcode, namely blocks that may call a paying function.For each such block, the detector checks if the block is unsafe.To do so, a symbolic stack execution on the block is fired, retrieving the pushed element on the stack when the CALL is executed.If the element is an unknown address the block is unsafe.Finally, for each unsafe blocks the detector checks D R A F T

Tx.origin
Another quite dangerous vulnerability is related to the misuse of the tx.origin command, that returns the address of the (first) EOA in the callers path ended up in the current contract instance.Indeed, during a contract execution, a chain of calls may happen: an EOA calls a contract method that, in turn calls another contract, and so on.In this case, tx.origin returns the address of the EOA performing the first transaction.Instead, the command msg.sender returns the address of the closest parent caller, that could be either a EOA or a contract.
Among the first instructions of a contract method, very often we can find an authorization check: the contract verifies that the caller is authorized to perform the subsequent operations.Indeed, a contract can execute operations on behalf of the caller, comprising funds transfer.The latter check should be performed using the msg.sender instead of the tx.origin, since the actual contract caller may be different from the initial EOA starting the first transaction.
We have a Tx.origin vulnerability in a smart-contract when into a method of the contract the command tx.origin is used incorrectly (i.e., in place of the command msg.sender), as we can see in the example described in Listing 6.In the example, we can see that tx.origin is used in the method sendTo to perform the caller authentication (line 2).To exploit the vulnerability of the method sendTo, another contract may use the code described in Listing 7. In this case, the fallback function (line 6) of the (malicious) contract MaliciousContract is able to call the vulnerable method sendTo (line 7) and to perform operations on behalf of the user that have called the contract.In particular, when the victim calls the malicious contract, the latter transfers all money of the victim to a specific address (of the attacker that have crafted the malicious contract).The execution of this contract calls chain is performed without errors on the blockchain.
The vulnerability arises because when an EOA calls a contract, all the recursive calls performed by that contract have the address of the EOA as tx.origin, potentially transferring the EOA credentials, as we have seen the example.
Tx.origin Detector.Concerning Tx.origin vulnerabilities, they can be exploited when we perform an authorization check on the contract caller using tx.origin instead of msg.sender, namely when we have a line like require(tx.origin==msg.sender) in the vulnerable contract.The proposed approach consists in traversing the CFG in order to detect the blocks resulting from the translation of that kind of line of code in the corresponding EVM bytecode.In particular, the block is identified by the opcode ORIGN, namely the translation of tx.origin, and it ends with a JUMPI, namely a conditional jump based on the equality concluding the block.Considering that the authorization check is usually performed at the beginning of the function, we can assume that the block is completely demanded to the translation of the require line.To identify this kind of blocks, EtherSolve scans the contract CFG in order to find the following common patterns: § ¤ ORIGIN PUSH20 AND EQ PUSH2 JUMPI ORIGIN PUSH20 AND EQ ISZERO ISZERO PUSH2 JUMPI ¦ ¥ Unfortunately, patterns may miss potential vulnerability instances.Indeed, some particular compiler versions may translate the require line to slightly different bytecode.To mitigate the problem, EtherSolve performs, in addition to the pattern matching check, a simple taint analysis on the block containing the opcode ORIGIN.In particular, EtherSolve performs a symbolic execution in order to propagate in the stack the value inserted by the ORIGIN and to verify whether the value inserted influences the JUMPI at the end of the block or not.

Experimental Validation
In this section we present the results of our empirical validation of the CFGs computed by EtherSolve.The following five research questions guide the definition of our experimental validation.The first research question investigates the extent to which EtherSolve can process instances of existing smartcontracts with no errors.We are interested in verifying this on a wide range of smart-contracts, directly taken from the Ethereum blockchain.In the second research question we assess the accuracy of EtherSolve in finding functions entry-point, by comparing entry-points found with Ether-Solve with the actual entry-points present in smart-contracts source code.The third research question compares our approach with the state-of-the-art EVM bytecode static analysis tools, w.r.t.success rate and CFG reconstruction precision.The fourth research question compares the results of the Reentrancy vulnerability detection based on Ether-Solve with state-of-the-art vulnerability detection tools.Finally, the fifth research question compares the results of the Tx.origin vulnerability detection based on EtherSolve with state-of-the-art vulnerability detection tools.Note that, the last two research questions are quite challenging since Ether-Solve works at the bytecode level while the considered tools work at source code level.Indeed, having the possibility to inspect the source code enhances considerably the analysis performance.We considered tools working at source code level since, to the best of our knowledge, there are no tools that detect vulnerabilities based on bytecode only 2 .

The Smart-Contracts Datasets
Our empirical validation has been conducted using two datasets of smart-contracts.The first one, dubbed Etherscan dataset, is used to answer 1 , 2 and 3 , and it is obtained from the list of verified contracts published by Etherscan 3 [27].They are publicly available open-source smartcontracts with information about compilation, deployment and transactions.From this list we have randomly extracted 1,000 contracts, using the standard random function of Python 3 (that samples from a uniform distribution) Using the APIs provided by Etherscan, both the EVM bytecode and the relevant information have been downloaded, obtaining for each smart-contract its name, address, hash, deployment date, bytecode length and compiler version.Note that, we do not use the Etherscan information during the analysis, we use it only to validate the information retrieved by EtherSolve.For instance, the compiler version is automatically retrieved by EtherSolve from the contract metadata, so the compiler version obtained from Etherscan is only used to check whether the version extracted from the metadata is correct or not.We enforced the uniqueness of smart-contracts bytecode, in order to avoid duplicates in the dataset.Indeed, it is common practice to reuse existing smart-contracts, especially libraries and interfaces, and deploy them multiple times in the blockchain, at different accounts.For this reason, we computed the hash of the bytecode of each smartcontract, which has been used to delete the duplicates.
Figure 5 shows some demographics of the dataset, computed on the source code available on Etherscan.It reports 2 Some tools work at bytecode level but use the information contained in the contract ABI.We do not consider such tools as bytecode-only. 3Contracts have been download on June 10, 2020 the histograms of the length of contracts solidity source code (left-hand side) and the number of public functions in their ABI (right-hand side).On average, contracts contain 1,374 lines of code, with the largest one with 11,743 lines.They, on average, contain 19 named functions (i.e., excluding the fallback function, when defined), with the largest contract having 108 functions.The average bytecode length of the smart contracts in the dataset is 7,351 bytes, with a maximum length of 24,570 bytes (65 contracts exceeded the length of 20 Kbytes).The average number of transactions of the smart contracts in the dataset is 337, with 27 contracts having 0 transactions.Note that, the Etherscan APIs allow to download at most 1,000 transactions, hence for some contracts the count has been truncated (13 smart contracts in the dataset have more than 1,000 transactions).The most diffused EVM version in the dataset is Petersburg (340 contracts), followed by Istanbul (210 contracts), Byzantium (114 contracts) and Costantinople (4 contracts).For some contracts compiled with old versions of solc (v0.4.26 and before) EtherScan does not provide the precise target EVM version, it just signals that the Default version is used.Unfortunately, such solc versions are not documented and, hence, it is not possible to deduce which is the corresponding default target EVM version.This happens for 331 contracts in the dataset.The average balance of the smart contracts in the dataset is 9.6 × 10 17 Wei.As shown in Fig. 6(left), the compiler version is quite variable (with a majority of old versions).This datum is crucial in order to assess that EtherSolve does not assume a specific Solidity version (the compiler often underwent dramatic changes).
The second set of smart-contracts, dubbed SolidiFI dataset, consists in 50 Ethereum smart-contracts collected by Ghaleb and Pattabiraman [31].They used the dataset to test their tool SolidiFI, that injects various vulnerabilities into Solidity code.We will use these smart-contracts, together with their injected versions, in order to answer 4 and 5 .

1 : Success Rate
In this research question we evaluate the number of contacts in the Etherscan dataset that EtherSolve is able to analyze without crashing, i.e., the Success Rate.The Success Rate is defined as the ratio of the smart-contracts analyzed without critical errors to the size of the dataset.EtherSolve managed to analyze all the smart-contracts in the dataset except three, obtaining a Success Rate of 0.997 (99.7%).The reason for these (very few) failures is that these three smartcontracts do not match the most common EVM bytecode patterns generated by the Solidity compiler.Indeed, one of them was written in Vyper rather than Solidity, another one was empty (with length of 0 bytes) and the last one presented an unusual begin section, that the tool failed to parse.The compiler version of Solidity is correctly identified by EtherSolve for all the analyzed contracts.
We have also performed a naive performance assessment of EtherSolve, w.r.t.analysis time.We measured the time  spent by EtherSolve to parse the EVM bytecode, generate the basic blocks, solve orphan jumps and decorate the CFG.The average time is about 3 seconds per smart-contract.For 930 smart-contracts, the analysis took less than 1 second each, while 7 smart-contracts required more than 1 minute of computation (with a maximum of 10 minutes), due to their big dimension and the consequent large number of edges.

Answer to 1
The Succes Rate of EtherSolve when analyzing real-world smart-contracts compiled with different versions of the Solidity compiler is very high, since it was able to analyze without errors the 99.7% of the samples in the considered smartcontracts dataset.

2 : Functions Entry-point Identification
We evaluate the accuracy of EtherSolve in detecting functions entry-point by means of well-known performance metrics such as Precision, Recall and F-measure [42].These metrics are computed comparing the functions recognized in the EVM bytecode by EtherSolve w.r.
On a practical point of view, we have computed the metrics as follows.Once the functions entry-point and hashes has been identified by EtherSolve, we have downloaded the smart-contracts ABI (this can be done, because the smartcontracts in the dataset are open-source).We then proceeded with the comparison, computing Precision and Recall for each contract.When EtherSolve retrieves no function, or the ABI contains no function, we consider Precision and Recall to be 0.This analysis, however, included only 996 contracts due to internal errors of the tool during the function identification phase.
In the left of Fig. 7 we report the distribution of Precision w.r.t.our dataset, where we aggregate the contracts with the same value for Precision.The average Precision is 98.5%, and for 948 contracts the precision is 100%, meaning that the EtherSolve is able to identify all the functions for the majority of the contracts.In fact, 30 contracts have a value below 90% and only 6 contracts have a score of 0%.
In the middle of Fig. 7 we report the distribution of Recall w.r.t.our dataset, where we aggregate the contracts with the same value for Recall.The Recall is slightly lower than Precision, with an average score of 98.1% and with 811 contracts scoring 100%.This is due to the fact that the fallback detector is not perfect and sometimes fails.Nevertheless, only 32 contracts have a score below 90% and only 6 have a score of 0%.
We should note that the smart-contracts considered in the experiments have 19.2 functions on average, meaning that even a single miss in the detection would have a significant impact on Precision and Recall.Finally, in the right of Fig. 7 we compute F-measure, as a summary indicator.Overall, the results of EtherSolve are very good, with an average F-measure of 98.2% and with 803 contracts scoring 100%.Indeed, 31 contracts have a score below 90% and only 6 have a score of 0%.

Answer to 2
The accuracy of EtherSolve in identifying functions entrypoint is very high, since it is able to correctly find the majority of functions declared in the smart-contracts dataset, with a very high number of true positives and very few false negatives.Indeed, EtherSolve obtained very good results for the adopted performance indicators, such as Precision (P = 0.985 on average), Recall (R = 0.981 on average) and Fmeasure (F 1 = 0.982 on average).

3 : Precision of the CFG Reconstruction
To asses the precision of the CFG generated by Ether-Solve, we compare our tool with the state-of-the-art analysis tools.We have selected for the comparison the tools respecting the following constraints: (i) perform a static analysis at the EVM bytecode level (no information from source code or ABI file); (ii) emit the CFG as output, or an alternative representation easily to compare with a CFG.These criteria led us to consider the following tools.
EthIR EthIR extends the Oyente framework and performs a high-level analysis of EVM bytecode.Oyente builds a CFG, to detect different kinds of vulnerabilities [3,36].Due to the facts that Oyente is very old (and discontinued) and that EthIR is built on top of Oyente, we consider EthIR only for the CFGs comparison.Indeed, since EthIR is an improved version of Oyente, we can assume that the CFGs retrieved by EthIR are at least precise as those retrieved by Oyente.
Octopus Octopus is an analysis framework for EVM bytecode.It produces a CFG to support reverse engineering and understand the internal behavior of smartcontracts [52].
Mythril Mythril is a security analysis tool for EVM bytecode that detects security problems in smart-contracts.It does not build a CFG, but rather it generates a trace tree given by symbolic execution and Satisfiability Modulo Theories (SMT) solving [40,14].
Vandal Vandal is a static analysis framework for smartcontracts that decompiles the EVM bytecode to an intermediate representation that includes the code control flow [8,51].
Gigahorse Gigahorse is a decompiler that transforms EVM bytecode into a high-level 3-address code representation.The tool does not require the Solidity source code [32,33,34].
We discarded other analysis tools for the following reasons: Securify [50], Ethersplay [18], Manticore [7] and Slither [20] because they analyze Solidity source code instead of EVM bytecode; evm_cfg_builder [19] because we did not find an easy way to make it emit the CFG; Jeb [46] and MythX [15] because they are paid tools; Porosity because it requires the ABI as input and it is discontinued; Panoramix [30] because it decompiles the code without building a CFG and it is discontinued.Finally, we did not consider the official Solidity compiler solc, that optionally outputs a CFG, since it exploits information contained in the source code in order to reconstruct the CFG, while EtherSolve is purely based on the bytecode.When a smartcontract is compiled to bytecode, some information useful to reconstruct the CFG is irremediably lost.Hence, it would be unfair to compare the CFGs obtained by solc with those computed by EtherSolve.
To confront EtherSolve with the state-of-the-art EVM bytecode static analysis tools we first compared the approaches in terms of Success Rate.We ran the tools on D R A F T Enhancing Ethereum Smart-Contracts Static Analysis by Computing a Precise CFG of EVM Bytecode the smart-contracts taken from the Etherscan dataset and counted the successful executions (without crash) and the non-empty CFGs given in output.We also set a reasonable timeout of 10 minutes per contract, for all tools.
As shown in Fig. 6(right) and Table 2, EtherSolve and Mythril were able to analyze almost all the smart-contracts (997 out of 1,000), immediately followed by Vandal with 978 smart-contracts and by Gigahorse with 951 smart-contracts.Octopus and EthIR, instead, reached an error state on many smart-contracts, computing a CFG only for 504 and 212 smart-contracts, respectively.Furthermore, we inspected the CFGs emitted by the tools, starting with an automatic numerical comparison of nodes and edges.Then, we proceeded with a manual inspection of anomalous cases.
For each smart-contract in the dataset we adopt a common representation for the outputs of the different tools.The chosen representation is a JSON file that contains a list of nodes (which represent the basic blocks), identified by their offset, and a list of edges, identified by a pair of offsets.For each smart-contract we count the number of nodes, the number of edges and the differences in the numbers of nodes and edges among the CFGs generated by different tools.To understand if there are portions of the EVM bytecode which are actually data (static data, child contracts or compiler metadata) but that the candidate tool wrongly interprets as code, we calculate the number of nodes in the CFG generated by a candidate tool that have a higher offset than the highest offset node obtained by EtherSolve.
While success rate represents a quantitative result to compare tools, the other columns will only be used for qualitative comments, rather than direct comparison.In fact, the average number of nodes and edges depend on which contracts could be successfully analyzed.For instance, a tool able to analyze only large contracts would have an average value inflated.
It is worth observing that the different tools cannot be compared w.r.t.analysis time, since the candidate tools have been executed in a Docker [24] container (for compatibility reasons), instantiating a new process for each smart-contract.Whereas, EtherSolve has been executed directly with a batch of 1,000 contracts.
In the manual analysis we focused on the smart-contracts for which the automatic analysis reported uncommon or anomalous results.These are relatively few contracts with sensitive differences in the number of nodes/edges between the CFG extracted by EtherSolve and by the compared tool.To support the manual analysis we implemented a script that generates a diff graph, where the two CFGs are combined and the nodes/edges that are present only in the first or in the second graph are highlighted in a different color.
Table 2 contains a summary of the automatic analysis, with the smart-contracts successfully analyzed, the average number of both nodes and edges and the average number of basic blocks in the static data segment.The results of both automatic and manual analysis are discussed in the following paragraphs, one for each tool.

EthIR
In most cases, EthIR finds more blocks than EtherSolve, that have a high offset.
Considerations: Nodes that are found by EthIR only have a very high offset, so they are probably static data or metadata interpreted as code (Table 2).Instead, when nodes match then edges match as well.
Manual analysis: Because of its very low Success Rate (21%), we deemed not so interesting to continue with a manual analysis of this tool results.

Octopus
Similarly to EthIR, also Octopus finds more blocks with a high offset than EtherSolve.
Considerations: Similarly to the previous tool, also for Octopus we speculate that the additional nodes found are probably data or metadata and not code (Table 2).In the majority of cases, however, Octopus finds very few edges, sometimes even 0 edges.
Manual analysis: During the manual analysis, we discovered that Octopus misses some patterns for the metadata separation, so metadata are parsed as if it were code.Moreover, in many cases, Octopus does not detect edges that should be present according to source code.
As an example, in Fig. 8 we have an excerpt of the diff graph for the smart-contract ZipmexTokenP (address: 0xaa602de53347579f86b996d2add74bb6f79462b2).In this case, Octopus exhibits a lack of edges, found by EtherSolve, that should be present according to the Solidity source code.They are often return-edges, i.e., edges that go from the last block of a function to the block that follows the call.In the EVM, these types of edges originate from orphan jumps, meaning that Octopus cannot always resolve them.
In some other contracts, Octopus evaluates the bytecode as creation code, thus it analyzes only the second part, starting with 60806040 (the most common begin sequence of a contract bytecode).However, this part of the code is a child contract and not the main one, so the computed graphs are completely different.

Mythril
This tool does not extract a CFG, but a trace tree from dynamic/symbolic analysis, in order to detect vulnerabilities.The output of this execution trace is not directly comparable with the CFG computed by EtherSolve.Nonetheless, an indirect comparison can be performed by checking if the EtherSolve CFG misses nodes/edges that are found by the Mythril dynamic analysis.This case would correspond to imprecision in the CFG elaborated by EtherSolve.
Manual analysis: In the 4% of the contracts, Mythril finds a bunch of edges which are not detected by EtherSolve.In some cases, Mythril adds some artificial basic blocks containing "0: STOP" which are placeholders that do not come from the analyzed code, but they indicate the end of the execution trace.Sometimes, basic blocks are not split when there is a JUMPDEST in the middle, so there is a little discrepancy in the edges, but the CFG is definitely compatible.In some other cases, Mythril finds new basic blocks and edges that are not part of the main contract, but that are opcodes of a child contract created by the main one.The child contract code is computed at runtime, so the EtherSolve static analysis simply considers those bytes as part of the static data segment.

Vandal
The CFGs generated by Vandal have always one node more than EtherSolve, which is the ending node with the INVALID opcode.In the 36% of the analyzed contracts the edges match, but in the remaining cases there are differences that we analyzed manually.In some contracts the basic blocks do not match because Vandal does not support two opcodes that have been added in the most recent versions of the EVM [47].Indeed, the SELFBALANCE opcode is considered as invalid by Vandal, obtaining a basic block break with no outgoing edges.In many other cases, Vandal detects a huge amount of edges.
Considerations: Probably, when Vandal is not able to correctly compute the destination address of a jump, it conservatively assumes all basic blocks as possible successors.
Manual analysis: This hypothesis is supported by the diff graph, which shows too many outgoing edges for basic blocks that occur at the end of functions, probably because the return address could not be computed accurately.Indeed, as we can see in Fig. 9, where we have an excerpt of the diff graph for the smart-contract BCN20 (address: 0x1964f2f3ce45ac518b18ef4aa4265f8aadcef4ae), Vandal finds too many outgoing edges from basic blocks which are the end of functions (e.g., 770, 468 and 1402).
However, the number of call sites (that should correspond to the number of return edges) is much smaller, according to the Solidity source code of these contracts.All in all, as shown in Table 2, the average number of edges found by Vandal is significantly higher than EtherSolve, whereas the basic blocks are reconstructed in a similar way.

Gigahorse
Gigahorse tries to identify the private functions inside the code, and the computed CFG reflects this objective.Because of this strategy, the conversion into the intermediate representation is tricky.Often, Gigahorse CFGs contain artificial blocks with the special "CALLPRIVATE" statement, introduced by Gigahorse to mark private function calls.Nevertheless, the data collected by the automatic comparison for Gigahorse and EtherSolve is very similar, so we proceeded with the manual inspection.
Manual analysis: For some contracts, EtherSolve computes a set of basic blocks that are unreachable from the contract entry-point, that might represent dead code.However, these unreachable blocks are not present in the Gigahorse CFGs.A dual case happens on some other contracts, where EtherSolve identifies a set of blocks that are not reachable from the contract entry-point whereas, according to the Gigahorse CFG, these blocks are reachable.
Considerations: Our speculation is that such basic blocks (which have a high offset) belong to a child contract or to an internal library, not a proper part of the main contract (e.g., called via STATICCALL), and thus they are skipped by EtherSolve (that only analyzes intra-contract calls).

Answer to 3
The precision of the CFG reconstruction performed by Ether-Solve is very high, since the CFGs reconstructed by Ether-Solve have a comparable, and sometimes higher, precision w.r.t. the CFGs reconstructed by the state-of-the-art EVM bytecode static analysis tools.Furthermore, EtherSolve is the approach with the highest Success Rate (99.7%), on par with Mythrill, among the state-of-the-art analysis tools.

4 : Reentrancy Vulnerabilities Detection
In order to validate the efficacy of the vulnerability detector built on top of EtherSolve, we compared it with specialized tools aiming at discovery Reentrancy vulnerabilities.The comparison is performed on the SolidiFI dataset, that consists in 50 Ethereum smart-contracts whose source code has been injected, using the tool SolidiFI [31], with Reentrancy vulnerabilities.In total, 1343 bugs have been injected, coming from 42 code snippets.This benchmark has been used by Ghaleb and Pattabiraman [31]  comparison was performed at source code level, EtherSolve targets the EVM bytecode, so these injected contracts have been compiled before applying our analysis.Even if the original dataset consisted of 50 files, each source file could contain more than one contract and injected vulnerabilities could multiply in the compiled contracts, because of the use of inheritance that caused vulnerable code to be cloned from abstract contracts to concrete ones.Additionally, abstract contracts do not produce bytecode as they are not executable.Hence, to simplify the analysis, we opted to analyze only one compiled contract per source file (the largest compiled contract), assuming that the remaining contracts were only supporting libraries or abstract contracts, whose CFGs were disconnected from the main one.
Another problematic aspect, acknowledged by Ghaleb and Pattabiraman, is that the dataset already contained vulnerabilities before the SolidiFI injection, but they were not documented.To gather comparable results, we considered only vulnerabilities added by the SolidiFI injection, by running EtherSolve before and after injection, and by keeping only those new vulnerabilities that are detected by the second analysis and not by the first one.We have taken an analogous approach for the other tools: we ran each tool on the original dataset (without injections) and on the injected contracts, then we have considered as Reentrancy bugs those present in the injected contracts but not in the original contracts.
Concerning tools selection for the comparison, we have considered: Mythril [40], Securify [50], Slither [20], Vandal [8] and SmartCheck [49].They are the state-of-theart static analysis tools that are able to detect Reentrancy vulnerabilities.We had to exclude from the comparison Oyente [39] and Manticore [7] since we did not manage to make the tools working (they are very old and discontinued).Recall that, the considered approaches inspect the Solidity source code of smart-contracts, not the EVM bytecode, as happens for EtherSolve.Fig. 10 shows the results of the comparison, subdivided in six plots, one for each tool under comparison.In each plot we have the number of bugs injected per contract (numbered from 1 to 50) and the number of bugs found by a analysis tool per contract.In other words, for each tool we plot the distribution of bugs found in the smart-contracts, compared with the baseline distribution of injected bugs.The efficacy of a tool increases as the distance between its plot and the baseline distribution decreases.Furthermore, the points above the baseline distribution indicate potential false positives (since the tool find more bugs than the ones that have been injected), while the points below the baseline distribution indicate potential false negatives (since the tool find less bugs than the ones that have been injected).
The results highlight that EtherSolve is the second-best analysis tool, immediately after Slither.Indeed, the plot for EtherSolve follows very closely the baseline distribution, except for few contracts where EtherSolve misses some bugs (i.e., we have false negatives).Slither, instead, performs very well since its plot coincide exactly with the baseline distribution, meaning that Slither do not have false positives nor false negatives.Conversely, SmartCheck have the worst plot, indeed it is not able to find any of the injected bugs (this has been highlighted also by the experimental evaluation of Ghaleb and Pattabiraman [31]).
Even if EtherSolve exhibits slightly lower results w.r.t.Slither, we have to recall that it is the only tool using EVM bytecode only, thus having less information.Inspecting the smart-contracts source code is of great help in improving bug detection results.So, EtherSolve results are even more valuable: it is able to detect Reentrancy vulnerabilities even in closed-source smart-contracts, with very high efficacy.Its results are comparable, and very often superior, to analysis tools that can also exploit smart-contracts source code.

Answer to 4
The efficacy of the Reentrancy vulnerability detector built on top of EtherSolve is very high when compared with state-ofthe-art tools, because it is the second-best, in terms of detected Reentrancy bugs, among the state-of-the-art analysis tool, even if it is the only one that does not exploit smartcontracts source code.

5 : Tx.origin Vulnerabilities Detection
Similarly to the previous section, in order to validate the efficacy of the vulnerability detector built on top of EtherSolve we compared it with specialized tools aiming at discovery Tx.origin vulnerabilities, again using the SolidiFI dataset.For this research question, the 50 Ethereum smartcontracts has been injected with Tx.origin vulnerabilities.In total, 1336 bugs have been injected, coming from 40 code snippets.Also in this case, we had to compile the injected contracts before applying the analysis, and we used one compiled contract per source file (the largest one).Furthermore, we considered only vulnerabilities added by the SolidiFI injection, by running EtherSolve before and after injection, and by keeping only those new vulnerabilities that are detected by the second analysis and not by the first one.The same applies for the other tools.Concerning tools selection for the comparison, we have considered: Mythril [40], Slither [20], SmartCheck [49] and Vandal [8].They are the state-of-the-art static analysis tools that are able to detect Tx.origin vulnerabilities.Note that, tools such as Oyente [39], Manticore [7] and Securify [50] are not designed to find Tx.origin vulnerabilities.Also in this case, the considered approaches inspect the Solidity source code of the smart-contracts, not the EVM bytecode, as happens for EtherSolve.Fig. 11 shows the results of the comparison, subdivided in five plots, one for each tool under comparison.Analogously to the previous section, for each tool we plot the distribution of bugs found in the smart-contracts, compared with the baseline distribution of injected bugs.The efficacy of a tool increases as the distance between its plot and the baseline distribution decreases.Again, points above the baseline distribution are considered false positives, while points below the baseline distribution are considered false negatives.

D R A F T
Also in this case, the results highlight that EtherSolve is the second-best analysis tool, again immediately after Slither.Indeed, the plot for EtherSolve follows very closely the baseline distribution, except for few contracts where EtherSolve misses some bugs (i.e., we have false negatives).Slither, instead, performs very well, since its plot coincide exactly with the baseline distribution, meaning that Slither do not have false positives nor false negatives.The worst results are obtained by SmartCheck, that is not able to detect the majority of the injected bugs.Vandal results are quite good in general, except for some contracts for which it signals either lots of false positives or lots of false negatives.
Even if EtherSolve exhibits slightly lower results w.r.t.Slither, we recall that it is the only tool using EVM bytecode only.So, EtherSolve results are even more valuable: it is able to detect Tx.origin vulnerabilities even in closed-source smart-contracts, with very high efficacy.

Answer to 5
The efficacy of the Tx.origin vulnerability detector built on top of EtherSolve is very high, since the number of false positives and false negatives is very low.Indeed, it is the second-best, in terms of detected Tx.origin bugs, among the state-of-the-art analysis tool, even if it is the only one that does not exploit smart-contracts source code.

Discussion
The results obtained in the experimental validation suggest that EtherSolve reconstructs very precise CFGs: it is able to work on a wide range of Solidity versions and in almost all cases it computes an exhaustive graph.The key point of our approach is the simplicity of the symbolic stack execution, which is limited to only a tiny set of opcodes, but capable of resolving the destination address of orphan jumps.However, there are particular cases of very complex smart-contracts with peculiar structures for which Ether-Solve is not able to identify certain edges.For instance, the new try-catch feature introduced in the Solidity 0.6.xversions, is translated in a particular bytecode structure.EtherSolve is able to deal with such bytecode, but its particular structure deceives the tool, inducing it to take wrong execution paths, causing little internal exceptions.Among the compared tools, only Gigahorse showed a CFG precision similar to EtherSolve.However, they seems to be complementary, because each one could precisely represent cases that the other one could not.
The results of the vulnerability detectors suggest that EtherSolve is a powerful tool, and that can be easily extended to support accurate subsequent static analyses based on a precise CFG.Indeed, the experimental evaluation highlights that EtherSolve analysis results, in terms of false positives and false negatives, are second only to Slither results, for both Reentrancy and Tx.origin vulnerabilities.This is a very valuable, since Slither works at source code level while EtherSolve works at bytecode level, hence with way less information about the smart-contracts under analysis.
EtherSolve is affected by some limitations.One of them is in the way we analyze loops.In fact, to prevent the analysis to iterate forever, we pose a limit to the number of iterations by enforcing a maximum size to the symbolic stack that we propagate.This might in principle cause the final result to be imprecise.However, in our experience, in those cases when the size limit is reached, a larger limit would not help in delivering a better result, so a larger limit would not improve the precision of the tool.A possible improvement that may mitigate the issue consists in developing an analysis to detect when a loop recursively increases the symbolic stack size, and cut-off such paths.

Related Work
Comparison with the conference paper.The present paper is a deeply revised and extended version of the companion conference paper [16].In particular, we better explained the problem (Section I of [16]) and we better described the background material (Section II of [16]).We also added the description of the Tx.origin vulnerability (with examples) in the Background.Furthermore, we improved the overall presentation of the proposed static analysis that reconstructs the smart-contracts CFG (Section III of [16]).We revised the experimental validation (Section IV of [16]), performing a more detailed assessment of the Reentrancy detector built on top of EtherSolve, comparing its Reentrancy detection efficacy with the state-of-the-art source code level analysis tools.As novel contributions, we extended EtherSolve with a mechanism to identify functions entry-point and a specific analysis detecting Tx.origin vulnerabilities.Consequently, we extended the experimental validation, adding an assessment of the accuracy of EtherSolve in identifying functions entry-point and a comparison of Tx.origin detection efficacy with the state-of-the-art source code level analysis tools.

Related approaches.
In the last years, many tools have been developed in order to analyze Ethereum smart-contracts, with different approaches and objectives.A quite recent and detailed survey of them has been written by Praitheeshan et al. [43].Some tools analyze directly the EVM bytecode, often trying to build a CFG.Among these tools we can find Oyente [39], EthIR [3] and Octopus [52].Their approaches

Fig. 4 .
Fig. 4. Example of CFG with complex dispatcher (outgoing edges to not reported nodes omitted).

D
R A F T Enhancing Ethereum Smart-Contracts Static Analysis by Computing a Precise CFG of EVM Bytecode
t. the functions defined in the original Solidity source code, that are listed in the ABI of each smart-contract.The smartcontracts are taken from the Etherscan dataset.Given the set  EtherSolve of function hashes identified by EtherSolve and the set  ABI of hashes of the functions declared in the smart-contract ABI, we define: the true positives = ℎ | ℎ ∈  EtherSolve ∧ ℎ ∈  ABI , namely the functions declared in the ABI that EtherSolve is able to correctly identify; the false positives = ℎ | ℎ ∈  EtherSolve ∧ ℎ ∉  ABI , namely the functions incorrectly detected by EtherSolve, since not present in the ABI; and the false negatives = ℎ | ℎ ∉  EtherSolve ∧ ℎ ∈  ABI , namely the functions incorrectly not detected by EtherSolve, since present in the ABI.Performance metrics are computed based on the correlation between true positives and false positives/negatives.

DFig. 8 .
Fig. 8.An excerpt of the diff graph for the ZipmexTokenP smart-contract used for manual inspection.In red we highlight nodes/edges found by EtherSolve but not by Octopus; in blue we highlight nodes/edges found by Octopus but not by EtherSolve; in black we highlight nodes/edges found by both tools.Basic blocks are identified by the offset of their first instruction.

Fig. 9 .
Fig. 9.An excerpt of the diff graph for the BCN20 smart-contract used for manual inspection.In red we highlight nodes/edges found by EtherSolve but not by Vandal; in blue we highlight nodes/edges found by Vandal but not by EtherSolve; in black we highlight nodes/edges found by both tools.Basic blocks are identified by the offset of their first instruction.

Fig. 10 .
Fig. 10.Reentrancy analysis results for the different tools (smart-contracts on the x-coordinate).

DFig. 11 .
Fig. 11.Tx.origin analysis results for the different tools (smart-contracts on the x-coordinate).
Enhancing Ethereum Smart-Contracts Static Analysis by Computing a Precise CFG of EVM Bytecode § ¤

Table 1
Enhancing Ethereum Smart-Contracts Static Analysis by Computing a Precise CFG of EVM Bytecode Look-up table for the executeOpcode function of Algorithm 1.
Enhancing Ethereum Smart-Contracts Static Analysis by Computing a Precise CFG of EVM Bytecode Enhancing Ethereum Smart-Contracts Static Analysis by Computing a Precise CFG of EVM Bytecode 1 function sendTo ( address payable dest , uint amount ) public { 2 require ( tx .origin == owner ) ; SSTORE opcode can be reached.In case a SSTORE is reachable the detector signals a reentrancy vulnerability.
if a

Table 2
Comparison with state-of-the-art tools.
Enhancing Ethereum Smart-Contracts Static Analysis by Computing a Precise CFG of EVM Bytecode