Table of Links
2 Background and 2.1 Blockchain
4 Computing Transaction Processing Times
5 Data Collection and 5.1 Data Sources
6 Results
6.1 RQ1: How long does it take to process a transaction in Ethereum?
7 Can a simpler model be derived? A post-hoc study
11 Conclusion, Disclaimer, and References
A. COMPUTING TRANSACTION PROCESSING TIMES
B. RQ1: GAS PRICE DISTRIBUTION FOR EACH GAS PRICE CATEGORY
B.1 Sensitivity Analysis on Block Lookback
C. RQ2: SUMMARY OF ACCURACY STATISTICS FOR THE PREDICTION MODELS
D. POST-HOC STUDY: SUMMARY OF ACCURACY STATISTICS FOR THE PREDICTION MODELS
A COMPUTING TRANSACTION PROCESSING TIMES
Computing the processing time of a given transaction depends on obtaining the pending timestamp and the processed timestamp (Figure 1). However, obtaining accurate values for these timestamps is considerably challenging.
In the following, we describe the approaches that we employed to obtain and evaluate the accuracy of the pending timestamp (Section A.1) and the processed timestamp (Section A.2).
A.1 Pending timestamp
Despite the ledger nature of a blockchain, Ethereum does not record any data regarding the timestamp at which a transaction was first seen in the network (i.e., pending timestamp). Discovering such a timestamp is challenging. First, as described in Section 2.2, each miner node has its own pending pool (i.e., there is no unified, centralized pending pool). Second, the pending pool of a given miner is rarely exposed to the outside world. Third, even if we were to set up our own nodes in the network, there are only so many nodes that we would be able to deploy. Given the peer-to-peer architecture of the blockchain network and the associated broadcasting of transactions, our nodes would likely take a long time to become aware of pending transactions and thus our obtained pending timestamps would not be accurate. As a reference, Ethermine[26], one of the largest Ethereum mining pools, has more than 300k nodes distributed across the globe as of October 2020.
In Section A.1.1, we describe how we overcome the aforementioned challenges to obtain the pending timestamp of transactions. In Section A.1.2, we evaluate the accuracy of our obtained timestamps.
A.1.1 Obtaining the pending timestamp. We obtain an approximation of the pending timestamp of a transaction. More specifically, we rely on Etherscan and equate the pending timestamp of a transaction t with the instant at which an Etherscan’s node first sees t (instead of the instant at which t is first seen in the network). Our approach is summarized in Figure 15. The figure highlights the various statuses that a transaction t undergoes until we can retrieve its pending timestamp. In the following, we describe each step.
(Status-1) Pre-submitted. First, the transaction issuer builds the transaction t using an Ethereum client tool. We note that the Ethereum client is connected to the transaction issuer’s node in the Ethereum peer-to-peer network.
(Status-2) Submitted. The transaction is submitted by the Ethereum client.
(Status 3) Broadcasted. The issuer’s node broadcasts the transaction to its neighbour nodes, which in turn broadcast the transaction to their neighbour nodes and so on.
(Status 4) Seen by Etherscan. The transaction eventually reaches some node belonging to Etherscan. This is the instant at which Etherscan becomes aware of the transaction t. From the perspective of Etherscan, such an instant corresponds to the pending timestamp.
(Status 5) Shown in Pending Txs. Page. Some Etherscan node (likely the one that first saw the pending transaction t) communicates with the Etherscan’s Pending Transactions webpage27 to signal the existence of t. The pending transaction t is then added to the list of pending transactions in the Pending Transactions webpage.
(Status 6) Detected by our Monitor. We built a monitor that watches the Pending Transactions webpage for updates. Our monitor detects t in the table of pending transactions.
(Status 7) Recorded by our Monitor. Our monitor accesses the Transaction Details webpage associated with the pending transaction t. This webpage includes a field called “Time Last Seen”, which contains two pieces of information: (a) a live “stopwatch” that increases second by second and (b) the pending timestamp in parenthesis. We record the pending timestamp. In Figure 16, we display the transaction details page of a real-world transaction t.
When transactions take a long time to be processed (days in our practical experience), Etherscan changes the transaction details page. More specifically, a new field called “Time First Seen” is added and the “Time Last Seen” field is updated (Figure 17). Although Etherscan does not publish any documentation describing these fields in detail, we manually observed that (i) the “Time First Seen” is a fixed timestamp, (ii) such a timestamp is older than that of the “Time Last Seen” (the portion in parenthesis), and (iii) the “Time Last Seen” stopwatch resets to zero and restarts. The new meaning of the “Time Last Seen” field is not clear to us. Since the “Time First Seen” field contains an older timestamp, we use that as the pending timestamp when such a field is shown.
A.1.2 Evaluating the accuracy of the collected pending timestamps. We designed an experiment to evaluate the accuracy of our collected pending timestamps. In a nutshell, we send transactions to Ethereum, record the timestamp at which they were sent (submitted timestamp), record their pending timestamp, and compare these two timestamps for every transaction. Our goal is to determine the delta between the two timestamps and to determine whether (i) the delta is large and (ii) whether the delta changes much from transaction to transaction.
The detailed experiment design is as follows. We set up a program to submit transactions. Since transaction processing times vary based on gas price, we decided to submit transactions using various prices. More specifically, our program sends five transactions every hour, one in each of the following gas price categories: very cheap, cheap, regular, expensive, and very expensive. As explained in Section 6.1, the ranges of gas prices for these categories are determined dynamically using a quintile approach over the gas prices of transactions residing in the 120 most recent mined blocks. We began the experiment on November 20, 2019 and executed it for 40 hours straight. Hence, a total of 200 transactions were sent (40 in each gas price category).
Right before sending a transaction t, the program saves a sent timestamp. Next, the pending timestamp for that transaction is obtained using the approach described in Figure 15. Once the timestamp is obtained and recorded, the program proceeds to send the next transaction in the next gas price category. Finally, we compare the sent timestamp with the pending timestamp of each transaction.
Results. Etherscan becomes aware of pending transactions in 1 to 2 seconds in 79.5% of the cases. The results that we obtained are depicted in Figure 18. Analysis of the figure reveals that the lag between the submitted timestamp and the pending timestamp is small (3 seconds at most) and stable (in the range of 1 to 2 seconds in 79.5% of the cases). Indeed, since Etherscan is a real-time dashboard of Ethereum, Etherscan needs to have “many” nodes in the network in order to quickly and accurately capture the state of the network. Consequently, we conclude that the pending timestamps that we collected using the approach described in Section A.1.1 are a good approximation of the original, unknown pending timestamps of transactions.
Authors:
(1) MICHAEL PACHECO, Software Analysis and Intelligence Lab (SAIL) at Queen’s University, Canada;
(2) GUSTAVO A. OLIVA, Software Analysis and Intelligence Lab (SAIL) at Queen’s University, Canada;
(3) GOPI KRISHNAN RAJBAHADUR, Centre for Software Excellence at Huawei, Canada;
(4) AHMED E. HASSAN, Software Analysis and Intelligence Lab (SAIL) at Queen’s University, Canada.
This paper is
[26] https://ethermine.org
[27] The url of the Pending Transactions webpage is https://etherscan.io/txsPending. There is no public information regarding how Etherscan’s nodes in the network and Etherscan’s Pending Transactions page communicate with each other (Etherscan is not open source). We conjecture that a Publish-Subscriber model is implemented.