Run, boy, run! They’re trying to catch you.
In the previous episode of the Steem Pressure series, we checked how much time we needed to get a fully functional steem consensus node from scratch.
We also took a closer look at the steemd
version v0.19.2
resync and replay performance for 20’000’000 blocks.
It was back in the pre-appbase era with the HardFork 19 rules.
Time flies, and so do blocks.
Recently, our chain has grown beyond 30’000’000 blocks, and this offers a good opportunity to look at the performance once again, especially as we have also seen many important changes in the code.

Time needed for replay
We won’t be measuring --resync
performance this time, because we want to avoid the bias introduced by external factors such as network latency and the performance of 3rd party nodes.
We will focus on the --replay
part, using block_log
file, which we can get either from a previous --resync
operation or we can download it from a another instance or public source.
If you want to find out more about the difference between --resync
and --replay
, feel free to read the previous episode.
Test setup
We have exactly the same machine as in the previous episodes:
An entry-level dedicated machine with Intel(R) Xeon(R) CPU E3-1245 V2 @ 3.40GHz on an Ivy Bridge with 32GB DDR3 1333MHz RAM and 3x 120GB SSD.
That’s not the configuration I use for my witness nodes, because as a consensus witness that takes part in every round, I’m not going to risk losing blocks sporadically due to high latency.
However, that is a perfect test configuration to check the performance of steemd
for the purposes of public seed nodes, backup witnesses, state providers, private RPC endpoints for wallets, or even exchanges, as account_history plugin with just one or two tracked accounts has a negligible footprint on the performance.
But why?
If we have all the blocks downloaded to our disk, we have all the data about all transactions that were performed on the Steem blockchain. But at this point we still can’t say how much Steem Power Alice has or how many upvotes Bob’s most recent post about cute kittens received.
We can’t until we check (replay) every block since The Genesis.
That’s the only way to validate the current state of Steem.
Every three seconds, one of the witnesses signs a new block (head block) with new transactions making the whole process increasingly difficult.
Speed does matter
When you can’t replay fast enough to catch up with the current head block in a reasonable time, you are doomed.
Of course, you can rely on your state providers, snapshots, etc., but whenever you run out of reliable copies of the reasonably recent state (due to software or hardware failures), or whenever new features are introduced, you have no other choice but to replay.
When something unexpected happens
Back in September, after an unexpected event caused by a bug, we were forced to replay with the fixed code.
That’s where the ability to replay the whole blockchain from scratch in less than 4 hours and multiple high-end servers at your service pays off. Especially when you need to do that multiple times.
I started producing again at block 26038153 (one hell of a block!), soon after we got back to a fully operational status.
There were no transactions going on between 12:47:00 and 19:56:51 UTC.
And it could have been much worse.
Benchmarks
v0.19.2 - pre-appbase, 20M blocks, 2018-02-19
In the previous episode, we were able to replay 20M blocks in 145 minutes.

v0.19.12 - appbase, 25M blocks, 2018-08-12
There were a number of optimizations to improve reindex performance.
With appbase based steemd
v0.19.12
, we were able to replay 20M blocks in 121 minutes and reach the 25M block mark in less than 5 hours.

v0.20.9 - Current version, 30M blocks, 2019-02-02
The current version of steemd is even faster. Despite “Reconstructing Block Log Index..” itself takes over 6 minutes longer, we were able to replay 20M blocks in 115 minutes, reach the 25M block mark 25 minutes earlier than v0.19.12, and replay 30M blocks in less than 8 hours (7h38m)

Replay speed

Replay speed, measured at 100k block intervals never drops below 324 blocks per second, and the last 1M blocks replay at an average speed of 454 blocks per second.
Summary
Replay | v0.19.2 | v0.19.12 | v0.20.9 |
---|---|---|---|
20 M blocks | 145 min | 121 min | 115 min |
25 M blocks | n/a | 297 min | 272 min |
30 M blocks | n/a | n/a | 458 min |
Paying attention pays off
By paying attention to performance, we are able to identify and resolve issues that, if underestimated, may prove silent killers and endanger the reliability of our platform in the long run.
One of such incidents I detected during appbase development was a witness plugin that caused significant resync performance degradation
Upcoming improvements
We are nearing 50GB of state file size for a consensus node.
As you can see, with the current architecture, that is difficult for servers with “only” 32GB of RAM.
I’m looking forward to the MIRA release.
My preliminary benchmarks of MIRA branches are far from being optimistic, but Steemit Inc team updates say that MIRA jest under heavy development / optimization phase; a lot of effort is being put into making it replay in reasonable time, but the advantage would involve robust access to the state in low memory conditions.
Previous episodes of Steem Pressure series
Introducing: Steem Pressure #1
Steem Pressure #2 - Toys for Boys and Girls
Steem Pressure #3 - Steem Node 101
Steem Pressure: The Movie ;-)
Steem Pressure #4 - Need for Speed
Stay tuned for next episodes of Steem Pressure :-)
Bonus: Inspiration for the title

If you believe I can be of value to Steem, please vote for me (gtg) as a witness on Steemit's Witnesses List or set (gtg) as a proxy that will vote for witnesses for you.
Your vote does matter!
You can contact me directly on steem.chat, as Gandalf

Steem On