The new ATX node!
Finally revealing what I bought. And, for some, this might instantaneously be a disappointing thing, about me going to Intel, although I had a reason (double one actually) for this strategy.
First, Intel CPUs are still faster for some "front ends" (web stuff) and for mixed workloads, it might be a debate, but might also win in some groups of workloads. Power-wise, it's OK... most stuff in the real world is fine and efficient. Yes on games AMD wins mostly... but mostly, for things that need clock speed and rich architecture branches, Intel has a better domain of cover.
The second reason was the price! And I will explain below why...
For IO and storage, definitely AMD! IOps and overall latency vs power, AMD rocks... So many lanes and a very good connection to IO and on bigger enterprise models, the extra memory channels.
But I have bought this one with a vision already... and that's, buying later an AMD system for the IO part! and leaving this one in the future for the front end. As Intel CPUs can easily sustain longer periods of support for many codes without being tooo deprecated.
Anyhow, apart from this, and some other tests I can't reveal due to NDA's, I have decided to procure the Intel 14th gen CPU. Also, because this CPU in NZ was quite cheaper than AMD (due to high demand) - a problem I have to balance due to warranty, which is a pain in the ass when I get stuff outside the country... because anytime I have a problem, I incur more costs! Hence why do I default to in-country when I know it's not worth the pennies...
Either way, the CPU is quite good (I would probably buy this for games too, but probably buy AMD for anything else, at least while under AMD high demand, otherwise AMD first). And I have yet to reveal some power usages... but compared with my other AMD Zen2 6-core stuff, this one beats it quickly! Idle power is super nice... if you running stuff very low at the start... and it can quickly scale too.
Zen5 is going to be fricking crazy! Intel has some stuff under the sleeve tooo... things are changing with the introduction of HBM and NVIDIA coming into the play with ARM... there is so much activity this year, it's a clusterF... of news!
Either way, here is guy! Enjoy the benchmarks and let me know any questions...
Results Disclaimer
Be aware please, especially in the VM solution, that results may vary (due to other workloads, although I have attempted to minimize the impact as much as I could) due to specific details I might have missed or failed to detect their conditions/reasons for possible impact. As a result of this work, please take these AS IS.
The reason I do these, is to be sure what I will be counting on and to be able to find baselines for when troubleshooting is needed in the future. You don't need to do this yourself if you don't want to. For me, this is also a mix of "keeping up to date" and enjoyment, due to inherent professional skills.
Hive Compile
- 6-core (VM) Intel 10th Gen, DDR4 2933 MHz, NVMe 2TB PCIe 3.0
Ubuntu 23.10
time cmake -DCMAKE_BUILD_TYPE=Release -DBUILD_HIVE_TESTNET=OFF -GNinja ../../hive/
# (3.7-seconds)
time ninja
# (real 18m9.183s) 6 threads
- 24-core Intel 14th Gen, DDR5 5600 MHz, NVMe 1TB Samsung 990 PRO PCIe 4.0
Ubuntu 22.04
time cmake -DCMAKE_BUILD_TYPE=Release -DBUILD_HIVE_TESTNET=OFF -GNinja ../../hive/
# (1.3-seconds)
time ninja
# (real 2m53.608s) 32 threads (system consumption of 150-440 Watts)
Less than 3 minutes 😱
Hive Live Sync
Otherwise mentioned (if different), these were the seed nodes I used:
p2p-seed-node = api.hive.blog:2001 rpc.ecency.com:2001 seed.openhive.network:2001 anyx.io:2001 hiveseed-fin.privex.io:2001 hive-seed.arcange.eu:2001 seed.liondani.com:2016 hived.splinterlands.com:2001 node.mahdiyari.info:2001 seed.roelandp.nl:2001 hiveseed-se.privex.io:2001 hive-seed.lukestokes.info:2001
Partial
- 6-core (VM) Intel 10th Gen, DDR4 2933 MHz, NVMe 2TB PCIe 3.0
first 2M blocks live sync, Ubuntu 22.04
real 6m3.114s
user 9m59.928s
sys 3m2.908s
- 6-core (VM) Intel 10th Gen, DDR4 2933 MHz, NVMe 2TB PCIe 3.0
first 2M blocks live sync, Ubuntu 23.10
real 5m52.383s
user 10m12.387s
sys 3m11.303s
- 6-core (VM) Intel 10th Gen, DDR4 2933 MHz, NVMe 2TB PCIe 3.0
first 2M blocks live sync, Ubuntu 23.10
p2p-seed-node = localnode:2001
real 6m23.933s
user 10m37.310s
sys 3m29.787s
p2p-seed-node = localnode:2001 api.hive.blog:2001 rpc.ecency.com:2001 seed.openhive.network:2001 anyx.io:2001 hiveseed-fin.privex.io:2001 hive-seed.arcange.eu:2001 seed.liondani.com:2016 hived.splinterlands.com:2001 node.mahdiyari.info:2001 seed.roelandp.nl:2001 hiveseed-se.privex.io:2001 hive-seed.lukestokes.info:2001
real 6m33.510s
user 11m4.258s
sys 3m47.537s
- 24-core Intel 14th Gen, DDR5 5600 MHz, NVMe 1TB Samsung 990 PRO PCIe 4.0
first 2M blocks live sync, Ubuntu 22.04
real 2m6.406s
user 4m10.067s
sys 0m51.402s
Full (blocks indicated)
- 24-core Intel 14th Gen, DDR5 5600 MHz, NVMe 1TB Samsung 990 PRO PCIe 4.0
full live sync, Ubuntu 22.04
2023-12-24T00:04:31.632 application.cpp:174 load_logging_config ] Logging thread started
...
2023-12-24T19:22:59.666 chain_plugin.cpp:1176 accept_block ] Syncing Blockchain --- Got block: #81310000 time: 2023-12-24T16:40:12 producer: guiltyparties --- 979.81µs/block, avg. size 10553, avg. tx count 39
2023-12-24T19:23:03.017 chain_plugin.cpp:498 operator() ] entering live mode
2023-12-24T19:23:03.017 chain_plugin.cpp:519 operator() ] waiting for work: 0.01%, waiting for locks: 0.00%, processing transactions: 0.00%, processing blocks: 99.95%, unknown: 0.04%
2023-12-24T19:23:03.331 block_log_artifacts.cpp:443 flush_header ] Attempting to write header (pack_size: 56) containing: git rev: 0acce16829c0702ac09d9e51beb57b9fac157c22, format version: 1.1, head_block_num: 81313230, tail_block_num: 1, generating_interrupted_at_block: 0, dirty_close: 1
19h18m 😱 - 81313230 blocks
And I have observed the same performance with just a local node (p2p-seed-node = localnode:2001
). Probably using more than one local node could have advantages, but likely very little to no improvement (as most of the time spent here is buffered/hidden, assuming enough resources obviously).
(after firmware update)
No difference, and I am guessing here, but possibly because there are too many hidden processing overlaps between access to IO and the time that takes to process things. Cumulative with the downloading time, adds no difference. And since most of the time (writing to disk) is cached, the impact is very little to none (or hidden as previously mentioned).
Hive Local Replay
Partial
- 6-core (VM) Intel 10th Gen, DDR4 2933 MHz, NVMe 2TB PCIe 3.0
first 10M blocks replay, Ubuntu 22.04
real 19m24.695s
user 23m27.612s
sys 2m21.661s
- 6-core (VM) Intel 10th Gen, DDR4 2933 MHz, NVMe 2TB PCIe 3.0
first 10M blocks replay, Ubuntu 23.10
real 19m36.713s
user 24m6.525s
sys 1m52.247s
- 24-core Intel 14th Gen, DDR5 5600 MHz, NVMe 1TB Samsung 990 PRO PCIe 4.0
first 10M blocks replay, Ubuntu 22.04
real 6m35.121s
user 7m25.173s
sys 0m36.348s
I tried overclocking the memory to 5800 and 6000 MHz and it gave worse timings... 6m41s
and 6m40s
respectively (voltage raised to stabilize, but controller frequency ratios might have needed more frequency to compensate). I didn't spend enough time with this to explore better conclusions. Timings were also calculated by the motherboard, so they might not have been ideal.
Somewhere in the process of many of these benchmarks, I managed to experience a problem with the 990 PRO NVMe, when this was running the 3B2QJXD7 firmware, so I have updated it to the at the time most recent (4B2QJXD7). I shared some notes on how you can do this here.
(after firmware update)
- 24-core Intel 14th Gen, DDR5 5600 MHz, NVMe 1TB Samsung 990 PRO PCIe 4.0
first 10M blocks replay, Ubuntu 22.04
real 5m51.730s
user 6m42.006s
sys 0m35.455s
Full (blocks indicated)
- 24-core Intel 14th Gen, DDR5 5600 MHz, NVMe 1TB Samsung 990 PRO PCIe 4.0
full replay, Ubuntu 22.04
2023-12-24T20:44:35.362 application.cpp:174 load_logging_config ] Logging thread started
...
2023-12-25T04:34:16.905 block_log.cpp:837 operator() ] Exiting the queue thread
2023-12-25T04:34:17.386 block_log.cpp:887 for_each_block ] Attempting to join queue_filler_thread...
2023-12-25T04:34:17.386 block_log.cpp:889 for_each_block ] queue_filler_thread joined.
2023-12-25T04:34:17.386 database.cpp:460 reindex ] Done reindexing, elapsed time: 28181.39933899999959976 sec
2023-12-25T04:34:17.386 chain_plugin.cpp:1101 plugin_startup ] P2P enabling after replaying...
2023-12-25T04:34:17.386 chain_plugin.cpp:768 work ] Started on blockchain with 81314881 blocks, LIB: 81314862
2023-12-25T04:34:17.386 chain_plugin.cpp:774 work ] Started on blockchain with 81314881 blocks
7h50m 😱 - 81314881 blocks
- 24-core Intel 14th Gen, DDR5 5600 MHz, NVMe 1TB Samsung 990 PRO PCIe 4.0
full replay, Ubuntu 23.10
2024-01-03T04:54:41.502 application.cpp:174 load_logging_config ]
...
2024-01-03T12:39:03.301 chain_plugin.cpp:498 operator() ] entering live mode
2024-01-03T12:39:03.305 block_log_artifacts.cpp:443 flush_header ] Attempting to write header (pack_size: 56) containing: git rev: 0acce16829c0702ac09d9e51beb57b9fac157c22, format version: 1.1, head_block_num: 81592963, tail_block_num: 1, generating_interrupted_at_block: 0, dirty_close: 1
The difference compared with Ubuntu 22.04
2024-01-03T05:00:27.502 10000000 of 81583660 blocks = 12.26% +5m46s 346 x1 -4s
2024-01-03T05:36:49.577 20000000 of 81583660 blocks = 24.51% +36m22s 2182 x6.31 -35s
2024-01-03T06:44:02.082 30000000 of 81583660 blocks = 36.77% +1h07m13s 4033 x11.66 -71s
2024-01-03T07:35:10.006 40000000 of 81583660 blocks = 49.03% +51m08s 3068 x8.87 -55s
2024-01-03T08:08:33.446 50000000 of 81583660 blocks = 61.29% +33m23s 2003 x5.79 -9s
2024-01-03T09:07:53.024 60000000 of 81583660 blocks = 73.54% +59h20s 3560 x10.29 -49s
2024-01-03T10:38:08.463 70000000 of 81583660 blocks = 85.80% +1h29m16s 5356 x15.48 -189s
2024-01-03T12:25:57.193 80000000 of 81583660 blocks = 98.06% +1h42m49s 6169 x17.83 -491s
2024-01-03T12:38:13.505 81500000 of 81583660 blocks = 99.90% +12m16s
7h45m22s 😱 - 81592963 blocks
Same system, filesystem, etc.
(after firmware update)
Not much noticeable difference! As expected because most of the code is dependent on rather lower latency things.
Hive-Engine
Light DB (restore)
- 6-core (VM) Intel 10th Gen, DDR4 2933 MHz, NVMe 2TB PCIe 3.0
mongo 7, Ubuntu 23.10, zfs recordsize=128k
time mongorestore --gzip --archive=./hsc_12-27-2023_b81403927.archive --drop
real 40m36.832s
user 9m0.805s
sys 3m9.786s
- 24-core Intel 14th Gen, DDR5 5600 MHz, NVMe 1TB Samsung 990 PRO PCIe 4.0
mongo 7, Ubuntu 22.04, zfs recordsize=128k
time mongorestore --gzip --archive=./hsc_12-27-2023_b81403927.archive --drop -j 8
real 19m17.922s
user 7m16.488s
sys 0m46.082s
Note: Using 8 (-j 8
) numParallelCollections did not make any difference (against the 4 default) for this DB dump, in the same system.
- 24-core Intel 14th Gen, DDR5 5600 MHz, NVMe 1TB Samsung 990 PRO PCIe 4.0
mongo 7, Ubuntu 23.10, zfs recordsize=128k
time mongorestore --gzip --archive=./hsc_12-27-2023_b81403927.archive --drop -j 8
real 26m47.608s
user 8m47.959s
sys 0m58.855s
vs -f 4
real 26m42.191s
user 8m44.144s
sys 1m3.569s
All the above had ashift=9 (512 bytes), which makes a huge impact on NVMe wear...and it's not worth the performance, given the amount of work you will be doing. So make sure you increase the ashift
if you are using ZFS, to at least 12 (4k bytes).
- 24-core Intel 14th Gen, DDR5 5600 MHz, NVMe 1TB Samsung 990 PRO PCIe 4.0
mongo 7.0.5, Ubuntu 22.04, zfs recordsize=4k, NVMe with ashift=12
time mongorestore --gzip --archive=./hsc_01-10-2024_b81806836.archive --drop -j 16
real 21m16.251s
user 7m25.942s
sys 0m42.511s
- 24-core Intel 14th Gen, DDR5 5600 MHz, NVMe 1TB Samsung 990 PRO PCIe 4.0
mongo 7.0.5, Ubuntu 22.04, zfs recordsize=128k, NVMe with ashift=12
time mongorestore --gzip --archive=./hsc_01-10-2024_b81806836.archive --drop -j 16
real 21m10.244s
user 7m37.555s
sys 0m42.051s
And it looks like 32k aligned IO, is still more performant.
- 24-core Intel 14th Gen, DDR5 5600 MHz, NVMe 1TB Samsung 990 PRO PCIe 4.0
mongo 7.0.5, Ubuntu 22.04, zfs recordsize=32k, NVMe with ashift=12
time mongorestore --gzip --archive=./hsc_01-10-2024_b81806836.archive --drop -j 16
real 20m24.926s
user 7m29.396s
sys 0m41.792s
Full DB (restore)
- 24-core Intel 14th Gen, DDR5 5600 MHz, NVMe 1TB Samsung 990 PRO PCIe 4.0
mongo 7.0.4, Ubuntu 23.10, zfs recordsize=128k
time mongorestore --gzip --archive=./hsc_12-27-2023_b81403927.archive
This one was cancelled due to some sort of performance bug between Mongo 7 and Ubuntu 23.10
2024-01-08T00:50:33.938+1300 hsc.chain 43.6GB
2024-01-08T00:50:33.938+1300 hsc.transactions 42.0GB
2024-01-08T00:50:33.938+1300
2024-01-08T00:50:34.122+1300 signal 'interrupt' received; attempting to shut down
2024-01-08T00:50:34.202+1300 terminating read on hsc.transactions
2024-01-08T00:50:34.239+1300 hsc.transactions 42.0GB
2024-01-08T00:50:34.239+1300 finished restoring hsc.transactions (539924017 documents, 0 failures)
2024-01-08T00:50:34.239+1300 Failed: hsc.transactions: error restoring from archive './hsc_01-05-2024_b81662950.archive': received termination signal
2024-01-08T00:50:34.239+1300 539924017 document(s) restored successfully. 0 document(s) failed to restore.
real 1583m42.338s
user 67m41.998s
sys 8m37.995s
And then because I have seen that there was an update to Mongo, I did another run...
- 24-core Intel 14th Gen, DDR5 5600 MHz, NVMe 1TB Samsung 990 PRO PCIe 4.0
mongo 7.0.5-rc0, Ubuntu 23.10, zfs recordsize=128k
time mongorestore --gzip --archive=./hsc_12-27-2023_b81403927.archive -j 16
2024-01-10T00:46:54.603+1300 727706502 document(s) restored successfully. 0 document(s) failed to restore.
real 1480m42.037s
user 124m20.456s
sys 14m11.413s
Slightly more than a full day... and I am not sure why it can't do more IO, but it likely needs lower latency than parallel IO. The next step is to increase the number of NVMes to see if this scales.
And because this one was done with -j 16
I am now thinking about the first run, and if I was actually going to finish, but there is a part of the restore where the speed of restore is super slow. In this run, I have also increased the cache of Mongo from 8 to 16GB, as I saw it being limited when using 8GB size.
- 24-core Intel 14th Gen, DDR5 5600 MHz, NVMe 1TB Samsung 990 PRO PCIe 4.0
mongo 7.0.5-rc0, Ubuntu 23.10, zfs recordsize=32k
time mongorestore --gzip --archive=./hsc_12-27-2023_b81403927.archive -j 16
2024-01-11T04:27:19.628+1300 727706502 document(s) restored successfully. 0 document(s) failed to restore.
real 905m53.218s
user 113m19.469s
sys 12m29.928s