Performance

ADAvault uses a combination of Prometheus and Grafana to to collect data from node instances, and monitor the active and passive clusters that we operate. Delegators have complete transparency on pool operation and performance, with access to the same real time monitoring dashboard that our Ops team uses.

Grafana Dashboard

Stake pool availability is at 100% (allowing for cluster switch time), nodes are on the latest release. Performance tuning is performed periodically for releases with a number of improvements in place for time sync and file systems (xfs from ext4).

You will notice that the ‘Node Uptime’ and ‘Blocks Forged’ metrics will periodically reset when we move from the active to the passive cluster. We perform this switch when we need to patch or upgrade the nodes to a new cardano release. By running multiple separate clusters we can minimise downtime to seconds while the switch occurs.

Resource utilisation for memory, network and CPU is low (even with the spike at Epoch boundaries) and there is significant capacity on the existing hardware for growth in transactions.

The cardano node software has proven to be very stable to date however we always allow for a comprehensive soak test on the passive nodes when compiling new releases.

Historic Performance

Checking historical performance using the pool algorithm allows us to confirm that the ADAvault ADV stake pool has minted all blocks allocated since it started operation*. Expected blocks are based on stake per Epoch, with the recent history of minted blocks (actual/statistical) below.

EpochLeadIdealLuckConfirmMissGhostStolenInvalid
2717262116%
270606297%
2696362102%612
268606297%60
2677261117%711
2667560124%732
2656161100%601
2646261101%584
263576292%561
262586194%553
2616960115%663
2606461104%6121
* There were stolen and ghosted blocks in some Epochs where a predicted block was minted by another pool. This is known as a “slot battle” and is standard part of the Ouroboros Praos protocol decided based on a random function using the pool VRF key. These times ADV lost. See discussion on implementation here https://github.com/input-output-hk/ouroboros-network/issues/2014

As you can see in the table above, there is a naturally occurring random variation in blocks allocated per Epoch (some are higher, some lower). This will average over time for pools producing allocated blocks to around 5-6% annual Return on ADA (ROA).

Historic performance for ADV2 is shown in the table below.

EpochLeadIdealLuckConfirmMissGhostStolenInvalid
271505787%
2706658113%4
269515888%51
2686559111%641
2676458110%604
266475782%461
265555896%541
2645856103%571
2636254116%62
2625452103%5311
261385076%344
2605750114%5511
* ADV2 has lost a similar ratio of slot battles to date, increasing as saturation has increased. There is no intrinsic technical difference between the pools, clashes are simply less probable when pools are less saturated. From this date we would expect to see equivalence between ADV and ADV2 maintained given roughly the same stake.

We have added detailed information for ADV3 now it has produced blocks for 10 Epochs. Performance is more variable as the pool is smaller, overall performance is slightly below 97% at this point.

Historic performance for ADV3 is shown in the table below.

EpochLeadIdealLuckConfirmMissGhostStolenInvalid
27161059%
27061058%
2691310125%13
268121578%12
267131682%13
266131681%13
265151692%15
264131587%13
263121485%12
262101473%10
2611413105%131
2601512127%15
259101287%10
2581510153%15
25796160%9
* ADV3 has lost one slot battle to date. There is no intrinsic technical difference between the pools, clashes are simply less probable when pools are less saturated. As the amount staked increases we would expect to see the number of stolen blocks increase.

Latency measurements

ADAvault has very low latency measurements for connectivity to peers. This is important as it ensures that the stake pools are able to mint all blocks allocated. Real time details are available from PoolTool.io who measure propagation delays across all registered pools.

ADAvault propagation delay for slot leader to broadcast to all network nodes
ADAvault propagation delay receiving from connected network nodes

The important number here is not the average (or median) latency at ~300ms, but the mode (or most frequently occurring latency) which is ~100ms. The reason the average is less relevant is that it is skewed by the very infrequent long tail which could be removed by connecting to close geographical peers, but has the side effect of making the network weaker and reducing global block propagation.

For comparison it takes approximately 200ms to send a ping request half way around the world. Therefore we can see that 100ms for the mode is towards the lower theoretical bound for latency.

We will be regularly reviewing global peer connectivity in the coming months, and are confident the pools have excellent performance characteristics for reliable block production.