ADAvault uses a combination of Prometheus and Grafana to to collect data from node instances, and monitor the active and passive clusters that we operate. Delegators have complete transparency on pool operation and performance, with access to the same real time monitoring dashboard that our Ops team uses.
Stake pool availability is at 100% (allowing for cluster switch time), nodes are running release 1.24.2 which has much lower resource demands than earlier releases at Epoch boundaries. Performance tuning has completed for this release with a number of improvements in place for time sync and file systems.
You will notice that the ‘Node Uptime’ and ‘Blocks Forged’ metrics will periodically reset when we move from the active to the passive cluster. We perform this switch when we need to patch or upgrade the nodes to a new cardano release. By running two completely separate clusters we can minimise downtime to seconds while the switch occurs.
Resource utilisation for memory, network and CPU is low (even with the spike at Epoch boundaries) and there is significant capacity on the existing hardware for growth in transactions.
The cardano node software has proven to be very stable to date however we always allow for a comprehensive soak test on the passive cluster when compiling new releases.
Checking historical performance using the pool algorithm allows us to confirm that the ADAvault ADV stake pool has minted all blocks allocated since it started operation*. Expected blocks are based on stake per Epoch, with the recent history of minted blocks (actual/statistical) below.
As you can see in the table above, there is a naturally occurring random variation in blocks allocated per Epoch (some are higher, some lower). This will average over time for pools producing allocated blocks to around 5-6% annual Return on ADA (ROA).
ADAvault has very low latency measurements for connectivity to peers. This is important as it ensures that the stake pool is able to mint all blocks allocated. Real time details are available from PoolTool.io who measure propagation delays across all registered pools.
The important number here is not the average (or median) latency at 479ms, but the mode (or most frequently occurring latency) which is 100ms. The reason the average is less relevant is that it is skewed by the very infrequent long tail which could be removed by connecting to close geographical peers, but has the side effect of making the network weaker and reducing global block propagation.
For comparison it takes approximately 200ms to send a ping request half way around the world. Therefore we can see that 100ms for the mode is towards the lower theoretical bound for latency.
We will be regularly reviewing global peer connectivity in the coming months, and are confident the pool has excellent performance characteristics for reliable block production.