ADAvault uses a combination of Prometheus and Grafana to to collect data from node instances, and monitor the active and passive clusters that we operate. Delegators have complete transparency on pool operation and performance, with access to the same real time monitoring dashboard that our Ops team uses.
Stake pool availability is at 100% (allowing for cluster switch time), nodes are running release 1.21.1 which has much lower resource demands than 1.18.0 at Epoch boundaries. Performance tuning has completed for this release with a number of improvements in place for time sync and file systems.
You will notice that the ‘Node Uptime’ and ‘Blocks Forged’ metrics will periodically reset when we move from the active to the passive cluster. We perform this switch when we need to patch or upgrade the nodes to a new cardano release. By running two completely separate clusters we can minimise downtime to seconds while the switch occurs.
Resource utilisation for memory, network and CPU is low (even with the spike at Epoch boundaries) and there is significant capacity on the existing hardware for growth in transactions.
The cardano node software has proven to be very stable to date for the majority of releases with one exception (1.18.1), which is why we always allow for a comprehensive soak test on the passive cluster when compiling new releases.
If you are interested in receiving more information, let us know. We are open to implementing anonymous* alerting options for delegators.
*Anonymous since we specifically do not want to hold any personally identifiable information on delegators to ADAvault.