GlusterFS / VG Metrics in Prometheus (OCP)

We had a requirement to gather LVM (VG) metrics via Prometheus to alert when GlusterFS is running low on ‘brick’ storage space. Currently, within Openshift 3.9 the only metrics seem to relate to mounted FS. A ‘heketi exporter module’ exists but this only reports space within allocated blocks. There doesn’t appear to be any method to pull metrics from the underlying storage.

We solved this by using a Prometheus pushgateway. Metrics are pushed from Gluster hosts using curl (via cron) and then pulled using a standard Prometheus scrape configuration (via prometheus configmap in OCP). Alerts are then pushed via alertmanager and eventually Cloudforms.

Import the pushgateway image:

oc import-image openshift/prom-pushgateway --from= docker.io/prom/pushgateway --confirm

Create pod and expose route. Then, add scrape config to prometheus configmap:-


- job_name: openshift-pushgateway
scrape_interval: 30s
scrape_timeout: 30s
metrics_path: /metrics
scheme: http
static_configs:
- targets:
- pushgateway-route.example.com

On GlusterFS hosts we then gather metrics in whatever way we like and push to the gateway via curl. Example:-

echo "VG_Gluster 42" | curl --data-binary @- http://pushgateway-route.example.com/metrics/job/pv_mon/pv/vg"

The metrics are then visible via prometheus UI / Grafana and alerts via alertmanager and CFME respectively.