prometheus liveness probe sidecar

Signed-off-by: Daniel-Pivonka dpivonka@redhat.com
2025-06-13 02:33:34 +00:00 · 2019-06-20 15:30:40 -04:00
parent 2ca575b99d
commit d621a58207
30 changed files with 2483 additions and 55 deletions
--- a/docs/deploy-cephfs.md
+++ b/docs/deploy-cephfs.md
@ -1,3 +1,4 @@
+
 # CSI CephFS plugin

 The CSI CephFS plugin is able to both provision new CephFS volumes
@ -42,18 +43,22 @@ that should be resolved in v14.2.3.

 **Available command line arguments:**

-| Option              | Default value               | Description                                                                                                                                                                                                                                                                            |
-| ------------------- | --------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `--endpoint`        | `unix://tmp/csi.sock`       | CSI endpoint, must be a UNIX socket                                                                                                                                                                                                                                                    |
-| `--drivername`      | `cephfs.csi.ceph.com`       | Name of the driver (Kubernetes: `provisioner` field in StorageClass must correspond to this value)                                                                                                                                                                                     |
-| `--nodeid`          | _empty_                     | This node's ID                                                                                                                                                                                                                                                                         |
-| `--type`            | _empty_                     | Driver type `[rbd | cephfs]` If the driver type is set to  `rbd` it will act as a `rbd plugin` or if it's set to `cephfs` will act as a `cephfs plugin`                                                                                                                                |
+| Option              | Default value               | Description                                                                                                                                                                                                                                                                               |
+| ------------------- | --------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `--endpoint`        | `unix://tmp/csi.sock`       | CSI endpoint, must be a UNIX socket                                                                                                                                                                                                                                                       |
+| `--drivername`      | `cephfs.csi.ceph.com`       | Name of the driver (Kubernetes: `provisioner` field in StorageClass must correspond to this value)                                                                                                         |
+| `--nodeid`          | _empty_                     | This node's ID                                                                                                                                                                                                                                                                            |
+| `--type`            | _empty_                     | Driver type `[rbd | cephfs]` If the driver type is set to  `rbd` it will act as a `rbd plugin` or if it's set to `cephfs` will act as a `cephfs plugin`                                                                                                               |
 | `--volumemounter`   | _empty_                     | Default volume mounter. Available options are `kernel` and `fuse`. This is the mount method used if volume parameters don't specify otherwise. If left unspecified, the driver will first probe for `ceph-fuse` in system's path and will choose Ceph kernel client if probing failed. |
 | `--mountcachedir`   | _empty_                     | Volume mount cache info save dir. If left unspecified, the dirver will not record mount info, or it will save mount info and when driver restart it will remount volume it cached.                                                                                                     |
 | `--instanceid`      | "default"                   | Unique ID distinguishing this instance of Ceph CSI among other instances, when sharing Ceph clusters across CSI instances for provisioning                                                                                                                                             |
 | `--pluginpath`      | "/var/lib/kubelet/plugins/" | The location of cephcsi plugin on host                                                                                                                                                                                                                                                 |
 | `--metadatastorage` | _empty_                     | Points to where older (1.0.0 or older plugin versions) metadata about provisioned volumes are kept, as file or in as k8s configmap (`node` or `k8s_configmap` respectively)                                                                                                            |
 | `--pidlimit`        | _0_                         | Configure the PID limit in cgroups. The container runtime can restrict the number of processes/tasks which can cause problems while provisioning (or deleting) a large number of volumes. A value of `-1` configures the limit to the maximum, `0` does not configure limits at all.   |
+| `--livenessport`    | `8080`                      | TCP port for liveness requests                                                                                                                                                                                                                                                            |
+| `--livenesspath`    | `/metrics`                  | Path of prometheus endpoint where metrics will be available                                                                                                                                                                                                                               |
+| `--polltime`        | `60s`                       | Time interval in between each poll                                                                                                                                                                                                                                                        |
+| `--timeout`         | `3s`                        | Probe timeout in seconds                                                                                                                                                                                                                                                                  |

 **Available environmental variables:**

@ -163,11 +168,11 @@ After successfully completing the steps above, you should see output similar to
 ```bash
 $ kubectl get all
 NAME                                 READY     STATUS    RESTARTS   AGE
-pod/csi-cephfsplugin-provisioner-0   3/3       Running   0          25s
-pod/csi-cephfsplugin-rljcv           2/2       Running   0          24s
+pod/csi-cephfsplugin-provisioner-0   4/4       Running   0          25s
+pod/csi-cephfsplugin-rljcv           3/3       Running   0          24s

 NAME                                   TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)     AGE
-service/csi-cephfsplugin-provisioner   ClusterIP   10.101.78.75     <none>        12345/TCP   26s
+service/csi-cephfsplugin-provisioner   ClusterIP   10.101.78.75     <none>        8080/TCP   26s
 ...
 ```

--- a/docs/deploy-rbd.md
+++ b/docs/deploy-rbd.md
@ -1,3 +1,4 @@
+
 # CSI RBD Plugin

 The RBD CSI plugin is able to provision new RBD images and
@ -36,6 +37,10 @@ make image-cephcsi
 | `--instanceid`      | "default"             | Unique ID distinguishing this instance of Ceph CSI among other instances, when sharing Ceph clusters across CSI instances for provisioning                                   |
 | `--metadatastorage` | _empty_               | Points to where legacy (1.0.0 or older plugin versions) metadata about provisioned volumes are kept, as file or in as k8s configmap (`node` or `k8s_configmap` respectively) |
 | `--pidlimit`        | _0_                   | Configure the PID limit in cgroups. The container runtime can restrict the number of processes/tasks which can cause problems while provisioning (or deleting) a large number of volumes. A value of `-1` configures the limit to the maximum, `0` does not configure limits at all.   |
+| `--livenessport`    | `8080`                | TCP port for liveness requests                                                                                                                                               |
+| `--livenesspath`    | `"/metrics"`          | Path of prometheus endpoint where metrics will be available                                                                                                                  |
+| `--polltime`        | `"60s"`               | Time interval in between each poll                                                                                                                                           |
+| `--timeout`         | `"3s"`                | Probe timeout in seconds                                                                                                                                                     |

 **Available volume parameters:**

@ -126,11 +131,11 @@ After successfully completing the steps above, you should see output similar to
 ```bash
 $ kubectl get all
 NAME                              READY     STATUS    RESTARTS   AGE
-pod/csi-rbdplugin-fptqr           2/2       Running   0          21s
-pod/csi-rbdplugin-provisioner-0   4/4       Running   0          22s
+pod/csi-rbdplugin-fptqr           3/3       Running   0          21s
+pod/csi-rbdplugin-provisioner-0   5/5       Running   0          22s

 NAME                                TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)     AGE
-service/csi-rbdplugin-provisioner   ClusterIP   10.104.2.130   <none>        12345/TCP   23s
+service/csi-rbdplugin-provisioner   ClusterIP   10.104.2.130   <none>        8080/TCP   23s
 ...
 ```

--- a/docs/metrics.md
+++ b/docs/metrics.md
@ -0,0 +1,32 @@
+# Metrics
+
+CSI deploys a sidecar container that is responsible for collecting metrics.
+
+## Liveness
+
+Liveness metrics are intended to be collected by prometheus but can be accesesed
+through a GET request to a specific pod ip.
+
+for example
+`curl -X get http://[pod ip]:[liveness-port][liveness-path] 2>/dev/null | grep csi`
+
+the expected output should be
+
+```bash
+[root@worker2 /]# curl -X GET http://10.109.65.142:8080/metrics 2>/dev/null | grep csi
+# HELP csi_liveness Liveness Probe
+# TYPE csi_liveness gauge
+csi_liveness 1
+```
+
+Promethues can be deployed through the promethues operator described [here](https://coreos.com/operators/prometheus/docs/latest/user-guides/getting-started.html).
+The [service-monitor](../examples/service-monitor.yaml) will tell promethues how
+to pull metrics out of CSI.
+
+Each CSI pod has a service to expose the end point to prometheus. By default rbd
+pods run on port 8080 and cephfs 8081.
+These can be changed if desired or if multiple ceph clusters are deployed more
+ports will be used for additional CSI pods.
+
+You may need to open the ports used in your firewall depending on how you
+cluster is setup.