mirror of
https://github.com/ceph/ceph-csi.git
synced 2024-11-30 02:00:19 +00:00
826f7126cd
updated upgrade documentation to remove the snapshot created by alpha driver before upgrade of CSI driver as beta snapshot is not backward compatible with the alpha snapshot. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
416 lines
15 KiB
Markdown
416 lines
15 KiB
Markdown
# Ceph-csi Upgrade
|
|
|
|
- [Ceph-csi Upgrade](#ceph-csi-upgrade)
|
|
- [Pre-upgrade considerations](#pre-upgrade-considerations)
|
|
- [RBD CSI Snapshot Incompatibility](#rbd-csi-snapshot-incompatibility)
|
|
- [Upgrading from v1.2 to v2.0](#upgrading-from-v12-to-v20)
|
|
- [Upgrading from v2.0 to v2.1](#upgrading-from-v20-to-v21)
|
|
- [Upgrading CephFS](#upgrading-cephfs)
|
|
- [1. Upgrade CephFS Provisioner resources](#1-upgrade-cephfs-provisioner-resources)
|
|
- [1.1 Update the CephFS Provisioner RBAC](#11-update-the-cephfs-provisioner-rbac)
|
|
- [1.2 Update the CephFS Provisioner deployment](#12-update-the-cephfs-provisioner-deployment)
|
|
- [2. Upgrade CephFS Nodeplugin resources](#2-upgrade-cephfs-nodeplugin-resources)
|
|
- [2.1 Update the CephFS Nodeplugin RBAC](#21-update-the-cephfs-nodeplugin-rbac)
|
|
- [2.2 Update the CephFS Nodeplugin daemonset](#22-update-the-cephfs-nodeplugin-daemonset)
|
|
- [2.3 Manual deletion of CephFS Nodeplugin daemonset pods](#23-manual-deletion-of-cephfs-nodeplugin-daemonset-pods)
|
|
- [Upgrading RBD](#upgrading-rbd)
|
|
- [3. Upgrade RBD Provisioner resources](#3-upgrade-rbd-provisioner-resources)
|
|
- [3.1 Update the RBD Provisioner RBAC](#31-update-the-rbd-provisioner-rbac)
|
|
- [3.2 Update the RBD Provisioner deployment](#32-update-the-rbd-provisioner-deployment)
|
|
- [3.3 Update snapshot CRD from Alpha to Beta](#33-update-snapshot-crd-from-alpha-to-beta)
|
|
- [4. Upgrade RBD Nodeplugin resources](#4-upgrade-rbd-nodeplugin-resources)
|
|
- [4.1 Update the RBD Nodeplugin RBAC](#41-update-the-rbd-nodeplugin-rbac)
|
|
- [4.2 Update the RBD Nodeplugin daemonset](#42-update-the-rbd-nodeplugin-daemonset)
|
|
- [4.3 Manual deletion of RBD Nodeplugin daemonset pods](#43-manual-deletion-of-rbd-nodeplugin-daemonset-pods)
|
|
- [Handling node reboot hangs due to existing network mounts](#handling-node-reboot-hangs-due-to-existing-network-mounts)
|
|
|
|
## Pre-upgrade considerations
|
|
|
|
In some scenarios there is an issue in the CSI driver that will cause
|
|
application pods to be disconnected from their mounts when the CSI driver is
|
|
restarted. Since the upgrade would cause the CSI driver to restart if it is
|
|
updated, you need to be aware of whether this affects your applications. This
|
|
issue will happen when using the Ceph fuse client or rbd-nbd:
|
|
|
|
CephFS: If you are provision volumes for CephFS and have a kernel less than
|
|
version 4.17, The CSI driver will fall back to use the FUSE client.
|
|
|
|
RBD: If you have set the mounter: rbd-nbd option in the RBD storage class, the
|
|
NBD mounter will have this issue.
|
|
|
|
If you are affected by this issue, you will need to proceed carefully during
|
|
the upgrade to restart your application pods. The recommended step is to modify
|
|
the update strategy of the CSI nodeplugin daemonsets to OnDelete so that you
|
|
can control when the CSI driver pods are restarted on each node.
|
|
|
|
To avoid this issue in future upgrades, we recommend that you do not use the
|
|
fuse client or rbd-nbd as of now.
|
|
|
|
This guide will walk you through the steps to upgrade the software in a cluster
|
|
from v2.0 to v2.1
|
|
|
|
### RBD CSI Snapshot Incompatibility
|
|
|
|
CSI snapshot is moved from Alpha to Beta and is not backward compatible. The snapshots
|
|
created with the Alpha version must be deleted before the upgrade.
|
|
|
|
Check if we have any `v1alpha1` CRD created in our kubernetes cluster, If there
|
|
is no `v1alpha1` CRD created you can skip this step.
|
|
|
|
```bash
|
|
[$]kubectl get crd volumesnapshotclasses.snapshot.storage.k8s.io -o yaml |grep v1alpha1
|
|
- name: v1alpha1
|
|
- v1alpha1
|
|
[$]kubectl get crd volumesnapshotcontents.snapshot.storage.k8s.io -o yaml |grep v1alpha1
|
|
- name: v1alpha1
|
|
- v1alpha1
|
|
[$]kubectl get crd volumesnapshots.snapshot.storage.k8s.io -o yaml |grep v1alpha1
|
|
- name: v1alpha1
|
|
- v1alpha1
|
|
```
|
|
|
|
- List all the volumesnapshots created
|
|
|
|
```bash
|
|
[$] kubectl get volumesnapshot
|
|
NAME AGE
|
|
rbd-pvc-snapshot 22s
|
|
```
|
|
|
|
- Delete all volumesnapshots
|
|
|
|
```bash
|
|
[$] kubectl delete volumesnapshot rbd-pvc-snapshot
|
|
volumesnapshot.snapshot.storage.k8s.io "rbd-pvc-snapshot" deleted
|
|
```
|
|
|
|
- List all volumesnapshotclasses created
|
|
|
|
```bash
|
|
[$] kubectl get volumesnapshotclass
|
|
NAME AGE
|
|
csi-rbdplugin-snapclass 86s
|
|
```
|
|
|
|
- Delete all volumesnapshotclasses
|
|
|
|
```bash
|
|
[$] kubectl delete volumesnapshotclass csi-rbdplugin-snapclass
|
|
volumesnapshotclass.snapshot.storage.k8s.io "csi-rbdplugin-snapclass" deleted
|
|
```
|
|
|
|
*Note:* The underlying snapshots on the storage system will be deleted by ceph-csi
|
|
|
|
## Upgrading from v1.2 to v2.0
|
|
|
|
Refer
|
|
[upgrade-from-1.2-v2.0](https://github.com/ceph/ceph-csi/blob/v2.0.1/docs/ceph-csi-upgrade.md)
|
|
to upgrade from cephcsi 1.2 to v2.0
|
|
|
|
## Upgrading from v2.0 to v2.1
|
|
|
|
**Ceph-csi releases from master are expressly unsupported.** It is strongly
|
|
recommended that you use [official
|
|
releases](https://github.com/ceph/ceph-csi/releases) of Ceph-csi. Unreleased
|
|
versions from the master branch are subject to changes and incompatibilities
|
|
that will not be supported in the official releases. Builds from the master
|
|
branch can have functionality changed and even removed at any time without
|
|
compatibility support and without prior notice.
|
|
|
|
git checkout v2.1.0 tag
|
|
|
|
```bash
|
|
[$] git clone https://github.com/ceph/ceph-csi.git
|
|
[$] cd ./ceph-csi
|
|
[$] git checkout v2.1.0
|
|
```
|
|
|
|
**Note:** While upgrading please Ignore warning messages from kubectl output
|
|
|
|
```console
|
|
Warning: kubectl apply should be used on resource created by either kubectl create --save-config or kubectl apply
|
|
```
|
|
|
|
### Upgrading CephFS
|
|
|
|
Upgrading cephfs csi includes upgrade of cephfs driver and as well as
|
|
kubernetes sidecar containers and also the permissions required for the
|
|
kubernetes sidecar containers, lets upgrade the things one by one
|
|
|
|
#### 1. Upgrade CephFS Provisioner resources
|
|
|
|
Upgrade provisioner resources include updating the provisioner RBAC and
|
|
Provisioner deployment
|
|
|
|
##### 1.1 Update the CephFS Provisioner RBAC
|
|
|
|
```bash
|
|
[$] kubectl apply -f deploy/cephfs/kubernetes/csi-provisioner-rbac.yaml
|
|
serviceaccount/cephfs-csi-provisioner configured
|
|
clusterrole.rbac.authorization.k8s.io/cephfs-external-provisioner-runner configured
|
|
clusterrole.rbac.authorization.k8s.io/cephfs-external-provisioner-runner-rules configured
|
|
clusterrolebinding.rbac.authorization.k8s.io/cephfs-csi-provisioner-role configured
|
|
role.rbac.authorization.k8s.io/cephfs-external-provisioner-cfg configured
|
|
rolebinding.rbac.authorization.k8s.io/cephfs-csi-provisioner-role-cfg configured
|
|
```
|
|
|
|
##### 1.2 Update the CephFS Provisioner deployment
|
|
|
|
```bash
|
|
[$]kubectl apply -f deploy/cephfs/kubernetes/csi-cephfsplugin-provisioner.yaml
|
|
service/csi-cephfsplugin-provisioner configured
|
|
deployment.apps/csi-cephfsplugin-provisioner configured
|
|
```
|
|
|
|
wait for the deployment to complete
|
|
|
|
```bash
|
|
[$]kubectl get deployment
|
|
NAME READY UP-TO-DATE AVAILABLE AGE
|
|
csi-cephfsplugin-provisioner 3/3 1 3 104m
|
|
```
|
|
|
|
deployment UP-TO-DATE value must be same as READY
|
|
|
|
#### 2. Upgrade CephFS Nodeplugin resources
|
|
|
|
Upgrading nodeplugin resources include updating the nodeplugin RBAC and
|
|
nodeplugin daemonset
|
|
|
|
##### 2.1 Update the CephFS Nodeplugin RBAC
|
|
|
|
```bash
|
|
[$]kubectl apply -f deploy/cephfs/kubernetes/csi-nodeplugin-rbac.yaml
|
|
serviceaccount/cephfs-csi-nodeplugin configured
|
|
clusterrole.rbac.authorization.k8s.io/cephfs-csi-nodeplugin configured
|
|
clusterrole.rbac.authorization.k8s.io/cephfs-csi-nodeplugin-rules configured
|
|
clusterrolebinding.rbac.authorization.k8s.io/cephfs-csi-nodeplugin configured
|
|
```
|
|
|
|
If you determined in [Pre-upgrade considerations](#pre-upgrade-considerations)
|
|
that you were affected by the CSI driver restart issue that disconnects the
|
|
application pods from their mounts, continue with this section. Otherwise, you
|
|
can skip to step 2.2
|
|
|
|
```console
|
|
vi deploy/cephfs/kubernetes/csi-cephfsplugin.yaml
|
|
```
|
|
|
|
```yaml
|
|
kind: DaemonSet
|
|
apiVersion: apps/v1
|
|
metadata:
|
|
name: csi-cephfsplugin
|
|
spec:
|
|
selector:
|
|
matchLabels:
|
|
app: csi-cephfsplugin
|
|
updateStrategy:
|
|
type: OnDelete
|
|
template:
|
|
metadata:
|
|
labels:
|
|
app: csi-cephfsplugin
|
|
spec:
|
|
serviceAccount: cephfs-csi-nodeplugin
|
|
```
|
|
|
|
in the above template we have added `updateStrategy` and its `type` to the
|
|
daemonset spec
|
|
|
|
##### 2.2 Update the CephFS Nodeplugin daemonset
|
|
|
|
```bash
|
|
[$]kubectl apply -f deploy/cephfs/kubernetes/csi-cephfsplugin.yaml
|
|
daemonset.apps/csi-cephfsplugin configured
|
|
service/csi-metrics-cephfsplugin configured
|
|
```
|
|
|
|
##### 2.3 Manual deletion of CephFS Nodeplugin daemonset pods
|
|
|
|
If you determined in [Pre-upgrade considerations](#pre-upgrade-considerations)
|
|
that you were affected by the CSI driver restart issue that disconnects the
|
|
application pods from their mounts, continue with this section. Otherwise, you
|
|
can skip this section.
|
|
|
|
As we have set the updateStrategy to OnDelete the CSI driver pods will not be
|
|
updated until you delete them manually. This allows you to control when your
|
|
application pods will be affected by the CSI driver restart.
|
|
|
|
For each node:
|
|
|
|
- Drain your application pods from the node
|
|
- Delete the CSI driver pods on the node
|
|
- The pods to delete will be named with a csi-cephfsplugin prefix and have a
|
|
random suffix on each node. However, no need to delete the provisioner
|
|
pods: csi-cephfsplugin-provisioner-* .
|
|
- The pod deletion causes the pods to be restarted and updated automatically
|
|
on the node.
|
|
|
|
we have successfully upgraded cephfs csi from v2.0 to v2.1
|
|
|
|
### Upgrading RBD
|
|
|
|
Upgrading rbd csi includes upgrade of rbd driver and as well as kubernetes
|
|
sidecar containers and also the permissions required for the kubernetes sidecar
|
|
containers, lets upgrade the things one by one
|
|
|
|
#### 3. Upgrade RBD Provisioner resources
|
|
|
|
Upgrading provisioner resources include updating the provisioner RBAC and
|
|
Provisioner deployment
|
|
|
|
##### 3.1 Update the RBD Provisioner RBAC
|
|
|
|
```bash
|
|
[$]kubectl apply -f deploy/rbd/kubernetes/csi-provisioner-rbac.yaml
|
|
serviceaccount/rbd-csi-provisioner configured
|
|
clusterrole.rbac.authorization.k8s.io/rbd-external-provisioner-runner configured
|
|
clusterrole.rbac.authorization.k8s.io/rbd-external-provisioner-runner-rules configured
|
|
clusterrolebinding.rbac.authorization.k8s.io/rbd-csi-provisioner-role configured
|
|
role.rbac.authorization.k8s.io/rbd-external-provisioner-cfg configured
|
|
rolebinding.rbac.authorization.k8s.io/rbd-csi-provisioner-role-cfg configured
|
|
```
|
|
|
|
##### 3.2 Update the RBD Provisioner deployment
|
|
|
|
```bash
|
|
[$]kubectl apply -f deploy/rbd/kubernetes/csi-rbdplugin-provisioner.yaml
|
|
service/csi-rbdplugin-provisioner configured
|
|
deployment.apps/csi-rbdplugin-provisioner configured
|
|
```
|
|
|
|
wait for the deployment to complete
|
|
|
|
```bash
|
|
[$]kubectl get deployments
|
|
NAME READY UP-TO-DATE AVAILABLE AGE
|
|
csi-rbdplugin-provisioner 3/3 3 3 139m
|
|
```
|
|
|
|
deployment UP-TO-DATE value must be same as READY
|
|
|
|
##### 3.3 Update snapshot CRD from Alpha to Beta
|
|
|
|
As we are updating the snapshot resources from `Alpha` to `Beta` we need to
|
|
delete the old `alphav1` snapshot CRD created by external-snapshotter sidecar container
|
|
|
|
Check if we have any `v1alpha1` CRD created in our kubernetes cluster
|
|
|
|
```bash
|
|
[$]kubectl get crd volumesnapshotclasses.snapshot.storage.k8s.io -o yaml |grep v1alpha1
|
|
- name: v1alpha1
|
|
- v1alpha1
|
|
[$]kubectl get crd volumesnapshotcontents.snapshot.storage.k8s.io -o yaml |grep v1alpha1
|
|
- name: v1alpha1
|
|
- v1alpha1
|
|
[$]kubectl get crd volumesnapshots.snapshot.storage.k8s.io -o yaml |grep v1alpha1
|
|
- name: v1alpha1
|
|
- v1alpha1
|
|
```
|
|
|
|
As we have `v1alpha1` CRD created in our kubernetes cluster,we need to delete
|
|
the Alpha CRD
|
|
|
|
```bash
|
|
[$]kubectl delete crd volumesnapshotclasses.snapshot.storage.k8s.io
|
|
customresourcedefinition.apiextensions.k8s.io "volumesnapshotclasses.snapshot.storage.k8s.io" deleted
|
|
[$]kubectl delete crd volumesnapshotcontents.snapshot.storage.k8s.io
|
|
customresourcedefinition.apiextensions.k8s.io "volumesnapshotcontents.snapshot.storage.k8s.io" deleted
|
|
[$]kubectl delete crd volumesnapshots.snapshot.storage.k8s.io
|
|
customresourcedefinition.apiextensions.k8s.io "volumesnapshots.snapshot.storage.k8s.io" deleted
|
|
```
|
|
|
|
**Note**: Its kubernetes distributor responsibility to install new snapshot
|
|
controller and snapshot beta CRD. more info can be found
|
|
[here](https://github.com/kubernetes-csi/external-snapshotter/tree/master#usage)
|
|
|
|
#### 4. Upgrade RBD Nodeplugin resources
|
|
|
|
Upgrading nodeplugin resources include updating the nodeplugin RBAC and
|
|
nodeplugin daemonset
|
|
|
|
##### 4.1 Update the RBD Nodeplugin RBAC
|
|
|
|
```bash
|
|
[$]kubectl apply -f deploy/rbd/kubernetes/csi-nodeplugin-rbac.yaml
|
|
serviceaccount/rbd-csi-nodeplugin configured
|
|
clusterrole.rbac.authorization.k8s.io/rbd-csi-nodeplugin configured
|
|
clusterrole.rbac.authorization.k8s.io/rbd-csi-nodeplugin-rules configured
|
|
clusterrolebinding.rbac.authorization.k8s.io/rbd-csi-nodeplugin configured
|
|
```
|
|
|
|
If you determined in [Pre-upgrade considerations](#pre-upgrade-considerations)
|
|
that you were affected by the CSI driver restart issue that disconnects the
|
|
application pods from their mounts, continue with this section. Otherwise, you
|
|
can skip to step 4.2
|
|
|
|
```console
|
|
vi deploy/rbd/kubernetes/csi-rbdplugin.yaml
|
|
```
|
|
|
|
```yaml
|
|
kind: DaemonSet
|
|
apiVersion: apps/v1
|
|
metadata:
|
|
name: csi-rbdplugin
|
|
spec:
|
|
selector:
|
|
matchLabels:
|
|
app: csi-rbdplugin
|
|
updateStrategy:
|
|
type: OnDelete
|
|
template:
|
|
metadata:
|
|
labels:
|
|
app: csi-rbdplugin
|
|
spec:
|
|
serviceAccount: rbd-csi-nodeplugin
|
|
```
|
|
|
|
in the above template we have added `updateStrategy` and its `type` to the
|
|
daemonset spec
|
|
|
|
##### 4.2 Update the RBD Nodeplugin daemonset
|
|
|
|
```bash
|
|
[$]kubectl apply -f deploy/rbd/kubernetes/csi-rbdplugin.yaml
|
|
daemonset.apps/csi-rbdplugin configured
|
|
service/csi-metrics-rbdplugin configured
|
|
```
|
|
|
|
##### 4.3 Manual deletion of RBD Nodeplugin daemonset pods
|
|
|
|
If you determined in [Pre-upgrade considerations](#pre-upgrade-considerations)
|
|
that you were affected by the CSI driver restart issue that disconnects the
|
|
application pods from their mounts, continue with this section.
|
|
Otherwise, you can skip this section.
|
|
|
|
As we have set the updateStrategy to OnDelete the CSI driver pods will not be
|
|
updated until you delete them manually. This allows you to control when your
|
|
application pods will be affected by the CSI driver restart.
|
|
|
|
For each node:
|
|
|
|
- Drain your application pods from the node
|
|
- Delete the CSI driver pods on the node
|
|
- The pods to delete will be named with a csi-rbdplugin prefix and have a
|
|
random suffix on each node. However, no need to delete the provisioner pods
|
|
csi-rbdplugin-provisioner-* .
|
|
- The pod deletion causes the pods to be restarted and updated automatically
|
|
on the node.
|
|
|
|
we have successfully upgraded RBD csi from v2.0 to v2.1
|
|
|
|
### Handling node reboot hangs due to existing network mounts
|
|
|
|
With prior versions of ceph-csi a node reboot may hang the reboot process. This
|
|
is due to existing cephfs or rbd network mounts not carrying the `_netdev`
|
|
mount option. The missing mount option is fixed, and future mounts using
|
|
ceph-csi would automatically carry the required flags needed for a clean
|
|
reboot.
|
|
|
|
It is suggested to upgrade to the latest ceph-csi release and drain application
|
|
pods on all the nodes so that new mount option `_netdev` can be added to all
|
|
the mountpoints.
|