ceph-csi

mirror of https://github.com/ceph/ceph-csi.git synced 2024-12-29 08:20:20 +00:00

Author	SHA1	Message	Date
Prasanna Kumar Kalever	09a8e5e9e6	rbd: unset cluster Name metadata unsets the cluster name metadata key and value on the RBD image Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2022-06-08 16:23:59 +00:00
Prasanna Kumar Kalever	2880c25fd6	rbd: set cluster Name as metadata on the image This change helps read the cluster name from the cmdline args, the provisioner will set the same on the RBD images. Fixes: #2973 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2022-06-08 16:23:59 +00:00
Prasanna Kumar Kalever	deb003e605	cleanup: use prefix instead of hardcoding csiParameterPrefix Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2022-06-08 16:23:59 +00:00
Madhu Rajanna	1952a9b4b3	ci: fix all linter errors found in golangci-lint Fixing all the linter errors found in golang-ci lint v1.46.2 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-06-03 12:55:54 +00:00
Madhu Rajanna	c9943320ac	cephfs: skip NetNamespaceFilePath if the volume is pre-provisioned In case of pre-provisioned volume the clusterID is not set in the volume context as the clusterID is missing we cannot extract the NetNamespaceFilePath from the configuration file. For static volume and dynamically provisioned volume the clusterID is set. Note:- This is a special case to support mounting PV without clusterID parameter. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-06-03 07:25:25 +00:00
Rakshith R	7688306f87	rbd: use vaultAuthPath variable name in error msg Before the change, the error msg was the following: ``` failed to set VAULT_AUTH_MOUNT_PATH in Vault config: path is empty ``` `vaultAuthPath` is the actual variable name set by the user. The error message will now be the following: ``` failed to set "vaultAuthPath" in vault config: path is empty ``` Signed-off-by: Rakshith R <rar@redhat.com>	2022-05-26 07:37:48 +00:00
Rakshith R	894c20f792	nfs: add support for pvc-pvc clone This commit adds support for pvc-pvc clone. Only capability needed to be advertised, the underlying support is already provided by cephfs backend. Signed-off-by: Rakshith R <rar@redhat.com>	2022-05-24 18:13:02 +00:00
Rakshith R	24515b509f	nfs: add support for create & delete snapshot This commits adds support for creation and deletion of nfs snapshots based on cephfs. Signed-off-by: Rakshith R <rar@redhat.com>	2022-05-24 18:13:02 +00:00
Prasanna Kumar Kalever	6470cf3343	rbd: fix bug handling GetKrbdSupportedFeatures() continue running rbd driver when /sys/bus/rbd/supported_features file is missing, do not bailout. Fixes: #2678 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2022-05-15 15:10:08 +00:00
Prasanna Kumar Kalever	83cc1b0e58	rbd: handle when krbdFeatures is zero krbdFeatures is set to zero when kernel version < 3.8, i.e. in case where /sys/bus/rbd/supported_features is absent and we are unable to prepare the krbd attributes based on kernel version. When krbdFeatures is set to zero fallback to NBD only when autofallback is turned ON. Fixes: #2678 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2022-05-15 15:10:08 +00:00
Prasanna Kumar Kalever	e53fd87154	rbd: prepare krbd feature attrs if supported_features file is absent Upstream /sys/bus/rbd/supported_features is part of Linux kernel v4.11.0 Prepare the attributes and use them in case if /sys/bus/rbd/supported_features is missing. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2022-05-15 15:10:08 +00:00
Prasanna Kumar Kalever	27f503c144	rbd: unset parent PVC metadata on CreateVolume From Volume Unset the parent PVC metadata on the temp clone rbd image Fixes: #2970 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2022-05-12 15:54:09 +00:00
Prasanna Kumar Kalever	e0f34a6d60	rbd: unset snapshot metadata on CreateVolume From snapshot Unset the snapshot metadata from the rbd image created from the snapshot Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2022-05-12 15:54:09 +00:00
Prasanna Kumar Kalever	d89c5fb39f	rbd: unset PVC metadata on CreateSnapshot Unset the PVC metadata on the rbd image created for the snapshot Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2022-05-12 15:54:09 +00:00
Prasanna Kumar Kalever	bac33262ae	rbd: add unset volume/snapshot metadata utility functions Added GetVolumeMetadataKeys() GetSnaoshotMetadataKeys() unsetVolumeMetadata() and unsetSnapshotMetadata() functions. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2022-05-12 15:54:09 +00:00
Prasanna Kumar Kalever	1fd5277b3c	cleanup: simplify setVolumeMetadata and rename it Move k8s.GetVolumeMetadata() out of setVolumeMetadata() and rename it to setAllMetadata() so that the same can be used for setting volume and snapshot metadata. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2022-05-12 15:54:09 +00:00
Niels de Vos	36e51402cb	nfs: support ExpandVolume CSI procedure There is not much the NFS-provisioner needs to do to expand a volume, everything is handled by the CephFS components. NFS does not need a resize on the node, so only ControllerExpandVolume is required. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2022-05-10 17:43:59 +00:00
Madhu Rajanna	70674565df	rbd: consider rbd as default mounter if not set For the default mounter the mounter option will not be set in the storageclass and as it is not available in the storageclass same will not be set in the volume context, Because of this the mapOptions are getting discarded. If the mounter is not set assuming it's an rbd mounter. Note:- If the mounter is not set in the storageclass we can set it in the volume context explicitly, Doing this check-in node server to support backward existing volumes and the check is minimal we are not altering the volume context. fixes: #3076 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-05-09 20:00:11 +00:00
Marcus Röder	a95a6213eb	util: support systems using the new cgroup v2 structure With cgroup v2, the location of the pids.max file changed and so did the /proc/self/cgroup file new /proc/self/cgroup file ` 0::/user.slice/user-500.slice/session-14.scope ` old file: ` 11:pids:/user.slice/user-500.slice/session-2.scope 10:blkio:/user.slice 9:net_cls,net_prio:/ 8:perf_event:/ ... ` There is no directory per subsystem (e.g. /sys/fs/cgroup/pids) any more, all files are now in one directory. fixes: https://github.com/ceph/ceph-csi/issues/3085 Signed-off-by: Marcus Röder <m.roeder@yieldlab.de>	2022-05-07 20:38:48 +00:00
Rakshith R	f1ccc4eced	rbd: support pvc-pvc clone with different sc & encryption This commit makes modification so as to allow pvc-pvc clone with different storageclass having different encryption configs. This commit also modifies `copyEncryptionConfig()` to include a `isEncrypted()` check within the function. Signed-off-by: Rakshith R <rar@redhat.com>	2022-05-06 10:32:21 +00:00
Rakshith R	bd57feb26e	rbd: use `vaultAuthPath` variable name in error msg Before the change, the error msg was the following: ``` failed to set VAULT_AUTH_MOUNT_PATH in Vault config: path is empty ``` `vaultAuthPath` is the actual variable name set by the user. The error message will now be the following: ``` failed to set "vaultAuthPath" in vault config: path is empty ``` Signed-off-by: Rakshith R <rar@redhat.com>	2022-05-05 05:49:31 +00:00
Niels de Vos	9d7faf850f	nfs: delete the CephFS volume when the export is already removed In case the NFS-export has already been removed from the NFS-server, but the CSI Controller was restarted, a retry to remove the NFS-volume will fail with an error like: > GRPC error: ....: response status not empty: "Export does not exist" When this error is reported, assume the NFS-export was already removed from the NFS-server configuration, and continue with deleting the backend volume. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2022-05-04 21:31:06 +00:00
Madhu Rajanna	d2bc9743f7	cephfs: add netNamespaceFilePath for CephFS as same host directory is not shared between the cephfs and the rbd plugin pod. we need to keep the netNamespaceFilePath separately for both cephfs and rbd. CephFS plugin will use this path to execute mount -t commands. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-04-19 12:28:46 +00:00
Madhu Rajanna	eb4bfb7326	cleanup: use block comment for ClusterInfo example Adjusted the mix of tabs and the spaces and also used block comment for better readability. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-04-19 12:28:46 +00:00
Madhu Rajanna	b4acbd08a5	rbd: move radosNamespace to RBD section As radosNamespace is more specific to RBD not the general ceph configuration. Now we introduced a new RBD section for RBD specific options, Moving the radosNamespace to RBD section and keeping the radosNamespace still under the global ceph level configration for backward compatibility. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-04-19 12:28:46 +00:00
Madhu Rajanna	766346868e	util: Add RBD specific options in clusterInfo As the netNamespaceFilePath can be separate for both cephfs and rbd adding the netNamespaceFilePath path for RBD, This will help us to keep RBD and CephFS specific options separately. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-04-19 12:28:46 +00:00
Niels de Vos	2b71aac752	nfs: return gRPC status from CephFS CreateVolume failure The NFS Controller returns a non-gRPC error in case the CreateVolume call for the CephFS volume fails. It is better to return the gRPC-error that the CephFS Controller passed along. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2022-04-19 08:23:16 +00:00
Humble Chirammal	fcd0f4713a	cleanup: correct typos in test description and source code this commit correct typos in various places. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2022-04-18 10:29:08 +00:00
Humble Chirammal	4c4879ba8b	cleanup: remove import alias for fence library this commit remove unneeded import alias of fence library from the network_fence test. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2022-04-18 10:29:08 +00:00
Madhu Rajanna	c245436ec4	util: fix logging in ExecuteCommandWithNSEnter log the nsenter and its argument after executing the command with the nsenter CLI. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-04-14 12:17:21 +00:00
Niels de Vos	28369702d2	nfs: use go-ceph API for creating/deleting exports Recent versions of Ceph allow calling the NFS-export management functions over the go-ceph API. This seems incompatible with older versions that have been tested with the `ceph nfs` commands that this commit replaces. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2022-04-14 08:01:45 +00:00
Madhu Rajanna	d886ab0d66	rbd: use leases for leader election use leases for leader election instead of the deprecated configmap based leader election. This PR is making leases as default leader election refer https://github.com/kubernetes-sigs/ controller-runtime/pull/1773, default from configmap to configmap leases was done with https://github.com/kubernetes-sigs/ controller-runtime/pull/1144. Release notes https://github.com/kubernetes-sigs/ controller-runtime/releases/tag/v0.7.0 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-04-14 06:46:50 +00:00
Madhu Rajanna	64a9b1fa59	rbd: consider remote image health for primary To consider the image is healthy during the Promote operation currently we are checking only the image state on the primary site. If the network is flaky or the remote site is down the image health is not as expected. To make sure the image is healthy across the clusters check the state on both local and the remote clusters. some details: https://bugzilla.redhat.com/show_bug.cgi?id=2014495 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-04-13 08:37:23 +00:00
Madhu Rajanna	dffb6e72c2	rbd: check nbd tool features only for rbd driver calling setRbdNbdToolFeatures inside an init gets called in main.go for both cephfs and rbd driver. instead of calling it in init function calling this in rbd driver.go as this is specific to rbd. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-04-11 21:18:27 +00:00
Humble Chirammal	959df4dbac	doc: correct typos in struct field comments and release.md corrected strings in the release guide and util server. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2022-04-11 06:23:25 +00:00
Prasanna Kumar Kalever	41fe2c7dda	rbd: set metadata on the snapshot Set snapshot-name/snapshot-namespace/snapshotcontent-name details on RBD backend snapshot image as metadata on snapshot Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2022-04-08 15:43:14 +00:00
Prasanna Kumar Kalever	0ef79c6fc0	rbd: set metadata on restart of provisioner pod Make sure to set metadata when image exist, i.e. if the provisioner pod is restarted while createVolume is in progress, say it created the image but didn't yet set the metadata. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2022-04-08 15:43:14 +00:00
Prasanna Kumar Kalever	ae5925f04c	rbd: update PV/PVC metadata on a reattach of PV Example if a PVC was delete by setting `persistentVolumeReclaimPolicy` as `Retain` on PV, and PV is reattached to a new PVC, we make sure to update PV/PVC image metadata on a PV reattach. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2022-04-08 15:43:14 +00:00
Prasanna Kumar Kalever	0119d69ab2	rbd: set PV/PVC details on the image as metadata on create This helps Monitoring solutions without access to Kubernetes clusters to display the details of the PV/PVC/NameSpace in their dashboard. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2022-04-08 15:43:14 +00:00
Prasanna Kumar Kalever	4d750ed0e5	rbd: add set/Get VolumeMetadata() utility function Define and use PV and PVC metadata keys used by external provisioner. The CSI external-provisioner (v1.6.0+) introduces the --extra-create-metadata flag, which automatically sets map<string, string> parameters in the CSI CreateVolumeRequest. Add utility functions to set/Get PV/PVC/PVCNamespace metadata on image Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2022-04-08 15:43:14 +00:00
Madhu Rajanna	7b2aef0d81	util: add support for the nsenter add support to run rbd map and mount -t commands with the nsenter. complete design of pod/multus network is added here https://github.com/rook/rook/ blob/master/design/ceph/multus-network.md#csi-pods Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-04-08 10:23:21 +00:00
Prasanna Kumar Kalever	d760d0ab6d	rbd: check for cookie support from kernel Currently we only check if the rbd-nbd tool supports cookie feature. This change will also defend cookie addition based on kernel version Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2022-04-04 09:51:13 +00:00
Madhu Rajanna	f8bbd2f60f	cephfs: fix omap deletion in DeleteSnapshot The omap is stored with the requested snapshot name not with the subvolume snapshotname. This fix uses the correct snapshot request name to cleanup the omap once the subvolume snapshot is deleted. fixes: #2974 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-03-31 13:46:03 +00:00
Niels de Vos	1da19680b4	nfs: support new and old NFS-management commands The `ceph nfs export ...` commands have changed in recent Ceph releases. Use the most recent command as a default, fall back to the older command when an error is reported. This shoud make the NFS-provisioner work on any current Ceph version. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2022-03-31 11:28:40 +00:00
Madhu Rajanna	f90408be4d	rbd: increase force promote timeout to 2 minutes Increase the timeout to 2 minutes to give enough time for rollback to complete. As rollback is performed by the force-promote command it, at times, may take more than a minute (based on dirty blocks that need to be rolled back approximately) to rollback. The added extra 1 minute is useful though to avoid multiple calls to complete the rollback and in extremely corner cases to avoid failures in the first instance of the call when the mirror watcher is not yet removed (post scaling down the RBD mirror instance) Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-03-30 13:46:27 +00:00
Thibaut Blanchard	e874c9c11b	rbd: fix topology snapshot pool Restoring a snapshot with a new PVC results with a wrong dataPoolName in case of initial volume linked to a storageClass with topology constraints and erasure coding. Signed-off-by: Thibaut Blanchard <thibaut.blanchard@gmail.com>	2022-03-30 04:40:30 +00:00
Niels de Vos	885295fcc9	nfs: store the NFS-cluster name in the journal Signed-off-by: Niels de Vos <ndevos@redhat.com>	2022-03-28 11:23:17 +00:00
Niels de Vos	3b4d193ca8	journal: add StoreAttribute/FetchAttribute Signed-off-by: Niels de Vos <ndevos@redhat.com>	2022-03-28 11:23:17 +00:00
Niels de Vos	010fd816dd	nfs: store the calling Context in NFSVolume NFSVolume instances are short lived, they only extist for a certain gRPC procedure. It is easier to store the calling Context in the NFSVolume struct, than to pass it to some of the functions that require it. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2022-03-28 11:23:17 +00:00
Niels de Vos	6d83df9cc9	nfs: add basic provisioner with create/delete procedures These NFS Controller and Identity servers are the base for the new provisioner. The functionality is currently extremely limited, follow-up PRs will implement various CSI procedures. CreateVolume is implemented with the bare minimum. This makes it possible to create a volume, and mount it with the kubernetes-csi/csi-driver-nfs NodePlugin. DeleteVolume unexports the volume from the Ceph managed NFS-Ganesha service. In case the Ceph cluster provides multiple NFS-Ganesha deployments, things might not work as expected. This is going to be addressed in follow-up improvements. Lots of TODO comments need to be resolved before this can be declared "production ready". Unit- and e2e-tests are missing as well. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2022-03-28 11:23:17 +00:00
Robert Vasek	f6ae612003	util: added reference tracker RT, reference tracker, is key-based implementation of a reference counter. Unlike an integer-based counter, RT counts references by tracking unique keys. This allows accounting in situations where idempotency must be preserved. It guarantees there will be no duplicit increments or decrements of the counter. Signed-off-by: Robert Vasek <robert.vasek@cern.ch>	2022-03-27 19:24:26 +00:00
Rakshith R	40de75e0db	rbd: modify oidc token file path according to FHS 3.0 OIDC token file path has been modified from `/var/run/secrets/token` to `/run/secrets/tokens`. This has been done to ensure compliance with FHS 3.0. refer: https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch05s13.html Signed-off-by: Rakshith R <rar@redhat.com>	2022-03-23 13:29:35 +00:00
Madhu Rajanna	8c5e414d53	rbd: do not read pvc namespace from volume attributes Below are the 3 different cases where we need the PVC namespace for encryption * CreateVolume:- Read the namespace from the createVolume parameters and store it in the omap * NodeStage:- Read the namespace from the omap not from the volumeContext * Regenerate:- Read the pvc namespace from the claimRef not from the volumeAttributes. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-03-21 08:54:43 +00:00
Madhu Rajanna	77011fbc61	cephfs: remove kubernetes csi prefixed parameters remove kubernetes csi prefixed parameters from the volumeContext as we dont want to store it in the PV VolumeAttributes. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-03-21 08:54:43 +00:00
Madhu Rajanna	a7315a04c1	rbd: remove kubernetes csi prefixed parameters remove kubernetes csi prefixed parameters from the volumeContext as we dont want to store it in the PV VolumeAttributes. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-03-21 08:54:43 +00:00
Madhu Rajanna	366c2ace31	util: add helper to get pvcnamespace from input added helper function to return the pvc namespace name from the input parameters. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-03-21 08:54:43 +00:00
Madhu Rajanna	772fe8d6c8	util: add helper function to strip kube parameters added helper function to strip the kubernetes specific parameters from the volumeContext as volumeContext is storaged in the PV volumeAttributes Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-03-21 08:54:43 +00:00
Rakshith R	a56f9a0c05	rbd: flatten datasource image before creating volume This commit ensures that parent image is flattened before creating volume. - If the data source is a PVC, the underlying image's parent is flattened(which would be a temp clone or snapshot). hard & soft limit is reduced by 2 to account for depth that will be added by temp & final clone. - If the data source is a Snapshot, the underlying image is itself flattened. hard & soft limit is reduced by 1 to account for depth that will be added by the clone which will be restored from the snapshot. Flattening step for resulting PVC image restored from snapshot is removed. Flattening step for temp clone & final image is removed when pvc clone is being created. Fixes: #2190 Signed-off-by: Rakshith R <rar@redhat.com>	2022-03-18 10:27:27 +00:00
Madhu Rajanna	d357bebbc2	cephfs: disallow creating small volumes from snapshot/volume as per the CSI standard the size is optional parameter, as we are allowing the clone to a bigger size today we need to block the clone to a smaller size as its a have side effects like data corruption etc. Note:- Even though this check is present in kubernetes sidecar as CSI is CO independent adding the check here. fixes: #2718 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-03-17 05:07:26 +00:00
Humble Chirammal	525ff5d97f	rbd: remove unimplemented responses for node operations These RPCs( nodestage,unstage,volumestats) are implemented RPCs for our drivers atm. This commit removes the `unimplemented` responses from the common/default server initialization routins. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2022-03-16 15:27:48 +00:00
Humble Chirammal	66e7f3525f	cleanup: remove unimplemented controller expand,snapshot RPCs These RPCs ( controller expand, create and delete snapshots) are no longer unimplmented and we dont have to declare these as with `unimplemented` states. This commit remove the same. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2022-03-16 15:27:48 +00:00
Rakshith R	4f0bb2315b	rbd: add `aws-sts-metdata` encryption type With Amazon STS and kubernetes cluster is configured with OIDC identity provider, credentials to access Amazon KMS can be fetched using oidc-token(serviceaccount token). Each tenant/namespace needs to create a secret with aws region, role and CMK ARN. Ceph-CSI will assume the given role with oidc token and access aws KMS, with given CMK to encrypt/decrypt DEK which will stored in the image metdata. Refer: https://docs.aws.amazon.com/STS/latest/APIReference/welcome.html Resolves: #2879 Signed-off-by: Rakshith R <rar@redhat.com>	2022-03-16 07:29:56 +00:00
Prasanna Kumar Kalever	3eb0fa5e21	rbd: fix parsing mapOptions Currently, we support mapOption: "krbd:v1,v2,v3;nbd:v1,v2,v3" - By omitting `krbd:` or `nbd:`, the option(s) apply to rbdDefaultMounter which is krbd. - A user can _override_ the options for a mounter by specifying `krbd:` or `nbd:`. mapOption: "v1,v2,v3;nbd:v1,v2,v3" is effectively the same as the 1st example. - Sections are split by `;`. - If users want to specify common options for both `krbd` and `nbd`, they should mention them twice. But in case if the krbd or nbd specifc options contian `:` within them, then the parsing is failing now. E0301 10:19:13.615111 7348 utils.go:200] ID: 63 Req-ID: 0001-0009-rook-ceph-0000000000000001-fd37c41b-9948-11ec-ad32-0242ac110004 GRPC error: badly formatted map/unmap options: "krbd:read_from_replica=localize,crush_location=zone:zone1;" This patch fix the above case where the options itself contain `:` delimitor ex: krbd:v1,v2,v3=v31:v32;nbd:v1,v2,v3" Please note, if you are using such options which contain `:` delimiter, then it is mandatory to specify the mounter-type. Fixes: #2910 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2022-03-14 15:21:25 +00:00
Madhu Rajanna	78ec859dc6	cleanup: remove unwanted print Removing unwanted print from the code Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-03-11 05:40:32 +00:00
Robert Vasek	80dda7cc30	cephfs: detect corrupt ceph-fuse mounts and try to remount Mounts managed by ceph-fuse may get corrupted by e.g. the ceph-fuse process exiting abruptly, or its parent container being terminated, taking down its child processes with it. This commit adds checks to NodeStageVolume and NodePublishVolume procedures to detect whether a mountpoint in staging_target_path and/or target_path is corrupted, and remount is performed if corruption is detected. Signed-off-by: Robert Vasek <robert.vasek@cern.ch>	2022-03-10 06:05:52 +00:00
Robert Vasek	aa6297e164	cleanup: refactor helper functions in nodeserver.go Refactored a couple of helper functions for easier resue. * Code for building store.VolumeOptions is factored out into a separate function. * Changed args of getCredentailsForVolume() and NodeServer.mount() so that instead of passing in whole csi.NodeStageVolumeRequest, only necessary properties are passed explicitly. This is to allow these functions to be called outside of NodeStageVolume() where NodeStageVolumeRequest is not available. Signed-off-by: Robert Vasek <robert.vasek@cern.ch>	2022-03-10 06:05:52 +00:00
Rakshith R	3a64ee48c3	rbd: return unimplemented error for block-mode reclaimspace req blkdiscard cmd discards all data on the block device which is not desired. Hence, return unimplemented code if the volume access mode is block. Signed-off-by: Rakshith R <rar@redhat.com>	2022-03-03 19:00:49 +00:00
Niels de Vos	1f012004a6	util: configure tenants vaultAuthNamespace if not set When a tenant provides a configuration that includes the `vaultNamespace` option, the `vaultAuthNamespace` option is still taken from the global configuration. This is not wanted in all cases, as the `vaultAuthNamespace` option defauls to the `vaultNamespace` option which the tenant may want to override as well. The following behaviour is now better defined: 1. no `vaultAuthNamespace` in the global configuration: A tenant can override the `vaultNamespace` option and that will also set the `vaultAuthNamespace` option to the same value. 2. `vaultAuthNamespace` and `vaultNamespace` in the global configuration: When both options are set to different values in the global configuration, the tenant `vaultNamespace` option will not override the global `vaultAuthNamespace` option. The tenant can configure `vaultAuthNamespace` with a different value if required. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2022-03-02 08:36:33 +00:00
Madhu Rajanna	d5c98f81a2	rbd: make image features as optional parameter Makes the rbd images features in the storageclass as optional so that default image features of librbd can be used. and also kept the option to user to specify the image features in the storageclass. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-02-28 13:10:03 +00:00
Madhu Rajanna	fb3835691f	rbd: add support for deep-flatten image feature as deep-flatten is long supported in ceph and its enabled by default in the librbd, providing an option to enable it in cephcsi for the rbd images we are creating. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-02-28 13:10:03 +00:00
Madhu Rajanna	e9802c4940	cephfs: refactor cephfs core functions This commits refactors the cephfs core functions with interfaces. This helps in better code structuring and writing the unit test cases. update #852 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-02-22 20:39:23 +00:00
Madhu Rajanna	46378f3bfc	rbd: log stderror when running modprobe logging the error is not user-friendly and it contains system error message. Log the stderr which is user-friendly error message for identifying the problem. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-02-14 15:03:31 +00:00
Sébastien BERNARD	ee8fb3f05f	rbd: Fix dataPool in createVolumeResponse Return the dataPool used to create the image instead of the default one provided by the createVolumeRequest. In case of topologyConstrainedDataPools, they may differ. Don't add datapool if it's not present Signed-off-by: Sébastien Bernard <sebastien.bernard@sfr.com>	2022-02-10 11:44:22 +00:00
Humble Chirammal	8f6a7da538	cephfs: dont set explicit permissions on the volume At present we are node staging with worldwide permissions which is not correct. We should allow the CO to take care of it and make the decision. This commit also remove `fuseMountOptions` and `KernelMountOptions` as they are no longer needed Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2022-02-09 17:30:29 +00:00
Madhu Rajanna	2943555904	cephfs: fix omap deletion in DeleteSnapshot the omap is stored with the requested snapshot name not with the subvolume snapshotname. This fix uses the correct snapshot request name to cleanup the omap once the subvolume snapshot is deleted. fixes: #2832 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-02-08 20:37:53 +00:00
Humble Chirammal	ad6a3d7575	rbd: remove kp-metadata register functions of HPCS/Key Protect This commit removes `kp-metadata` registration from existing HPCS or Key Protect code as per the plan. Fix #2816 Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2022-02-08 18:27:03 +00:00
Humble Chirammal	1c3baa0722	rbd: add AAD(additionalAuthData) while unwrapping the DEK As we are using optional additional auth data while wrapping the DEK, we have to send the same additionally while unwrapping. Error: ``` failed to unwrap the DEK: kp.Error: ..(INVALID_FIELD_ERR)', reasons='[INVALID_FIELD_ERR: The field `ciphertext` must be: the original base64 encoded ciphertext from the wrap operation ``` Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2022-02-08 03:06:30 +00:00
Niels de Vos	f6894909d7	util: use `vaultNamespace` if `vaultAuthNamespace` is not set When a tenant configures `vaultNamespace` in their own ConfigMap, it is not applied to the Vault configuration, unless `vaultAuthNamespace` is set as well. This is unexpected, as the `vaultAuthNamespace` usually is something configured globally, and not per tenant. The `vaultAuthNamespace` is an advanced option, that is often not needed to be configured. Only when tenants have to configure their own `vaultNamespace`, it is possible that they need to use a different `vaultAuthNamespace`. The default for the `vaultAuthNamespace` is now the `vaultNamespace` value from the global configuration. Tenants can still set it to something else in their own ConfigMap if needed. Note that Hashicorp Vault Namespaces are only functional in the Enterprise version of the product. Therefor this can not be tested in the Ceph-CSI e2e with the Open Source version of Vault. Fixes: https://bugzilla.redhat.com/2050056 Reported-by: Rachael George <rgeorge@redhat.com> Signed-off-by: Niels de Vos <ndevos@redhat.com>	2022-02-07 08:20:48 +00:00
Rakshith R	3203673d17	cleanup: remove ceph.conf WA options which are already fixed This commit removes ceph.conf WA options: ``` # Workaround for http://tracker.ceph.com/issues/23446 fuse_set_user_groups = false # ceph-fuse which uses libfuse2 by default has write buffer size of 2KiB # adding 'fuse_big_writes = true' option by default to override this limit # see https://github.com/ceph/ceph-csi/issues/1928 fuse_big_writes = true ``` Since they are already fixed. Refer: https://tracker.ceph.com/issues/44885 Refer: https://tracker.ceph.com/issues/23446 Closes: #2825 Signed-off-by: Rakshith R <rar@redhat.com>	2022-02-04 15:42:32 +00:00
Madhu Rajanna	28fef9b379	cleanup: remove thick provisioning code This commit removes the thick provisioning code as thick provisioning is deprecated in cephcsi 3.5.0. fixes: #2795 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-01-28 11:17:15 +00:00
Humble Chirammal	4ee4fdfebd	rbd: unexport SecretsKMS from KMS implementation This commit unexport SecretsKMS from KMS implementation. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2022-01-28 06:55:12 +00:00
Humble Chirammal	4058246637	rbd: unexport vaultTokenSA struct from KMS implementation This commit unexport the vaultTokenSA from the vault KMS implementation Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2022-01-28 06:55:12 +00:00
Humble Chirammal	b75c562217	rbd: Unexport VaultTenantSA struct from KMS implementation This commit unexport VaultTenantSA struct from KMS implemenation of Vault KMS. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2022-01-28 06:55:12 +00:00
Humble Chirammal	c8a3b9352e	rbd: Unexport SecretsMetadataKMS struct This commit unexport SecretsMetadataKMS struct from KMS implementation Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2022-01-28 06:55:12 +00:00
Humble Chirammal	3f18d6e4b4	rbd: Unexport IntegratedDEK struct from kms This commit unexport IntegratedDEK struct from KMS implementation Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2022-01-28 06:55:12 +00:00
Humble Chirammal	6141aabcd2	rbd: unexport KeyProtect kms struct At present the KMS structs are exported and ideally we should be able to work without exporting the same. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2022-01-28 06:55:12 +00:00
Humble Chirammal	a86121f756	rbd: unexport aws kms structs At present the KMS structs are exported and ideally we should be able to work without exporting the same. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2022-01-28 06:55:12 +00:00
Madhu Rajanna	992d257530	cephfs: fix error logging in filesystem.go fix error message logging in filesystem.go Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-01-27 14:31:12 +00:00
Madhu Rajanna	14c008c419	cleanup: use interface in filesystem.go Currently, we are using methods and all the methods makes a network call to fetch details from the ceph clusters, its difficult to write test cases for these functions, if we move to the interfaces we can make use of mock to write unit testing for the caller functions. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-01-27 14:31:12 +00:00
Humble Chirammal	f822600689	rbd: change the keyprotect metadata name to `ibmkeyprotect` To be consistent with other components and also to explictly state it belong to `ibm keyprotect` service introducing this change Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2022-01-26 02:28:05 +00:00
Humble Chirammal	7ff048bf1e	e2e: add podsecuritycontext fsgroup for normal user validation considering the pod has run as normal user, the fsgroup has also set to the same. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2022-01-25 16:25:11 +00:00
Humble Chirammal	bf4ba0ec84	rbd: dont attempt explicit permission mod change from the RBD driver currently we are overriding the permission to `0o777` at time of node stage which is not the correct action. That said, this permission change causes an extra permission correction at time of nodestaging by the CO while the FSGROUP change policy has been set to `OnRootMismatch`. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2022-01-25 16:25:11 +00:00
Madhu Rajanna	8096dd47e4	cleanup: remove unwanted type declaration removed unwanted int64 type declaration to fix style check. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-01-24 05:25:11 +00:00
Madhu Rajanna	9c841c83d4	cleanup: rename errorPair to pairError to fix the errname check renaming the struct. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-01-24 05:25:11 +00:00
Madhu Rajanna	4938fc2ff4	cleanup: use 0o600 intead of 0600 as we are using 0o600 in multiple files use the same in all files which also fixes go lint issue. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-01-24 05:25:11 +00:00
Madhu Rajanna	c67bacdb11	cleanup: use %s instead of %w for t.Errorf As t.Errorf does not support error-wrapping directive using %s. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-01-24 05:25:11 +00:00
Madhu Rajanna	813f6c30cc	cleanup: use WriteString instead of Write use WriteString instead of Write for the temp files. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-01-24 05:25:11 +00:00
Madhu Rajanna	aba6979d29	cleanup: use os.ReadFile to read file as ioutil.ReadFile is deprecated and suggestion is to use os.ReadFile as per https://pkg.go.dev/io/ioutil updating the same. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-01-24 05:25:11 +00:00
Madhu Rajanna	562dff0d19	cleanup: use os.WriteFile to write files as ioutil.WriteFile is deprecated and suggestion is to use os.WriteFile as per https://pkg.go.dev/io/ioutil updating the same. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-01-24 05:25:11 +00:00
Madhu Rajanna	ba5809e191	rbd: make rbdImage as received for internal methods Currently most of the internal methods have the rbdVolume as the received. As these methods are completely internal and requires only the fields of the rbdImage use rbdImage as the receiver instead of rbdVolume. updates #2742 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-01-17 12:15:21 +00:00
Madhu Rajanna	2daf2f9f0c	cephfs: log error message if clone fails During CreateVolume from snapshot/volume, its difficult to identify if the clone is failed and a new clone is created. In case of clone failure logging the error message for better debugging. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-01-17 09:43:09 +00:00
Madhu Rajanna	d293d91c07	rbd: disallow creating small size volume from volume as per the CSI standard the size is optional parameter, as we are allowing the clone to a bigger size today we need to block the clone to a smaller size as its a have side effects like data corruption etc. Note:- Even though this check is present in kubernetes sidecar as CSI is CO independent adding the check here. updates: #2718 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-01-17 07:00:00 +00:00
Madhu Rajanna	ceafca6ddf	rbd: disallow creating small size volume from snapshot as per the CSI standard the size is optional parameter, as we are allowing the restore to a bigger size today we need to block the restore to a smaller size as its a have side effects like data corruption. Note:- Even though this check is present in kubernetes sidecar as CSI is CO independent adding the check here. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-01-17 07:00:00 +00:00
Madhu Rajanna	ef14ea7723	cephfs: resize cloned, restored volume if required Currently, as a workaround, we are calling the resize volume on the cloned, restore volumes to adjust the cloned, restored volumes. With this fix, we are calling the resize volume only if there is a size mismatch with requested and the volume from which the new volume needs to be created. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-01-12 10:44:11 +00:00
Humble Chirammal	4a69378698	rbd: introduce a helper function to detect multi writer,block & rwofile SINGLE_NODE_WRITER capability ambiguity has been fixed in csi spec v1.5 which allows the SP drivers to declare more granular WRITE capability in form of SINGLE_NODE_SINGLE_WRITER or SINGLE_NODE_MULTI_WRITER. These are not really new capabilities rather capabilities introduced to get the desired functionality from CO side based on the capabilities SP driver support for various CSI operations, this new capabilities also help to address new access mode RWOP (readwriteoncepod). This commit adds a helper function which identity the request is of multiwriter mode and also validates whether it is filesystem mode or block mode. Based on the inspection it fails to allow multi write requests for filesystem mode and only allow multi write request against block mode. This commit also adds unit tests for isMultiWriterBlock function which validates various accesstypes and accessmodes. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2022-01-11 19:40:22 +00:00
Humble Chirammal	68350e8815	cephfs: add SINGLE_NODE_{SINGLE/MULTI}_WRITER capability SINGLE_NODE_WRITER capability ambiguity has been fixed in csi spec v1.5 which allows the SP drivers to declare more granular WRITE capability. These are not really new capabilities rather capabilities introduced to get the desired functionality from CO side based on the capabilities SP driver support for various CSI operations. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2022-01-11 19:40:22 +00:00
Humble Chirammal	3730a462f4	rbd: add SINGLE_NODE{SINGLE_MULTI}_WRITER capabilities Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2022-01-11 19:40:22 +00:00
Humble Chirammal	bc354b6fb5	rbd: add BaseURL and tokenURL configuration This commit adds optional BaseURL and TokenURL configuration to key protect/hpcs configuration and client connections, if not provided default values are used. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2022-01-11 21:12:56 +05:30
Yug Gupta	9d34809425	rbd: add NetworkFence operation Signed-off-by: Yug Gupta <yuggupta27@gmail.com>	2022-01-07 14:48:12 +00:00
Yug Gupta	fa5866deec	ci: add unit test for NetworkFence grpc calls Signed-off-by: Yug Gupta <yuggupta27@gmail.com>	2022-01-07 14:48:12 +00:00
Yug Gupta	29782bf377	rbd: implement UnfenceClusterNetwork implement UnfenceClusterNetwork grpc call which allows to unblock the access to a CIDR block by removing it from network fence. Signed-off-by: Yug Gupta <yuggupta27@gmail.com>	2022-01-07 14:48:12 +00:00
Yug Gupta	ebd8a762f0	rbd: implement FenceClusterNetwork implement FenceClusterNetwork grpc call which allows to blocks access to a CIDR block by creating a network fence. Signed-off-by: Yug Gupta <yuggupta27@gmail.com>	2022-01-07 14:48:12 +00:00
Yug Gupta	ab15053fef	ci: add unit test for networkfencing util Signed-off-by: Yug Gupta <yuggupta27@gmail.com>	2022-01-07 14:48:12 +00:00
Yug Gupta	7d5879ad81	rbd: add network fencing utils Convert the CIDR block into a range of IPs, and then add network fencing via "ceph osd blocklist" for each IP in that range. Signed-off-by: Yug Gupta <yuggupta27@gmail.com>	2022-01-07 14:48:12 +00:00
Rakshith R	384ab42ae7	cleanup: use %q instead of %s for logging Signed-off-by: Rakshith R <rar@redhat.com>	2022-01-06 12:28:18 +00:00
Rakshith R	c19264e996	rbd: add function (cc *ClusterConnection) GetTaskAdmin() This function returns new go-ceph TaskAdmin to add tasks on rbd volumes. Signed-off-by: Rakshith R <rar@redhat.com>	2022-01-06 12:28:18 +00:00
Rakshith R	420aa9ec57	rbd: remove redundant rbdVol.getTrashPath() function This commit removes rbdVol.getTrashPath() function since it is no longer being used due to introduction of go-ceph rbd admin task api for deletion. Signed-off-by: Rakshith R <rar@redhat.com>	2022-01-06 12:28:18 +00:00
Rakshith R	9adb25691c	rbd: remove redundant util.Credentials arg from flattenRbdImage() With introduction of go-ceph rbd admin task api, credentials are no longer required to be passed as cli cmd is not invoked. Signed-off-by: Rakshith R <rar@redhat.com>	2022-01-06 12:28:18 +00:00
Rakshith R	7b0f051fd4	rbd: remove redundant rbdVolume.connect() in flattenRbdImage() This commit removes `rv.Connect(cr)` since the rbdVolume should have an active connection in this stage of the function call. `rv.getCloneDepth(ctx)` will work after a connect to the cluster. Signed-off-by: Rakshith R <rar@redhat.com>	2022-01-06 12:28:18 +00:00
Rakshith R	ad3c334a3a	rbd: use go-ceph rbd admin task api instead of cli This commit adds support to go-ceph rbd task api `trash remove` and `flatten` instead of using cli cmds. Fixes: #2186 Signed-off-by: Rakshith R <rar@redhat.com>	2022-01-06 12:28:18 +00:00
Humble Chirammal	5aa1e4d225	rbd: change the configmap of HPCS/KP key names to reflect the IBM string considering IBM has different crypto services (ex: SKLM) in place, its good to keep the configmap key names with below format `IBM_KP_...` instead of `KP_..` so that in future, if we add more crypto services from IBM we can keep similar schema specific to that specific service from IBM. Ex: `IBM_SKLM_...` Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2022-01-05 06:08:19 +00:00
Niels de Vos	8eaf1abbdc	util: add common logging to csi-addons gRPC Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-12-23 17:43:23 +00:00
Niels de Vos	bb5d3b7257	cleanup: refactor gRPC middleware into NewMiddlewareServerOption Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-12-23 17:43:23 +00:00
Niels de Vos	e574c807f0	rbd: expose CSI-Addons ReclaimSpace operations Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-12-23 17:43:23 +00:00
Niels de Vos	c274649b80	rbd: implement NodeReclaimSpace By calling fstrim/blkdiscard on the volume, space consumption should get reduced. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-12-23 17:43:23 +00:00
Niels de Vos	7d36c5a9d1	rbd: implement CSI-Addons ControllerReclaimSpace The CSI Controller (provisioner) can call `rbd sparsify` to reduce the space consumption of the volume. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-12-23 17:43:23 +00:00
Madhu Rajanna	e4b7943bac	rbd: add workaround for force promote use ExecCommandWithTimeout with timeout of 1 minute for the promote operation. If the command doesnot returns error/response in 1 minute the process will be killed and error will be returned to the user. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-23 13:36:21 +00:00
Madhu Rajanna	95e9595c1f	util: add helper ExecCommandWithTimeout function added ExecCommandWithTimeout helper function to execute the commands with the timeout option, if the command does not return any response with in the timeout time the process will be terminated and error will be returned back to the user. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-23 13:36:21 +00:00
Madhu Rajanna	9499e73b93	rbd: correct logging in createBackingImage after creating the rbd image log the image details corresponding for the request along with the request name. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-23 03:47:00 +00:00
Madhu Rajanna	549bfedc94	rbd: remove extra logging from createBackingImage we are already logging the rbd image details and the snapshot details after creating the clone. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-23 03:47:00 +00:00
Madhu Rajanna	8c9105f09e	rbd: remove extra getImageInfo API call as getImageInfo is already called inside cloneRbdImageFromSnapshot function right after creating the clone. remove the extra API call to get the details again. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-23 03:47:00 +00:00
Madhu Rajanna	ff91b7edbd	rbd: get image details after creating clone after creating the clone get the current image details like size, creationTime, imageFeatures etc from the ceph cluster. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-23 03:47:00 +00:00
Madhu Rajanna	edcb2b529b	rbd: move core fields to rbdImage struct moved ParentName, ParentPool and ImageFeatureSet fields to the rbdImage struct as these are the first citizens on the rbdImage. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-23 03:47:00 +00:00
Madhu Rajanna	c6b288779a	rbd: correct logging for clone log the rbdVolume and the rbdSnapshot after creating the clone from snapshot. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-23 03:47:00 +00:00
Madhu Rajanna	3169c8e23a	rbd: expand filesystem during NodeStageVolume If the volume with a bigger size is created from a snapshot or from another volume we need to exapand the filesystem also in the csidriver as nodeExpand request is not triggered for this one, During NodeStageVolume we can expand the filesystem by checking filesystem needs expansion or not. If its a encrypted device, check the device size of rbd device and the LUKS device if required the device will be expanded before expanding the filesystem. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-23 03:47:00 +00:00
Madhu Rajanna	69ae19e0cb	rbd: resize the volume created from snapshot If the requested volume size is greater than the snapshot size, resize the cloned volume after creating a clone from a snapshot. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-23 03:47:00 +00:00
Madhu Rajanna	a28a4a4285	rbd: resize the volume created from volume If the requested volume size is greater than the parent volume size, resize the cloned volume after creating a final clone from a parent volume. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-23 03:47:00 +00:00
Madhu Rajanna	f7f662678a	rbd: consider ErrImageNotFound during DeleteSnapshot added a check to consider ErrImageNotFound error during DeleteSnapshot operation, if the error is ErrImageNotFound we need to ensure that image is removed from the trash and also the rados OMAP data is removed. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-23 03:47:00 +00:00
Madhu Rajanna	da60d221df	rbd: update size for rbdSnapshot struct we need actual size of the rbdVolume created for the snapshot, as we are not storing the size of the snapshot in OMAP we need to fetch the size from ceph cluster and update the same on rbdSnapshot struct. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-23 03:47:00 +00:00
Madhu Rajanna	6a82baf5d3	rbd: remove SizeBytes from rbdSnapshot struct as we are moving the VolSize to rbdImage struct we should reuse the same instead of maintaining one more field in rbdSnapshot struct. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-23 03:47:00 +00:00
Madhu Rajanna	b1a0bb4714	rbd: move VolSize to rbdImage struct move the Volsize to the rbdImage struct as size is more applicable for rbdImage as rbdImage is used for both rbdVolume and rbdSnapshot. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-23 03:47:00 +00:00
Madhu Rajanna	a0829e9e93	rbd: remove json tag from rbdVolume struct as we are no longer supporting the v1.x version of cephcsi. removing the json tag used to store rbd volume details in configmap. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-23 03:47:00 +00:00
Madhu Rajanna	124281519f	rbd: add RequestedVolSize to rbdVolume struct when doing the internal operation to get the latest details the rbd image size is also getting updated and this will update the volume size also without actual requested size we cannot do the resize operation for bigger clones. This commit adds a new field called RequestedVolSize to rbdVolume struct to hold the user requested size. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-23 03:47:00 +00:00
Madhu Rajanna	22365ab77f	cleanup: add cleanup helper for incorrect thick volume added a new helper function called cleanupThickClone to cleanup the snapshot and clone if the thick provisioning is not fully completed. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-23 03:47:00 +00:00
Madhu Rajanna	ca29328554	csi: remove size check when creating volume remove the bigger size validation when creating a volume from a snapshot or when creation a clone from a volume as we resized the volume after cloning. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-23 03:47:00 +00:00
Humble Chirammal	b9a8d37c3d	rbd: enable expand operation for intree volumes This commit enable the resize operation[1] for in-tree volumes. new helper has been introduced here to aid the enablement or to make it clean with existing code base. [1] https://github.com/ceph/ceph-csi/blob/devel/docs/design/proposals/intree-migrate.md?plain=1#L66 Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-12-22 19:33:05 +00:00
Madhu Rajanna	810e285c50	rbd: reset dummy image id dummy image rbdVolume struct is derived from the actual one rbdVolume of the volumeID sent in the EnableVolumeReplication request. and the dummy rbdVolume struct contains the image id of the actual volume because of that when we are repairing the dummy image the image is sent to trash but not deleted due to the wrong image ID. resetting the image id will makes sure the image id is fetching from ceph cluster and same image id will be used for manager operation. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-21 17:39:07 +00:00
Humble Chirammal	b904c446d6	rbd: add kms unit test for key protect server Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-12-21 17:09:50 +00:00
Humble Chirammal	9200bc7a00	rbd: Implement Key Protect KMS integration for Ceph CSI This commit adds the support for HPCS/Key Protect IBM KMS service to Ceph CSI service. EncryptDEK() and DecryptDEK() of RBD volumes are done with the help of key protect KMS server by wrapping and unwrapping the DEK and by using the DEKStoreMetadata. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-12-21 17:09:50 +00:00
Madhu Rajanna	12e8e46bcf	revert: remove explicit size setting of cloned volume The ceph changes are done on the both server and the client side this change is not enough for remove setting the size of cloned volumes. this caused the regression like #2719 #2720 #2721 #2722. This reverts commit `3565a342d5`. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-21 14:15:46 +00:00
Humble Chirammal	88911eb4e9	rbd: add migration secret support to controllerserver functions This commit adds the migration secret request validation to expand, create controller functions. Ref # https://github.com/ceph/ceph-csi/issues/2509 Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-12-20 07:34:43 +00:00
Niels de Vos	30333378ef	cleanup: add IsBlockMultiNode() helper IsBlockMultiNode() is a new helper that takes a slice of VolumeCapability objects and checks if it includes multi-node access and/or block-mode support. This can then easily be used in other services that need checking for these particular capabilities, and preventing multi-node block-mode access. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-12-17 07:31:55 +00:00
Madhu Rajanna	50d6ea825c	rbd: remove retrieving volumeHandle from PV annotation we have added clusterID mapping to identify the volumes in case of a failover in Disaster recovery in #1946. with #2314 we are moving to a configuration in configmap for clusterID and poolID mapping. and with #2314 we have all the required information to identify the image mappings. This commit removes the workaround implementation done in #1946. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-17 03:38:29 +00:00
Niels de Vos	203920d8f4	rbd: move driver component into the rbd/driver package The rbd package contains several functions that can be used by CSI-Addons Service implmentations. Unfortunately it is not possible to do this, as the rbd-driver needs to import the csi-addons/rbd package to provide the CSI-Addons server. This causes a circular import when services use the rbd package: - rbd/driver.go import csi-addons/rbd - csi-addons/rbd import rbd (including the driver) By moving rbd/driver.go into its own package, the circular import can be prevented. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-12-10 07:35:26 +00:00
Niels de Vos	44d69502bc	rbd: export HexStringToInteger() HexStringToInteger() used to return a uint64, but everywhere else uint is used. Having HexStringToInteger() return a uint as well makes it a little easier to use when setting it with SetGlobalInt(). Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-12-10 07:35:26 +00:00
Niels de Vos	8b531f337e	rbd: add functions for initializing global variables When the rbd-driver starts, it initializes some global (yuck!) variables in the rbd package. Because the rbd-driver is moved out into its own package, these variables can not easily be set anymore. Introcude SetGlobalInt(), SetGlobalBool() and InitJournals() so that the rbd-driver can configure the rbd package. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-12-10 07:35:26 +00:00
Niels de Vos	3eeac3d36c	rbd: export RunVolumeHealer() so that rbd/driver can start it The rbd-driver calls rbd.runVolumeHealer() which is not available outside the rbd package. By moving the rbd-driver into its own package, RunVolumeHealer() needs to be exported. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-12-10 07:35:26 +00:00
Niels de Vos	5baf9811f9	rbd: export NodeServer.mounter outside of the rbd package NodeServer.mounter is internal to the NodeServer type, but it needs to be initialized by the rbd-driver. The rbd-driver is moved to its own package, so .Mounter needs to be available from there in order to set it. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-12-10 07:35:26 +00:00
Niels de Vos	8d09134125	rbd: export GenVolFromVolID() for consumption by csi-addons genVolFromVolID() is used by the CSI Controller service to create an rbdVolume object from a CSI volume_id. This function is useful for CSI-Addons Services as well, so rename it to GenVolFromVolID(). Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-12-10 07:35:26 +00:00
Niels de Vos	e76bffe353	cleanup: import k8s.io/mount-utils instead of k8s.io/utils/mount k8s.io/utils/mount has moved to k8s.io/mount-utils, and Ceph-CSI uses that already in most locations. Only internal/util/util.go still imports the old path. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-12-09 17:58:34 +00:00
Madhu Rajanna	8081ac8251	rbd: add new image features for dummy image The dummy image will be created with 1Mib size. during the snapshot transfer operation the 1Mib will be transferred even if the dummy image doesnot contains any data. adding the new image features `fast-diff,layering,obj-map,exclusive-lock`on the dummy image will ensure that only the diff is transferred to the remote cluster. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-07 17:34:14 +00:00
Madhu Rajanna	9a4533e549	rbd: create 1MiB size dummy image we added a workaround for rbd scheduling by creating a dummy image in #2656. with the fix we are creating a dummy image of the size of the first actual rbd image which is sent in EnableVolumeReplication request if the actual rbd image size is 1TiB we are creating a dummy image of 1TiB which is not good. even though its a thin provisioned rbd images this is causing issue for the transfer of the snapshot during the mirroring operation. This commit recreates the rbd image with 1MiB size which is the smaller supported size in rbd. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-07 17:34:14 +00:00
Konstantin Shalygin	7411773f73	rbd: added RBD features support for krbd Added support for `object-map, fast-diff` Signed-off-by: Konstantin Shalygin <k0ste@k0ste.ru>	2021-12-07 07:38:24 +00:00
Madhu Rajanna	64ce5e0949	rbd: check local image state during promote operation rbd mirroring CLI calls are async and it doesn't wait for the operation to be completed. ex:- `rbd mirror image enable` it will enable the mirroring on the image but it doesn't ensure that the image is mirroring enabled and healthy primary. The same goes for the promote volume also. This commits adds a check-in PromoteVolume to make sure the image in a healthy state i.e `up+stopped`. note:- not considering any intermediate states to make sure the image is completely healthy before responding success to the RPC call. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-01 20:19:05 +00:00
Prasanna Kumar Kalever	e7d8834149	rbd: enabe journal based mirroring Journal-based RADOS block device mirroring ensures point-in-time consistent replicas of all changes to an image, including reads and writes, block device resizing, snapshots, clones, and flattening. Journaling-based mirroring records all modifications to an image in the order in which they occur. This ensures that a crash-consistent mirror of an image is available. Mirroring when configured in journal mode, mirroring will utilize the RBD journaling image feature to replicate the image contents. If the RBD journaling image feature is not yet enabled on the image, it will be automatically enabled. Fixes: #2018 Co-authored-by: Madhu Rajanna <madhupr007@gmail.com> Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-12-01 14:12:30 +00:00
Niels de Vos	ab76459e87	rbd: implement CSI-Addons Identity Service Depending on the way Ceph-CSI is deployed, the capabilities will be configured for the GetCapabilities procedure. The other procedures are more straight-forward. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-12-01 06:31:09 +00:00
Niels de Vos	20727bd41a	cleanup: reduce complexity of rbd.Driver.Run() After adding the new CSI-Addons Server, golang-ci complains that driver.Run() is too complex. By moving the profiling checks and starting of the go-routines in their own function, golang-ci is happy again. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-11-30 11:48:40 +00:00
Niels de Vos	b3910f2b4a	rbd: enable CSI-Addons Server and Identity Service Add a new endpoint for the CSI-Addons Service and enable the Identity Service for the RBD plugin. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-11-30 11:48:40 +00:00
Niels de Vos	0f8bbaa217	rbd: add framework for CSI-Addons Identity Service Add a new CSI-Addons Server and empty Identity Service for the RBD plugin. The implementation of the Identity Service procedure calls will be done in other PRs. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-11-30 11:48:40 +00:00
Madhu Rajanna	f0b2ea6a6d	rbd: repair imageid after resync During resync operation the local image will get deleted and a new image is recreated by the rbd mirroring. The new image will have a new imageID. Once resync is completed update the imageID in the OMAP to get the image removed from the trash during DeleteVolume. Before resyncing ``` sh-4.4# rbd info replicapool/csi-vol-0c25bdd3-485f-11ec-bd30-0242ac110004 rbd image 'csi-vol-0c25bdd3-485f-11ec-bd30-0242ac110004': size 1 GiB in 256 objects order 22 (4 MiB objects) snapshot_count: 1 id: 1efcc6b7a769 block_name_prefix: rbd_data.1efcc6b7a769 format: 2 features: layering op_features: flags: create_timestamp: Thu Nov 18 11:02:40 2021 access_timestamp: Thu Nov 18 11:02:40 2021 modify_timestamp: Thu Nov 18 11:02:40 2021 mirroring state: enabled mirroring mode: snapshot mirroring global id: 9c4c236d-8a47-4779-b4f6-94e05da70dbd mirroring primary: true ``` ``` sh-4.4# rados listomapvals csi.volume.0c25bdd3-485f-11ec-bd30-0242ac110004 --pool=replicapool csi.imageid value (12 bytes) : 00000000 31 65 66 63 63 36 62 37 61 37 36 39 \|1efcc6b7a769\| 0000000c csi.imagename value (44 bytes) : 00000000 63 73 69 2d 76 6f 6c 2d 30 63 32 35 62 64 64 33 \|csi-vol-0c25bdd3\| 00000010 2d 34 38 35 66 2d 31 31 65 63 2d 62 64 33 30 2d \|-485f-11ec-bd30-\| 00000020 30 32 34 32 61 63 31 31 30 30 30 34 \|0242ac110004\| 0000002c csi.volname value (40 bytes) : 00000000 70 76 63 2d 32 36 38 39 33 66 30 38 2d 66 66 32 \|pvc-26893f08-ff2\| 00000010 62 2d 34 61 30 66 2d 61 35 63 33 2d 38 38 34 62 \|b-4a0f-a5c3-884b\| 00000020 37 32 30 66 66 62 32 63 \|720ffb2c\| 00000028 csi.volume.owner value (7 bytes) : 00000000 64 65 66 61 75 6c 74 \|default\| 00000007 ``` After Resyncing ``` sh-4.4# rbd info replicapool/csi-vol-0c25bdd3-485f-11ec-bd30-0242ac110004 rbd image 'csi-vol-0c25bdd3-485f-11ec-bd30-0242ac110004': size 1 GiB in 256 objects order 22 (4 MiB objects) snapshot_count: 1 id: 10b183a48a97 block_name_prefix: rbd_data.10b183a48a97 format: 2 features: layering, non-primary op_features: flags: create_timestamp: Thu Nov 18 11:09:39 2021 access_timestamp: Thu Nov 18 11:09:39 2021 modify_timestamp: Thu Nov 18 11:09:39 2021 mirroring state: enabled mirroring mode: snapshot mirroring global id: 9c4c236d-8a47-4779-b4f6-94e05da70dbd mirroring primary: false sh-4.4# rados listomapvals csi.volume.0c25bdd3-485f-11ec-bd30-0242ac110004 --pool=replicapool csi.imageid value (12 bytes) : 00000000 31 30 62 31 38 33 61 34 38 61 39 37 \|10b183a48a97\| 0000000c csi.imagename value (44 bytes) : 00000000 63 73 69 2d 76 6f 6c 2d 30 63 32 35 62 64 64 33 \|csi-vol-0c25bdd3\| 00000010 2d 34 38 35 66 2d 31 31 65 63 2d 62 64 33 30 2d \|-485f-11ec-bd30-\| 00000020 30 32 34 32 61 63 31 31 30 30 30 34 \|0242ac110004\| 0000002c csi.volname value (40 bytes) : 00000000 70 76 63 2d 32 36 38 39 33 66 30 38 2d 66 66 32 \|pvc-26893f08-ff2\| 00000010 62 2d 34 61 30 66 2d 61 35 63 33 2d 38 38 34 62 \|b-4a0f-a5c3-884b\| 00000020 37 32 30 66 66 62 32 63 \|720ffb2c\| 00000028 csi.volume.owner value (7 bytes) : 00000000 64 65 66 61 75 6c 74 \|default\| 00000007 ``` Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-11-25 09:22:13 +00:00
Madhu Rajanna	027b68ab39	rbd: operate on dummy image after adding scheduling currently we are fist operating on the dummy image to refresh the pool and then we are adding the scheduling. we think the scheduling should be added first and than we should refresh the pool. If we do this all the existing schedules will be considered from the scheduler. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-11-23 11:04:42 +00:00
Madhu Rajanna	211ca9b5a7	rbd: do deep copy for dummyVol struct with shallow copy of rbdVol to dummyVol the image name update of the dummyVol is getting reflected on the rbdVol which we dont want. do deep copy to avoid this problem. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-11-23 11:04:42 +00:00
Prasanna Kumar Kalever	bdcf3273b5	rbd: provide a way to supply mounter specific mapOptions from sc Uses the below schema to supply mounter specific map/unmapOptions to the nodeplugin based on the discussion we all had at https://github.com/ceph/ceph-csi/pull/2636 This should specifically be really helpful with the `tryOthermonters` set to true, i.e with fallback mechanism settings turned ON. mapOption: "kbrd:v1,v2,v3;nbd:v1,v2,v3" - By omitting `krbd:` or `nbd:`, the option(s) apply to rbdDefaultMounter which is krbd. - A user can _override_ the options for a mounter by specifying `krbd:` or `nbd:`. mapOption: "v1,v2,v3;nbd:v1,v2,v3" is effectively the same as the 1st example. - Sections are split by `;`. - If users want to specify common options for both `krbd` and `nbd`, they should mention them twice. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-11-23 08:54:37 +00:00
Shyamsundar Ranganathan	d1c21eece9	rbd: Update sequence of operations on dummy mirror image The dummy mirror image needs to be disabled and then reenabled for mirroring, to ensure a newly promoted primary is now starting to schedule snapshots. Signed-off-by: Shyamsundar Ranganathan <srangana@redhat.com>	2021-11-19 09:38:59 +05:30
Madhu Rajanna	517ad8c644	rbd: use dummy image to workaround rbd scheduling bug currently we have a bug in rbd mirror scheduling module. After doing failover and failback the scheduling is not getting updated and the mirroring snapshots are not getting created periodically as per the scheduling interval. This PR workarounds this one by doing below operations * Create a dummy (unique) image per cluster and this image should be easily identified. * During Promote operation on any image enable the mirroring on the dummy image. when we enable the mirroring on the dummy image the pool will get updated and the scheduling will be reconfigured. * During Demote operation on any image disable the mirroring on the dummy image. the disable need to be done to enable the mirroring again when we get the promote request to make the image as primary * When the DR is no more needed, this image need to be manually cleanup as for now as we dont want to add a check in the existing DeleteVolume code path for delete dummy image as it impact the performance of the DeleteVolume workflow. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-11-19 09:38:59 +05:30
Madhu Rajanna	d05fc1e8e5	util: add helper to get the cluster ID added helper function to get the cluster ID. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-11-19 09:38:59 +05:30
Madhu Rajanna	e4e0f397a6	rbd: run schedule during promote operation Moved to add scheduling to the promote operation as scheduling need to be added when the image is promoted and this is the correct method of adding the scheduling to make the scheduling take place. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-11-19 09:38:59 +05:30
Madhu Rajanna	7bbd2ea284	rbd: use small case of error message the error message should not start with the capital letter changing the case as per the standard. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-11-18 10:44:12 +00:00
Madhu Rajanna	51998a5f4a	cleanup: log the image name and pool name instead of logging the volumeID and the pool name. log the poolname and image name for better debugging. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-11-18 10:44:12 +00:00
Madhu Rajanna	0f0cda49a7	rbd: log stdError for cryptosetup command If we hit any error while running the cryptosetup commands we are logging only the error message. with only error message it is difficult to analyze the problem, logging the stdError will help us to check what is the problem. updates: #2610 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-11-18 02:17:15 +00:00
Niels de Vos	7e22180125	rbd: call undoStagingTransaction() when NodeStageVolume() fails On line 341 a `transaction` is created. This is passed to the deferred `undoStagingTransaction()` function when an error in the `NodeStageVolume` procedure is detected. So far, so good. However, on line 356 a new `transaction` is returned. This new `transaction` is not used for the defer call. By removing the empty `transaction` that is used in the defer call, and calling `undoStagingTransaction()` on an error of `stageTransaction()`, the code is a little simpler, and the cleanup of the transaction should be done correctly now. Updates: #2610 Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-11-17 23:58:00 +00:00
Prasanna Kumar Kalever	e6fa392df1	rbd: fix mapOptions passing with rbd-nbd mounter This was a regression introduced by: https://github.com/ceph/ceph-csi/pull/2556 Fixes: #2610 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-11-16 10:12:46 +00:00
Prasanna Kumar Kalever	50e9dfa5c5	cleanup: fix log level This log line is seen frequently in the logs and its better to be at Warning loglevel rather than Error based on its severity E1109 08:30:45.612395 38328 util.go:247] kernel 4.19.202 does not support required features Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-11-10 10:54:29 +00:00
Prasanna Kumar Kalever	3686b6da8b	rbd: utilize cookie support from rbd for nbd Problem: On remap/attach of device (i.e. nodeplugin restart), there is no way for rbd-nbd to defend if the backend storage is matching with the initial backend storage. Say, if an initial map request for backend "pool1/image1" got mapped to /dev/nbd0 and the userspace process is terminated (on nodeplugin restart). A next remap/attach (nodeplugin start) request within reattach-timeout is allowed to use /dev/nbd0 for a different backend "pool1/image2" For example, an operation like below could be dangerous: $ sudo rbd-nbd map --try-netlink rbd-pool/ext4-image /dev/nbd0 $ sudo blkid /dev/nbd0 /dev/nbd0: UUID="bfc444b4-64b1-418f-8b36-6e0d170cfc04" TYPE="ext4" $ sudo pkill -15 rbd-nbd <-- nodeplugin terminate $ sudo rbd-nbd attach --try-netlink --device /dev/nbd0 rbd-pool/xfs-image /dev/nbd0 $ sudo blkid /dev/nbd0 /dev/nbd0: UUID="d29bf343-6570-4069-a9ea-2fa156ced908" TYPE="xfs" Solution: rbd-nbd/kernel now provides a way to keep some metadata in sysfs to identify between the device and the backend, so that when a remap/attach request is made, rbd-nbd can compare and avoid such dangerous operations. With the provided solution, as part of the initial map request, backend cookie (ceph-csi VOLID) can be stored in the sysfs per device config, so that on a remap/attach request rbd-nbd will check and validate if the backend per device cookie matches with the initial map backend with the help of cookie. At Ceph-csi we use VOLID as device cookie, which will be unique, we pass the VOLID as cookie at map and use the same at the time of attach, that way rbd-nbd can identify backends and their matching devices. Requires: https://github.com/ceph/ceph/pull/41323 https://lkml.org/lkml/2021/4/29/274 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-11-04 03:20:59 +00:00
Prasanna Kumar Kalever	793b22cf27	rbd: check for nbd cookie support Change checkRbdNbdTools() to setRbdNbdToolFeatures() Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-11-04 03:20:59 +00:00
Prasanna Kumar Kalever	9a3170bf77	rbd: provide a way to disable the auto fallback to nbd mounter This change allows the user to choose not to fallback to NBD mounter when some ImageFeatures are absent with krbd driver, rather just fail the NodeStage call. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-11-01 08:17:36 +00:00
Prasanna Kumar Kalever	bfc24f6f12	cleanup: generalize the parseBool function Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-11-01 08:17:36 +00:00
Prasanna Kumar Kalever	84ec797dda	rbd: detect krbd features in runtime and fallback to nbd Currently, we recognize and warn for the provided image features based on our prior intelligence at ceph-csi (i.e based on supportedFeatures map and validateImageFeatures) at image/PV creation time. It might be very much possible that the cluster is heterogeneous i.e. the PV creation and application container might both be on different nodes with different kernel versions (krbd driver versions). This PR adds a mechanism to check for the supported krbd features during mount time, if the krbd driver doesn't have the specified image feature then it will fall back to rbd-nbd mounter. Fixes: #478 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-11-01 08:17:36 +00:00
Niels de Vos	c852f487a5	util: set defaults for Vault config before converting When using UPPER_CASE formatting for the HashiCorp Vault KMS configuration, a missing `VAULT_DESTROY_KEYS` will cause the option to be set to "false". The default for the option is intended for be "true". This is a difference in behaviour between the `vaultDestroyKeys` and `VAULT_DESTROY_KEYS` options. Both should use a default of "true" when the configuration does not set the option explicitly. By setting the default options in the `standardVault` struct before unmarshalling the configuration in it, the default values will be retained for the missing configuration options. Reported-by: Rachael George <rgeorge@redhat.com> Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-10-28 14:41:53 +00:00
Humble Chirammal	6aec858cba	rbd: parse migration secret and set fields for nodestage operations this commit make use of the migration request secret parsing and set the required fields for further nodestage operations Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-10-27 18:35:00 +00:00
Humble Chirammal	5621f2cfca	rbd: split the parsing and deletion logic to its own functions. parseAndDeleteMigratedVolume() prviously clubbed the logic of parsing of migration volume handle and then continued with the deletion of the volume. however this commit split this logic into two, ie parsing has been done in parseMigrationVolID() and DeleteMigratedVolume() deletes the backend volume. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-10-27 18:35:00 +00:00
Humble Chirammal	ff0911fb6a	rbd: add unittests for IsMigrationSecret and ParseAndSetSecretMapFromMigSecret This commit adds unit tests for newly introduced migration specific functions. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-10-27 18:35:00 +00:00
Humble Chirammal	b49bf4b987	rbd: parse migration secret and set it for controller server operations This commit adds a couple of helper functions to parse the migration request secret and set it for further csi driver operations. More details: The intree secret has a data field called "key" which is the base64 admin secret key. The ceph CSI driver currently expect the secret to contain data field "UserKey" for the equivalant. The CSI driver also expect the "UserID" field which is not available in the in-tree secret by deafult. This missing userID will be filled (if the username differ than 'admin') in the migration secret as 'adminId' field in the migration request, this commit adds the logic to parse this migration secret as below: "key" field value will be picked up from the migraion secret to "UserKey" field. "adminId" field value will be picked up from the migration secret to "UserID" field if `adminId` field is nil or not set, `UserID` field will be filled with default value ie `admin`.The above logic get activated only when the secret is a migration secret, otherwise skipped to the normal workflow as we have today. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-10-27 18:35:00 +00:00
Niels de Vos	b132696e54	rbd: note that thick-provisioning is deprecated Thick-provisioning was introduced to make accounting of assigned space for volumes easier. When thick-provisioned volumes are the only consumer of the Ceph cluster, this works fine. However, it is unlikely that this is the case. Instead, accounting of the requested (thin-provisioned) size of volumes is much more practical as different types of volumes can be tracked. OpenShift already provides cluster-wide quotas, which can combine accounting of requested volumes by grouping different StorageClasses. In addition to the difficult practise of allowing only thick-provisioned RBD backed volumes, the performance makes thick-provisioning troublesome. As volumes need to be completely allocated, data needs to be written to the volume. This can take a long time, depending on the size of the volume. Provisioning, cloning and snapshotting becomes very much noticeable, and because of the additional time consumption, more prone to failures. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-10-27 06:54:07 +00:00
Madhu Rajanna	0838845c6a	cleanup: remove FIXME from ResyncVolume as the complexity of ResyncVolume is reduced removing the FIXME which is not valid anymore. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-26 12:00:36 +00:00
Madhu Rajanna	2017b8c621	rbd: log mirror daemon state for replication log the mirror deamon state in the local and remote cluster for better debugging. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-26 12:00:36 +00:00
Madhu Rajanna	7472338334	rbd: remove unwanted const for comparing the image states use the states defined in the go-ceph avoid creating of the deplicate const in cephcsi. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-26 12:00:36 +00:00
Madhu Rajanna	b92a6f5ccb	rbd: log the remote site details during resync logging the remote site details during resyncing for better debugging. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-26 12:00:36 +00:00
Madhu Rajanna	1fd2f28fee	rbd: check local image state for resyncing below are the local states of the mirrored image "unknown" -> If the image is in an error state means data is completely synced "error" -> If the image is in an error state means it needs resync "syncing" "starting_replay" "replaying" "stopping_replay" "stopped" If the resync is successfully started which means the image will be in "replaying" state. we can consider "replaying" state to report resync succesfully going on state. we are discarding the intermediate states like "syncing", "starting_replay" and "stopping_replay". Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-26 12:00:36 +00:00
Rakshith R	12cd05a408	rbd: add EnsureImageCleanup to snapshot deletion Signed-off-by: Rakshith R <rar@redhat.com>	2021-10-20 18:25:31 +00:00
Rakshith R	1849076aab	rbd: add EnsureImageCleanup to ensure image cleanup from trash After moving moving image to trash, if `trash remove` step fails, then external-provisioner will issue subsequent requests, in which image will be absent in pool( will be in trash) and omap cleanup will be done with stale image left in trash with no `trash remove` step on it. To avoid this scenario list trash images and find corresponding id for given image name and add a task to flatten when we encounter a ErrImageNotFound. Fixes: #1728 Signed-off-by: Rakshith R <rar@redhat.com>	2021-10-20 18:25:31 +00:00
Niels de Vos	6d3e25f069	util: NodeGetVolumeStatsResponse.Usage may not contain negative values Following the CSI specification, values that are included in the VolumeUsage MUST NOT be negative. However, CephFS seems to return -1 for the number of inodes that are available. Instead of returning a negative value, set it to 0 so that it will not get included in the encoded JSON response. Updates: #2579 See-also: `5b0d454015/spec.md (L2477-L2487)` Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-10-20 07:18:48 +00:00
Madhu Rajanna	0d51f6d833	rbd: check local image description for split-brain In some corner case like `re-player shutdown` the local image will not be in error state. It would be also worth considering `description` field to make sure about split-brain. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-18 11:22:03 +00:00
Humble Chirammal	c584fa20da	rbd: use clusterID from volumeContext at nodestage previously we were retriving clusterID using the monitors field in the volume context at node stage code path. however it is possible to retrieve or use clusterID directly from the volume context. This commit also remove the getClusterIDFromMigrationVolume() function which was used previously and its tests Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-10-11 10:06:30 +00:00
Humble Chirammal	4e61156dc4	rbd: change iteration variable name in the migration test to be specific we reuse or overload the variable name in the test execution at present. This commit use a different variable name as initialized in each run Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-10-11 10:06:30 +00:00
Madhu Rajanna	90ecd2d7e8	rbd: use go-ceph to get mirroring info use go-ceph api to get image mirroring info. closes #2558 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-07 08:02:06 +00:00
Madhu Rajanna	8ebc0659ab	rbd: perform resize of file system for static volume For static volume, the user will manually mounts already existing image as a volume to the application pods. As its a rbd Image, if the PVC is of type fileSystem the image will be mapped, formatted and mounted on the node, If the user resizes the image on the ceph cluster. User cannot not automatically resize the filesystem created on the rbd image. Even if deletes and recreates the kubernetes objects, the new size will not be visible on the node. With this changes During the NodeStageVolumeRequest the nodeplugin will check the size of the mapped rbd image on the node using the devicePath. and also the rbd image size on the ceph cluster. If the size is not matching it will do the file system resize on the node as part of the NodeStageVolumeRequest RPC call. The user need to do below operation to see new size * Resize the rbd image in ceph cluster * Scale down all the application pods using the static PVC. * Make sure no application pods which are using the static PVC is running on a node. * Scale up all the application pods. Validate the new size in application pod mounted volume. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-06 13:15:00 +00:00
Madhu Rajanna	fe9020260d	rbd: move flattening to helper function in NodeStage operation we are flattening the image to support mounting on the older clients. this commits moves it to a helper function to reduce code complexity. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-06 13:15:00 +00:00
Madhu Rajanna	cda2abca5d	rbd: use NewMetricsBlock to get size instead of lsblk command use NewMetricsBlock function from the kubernetes package to get the size. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-06 13:15:00 +00:00
Rakshith R	ded75eb099	rbd: copyEncryptionConfig for thickProvisioned snap restore too This commit adds bugfix to copy encryption passphrase for thick provisioned PVC restored from snapshot. Signed-off-by: Rakshith R <rar@redhat.com>	2021-10-05 07:46:57 +00:00
Rakshith R	59b7a26175	rbd: modify copyEncryptionConfig to accept copyOnlyPassphrase arg During PVC snapshot/clone both kms config and passphrase needs to copied, while for PVC restore only passphrase needs to be copied to dest rbdvol since destination storageclass may have another kms config. Signed-off-by: Rakshith R <rar@redhat.com>	2021-10-05 07:46:57 +00:00
Humble Chirammal	3c9d7e3cd5	rbd: detect migration volID in DeleteVolume() and delete rbd image This commit adds the logic to detect a passed in volumeID is a migrated volume ID and if yes, the driver connect to the backend cluster and clean/delete the image. The logic only applied if its a migration volume ID. The migration volume ID carry the information like mons, pool and image name which is good enough for the driver to identify and connect to the backend cluster for its operations. migration volID format: <mig>_mons-<monsHash>_image-<imageUID>_<poolHash> Details on the hash values: * MonsHash: this carry a hash value (md5sum) which will be acted as the `clusterID` for the operations in this context. * ImageUID: this is the unique UUID generated by kubernetes for the created volume. * PoolHash: this is an encoded string of pool name. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-10-04 16:06:31 +00:00
Madhu Rajanna	34a21cdbe3	cleanup: move mount functions to new pkg moved fuse and kernel mount functions to a new package. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-09-23 06:39:37 +00:00
Madhu Rajanna	b1ef842640	cleanup: move core functions to core pkg as we are refractoring the cephfs code, Moving all the core functions to a new folder /pkg called core. This will make things easier to implement. For now onwards all the core functionalities will be added to the core package. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-09-23 06:39:37 +00:00
Humble Chirammal	4804f47b18	e2e: Add e2e for rbd migration static pvc This commit adds e2e for rbd migration static PVCs Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-09-20 09:54:54 +00:00
Humble Chirammal	2e8e8f5e64	rbd: fill clusterID if its a migration nodestage request the migration nodestage request does not carry the 'clusterID' in it and only monitors are available with the volumeContext. The volume context flag 'migration=true' and 'static=true' flags allow us to fill 'clusterID' from the passed in monitors to the volume Context,so that rest of the static operations on nodestage can be proceeded as we do treat static volumes today. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-09-20 09:54:54 +00:00
Humble Chirammal	1f5963919f	util: get clusterID for the passed in mon string as part of migration support, the clusterID has to be fetched from passed in mon. Because the intree RBD storage class only got monitor and not `clusterID` parameter support. However, in CSI, SC has the `clusterID` parameter support but not mon. Due to that we have to fetch the clusterID from config file for the passed in mon and use it in our operations. This adds a helper function to retrieve clusterID from passed in mon string. Updates https://github.com/ceph/ceph-csi/issues/2509 Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-09-20 09:54:54 +00:00
Prasanna Kumar Kalever	c9cc36d8db	rbd: provide alternatives to preserve the ceph log files Currently, we delete the ceph client log file on unmap/detach. This patch provides additional alternatives for users who would like to persist the log files. Strategies: ----------- `remove`: delete log file on unmap/detach `compress`: compress the log file to gzip on unmap/detach `preserve`: preserve the log file in text format Note that the default strategy will be remove on unmap, and these options can be tweaked from the storage class Compression size details example: On Map: (with debug-rbd=20) --------- $ ls -lh -rw-r--r-- 1 root root 526K Sep 1 18:15 rbd-nbd-0001-0024-fed5480a-f00f-417a-a51d-31d8a8144c03-0000000000000003-d2e89c87-0b4d-11ec-8ea6-160f128e682d.log On unmap: --------- $ ls -lh -rw-r--r-- 1 root root 33K Sep 1 18:15 rbd-nbd-0001-0024-fed5480a-f00f-417a-a51d-31d8a8144c03-0000000000000003-d2e89c87-0b4d-11ec-8ea6-160f128e682d.gz Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-09-16 13:55:15 +00:00
Prasanna Kumar Kalever	10bbb049f7	cleanup: passing pointers to larger type Log: internal/rbd/rbd_attach.go:424:2: hugeParam: dArgs is heavy (88 bytes); consider passing it by pointer (gocritic) Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-09-16 13:55:15 +00:00
Prasanna Kumar Kalever	ad2c6d2851	util: add gzip helper function Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-09-16 13:55:15 +00:00
Shyamsundar Ranganathan	47dc9cf28d	rbd: Report errors when a resync maybe in progress Currently we return a !ready status if an image is not found when a replication resync is issued. We also return a !ready just post issuing a resync. The change is to ensure we return errors in these cases for the caller to retry the operation till we can determine we are actually resyncing, and then return !ready with nil errors. Part of addressing: https://github.com/csi-addons/volume-replication-operator/issues/101 Signed-off-by: Shyamsundar Ranganathan <srangana@redhat.com>	2021-09-15 15:59:22 +00:00
Rakshith R	82d09d81cf	util: modify GetMonsAndClusterID() to take clusterID instead of options This commit: - modifies GetMonsAndClusterID() to take clusterID instead of options. - moves out validation of clusterID is set or not out of GetMonsAndClusterID(). - defines ErrClusterIDNotSet new error for reusability. - add GetClusterID() to obtain clusterID from options. Signed-off-by: Rakshith R <rar@redhat.com>	2021-09-14 08:39:57 +00:00
Rakshith R	9d1e98ca60	rbd: check for clusterid mapping in genVolFromVolumeOptions() This commit adds capability to genVolFromVolumeOptions() to fetch mapped clusted-id & mon ips for mirrored PVC on secondary cluster which may have different cluster-id. This is required for NodeStageVolume(). We also don't need to check for mapping during volume create requests, so it can be disabled by passing a bool checkClusterIDMapping as false. GetMonsAndClusterID() is modified to accept bool checkClusterIDMapping based on which clustermapping is checked to fetch mapped cluster-id and mon-ips. Signed-off-by: Rakshith R <rar@redhat.com>	2021-09-14 08:39:57 +00:00
Humble Chirammal	4be53a27d3	cleanup: replace parentName to snapParentName in checkReservation at present, eventhough the checkReservation works for both volume and snapshot, the arg parentName make sense only for snapshot cases renaming that arg to more approprite Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-09-14 05:32:54 +00:00
Humble Chirammal	1fee3ec460	cleanup: correct checkReservation return description it wrongly mention that the return is imageUUID string where actually it is the imageData struct Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-09-14 05:32:54 +00:00
Rakshith R	0a7a7f4866	util: call WriteCephConfig() in cephcsi.go This commit calls WriteCephConfig() in cephcsi.go to create ceph.conf and keyring if it is not mounted to be used by all cli calls and conn cmds. Before this change, rbd-controller/omap-generator did not create ceph.conf on startup. Signed-off-by: Rakshith R <rar@redhat.com>	2021-09-08 16:05:27 +00:00
Madhu Rajanna	8c8f34cf7a	rbd: set vaultAuthNamespace to vaultNamespace if empty When we read the csi-kms-connection-details configmap vaultAuthNamespace might not be set when we do the conversion the vaultAuthNamespace might be set to empty key and this commits check for the empty value of vaultAuthNamespace and set the vaultAuthNamespace to vaultNamespace. setting empty value for vaultAuthNamespace happened due to Marshalling at https://github.com/ceph/ceph-csi/blob/devel/ internal/kms/vault_tokens.go#L136-L139. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-09-08 11:18:03 +00:00
Rakshith R	e99dd3dea4	util: read ceph.conf by calling conn.ReadConfigFile(CephConfigPath) The configurations in cpeh.conf is not picked up by rados connection automatically, hence we need to call conn.ReadConfigFile before calling Connect(). Signed-off-by: Rakshith R <rar@redhat.com>	2021-09-07 16:50:12 +00:00
Madhu Rajanna	76f1b42498	cephfs: correct comment for validateExpandVolumeRequest corrected the function comment for validateExpandVolumeRequest. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-09-07 14:33:02 +00:00
Madhu Rajanna	9fd51d9bec	cephfs: add comment for validateCreateVolumeRequest added function comment for validateCreateVolumeRequest Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-09-07 14:33:02 +00:00
Madhu Rajanna	8caeb409bb	cephfs: add comment for validateDeleteVolumeRequest added function comment for the validateDeleteVolumeRequest function. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-09-07 14:33:02 +00:00
Madhu Rajanna	be7749c90e	cleanup: move volumeID to the volumeoptions volumeID can be moved to the volumeOptions as most of the volume related helper functions are available on the volumeoptions.go Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-09-07 14:33:02 +00:00
Madhu Rajanna	da70ed50dc	cleanup: move execCommandErr to volumemounter Moved execCommandErr to the volumemounter.go which is the only caller of this function and moving the execCommandErr helps in reducing the util file. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-09-07 14:33:02 +00:00
Madhu Rajanna	31696a6ce0	cleanup: move genSnapFromOptions to volumeoptions moved genSnapFromOptions function to volumeoptions.go which is more appropriated than util. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-09-07 14:33:02 +00:00
Madhu Rajanna	73e2ffe8b8	cleanup: move cephfs csi spec validation to validator moved the cephfs related validation like validating the input parameters sent in the GRPC request to a new file. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-09-07 14:33:02 +00:00
Humble Chirammal	4efcc5bf97	cleanup: simplify checkStaticVolume function and remove unwanted vars checkStaticVolume() in the reconcilePV function has been unwantedly introducing variables to confirm the pv spec is static or not. This patch simplify it and make a smaller footprint of the functions. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-09-07 12:51:30 +00:00
Humble Chirammal	df2d9548ae	cephfs: no need to check for zero volume size At present there is a 'todo' to check for zero volume size in the createVolume request which in unwanted, ie the pvc creation with size 0 fail from the kubernetes api validation itself: For ex: ``` ..spec.resources[storage]: Invalid value: "0": must be greater than zero``` ``` so we dont need any extra check in the controller server Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-09-07 04:49:24 +00:00
Prasanna Kumar Kalever	9e55f015de	rbd: avoid supplying map options on unmap Thanks to the random unmap failure on my local machine: I0901 17:08:37.841890 2617035 cephcmds.go:55] ID: 11 Req-ID: 0001-0024-fed5480a-f00f-417a-a51d-31d8a8144c03-0000000000000003-024983f3-0b47-11ec-8fcb-e671f0b9f58e an error (exit status 22) occurred while running rbd args: [unmap rbd-pool/csi-vol-024983f3-0b47-11ec-8fcb-e671f0b9f58e --device-type nbd --options try-netlink --options reattach-timeout=300 --options io-timeout=0] Noticed the map args are also getting passed to/as unmap args, which is not correct. We have separate things for mapOptions and unmapOptions. This PR makes sure that the map args are not passed at the time of unmap. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-09-06 15:59:30 +00:00
Humble Chirammal	3f31ca8a3a	cleanup: introduce populateVolOptions(), to fill rbdVol from stage req At present the nodeStageVolume() handle many logic of filling rbdvol struct based on the request received and this method is complex to follow. with this patch, filling or populating volOptions has been segregrated and handled hence make the stage functions' job easy. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-09-06 07:49:03 +00:00
Humble Chirammal	f0b8a3f626	rbd: use String() method of MirrorImageState in return error MirrorImageState (type C.rbd_mirror_image_state_t) has a string method which can be used while returning error in the replication controller. Previously, we were using int return in the error which is not the proper usage. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-09-03 16:02:53 +00:00
Madhu Rajanna	4865061ab9	util: create ceph configuration files if not present create ceph.conf and keyring files if its not present in the /et/ceph/ path. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-09-03 14:14:43 +00:00
Humble Chirammal	1d94c12cd6	cleanup: add checkErrAndUndoReserve() for error check,unreserve omap all the error check scenarios of genVolFromVolID() and unreserving omap entries based on the error made deleteVolume method complex, this patch create a new function which handle the error check and unrerving omap entries accordingly and finally return the response to deletevolume/caller. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-09-03 12:20:04 +00:00
Niels de Vos	60c2afbcca	util: NewK8sClient() should not panic on non-Kubernetes clusters When NewK8sClient() detects and error, it used to call FatalLogMsg() which causes a panic. There are additional features that can be used on Kubernetes clusters, but these are not a requirement for most functionalities of the driver. Instead of causing a panic, returning an error should suffice. This allows using the driver on non-Kubernetes clusters again. Fixes: #2452 Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-09-02 11:22:14 +00:00
Humble Chirammal	247795517f	cephfs: remove explicit size setting of cloned volume CephFS csi driver explictly set the size of the cloned volume to the size of parent volume as cephfs mgr was lacking this functionality previously. However it has been addressed in cephfs so we dont need explicit size setting. Ref#https://tracker.ceph.com/issues/46163 Supported Ceph releases: Ceph versions equal or above - v16.0.0, v15.2.9, v14.2.12 Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-09-01 09:32:29 +00:00
Madhu Rajanna	b383af20b4	cleanup: move cephfs errors to new util package As part of the refactoring, moving the cephfs errors file to a new package. Updates: #852 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-09-01 06:50:16 +00:00
Rakshith R	99168dc822	rbd: check for clusterid mapping in RegenerateJournal() This commit adds fetchMappedClusterIDAndMons() which returns monitors and clusterID info after checking cluster mapping info. This is required for regenerating omap entries in mirrored cluster with different clusterID. Signed-off-by: Rakshith R <rar@redhat.com>	2021-08-31 14:30:06 +00:00
Rakshith R	496bcba85c	rbd: move GetMappedID() to util package This commit moves getMappedID() from rbd to util package since it is not rbd specific and exports it from there. Signed-off-by: Rakshith R <rar@redhat.com>	2021-08-31 14:30:06 +00:00
Niels de Vos	4a3b1181ce	cleanup: move KMS functionality into its own package A new "internal/kms" package is introduced, it holds the API that can be consumed by the RBD components. The KMS providers are currently in the same package as the API. With later follow-up changes the providers will be placed in their own sub-package. Because of the name of the package "kms", the types, functions and structs inside the package should not be prefixed with KMS anymore: internal/kms/kms.go:213:6: type name will be used as kms.KMSInitializerArgs by other packages, and that stutters; consider calling this InitializerArgs (golint) Updates: #852 Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-08-30 16:31:40 +00:00
Niels de Vos	778b5e86de	cleanup: move k8s functions to the util/k8s package By placing the NewK8sClient() function in its own package, the KMS API can be split from the "internal/util" package. Some of the KMS providers use the NewK8sClient() function, and this causes circular dependencies between "internal/utils" -> "internal/kms" -> "internal/utils", which are not alowed in Go. Updates: #852 Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-08-30 16:31:40 +00:00
Humble Chirammal	8ea495ab81	rbd: skip volumeattachment processing if pv marked for deletion if the volumeattachment has been fetched but marked for deletion the nbd healer dont want to process further on this pv. This patch adds a check for pv is marked for deletion and if so, make the healer skip processing the same Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-08-26 15:04:19 +00:00
Niels de Vos	6d00b39886	cleanup: move log functions to new internal/util/log package Moving the log functions into its own internal/util/log package makes it possible to split out the humongous internal/util packages in further smaller pieces. This reduces the inter-dependencies between utility functions and components, preventing circular dependencies which are not allowed in Go. Updates: #852 Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-08-26 09:34:05 +00:00
Niels de Vos	68588dc7df	util: fix unit-test for GetClusterMappingInfo() Unit-testing often fails due to a race condition while writing the clusterMappingConfigFile from multiple go-routines at the same time. Failures from `make containerized-test` look like this: === CONT TestGetClusterMappingInfo/site2-storage_cluster-id_mapping cluster_mapping_test.go:153: GetClusterMappingInfo() = <nil>, expected data &[{map[site1-storage:site2-storage] [map[1:3]] [map[11:5]]} {map[site3-storage:site2-storage] [map[8:3]] [map[10:5]]}] === CONT TestGetClusterMappingInfo/site3-storage_cluster-id_mapping cluster_mapping_test.go:153: GetClusterMappingInfo() = <nil>, expected data &[{map[site3-storage:site2-storage] [map[8:3]] [map[10:5]]}] --- FAIL: TestGetClusterMappingInfo (0.01s) --- PASS: TestGetClusterMappingInfo/mapping_file_not_found (0.00s) --- PASS: TestGetClusterMappingInfo/mapping_file_found_with_empty_data (0.00s) --- PASS: TestGetClusterMappingInfo/cluster-id_mapping_not_found (0.00s) --- FAIL: TestGetClusterMappingInfo/site2-storage_cluster-id_mapping (0.00s) --- FAIL: TestGetClusterMappingInfo/site3-storage_cluster-id_mapping (0.00s) --- PASS: TestGetClusterMappingInfo/site1-storage_cluster-id_mapping (0.00s) By splitting the public GetClusterMappingInfo() function into an internal getClusterMappingInfo() that takes a filename, unit-testing can use different files for each go-routine, and testing becomes more predictable. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-08-25 16:08:48 +00:00
Prasanna Kumar Kalever	4f40213d8e	rbd: fix rbd-nbd io-timeout to never abort With the tests at CI, it kind of looks like that the IO is timing out after 30 seconds (default with rbd-nbd). Since we have tweaked reattach-timeout to 300 seconds at ceph-csi, we need to explicitly set io-timeout on the device too, as it doesn't make any sense to keep io-timeout < reattach-timeout Hence we set io-timeout for rbd nbd to 0. Specifying io-timeout 0 tells the nbd driver to not abort the request and instead see if it can be restarted on another socket. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> Suggested-by: Ilya Dryomov <idryomov@redhat.com>	2021-08-24 17:09:09 +00:00
Prasanna Kumar Kalever	3bf17ade7a	doc: update code comments about available timeout options Adding some code comments to make them readable and easy to understand. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-08-24 17:09:09 +00:00
Prasanna Kumar Kalever	ea3def0db2	rbd: remove per volume rbd-nbd logfiles on detach - Update the meta stash with logDir details - Use the same to remove logfile on unstage/unmap to be space efficient Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-08-24 07:15:30 +00:00
Prasanna Kumar Kalever	d67e88ccd0	cleanup: embed args into struct and pass it to detachRBDImageOrDeviceSpec Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-08-24 07:15:30 +00:00
Prasanna Kumar Kalever	474100c1f1	rbd: add a unit test for getCephClientLogFileName() Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-08-24 07:15:30 +00:00
Prasanna Kumar Kalever	682b3a980b	rbd: rbd-nbd logging the ceph-CSI way - One logfile per device/volume - Add ability to customize the logdir, default: /var/log/ceph Note: if user customizes the hostpath to something else other than default /var/log/ceph, then it is his responsibility to update the `cephLogDir` in storageclass to reflect the same with daemon: ``` cephLogDir: "/var/log/mynewpath" ``` Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-08-24 07:15:30 +00:00
Humble Chirammal	9ac1391d0f	util: correct interface name and remove redundancy ContollerManager had a typo in it, and if we correct it, linter will fail and suggest not to use controller.ControllerManager as the interface name and package name is redundant, keeping manager as the interface name which is the practice and also address the linter issues. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-08-19 04:19:42 +00:00
Humble Chirammal	edf511a833	cephfs: make use of subvolumeInfo.state to determine quota https://github.com/ceph/go-ceph/pull/455/ added `state` field to subvolume info struct which helps to identify the snapshot retention state in the caller. This patch make use of the same Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-08-18 04:50:46 +00:00
Humble Chirammal	66fa5891b2	cephfs: correct typos in cephfs driver code Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-08-18 04:50:46 +00:00
Humble Chirammal	5089a4ce5d	doc: correct some source code comments in rbd driver code Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-08-17 06:57:09 +00:00
Madhu Rajanna	5562e46d0f	rbd: Cleanup OMAP data for secondary image If the image is in a secondary state and its up+replaying means its an healthy secondary and the image is primary somewhere in the remote cluster and the local image is getting replayed. Delete the OMAP data generated as we cannot delete the secondary image. When the image on the primary cluster gets deleted/mirroring disabled, the image on all the remote (secondary) clusters will get auto-deleted. This helps in garbage collecting the OMAP, PVC and PV objects after failback operation. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-08-16 17:38:25 +00:00
Madhu Rajanna	fc0d6f6b8b	rbd: return succuss if image is healthy secondary If the image is in secondary state and its up+replaying means its an healthy secondary and the image is primary somewhere in the remote cluster and the local image is getting replayed. Return success for the Disabling mirroring as we cannot disable the mirroring on the secondary state, when the image on the remote site gets disabled the image on all the remote (secondary) will get auto deleted. This helps in garbage collecting the volume replication kuberentes artifacts Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-08-16 17:38:25 +00:00
Madhu Rajanna	35324b2e17	rbd: add helper function to get local state added helper function to check the local image state is up+replaying. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-08-16 17:38:25 +00:00
Humble Chirammal	87beaac25b	rbd: add ReadWriteOncePod in accessModeStrToInt() conversion function Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-08-12 09:55:50 +00:00
Rakshith R	f05ac2b25d	rbd: extract kmsID from volumeAttributes in RegenerateJournal() This commit adds functionality of extracting encryption kmsID, owner from volumeAttributes in RegenerateJournal() and adds utility functions ParseEncryptionOpts and FetchEncryptionKMSID. Signed-off-by: Rakshith R <rar@redhat.com>	2021-08-10 09:17:59 +00:00
Rakshith R	b960e3633a	rbd: extract volumeNamePrefix in RegenerateJournal() Signed-off-by: Rakshith R <rar@redhat.com>	2021-08-10 09:17:59 +00:00
Rakshith R	b9b4b1e34e	rbd: refractor RegenerateJournal() to take in volumeAttributes This commit refractors RegenerateJournal() to take in volumeAttributes map[string]string as argument so it can extract required attributes internally. Signed-off-by: Rakshith R <rar@redhat.com>	2021-08-10 09:17:59 +00:00
Rakshith R	39d6752fc1	rbd: use `CSIInstanceID` var instead of "default" in RegenerateJournal() Signed-off-by: Rakshith R <rar@redhat.com>	2021-08-10 09:17:59 +00:00
Ben Ye	9cd8326bb2	cleanup: allocate slice with known size As the input capabilities size is known, it is better to allocate slice with a specified size. Signed-off-by: Ben Ye <ben.ye@bytedance.com>	2021-08-10 05:39:44 +00:00
Madhu Rajanna	6cc37f0a17	cleanup: use different file name for testing For clusterMappingConfigFile using different file name so that multiple unit test cases can work without any data race. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-08-09 13:37:25 +00:00
Madhu Rajanna	3c85219962	rbd: consider empty mirroring mode consider the empty mirroring mode when validating the snapshot interval and the scheduling time. Even if the mirroring Mode is not set validate the snapshot scheduling details as cephcsi sets the mirroring mode to default snapshot. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-08-09 11:05:05 +00:00
Rakshith R	825211730c	rbd: fix snapshot id idempotency issue This commit fixes snapshot id idempotency issue by always returning an error when flattening is in progress and not using `readyToUse:false` response. Signed-off-by: Rakshith R <rar@redhat.com>	2021-08-09 07:28:43 +00:00
Rakshith R	859d696279	cleanup: refractor checkCloneImage to reducing nesting if This commit refractors checkCloneImage function to address nestif linter issue. Signed-off-by: Rakshith R <rar@redhat.com>	2021-08-09 07:28:43 +00:00
Madhu Rajanna	a5a8952716	rbd: fix clone problem This commit fixes a bug in checkCloneImage() which was caused by checking cloned image before checking on temp-clone image snap in a subsequent request which lead to stale images. This was solved by checking temp-clone image snap and flattening temp-clone if needed. This commit also fixes comparison bug in flattenCloneImage(). Signed-off-by: Rakshith R <rar@redhat.com> Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-08-09 07:28:43 +00:00
Madhu Rajanna	916c97b4a8	rbd: copy creds when copying the connection rbd flatten functions is a CLI call and it expects the creds as the input and copying of creds is required when we generate the temp clone image. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-08-09 07:28:43 +00:00
Rakshith R	08728b631b	rbd: fix vol.VolID in cloneFromSnapshot() Volume generated from snap using genrateVolFromSnap already copies volume ID correctly, therefore removing `vol.VolID = rbdVol.VolID` which wrongly copies parent Volume ID instead leading to error from copyEncryption() on parent and clone volume ID being equal. Signed-off-by: Rakshith R <rar@redhat.com>	2021-08-09 07:28:43 +00:00
Niels de Vos	b5d2321d57	cleanup: use vaultDefaultCAVerify to set default value Golang-ci complains about the following: internal/util/vault_tokens.go:99:20: string `true` has 4 occurrences, but such constant `vaultDefaultDestroyKeys` already exists (goconst) v.VaultCAVerify = "true" ^ This occurence of "true" can be replaced by vaultDefaultCAVerify so address the warning. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-08-06 12:19:18 +00:00
Niels de Vos	f584db41e6	util: add vaultDestroyKeys option to destroy Vault kv-v2 secrets Hashicorp Vault does not completely remove the secrets in a kv-v2 backend when the keys are deleted. The metadata of the keys will be kept, and it is possible to recover the contents of the keys afterwards. With the new `vaultDestroyKeys` configuration parameter, this behaviour can now be selected. By default the parameter will be set to `true`, indicating that the keys and contents should completely be destroyed. Setting it to any other value will make it possible to recover the deleted keys. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-08-06 12:19:18 +00:00
Madhu Rajanna	2782878ea2	rbd: log LastUpdate in UTC format This Commit converts the LastUpdate from int to the UTC format and logs it for better debugging. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-08-06 10:18:51 +00:00
Madhu Rajanna	92ad2ceec9	rbd: read clusterID and PoolID from mapping Whenever Ceph-CSI receives a CSI/Replication request it will first decode the volumeHandle and try to get the required OMAP details if it is not able to retrieve, receives a `Not Found` error message and Ceph-CSI will check for the clusterID mapping. If the old volumeID `0001-00013-site1-storage-0000000000000001 -b0285c97-a0ce-11eb-8c66-0242ac110002` contains the `site1-storage` as the clusterID, now Ceph-CSI will look for the corresponding clusterID `site2-storage` from the above configmap. If the clusterID mapping is found now Ceph-CSI will look for the poolID mapping ie mapping between `1` and `2`. Example:- pool with name exists on both the clusters with different ID's Replicapool with ID `1` on site1 and Replicapool with ID `2` on site2. After getting the required mapping Ceph-CSI has the required information to get more details from the rados OMAP. If we have multiple clusterID mapping it will loop through all the mapping and checks the corresponding pool to get the OMAP data. If the clusterID mapping does not exist Ceph-CSI will return an `Not Found` error message to the caller. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-08-05 16:07:51 +00:00
Madhu Rajanna	ac11d71e19	util: add helper function to read clusterID mapping added helper function to read the clusterID mapping from the mounted file. The clusterID mapping contains below mappings * ClusterID mappings (to cluster to which we are failingover and from which cluster failover happened) * RBD PoolID mapping of between the clusters. * CephFS FscID mapping between the clusters. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-08-05 16:07:51 +00:00
Yug Gupta	1dc032e554	doc: update comments in voljournal Update spell errors and comments in voljournal.go Signed-off-by: Yug Gupta <yuggupta27@gmail.com>	2021-08-05 08:11:15 +00:00
Niels de Vos	4859f2dfdb	util: allow configuring VAULT_AUTH_MOUNT_PATH for Vault Tenant SA KMS The VAULT_AUTH_MOUNT_PATH is a Vault configuration parameter that allows a user to set a non default path for the Kubernetes ServiceAccount integration. This can already be configured for the Vault KMS, and is now added to the Vault Tenant SA KMS as well. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-08-05 06:02:57 +00:00
Niels de Vos	f2d5c2e0df	util: add vaultAuthNamespace option for Vault KMS The new `vaultAuthNamespace` configuration parameter can be set to the Vault Namespace where the authentication is setup in the service. Some Hashicorp Vault deployments use sub-namespaces for their users/tenants, with a 'root' namespace where the authentication is configured. This requires passing of different Vault namespaces for different operations. Example: - the Kubernetes Auth mechanism is configured for in the Vault Namespace called 'devops' - a user/tenant has a sub-namespace called 'devops/website' where the encryption passphrases can be placed in the key-value store The configuration for this, then looks like: vaultAuthNamespace: devops vaultNamespace: devops/homepage Note that Vault Namespaces are a feature of the Hashicorp Vault Enterprise product, and not part of the Open Source version. This prevents adding e2e tests that validate the Vault Namespace configuration. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-08-04 18:20:45 +00:00
Niels de Vos	83167e2ac5	util: correct error message when connecting to Vault fails Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-08-04 18:20:45 +00:00
Alexandre Lossent	5cba04c470	cephfs: support selinux mount options - mount host's /etc/selinux in node plugins - process mount options in all code paths for cephfs volume options Signed-off-by: Alexandre Lossent <alexandre.lossent@cern.ch>	2021-08-04 12:59:34 +00:00
Artur Troian	16ec97d8f7	util: getCgroupPidsFile produces striped path when extra : present This commit uses `string.SplitN` instead of `string.Split`. The path for pids.max has extra `:` symbols in it due to which getCgroupPidsFile() splits the string into 5 tokens instead of 3 leading to loss of part of the path. As a result, the below error is reported: `Failed to get the PID limit, can not reconfigure: open /sys/fs/cgroup/pids/system.slice/containerd.service/ kubepods-besteffort-pod183b9d14_aed1_4b66_a696_da0c738bc012.slice/pids.max: no such file or directory` SplitN takes an argument n and splits the string accordingly which helps us to get the desired file path. Fixes: #2337 Co-authored-by: Yati Padia <ypadia@redhat.com> Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-08-03 06:03:10 +00:00
Madhu Rajanna	8f185bf7b2	rbd: use rados namespace for manager command Currently we have a bug that we are not using rados namespace when adding ceph manager command to remove the image from the trash. This commit adds the missing rados namespace when adding ceph manager task. without fix the image will be moved to trash and no task will be added to remove from the trash. it will become ceph responsibility to remove the image from trash when it will cleanup the trash. workaroud: manually purge the trash Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-07-28 03:48:33 +00:00
Niels de Vos	ec6703ed58	rbd: rename encryption metadata keys to enable mirroring RBD image metadata keys that start with '.rbd' are expected to be internal to RBD itself and are not mirrored to remote sites. Renaming the keys (dropping the '.' prefix) and using the new MigrateMetadata() function now makes the keys available on remote sites too. Closes: #2219 Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-07-26 11:49:56 +00:00
Niels de Vos	607129171d	rbd: move image metadata key migration to its own function The new MigrateMetadata() function can be used to get the metadata of an image with a deprecated and new key. Renaming metadata keys can be done easily this way. A default value will be set in the image metadata when it is missing completely. But if the deprecated key was set, the data is stored under the new key and the deprecated key is removed. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-07-26 11:49:56 +00:00
Yati Padia	6691951453	rbd: use go-ceph for getImageMirroringStatus Currently, getImageMirroringStatus() is using RBD CLI. This commit converts RBD CLI to go-ceph API. Fixes: #2120 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-07-26 06:37:40 +00:00
Prasanna Kumar Kalever	526ff95f10	rbd: add support to expand encrypted volume Previously in ControllerExpandVolume() we had a check for encrypted volumes and we use to fail for all expand requests on an encrypted volume. Also for Block VolumeMode PVCs NodeExpandVolume used to be ignored/skipped. With these changes, we add support for the expansion of encrypted volumes. Also for raw Block VolumeMode PVCs with Encryption we call NodeExpandVolume. That said, With LUKS1, cryptsetup utility doesn't prompt for a passphrase on resizing the crypto mapper device. This is because LUKS1 devices don't use kernel keyring for volume keys. Whereas, LUKS2 devices use kernel keyring for volume key by default, i.e. cryptsetup utility asks for a passphrase if it detects volume key was previously passed to dm-crypt via kernel keyring service, we are overriding the default by --disable-keyring option during cryptsetup open command. So that at the time of crypto mapper device resize we will not be prompted for any passphrase. Fixes: #1469 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-07-23 10:00:23 +00:00
Prasanna Kumar Kalever	4fa05cb3a1	util: add helper functions for resize of encrypted volume such as: ResizeEncryptedVolume() and LuksResize() Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-07-23 10:00:23 +00:00
Prasanna Kumar Kalever	572f39d656	util: fix log level in OpenEncryptedVolume() Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-07-23 10:00:23 +00:00
Prasanna Kumar Kalever	812003eb45	util: fix bug in DeviceEncryptionStatus() With Luks1 device: $ cryptsetup status /dev/mapper/crypto-rbd0 /dev/mapper/crypto-rbd0 is active and is in use. type: LUKS1 cipher: aes-xts-plain64 keysize: 512 bits key location: dm-crypt device: /dev/rbd0 sector size: 512 offset: 4096 sectors size: 4190208 sectors mode: read/write With Luks2 device: $ cryptsetup status /dev/mapper/crypto-rbd0 /dev/mapper/crypto-rbd0 is active and is in use. type: LUKS2 cipher: aes-xts-plain64 keysize: 512 bits key location: dm-crypt device: /dev/rbd0 sector size: 512 offset: 32768 sectors size: 4161536 sectors mode: read/write This could lead to failures with unmap in the NodeUnstageVolume path for the encrypted volumes. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-07-23 10:00:23 +00:00
Yati Padia	1ae2afe208	cleanup: modifies the error caused due to merged PRs This commit modifies the error of godot, cyclop, paralleltest linter caused due to merged PRs. Updates: #1586 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-07-22 18:15:48 +00:00
Yati Padia	172b66f73f	cleanup: resolves cyclop linter issue this commit adds `// nolint:cyclop` for the fucntions whose complexity is above 20 Updates: #1586 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-07-22 18:15:48 +00:00
Humble Chirammal	abe6a6e5ac	util: remove deleteLock test as it is enforced by the controller Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-07-22 15:07:49 +00:00
Humble Chirammal	c42d4768ca	util: remove the deleteLock acquistion check for clone and snapshot At present while acquiring the deleteLock on the volume, we check for ongoing clone and snapshot creation operations on the same. Considering snapshot and clone controllers does not allow parent volume deletion on subjected operations, we can be free from this extra check. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-07-22 15:07:49 +00:00
Niels de Vos	82557e3f34	util: allow configuring VAULT_BACKEND for Vault connection It seems that the version of the key/value engine can not always be detected for Hashicorp Vault. In certain cases, it is required to configure the `VAULT_BACKEND` (or `vaultBackend`) option so that a successful connection to the service can be made. The `kv-v2` is the current default for development deployments of Hashicorp Vault (what we use for automated testing). Production deployments default to version 1 for now. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-07-22 13:02:47 +00:00
Rakshith R	43f753760b	cleanup: resolve nlreturn linter issues nlreturn linter requires a new line before return and branch statements except when the return is alone inside a statement group (such as an if statement) to increase code clarity. This commit addresses such issues. Updates: #1586 Signed-off-by: Rakshith R <rar@redhat.com>	2021-07-22 06:05:01 +00:00
Yati Padia	3469dfc753	cleanup: resolve errorlint issues This commit resolves errorlint issues which checks for the code that will cause problems with the error wrapping scheme. Updates: #1586 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-07-19 13:31:29 +00:00
Yati Padia	bfda5fa57f	cleanup: resolve revive linter issue revive linter checks for var-declaration format. For example: "e2e/rbd_helper.go:441:36: var-declaration: should drop = nil from declaration of var noPVCValidation; it is the zero value (revive) var noPVCValidation validateFunc = nil" Updates: #1586 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-07-19 08:39:32 +00:00
Humble Chirammal	bd947bbe31	util: remove deleteLock check while acquiring snapshot createLock snapshot controller make sure the pvc which is the source for the snapshot request wont get deleted while snapshot is getting created, so we dont need to check for any ongoing delete operation here on the volume. Subjected code path in snapshot controller: ``` pvc, err := ctrl.getClaimFromVolumeSnapshot(snapshot) . .. pvcClone.ObjectMeta.Finalizers = append(pvcClone.ObjectMeta.Finalizers, utils.PVCFinalizer) _, err = ctrl.client.CoreV1().PersistentVolumeClaims(pvcClone.Namespace).Update(..) ``` Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-07-17 10:23:13 +00:00
Prasanna Kumar Kalever	78f740d903	rbd: improve healer to run multiple NodeStageVolume req concurrently This will bring down the healer run time by a great factor. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-07-16 16:30:58 +00:00
Prasanna Kumar Kalever	b6a88dd728	rbd: add volume healer Problem: ------- For rbd nbd userspace mounter backends, after a restart of the nodeplugin all the mounts will start seeing IO errors. This is because, for rbd-nbd backends there will be a userspace mount daemon running per volume, post restart of the nodeplugin pod, there is no way to restore the daemons back to life. Solution: -------- The volume healer is a one-time activity that is triggered at the startup time of the rbd nodeplugin. It navigates through the list of volume attachments on the node and acts accordingly. For now, it is limited to nbd type storage only, but it is flexible and can be extended in the future for other backend types as needed. From a few feets above: This solves a severe problem for nbd backed csi volumes. The healer while going through the list of volume attachments on the node, if finds the volume is in attached state and is of type nbd, then it will attempt to fix the rbd-nbd volumes by sending a NodeStageVolume request with the required volume attributes like secrets, device name, image attributes, and etc.. which will finally help start the required rbd-nbd daemons in the nodeplugin csi-rbdplugin container. This will allow reattaching the backend images with the right nbd device, thus allowing the applications to perform IO without any interruptions even after a nodeplugin restart. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-07-16 16:30:58 +00:00
Prasanna Kumar Kalever	6007fc9bfe	cleanup: move static volume check to helper function Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-07-16 16:30:58 +00:00
Prasanna Kumar Kalever	6d24080851	rbd: update per volume metadata stash-file with devicePath As part of stage transaction if the mounter is of type nbd, then capture device path after a successful rbd-nbd map. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-07-16 16:30:58 +00:00
Prasanna Kumar Kalever	70998571aa	cleanup: change variable name from path to metaDataPath path is used by standard package. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-07-16 16:30:58 +00:00
Humble Chirammal	94c5c5e119	util: remove deleteLock while we acquire clone operation lock clone controller make sure there is no delete operation happens on the source PVC which has been referred as the datasource of clone PVC, we are safe to operate without looking at delete operation lock in this case. Subjected code in the controller: ... if claim.Spec.DataSource != nil && rc.clone { err = p.setCloneFinalizer(ctx, claim) ... } if !checkFinalizer(claim, pvcCloneFinalizer) { claim.Finalizers = append(claim.Finalizers, pvcCloneFinalizer) _, err := p.client.CoreV1().PersistentVolumeClaims(claim.Namespace).Update(..claim..) } Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-07-16 12:32:28 +00:00
Humble Chirammal	e088e8fd2e	cephfs: Get rid of locking at nodepublish Considering kubelet make sure the stage and publish operations are serialized, we dont need any extra locking in nodePublish Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-07-16 07:18:56 +00:00
Humble Chirammal	61bf49a4f5	rbd: Get rid of locking at nodePublish Considering kubelet make sure the stage and publish operations are serialized, we dont need any extra locking in nodePublish Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-07-16 07:18:56 +00:00
Humble Chirammal	ced3a0922f	cephfs: Get rid of locking at nodeUnpublish call Considering kubelet make sure the unstage and unpublish operations are serialized, we dont need any extra locking in nodeUnpublish Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-07-16 07:18:56 +00:00
Humble Chirammal	ef852cc93d	rbd: Get rid of locking at nodeUnpublish call Considering kubelet make sure the unstage and unpublish operations are serialized, we dont need any extra locking in nodeUnpublish Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-07-16 07:18:56 +00:00
Yati Padia	f36d611ef9	cleanup: resolves gofumpt issues of internal codes This PR runs gofumpt for internal folder. Updates: #1586 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-07-14 19:50:56 +00:00
Yati Padia	299979fc14	ci: add unit test for toError() This commit adds unit test for the func converting cephFSCloneState to error. Fixes: #2259 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-07-14 15:02:12 +00:00
Yati Padia	c66872c3c6	cleanup: ineffective assignment This commit resolves ineffective assignent of snap. Updates: #1586 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-07-14 12:39:17 +00:00
Yati Padia	f210d5758b	cleanup: spell check getImageMirroingStatus This commit corrects the spelling for getImageMirroingStatus() -> getImageMirroringStatus Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-07-14 07:32:01 +00:00
Niels de Vos	d941e5abac	util: make parseTenantConfig() usable for modular KMSs parseTenantConfig() only allowed configuring a defined set of options, and KMSs were not able to re-use the implementation. Now, the function parses the ConfigMap from the Tenants Namespace and returns a map with options that the KMS supports. The map that parseTenantConfig() returns can be inspected by the KMS, and applied to the vaultTenantConnection type by calling parseConfig(). Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-07-13 17:16:35 +00:00
Niels de Vos	3d7d48a4aa	util: VaultTenantSA KMS implementation This new KMS uses a Kubernetes ServiceAccount from a Tenant (Namespace) to connect to Hashicorp Vault. The provisioner and node-plugin will check for the configured ServiceAccount and use the token that is located in one of the linked Secrets. Subsequently the Vault connection is configured to use the Kubernetes token from the Tenant. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-07-13 17:16:35 +00:00
Niels de Vos	6dc5bf2b29	util: split vaultTenantConnection from VaultTokensKMS This makes the Tenant configuration for Hashicorp Vault KMS connections more modular. Additional KMS implementations that use Hashicorp Vault with per-Tenant options can re-use the new vaultTenantConnection. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-07-13 17:16:35 +00:00
Yati Padia	69c9e5ffb1	cleanup: resolve parallel test issue This commit resolves parallel test issues and also excludes internal/util/conn_pool_test.go as those test can't run in parallel. Updates: #1586 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-07-13 11:31:39 +00:00
Yati Padia	4a649fe17f	cleanup: resolve godot linter This commit resolves godot linter issue which says "Comment should end in a period (godot)". Updates: #1586 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-07-13 06:50:03 +00:00
Yati Padia	f35ce3d880	cleanup: Adds t.Helper() to test helper function This commit adds t.Helper() to the test helper function. With this call go test prints correct lines of code for failed tests. Otherwise, printed lines will be inside helpers functions. For more details check: https://github.com/kulti/thelper Updates: #1586 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-07-12 11:25:55 +00:00
Yati Padia	84c1fe52c7	cleanup: resolve exhaustive linter This commit resolves exhaustive linter error. Updates: #2240 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-07-12 04:47:08 +00:00
Jonas Zeiger	680a7bf411	util: more generic kernel version parsing * Make kernel version parsing to support more (valid) version strings * Put version string parsing into a separate, testable function * Fixes #2248 (Kernel Subversion Parsing Failure) Signed-off-by: Jonas Zeiger <jonas.zeiger@talpidae.net>	2021-07-09 07:36:27 +00:00
Rakshith R	3352d4aabd	rbd: add user secret based metadata encryption This commit adds capability to `metadata` encryption to be able to fetch `encryptionPassphrase` from user specified secret name and namespace(if not specified, will default to namespace where PVC was created). This behavior is followed if `secretName` key is found in the encryption configuration else defaults to fetching `encryptionPassphrase` from storageclass secrets. Closes: 2107 Signed-off-by: Rakshith R <rar@redhat.com>	2021-07-08 17:06:02 +00:00
Yati Padia	ffab37f44f	cleanup: resolves gocritic linter issues This commit resolves gocritic linter errors. Updates: #2250 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-07-08 05:19:26 +00:00
Madhu Rajanna	dd0884310f	rbd: set image metadata in isThickProvisioned setting metadata in isThickProvisioned method helps us to avoid checking thick metakey and deprecated metakey for both thick and thin provisioned images and also this will easily help us to migrated the deprecated key to new key. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-07-07 08:31:10 +00:00
Madhu Rajanna	77135599ac	rbd: make setThickProvisioned as method of rbdImage isThickProvisioned is already method of the rbdImage to keep similar thick provisioner related functions common making isThickProvisioned as method of rbdImage. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-07-07 08:31:10 +00:00
Madhu Rajanna	708800ddc1	rbd: set thick metadata if ThickProvision is set instead of checking the parent is thick provisioned or not we can decide based on the rbdVol generated from the request. If the request is to create a Thick Image. set metadata without checking the parent. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-07-07 08:31:10 +00:00
Madhu Rajanna	332a47a100	rbd: deprecate .rbd.csi.ceph.com/thick-provisioned metadata key As image metadata key starting with '.rbd' will not be copied when we do clone or mirroring, deprecating the old key for the same reason use 'csi.ceph.com/thick-provisioned' to set image metadata. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-07-07 08:31:10 +00:00
Madhu Rajanna	0837c05be0	rbd: set scheduling interval on snapshot mirrored image Mirror-snapshots can also be automatically created on a periodic basis if mirror-snapshot schedules are defined. The mirror-snapshot can be scheduled globally, per-pool, or per-image levels. Multiple mirror-snapshot schedules can be defined at any level. To create a mirror-snapshot schedule with rbd, specify the mirror snapshot schedule add command along with an optional pool or image name; interval; and optional start time: The interval can be specified in days, hours, or minutes using d, h, m suffix respectively. The optional start-time can be specified using the ISO 8601 time format. For example: ``` $ rbd --cluster site-a mirror snapshot schedule add --pool image-pool --image image1 24h 14:00:00-05:00 ``` Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-07-06 14:41:48 +00:00
Madhu Rajanna	b1710f4c53	util: add method to get rados connection New go-ceph admin package api's expects to pass the rados connection as argument. added new method called GetRBDAdmin to get admin connection to administrate rbd volumes. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-07-06 14:41:48 +00:00
Rakshith R	9eaa55506f	rebase: update controller-runtime package to v0.9.2 This commit updates controller-runtime to v0.9.2 and makes changes in persistentvolume.go to add context to various functions and function calls made here instead of context.TODO(). Signed-off-by: Rakshith R <rar@redhat.com>	2021-07-01 03:35:23 +00:00
Rakshith R	1b23d78113	rebase: update kubernetes to v1.21.2 Updated kubernetes packages to latest release. resizefs package has been included into k8s.io/mount-utils package. updated code to use the same. Updates: #1968 Signed-off-by: Rakshith R <rar@redhat.com>	2021-07-01 03:35:23 +00:00
Humble Chirammal	cc6d67a7d6	internal: reformat long lines in internal/util package to 120 chars We have many declarations and invocations..etc with long lines which are very difficult to follow while doing code reading. This address the issues in 'internal/util' package files to restrict the line length to 120 chars. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-06-28 14:43:49 +00:00
Humble Chirammal	8f82a30c21	internal: reformat long lines in internal/rbd package to 120 chars We have many declarations and invocations..etc with long lines which are very difficult to follow while doing code reading. This address the issues in below files, and restrict the line length to 120 chars. -internal/rbd/rbd_attach.go -internal/rbd/rbd_journal.go -internal/rbd/rbd_util.go -internal/rbd/replicationcontrollerserver.go -internal/rbd/snapshot.go Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-06-28 14:43:49 +00:00
Humble Chirammal	e829308249	internal: reformat long lines in internal/rbd package to 120 chars We have many declarations and invocations..etc with long lines which are very difficult to follow while doing code reading. This address the issues in 'internal/rbd/*server.go' and 'internal/rbd/driver.go' files to restrict the line length to 120 chars. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-06-28 14:43:49 +00:00
Humble Chirammal	3dc8c5b516	internal: reformat long lines in internal/journal package to 120 chars We have many declarations and invocations..etc with long lines which are very difficult to follow while doing code reading. This address the issues in 'internal/journal' package to restrict the line length to 120 chars. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-06-28 14:43:49 +00:00
Humble Chirammal	a3b83fe8a7	internal: reformat long lines in internal/csi-common package to 120 chars We have many declarations and invocations..etc with long lines which are very difficult to follow while doing code reading. This address the issues in 'internal/csi-common' package to restrict the line length to 120 chars. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-06-28 14:43:49 +00:00
Humble Chirammal	f526c4a5e8	internal: reformat long lines in internal/controller package to 120 chars We have many declarations and invocations..etc with long lines which are very difficult to follow while doing code reading. This address the issues in 'internal/controller' package to restrict the line length to 120 chars. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-06-28 14:43:49 +00:00
Humble Chirammal	0d432be5bf	internal: reformat long lines in internal/cephfs package to 120 chars We have many declarations and invocations..etc with long lines which are very difficult to follow while doing code reading. This address the issues in 'internal/cephfs' package to restrict the line length to 120 chars. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-06-28 14:43:49 +00:00
Rakshith R	404e011ae9	cleanup: added helper func isNotMountPoint Added helper func isNotMountPoint to check mountPoint, validate error and reduce complexity of NodeStageVolume. Signed-off-by: Rakshith R <rar@redhat.com>	2021-06-28 05:46:42 +00:00
Rakshith R	7fc553a3a7	rbd: removing TrimSpace from validateImageFeatures func `imageFeatures` string containing just whitespace should also be treated as a invalid feature. Signed-off-by: Rakshith R <rar@redhat.com>	2021-06-28 05:46:42 +00:00
Rakshith R	84b046d736	rbd: add check for imageFeatures parameter This commit adds checks for missing `imageFeatures` parameter in createvolumerequest and nodestagerequest(only for static PVs). Missing `imageFeatures` parameter is ignored in case of non-static PVs to ensure backwards compatibility with older versions which did not have `imageFeatures` as required parameter. Signed-off-by: Rakshith R <rar@redhat.com>	2021-06-28 05:46:42 +00:00
Yati Padia	13667c013c	cleanup: addresses paralleltest linter The Go linter paralleltest checks that the t.Parallel gets called for the test method and for the range of test cases within the test. Updates: #2025 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-06-25 11:55:12 +00:00
Niels de Vos	0ee0c12027	cleanup: prevent panic in cleanUpSnapshot While cleaning up snapshots, not all object may exist after a partial provisioning attempt. In case objects are missing, do not try to delete them. Fixes: #2192 Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-06-25 10:01:35 +00:00
Niels de Vos	eeec4471c5	rbd: no need to create a snapshot on a thick-provisioned volume When cloning a volume from a (CSI) snapshot, we use DeepCopy() and do not need an RBD snapshot as source. Suggested-by: Madhu Rajanna <madhupr007@gmail.com> Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-06-23 14:22:28 +00:00
Niels de Vos	d2c4cacb39	rbd: restart thick-provisioned PVC snapshot restoring after aborting In case restoring a snapshot of a thick-PVC failed during DeepCopy(), the image will exist, but have partial contents. Only when the image has the thick-provisioned metadata set, it has completed DeepCopy(). When the metadata is missing, the image is deleted, and an error is returned to the caller. Kubernetes will automatically retry provisioning on the ABORTED error, and the restoring will get restarted from the beginning. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-06-23 14:22:28 +00:00
Niels de Vos	7f1bdb49d1	rbd: use DeepCopy() when restoring a thick-snapshot Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-06-23 14:22:28 +00:00
Yati Padia	847b996501	cleanup: Modifies Wrapcheck linter Wrapcheck is a simple Go linter to check that errors from external packages are wrapped during return to help identify the error source during debugging. This commit addresses the wrapcheck error Updates:#2025 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-06-22 08:47:55 +00:00
Madhu Rajanna	591ba3f580	rbd: set thick provision metadata on clone volume the parent volume(CreateVolume) and the clone volume (CreateSnapshot) are both indepedent and parent volume can be deleted anytime. To check the thick provision during Snapshot restore(CreateVolume from snapshot) we need the thick provision metadata so for the same reason setting the thick provision metadata on the clone image we are creating at the CreateSnapshot time. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-06-18 10:57:48 +00:00
Madhu Rajanna	6d14eeee70	rbd: use RbdSnapName to check the image details RbdSnapName holds the actual RBD image name which got created during the CreateSnapshot operation. RbdImageName holds the name of the parent from which the snapshot is created. and the parent is independent of snapshot and it can be deleted any time for the same reason using the RbdSnapName to check the rbd image details. generate a temporary volume from the snapshot which replaces the rbdImageName with RbdSnapName and use it to check the image metadata. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-06-18 10:57:48 +00:00
Madhu Rajanna	7966d2e5c1	rbd: add validation for thick restore/clone added validation to allow only Restore of Thick PVC snapshot to a thick clone and creation of thick clone from thick PVC. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-06-18 10:57:48 +00:00
Madhu Rajanna	fc442221e4	rbd: make isThickProvisioned method of rbdImage isThickProvisioned can be used for both snapshot and clone validation if isThickProvisioned is method of common rbdImage structure. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-06-18 10:57:48 +00:00
Niels de Vos	57d3183cb1	rbd: restart thick-provisioned PVC cloning after aborting In case cloning a thick-PVC failed during DeepCopy(), the image will exist, but have partial contents. Only when the image has the thick-provisioned metadata set, it has completed DeepCopy(). When the metadata is missing, the image is deleted, and an error is returned to the caller. Kubernetes will automatically retry provisioning on the ABORTED error, and the cloning will get restarted from the beginning. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-06-18 06:25:56 +00:00
Niels de Vos	b1045364d9	rbd: disable FeatureDeepFlatten when doing DeepCopy() Not all Linux kernels support the deep-flatten feature. Disabling the feature makes it possible to map RBD images on older kernels (like what minikube uses). Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-06-18 06:25:56 +00:00
Niels de Vos	4908ff8743	rbd: no need to flatten thick-provisioned images Thick-provisioned images are independent, cloned images or snapshots are deep-flattened during creation. There is no need to try and flatten them again. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-06-18 06:25:56 +00:00
Niels de Vos	6cc11c15d3	rbd: use DeepCopy to create a thick-provisioned clone To create a full-allocated RBD image from a snapshot/clone DeepCopy() can be used. This is needed when the parent of the new volume is thick-provisioner, so that the new volume is independent of the parent and thick-provisioned as well. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-06-18 06:25:56 +00:00
Niels de Vos	334f237e23	cleanup: move snapshot/clone/flatten into its own function Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-06-18 06:25:56 +00:00
Madhu Rajanna	367eb9f748	rbd: correct return error for isCompatibleEncryption isCompatibleEncryption is used to validate the requested volume and the existing volume and the destination volume name wont be generated yet and logging the destination volume prints the empty image name with pool name. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-06-17 10:12:18 +00:00
Madhu Rajanna	05b8433b89	rbd: check stdErr for does not have a parent error actual error will be present in the stdErr not the error when we try to add a task to flatten the rbd image. This commits corrects the error checking when the image does not have a parent. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-06-15 11:07:34 +00:00
Yati Padia	6bfdf2feb0	cleanup: gocyclo being unused for linter This commit addresses the following issue: 'nolint:gocyclo // complexity needs to be reduced.' is unused for linter "gocyclo" (nolintlint) Updates:#2025 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-06-15 02:54:16 +00:00
Yug	5c079894c7	doc: correct comment indentation in rbdVolume correct comment indentation in rbdvolume{} Signed-off-by: Yug <yuggupta27@gmail.com>	2021-06-15 02:34:51 +00:00
Yati Padia	095a82f37d	util: returns actual error instead of ErrPoolNotFound This commit returns actual error returned by the go-ceph API to the function GetPoolName(..) instead of just returning ErrPoolNotFound everytime there is error getting the pool id. There is a issue reported in which the snapshot creation takes much more time to reach True state (i.e., between 2-7 mins) and keeps trying to create with below error though pool is present: rpc error: code = NotFound desc = pool not found: pool ID (21) not found in Ceph cluster. Since we cannot interpret the actual error for the delay in snapshot creation, it is required to return the actual error as well so that we can uderstand the reason. Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-06-14 14:41:32 +00:00
Humble Chirammal	17b0091cba	cleanup: fix codespell error in internal/utils package Codespell checker report below error: ``` Resulting CLI options --check-filenames --check-hidden --skip .git,./vendor --ignore-words-list ExtraVersion,extraversion,ba 1 Error: ./internal/util/aws_metadata.go:96: Kubenetes ==> Kubernetes ``` This commit address the same. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-06-11 08:04:07 +00:00
Yug	d992803e9e	rbd: Update pool name in image chain While traversing image chain, the parent image can be present in a different pool that the one child is in. So, updating pool name in the next itteration to that of the Parent. Co-authored-by: Madhu Rajanna <madhupr007@gmail.com> Signed-off-by: Yug <yuggupta27@gmail.com>	2021-06-10 21:46:53 +00:00
Yug	1f6a9cabfd	rbd: verify if pool name is not empty Validate Snapshot request to check if the passed pool name is not empty. Co-authored-by: Madhu Rajanna <madhupr007@gmail.com> Signed-off-by: Yug <yuggupta27@gmail.com>	2021-06-10 21:46:53 +00:00
Yug	3898ae34a7	rbd: open new ioctx connection if the parent and child clones are in different namespaces we need to open a new ioctx for pools. Co-authored-by: Madhu Rajanna <madhupr007@gmail.com> Signed-off-by: Yug <yuggupta27@gmail.com>	2021-06-10 21:46:53 +00:00
Yug	b63b0bf18d	rbd: retrieve parent pool name of child image when clones are created in different pool,we need to retrieve the parent pool to get the information of the parent image. Co-authored-by: Madhu Rajanna <madhupr007@gmail.com> Signed-off-by: Yug <yuggupta27@gmail.com>	2021-06-10 21:46:53 +00:00
Yug	e699318acc	rbd: pass parent volume to undoSnapshotCloning function as we are supporting the creation of clone to a new pool we need to pass the correct parent volume to cleanup the snapshot on parent volume. Co-authored-by: Madhu Rajanna <madhupr007@gmail.com> Signed-off-by: Yug <yuggupta27@gmail.com>	2021-06-10 21:46:53 +00:00
Yug	961c1d12fd	rbd: add support to create clone in different pool added support to create image in different pool. if the snapshot/rbd image exists in one pool we can create a clone the clone of the rbd image to a different pool. Co-authored-by: Madhu Rajanna <madhupr007@gmail.com> Signed-off-by: Yug <yuggupta27@gmail.com>	2021-06-10 21:46:53 +00:00
Mohammed Naser	671d6a7767	rbd: Backout if image features is empty In golang world, if you split an empty string that does not contain the seperator, you get an array with one empty string. This results in volumes failing to mount with "invalid feature " (note extra space because it's trying to check if 'empty string' is a valid feature). This patch checks if the string is empty, and if so, it just decides to skip the entire validation and returning nothing. Signed-off-by: Mohammed Naser <mnaser@vexxhost.com>	2021-06-10 15:43:09 +00:00
Mohammed Naser	f193ebfbb1	rbd: Add failing test when no features are provided Signed-off-by: Mohammed Naser <mnaser@vexxhost.com>	2021-06-10 15:43:09 +00:00
Madhu Rajanna	7b5c78ec7c	rbd: fail fast in create volume for missmatch encryption CreateVolume will fail in below cases * If the snapshot is encrypted and requested volume is not encrypted * If the snapshot is not encrypted and requested volume is encrypted * If the parent volume is encrypted and requested volume is not encrypted * If the parent volume is not encrypted and requested volume is encrypted Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-06-07 15:05:21 +00:00
Madhu Rajanna	4e2c4ef704	cephfs: return internal server error if it is an error from the IsMountPoint function and the error is not IsNotExist return it as a internal server error. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-06-07 07:38:48 +00:00
Madhu Rajanna	46f1ab9e99	cephfs: use IsMountPoint to check mountpoint Currently we are relaying on the error output from the umount command we run on the nodes when mounting the volume but we are not checking for all the error message to verify the volume is mounted or not. This commits uses IsMountPoint function in util to check the mountpoint. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-06-07 07:38:48 +00:00
Madhu Rajanna	b4dbffa316	util: return actual error from IsMountPoint as callers are already taking care of returing the GRPC error code return the actual error from the IsMountPoint function. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-06-07 07:38:48 +00:00
Yati Padia	0f44c6acb7	cleanup: address wasted assign issues At places variable is reassigned without being used. Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-06-03 09:51:14 +00:00
YingshuoTao	bfe64d4aee	cephfs: pass extra volume attributes to static PV when using pre-provisioned volumes, pass these parameters: - kernelMountOptions - fuseMountOptions - subVolumeGroup in spec.csi.volumeAttributes in PV declaration Signed-off-by: YingshuoTao <frigid.blues@gmail.com>	2021-06-03 04:42:59 +00:00
Niels de Vos	7cbad9305f	rbd: repair thick-provisioned images on CreateVolume restart Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-06-01 14:42:12 +00:00
Niels de Vos	96a8ea3e88	cleanup: split repairExistingVolume() from CreateVolume() Move the repairing of a volume/snapshot from CreateVolume to its own function. This reduces the complexity of the code, and makes the procedure easier to understand. Further enhancements to repairing an exsiting volume can be done in the new function. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-06-01 14:42:12 +00:00
Madhu Rajanna	2e978e4211	rbd: fix typo in error message fixed typo in error message. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-06-01 10:40:07 +00:00
Madhu Rajanna	a666d452bf	cephfs: return GRPC error in NodeGetVolumeStats in case of failure return GRPC error to the caller. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-05-31 08:17:37 +00:00
Rakshith R	b891e5585d	cleanup: address ifshort linter issues This commit addresses ifshort linter issues which checks if short syntax for if-statements is possible. updates: #1586 Signed-off-by: Rakshith R <rar@redhat.com>	2021-05-26 07:04:32 +00:00
Rakshith R	6618e2012d	cleanup: remove unnecessary calling of .String() when logging This commit removes calling of .String() when logging since `%s`,`%v` or `%q` will call an existing .String() function automatically. Fixes: #2051 Signed-off-by: Rakshith R <rar@redhat.com>	2021-05-25 18:02:11 +00:00
Yati Padia	774e8e4042	util: enable golang profiling Add support for golang profiling. Standard tools like go tool pprof and curl work. example: $ go tool pprof http://localhost:8080/debug/pprof/profile $ go tool pprof http://localhost:8080/debug/pprof/heap $ curl http://localhost:8080/debug/pprof/heap?debug=1 https://golang.org/pkg/net/http/pprof/ contains more details about the pprof interface. Fixes: #1699 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-05-25 10:41:22 +00:00
Niels de Vos	25d0a1cfc0	rbd: add support for block-devices in NodeGetVolumeStats() The NodeGetVolumeStats procedure can now be used to fetch the capacity of the RBD block-device. By default this is a thin-provisioned device, which means that the capacity is not reserved in the Ceph cluster. This makes it possible to over-provision the cluster. In order to detect the amount of storage used by the RBD block-device (when thin-provisioned), it is required to connect to the Ceph cluster. Unfortunately, the NodeGetVolumeStats CSI procedure does not provide enough parameters to connect to the Ceph cluster and fetch more details about the RBD image. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-05-25 06:41:04 +00:00
Niels de Vos	c0ab4c03e6	cephfs: move NodeGetVolumeStats() to CephFS NodeServer The CephFS NodeServer should handle the CephFS specific requests. This is not something that the NodeServer for RBD should handle. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-05-25 06:41:04 +00:00
Madhu Rajanna	0ce6ad1152	rbd: fix image details logging log only the required details of the image. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-05-07 07:57:37 +00:00
Madhu Rajanna	67d73cd6e9	rbd: flatten image if the depth is not zero flatten the image if the deep-flatten feature is present on the images in the chain or if the images in chain is not zero, as we cannot check the deep-flatten feature the images which are in trash. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-05-07 07:57:37 +00:00
Madhu Rajanna	e15e2e5081	rbd: discard image not found error For flatten we call checkImageChainHasFeature which internally calls to getImageInfo returns the parent name even if the parent is in the trash, when we try to open the parent image to get its information it fails as the image not found. we should treat error as nil if the parent is not found. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-05-07 07:57:37 +00:00
Niels de Vos	f11a041f56	cleanup: address gosec complaint about creating a file The new gosec 2.7.0 complains like: G304 (CWE-22): Potential file inclusion via variable (Confidence: HIGH, Severity: MEDIUM) Updates: #2025 Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-05-05 16:05:23 +00:00
Madhu Rajanna	07a916b84d	rbd: mark image ready when image state is up+unknown To recover from split brain (up+error) state the image need to be demoted and requested for resync on site-a and then the image on site-b should gets demoted.The volume should be marked to ready=true when the image state on both the clusters are up+unknown because during the last snapshot syncing the data gets copied first and then image state on the site-a changes to up+unknown. If the image state on both the sites are up+unknown consider that complete data is synced as the last snapshot gets exchanged between the clusters. * create 10 GB of file and validate the data after resync * Do Failover when the site-a goes down * Force promote the image and write data in GiB * Once the site-a comes back, Demote the image and issue resync * Demote the image on site-b * The status will get reflected on the other site when the last snapshot sync happens * The image will go to up+unknown state. and complete data will be copied to site a * Promote the image on site-a and use it ```bash csi-vol-5633715e-a7eb-11eb-bebb-0242ac110006: global_id: e7f9ec55-06ab-46cb-a1ae-784be75ed96d state: up+unknown description: remote image demoted service: a on minicluster1 last_update: 2021-04-28 07:11:56 peer_sites: name: e47e29f4-96e8-44ed-b6c6-edf15c5a91d6-rook-ceph state: up+unknown description: remote image demoted last_update: 2021-04-28 07:11:41 ``` * Do Failover when the site-a goes down * Force promote the image on site-b and write data in GiB * Demote the image on site-b * Once the site-a comes back, Demote the image on site-a * The images on the both site will go to split brain state ```bash csi-vol-37effcb5-a7f1-11eb-bebb-0242ac110006: global_id: 115c3df9-3d4f-4c04-93a7-531b82155ddf state: up+error description: split-brain service: a on minicluster2 last_update: 2021-04-28 07:25:41 peer_sites: name: abbda0f0-0117-4425-8cb2-deb4c853da47-rook-ceph state: up+error description: split-brain last_update: 2021-04-28 07:25:26 ``` * Issue resync * The images cannot be resynced because when we issue resync on site a the image on site-b was in demoted state * To recover from this state (promote and then demote the image on site-b after sometime) ```bash csi-vol-37effcb5-a7f1-11eb-bebb-0242ac110006: global_id: 115c3df9-3d4f-4c04-93a7-531b82155ddf state: up+unknown description: remote image demoted service: a on minicluster1 last_update: 2021-04-28 07:32:56 peer_sites: name: e47e29f4-96e8-44ed-b6c6-edf15c5a91d6-rook-ceph state: up+unknown description: remote image demoted last_update: 2021-04-28 07:32:41 ``` * Once the data is copied we can see that the image state is moved to up+unknown on both sites * Promote the image on site-a and use it Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-05-05 13:38:29 +00:00
Madhu Rajanna	c3bae17fce	rbd: delete encryption key from KMS when a Snapshot is encrypted during a CreateSnapshot operation, the encryption key gets created in the KMS when we delete the Snapshot the key from the KMS should also gets deleted. When we create a volume from snapshot we are copying required information but we missed to copy the encryption information, This commit adds the missing information to delete the encryption key. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-04-30 08:05:47 +00:00
Humble Chirammal	074c937a08	cleanup: correct typo in vault_tokens.go Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-04-29 08:51:29 +00:00
Mudit Agarwal	ec105bd782	cephfs: expand clone error messages Adding "snapshot clone" in the clone error messages. Signed-off-by: Mudit Agarwal <muagarwa@redhat.com>	2021-04-26 13:38:55 +00:00
Humble Chirammal	798437d0c4	rbd: return crypt error for the rpc return At present we return the volume connect error if the clone from snapshot fails when rbdvolume is encrypted, which is incorrect. This patch correctly return the failed copy encryption error to the caller Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-04-21 16:10:20 +00:00
Madhu Rajanna	52290333e6	rbd: modified logic to check image watchers Before RBD map operation, we do check the watchers on the RBD image. In the case of RWO volume. cephcsi makes sure only one client is using the RBD image. If the rbd image is mirrored, by default mirroring daemon will add a watcher on the image and as we are using go-ceph a watcher will be added as we have opened the image So we will have two watchers on an image if mirroring is enabled. This holds when the rbd mirror daemon is running, In case if the mirror daemon is not running there will be only one watcher on the rbd image (which is placed by go-ceph image open) we should not block the map operation if the mirroring daemon is not running as its Async mirroring. This commit adds a check to make sure no more than 2 watchers if the image is mirrored or no more than 1 watcher if it is not mirrored image. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-04-19 16:30:55 +00:00
Yug	6a46f381c2	cleanup: update description to generic Since rbdImage is a common struct for rbdVolume and rbdSnapshot, it description was matching to only snapshot. This commit makes the comments generic for both volumes and snapshots. Signed-off-by: Yug <yuggupta27@gmail.com>	2021-04-19 07:32:35 +00:00
Rakshith R	9f2cf498b6	cephfs: enable ceph-fuse big_writes by default By default, the write buffer size in libfuse2 is 2KiB `fuse_big_writes = true` option is used to override this limit. This commit makes `fuse_big_writes = true` option as default in ceph.conf. Closes: #1928 Signed-off-by: Rakshith R <rar@redhat.com>	2021-04-19 07:08:57 +00:00
Humble Chirammal	54845b63c0	cleanup: better or corrected variable name in grpc prometheous code Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-04-16 10:22:35 +00:00
Humble Chirammal	0fae0e53b6	cleanup: various source code comment corrections Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-04-16 10:22:35 +00:00
Madhu Rajanna	eea52847bc	rbd: check volumeID in PV if image not found If the pool or few keys are missing in the omap. GetImageAttributes function returns nil error message and few empty items in imageAttributes struct. if the image is not found and the entiries are missing use the volumeId present on the PV annotation for further operations. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-04-15 17:13:06 +05:30
Madhu Rajanna	cfc88c9910	rbd: discard up+unknown state in ResyncVolume incase if the image is promoted and demoted the image state will be set to up+unknown if the image on the remote cluster is still in demoted state. when user changes the state from primary to secondary and still the image is in demoted (secondary) state in the remote cluster. the image state on both the cluster will be on unknown state. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-04-15 17:13:06 +05:30
Niels de Vos	8b8480017b	logging: report issues in rbdImage.DEKStore API with stacks It helps to get a stack trace when debugging issues. Certain things are considered bugs in the code (like missing attributes in a struct), and might cause a panic in certain occasions. In this case, a missing string will not panic, but the behaviour will also not be correct (DEKs getting encrypted, but unable to decrypt). Clearly logging this as a BUG is probably better than calling panic(). Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	b1d05a1840	rbd: repair encryption config in case it is missing It is possible that when a provisioner restarts after a snapshot was cloned, but before the newly restored image had its encryption metadata set, the new image is not marked as encrypted. This will prevent attaching/mounting the image, as the encryption key will not be fetched, or is not available in the DEKStore. By actively repairing the encryption configuration when needed, this problem should be addressed. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	1482105309	cleanup: use buildCreateVolumeResponse() to simplify CreateVolume() buildCreateVolumeResponse() exists exactly for the need to create a csi.CreateVolumeResponse based on an rbdVolume. Calling this helper reduces the code duplication in CreateVolume(). Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	52433841b4	cleanup: move copyEncryptionConfig() from CreateVolume to Exists() The rbdVolume that needs its encryption configured is constructed in the Exists() method. It is suitable to move the copyEncryptionConfig() call there as well, so that the object is completely constructed in a single place. Golang-ci:gocyclo complained about the increased complexity of the Exists() function. Moving the repairing of the ImageID into its own helper function makes the code a little easier to understand. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	596410ae60	cleanup: address "nolint" comments for RBD CreateSnapshot Introduce helper function cloneFromSnapshot() that takes care of the procedures that are needed when an existing snapshot has been found. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	b5d0524c39	cleanup: release resources for rbdImages objects after use Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	dc990037a5	rbd: move setupEncryption() from buildCreateVolumeResponse to CreateVolume Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	bea9d56117	rbd: copyEncryptionConfig in doSnapshotClone() Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	fd5f4dbafd	rbd: configureEncryption() in genSnapFromSnapID() Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	6fd3f57f40	rbd: set kmsID in reserveSnap() Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	0a046c5b6d	rbd: copy encryption configuration in CreateSnapshot Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	6b1285d38b	rbd: copy passphrase for encrypted clones When a source volume is encrypted, the passphrase needs to be copied and stored for the newly cloned volume. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	7b332a0184	rbd: add rbdImage.copyEncryptionConfig() to copy encryption metadata Cloning volumes requires copying the DEK from the source to the newly cloned volume. Introduce copyEncryptionConfig() as a helper for that. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	7e6feecc25	util: add VolumeEncryption.StoreCryptoPassphrase() The new StoreCryptoPassphrase() method makes it possible to store an unencrypted passphrase newly encrypted in the DEKStore. Cloning volumes will use this, as the passphrase from the original volume will need to get copied as part of the metadata for the volume. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	b6aa19eea5	rbd: pass secrets when creating an source rbdVolume for cloning Without this, the rbdVolume can not connect to the Ceph cluster and configure the (optional) encryption. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	92b2e08adf	rbd: improve logging in deleteImage() Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	99da92cfd7	rbd: move deletion of DEK to deleteImage() The ControllerServer should not need to care about support for encryption, ideally it is transparantly handled by the rbdVolume type and its internal API. Deleting the DEK was one of the last remainders that was explicitly done inside the ControllerServer. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	151d066938	util: add logging when OpenEncryptedVolume() encounters an error Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	bd1388fb96	util: log available configs when KMS not found When the KMS configuration can not be found, it is useful to know what configurations are available. This aids troubleshooting when typos in the KMS ID are made. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	a7c261a394	logging: correct formatting when reporting error in createVolumeFromSnapshot() Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Rakshith R	ae6a52a84e	util: add nil check to default ControllerGetCapabilities() Currently default ControllerGetCapabilities function is being used which throws 'runtime error: invalid memory address or nil pointer dereference' when `--controllerServer=true` is not set in provisioner deployment args. This commit adds a check to prevent it. Fixes: 1925 Signed-off-by: Rakshith R <rar@redhat.com>	2021-04-09 10:12:48 +00:00
Rakshith R	10d539efc8	cleanup: correct nolint directive listing format nolint directive needs to be followed by comma separated list of linters. This commit changes to gocognit:gocyclo which was not recognised to linters which show error for the function. Signed-off-by: Rakshith R <rar@redhat.com>	2021-04-09 07:24:47 +00:00
Rakshith R	fb7389f478	cephfs: add stderr to mount function errors This commit appends stderr to error in both kernel and ceph-fuse mounter functions to better be able to debug errors. Signed-off-by: Rakshith R <rar@redhat.com>	2021-04-08 12:18:01 +00:00
Madhu Rajanna	e2fa84357a	rbd: take lock when reconciling the PV there can be a change we can reconcile same PV parallelly we can endup in generating and deleting multiple omap keys. to be on safer side taking lock to process one volumeHandle at a time. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-04-07 11:46:27 +00:00
Madhu Rajanna	0f8813d89f	rbd:store/Read volumeID in/from PV annotation In the case of the Async DR, the volumeID will not be the same if the clusterID or the PoolID is different, With Earlier implementation, it is expected that the new volumeID mapping is stored in the rados omap pool. In the case of the ControllerExpand or the DeleteVolume Request, the only volumeID will be sent it's not possible to find the corresponding poolID in the new cluster. With This Change, it works as below The csi-rbdplugin-controller will watch for the PV objects, when there are any PV objects created it will check the omap already exists, If the omap doesn't exist it will generate the new volumeID and it checks for the volumeID mapping entry in the PV annotation, if the mapping does not exist, it will add the new entry to the PV annotation. The cephcsi will check for the PV annotations if the omap does not exist if the mapping exists in the PV annotation, it will use the new volumeID for further operations. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-04-07 11:46:27 +00:00
Rakshith R	020cded581	cleanup: refactor deeply nested if statements in internal/rbd Refactored deeply nested if statement in internal/rbd to reduce cognitive complexity. Signed-off-by: Rakshith R <rar@redhat.com>	2021-04-07 02:31:41 +00:00
Rakshith R	d4cfd7bef9	cleanup: refactor deeply nested if statement in vault_tokens.go Refactored deeply nested if statement in vault_tokens.go to reduce cognitive complexity by adding fetchTenantConfig function. Signed-off-by: Rakshith R <rar@redhat.com>	2021-04-07 02:31:41 +00:00
Rakshith R	2d1a572d11	cleanup: refactor deeply nested if statements in internal/cephfs Refactored deeply nested if statement in internal/cephfs to reduce cognitive complexity. Signed-off-by: Rakshith R <rar@redhat.com>	2021-04-07 02:31:41 +00:00
Rakshith R	0f7b653b4e	cleanup: refactor deeply nested if statements in persistentvolume.go Refactored deeply nested if statement in persistentvolume.go to reduce cognitive complexity. Signed-off-by: Rakshith R <rar@redhat.com>	2021-04-07 02:31:41 +00:00
Niels de Vos	aaeb35eceb	rbd: encrypted volumes can be of type "crypto_LUKS" too It seems that newer versions of some tools/libraries identify encrypted filesystems with `crypto_LUKS` instead of `crypt`. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-06 15:54:27 +00:00
Madhu Rajanna	d7838defcf	rbd: return FailedPrecondition error message In case of the DR the image on the primary site cannot be demoted as the cluster is down, during failover the image need to be force promoted. RBD returns `Device or resource busy` error message if the image cannot be promoted for above reason. Return FailedPrecondition so that replication operator can send request to force promote the image. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-04-06 14:12:41 +00:00
Madhu Rajanna	403532c9a6	rbd: use force from PromoteVolume Request instead of fetching the force option from the parameters. Use the Force field available in the PromoteVolume Request. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-04-06 14:12:41 +00:00
Madhu Rajanna	385a751b8e	rebase: rename kube-storage to csi-addons as the org github.com/kube-storage is renamed to github.com/csi-addons as the name kube-storage was more generic. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-04-06 10:59:58 +00:00
Niels de Vos	1c1683ba20	util: add AmazonMetadata KMS provider The new Amazon Metadata KMS provider uses a CMK stored in AWS KMS to encrypt/decrypt the DEK which is stored in the volume metadata. Updates: #1921 Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-06 07:33:54 +00:00
Niels de Vos	f3b06d4c4a	util: pass Namespace as part of KMSInitializerArgs Amazon KMS expects a Secret with sensitive account and key information in the Kubernetes Namespace where the Ceph-CSI Pods are running. It will fetch the contents of the Secret itself. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-06 07:33:54 +00:00
Niels de Vos	523ac4b975	util: move getPodNamespace() and getKMSConfigMapName() into its own helpers These functions can now be re-used easier. The Amazon KMS needs to know the Namespace of the Pod for reading a Secret with more key/values. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-06 07:33:54 +00:00
Humble Chirammal	314fe0e23d	cleanup: correct misspelling in rbd/clone.go Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-04-05 09:34:09 +00:00
Madhu Rajanna	448be70682	rbd: early check for disabled,disabling in DisableVolumeReplication added early check for disabling and disabled image mirroring state in DisableVolumeReplication Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-04-05 08:53:40 +00:00
Madhu Rajanna	fb3f7fe202	rbd: remove todo for image not found Incase of resync the image will get deleted, gets recreated and its a a time consuming operation. It makes sense to return aborted error instead of not found as we have omap data only the image is missing in rbd pool. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-04-05 08:53:40 +00:00
Madhu Rajanna	95387c3b5e	rbd: check for peer site status Do resync if the image is in unknow or in error state. Check for the current image state for up+stopped or up+replaying and also all peer site status should be un up+stopped to confirm that resyncing is done and image can be promoted and used. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-04-05 08:53:40 +00:00
Madhu Rajanna	233954bc10	rbd: make replication operations as rbdImage methods added replication related operations as a method of rbdImage as these methods can be easily used when we introduce volumesnaphot mirroring operations. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-04-05 08:53:40 +00:00
Madhu Rajanna	c822ad460d	rbd: add a check for image mirror disabling state the rbd mirror state can be in enabled,disabled or disabling state. If the mirroring is not disabled yet and still in disabling state. we need to check for it and return abort error message if the mirroring is still getting disabled. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-04-05 08:53:40 +00:00

... 7 8 9 10 11 ...

1298 Commits