ceph-csi

mirror of https://github.com/ceph/ceph-csi.git synced 2024-12-24 05:50:22 +00:00

Author	SHA1	Message	Date
Humble Chirammal	5aa1e4d225	rbd: change the configmap of HPCS/KP key names to reflect the IBM string considering IBM has different crypto services (ex: SKLM) in place, its good to keep the configmap key names with below format `IBM_KP_...` instead of `KP_..` so that in future, if we add more crypto services from IBM we can keep similar schema specific to that specific service from IBM. Ex: `IBM_SKLM_...` Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2022-01-05 06:08:19 +00:00
Niels de Vos	8eaf1abbdc	util: add common logging to csi-addons gRPC Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-12-23 17:43:23 +00:00
Niels de Vos	bb5d3b7257	cleanup: refactor gRPC middleware into NewMiddlewareServerOption Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-12-23 17:43:23 +00:00
Niels de Vos	e574c807f0	rbd: expose CSI-Addons ReclaimSpace operations Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-12-23 17:43:23 +00:00
Niels de Vos	c274649b80	rbd: implement NodeReclaimSpace By calling fstrim/blkdiscard on the volume, space consumption should get reduced. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-12-23 17:43:23 +00:00
Niels de Vos	7d36c5a9d1	rbd: implement CSI-Addons ControllerReclaimSpace The CSI Controller (provisioner) can call `rbd sparsify` to reduce the space consumption of the volume. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-12-23 17:43:23 +00:00
Madhu Rajanna	e4b7943bac	rbd: add workaround for force promote use ExecCommandWithTimeout with timeout of 1 minute for the promote operation. If the command doesnot returns error/response in 1 minute the process will be killed and error will be returned to the user. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-23 13:36:21 +00:00
Madhu Rajanna	95e9595c1f	util: add helper ExecCommandWithTimeout function added ExecCommandWithTimeout helper function to execute the commands with the timeout option, if the command does not return any response with in the timeout time the process will be terminated and error will be returned back to the user. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-23 13:36:21 +00:00
Madhu Rajanna	9499e73b93	rbd: correct logging in createBackingImage after creating the rbd image log the image details corresponding for the request along with the request name. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-23 03:47:00 +00:00
Madhu Rajanna	549bfedc94	rbd: remove extra logging from createBackingImage we are already logging the rbd image details and the snapshot details after creating the clone. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-23 03:47:00 +00:00
Madhu Rajanna	8c9105f09e	rbd: remove extra getImageInfo API call as getImageInfo is already called inside cloneRbdImageFromSnapshot function right after creating the clone. remove the extra API call to get the details again. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-23 03:47:00 +00:00
Madhu Rajanna	ff91b7edbd	rbd: get image details after creating clone after creating the clone get the current image details like size, creationTime, imageFeatures etc from the ceph cluster. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-23 03:47:00 +00:00
Madhu Rajanna	edcb2b529b	rbd: move core fields to rbdImage struct moved ParentName, ParentPool and ImageFeatureSet fields to the rbdImage struct as these are the first citizens on the rbdImage. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-23 03:47:00 +00:00
Madhu Rajanna	c6b288779a	rbd: correct logging for clone log the rbdVolume and the rbdSnapshot after creating the clone from snapshot. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-23 03:47:00 +00:00
Madhu Rajanna	3169c8e23a	rbd: expand filesystem during NodeStageVolume If the volume with a bigger size is created from a snapshot or from another volume we need to exapand the filesystem also in the csidriver as nodeExpand request is not triggered for this one, During NodeStageVolume we can expand the filesystem by checking filesystem needs expansion or not. If its a encrypted device, check the device size of rbd device and the LUKS device if required the device will be expanded before expanding the filesystem. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-23 03:47:00 +00:00
Madhu Rajanna	69ae19e0cb	rbd: resize the volume created from snapshot If the requested volume size is greater than the snapshot size, resize the cloned volume after creating a clone from a snapshot. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-23 03:47:00 +00:00
Madhu Rajanna	a28a4a4285	rbd: resize the volume created from volume If the requested volume size is greater than the parent volume size, resize the cloned volume after creating a final clone from a parent volume. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-23 03:47:00 +00:00
Madhu Rajanna	f7f662678a	rbd: consider ErrImageNotFound during DeleteSnapshot added a check to consider ErrImageNotFound error during DeleteSnapshot operation, if the error is ErrImageNotFound we need to ensure that image is removed from the trash and also the rados OMAP data is removed. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-23 03:47:00 +00:00
Madhu Rajanna	da60d221df	rbd: update size for rbdSnapshot struct we need actual size of the rbdVolume created for the snapshot, as we are not storing the size of the snapshot in OMAP we need to fetch the size from ceph cluster and update the same on rbdSnapshot struct. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-23 03:47:00 +00:00
Madhu Rajanna	6a82baf5d3	rbd: remove SizeBytes from rbdSnapshot struct as we are moving the VolSize to rbdImage struct we should reuse the same instead of maintaining one more field in rbdSnapshot struct. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-23 03:47:00 +00:00
Madhu Rajanna	b1a0bb4714	rbd: move VolSize to rbdImage struct move the Volsize to the rbdImage struct as size is more applicable for rbdImage as rbdImage is used for both rbdVolume and rbdSnapshot. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-23 03:47:00 +00:00
Madhu Rajanna	a0829e9e93	rbd: remove json tag from rbdVolume struct as we are no longer supporting the v1.x version of cephcsi. removing the json tag used to store rbd volume details in configmap. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-23 03:47:00 +00:00
Madhu Rajanna	124281519f	rbd: add RequestedVolSize to rbdVolume struct when doing the internal operation to get the latest details the rbd image size is also getting updated and this will update the volume size also without actual requested size we cannot do the resize operation for bigger clones. This commit adds a new field called RequestedVolSize to rbdVolume struct to hold the user requested size. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-23 03:47:00 +00:00
Madhu Rajanna	22365ab77f	cleanup: add cleanup helper for incorrect thick volume added a new helper function called cleanupThickClone to cleanup the snapshot and clone if the thick provisioning is not fully completed. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-23 03:47:00 +00:00
Madhu Rajanna	ca29328554	csi: remove size check when creating volume remove the bigger size validation when creating a volume from a snapshot or when creation a clone from a volume as we resized the volume after cloning. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-23 03:47:00 +00:00
Humble Chirammal	b9a8d37c3d	rbd: enable expand operation for intree volumes This commit enable the resize operation[1] for in-tree volumes. new helper has been introduced here to aid the enablement or to make it clean with existing code base. [1] https://github.com/ceph/ceph-csi/blob/devel/docs/design/proposals/intree-migrate.md?plain=1#L66 Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-12-22 19:33:05 +00:00
Madhu Rajanna	810e285c50	rbd: reset dummy image id dummy image rbdVolume struct is derived from the actual one rbdVolume of the volumeID sent in the EnableVolumeReplication request. and the dummy rbdVolume struct contains the image id of the actual volume because of that when we are repairing the dummy image the image is sent to trash but not deleted due to the wrong image ID. resetting the image id will makes sure the image id is fetching from ceph cluster and same image id will be used for manager operation. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-21 17:39:07 +00:00
Humble Chirammal	b904c446d6	rbd: add kms unit test for key protect server Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-12-21 17:09:50 +00:00
Humble Chirammal	9200bc7a00	rbd: Implement Key Protect KMS integration for Ceph CSI This commit adds the support for HPCS/Key Protect IBM KMS service to Ceph CSI service. EncryptDEK() and DecryptDEK() of RBD volumes are done with the help of key protect KMS server by wrapping and unwrapping the DEK and by using the DEKStoreMetadata. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-12-21 17:09:50 +00:00
Madhu Rajanna	12e8e46bcf	revert: remove explicit size setting of cloned volume The ceph changes are done on the both server and the client side this change is not enough for remove setting the size of cloned volumes. this caused the regression like #2719 #2720 #2721 #2722. This reverts commit `3565a342d5`. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-21 14:15:46 +00:00
Humble Chirammal	88911eb4e9	rbd: add migration secret support to controllerserver functions This commit adds the migration secret request validation to expand, create controller functions. Ref # https://github.com/ceph/ceph-csi/issues/2509 Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-12-20 07:34:43 +00:00
Niels de Vos	30333378ef	cleanup: add IsBlockMultiNode() helper IsBlockMultiNode() is a new helper that takes a slice of VolumeCapability objects and checks if it includes multi-node access and/or block-mode support. This can then easily be used in other services that need checking for these particular capabilities, and preventing multi-node block-mode access. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-12-17 07:31:55 +00:00
Madhu Rajanna	50d6ea825c	rbd: remove retrieving volumeHandle from PV annotation we have added clusterID mapping to identify the volumes in case of a failover in Disaster recovery in #1946. with #2314 we are moving to a configuration in configmap for clusterID and poolID mapping. and with #2314 we have all the required information to identify the image mappings. This commit removes the workaround implementation done in #1946. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-17 03:38:29 +00:00
Niels de Vos	203920d8f4	rbd: move driver component into the rbd/driver package The rbd package contains several functions that can be used by CSI-Addons Service implmentations. Unfortunately it is not possible to do this, as the rbd-driver needs to import the csi-addons/rbd package to provide the CSI-Addons server. This causes a circular import when services use the rbd package: - rbd/driver.go import csi-addons/rbd - csi-addons/rbd import rbd (including the driver) By moving rbd/driver.go into its own package, the circular import can be prevented. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-12-10 07:35:26 +00:00
Niels de Vos	44d69502bc	rbd: export HexStringToInteger() HexStringToInteger() used to return a uint64, but everywhere else uint is used. Having HexStringToInteger() return a uint as well makes it a little easier to use when setting it with SetGlobalInt(). Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-12-10 07:35:26 +00:00
Niels de Vos	8b531f337e	rbd: add functions for initializing global variables When the rbd-driver starts, it initializes some global (yuck!) variables in the rbd package. Because the rbd-driver is moved out into its own package, these variables can not easily be set anymore. Introcude SetGlobalInt(), SetGlobalBool() and InitJournals() so that the rbd-driver can configure the rbd package. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-12-10 07:35:26 +00:00
Niels de Vos	3eeac3d36c	rbd: export RunVolumeHealer() so that rbd/driver can start it The rbd-driver calls rbd.runVolumeHealer() which is not available outside the rbd package. By moving the rbd-driver into its own package, RunVolumeHealer() needs to be exported. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-12-10 07:35:26 +00:00
Niels de Vos	5baf9811f9	rbd: export NodeServer.mounter outside of the rbd package NodeServer.mounter is internal to the NodeServer type, but it needs to be initialized by the rbd-driver. The rbd-driver is moved to its own package, so .Mounter needs to be available from there in order to set it. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-12-10 07:35:26 +00:00
Niels de Vos	8d09134125	rbd: export GenVolFromVolID() for consumption by csi-addons genVolFromVolID() is used by the CSI Controller service to create an rbdVolume object from a CSI volume_id. This function is useful for CSI-Addons Services as well, so rename it to GenVolFromVolID(). Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-12-10 07:35:26 +00:00
Niels de Vos	e76bffe353	cleanup: import k8s.io/mount-utils instead of k8s.io/utils/mount k8s.io/utils/mount has moved to k8s.io/mount-utils, and Ceph-CSI uses that already in most locations. Only internal/util/util.go still imports the old path. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-12-09 17:58:34 +00:00
Madhu Rajanna	8081ac8251	rbd: add new image features for dummy image The dummy image will be created with 1Mib size. during the snapshot transfer operation the 1Mib will be transferred even if the dummy image doesnot contains any data. adding the new image features `fast-diff,layering,obj-map,exclusive-lock`on the dummy image will ensure that only the diff is transferred to the remote cluster. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-07 17:34:14 +00:00
Madhu Rajanna	9a4533e549	rbd: create 1MiB size dummy image we added a workaround for rbd scheduling by creating a dummy image in #2656. with the fix we are creating a dummy image of the size of the first actual rbd image which is sent in EnableVolumeReplication request if the actual rbd image size is 1TiB we are creating a dummy image of 1TiB which is not good. even though its a thin provisioned rbd images this is causing issue for the transfer of the snapshot during the mirroring operation. This commit recreates the rbd image with 1MiB size which is the smaller supported size in rbd. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-07 17:34:14 +00:00
Konstantin Shalygin	7411773f73	rbd: added RBD features support for krbd Added support for `object-map, fast-diff` Signed-off-by: Konstantin Shalygin <k0ste@k0ste.ru>	2021-12-07 07:38:24 +00:00
Madhu Rajanna	64ce5e0949	rbd: check local image state during promote operation rbd mirroring CLI calls are async and it doesn't wait for the operation to be completed. ex:- `rbd mirror image enable` it will enable the mirroring on the image but it doesn't ensure that the image is mirroring enabled and healthy primary. The same goes for the promote volume also. This commits adds a check-in PromoteVolume to make sure the image in a healthy state i.e `up+stopped`. note:- not considering any intermediate states to make sure the image is completely healthy before responding success to the RPC call. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-01 20:19:05 +00:00
Prasanna Kumar Kalever	e7d8834149	rbd: enabe journal based mirroring Journal-based RADOS block device mirroring ensures point-in-time consistent replicas of all changes to an image, including reads and writes, block device resizing, snapshots, clones, and flattening. Journaling-based mirroring records all modifications to an image in the order in which they occur. This ensures that a crash-consistent mirror of an image is available. Mirroring when configured in journal mode, mirroring will utilize the RBD journaling image feature to replicate the image contents. If the RBD journaling image feature is not yet enabled on the image, it will be automatically enabled. Fixes: #2018 Co-authored-by: Madhu Rajanna <madhupr007@gmail.com> Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-12-01 14:12:30 +00:00
Niels de Vos	ab76459e87	rbd: implement CSI-Addons Identity Service Depending on the way Ceph-CSI is deployed, the capabilities will be configured for the GetCapabilities procedure. The other procedures are more straight-forward. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-12-01 06:31:09 +00:00
Niels de Vos	20727bd41a	cleanup: reduce complexity of rbd.Driver.Run() After adding the new CSI-Addons Server, golang-ci complains that driver.Run() is too complex. By moving the profiling checks and starting of the go-routines in their own function, golang-ci is happy again. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-11-30 11:48:40 +00:00
Niels de Vos	b3910f2b4a	rbd: enable CSI-Addons Server and Identity Service Add a new endpoint for the CSI-Addons Service and enable the Identity Service for the RBD plugin. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-11-30 11:48:40 +00:00
Niels de Vos	0f8bbaa217	rbd: add framework for CSI-Addons Identity Service Add a new CSI-Addons Server and empty Identity Service for the RBD plugin. The implementation of the Identity Service procedure calls will be done in other PRs. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-11-30 11:48:40 +00:00
Madhu Rajanna	f0b2ea6a6d	rbd: repair imageid after resync During resync operation the local image will get deleted and a new image is recreated by the rbd mirroring. The new image will have a new imageID. Once resync is completed update the imageID in the OMAP to get the image removed from the trash during DeleteVolume. Before resyncing ``` sh-4.4# rbd info replicapool/csi-vol-0c25bdd3-485f-11ec-bd30-0242ac110004 rbd image 'csi-vol-0c25bdd3-485f-11ec-bd30-0242ac110004': size 1 GiB in 256 objects order 22 (4 MiB objects) snapshot_count: 1 id: 1efcc6b7a769 block_name_prefix: rbd_data.1efcc6b7a769 format: 2 features: layering op_features: flags: create_timestamp: Thu Nov 18 11:02:40 2021 access_timestamp: Thu Nov 18 11:02:40 2021 modify_timestamp: Thu Nov 18 11:02:40 2021 mirroring state: enabled mirroring mode: snapshot mirroring global id: 9c4c236d-8a47-4779-b4f6-94e05da70dbd mirroring primary: true ``` ``` sh-4.4# rados listomapvals csi.volume.0c25bdd3-485f-11ec-bd30-0242ac110004 --pool=replicapool csi.imageid value (12 bytes) : 00000000 31 65 66 63 63 36 62 37 61 37 36 39 \|1efcc6b7a769\| 0000000c csi.imagename value (44 bytes) : 00000000 63 73 69 2d 76 6f 6c 2d 30 63 32 35 62 64 64 33 \|csi-vol-0c25bdd3\| 00000010 2d 34 38 35 66 2d 31 31 65 63 2d 62 64 33 30 2d \|-485f-11ec-bd30-\| 00000020 30 32 34 32 61 63 31 31 30 30 30 34 \|0242ac110004\| 0000002c csi.volname value (40 bytes) : 00000000 70 76 63 2d 32 36 38 39 33 66 30 38 2d 66 66 32 \|pvc-26893f08-ff2\| 00000010 62 2d 34 61 30 66 2d 61 35 63 33 2d 38 38 34 62 \|b-4a0f-a5c3-884b\| 00000020 37 32 30 66 66 62 32 63 \|720ffb2c\| 00000028 csi.volume.owner value (7 bytes) : 00000000 64 65 66 61 75 6c 74 \|default\| 00000007 ``` After Resyncing ``` sh-4.4# rbd info replicapool/csi-vol-0c25bdd3-485f-11ec-bd30-0242ac110004 rbd image 'csi-vol-0c25bdd3-485f-11ec-bd30-0242ac110004': size 1 GiB in 256 objects order 22 (4 MiB objects) snapshot_count: 1 id: 10b183a48a97 block_name_prefix: rbd_data.10b183a48a97 format: 2 features: layering, non-primary op_features: flags: create_timestamp: Thu Nov 18 11:09:39 2021 access_timestamp: Thu Nov 18 11:09:39 2021 modify_timestamp: Thu Nov 18 11:09:39 2021 mirroring state: enabled mirroring mode: snapshot mirroring global id: 9c4c236d-8a47-4779-b4f6-94e05da70dbd mirroring primary: false sh-4.4# rados listomapvals csi.volume.0c25bdd3-485f-11ec-bd30-0242ac110004 --pool=replicapool csi.imageid value (12 bytes) : 00000000 31 30 62 31 38 33 61 34 38 61 39 37 \|10b183a48a97\| 0000000c csi.imagename value (44 bytes) : 00000000 63 73 69 2d 76 6f 6c 2d 30 63 32 35 62 64 64 33 \|csi-vol-0c25bdd3\| 00000010 2d 34 38 35 66 2d 31 31 65 63 2d 62 64 33 30 2d \|-485f-11ec-bd30-\| 00000020 30 32 34 32 61 63 31 31 30 30 30 34 \|0242ac110004\| 0000002c csi.volname value (40 bytes) : 00000000 70 76 63 2d 32 36 38 39 33 66 30 38 2d 66 66 32 \|pvc-26893f08-ff2\| 00000010 62 2d 34 61 30 66 2d 61 35 63 33 2d 38 38 34 62 \|b-4a0f-a5c3-884b\| 00000020 37 32 30 66 66 62 32 63 \|720ffb2c\| 00000028 csi.volume.owner value (7 bytes) : 00000000 64 65 66 61 75 6c 74 \|default\| 00000007 ``` Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-11-25 09:22:13 +00:00
Madhu Rajanna	027b68ab39	rbd: operate on dummy image after adding scheduling currently we are fist operating on the dummy image to refresh the pool and then we are adding the scheduling. we think the scheduling should be added first and than we should refresh the pool. If we do this all the existing schedules will be considered from the scheduler. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-11-23 11:04:42 +00:00
Madhu Rajanna	211ca9b5a7	rbd: do deep copy for dummyVol struct with shallow copy of rbdVol to dummyVol the image name update of the dummyVol is getting reflected on the rbdVol which we dont want. do deep copy to avoid this problem. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-11-23 11:04:42 +00:00
Prasanna Kumar Kalever	bdcf3273b5	rbd: provide a way to supply mounter specific mapOptions from sc Uses the below schema to supply mounter specific map/unmapOptions to the nodeplugin based on the discussion we all had at https://github.com/ceph/ceph-csi/pull/2636 This should specifically be really helpful with the `tryOthermonters` set to true, i.e with fallback mechanism settings turned ON. mapOption: "kbrd:v1,v2,v3;nbd:v1,v2,v3" - By omitting `krbd:` or `nbd:`, the option(s) apply to rbdDefaultMounter which is krbd. - A user can _override_ the options for a mounter by specifying `krbd:` or `nbd:`. mapOption: "v1,v2,v3;nbd:v1,v2,v3" is effectively the same as the 1st example. - Sections are split by `;`. - If users want to specify common options for both `krbd` and `nbd`, they should mention them twice. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-11-23 08:54:37 +00:00
Shyamsundar Ranganathan	d1c21eece9	rbd: Update sequence of operations on dummy mirror image The dummy mirror image needs to be disabled and then reenabled for mirroring, to ensure a newly promoted primary is now starting to schedule snapshots. Signed-off-by: Shyamsundar Ranganathan <srangana@redhat.com>	2021-11-19 09:38:59 +05:30
Madhu Rajanna	517ad8c644	rbd: use dummy image to workaround rbd scheduling bug currently we have a bug in rbd mirror scheduling module. After doing failover and failback the scheduling is not getting updated and the mirroring snapshots are not getting created periodically as per the scheduling interval. This PR workarounds this one by doing below operations * Create a dummy (unique) image per cluster and this image should be easily identified. * During Promote operation on any image enable the mirroring on the dummy image. when we enable the mirroring on the dummy image the pool will get updated and the scheduling will be reconfigured. * During Demote operation on any image disable the mirroring on the dummy image. the disable need to be done to enable the mirroring again when we get the promote request to make the image as primary * When the DR is no more needed, this image need to be manually cleanup as for now as we dont want to add a check in the existing DeleteVolume code path for delete dummy image as it impact the performance of the DeleteVolume workflow. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-11-19 09:38:59 +05:30
Madhu Rajanna	d05fc1e8e5	util: add helper to get the cluster ID added helper function to get the cluster ID. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-11-19 09:38:59 +05:30
Madhu Rajanna	e4e0f397a6	rbd: run schedule during promote operation Moved to add scheduling to the promote operation as scheduling need to be added when the image is promoted and this is the correct method of adding the scheduling to make the scheduling take place. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-11-19 09:38:59 +05:30
Madhu Rajanna	7bbd2ea284	rbd: use small case of error message the error message should not start with the capital letter changing the case as per the standard. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-11-18 10:44:12 +00:00
Madhu Rajanna	51998a5f4a	cleanup: log the image name and pool name instead of logging the volumeID and the pool name. log the poolname and image name for better debugging. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-11-18 10:44:12 +00:00
Madhu Rajanna	0f0cda49a7	rbd: log stdError for cryptosetup command If we hit any error while running the cryptosetup commands we are logging only the error message. with only error message it is difficult to analyze the problem, logging the stdError will help us to check what is the problem. updates: #2610 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-11-18 02:17:15 +00:00
Niels de Vos	7e22180125	rbd: call undoStagingTransaction() when NodeStageVolume() fails On line 341 a `transaction` is created. This is passed to the deferred `undoStagingTransaction()` function when an error in the `NodeStageVolume` procedure is detected. So far, so good. However, on line 356 a new `transaction` is returned. This new `transaction` is not used for the defer call. By removing the empty `transaction` that is used in the defer call, and calling `undoStagingTransaction()` on an error of `stageTransaction()`, the code is a little simpler, and the cleanup of the transaction should be done correctly now. Updates: #2610 Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-11-17 23:58:00 +00:00
Prasanna Kumar Kalever	e6fa392df1	rbd: fix mapOptions passing with rbd-nbd mounter This was a regression introduced by: https://github.com/ceph/ceph-csi/pull/2556 Fixes: #2610 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-11-16 10:12:46 +00:00
Prasanna Kumar Kalever	50e9dfa5c5	cleanup: fix log level This log line is seen frequently in the logs and its better to be at Warning loglevel rather than Error based on its severity E1109 08:30:45.612395 38328 util.go:247] kernel 4.19.202 does not support required features Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-11-10 10:54:29 +00:00
Prasanna Kumar Kalever	3686b6da8b	rbd: utilize cookie support from rbd for nbd Problem: On remap/attach of device (i.e. nodeplugin restart), there is no way for rbd-nbd to defend if the backend storage is matching with the initial backend storage. Say, if an initial map request for backend "pool1/image1" got mapped to /dev/nbd0 and the userspace process is terminated (on nodeplugin restart). A next remap/attach (nodeplugin start) request within reattach-timeout is allowed to use /dev/nbd0 for a different backend "pool1/image2" For example, an operation like below could be dangerous: $ sudo rbd-nbd map --try-netlink rbd-pool/ext4-image /dev/nbd0 $ sudo blkid /dev/nbd0 /dev/nbd0: UUID="bfc444b4-64b1-418f-8b36-6e0d170cfc04" TYPE="ext4" $ sudo pkill -15 rbd-nbd <-- nodeplugin terminate $ sudo rbd-nbd attach --try-netlink --device /dev/nbd0 rbd-pool/xfs-image /dev/nbd0 $ sudo blkid /dev/nbd0 /dev/nbd0: UUID="d29bf343-6570-4069-a9ea-2fa156ced908" TYPE="xfs" Solution: rbd-nbd/kernel now provides a way to keep some metadata in sysfs to identify between the device and the backend, so that when a remap/attach request is made, rbd-nbd can compare and avoid such dangerous operations. With the provided solution, as part of the initial map request, backend cookie (ceph-csi VOLID) can be stored in the sysfs per device config, so that on a remap/attach request rbd-nbd will check and validate if the backend per device cookie matches with the initial map backend with the help of cookie. At Ceph-csi we use VOLID as device cookie, which will be unique, we pass the VOLID as cookie at map and use the same at the time of attach, that way rbd-nbd can identify backends and their matching devices. Requires: https://github.com/ceph/ceph/pull/41323 https://lkml.org/lkml/2021/4/29/274 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-11-04 03:20:59 +00:00
Prasanna Kumar Kalever	793b22cf27	rbd: check for nbd cookie support Change checkRbdNbdTools() to setRbdNbdToolFeatures() Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-11-04 03:20:59 +00:00
Prasanna Kumar Kalever	9a3170bf77	rbd: provide a way to disable the auto fallback to nbd mounter This change allows the user to choose not to fallback to NBD mounter when some ImageFeatures are absent with krbd driver, rather just fail the NodeStage call. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-11-01 08:17:36 +00:00
Prasanna Kumar Kalever	bfc24f6f12	cleanup: generalize the parseBool function Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-11-01 08:17:36 +00:00
Prasanna Kumar Kalever	84ec797dda	rbd: detect krbd features in runtime and fallback to nbd Currently, we recognize and warn for the provided image features based on our prior intelligence at ceph-csi (i.e based on supportedFeatures map and validateImageFeatures) at image/PV creation time. It might be very much possible that the cluster is heterogeneous i.e. the PV creation and application container might both be on different nodes with different kernel versions (krbd driver versions). This PR adds a mechanism to check for the supported krbd features during mount time, if the krbd driver doesn't have the specified image feature then it will fall back to rbd-nbd mounter. Fixes: #478 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-11-01 08:17:36 +00:00
Niels de Vos	c852f487a5	util: set defaults for Vault config before converting When using UPPER_CASE formatting for the HashiCorp Vault KMS configuration, a missing `VAULT_DESTROY_KEYS` will cause the option to be set to "false". The default for the option is intended for be "true". This is a difference in behaviour between the `vaultDestroyKeys` and `VAULT_DESTROY_KEYS` options. Both should use a default of "true" when the configuration does not set the option explicitly. By setting the default options in the `standardVault` struct before unmarshalling the configuration in it, the default values will be retained for the missing configuration options. Reported-by: Rachael George <rgeorge@redhat.com> Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-10-28 14:41:53 +00:00
Humble Chirammal	6aec858cba	rbd: parse migration secret and set fields for nodestage operations this commit make use of the migration request secret parsing and set the required fields for further nodestage operations Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-10-27 18:35:00 +00:00
Humble Chirammal	5621f2cfca	rbd: split the parsing and deletion logic to its own functions. parseAndDeleteMigratedVolume() prviously clubbed the logic of parsing of migration volume handle and then continued with the deletion of the volume. however this commit split this logic into two, ie parsing has been done in parseMigrationVolID() and DeleteMigratedVolume() deletes the backend volume. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-10-27 18:35:00 +00:00
Humble Chirammal	ff0911fb6a	rbd: add unittests for IsMigrationSecret and ParseAndSetSecretMapFromMigSecret This commit adds unit tests for newly introduced migration specific functions. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-10-27 18:35:00 +00:00
Humble Chirammal	b49bf4b987	rbd: parse migration secret and set it for controller server operations This commit adds a couple of helper functions to parse the migration request secret and set it for further csi driver operations. More details: The intree secret has a data field called "key" which is the base64 admin secret key. The ceph CSI driver currently expect the secret to contain data field "UserKey" for the equivalant. The CSI driver also expect the "UserID" field which is not available in the in-tree secret by deafult. This missing userID will be filled (if the username differ than 'admin') in the migration secret as 'adminId' field in the migration request, this commit adds the logic to parse this migration secret as below: "key" field value will be picked up from the migraion secret to "UserKey" field. "adminId" field value will be picked up from the migration secret to "UserID" field if `adminId` field is nil or not set, `UserID` field will be filled with default value ie `admin`.The above logic get activated only when the secret is a migration secret, otherwise skipped to the normal workflow as we have today. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-10-27 18:35:00 +00:00
Niels de Vos	b132696e54	rbd: note that thick-provisioning is deprecated Thick-provisioning was introduced to make accounting of assigned space for volumes easier. When thick-provisioned volumes are the only consumer of the Ceph cluster, this works fine. However, it is unlikely that this is the case. Instead, accounting of the requested (thin-provisioned) size of volumes is much more practical as different types of volumes can be tracked. OpenShift already provides cluster-wide quotas, which can combine accounting of requested volumes by grouping different StorageClasses. In addition to the difficult practise of allowing only thick-provisioned RBD backed volumes, the performance makes thick-provisioning troublesome. As volumes need to be completely allocated, data needs to be written to the volume. This can take a long time, depending on the size of the volume. Provisioning, cloning and snapshotting becomes very much noticeable, and because of the additional time consumption, more prone to failures. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-10-27 06:54:07 +00:00
Madhu Rajanna	0838845c6a	cleanup: remove FIXME from ResyncVolume as the complexity of ResyncVolume is reduced removing the FIXME which is not valid anymore. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-26 12:00:36 +00:00
Madhu Rajanna	2017b8c621	rbd: log mirror daemon state for replication log the mirror deamon state in the local and remote cluster for better debugging. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-26 12:00:36 +00:00
Madhu Rajanna	7472338334	rbd: remove unwanted const for comparing the image states use the states defined in the go-ceph avoid creating of the deplicate const in cephcsi. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-26 12:00:36 +00:00
Madhu Rajanna	b92a6f5ccb	rbd: log the remote site details during resync logging the remote site details during resyncing for better debugging. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-26 12:00:36 +00:00
Madhu Rajanna	1fd2f28fee	rbd: check local image state for resyncing below are the local states of the mirrored image "unknown" -> If the image is in an error state means data is completely synced "error" -> If the image is in an error state means it needs resync "syncing" "starting_replay" "replaying" "stopping_replay" "stopped" If the resync is successfully started which means the image will be in "replaying" state. we can consider "replaying" state to report resync succesfully going on state. we are discarding the intermediate states like "syncing", "starting_replay" and "stopping_replay". Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-26 12:00:36 +00:00
Rakshith R	12cd05a408	rbd: add EnsureImageCleanup to snapshot deletion Signed-off-by: Rakshith R <rar@redhat.com>	2021-10-20 18:25:31 +00:00
Rakshith R	1849076aab	rbd: add EnsureImageCleanup to ensure image cleanup from trash After moving moving image to trash, if `trash remove` step fails, then external-provisioner will issue subsequent requests, in which image will be absent in pool( will be in trash) and omap cleanup will be done with stale image left in trash with no `trash remove` step on it. To avoid this scenario list trash images and find corresponding id for given image name and add a task to flatten when we encounter a ErrImageNotFound. Fixes: #1728 Signed-off-by: Rakshith R <rar@redhat.com>	2021-10-20 18:25:31 +00:00
Niels de Vos	6d3e25f069	util: NodeGetVolumeStatsResponse.Usage may not contain negative values Following the CSI specification, values that are included in the VolumeUsage MUST NOT be negative. However, CephFS seems to return -1 for the number of inodes that are available. Instead of returning a negative value, set it to 0 so that it will not get included in the encoded JSON response. Updates: #2579 See-also: `5b0d454015/spec.md (L2477-L2487)` Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-10-20 07:18:48 +00:00
Madhu Rajanna	0d51f6d833	rbd: check local image description for split-brain In some corner case like `re-player shutdown` the local image will not be in error state. It would be also worth considering `description` field to make sure about split-brain. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-18 11:22:03 +00:00
Humble Chirammal	c584fa20da	rbd: use clusterID from volumeContext at nodestage previously we were retriving clusterID using the monitors field in the volume context at node stage code path. however it is possible to retrieve or use clusterID directly from the volume context. This commit also remove the getClusterIDFromMigrationVolume() function which was used previously and its tests Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-10-11 10:06:30 +00:00
Humble Chirammal	4e61156dc4	rbd: change iteration variable name in the migration test to be specific we reuse or overload the variable name in the test execution at present. This commit use a different variable name as initialized in each run Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-10-11 10:06:30 +00:00
Madhu Rajanna	90ecd2d7e8	rbd: use go-ceph to get mirroring info use go-ceph api to get image mirroring info. closes #2558 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-07 08:02:06 +00:00
Madhu Rajanna	8ebc0659ab	rbd: perform resize of file system for static volume For static volume, the user will manually mounts already existing image as a volume to the application pods. As its a rbd Image, if the PVC is of type fileSystem the image will be mapped, formatted and mounted on the node, If the user resizes the image on the ceph cluster. User cannot not automatically resize the filesystem created on the rbd image. Even if deletes and recreates the kubernetes objects, the new size will not be visible on the node. With this changes During the NodeStageVolumeRequest the nodeplugin will check the size of the mapped rbd image on the node using the devicePath. and also the rbd image size on the ceph cluster. If the size is not matching it will do the file system resize on the node as part of the NodeStageVolumeRequest RPC call. The user need to do below operation to see new size * Resize the rbd image in ceph cluster * Scale down all the application pods using the static PVC. * Make sure no application pods which are using the static PVC is running on a node. * Scale up all the application pods. Validate the new size in application pod mounted volume. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-06 13:15:00 +00:00
Madhu Rajanna	fe9020260d	rbd: move flattening to helper function in NodeStage operation we are flattening the image to support mounting on the older clients. this commits moves it to a helper function to reduce code complexity. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-06 13:15:00 +00:00
Madhu Rajanna	cda2abca5d	rbd: use NewMetricsBlock to get size instead of lsblk command use NewMetricsBlock function from the kubernetes package to get the size. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-06 13:15:00 +00:00
Rakshith R	ded75eb099	rbd: copyEncryptionConfig for thickProvisioned snap restore too This commit adds bugfix to copy encryption passphrase for thick provisioned PVC restored from snapshot. Signed-off-by: Rakshith R <rar@redhat.com>	2021-10-05 07:46:57 +00:00
Rakshith R	59b7a26175	rbd: modify copyEncryptionConfig to accept copyOnlyPassphrase arg During PVC snapshot/clone both kms config and passphrase needs to copied, while for PVC restore only passphrase needs to be copied to dest rbdvol since destination storageclass may have another kms config. Signed-off-by: Rakshith R <rar@redhat.com>	2021-10-05 07:46:57 +00:00
Humble Chirammal	3c9d7e3cd5	rbd: detect migration volID in DeleteVolume() and delete rbd image This commit adds the logic to detect a passed in volumeID is a migrated volume ID and if yes, the driver connect to the backend cluster and clean/delete the image. The logic only applied if its a migration volume ID. The migration volume ID carry the information like mons, pool and image name which is good enough for the driver to identify and connect to the backend cluster for its operations. migration volID format: <mig>_mons-<monsHash>_image-<imageUID>_<poolHash> Details on the hash values: * MonsHash: this carry a hash value (md5sum) which will be acted as the `clusterID` for the operations in this context. * ImageUID: this is the unique UUID generated by kubernetes for the created volume. * PoolHash: this is an encoded string of pool name. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-10-04 16:06:31 +00:00
Madhu Rajanna	34a21cdbe3	cleanup: move mount functions to new pkg moved fuse and kernel mount functions to a new package. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-09-23 06:39:37 +00:00
Madhu Rajanna	b1ef842640	cleanup: move core functions to core pkg as we are refractoring the cephfs code, Moving all the core functions to a new folder /pkg called core. This will make things easier to implement. For now onwards all the core functionalities will be added to the core package. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-09-23 06:39:37 +00:00
Humble Chirammal	4804f47b18	e2e: Add e2e for rbd migration static pvc This commit adds e2e for rbd migration static PVCs Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-09-20 09:54:54 +00:00
Humble Chirammal	2e8e8f5e64	rbd: fill clusterID if its a migration nodestage request the migration nodestage request does not carry the 'clusterID' in it and only monitors are available with the volumeContext. The volume context flag 'migration=true' and 'static=true' flags allow us to fill 'clusterID' from the passed in monitors to the volume Context,so that rest of the static operations on nodestage can be proceeded as we do treat static volumes today. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-09-20 09:54:54 +00:00
Humble Chirammal	1f5963919f	util: get clusterID for the passed in mon string as part of migration support, the clusterID has to be fetched from passed in mon. Because the intree RBD storage class only got monitor and not `clusterID` parameter support. However, in CSI, SC has the `clusterID` parameter support but not mon. Due to that we have to fetch the clusterID from config file for the passed in mon and use it in our operations. This adds a helper function to retrieve clusterID from passed in mon string. Updates https://github.com/ceph/ceph-csi/issues/2509 Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-09-20 09:54:54 +00:00
Prasanna Kumar Kalever	c9cc36d8db	rbd: provide alternatives to preserve the ceph log files Currently, we delete the ceph client log file on unmap/detach. This patch provides additional alternatives for users who would like to persist the log files. Strategies: ----------- `remove`: delete log file on unmap/detach `compress`: compress the log file to gzip on unmap/detach `preserve`: preserve the log file in text format Note that the default strategy will be remove on unmap, and these options can be tweaked from the storage class Compression size details example: On Map: (with debug-rbd=20) --------- $ ls -lh -rw-r--r-- 1 root root 526K Sep 1 18:15 rbd-nbd-0001-0024-fed5480a-f00f-417a-a51d-31d8a8144c03-0000000000000003-d2e89c87-0b4d-11ec-8ea6-160f128e682d.log On unmap: --------- $ ls -lh -rw-r--r-- 1 root root 33K Sep 1 18:15 rbd-nbd-0001-0024-fed5480a-f00f-417a-a51d-31d8a8144c03-0000000000000003-d2e89c87-0b4d-11ec-8ea6-160f128e682d.gz Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-09-16 13:55:15 +00:00
Prasanna Kumar Kalever	10bbb049f7	cleanup: passing pointers to larger type Log: internal/rbd/rbd_attach.go:424:2: hugeParam: dArgs is heavy (88 bytes); consider passing it by pointer (gocritic) Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-09-16 13:55:15 +00:00
Prasanna Kumar Kalever	ad2c6d2851	util: add gzip helper function Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-09-16 13:55:15 +00:00
Shyamsundar Ranganathan	47dc9cf28d	rbd: Report errors when a resync maybe in progress Currently we return a !ready status if an image is not found when a replication resync is issued. We also return a !ready just post issuing a resync. The change is to ensure we return errors in these cases for the caller to retry the operation till we can determine we are actually resyncing, and then return !ready with nil errors. Part of addressing: https://github.com/csi-addons/volume-replication-operator/issues/101 Signed-off-by: Shyamsundar Ranganathan <srangana@redhat.com>	2021-09-15 15:59:22 +00:00
Rakshith R	82d09d81cf	util: modify GetMonsAndClusterID() to take clusterID instead of options This commit: - modifies GetMonsAndClusterID() to take clusterID instead of options. - moves out validation of clusterID is set or not out of GetMonsAndClusterID(). - defines ErrClusterIDNotSet new error for reusability. - add GetClusterID() to obtain clusterID from options. Signed-off-by: Rakshith R <rar@redhat.com>	2021-09-14 08:39:57 +00:00
Rakshith R	9d1e98ca60	rbd: check for clusterid mapping in genVolFromVolumeOptions() This commit adds capability to genVolFromVolumeOptions() to fetch mapped clusted-id & mon ips for mirrored PVC on secondary cluster which may have different cluster-id. This is required for NodeStageVolume(). We also don't need to check for mapping during volume create requests, so it can be disabled by passing a bool checkClusterIDMapping as false. GetMonsAndClusterID() is modified to accept bool checkClusterIDMapping based on which clustermapping is checked to fetch mapped cluster-id and mon-ips. Signed-off-by: Rakshith R <rar@redhat.com>	2021-09-14 08:39:57 +00:00
Humble Chirammal	4be53a27d3	cleanup: replace parentName to snapParentName in checkReservation at present, eventhough the checkReservation works for both volume and snapshot, the arg parentName make sense only for snapshot cases renaming that arg to more approprite Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-09-14 05:32:54 +00:00
Humble Chirammal	1fee3ec460	cleanup: correct checkReservation return description it wrongly mention that the return is imageUUID string where actually it is the imageData struct Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-09-14 05:32:54 +00:00
Rakshith R	0a7a7f4866	util: call WriteCephConfig() in cephcsi.go This commit calls WriteCephConfig() in cephcsi.go to create ceph.conf and keyring if it is not mounted to be used by all cli calls and conn cmds. Before this change, rbd-controller/omap-generator did not create ceph.conf on startup. Signed-off-by: Rakshith R <rar@redhat.com>	2021-09-08 16:05:27 +00:00
Madhu Rajanna	8c8f34cf7a	rbd: set vaultAuthNamespace to vaultNamespace if empty When we read the csi-kms-connection-details configmap vaultAuthNamespace might not be set when we do the conversion the vaultAuthNamespace might be set to empty key and this commits check for the empty value of vaultAuthNamespace and set the vaultAuthNamespace to vaultNamespace. setting empty value for vaultAuthNamespace happened due to Marshalling at https://github.com/ceph/ceph-csi/blob/devel/ internal/kms/vault_tokens.go#L136-L139. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-09-08 11:18:03 +00:00
Rakshith R	e99dd3dea4	util: read ceph.conf by calling conn.ReadConfigFile(CephConfigPath) The configurations in cpeh.conf is not picked up by rados connection automatically, hence we need to call conn.ReadConfigFile before calling Connect(). Signed-off-by: Rakshith R <rar@redhat.com>	2021-09-07 16:50:12 +00:00
Madhu Rajanna	76f1b42498	cephfs: correct comment for validateExpandVolumeRequest corrected the function comment for validateExpandVolumeRequest. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-09-07 14:33:02 +00:00
Madhu Rajanna	9fd51d9bec	cephfs: add comment for validateCreateVolumeRequest added function comment for validateCreateVolumeRequest Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-09-07 14:33:02 +00:00
Madhu Rajanna	8caeb409bb	cephfs: add comment for validateDeleteVolumeRequest added function comment for the validateDeleteVolumeRequest function. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-09-07 14:33:02 +00:00
Madhu Rajanna	be7749c90e	cleanup: move volumeID to the volumeoptions volumeID can be moved to the volumeOptions as most of the volume related helper functions are available on the volumeoptions.go Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-09-07 14:33:02 +00:00
Madhu Rajanna	da70ed50dc	cleanup: move execCommandErr to volumemounter Moved execCommandErr to the volumemounter.go which is the only caller of this function and moving the execCommandErr helps in reducing the util file. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-09-07 14:33:02 +00:00
Madhu Rajanna	31696a6ce0	cleanup: move genSnapFromOptions to volumeoptions moved genSnapFromOptions function to volumeoptions.go which is more appropriated than util. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-09-07 14:33:02 +00:00
Madhu Rajanna	73e2ffe8b8	cleanup: move cephfs csi spec validation to validator moved the cephfs related validation like validating the input parameters sent in the GRPC request to a new file. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-09-07 14:33:02 +00:00
Humble Chirammal	4efcc5bf97	cleanup: simplify checkStaticVolume function and remove unwanted vars checkStaticVolume() in the reconcilePV function has been unwantedly introducing variables to confirm the pv spec is static or not. This patch simplify it and make a smaller footprint of the functions. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-09-07 12:51:30 +00:00
Humble Chirammal	df2d9548ae	cephfs: no need to check for zero volume size At present there is a 'todo' to check for zero volume size in the createVolume request which in unwanted, ie the pvc creation with size 0 fail from the kubernetes api validation itself: For ex: ``` ..spec.resources[storage]: Invalid value: "0": must be greater than zero``` ``` so we dont need any extra check in the controller server Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-09-07 04:49:24 +00:00
Prasanna Kumar Kalever	9e55f015de	rbd: avoid supplying map options on unmap Thanks to the random unmap failure on my local machine: I0901 17:08:37.841890 2617035 cephcmds.go:55] ID: 11 Req-ID: 0001-0024-fed5480a-f00f-417a-a51d-31d8a8144c03-0000000000000003-024983f3-0b47-11ec-8fcb-e671f0b9f58e an error (exit status 22) occurred while running rbd args: [unmap rbd-pool/csi-vol-024983f3-0b47-11ec-8fcb-e671f0b9f58e --device-type nbd --options try-netlink --options reattach-timeout=300 --options io-timeout=0] Noticed the map args are also getting passed to/as unmap args, which is not correct. We have separate things for mapOptions and unmapOptions. This PR makes sure that the map args are not passed at the time of unmap. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-09-06 15:59:30 +00:00
Humble Chirammal	3f31ca8a3a	cleanup: introduce populateVolOptions(), to fill rbdVol from stage req At present the nodeStageVolume() handle many logic of filling rbdvol struct based on the request received and this method is complex to follow. with this patch, filling or populating volOptions has been segregrated and handled hence make the stage functions' job easy. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-09-06 07:49:03 +00:00
Humble Chirammal	f0b8a3f626	rbd: use String() method of MirrorImageState in return error MirrorImageState (type C.rbd_mirror_image_state_t) has a string method which can be used while returning error in the replication controller. Previously, we were using int return in the error which is not the proper usage. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-09-03 16:02:53 +00:00
Madhu Rajanna	4865061ab9	util: create ceph configuration files if not present create ceph.conf and keyring files if its not present in the /et/ceph/ path. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-09-03 14:14:43 +00:00
Humble Chirammal	1d94c12cd6	cleanup: add checkErrAndUndoReserve() for error check,unreserve omap all the error check scenarios of genVolFromVolID() and unreserving omap entries based on the error made deleteVolume method complex, this patch create a new function which handle the error check and unrerving omap entries accordingly and finally return the response to deletevolume/caller. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-09-03 12:20:04 +00:00
Niels de Vos	60c2afbcca	util: NewK8sClient() should not panic on non-Kubernetes clusters When NewK8sClient() detects and error, it used to call FatalLogMsg() which causes a panic. There are additional features that can be used on Kubernetes clusters, but these are not a requirement for most functionalities of the driver. Instead of causing a panic, returning an error should suffice. This allows using the driver on non-Kubernetes clusters again. Fixes: #2452 Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-09-02 11:22:14 +00:00
Humble Chirammal	247795517f	cephfs: remove explicit size setting of cloned volume CephFS csi driver explictly set the size of the cloned volume to the size of parent volume as cephfs mgr was lacking this functionality previously. However it has been addressed in cephfs so we dont need explicit size setting. Ref#https://tracker.ceph.com/issues/46163 Supported Ceph releases: Ceph versions equal or above - v16.0.0, v15.2.9, v14.2.12 Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-09-01 09:32:29 +00:00
Madhu Rajanna	b383af20b4	cleanup: move cephfs errors to new util package As part of the refactoring, moving the cephfs errors file to a new package. Updates: #852 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-09-01 06:50:16 +00:00
Rakshith R	99168dc822	rbd: check for clusterid mapping in RegenerateJournal() This commit adds fetchMappedClusterIDAndMons() which returns monitors and clusterID info after checking cluster mapping info. This is required for regenerating omap entries in mirrored cluster with different clusterID. Signed-off-by: Rakshith R <rar@redhat.com>	2021-08-31 14:30:06 +00:00
Rakshith R	496bcba85c	rbd: move GetMappedID() to util package This commit moves getMappedID() from rbd to util package since it is not rbd specific and exports it from there. Signed-off-by: Rakshith R <rar@redhat.com>	2021-08-31 14:30:06 +00:00
Niels de Vos	4a3b1181ce	cleanup: move KMS functionality into its own package A new "internal/kms" package is introduced, it holds the API that can be consumed by the RBD components. The KMS providers are currently in the same package as the API. With later follow-up changes the providers will be placed in their own sub-package. Because of the name of the package "kms", the types, functions and structs inside the package should not be prefixed with KMS anymore: internal/kms/kms.go:213:6: type name will be used as kms.KMSInitializerArgs by other packages, and that stutters; consider calling this InitializerArgs (golint) Updates: #852 Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-08-30 16:31:40 +00:00
Niels de Vos	778b5e86de	cleanup: move k8s functions to the util/k8s package By placing the NewK8sClient() function in its own package, the KMS API can be split from the "internal/util" package. Some of the KMS providers use the NewK8sClient() function, and this causes circular dependencies between "internal/utils" -> "internal/kms" -> "internal/utils", which are not alowed in Go. Updates: #852 Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-08-30 16:31:40 +00:00
Humble Chirammal	8ea495ab81	rbd: skip volumeattachment processing if pv marked for deletion if the volumeattachment has been fetched but marked for deletion the nbd healer dont want to process further on this pv. This patch adds a check for pv is marked for deletion and if so, make the healer skip processing the same Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-08-26 15:04:19 +00:00
Niels de Vos	6d00b39886	cleanup: move log functions to new internal/util/log package Moving the log functions into its own internal/util/log package makes it possible to split out the humongous internal/util packages in further smaller pieces. This reduces the inter-dependencies between utility functions and components, preventing circular dependencies which are not allowed in Go. Updates: #852 Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-08-26 09:34:05 +00:00
Niels de Vos	68588dc7df	util: fix unit-test for GetClusterMappingInfo() Unit-testing often fails due to a race condition while writing the clusterMappingConfigFile from multiple go-routines at the same time. Failures from `make containerized-test` look like this: === CONT TestGetClusterMappingInfo/site2-storage_cluster-id_mapping cluster_mapping_test.go:153: GetClusterMappingInfo() = <nil>, expected data &[{map[site1-storage:site2-storage] [map[1:3]] [map[11:5]]} {map[site3-storage:site2-storage] [map[8:3]] [map[10:5]]}] === CONT TestGetClusterMappingInfo/site3-storage_cluster-id_mapping cluster_mapping_test.go:153: GetClusterMappingInfo() = <nil>, expected data &[{map[site3-storage:site2-storage] [map[8:3]] [map[10:5]]}] --- FAIL: TestGetClusterMappingInfo (0.01s) --- PASS: TestGetClusterMappingInfo/mapping_file_not_found (0.00s) --- PASS: TestGetClusterMappingInfo/mapping_file_found_with_empty_data (0.00s) --- PASS: TestGetClusterMappingInfo/cluster-id_mapping_not_found (0.00s) --- FAIL: TestGetClusterMappingInfo/site2-storage_cluster-id_mapping (0.00s) --- FAIL: TestGetClusterMappingInfo/site3-storage_cluster-id_mapping (0.00s) --- PASS: TestGetClusterMappingInfo/site1-storage_cluster-id_mapping (0.00s) By splitting the public GetClusterMappingInfo() function into an internal getClusterMappingInfo() that takes a filename, unit-testing can use different files for each go-routine, and testing becomes more predictable. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-08-25 16:08:48 +00:00
Prasanna Kumar Kalever	4f40213d8e	rbd: fix rbd-nbd io-timeout to never abort With the tests at CI, it kind of looks like that the IO is timing out after 30 seconds (default with rbd-nbd). Since we have tweaked reattach-timeout to 300 seconds at ceph-csi, we need to explicitly set io-timeout on the device too, as it doesn't make any sense to keep io-timeout < reattach-timeout Hence we set io-timeout for rbd nbd to 0. Specifying io-timeout 0 tells the nbd driver to not abort the request and instead see if it can be restarted on another socket. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> Suggested-by: Ilya Dryomov <idryomov@redhat.com>	2021-08-24 17:09:09 +00:00
Prasanna Kumar Kalever	3bf17ade7a	doc: update code comments about available timeout options Adding some code comments to make them readable and easy to understand. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-08-24 17:09:09 +00:00
Prasanna Kumar Kalever	ea3def0db2	rbd: remove per volume rbd-nbd logfiles on detach - Update the meta stash with logDir details - Use the same to remove logfile on unstage/unmap to be space efficient Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-08-24 07:15:30 +00:00
Prasanna Kumar Kalever	d67e88ccd0	cleanup: embed args into struct and pass it to detachRBDImageOrDeviceSpec Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-08-24 07:15:30 +00:00
Prasanna Kumar Kalever	474100c1f1	rbd: add a unit test for getCephClientLogFileName() Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-08-24 07:15:30 +00:00
Prasanna Kumar Kalever	682b3a980b	rbd: rbd-nbd logging the ceph-CSI way - One logfile per device/volume - Add ability to customize the logdir, default: /var/log/ceph Note: if user customizes the hostpath to something else other than default /var/log/ceph, then it is his responsibility to update the `cephLogDir` in storageclass to reflect the same with daemon: ``` cephLogDir: "/var/log/mynewpath" ``` Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-08-24 07:15:30 +00:00
Humble Chirammal	9ac1391d0f	util: correct interface name and remove redundancy ContollerManager had a typo in it, and if we correct it, linter will fail and suggest not to use controller.ControllerManager as the interface name and package name is redundant, keeping manager as the interface name which is the practice and also address the linter issues. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-08-19 04:19:42 +00:00
Humble Chirammal	edf511a833	cephfs: make use of subvolumeInfo.state to determine quota https://github.com/ceph/go-ceph/pull/455/ added `state` field to subvolume info struct which helps to identify the snapshot retention state in the caller. This patch make use of the same Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-08-18 04:50:46 +00:00
Humble Chirammal	66fa5891b2	cephfs: correct typos in cephfs driver code Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-08-18 04:50:46 +00:00
Humble Chirammal	5089a4ce5d	doc: correct some source code comments in rbd driver code Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-08-17 06:57:09 +00:00
Madhu Rajanna	5562e46d0f	rbd: Cleanup OMAP data for secondary image If the image is in a secondary state and its up+replaying means its an healthy secondary and the image is primary somewhere in the remote cluster and the local image is getting replayed. Delete the OMAP data generated as we cannot delete the secondary image. When the image on the primary cluster gets deleted/mirroring disabled, the image on all the remote (secondary) clusters will get auto-deleted. This helps in garbage collecting the OMAP, PVC and PV objects after failback operation. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-08-16 17:38:25 +00:00
Madhu Rajanna	fc0d6f6b8b	rbd: return succuss if image is healthy secondary If the image is in secondary state and its up+replaying means its an healthy secondary and the image is primary somewhere in the remote cluster and the local image is getting replayed. Return success for the Disabling mirroring as we cannot disable the mirroring on the secondary state, when the image on the remote site gets disabled the image on all the remote (secondary) will get auto deleted. This helps in garbage collecting the volume replication kuberentes artifacts Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-08-16 17:38:25 +00:00
Madhu Rajanna	35324b2e17	rbd: add helper function to get local state added helper function to check the local image state is up+replaying. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-08-16 17:38:25 +00:00
Humble Chirammal	87beaac25b	rbd: add ReadWriteOncePod in accessModeStrToInt() conversion function Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-08-12 09:55:50 +00:00
Rakshith R	f05ac2b25d	rbd: extract kmsID from volumeAttributes in RegenerateJournal() This commit adds functionality of extracting encryption kmsID, owner from volumeAttributes in RegenerateJournal() and adds utility functions ParseEncryptionOpts and FetchEncryptionKMSID. Signed-off-by: Rakshith R <rar@redhat.com>	2021-08-10 09:17:59 +00:00
Rakshith R	b960e3633a	rbd: extract volumeNamePrefix in RegenerateJournal() Signed-off-by: Rakshith R <rar@redhat.com>	2021-08-10 09:17:59 +00:00
Rakshith R	b9b4b1e34e	rbd: refractor RegenerateJournal() to take in volumeAttributes This commit refractors RegenerateJournal() to take in volumeAttributes map[string]string as argument so it can extract required attributes internally. Signed-off-by: Rakshith R <rar@redhat.com>	2021-08-10 09:17:59 +00:00
Rakshith R	39d6752fc1	rbd: use `CSIInstanceID` var instead of "default" in RegenerateJournal() Signed-off-by: Rakshith R <rar@redhat.com>	2021-08-10 09:17:59 +00:00
Ben Ye	9cd8326bb2	cleanup: allocate slice with known size As the input capabilities size is known, it is better to allocate slice with a specified size. Signed-off-by: Ben Ye <ben.ye@bytedance.com>	2021-08-10 05:39:44 +00:00
Madhu Rajanna	6cc37f0a17	cleanup: use different file name for testing For clusterMappingConfigFile using different file name so that multiple unit test cases can work without any data race. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-08-09 13:37:25 +00:00
Madhu Rajanna	3c85219962	rbd: consider empty mirroring mode consider the empty mirroring mode when validating the snapshot interval and the scheduling time. Even if the mirroring Mode is not set validate the snapshot scheduling details as cephcsi sets the mirroring mode to default snapshot. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-08-09 11:05:05 +00:00
Rakshith R	825211730c	rbd: fix snapshot id idempotency issue This commit fixes snapshot id idempotency issue by always returning an error when flattening is in progress and not using `readyToUse:false` response. Signed-off-by: Rakshith R <rar@redhat.com>	2021-08-09 07:28:43 +00:00
Rakshith R	859d696279	cleanup: refractor checkCloneImage to reducing nesting if This commit refractors checkCloneImage function to address nestif linter issue. Signed-off-by: Rakshith R <rar@redhat.com>	2021-08-09 07:28:43 +00:00
Madhu Rajanna	a5a8952716	rbd: fix clone problem This commit fixes a bug in checkCloneImage() which was caused by checking cloned image before checking on temp-clone image snap in a subsequent request which lead to stale images. This was solved by checking temp-clone image snap and flattening temp-clone if needed. This commit also fixes comparison bug in flattenCloneImage(). Signed-off-by: Rakshith R <rar@redhat.com> Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-08-09 07:28:43 +00:00
Madhu Rajanna	916c97b4a8	rbd: copy creds when copying the connection rbd flatten functions is a CLI call and it expects the creds as the input and copying of creds is required when we generate the temp clone image. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-08-09 07:28:43 +00:00
Rakshith R	08728b631b	rbd: fix vol.VolID in cloneFromSnapshot() Volume generated from snap using genrateVolFromSnap already copies volume ID correctly, therefore removing `vol.VolID = rbdVol.VolID` which wrongly copies parent Volume ID instead leading to error from copyEncryption() on parent and clone volume ID being equal. Signed-off-by: Rakshith R <rar@redhat.com>	2021-08-09 07:28:43 +00:00
Niels de Vos	b5d2321d57	cleanup: use vaultDefaultCAVerify to set default value Golang-ci complains about the following: internal/util/vault_tokens.go:99:20: string `true` has 4 occurrences, but such constant `vaultDefaultDestroyKeys` already exists (goconst) v.VaultCAVerify = "true" ^ This occurence of "true" can be replaced by vaultDefaultCAVerify so address the warning. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-08-06 12:19:18 +00:00
Niels de Vos	f584db41e6	util: add vaultDestroyKeys option to destroy Vault kv-v2 secrets Hashicorp Vault does not completely remove the secrets in a kv-v2 backend when the keys are deleted. The metadata of the keys will be kept, and it is possible to recover the contents of the keys afterwards. With the new `vaultDestroyKeys` configuration parameter, this behaviour can now be selected. By default the parameter will be set to `true`, indicating that the keys and contents should completely be destroyed. Setting it to any other value will make it possible to recover the deleted keys. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-08-06 12:19:18 +00:00
Madhu Rajanna	2782878ea2	rbd: log LastUpdate in UTC format This Commit converts the LastUpdate from int to the UTC format and logs it for better debugging. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-08-06 10:18:51 +00:00
Madhu Rajanna	92ad2ceec9	rbd: read clusterID and PoolID from mapping Whenever Ceph-CSI receives a CSI/Replication request it will first decode the volumeHandle and try to get the required OMAP details if it is not able to retrieve, receives a `Not Found` error message and Ceph-CSI will check for the clusterID mapping. If the old volumeID `0001-00013-site1-storage-0000000000000001 -b0285c97-a0ce-11eb-8c66-0242ac110002` contains the `site1-storage` as the clusterID, now Ceph-CSI will look for the corresponding clusterID `site2-storage` from the above configmap. If the clusterID mapping is found now Ceph-CSI will look for the poolID mapping ie mapping between `1` and `2`. Example:- pool with name exists on both the clusters with different ID's Replicapool with ID `1` on site1 and Replicapool with ID `2` on site2. After getting the required mapping Ceph-CSI has the required information to get more details from the rados OMAP. If we have multiple clusterID mapping it will loop through all the mapping and checks the corresponding pool to get the OMAP data. If the clusterID mapping does not exist Ceph-CSI will return an `Not Found` error message to the caller. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-08-05 16:07:51 +00:00
Madhu Rajanna	ac11d71e19	util: add helper function to read clusterID mapping added helper function to read the clusterID mapping from the mounted file. The clusterID mapping contains below mappings * ClusterID mappings (to cluster to which we are failingover and from which cluster failover happened) * RBD PoolID mapping of between the clusters. * CephFS FscID mapping between the clusters. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-08-05 16:07:51 +00:00
Yug Gupta	1dc032e554	doc: update comments in voljournal Update spell errors and comments in voljournal.go Signed-off-by: Yug Gupta <yuggupta27@gmail.com>	2021-08-05 08:11:15 +00:00
Niels de Vos	4859f2dfdb	util: allow configuring VAULT_AUTH_MOUNT_PATH for Vault Tenant SA KMS The VAULT_AUTH_MOUNT_PATH is a Vault configuration parameter that allows a user to set a non default path for the Kubernetes ServiceAccount integration. This can already be configured for the Vault KMS, and is now added to the Vault Tenant SA KMS as well. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-08-05 06:02:57 +00:00
Niels de Vos	f2d5c2e0df	util: add vaultAuthNamespace option for Vault KMS The new `vaultAuthNamespace` configuration parameter can be set to the Vault Namespace where the authentication is setup in the service. Some Hashicorp Vault deployments use sub-namespaces for their users/tenants, with a 'root' namespace where the authentication is configured. This requires passing of different Vault namespaces for different operations. Example: - the Kubernetes Auth mechanism is configured for in the Vault Namespace called 'devops' - a user/tenant has a sub-namespace called 'devops/website' where the encryption passphrases can be placed in the key-value store The configuration for this, then looks like: vaultAuthNamespace: devops vaultNamespace: devops/homepage Note that Vault Namespaces are a feature of the Hashicorp Vault Enterprise product, and not part of the Open Source version. This prevents adding e2e tests that validate the Vault Namespace configuration. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-08-04 18:20:45 +00:00
Niels de Vos	83167e2ac5	util: correct error message when connecting to Vault fails Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-08-04 18:20:45 +00:00
Alexandre Lossent	5cba04c470	cephfs: support selinux mount options - mount host's /etc/selinux in node plugins - process mount options in all code paths for cephfs volume options Signed-off-by: Alexandre Lossent <alexandre.lossent@cern.ch>	2021-08-04 12:59:34 +00:00
Artur Troian	16ec97d8f7	util: getCgroupPidsFile produces striped path when extra : present This commit uses `string.SplitN` instead of `string.Split`. The path for pids.max has extra `:` symbols in it due to which getCgroupPidsFile() splits the string into 5 tokens instead of 3 leading to loss of part of the path. As a result, the below error is reported: `Failed to get the PID limit, can not reconfigure: open /sys/fs/cgroup/pids/system.slice/containerd.service/ kubepods-besteffort-pod183b9d14_aed1_4b66_a696_da0c738bc012.slice/pids.max: no such file or directory` SplitN takes an argument n and splits the string accordingly which helps us to get the desired file path. Fixes: #2337 Co-authored-by: Yati Padia <ypadia@redhat.com> Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-08-03 06:03:10 +00:00
Madhu Rajanna	8f185bf7b2	rbd: use rados namespace for manager command Currently we have a bug that we are not using rados namespace when adding ceph manager command to remove the image from the trash. This commit adds the missing rados namespace when adding ceph manager task. without fix the image will be moved to trash and no task will be added to remove from the trash. it will become ceph responsibility to remove the image from trash when it will cleanup the trash. workaroud: manually purge the trash Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-07-28 03:48:33 +00:00
Niels de Vos	ec6703ed58	rbd: rename encryption metadata keys to enable mirroring RBD image metadata keys that start with '.rbd' are expected to be internal to RBD itself and are not mirrored to remote sites. Renaming the keys (dropping the '.' prefix) and using the new MigrateMetadata() function now makes the keys available on remote sites too. Closes: #2219 Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-07-26 11:49:56 +00:00
Niels de Vos	607129171d	rbd: move image metadata key migration to its own function The new MigrateMetadata() function can be used to get the metadata of an image with a deprecated and new key. Renaming metadata keys can be done easily this way. A default value will be set in the image metadata when it is missing completely. But if the deprecated key was set, the data is stored under the new key and the deprecated key is removed. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-07-26 11:49:56 +00:00
Yati Padia	6691951453	rbd: use go-ceph for getImageMirroringStatus Currently, getImageMirroringStatus() is using RBD CLI. This commit converts RBD CLI to go-ceph API. Fixes: #2120 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-07-26 06:37:40 +00:00
Prasanna Kumar Kalever	526ff95f10	rbd: add support to expand encrypted volume Previously in ControllerExpandVolume() we had a check for encrypted volumes and we use to fail for all expand requests on an encrypted volume. Also for Block VolumeMode PVCs NodeExpandVolume used to be ignored/skipped. With these changes, we add support for the expansion of encrypted volumes. Also for raw Block VolumeMode PVCs with Encryption we call NodeExpandVolume. That said, With LUKS1, cryptsetup utility doesn't prompt for a passphrase on resizing the crypto mapper device. This is because LUKS1 devices don't use kernel keyring for volume keys. Whereas, LUKS2 devices use kernel keyring for volume key by default, i.e. cryptsetup utility asks for a passphrase if it detects volume key was previously passed to dm-crypt via kernel keyring service, we are overriding the default by --disable-keyring option during cryptsetup open command. So that at the time of crypto mapper device resize we will not be prompted for any passphrase. Fixes: #1469 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-07-23 10:00:23 +00:00
Prasanna Kumar Kalever	4fa05cb3a1	util: add helper functions for resize of encrypted volume such as: ResizeEncryptedVolume() and LuksResize() Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-07-23 10:00:23 +00:00
Prasanna Kumar Kalever	572f39d656	util: fix log level in OpenEncryptedVolume() Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-07-23 10:00:23 +00:00
Prasanna Kumar Kalever	812003eb45	util: fix bug in DeviceEncryptionStatus() With Luks1 device: $ cryptsetup status /dev/mapper/crypto-rbd0 /dev/mapper/crypto-rbd0 is active and is in use. type: LUKS1 cipher: aes-xts-plain64 keysize: 512 bits key location: dm-crypt device: /dev/rbd0 sector size: 512 offset: 4096 sectors size: 4190208 sectors mode: read/write With Luks2 device: $ cryptsetup status /dev/mapper/crypto-rbd0 /dev/mapper/crypto-rbd0 is active and is in use. type: LUKS2 cipher: aes-xts-plain64 keysize: 512 bits key location: dm-crypt device: /dev/rbd0 sector size: 512 offset: 32768 sectors size: 4161536 sectors mode: read/write This could lead to failures with unmap in the NodeUnstageVolume path for the encrypted volumes. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-07-23 10:00:23 +00:00
Yati Padia	1ae2afe208	cleanup: modifies the error caused due to merged PRs This commit modifies the error of godot, cyclop, paralleltest linter caused due to merged PRs. Updates: #1586 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-07-22 18:15:48 +00:00
Yati Padia	172b66f73f	cleanup: resolves cyclop linter issue this commit adds `// nolint:cyclop` for the fucntions whose complexity is above 20 Updates: #1586 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-07-22 18:15:48 +00:00
Humble Chirammal	abe6a6e5ac	util: remove deleteLock test as it is enforced by the controller Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-07-22 15:07:49 +00:00
Humble Chirammal	c42d4768ca	util: remove the deleteLock acquistion check for clone and snapshot At present while acquiring the deleteLock on the volume, we check for ongoing clone and snapshot creation operations on the same. Considering snapshot and clone controllers does not allow parent volume deletion on subjected operations, we can be free from this extra check. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-07-22 15:07:49 +00:00
Niels de Vos	82557e3f34	util: allow configuring VAULT_BACKEND for Vault connection It seems that the version of the key/value engine can not always be detected for Hashicorp Vault. In certain cases, it is required to configure the `VAULT_BACKEND` (or `vaultBackend`) option so that a successful connection to the service can be made. The `kv-v2` is the current default for development deployments of Hashicorp Vault (what we use for automated testing). Production deployments default to version 1 for now. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-07-22 13:02:47 +00:00
Rakshith R	43f753760b	cleanup: resolve nlreturn linter issues nlreturn linter requires a new line before return and branch statements except when the return is alone inside a statement group (such as an if statement) to increase code clarity. This commit addresses such issues. Updates: #1586 Signed-off-by: Rakshith R <rar@redhat.com>	2021-07-22 06:05:01 +00:00
Yati Padia	3469dfc753	cleanup: resolve errorlint issues This commit resolves errorlint issues which checks for the code that will cause problems with the error wrapping scheme. Updates: #1586 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-07-19 13:31:29 +00:00
Yati Padia	bfda5fa57f	cleanup: resolve revive linter issue revive linter checks for var-declaration format. For example: "e2e/rbd_helper.go:441:36: var-declaration: should drop = nil from declaration of var noPVCValidation; it is the zero value (revive) var noPVCValidation validateFunc = nil" Updates: #1586 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-07-19 08:39:32 +00:00
Humble Chirammal	bd947bbe31	util: remove deleteLock check while acquiring snapshot createLock snapshot controller make sure the pvc which is the source for the snapshot request wont get deleted while snapshot is getting created, so we dont need to check for any ongoing delete operation here on the volume. Subjected code path in snapshot controller: ``` pvc, err := ctrl.getClaimFromVolumeSnapshot(snapshot) . .. pvcClone.ObjectMeta.Finalizers = append(pvcClone.ObjectMeta.Finalizers, utils.PVCFinalizer) _, err = ctrl.client.CoreV1().PersistentVolumeClaims(pvcClone.Namespace).Update(..) ``` Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-07-17 10:23:13 +00:00
Prasanna Kumar Kalever	78f740d903	rbd: improve healer to run multiple NodeStageVolume req concurrently This will bring down the healer run time by a great factor. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-07-16 16:30:58 +00:00
Prasanna Kumar Kalever	b6a88dd728	rbd: add volume healer Problem: ------- For rbd nbd userspace mounter backends, after a restart of the nodeplugin all the mounts will start seeing IO errors. This is because, for rbd-nbd backends there will be a userspace mount daemon running per volume, post restart of the nodeplugin pod, there is no way to restore the daemons back to life. Solution: -------- The volume healer is a one-time activity that is triggered at the startup time of the rbd nodeplugin. It navigates through the list of volume attachments on the node and acts accordingly. For now, it is limited to nbd type storage only, but it is flexible and can be extended in the future for other backend types as needed. From a few feets above: This solves a severe problem for nbd backed csi volumes. The healer while going through the list of volume attachments on the node, if finds the volume is in attached state and is of type nbd, then it will attempt to fix the rbd-nbd volumes by sending a NodeStageVolume request with the required volume attributes like secrets, device name, image attributes, and etc.. which will finally help start the required rbd-nbd daemons in the nodeplugin csi-rbdplugin container. This will allow reattaching the backend images with the right nbd device, thus allowing the applications to perform IO without any interruptions even after a nodeplugin restart. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-07-16 16:30:58 +00:00
Prasanna Kumar Kalever	6007fc9bfe	cleanup: move static volume check to helper function Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-07-16 16:30:58 +00:00
Prasanna Kumar Kalever	6d24080851	rbd: update per volume metadata stash-file with devicePath As part of stage transaction if the mounter is of type nbd, then capture device path after a successful rbd-nbd map. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-07-16 16:30:58 +00:00
Prasanna Kumar Kalever	70998571aa	cleanup: change variable name from path to metaDataPath path is used by standard package. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-07-16 16:30:58 +00:00
Humble Chirammal	94c5c5e119	util: remove deleteLock while we acquire clone operation lock clone controller make sure there is no delete operation happens on the source PVC which has been referred as the datasource of clone PVC, we are safe to operate without looking at delete operation lock in this case. Subjected code in the controller: ... if claim.Spec.DataSource != nil && rc.clone { err = p.setCloneFinalizer(ctx, claim) ... } if !checkFinalizer(claim, pvcCloneFinalizer) { claim.Finalizers = append(claim.Finalizers, pvcCloneFinalizer) _, err := p.client.CoreV1().PersistentVolumeClaims(claim.Namespace).Update(..claim..) } Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-07-16 12:32:28 +00:00
Humble Chirammal	e088e8fd2e	cephfs: Get rid of locking at nodepublish Considering kubelet make sure the stage and publish operations are serialized, we dont need any extra locking in nodePublish Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-07-16 07:18:56 +00:00
Humble Chirammal	61bf49a4f5	rbd: Get rid of locking at nodePublish Considering kubelet make sure the stage and publish operations are serialized, we dont need any extra locking in nodePublish Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-07-16 07:18:56 +00:00
Humble Chirammal	ced3a0922f	cephfs: Get rid of locking at nodeUnpublish call Considering kubelet make sure the unstage and unpublish operations are serialized, we dont need any extra locking in nodeUnpublish Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-07-16 07:18:56 +00:00
Humble Chirammal	ef852cc93d	rbd: Get rid of locking at nodeUnpublish call Considering kubelet make sure the unstage and unpublish operations are serialized, we dont need any extra locking in nodeUnpublish Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-07-16 07:18:56 +00:00
Yati Padia	f36d611ef9	cleanup: resolves gofumpt issues of internal codes This PR runs gofumpt for internal folder. Updates: #1586 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-07-14 19:50:56 +00:00
Yati Padia	299979fc14	ci: add unit test for toError() This commit adds unit test for the func converting cephFSCloneState to error. Fixes: #2259 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-07-14 15:02:12 +00:00
Yati Padia	c66872c3c6	cleanup: ineffective assignment This commit resolves ineffective assignent of snap. Updates: #1586 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-07-14 12:39:17 +00:00
Yati Padia	f210d5758b	cleanup: spell check getImageMirroingStatus This commit corrects the spelling for getImageMirroingStatus() -> getImageMirroringStatus Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-07-14 07:32:01 +00:00
Niels de Vos	d941e5abac	util: make parseTenantConfig() usable for modular KMSs parseTenantConfig() only allowed configuring a defined set of options, and KMSs were not able to re-use the implementation. Now, the function parses the ConfigMap from the Tenants Namespace and returns a map with options that the KMS supports. The map that parseTenantConfig() returns can be inspected by the KMS, and applied to the vaultTenantConnection type by calling parseConfig(). Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-07-13 17:16:35 +00:00
Niels de Vos	3d7d48a4aa	util: VaultTenantSA KMS implementation This new KMS uses a Kubernetes ServiceAccount from a Tenant (Namespace) to connect to Hashicorp Vault. The provisioner and node-plugin will check for the configured ServiceAccount and use the token that is located in one of the linked Secrets. Subsequently the Vault connection is configured to use the Kubernetes token from the Tenant. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-07-13 17:16:35 +00:00
Niels de Vos	6dc5bf2b29	util: split vaultTenantConnection from VaultTokensKMS This makes the Tenant configuration for Hashicorp Vault KMS connections more modular. Additional KMS implementations that use Hashicorp Vault with per-Tenant options can re-use the new vaultTenantConnection. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-07-13 17:16:35 +00:00
Yati Padia	69c9e5ffb1	cleanup: resolve parallel test issue This commit resolves parallel test issues and also excludes internal/util/conn_pool_test.go as those test can't run in parallel. Updates: #1586 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-07-13 11:31:39 +00:00
Yati Padia	4a649fe17f	cleanup: resolve godot linter This commit resolves godot linter issue which says "Comment should end in a period (godot)". Updates: #1586 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-07-13 06:50:03 +00:00
Yati Padia	f35ce3d880	cleanup: Adds t.Helper() to test helper function This commit adds t.Helper() to the test helper function. With this call go test prints correct lines of code for failed tests. Otherwise, printed lines will be inside helpers functions. For more details check: https://github.com/kulti/thelper Updates: #1586 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-07-12 11:25:55 +00:00
Yati Padia	84c1fe52c7	cleanup: resolve exhaustive linter This commit resolves exhaustive linter error. Updates: #2240 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-07-12 04:47:08 +00:00
Jonas Zeiger	680a7bf411	util: more generic kernel version parsing * Make kernel version parsing to support more (valid) version strings * Put version string parsing into a separate, testable function * Fixes #2248 (Kernel Subversion Parsing Failure) Signed-off-by: Jonas Zeiger <jonas.zeiger@talpidae.net>	2021-07-09 07:36:27 +00:00
Rakshith R	3352d4aabd	rbd: add user secret based metadata encryption This commit adds capability to `metadata` encryption to be able to fetch `encryptionPassphrase` from user specified secret name and namespace(if not specified, will default to namespace where PVC was created). This behavior is followed if `secretName` key is found in the encryption configuration else defaults to fetching `encryptionPassphrase` from storageclass secrets. Closes: 2107 Signed-off-by: Rakshith R <rar@redhat.com>	2021-07-08 17:06:02 +00:00
Yati Padia	ffab37f44f	cleanup: resolves gocritic linter issues This commit resolves gocritic linter errors. Updates: #2250 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-07-08 05:19:26 +00:00
Madhu Rajanna	dd0884310f	rbd: set image metadata in isThickProvisioned setting metadata in isThickProvisioned method helps us to avoid checking thick metakey and deprecated metakey for both thick and thin provisioned images and also this will easily help us to migrated the deprecated key to new key. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-07-07 08:31:10 +00:00
Madhu Rajanna	77135599ac	rbd: make setThickProvisioned as method of rbdImage isThickProvisioned is already method of the rbdImage to keep similar thick provisioner related functions common making isThickProvisioned as method of rbdImage. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-07-07 08:31:10 +00:00
Madhu Rajanna	708800ddc1	rbd: set thick metadata if ThickProvision is set instead of checking the parent is thick provisioned or not we can decide based on the rbdVol generated from the request. If the request is to create a Thick Image. set metadata without checking the parent. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-07-07 08:31:10 +00:00
Madhu Rajanna	332a47a100	rbd: deprecate .rbd.csi.ceph.com/thick-provisioned metadata key As image metadata key starting with '.rbd' will not be copied when we do clone or mirroring, deprecating the old key for the same reason use 'csi.ceph.com/thick-provisioned' to set image metadata. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-07-07 08:31:10 +00:00
Madhu Rajanna	0837c05be0	rbd: set scheduling interval on snapshot mirrored image Mirror-snapshots can also be automatically created on a periodic basis if mirror-snapshot schedules are defined. The mirror-snapshot can be scheduled globally, per-pool, or per-image levels. Multiple mirror-snapshot schedules can be defined at any level. To create a mirror-snapshot schedule with rbd, specify the mirror snapshot schedule add command along with an optional pool or image name; interval; and optional start time: The interval can be specified in days, hours, or minutes using d, h, m suffix respectively. The optional start-time can be specified using the ISO 8601 time format. For example: ``` $ rbd --cluster site-a mirror snapshot schedule add --pool image-pool --image image1 24h 14:00:00-05:00 ``` Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-07-06 14:41:48 +00:00
Madhu Rajanna	b1710f4c53	util: add method to get rados connection New go-ceph admin package api's expects to pass the rados connection as argument. added new method called GetRBDAdmin to get admin connection to administrate rbd volumes. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-07-06 14:41:48 +00:00
Rakshith R	9eaa55506f	rebase: update controller-runtime package to v0.9.2 This commit updates controller-runtime to v0.9.2 and makes changes in persistentvolume.go to add context to various functions and function calls made here instead of context.TODO(). Signed-off-by: Rakshith R <rar@redhat.com>	2021-07-01 03:35:23 +00:00
Rakshith R	1b23d78113	rebase: update kubernetes to v1.21.2 Updated kubernetes packages to latest release. resizefs package has been included into k8s.io/mount-utils package. updated code to use the same. Updates: #1968 Signed-off-by: Rakshith R <rar@redhat.com>	2021-07-01 03:35:23 +00:00
Humble Chirammal	cc6d67a7d6	internal: reformat long lines in internal/util package to 120 chars We have many declarations and invocations..etc with long lines which are very difficult to follow while doing code reading. This address the issues in 'internal/util' package files to restrict the line length to 120 chars. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-06-28 14:43:49 +00:00
Humble Chirammal	8f82a30c21	internal: reformat long lines in internal/rbd package to 120 chars We have many declarations and invocations..etc with long lines which are very difficult to follow while doing code reading. This address the issues in below files, and restrict the line length to 120 chars. -internal/rbd/rbd_attach.go -internal/rbd/rbd_journal.go -internal/rbd/rbd_util.go -internal/rbd/replicationcontrollerserver.go -internal/rbd/snapshot.go Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-06-28 14:43:49 +00:00
Humble Chirammal	e829308249	internal: reformat long lines in internal/rbd package to 120 chars We have many declarations and invocations..etc with long lines which are very difficult to follow while doing code reading. This address the issues in 'internal/rbd/*server.go' and 'internal/rbd/driver.go' files to restrict the line length to 120 chars. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-06-28 14:43:49 +00:00
Humble Chirammal	3dc8c5b516	internal: reformat long lines in internal/journal package to 120 chars We have many declarations and invocations..etc with long lines which are very difficult to follow while doing code reading. This address the issues in 'internal/journal' package to restrict the line length to 120 chars. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-06-28 14:43:49 +00:00
Humble Chirammal	a3b83fe8a7	internal: reformat long lines in internal/csi-common package to 120 chars We have many declarations and invocations..etc with long lines which are very difficult to follow while doing code reading. This address the issues in 'internal/csi-common' package to restrict the line length to 120 chars. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-06-28 14:43:49 +00:00
Humble Chirammal	f526c4a5e8	internal: reformat long lines in internal/controller package to 120 chars We have many declarations and invocations..etc with long lines which are very difficult to follow while doing code reading. This address the issues in 'internal/controller' package to restrict the line length to 120 chars. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-06-28 14:43:49 +00:00
Humble Chirammal	0d432be5bf	internal: reformat long lines in internal/cephfs package to 120 chars We have many declarations and invocations..etc with long lines which are very difficult to follow while doing code reading. This address the issues in 'internal/cephfs' package to restrict the line length to 120 chars. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-06-28 14:43:49 +00:00
Rakshith R	404e011ae9	cleanup: added helper func isNotMountPoint Added helper func isNotMountPoint to check mountPoint, validate error and reduce complexity of NodeStageVolume. Signed-off-by: Rakshith R <rar@redhat.com>	2021-06-28 05:46:42 +00:00
Rakshith R	7fc553a3a7	rbd: removing TrimSpace from validateImageFeatures func `imageFeatures` string containing just whitespace should also be treated as a invalid feature. Signed-off-by: Rakshith R <rar@redhat.com>	2021-06-28 05:46:42 +00:00
Rakshith R	84b046d736	rbd: add check for imageFeatures parameter This commit adds checks for missing `imageFeatures` parameter in createvolumerequest and nodestagerequest(only for static PVs). Missing `imageFeatures` parameter is ignored in case of non-static PVs to ensure backwards compatibility with older versions which did not have `imageFeatures` as required parameter. Signed-off-by: Rakshith R <rar@redhat.com>	2021-06-28 05:46:42 +00:00
Yati Padia	13667c013c	cleanup: addresses paralleltest linter The Go linter paralleltest checks that the t.Parallel gets called for the test method and for the range of test cases within the test. Updates: #2025 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-06-25 11:55:12 +00:00
Niels de Vos	0ee0c12027	cleanup: prevent panic in cleanUpSnapshot While cleaning up snapshots, not all object may exist after a partial provisioning attempt. In case objects are missing, do not try to delete them. Fixes: #2192 Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-06-25 10:01:35 +00:00
Niels de Vos	eeec4471c5	rbd: no need to create a snapshot on a thick-provisioned volume When cloning a volume from a (CSI) snapshot, we use DeepCopy() and do not need an RBD snapshot as source. Suggested-by: Madhu Rajanna <madhupr007@gmail.com> Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-06-23 14:22:28 +00:00
Niels de Vos	d2c4cacb39	rbd: restart thick-provisioned PVC snapshot restoring after aborting In case restoring a snapshot of a thick-PVC failed during DeepCopy(), the image will exist, but have partial contents. Only when the image has the thick-provisioned metadata set, it has completed DeepCopy(). When the metadata is missing, the image is deleted, and an error is returned to the caller. Kubernetes will automatically retry provisioning on the ABORTED error, and the restoring will get restarted from the beginning. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-06-23 14:22:28 +00:00
Niels de Vos	7f1bdb49d1	rbd: use DeepCopy() when restoring a thick-snapshot Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-06-23 14:22:28 +00:00
Yati Padia	847b996501	cleanup: Modifies Wrapcheck linter Wrapcheck is a simple Go linter to check that errors from external packages are wrapped during return to help identify the error source during debugging. This commit addresses the wrapcheck error Updates:#2025 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-06-22 08:47:55 +00:00
Madhu Rajanna	591ba3f580	rbd: set thick provision metadata on clone volume the parent volume(CreateVolume) and the clone volume (CreateSnapshot) are both indepedent and parent volume can be deleted anytime. To check the thick provision during Snapshot restore(CreateVolume from snapshot) we need the thick provision metadata so for the same reason setting the thick provision metadata on the clone image we are creating at the CreateSnapshot time. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-06-18 10:57:48 +00:00
Madhu Rajanna	6d14eeee70	rbd: use RbdSnapName to check the image details RbdSnapName holds the actual RBD image name which got created during the CreateSnapshot operation. RbdImageName holds the name of the parent from which the snapshot is created. and the parent is independent of snapshot and it can be deleted any time for the same reason using the RbdSnapName to check the rbd image details. generate a temporary volume from the snapshot which replaces the rbdImageName with RbdSnapName and use it to check the image metadata. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-06-18 10:57:48 +00:00
Madhu Rajanna	7966d2e5c1	rbd: add validation for thick restore/clone added validation to allow only Restore of Thick PVC snapshot to a thick clone and creation of thick clone from thick PVC. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-06-18 10:57:48 +00:00
Madhu Rajanna	fc442221e4	rbd: make isThickProvisioned method of rbdImage isThickProvisioned can be used for both snapshot and clone validation if isThickProvisioned is method of common rbdImage structure. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-06-18 10:57:48 +00:00
Niels de Vos	57d3183cb1	rbd: restart thick-provisioned PVC cloning after aborting In case cloning a thick-PVC failed during DeepCopy(), the image will exist, but have partial contents. Only when the image has the thick-provisioned metadata set, it has completed DeepCopy(). When the metadata is missing, the image is deleted, and an error is returned to the caller. Kubernetes will automatically retry provisioning on the ABORTED error, and the cloning will get restarted from the beginning. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-06-18 06:25:56 +00:00
Niels de Vos	b1045364d9	rbd: disable FeatureDeepFlatten when doing DeepCopy() Not all Linux kernels support the deep-flatten feature. Disabling the feature makes it possible to map RBD images on older kernels (like what minikube uses). Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-06-18 06:25:56 +00:00
Niels de Vos	4908ff8743	rbd: no need to flatten thick-provisioned images Thick-provisioned images are independent, cloned images or snapshots are deep-flattened during creation. There is no need to try and flatten them again. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-06-18 06:25:56 +00:00
Niels de Vos	6cc11c15d3	rbd: use DeepCopy to create a thick-provisioned clone To create a full-allocated RBD image from a snapshot/clone DeepCopy() can be used. This is needed when the parent of the new volume is thick-provisioner, so that the new volume is independent of the parent and thick-provisioned as well. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-06-18 06:25:56 +00:00
Niels de Vos	334f237e23	cleanup: move snapshot/clone/flatten into its own function Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-06-18 06:25:56 +00:00
Madhu Rajanna	367eb9f748	rbd: correct return error for isCompatibleEncryption isCompatibleEncryption is used to validate the requested volume and the existing volume and the destination volume name wont be generated yet and logging the destination volume prints the empty image name with pool name. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-06-17 10:12:18 +00:00
Madhu Rajanna	05b8433b89	rbd: check stdErr for does not have a parent error actual error will be present in the stdErr not the error when we try to add a task to flatten the rbd image. This commits corrects the error checking when the image does not have a parent. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-06-15 11:07:34 +00:00
Yati Padia	6bfdf2feb0	cleanup: gocyclo being unused for linter This commit addresses the following issue: 'nolint:gocyclo // complexity needs to be reduced.' is unused for linter "gocyclo" (nolintlint) Updates:#2025 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-06-15 02:54:16 +00:00
Yug	5c079894c7	doc: correct comment indentation in rbdVolume correct comment indentation in rbdvolume{} Signed-off-by: Yug <yuggupta27@gmail.com>	2021-06-15 02:34:51 +00:00
Yati Padia	095a82f37d	util: returns actual error instead of ErrPoolNotFound This commit returns actual error returned by the go-ceph API to the function GetPoolName(..) instead of just returning ErrPoolNotFound everytime there is error getting the pool id. There is a issue reported in which the snapshot creation takes much more time to reach True state (i.e., between 2-7 mins) and keeps trying to create with below error though pool is present: rpc error: code = NotFound desc = pool not found: pool ID (21) not found in Ceph cluster. Since we cannot interpret the actual error for the delay in snapshot creation, it is required to return the actual error as well so that we can uderstand the reason. Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-06-14 14:41:32 +00:00
Humble Chirammal	17b0091cba	cleanup: fix codespell error in internal/utils package Codespell checker report below error: ``` Resulting CLI options --check-filenames --check-hidden --skip .git,./vendor --ignore-words-list ExtraVersion,extraversion,ba 1 Error: ./internal/util/aws_metadata.go:96: Kubenetes ==> Kubernetes ``` This commit address the same. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-06-11 08:04:07 +00:00
Yug	d992803e9e	rbd: Update pool name in image chain While traversing image chain, the parent image can be present in a different pool that the one child is in. So, updating pool name in the next itteration to that of the Parent. Co-authored-by: Madhu Rajanna <madhupr007@gmail.com> Signed-off-by: Yug <yuggupta27@gmail.com>	2021-06-10 21:46:53 +00:00
Yug	1f6a9cabfd	rbd: verify if pool name is not empty Validate Snapshot request to check if the passed pool name is not empty. Co-authored-by: Madhu Rajanna <madhupr007@gmail.com> Signed-off-by: Yug <yuggupta27@gmail.com>	2021-06-10 21:46:53 +00:00
Yug	3898ae34a7	rbd: open new ioctx connection if the parent and child clones are in different namespaces we need to open a new ioctx for pools. Co-authored-by: Madhu Rajanna <madhupr007@gmail.com> Signed-off-by: Yug <yuggupta27@gmail.com>	2021-06-10 21:46:53 +00:00
Yug	b63b0bf18d	rbd: retrieve parent pool name of child image when clones are created in different pool,we need to retrieve the parent pool to get the information of the parent image. Co-authored-by: Madhu Rajanna <madhupr007@gmail.com> Signed-off-by: Yug <yuggupta27@gmail.com>	2021-06-10 21:46:53 +00:00
Yug	e699318acc	rbd: pass parent volume to undoSnapshotCloning function as we are supporting the creation of clone to a new pool we need to pass the correct parent volume to cleanup the snapshot on parent volume. Co-authored-by: Madhu Rajanna <madhupr007@gmail.com> Signed-off-by: Yug <yuggupta27@gmail.com>	2021-06-10 21:46:53 +00:00
Yug	961c1d12fd	rbd: add support to create clone in different pool added support to create image in different pool. if the snapshot/rbd image exists in one pool we can create a clone the clone of the rbd image to a different pool. Co-authored-by: Madhu Rajanna <madhupr007@gmail.com> Signed-off-by: Yug <yuggupta27@gmail.com>	2021-06-10 21:46:53 +00:00
Mohammed Naser	671d6a7767	rbd: Backout if image features is empty In golang world, if you split an empty string that does not contain the seperator, you get an array with one empty string. This results in volumes failing to mount with "invalid feature " (note extra space because it's trying to check if 'empty string' is a valid feature). This patch checks if the string is empty, and if so, it just decides to skip the entire validation and returning nothing. Signed-off-by: Mohammed Naser <mnaser@vexxhost.com>	2021-06-10 15:43:09 +00:00
Mohammed Naser	f193ebfbb1	rbd: Add failing test when no features are provided Signed-off-by: Mohammed Naser <mnaser@vexxhost.com>	2021-06-10 15:43:09 +00:00
Madhu Rajanna	7b5c78ec7c	rbd: fail fast in create volume for missmatch encryption CreateVolume will fail in below cases * If the snapshot is encrypted and requested volume is not encrypted * If the snapshot is not encrypted and requested volume is encrypted * If the parent volume is encrypted and requested volume is not encrypted * If the parent volume is not encrypted and requested volume is encrypted Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-06-07 15:05:21 +00:00
Madhu Rajanna	4e2c4ef704	cephfs: return internal server error if it is an error from the IsMountPoint function and the error is not IsNotExist return it as a internal server error. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-06-07 07:38:48 +00:00
Madhu Rajanna	46f1ab9e99	cephfs: use IsMountPoint to check mountpoint Currently we are relaying on the error output from the umount command we run on the nodes when mounting the volume but we are not checking for all the error message to verify the volume is mounted or not. This commits uses IsMountPoint function in util to check the mountpoint. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-06-07 07:38:48 +00:00
Madhu Rajanna	b4dbffa316	util: return actual error from IsMountPoint as callers are already taking care of returing the GRPC error code return the actual error from the IsMountPoint function. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-06-07 07:38:48 +00:00
Yati Padia	0f44c6acb7	cleanup: address wasted assign issues At places variable is reassigned without being used. Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-06-03 09:51:14 +00:00
YingshuoTao	bfe64d4aee	cephfs: pass extra volume attributes to static PV when using pre-provisioned volumes, pass these parameters: - kernelMountOptions - fuseMountOptions - subVolumeGroup in spec.csi.volumeAttributes in PV declaration Signed-off-by: YingshuoTao <frigid.blues@gmail.com>	2021-06-03 04:42:59 +00:00
Niels de Vos	7cbad9305f	rbd: repair thick-provisioned images on CreateVolume restart Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-06-01 14:42:12 +00:00
Niels de Vos	96a8ea3e88	cleanup: split repairExistingVolume() from CreateVolume() Move the repairing of a volume/snapshot from CreateVolume to its own function. This reduces the complexity of the code, and makes the procedure easier to understand. Further enhancements to repairing an exsiting volume can be done in the new function. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-06-01 14:42:12 +00:00
Madhu Rajanna	2e978e4211	rbd: fix typo in error message fixed typo in error message. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-06-01 10:40:07 +00:00
Madhu Rajanna	a666d452bf	cephfs: return GRPC error in NodeGetVolumeStats in case of failure return GRPC error to the caller. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-05-31 08:17:37 +00:00
Rakshith R	b891e5585d	cleanup: address ifshort linter issues This commit addresses ifshort linter issues which checks if short syntax for if-statements is possible. updates: #1586 Signed-off-by: Rakshith R <rar@redhat.com>	2021-05-26 07:04:32 +00:00
Rakshith R	6618e2012d	cleanup: remove unnecessary calling of .String() when logging This commit removes calling of .String() when logging since `%s`,`%v` or `%q` will call an existing .String() function automatically. Fixes: #2051 Signed-off-by: Rakshith R <rar@redhat.com>	2021-05-25 18:02:11 +00:00
Yati Padia	774e8e4042	util: enable golang profiling Add support for golang profiling. Standard tools like go tool pprof and curl work. example: $ go tool pprof http://localhost:8080/debug/pprof/profile $ go tool pprof http://localhost:8080/debug/pprof/heap $ curl http://localhost:8080/debug/pprof/heap?debug=1 https://golang.org/pkg/net/http/pprof/ contains more details about the pprof interface. Fixes: #1699 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-05-25 10:41:22 +00:00
Niels de Vos	25d0a1cfc0	rbd: add support for block-devices in NodeGetVolumeStats() The NodeGetVolumeStats procedure can now be used to fetch the capacity of the RBD block-device. By default this is a thin-provisioned device, which means that the capacity is not reserved in the Ceph cluster. This makes it possible to over-provision the cluster. In order to detect the amount of storage used by the RBD block-device (when thin-provisioned), it is required to connect to the Ceph cluster. Unfortunately, the NodeGetVolumeStats CSI procedure does not provide enough parameters to connect to the Ceph cluster and fetch more details about the RBD image. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-05-25 06:41:04 +00:00
Niels de Vos	c0ab4c03e6	cephfs: move NodeGetVolumeStats() to CephFS NodeServer The CephFS NodeServer should handle the CephFS specific requests. This is not something that the NodeServer for RBD should handle. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-05-25 06:41:04 +00:00
Madhu Rajanna	0ce6ad1152	rbd: fix image details logging log only the required details of the image. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-05-07 07:57:37 +00:00
Madhu Rajanna	67d73cd6e9	rbd: flatten image if the depth is not zero flatten the image if the deep-flatten feature is present on the images in the chain or if the images in chain is not zero, as we cannot check the deep-flatten feature the images which are in trash. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-05-07 07:57:37 +00:00
Madhu Rajanna	e15e2e5081	rbd: discard image not found error For flatten we call checkImageChainHasFeature which internally calls to getImageInfo returns the parent name even if the parent is in the trash, when we try to open the parent image to get its information it fails as the image not found. we should treat error as nil if the parent is not found. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-05-07 07:57:37 +00:00
Niels de Vos	f11a041f56	cleanup: address gosec complaint about creating a file The new gosec 2.7.0 complains like: G304 (CWE-22): Potential file inclusion via variable (Confidence: HIGH, Severity: MEDIUM) Updates: #2025 Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-05-05 16:05:23 +00:00
Madhu Rajanna	07a916b84d	rbd: mark image ready when image state is up+unknown To recover from split brain (up+error) state the image need to be demoted and requested for resync on site-a and then the image on site-b should gets demoted.The volume should be marked to ready=true when the image state on both the clusters are up+unknown because during the last snapshot syncing the data gets copied first and then image state on the site-a changes to up+unknown. If the image state on both the sites are up+unknown consider that complete data is synced as the last snapshot gets exchanged between the clusters. * create 10 GB of file and validate the data after resync * Do Failover when the site-a goes down * Force promote the image and write data in GiB * Once the site-a comes back, Demote the image and issue resync * Demote the image on site-b * The status will get reflected on the other site when the last snapshot sync happens * The image will go to up+unknown state. and complete data will be copied to site a * Promote the image on site-a and use it ```bash csi-vol-5633715e-a7eb-11eb-bebb-0242ac110006: global_id: e7f9ec55-06ab-46cb-a1ae-784be75ed96d state: up+unknown description: remote image demoted service: a on minicluster1 last_update: 2021-04-28 07:11:56 peer_sites: name: e47e29f4-96e8-44ed-b6c6-edf15c5a91d6-rook-ceph state: up+unknown description: remote image demoted last_update: 2021-04-28 07:11:41 ``` * Do Failover when the site-a goes down * Force promote the image on site-b and write data in GiB * Demote the image on site-b * Once the site-a comes back, Demote the image on site-a * The images on the both site will go to split brain state ```bash csi-vol-37effcb5-a7f1-11eb-bebb-0242ac110006: global_id: 115c3df9-3d4f-4c04-93a7-531b82155ddf state: up+error description: split-brain service: a on minicluster2 last_update: 2021-04-28 07:25:41 peer_sites: name: abbda0f0-0117-4425-8cb2-deb4c853da47-rook-ceph state: up+error description: split-brain last_update: 2021-04-28 07:25:26 ``` * Issue resync * The images cannot be resynced because when we issue resync on site a the image on site-b was in demoted state * To recover from this state (promote and then demote the image on site-b after sometime) ```bash csi-vol-37effcb5-a7f1-11eb-bebb-0242ac110006: global_id: 115c3df9-3d4f-4c04-93a7-531b82155ddf state: up+unknown description: remote image demoted service: a on minicluster1 last_update: 2021-04-28 07:32:56 peer_sites: name: e47e29f4-96e8-44ed-b6c6-edf15c5a91d6-rook-ceph state: up+unknown description: remote image demoted last_update: 2021-04-28 07:32:41 ``` * Once the data is copied we can see that the image state is moved to up+unknown on both sites * Promote the image on site-a and use it Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-05-05 13:38:29 +00:00
Madhu Rajanna	c3bae17fce	rbd: delete encryption key from KMS when a Snapshot is encrypted during a CreateSnapshot operation, the encryption key gets created in the KMS when we delete the Snapshot the key from the KMS should also gets deleted. When we create a volume from snapshot we are copying required information but we missed to copy the encryption information, This commit adds the missing information to delete the encryption key. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-04-30 08:05:47 +00:00
Humble Chirammal	074c937a08	cleanup: correct typo in vault_tokens.go Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-04-29 08:51:29 +00:00
Mudit Agarwal	ec105bd782	cephfs: expand clone error messages Adding "snapshot clone" in the clone error messages. Signed-off-by: Mudit Agarwal <muagarwa@redhat.com>	2021-04-26 13:38:55 +00:00
Humble Chirammal	798437d0c4	rbd: return crypt error for the rpc return At present we return the volume connect error if the clone from snapshot fails when rbdvolume is encrypted, which is incorrect. This patch correctly return the failed copy encryption error to the caller Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-04-21 16:10:20 +00:00
Madhu Rajanna	52290333e6	rbd: modified logic to check image watchers Before RBD map operation, we do check the watchers on the RBD image. In the case of RWO volume. cephcsi makes sure only one client is using the RBD image. If the rbd image is mirrored, by default mirroring daemon will add a watcher on the image and as we are using go-ceph a watcher will be added as we have opened the image So we will have two watchers on an image if mirroring is enabled. This holds when the rbd mirror daemon is running, In case if the mirror daemon is not running there will be only one watcher on the rbd image (which is placed by go-ceph image open) we should not block the map operation if the mirroring daemon is not running as its Async mirroring. This commit adds a check to make sure no more than 2 watchers if the image is mirrored or no more than 1 watcher if it is not mirrored image. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-04-19 16:30:55 +00:00
Yug	6a46f381c2	cleanup: update description to generic Since rbdImage is a common struct for rbdVolume and rbdSnapshot, it description was matching to only snapshot. This commit makes the comments generic for both volumes and snapshots. Signed-off-by: Yug <yuggupta27@gmail.com>	2021-04-19 07:32:35 +00:00
Rakshith R	9f2cf498b6	cephfs: enable ceph-fuse big_writes by default By default, the write buffer size in libfuse2 is 2KiB `fuse_big_writes = true` option is used to override this limit. This commit makes `fuse_big_writes = true` option as default in ceph.conf. Closes: #1928 Signed-off-by: Rakshith R <rar@redhat.com>	2021-04-19 07:08:57 +00:00
Humble Chirammal	54845b63c0	cleanup: better or corrected variable name in grpc prometheous code Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-04-16 10:22:35 +00:00
Humble Chirammal	0fae0e53b6	cleanup: various source code comment corrections Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-04-16 10:22:35 +00:00
Madhu Rajanna	eea52847bc	rbd: check volumeID in PV if image not found If the pool or few keys are missing in the omap. GetImageAttributes function returns nil error message and few empty items in imageAttributes struct. if the image is not found and the entiries are missing use the volumeId present on the PV annotation for further operations. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-04-15 17:13:06 +05:30
Madhu Rajanna	cfc88c9910	rbd: discard up+unknown state in ResyncVolume incase if the image is promoted and demoted the image state will be set to up+unknown if the image on the remote cluster is still in demoted state. when user changes the state from primary to secondary and still the image is in demoted (secondary) state in the remote cluster. the image state on both the cluster will be on unknown state. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-04-15 17:13:06 +05:30
Niels de Vos	8b8480017b	logging: report issues in rbdImage.DEKStore API with stacks It helps to get a stack trace when debugging issues. Certain things are considered bugs in the code (like missing attributes in a struct), and might cause a panic in certain occasions. In this case, a missing string will not panic, but the behaviour will also not be correct (DEKs getting encrypted, but unable to decrypt). Clearly logging this as a BUG is probably better than calling panic(). Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	b1d05a1840	rbd: repair encryption config in case it is missing It is possible that when a provisioner restarts after a snapshot was cloned, but before the newly restored image had its encryption metadata set, the new image is not marked as encrypted. This will prevent attaching/mounting the image, as the encryption key will not be fetched, or is not available in the DEKStore. By actively repairing the encryption configuration when needed, this problem should be addressed. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	1482105309	cleanup: use buildCreateVolumeResponse() to simplify CreateVolume() buildCreateVolumeResponse() exists exactly for the need to create a csi.CreateVolumeResponse based on an rbdVolume. Calling this helper reduces the code duplication in CreateVolume(). Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	52433841b4	cleanup: move copyEncryptionConfig() from CreateVolume to Exists() The rbdVolume that needs its encryption configured is constructed in the Exists() method. It is suitable to move the copyEncryptionConfig() call there as well, so that the object is completely constructed in a single place. Golang-ci:gocyclo complained about the increased complexity of the Exists() function. Moving the repairing of the ImageID into its own helper function makes the code a little easier to understand. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	596410ae60	cleanup: address "nolint" comments for RBD CreateSnapshot Introduce helper function cloneFromSnapshot() that takes care of the procedures that are needed when an existing snapshot has been found. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	b5d0524c39	cleanup: release resources for rbdImages objects after use Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	dc990037a5	rbd: move setupEncryption() from buildCreateVolumeResponse to CreateVolume Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	bea9d56117	rbd: copyEncryptionConfig in doSnapshotClone() Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	fd5f4dbafd	rbd: configureEncryption() in genSnapFromSnapID() Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	6fd3f57f40	rbd: set kmsID in reserveSnap() Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	0a046c5b6d	rbd: copy encryption configuration in CreateSnapshot Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	6b1285d38b	rbd: copy passphrase for encrypted clones When a source volume is encrypted, the passphrase needs to be copied and stored for the newly cloned volume. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	7b332a0184	rbd: add rbdImage.copyEncryptionConfig() to copy encryption metadata Cloning volumes requires copying the DEK from the source to the newly cloned volume. Introduce copyEncryptionConfig() as a helper for that. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	7e6feecc25	util: add VolumeEncryption.StoreCryptoPassphrase() The new StoreCryptoPassphrase() method makes it possible to store an unencrypted passphrase newly encrypted in the DEKStore. Cloning volumes will use this, as the passphrase from the original volume will need to get copied as part of the metadata for the volume. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	b6aa19eea5	rbd: pass secrets when creating an source rbdVolume for cloning Without this, the rbdVolume can not connect to the Ceph cluster and configure the (optional) encryption. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	92b2e08adf	rbd: improve logging in deleteImage() Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	99da92cfd7	rbd: move deletion of DEK to deleteImage() The ControllerServer should not need to care about support for encryption, ideally it is transparantly handled by the rbdVolume type and its internal API. Deleting the DEK was one of the last remainders that was explicitly done inside the ControllerServer. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	151d066938	util: add logging when OpenEncryptedVolume() encounters an error Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	bd1388fb96	util: log available configs when KMS not found When the KMS configuration can not be found, it is useful to know what configurations are available. This aids troubleshooting when typos in the KMS ID are made. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	a7c261a394	logging: correct formatting when reporting error in createVolumeFromSnapshot() Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Rakshith R	ae6a52a84e	util: add nil check to default ControllerGetCapabilities() Currently default ControllerGetCapabilities function is being used which throws 'runtime error: invalid memory address or nil pointer dereference' when `--controllerServer=true` is not set in provisioner deployment args. This commit adds a check to prevent it. Fixes: 1925 Signed-off-by: Rakshith R <rar@redhat.com>	2021-04-09 10:12:48 +00:00
Rakshith R	10d539efc8	cleanup: correct nolint directive listing format nolint directive needs to be followed by comma separated list of linters. This commit changes to gocognit:gocyclo which was not recognised to linters which show error for the function. Signed-off-by: Rakshith R <rar@redhat.com>	2021-04-09 07:24:47 +00:00
Rakshith R	fb7389f478	cephfs: add stderr to mount function errors This commit appends stderr to error in both kernel and ceph-fuse mounter functions to better be able to debug errors. Signed-off-by: Rakshith R <rar@redhat.com>	2021-04-08 12:18:01 +00:00
Madhu Rajanna	e2fa84357a	rbd: take lock when reconciling the PV there can be a change we can reconcile same PV parallelly we can endup in generating and deleting multiple omap keys. to be on safer side taking lock to process one volumeHandle at a time. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-04-07 11:46:27 +00:00
Madhu Rajanna	0f8813d89f	rbd:store/Read volumeID in/from PV annotation In the case of the Async DR, the volumeID will not be the same if the clusterID or the PoolID is different, With Earlier implementation, it is expected that the new volumeID mapping is stored in the rados omap pool. In the case of the ControllerExpand or the DeleteVolume Request, the only volumeID will be sent it's not possible to find the corresponding poolID in the new cluster. With This Change, it works as below The csi-rbdplugin-controller will watch for the PV objects, when there are any PV objects created it will check the omap already exists, If the omap doesn't exist it will generate the new volumeID and it checks for the volumeID mapping entry in the PV annotation, if the mapping does not exist, it will add the new entry to the PV annotation. The cephcsi will check for the PV annotations if the omap does not exist if the mapping exists in the PV annotation, it will use the new volumeID for further operations. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-04-07 11:46:27 +00:00
Rakshith R	020cded581	cleanup: refactor deeply nested if statements in internal/rbd Refactored deeply nested if statement in internal/rbd to reduce cognitive complexity. Signed-off-by: Rakshith R <rar@redhat.com>	2021-04-07 02:31:41 +00:00
Rakshith R	d4cfd7bef9	cleanup: refactor deeply nested if statement in vault_tokens.go Refactored deeply nested if statement in vault_tokens.go to reduce cognitive complexity by adding fetchTenantConfig function. Signed-off-by: Rakshith R <rar@redhat.com>	2021-04-07 02:31:41 +00:00
Rakshith R	2d1a572d11	cleanup: refactor deeply nested if statements in internal/cephfs Refactored deeply nested if statement in internal/cephfs to reduce cognitive complexity. Signed-off-by: Rakshith R <rar@redhat.com>	2021-04-07 02:31:41 +00:00
Rakshith R	0f7b653b4e	cleanup: refactor deeply nested if statements in persistentvolume.go Refactored deeply nested if statement in persistentvolume.go to reduce cognitive complexity. Signed-off-by: Rakshith R <rar@redhat.com>	2021-04-07 02:31:41 +00:00
Niels de Vos	aaeb35eceb	rbd: encrypted volumes can be of type "crypto_LUKS" too It seems that newer versions of some tools/libraries identify encrypted filesystems with `crypto_LUKS` instead of `crypt`. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-06 15:54:27 +00:00
Madhu Rajanna	d7838defcf	rbd: return FailedPrecondition error message In case of the DR the image on the primary site cannot be demoted as the cluster is down, during failover the image need to be force promoted. RBD returns `Device or resource busy` error message if the image cannot be promoted for above reason. Return FailedPrecondition so that replication operator can send request to force promote the image. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-04-06 14:12:41 +00:00
Madhu Rajanna	403532c9a6	rbd: use force from PromoteVolume Request instead of fetching the force option from the parameters. Use the Force field available in the PromoteVolume Request. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-04-06 14:12:41 +00:00
Madhu Rajanna	385a751b8e	rebase: rename kube-storage to csi-addons as the org github.com/kube-storage is renamed to github.com/csi-addons as the name kube-storage was more generic. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-04-06 10:59:58 +00:00
Niels de Vos	1c1683ba20	util: add AmazonMetadata KMS provider The new Amazon Metadata KMS provider uses a CMK stored in AWS KMS to encrypt/decrypt the DEK which is stored in the volume metadata. Updates: #1921 Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-06 07:33:54 +00:00
Niels de Vos	f3b06d4c4a	util: pass Namespace as part of KMSInitializerArgs Amazon KMS expects a Secret with sensitive account and key information in the Kubernetes Namespace where the Ceph-CSI Pods are running. It will fetch the contents of the Secret itself. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-06 07:33:54 +00:00
Niels de Vos	523ac4b975	util: move getPodNamespace() and getKMSConfigMapName() into its own helpers These functions can now be re-used easier. The Amazon KMS needs to know the Namespace of the Pod for reading a Secret with more key/values. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-06 07:33:54 +00:00
Humble Chirammal	314fe0e23d	cleanup: correct misspelling in rbd/clone.go Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-04-05 09:34:09 +00:00
Madhu Rajanna	448be70682	rbd: early check for disabled,disabling in DisableVolumeReplication added early check for disabling and disabled image mirroring state in DisableVolumeReplication Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-04-05 08:53:40 +00:00
Madhu Rajanna	fb3f7fe202	rbd: remove todo for image not found Incase of resync the image will get deleted, gets recreated and its a a time consuming operation. It makes sense to return aborted error instead of not found as we have omap data only the image is missing in rbd pool. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-04-05 08:53:40 +00:00
Madhu Rajanna	95387c3b5e	rbd: check for peer site status Do resync if the image is in unknow or in error state. Check for the current image state for up+stopped or up+replaying and also all peer site status should be un up+stopped to confirm that resyncing is done and image can be promoted and used. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-04-05 08:53:40 +00:00
Madhu Rajanna	233954bc10	rbd: make replication operations as rbdImage methods added replication related operations as a method of rbdImage as these methods can be easily used when we introduce volumesnaphot mirroring operations. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-04-05 08:53:40 +00:00
Madhu Rajanna	c822ad460d	rbd: add a check for image mirror disabling state the rbd mirror state can be in enabled,disabled or disabling state. If the mirroring is not disabled yet and still in disabling state. we need to check for it and return abort error message if the mirroring is still getting disabled. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-04-05 08:53:40 +00:00
Madhu Rajanna	aaf6b571b8	rbd: Add ReplicationServer struct for replication operations added ReplicationServer struct for the replication related operation it also embed the ControllerServer which already implements the helper functions like locking/unlocking etc. removed getVolumeFromID and cleanup functions for better code readability and easy maintaince. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-04-05 08:53:40 +00:00
Madhu Rajanna	da840a70c5	util: avoid secret logging in GRPC Replication Request This commit uses the helper function to avoid the logging of secrets in Replication GRPC request. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-04-01 20:05:55 +00:00
Niels de Vos	96fcc58095	util: use transformed Vault Tokens for initialization After translating options from the ConfigMap into the common Vault parameters, the generated configuration is not used. Instead, the untranslated version of the configuration is passed on to the vaultConnection initialization function, which then can detects missing options. By passing the right configuration to the initializatino function, things work as intended. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-03-29 13:56:40 +00:00
Niels de Vos	d0f054bb6c	util: use ConfigMap.Data instead of .BinaryData When using .BinaryData, the contents of the configuration is not parsed correctly. Whereas the parsing works fine whet .Data is used. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-03-29 13:56:40 +00:00
Rakshith R	721640178b	cephfs: fix unmountVolume function This commit fixes bug in unmount function which caused unmountVolume to fail when targetPath was already unmounted. Signed-off-by: Rakshith R <rar@redhat.com>	2021-03-25 15:15:07 +00:00
Humble Chirammal	82bc993b32	util: remove unwanted import string from module dependencies There is no need for an extra import string when the go mod package itself declared in the same. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-03-24 15:12:13 +00:00
Niels de Vos	eea97ca014	util: move GetID() from EncryptionKMS to VolumeEncryption There is no need for each EncryptionKMS to implement the same GetID() function. We have a VolumeEncryption type that is more suitable for keeping track of the KMS-ID that was used to get the configuration of the KMS. This does not change any metadata that is stored anywhere. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-03-24 12:09:04 +00:00
Niels de Vos	9317e2afb4	util: rewrite GetKMS() to use KMS provider plugin API GetKMS() is the public API that initilizes the KMS providers on demand. Each provider identifies itself with a KMS-Type, and adds its own initialization function to a switch/case construct. This is not well maintainable. The new GetKMS() can be used the same way, but uses the new kmsManager interface to create and configure the KMS provider instances. All existing KMS providers are converted to use the new kmsManager plugins API. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-03-24 12:09:04 +00:00
Niels de Vos	b43d28d35b	util: add API for KMS Provider plugins The KMSProvider struct is a simple, extendable type that can be used to register KMS providers with an internal kmsManager. Helper functions for creating and configuring KMS providers will also be located in the new kms.go file. This makes things more modular and better maintainable. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-03-24 12:09:04 +00:00
Madhu Rajanna	d8f7b38d3d	rbd: add exclusive-lock and journaling image features for rbd image Current rbd plugin only supports the layering feature for rbd image. Add exclusive-lock and journaling image features for the rbd. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> Signed-off-by: woohhan <woohyung_han@tmax.co.kr>	2021-03-24 09:48:04 +00:00
Yati Padia	0d9548c815	Cephfs: Failed to delete snapshot Failed to delete voluesnapshot when backend subvolume (pvc) and ceph fs subvolume snapshot is deleted Fixes#1647 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-03-17 10:28:08 +00:00
Niels de Vos	bbd24e52f3	cleanup: use rbdImage.Destroy() for temporary volumes rbdVolumes can have several resources that get allocated during its usage. Only destroying the IOContext may not be suffiecient and can cause resource leaks. Use rbdVolume.Destroy() when the rbdVolume is not used anymore. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-03-17 07:50:09 +00:00
Niels de Vos	5c26fbb0d7	util: use ClusterConnection.Copy() for re-using connections Connections are reference counted, so just assigning the connection to an other object for re-use is not correct. This can cause connections to be garbage collected while something else is still using it. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-03-17 07:50:09 +00:00
Madhu Rajanna	6e941539b5	rbd: implement volume replication spec implemented the volume replication spec for the rbd mirroring. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-03-16 13:06:44 +00:00
Madhu Rajanna	ee9a200fcc	rbd: implement mirroring helpers with go-ceph mirror.go exposes the helper functions to perform the mirroring operations. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-03-16 13:06:44 +00:00
Madhu Rajanna	c642637cec	util: register replication controller as RBD is implementing the replication we are registering it. For CephFS, its not implementing the replication we are passing nil so we dont want to register it. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-03-16 13:06:44 +00:00
Madhu Rajanna	ee0576278f	cleanup: move servers to a new struct For future enhancments like adding more servers. Moving the list of servers to a new structure. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-03-16 13:06:44 +00:00
Niels de Vos	10a75dd4ff	rbd: introduce rbdImage as base for rbdVolume and rbdSnapshot Because rbdVolume and rbdSnapshot are very similar, they can be based off a common struct rbdImage that contains the common attributes and functions. This makes it possible to re-use functions for snapshots, and prevents further duplication or code. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-03-15 08:10:51 +00:00
Niels de Vos	3fe714c4fa	cleanup: rename rbdSnapshot.SnapID to VolID The rbdSnapshot and rbdVolume structs have many common attributes. In order to combine these into an rbdImage struct that implements shared functionality, having the same attribute for the ID makes things much easier. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-03-15 08:10:51 +00:00
Niels de Vos	5e63743243	util: add SecretsMetadataKMS This new KMS is based on the (default) SecretsKMS, but instead of using the passphrase for all volumes, the passphrase is used to encrypt/decrypt a Data-Encryption-Key that is stored in the metadata of the volume. CC: Patrick Uiterwijk <puiterwijk@redhat.com> - for encryption guidance Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-03-12 10:11:47 +00:00
Niels de Vos	6915624380	util: add EncryptDEK DecryptDEK to EncryptionKMS interface By adding these methods, a KMS can explicitly encrypt/decrypt the DEK if there is no transparent way of doing so. Hashicorp Vault encrypts the DEK when it it stored, and decrypts it when fetched. Therefor there is no need to do any encryption in this case. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-03-12 10:11:47 +00:00
Niels de Vos	cffec0b3f3	rbd: configure the DEKStore if the configuration suggests to use metadata NewVolumeEncryption() will return an indication that an alternative DEKStore needs to be configured in case the KMS does not support it. setKMS() will also set the DEKStore if needed, so renaming it to configureEncryption() makes things clearer. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-03-12 10:11:47 +00:00
Niels de Vos	e4431edaf9	rbd: implement the DEKStore interface To accommodate storing DEKs outside a KMS, the DEK can be stored in the metadata of the volume. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-03-12 10:11:47 +00:00
Niels de Vos	9ac7f56400	util: move existing KMS implementations to the DEKStore interface Use DEKStore API for Fetching and Storing passphrases. Drop the fallback for the old KMS interface that is now provided as DEKStore. The original implementation has been re-used for the DEKStore interface. This also moves GetCryptoPassphrase/StoreNewCryptoPassphrase functions to methods of VolumeEncryption. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-03-12 10:11:47 +00:00
Niels de Vos	b60dd286c6	util: use the KMS as DEKStore if it supports it Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-03-12 10:11:47 +00:00
Niels de Vos	ee033da8e9	util: add DEKStore interface DEKStore is a new interface that will be used for Storing and Fetching DEKs. The existing implementations for KMS already function as a DEKStore, and will be updated to match the interface. By splitting KMS and DEKStore into two components, the encryption configuration for volumes becomes more modular. This makes it possible to implement a DEKStore where the encrypted DEK for a volume is stored in the metadata of the volume (RBD image). Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-03-12 10:11:47 +00:00
Niels de Vos	d4076d6216	util: introduce VolumeEncryption type Prepare for grouping encryption related functions together. The main rbdVolume object should not be cluttered with KMS or DEK procedures. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-03-12 10:11:47 +00:00
Niels de Vos	aa52afff09	cleanup: move SecretsKMS in own file Prepared for an enhanced API to communicate with a KMS and keep the DEK storage separate. The crypto.go file is already mixed with different functions, so moving the KMS part into its own file, just like we have for Hashicorp Vault KMS's. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-03-12 10:11:47 +00:00
Madhu Rajanna	cbb10fd84d	rbd: add more logging for NodeUnstageVolume For NodeUnstageVolume its a two step process, first unmount the volume and than unmap the volume. Currently, we are logging only after rbd unmapping is done. sometimes it becomes difficult to debug with above logging whether more time is spent in unmount or unmap. This commits adds one more debug log after unmount is done. with this we can identify where exactly more time is spent by looking at the logs. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-03-11 17:40:57 +00:00
Niels de Vos	fe0f169875	rbd: write max 1gb per WriteSame() operation It seems that writing more than 1 GiB per WriteSame() operation causes an EINVAL (22) "Invalid argument" error. Splitting the writes in blocks of maximum 1 GiB should prevent that from happening. Not all volumes are of a size that is the multiple of the stripe-size. WriteSame() needs to write full blocks of data, so in case there is a small left-over, it will be filled with WriteAt(). Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-03-11 10:57:31 +00:00
Niels de Vos	6c8bc79771	ci: add unit tests for SecretsKMS Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-03-04 12:11:07 +00:00
Niels de Vos	165a837bca	rbd: move KMS initialization into rbdVol.initKMS() Introduce initKMS() as a function of rbdVolume. KMS functionality does not need to pollute general RBD image functions. Encryption functions are now in internal/rbd.encryption.go, so move initKMS() there as well. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-02-24 13:16:11 +00:00
Niels de Vos	cf6dae86e9	rbd: move encryptDevice() to a method of rbdVolume Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-02-24 13:16:11 +00:00
Niels de Vos	fb065b0f39	rbd: move openEncryptedDevice() to a method of rbdVolume Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-02-24 13:16:11 +00:00
Madhu Rajanna	8720f4e2f5	cephfs: create subvolume with VolumeNamePrefix when user provides an option for VolumeNamePrefix create subvolume with the prefix which will be easy for user to identify the subvolumes belongs to the storageclass. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-02-19 17:04:17 +00:00
Niels de Vos	b5020657e6	rbd: add "--options notrim" when mapping a thick-provisioned image Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-02-19 11:55:40 +00:00
Niels de Vos	cc96bdaac3	rbd: allocate extents when expanding an image When and RBD image is expanded, the additional extents need to get allocated when the image was thick-provisioned. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-02-19 11:55:40 +00:00
Niels de Vos	294a0973bd	rbd: mark images thick-provisioned in metadata When images get resized/expanded, the additional space needs to be allocated if the image was initially thick-provisioned. By marking the image with a "thick-provisioned" key in the metadata, future operations can check the need. A missing "thick-provisioned" key indicates that the image has not been thick-provisioned. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-02-19 11:55:40 +00:00
Niels de Vos	74d218df8d	rbd: disable rbd_discard_on_zeroed_write_same for thick-allocation Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-02-19 11:55:40 +00:00
Niels de Vos	5522a05f59	rbd: thick-provision images on request Write blocks of stripe-size to allocate RBD images when Thick-Provisioning is enabled in the StorageClass. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-02-19 11:55:40 +00:00
Madhu Rajanna	c417a5d0ba	rbd: add support for thick provisioning option Add an option to the StorageClass to support creating fully allocated (thick provisioned) RBD images Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-02-19 11:55:40 +00:00
Niels de Vos	4937e59c4d	rbd: add backwards compatible encryption in NodeStageVolume When a volume was provisioned by an old Ceph-CSI provisioner, the metadata of the RBD image will contain `requiresEncryption` to indicate a passphrase needs to be created. New Ceph-CSI provisioners create the passphrase in the CreateVolume request, and set `encryptionPrepared` instead. When a new node-plugin detects that `requiresEncryption` is set in the RBD image metadata, it will fallback to the old behaviour. In case `encryptionPrepared` is read from the RBD image metadata, the passphrase is used to cryptsetup/format the image. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-02-17 17:51:13 +00:00
Niels de Vos	ee79b22c97	rbd: move encryption function to encryption.go This adds internal/rbd/encryption.go which will be used to include other encryption functionality to support additional KMS related functions. It will work together with the shared API from internal/util/kms.go. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-02-17 17:51:13 +00:00
Niels de Vos	dc81e001cf	cleanup: remove unused MissingPassphrase error type Storing a passphrase is now done while the volume is created. There is no need to (re)generate a passphrase when it can not be found. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-02-17 17:51:13 +00:00
Niels de Vos	9b6c2117f3	rbd: set encryption passphrase on CreateVolume Have the provisioner create the passphrase for the volume, instead of doign it lazily at the time the volume is used for the 1st time. This prevents potential races where pods on different nodes try to store different passphrases at the (almost) same time. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-02-17 17:51:13 +00:00
Niels de Vos	a42c4b5855	util: convert VAULT_SKIP_VERIFY to "vaultCAVerify" KMS option "VAULT_SKIP_VERIFY" is a standard Hashicorp Vault environment variable (a string) that needs to get converted to the "vaultCAVerify" configuration option in the Ceph-CSI format. The value of "VAULT_SKIP_VERIFY" means the reverse of "vaultCAVerify", this part was missing in the original conversion too. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-02-16 13:05:47 +00:00
Niels de Vos	d534ee9ce8	rbd: include rados-namespace when calling addRbdManagerTask() It seems that calls to addRbdManagerTask() do not include the rados-namespace in the image location. Functions calling addRbdManagerTask() construct the image location themselves, but should use rbdVolume.String() to include all the attributes. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-02-12 12:02:14 +00:00
Niels de Vos	8d0b39e690	rbd: log error when scheduling flattening fails Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-02-12 12:02:14 +00:00
Madhu Rajanna	8cd901d2dd	cephfs: add subvolume path to volume context There are many usecases with adding the subvolume path to the PV object. the volume context returned in the createVolumeResponse is added to the PV object by the external provisioner. More Details about the usecases are in below link https://github.com/rook/rook/issues/5471 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-02-11 11:31:22 +00:00
Madhu Rajanna	dd6ce7b441	rbd: fix error check when reading vaultCAFromSecret check correct error variable when reading vaultCAFromSecret. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-02-04 14:58:40 +00:00
Madhu Rajanna	e9782d86ad	rbd: fix incorrect reading of client cert key fix incorrect reading of client cert key. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-02-04 14:58:40 +00:00
Madhu Rajanna	f63ccb0cce	rbd: store VaultCAVerify as a string storing VaultCAVerify as a string. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-02-04 14:58:40 +00:00
Madhu Rajanna	bf5c36822f	rbd: set tenant in kms object the tenant/namespace is needed to read the certificates, this commit sets the tenant in kms object. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-02-04 14:58:40 +00:00
Madhu Rajanna	22ae4a0b16	rbd: change key in secret for cert and tls currently, the keys for kms certificates/keys in a secret is ca.cert, tls.cert and tls.key, this commit changes the key from ca.cert and tls.cert to cert and tls.key to key. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-02-04 14:58:40 +00:00
Madhu Rajanna	b370d9afb6	rbd: unmarshal the data read from file if are reading the kms data from the file. than only we need to unmarshal. If we are reading from the configmap it already returns the unmarshal data. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-02-04 14:58:40 +00:00
Niels de Vos	582d004ca5	util: store EnvVaultInsecure as string, not bool The configuration option `EnvVaultInsecure` is expected to be a string, not a boolean. By converting the bool back to a string (after verification), it is now possible to skip the certificate validation check by setting `vaultCAVerify: false` in the Vault configuration. Fixes: #1852 Reported-by: Bryon Nevis <bryon.nevis@intel.com> Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-02-02 11:24:28 +00:00
Niels de Vos	df81022349	rbd: add support for VAULT_SKIP_VERIFY in KMS ConfigMap When the KMS VaultTokens is configured through a Kubernetens ConfigMap, the `VAULT_SKIP_VERIFY` option was not taken into account. The option maps to the `vaultCAVerify` value in the configuration file, but has the reverse meaning. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-02-01 13:09:54 +00:00
Mudit Agarwal	d480eb4bda	cephfs: ignore BytesQuota field in case it is not set. This can happen when the subvolume is in snapshot-retained state. We should not return error for such case as it is a valid situation. Signed-off-by: Mudit Agarwal <muagarwa@redhat.com>	2021-02-01 09:20:53 +00:00
Madhu Rajanna	584a43dc2c	rbd: fix issue in ENV variable check Currently cephcsi is returning an error if the ENV variable is set, but it should not. This commit fixes the the POD_NAMESPACE env variable issue and as well as the KMS_CONFIG_NAME ENV variable. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-01-29 10:00:12 +00:00
Niels de Vos	0b7521162c	cleanup: rewrite ifElseChains to switch statements Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-01-27 13:03:56 +00:00
Ilya Dryomov	04644c1d58	rbd: enable mapping and unmapping from a network namespace Make rbdplugin pod work in a non-initial network namespace (i.e. with "hostNetwork: false") by skipping waiting for udev events when mapping and unmapping images. CSI use case is very simple: all that is needed is a device node which is immediately fed to mkfs, so we should be able to tolerate udev not being finished with the device just fine. Fixes: #1323 Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-01-07 15:34:05 +00:00
Ilya Dryomov	c2493686b7	rbd: introduce appendDeviceTypeAndOptions() Factor out --device-type and --options formatting. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-01-07 15:34:05 +00:00
Ilya Dryomov	d3f31187fc	rbd: rename ndbType parameter Fix "ndb" typo. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-01-07 15:34:05 +00:00
Ilya Dryomov	5631b83dd0	rbd: rename mapOptions and options argument slices With the new support for passing --options, referring to ExecCommand() argument slices as mapOptions and options is confusing. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-01-07 15:34:05 +00:00
Seena Fallah	fdec9f65b8	rbd: fix namespace json parser for xbdDeviceInfo rbd device list --format=json returns namespace as a namespace not radosNamespace Signed-off-by: Seena Fallah <seenafallah@gmail.com>	2021-01-05 11:26:09 +00:00
Yati Padia	995879d349	cephfs: use enum to check resize is supported or not Currently, we are using bool pointer to find out the ceph cluster supports resize or not. This commit replaces the bool pointer with enum. Signed-off-by: Yati Padia <ypadia@redhat.com> Fixes#1764	2021-01-04 04:42:58 +00:00
Madhu Rajanna	9c7176dbb4	rbd: update mount packges in import path mount packges is moved from k8s.io/utils/mount to a new repository k8s.io/mount-utils. updated code to use the same. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2020-12-17 16:04:54 +00:00
Madhu Rajanna	b3fbcb9c95	rbd: read configuration from the configmap if the kms encryption configmap is not mounted as a volume to the CSI pods, add the code to read the configuration from the kubernetes. Later the code to fetch the configmap will be moved to the new sidecar which is will talk to respective CO to fetch the encryption configurations. The k8s configmap uses the standard vault spefic names to add the configurations. this will be converted back to the CSI configurations. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2020-12-16 15:05:05 +00:00
Niels de Vos	e4b16a5c72	util: allow tenants to (re)configure VaultTokens settings A tenant can place a ConfigMap in their Kubernetes Namespace with configuration options that differ from the global (by the Storage Admin set) values. The ConfigMap needs to be located in the Tenants namespace, as described in the documentation See-also: docs/design/proposals/encryption-with-vault-tokens.md Signed-off-by: Niels de Vos <ndevos@redhat.com>	2020-12-16 13:42:52 +00:00
Madhu Rajanna	81061e9f68	util: add support for vault certificates Added a option to pass the client certificate and the client certificate key for the vault token based encryption. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2020-12-16 11:01:15 +00:00

... 6 7 8 9 10 ...

1128 Commits