ceph-csi

mirror of https://github.com/ceph/ceph-csi.git synced 2025-06-03 04:16:42 +00:00

Author	SHA1	Message	Date
Humble Chirammal	ff0911fb6a	rbd: add unittests for IsMigrationSecret and ParseAndSetSecretMapFromMigSecret This commit adds unit tests for newly introduced migration specific functions. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-10-27 18:35:00 +00:00
Humble Chirammal	b49bf4b987	rbd: parse migration secret and set it for controller server operations This commit adds a couple of helper functions to parse the migration request secret and set it for further csi driver operations. More details: The intree secret has a data field called "key" which is the base64 admin secret key. The ceph CSI driver currently expect the secret to contain data field "UserKey" for the equivalant. The CSI driver also expect the "UserID" field which is not available in the in-tree secret by deafult. This missing userID will be filled (if the username differ than 'admin') in the migration secret as 'adminId' field in the migration request, this commit adds the logic to parse this migration secret as below: "key" field value will be picked up from the migraion secret to "UserKey" field. "adminId" field value will be picked up from the migration secret to "UserID" field if `adminId` field is nil or not set, `UserID` field will be filled with default value ie `admin`.The above logic get activated only when the secret is a migration secret, otherwise skipped to the normal workflow as we have today. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-10-27 18:35:00 +00:00
Niels de Vos	b132696e54	rbd: note that thick-provisioning is deprecated Thick-provisioning was introduced to make accounting of assigned space for volumes easier. When thick-provisioned volumes are the only consumer of the Ceph cluster, this works fine. However, it is unlikely that this is the case. Instead, accounting of the requested (thin-provisioned) size of volumes is much more practical as different types of volumes can be tracked. OpenShift already provides cluster-wide quotas, which can combine accounting of requested volumes by grouping different StorageClasses. In addition to the difficult practise of allowing only thick-provisioned RBD backed volumes, the performance makes thick-provisioning troublesome. As volumes need to be completely allocated, data needs to be written to the volume. This can take a long time, depending on the size of the volume. Provisioning, cloning and snapshotting becomes very much noticeable, and because of the additional time consumption, more prone to failures. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-10-27 06:54:07 +00:00
Madhu Rajanna	0838845c6a	cleanup: remove FIXME from ResyncVolume as the complexity of ResyncVolume is reduced removing the FIXME which is not valid anymore. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-26 12:00:36 +00:00
Madhu Rajanna	2017b8c621	rbd: log mirror daemon state for replication log the mirror deamon state in the local and remote cluster for better debugging. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-26 12:00:36 +00:00
Madhu Rajanna	7472338334	rbd: remove unwanted const for comparing the image states use the states defined in the go-ceph avoid creating of the deplicate const in cephcsi. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-26 12:00:36 +00:00
Madhu Rajanna	b92a6f5ccb	rbd: log the remote site details during resync logging the remote site details during resyncing for better debugging. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-26 12:00:36 +00:00
Madhu Rajanna	1fd2f28fee	rbd: check local image state for resyncing below are the local states of the mirrored image "unknown" -> If the image is in an error state means data is completely synced "error" -> If the image is in an error state means it needs resync "syncing" "starting_replay" "replaying" "stopping_replay" "stopped" If the resync is successfully started which means the image will be in "replaying" state. we can consider "replaying" state to report resync succesfully going on state. we are discarding the intermediate states like "syncing", "starting_replay" and "stopping_replay". Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-26 12:00:36 +00:00
Rakshith R	12cd05a408	rbd: add EnsureImageCleanup to snapshot deletion Signed-off-by: Rakshith R <rar@redhat.com>	2021-10-20 18:25:31 +00:00
Rakshith R	1849076aab	rbd: add EnsureImageCleanup to ensure image cleanup from trash After moving moving image to trash, if `trash remove` step fails, then external-provisioner will issue subsequent requests, in which image will be absent in pool( will be in trash) and omap cleanup will be done with stale image left in trash with no `trash remove` step on it. To avoid this scenario list trash images and find corresponding id for given image name and add a task to flatten when we encounter a ErrImageNotFound. Fixes: #1728 Signed-off-by: Rakshith R <rar@redhat.com>	2021-10-20 18:25:31 +00:00
Niels de Vos	6d3e25f069	util: NodeGetVolumeStatsResponse.Usage may not contain negative values Following the CSI specification, values that are included in the VolumeUsage MUST NOT be negative. However, CephFS seems to return -1 for the number of inodes that are available. Instead of returning a negative value, set it to 0 so that it will not get included in the encoded JSON response. Updates: #2579 See-also: `5b0d454015/spec.md (L2477-L2487)` Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-10-20 07:18:48 +00:00
Madhu Rajanna	0d51f6d833	rbd: check local image description for split-brain In some corner case like `re-player shutdown` the local image will not be in error state. It would be also worth considering `description` field to make sure about split-brain. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-18 11:22:03 +00:00
Humble Chirammal	c584fa20da	rbd: use clusterID from volumeContext at nodestage previously we were retriving clusterID using the monitors field in the volume context at node stage code path. however it is possible to retrieve or use clusterID directly from the volume context. This commit also remove the getClusterIDFromMigrationVolume() function which was used previously and its tests Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-10-11 10:06:30 +00:00
Humble Chirammal	4e61156dc4	rbd: change iteration variable name in the migration test to be specific we reuse or overload the variable name in the test execution at present. This commit use a different variable name as initialized in each run Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-10-11 10:06:30 +00:00
Madhu Rajanna	90ecd2d7e8	rbd: use go-ceph to get mirroring info use go-ceph api to get image mirroring info. closes #2558 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-07 08:02:06 +00:00
Madhu Rajanna	8ebc0659ab	rbd: perform resize of file system for static volume For static volume, the user will manually mounts already existing image as a volume to the application pods. As its a rbd Image, if the PVC is of type fileSystem the image will be mapped, formatted and mounted on the node, If the user resizes the image on the ceph cluster. User cannot not automatically resize the filesystem created on the rbd image. Even if deletes and recreates the kubernetes objects, the new size will not be visible on the node. With this changes During the NodeStageVolumeRequest the nodeplugin will check the size of the mapped rbd image on the node using the devicePath. and also the rbd image size on the ceph cluster. If the size is not matching it will do the file system resize on the node as part of the NodeStageVolumeRequest RPC call. The user need to do below operation to see new size * Resize the rbd image in ceph cluster * Scale down all the application pods using the static PVC. * Make sure no application pods which are using the static PVC is running on a node. * Scale up all the application pods. Validate the new size in application pod mounted volume. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-06 13:15:00 +00:00
Madhu Rajanna	fe9020260d	rbd: move flattening to helper function in NodeStage operation we are flattening the image to support mounting on the older clients. this commits moves it to a helper function to reduce code complexity. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-06 13:15:00 +00:00
Madhu Rajanna	cda2abca5d	rbd: use NewMetricsBlock to get size instead of lsblk command use NewMetricsBlock function from the kubernetes package to get the size. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-06 13:15:00 +00:00
Rakshith R	ded75eb099	rbd: copyEncryptionConfig for thickProvisioned snap restore too This commit adds bugfix to copy encryption passphrase for thick provisioned PVC restored from snapshot. Signed-off-by: Rakshith R <rar@redhat.com>	2021-10-05 07:46:57 +00:00
Rakshith R	59b7a26175	rbd: modify copyEncryptionConfig to accept copyOnlyPassphrase arg During PVC snapshot/clone both kms config and passphrase needs to copied, while for PVC restore only passphrase needs to be copied to dest rbdvol since destination storageclass may have another kms config. Signed-off-by: Rakshith R <rar@redhat.com>	2021-10-05 07:46:57 +00:00
Humble Chirammal	3c9d7e3cd5	rbd: detect migration volID in DeleteVolume() and delete rbd image This commit adds the logic to detect a passed in volumeID is a migrated volume ID and if yes, the driver connect to the backend cluster and clean/delete the image. The logic only applied if its a migration volume ID. The migration volume ID carry the information like mons, pool and image name which is good enough for the driver to identify and connect to the backend cluster for its operations. migration volID format: <mig>_mons-<monsHash>_image-<imageUID>_<poolHash> Details on the hash values: * MonsHash: this carry a hash value (md5sum) which will be acted as the `clusterID` for the operations in this context. * ImageUID: this is the unique UUID generated by kubernetes for the created volume. * PoolHash: this is an encoded string of pool name. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-10-04 16:06:31 +00:00
Madhu Rajanna	34a21cdbe3	cleanup: move mount functions to new pkg moved fuse and kernel mount functions to a new package. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-09-23 06:39:37 +00:00
Madhu Rajanna	b1ef842640	cleanup: move core functions to core pkg as we are refractoring the cephfs code, Moving all the core functions to a new folder /pkg called core. This will make things easier to implement. For now onwards all the core functionalities will be added to the core package. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-09-23 06:39:37 +00:00
Humble Chirammal	4804f47b18	e2e: Add e2e for rbd migration static pvc This commit adds e2e for rbd migration static PVCs Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-09-20 09:54:54 +00:00
Humble Chirammal	2e8e8f5e64	rbd: fill clusterID if its a migration nodestage request the migration nodestage request does not carry the 'clusterID' in it and only monitors are available with the volumeContext. The volume context flag 'migration=true' and 'static=true' flags allow us to fill 'clusterID' from the passed in monitors to the volume Context,so that rest of the static operations on nodestage can be proceeded as we do treat static volumes today. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-09-20 09:54:54 +00:00
Humble Chirammal	1f5963919f	util: get clusterID for the passed in mon string as part of migration support, the clusterID has to be fetched from passed in mon. Because the intree RBD storage class only got monitor and not `clusterID` parameter support. However, in CSI, SC has the `clusterID` parameter support but not mon. Due to that we have to fetch the clusterID from config file for the passed in mon and use it in our operations. This adds a helper function to retrieve clusterID from passed in mon string. Updates https://github.com/ceph/ceph-csi/issues/2509 Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-09-20 09:54:54 +00:00
Prasanna Kumar Kalever	c9cc36d8db	rbd: provide alternatives to preserve the ceph log files Currently, we delete the ceph client log file on unmap/detach. This patch provides additional alternatives for users who would like to persist the log files. Strategies: ----------- `remove`: delete log file on unmap/detach `compress`: compress the log file to gzip on unmap/detach `preserve`: preserve the log file in text format Note that the default strategy will be remove on unmap, and these options can be tweaked from the storage class Compression size details example: On Map: (with debug-rbd=20) --------- $ ls -lh -rw-r--r-- 1 root root 526K Sep 1 18:15 rbd-nbd-0001-0024-fed5480a-f00f-417a-a51d-31d8a8144c03-0000000000000003-d2e89c87-0b4d-11ec-8ea6-160f128e682d.log On unmap: --------- $ ls -lh -rw-r--r-- 1 root root 33K Sep 1 18:15 rbd-nbd-0001-0024-fed5480a-f00f-417a-a51d-31d8a8144c03-0000000000000003-d2e89c87-0b4d-11ec-8ea6-160f128e682d.gz Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-09-16 13:55:15 +00:00
Prasanna Kumar Kalever	10bbb049f7	cleanup: passing pointers to larger type Log: internal/rbd/rbd_attach.go:424:2: hugeParam: dArgs is heavy (88 bytes); consider passing it by pointer (gocritic) Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-09-16 13:55:15 +00:00
Prasanna Kumar Kalever	ad2c6d2851	util: add gzip helper function Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-09-16 13:55:15 +00:00
Shyamsundar Ranganathan	47dc9cf28d	rbd: Report errors when a resync maybe in progress Currently we return a !ready status if an image is not found when a replication resync is issued. We also return a !ready just post issuing a resync. The change is to ensure we return errors in these cases for the caller to retry the operation till we can determine we are actually resyncing, and then return !ready with nil errors. Part of addressing: https://github.com/csi-addons/volume-replication-operator/issues/101 Signed-off-by: Shyamsundar Ranganathan <srangana@redhat.com>	2021-09-15 15:59:22 +00:00
Rakshith R	82d09d81cf	util: modify GetMonsAndClusterID() to take clusterID instead of options This commit: - modifies GetMonsAndClusterID() to take clusterID instead of options. - moves out validation of clusterID is set or not out of GetMonsAndClusterID(). - defines ErrClusterIDNotSet new error for reusability. - add GetClusterID() to obtain clusterID from options. Signed-off-by: Rakshith R <rar@redhat.com>	2021-09-14 08:39:57 +00:00
Rakshith R	9d1e98ca60	rbd: check for clusterid mapping in genVolFromVolumeOptions() This commit adds capability to genVolFromVolumeOptions() to fetch mapped clusted-id & mon ips for mirrored PVC on secondary cluster which may have different cluster-id. This is required for NodeStageVolume(). We also don't need to check for mapping during volume create requests, so it can be disabled by passing a bool checkClusterIDMapping as false. GetMonsAndClusterID() is modified to accept bool checkClusterIDMapping based on which clustermapping is checked to fetch mapped cluster-id and mon-ips. Signed-off-by: Rakshith R <rar@redhat.com>	2021-09-14 08:39:57 +00:00
Humble Chirammal	4be53a27d3	cleanup: replace parentName to snapParentName in checkReservation at present, eventhough the checkReservation works for both volume and snapshot, the arg parentName make sense only for snapshot cases renaming that arg to more approprite Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-09-14 05:32:54 +00:00
Humble Chirammal	1fee3ec460	cleanup: correct checkReservation return description it wrongly mention that the return is imageUUID string where actually it is the imageData struct Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-09-14 05:32:54 +00:00
Rakshith R	0a7a7f4866	util: call WriteCephConfig() in cephcsi.go This commit calls WriteCephConfig() in cephcsi.go to create ceph.conf and keyring if it is not mounted to be used by all cli calls and conn cmds. Before this change, rbd-controller/omap-generator did not create ceph.conf on startup. Signed-off-by: Rakshith R <rar@redhat.com>	2021-09-08 16:05:27 +00:00
Madhu Rajanna	8c8f34cf7a	rbd: set vaultAuthNamespace to vaultNamespace if empty When we read the csi-kms-connection-details configmap vaultAuthNamespace might not be set when we do the conversion the vaultAuthNamespace might be set to empty key and this commits check for the empty value of vaultAuthNamespace and set the vaultAuthNamespace to vaultNamespace. setting empty value for vaultAuthNamespace happened due to Marshalling at https://github.com/ceph/ceph-csi/blob/devel/ internal/kms/vault_tokens.go#L136-L139. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-09-08 11:18:03 +00:00
Rakshith R	e99dd3dea4	util: read ceph.conf by calling conn.ReadConfigFile(CephConfigPath) The configurations in cpeh.conf is not picked up by rados connection automatically, hence we need to call conn.ReadConfigFile before calling Connect(). Signed-off-by: Rakshith R <rar@redhat.com>	2021-09-07 16:50:12 +00:00
Madhu Rajanna	76f1b42498	cephfs: correct comment for validateExpandVolumeRequest corrected the function comment for validateExpandVolumeRequest. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-09-07 14:33:02 +00:00
Madhu Rajanna	9fd51d9bec	cephfs: add comment for validateCreateVolumeRequest added function comment for validateCreateVolumeRequest Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-09-07 14:33:02 +00:00
Madhu Rajanna	8caeb409bb	cephfs: add comment for validateDeleteVolumeRequest added function comment for the validateDeleteVolumeRequest function. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-09-07 14:33:02 +00:00
Madhu Rajanna	be7749c90e	cleanup: move volumeID to the volumeoptions volumeID can be moved to the volumeOptions as most of the volume related helper functions are available on the volumeoptions.go Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-09-07 14:33:02 +00:00
Madhu Rajanna	da70ed50dc	cleanup: move execCommandErr to volumemounter Moved execCommandErr to the volumemounter.go which is the only caller of this function and moving the execCommandErr helps in reducing the util file. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-09-07 14:33:02 +00:00
Madhu Rajanna	31696a6ce0	cleanup: move genSnapFromOptions to volumeoptions moved genSnapFromOptions function to volumeoptions.go which is more appropriated than util. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-09-07 14:33:02 +00:00
Madhu Rajanna	73e2ffe8b8	cleanup: move cephfs csi spec validation to validator moved the cephfs related validation like validating the input parameters sent in the GRPC request to a new file. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-09-07 14:33:02 +00:00
Humble Chirammal	4efcc5bf97	cleanup: simplify checkStaticVolume function and remove unwanted vars checkStaticVolume() in the reconcilePV function has been unwantedly introducing variables to confirm the pv spec is static or not. This patch simplify it and make a smaller footprint of the functions. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-09-07 12:51:30 +00:00
Humble Chirammal	df2d9548ae	cephfs: no need to check for zero volume size At present there is a 'todo' to check for zero volume size in the createVolume request which in unwanted, ie the pvc creation with size 0 fail from the kubernetes api validation itself: For ex: ``` ..spec.resources[storage]: Invalid value: "0": must be greater than zero``` ``` so we dont need any extra check in the controller server Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-09-07 04:49:24 +00:00
Prasanna Kumar Kalever	9e55f015de	rbd: avoid supplying map options on unmap Thanks to the random unmap failure on my local machine: I0901 17:08:37.841890 2617035 cephcmds.go:55] ID: 11 Req-ID: 0001-0024-fed5480a-f00f-417a-a51d-31d8a8144c03-0000000000000003-024983f3-0b47-11ec-8fcb-e671f0b9f58e an error (exit status 22) occurred while running rbd args: [unmap rbd-pool/csi-vol-024983f3-0b47-11ec-8fcb-e671f0b9f58e --device-type nbd --options try-netlink --options reattach-timeout=300 --options io-timeout=0] Noticed the map args are also getting passed to/as unmap args, which is not correct. We have separate things for mapOptions and unmapOptions. This PR makes sure that the map args are not passed at the time of unmap. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-09-06 15:59:30 +00:00
Humble Chirammal	3f31ca8a3a	cleanup: introduce populateVolOptions(), to fill rbdVol from stage req At present the nodeStageVolume() handle many logic of filling rbdvol struct based on the request received and this method is complex to follow. with this patch, filling or populating volOptions has been segregrated and handled hence make the stage functions' job easy. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-09-06 07:49:03 +00:00
Humble Chirammal	f0b8a3f626	rbd: use String() method of MirrorImageState in return error MirrorImageState (type C.rbd_mirror_image_state_t) has a string method which can be used while returning error in the replication controller. Previously, we were using int return in the error which is not the proper usage. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-09-03 16:02:53 +00:00
Madhu Rajanna	4865061ab9	util: create ceph configuration files if not present create ceph.conf and keyring files if its not present in the /et/ceph/ path. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-09-03 14:14:43 +00:00

1 2 3 4 5 ...

707 Commits