ceph-csi

mirror of https://github.com/ceph/ceph-csi.git synced 2025-06-01 19:36:41 +00:00

Author	SHA1	Message	Date
Prasanna Kumar Kalever	e6fa392df1	rbd: fix mapOptions passing with rbd-nbd mounter This was a regression introduced by: https://github.com/ceph/ceph-csi/pull/2556 Fixes: #2610 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-11-16 10:12:46 +00:00
Prasanna Kumar Kalever	3686b6da8b	rbd: utilize cookie support from rbd for nbd Problem: On remap/attach of device (i.e. nodeplugin restart), there is no way for rbd-nbd to defend if the backend storage is matching with the initial backend storage. Say, if an initial map request for backend "pool1/image1" got mapped to /dev/nbd0 and the userspace process is terminated (on nodeplugin restart). A next remap/attach (nodeplugin start) request within reattach-timeout is allowed to use /dev/nbd0 for a different backend "pool1/image2" For example, an operation like below could be dangerous: $ sudo rbd-nbd map --try-netlink rbd-pool/ext4-image /dev/nbd0 $ sudo blkid /dev/nbd0 /dev/nbd0: UUID="bfc444b4-64b1-418f-8b36-6e0d170cfc04" TYPE="ext4" $ sudo pkill -15 rbd-nbd <-- nodeplugin terminate $ sudo rbd-nbd attach --try-netlink --device /dev/nbd0 rbd-pool/xfs-image /dev/nbd0 $ sudo blkid /dev/nbd0 /dev/nbd0: UUID="d29bf343-6570-4069-a9ea-2fa156ced908" TYPE="xfs" Solution: rbd-nbd/kernel now provides a way to keep some metadata in sysfs to identify between the device and the backend, so that when a remap/attach request is made, rbd-nbd can compare and avoid such dangerous operations. With the provided solution, as part of the initial map request, backend cookie (ceph-csi VOLID) can be stored in the sysfs per device config, so that on a remap/attach request rbd-nbd will check and validate if the backend per device cookie matches with the initial map backend with the help of cookie. At Ceph-csi we use VOLID as device cookie, which will be unique, we pass the VOLID as cookie at map and use the same at the time of attach, that way rbd-nbd can identify backends and their matching devices. Requires: https://github.com/ceph/ceph/pull/41323 https://lkml.org/lkml/2021/4/29/274 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-11-04 03:20:59 +00:00
Prasanna Kumar Kalever	793b22cf27	rbd: check for nbd cookie support Change checkRbdNbdTools() to setRbdNbdToolFeatures() Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-11-04 03:20:59 +00:00
Prasanna Kumar Kalever	9a3170bf77	rbd: provide a way to disable the auto fallback to nbd mounter This change allows the user to choose not to fallback to NBD mounter when some ImageFeatures are absent with krbd driver, rather just fail the NodeStage call. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-11-01 08:17:36 +00:00
Prasanna Kumar Kalever	bfc24f6f12	cleanup: generalize the parseBool function Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-11-01 08:17:36 +00:00
Prasanna Kumar Kalever	84ec797dda	rbd: detect krbd features in runtime and fallback to nbd Currently, we recognize and warn for the provided image features based on our prior intelligence at ceph-csi (i.e based on supportedFeatures map and validateImageFeatures) at image/PV creation time. It might be very much possible that the cluster is heterogeneous i.e. the PV creation and application container might both be on different nodes with different kernel versions (krbd driver versions). This PR adds a mechanism to check for the supported krbd features during mount time, if the krbd driver doesn't have the specified image feature then it will fall back to rbd-nbd mounter. Fixes: #478 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-11-01 08:17:36 +00:00
Humble Chirammal	6aec858cba	rbd: parse migration secret and set fields for nodestage operations this commit make use of the migration request secret parsing and set the required fields for further nodestage operations Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-10-27 18:35:00 +00:00
Humble Chirammal	5621f2cfca	rbd: split the parsing and deletion logic to its own functions. parseAndDeleteMigratedVolume() prviously clubbed the logic of parsing of migration volume handle and then continued with the deletion of the volume. however this commit split this logic into two, ie parsing has been done in parseMigrationVolID() and DeleteMigratedVolume() deletes the backend volume. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-10-27 18:35:00 +00:00
Humble Chirammal	b49bf4b987	rbd: parse migration secret and set it for controller server operations This commit adds a couple of helper functions to parse the migration request secret and set it for further csi driver operations. More details: The intree secret has a data field called "key" which is the base64 admin secret key. The ceph CSI driver currently expect the secret to contain data field "UserKey" for the equivalant. The CSI driver also expect the "UserID" field which is not available in the in-tree secret by deafult. This missing userID will be filled (if the username differ than 'admin') in the migration secret as 'adminId' field in the migration request, this commit adds the logic to parse this migration secret as below: "key" field value will be picked up from the migraion secret to "UserKey" field. "adminId" field value will be picked up from the migration secret to "UserID" field if `adminId` field is nil or not set, `UserID` field will be filled with default value ie `admin`.The above logic get activated only when the secret is a migration secret, otherwise skipped to the normal workflow as we have today. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-10-27 18:35:00 +00:00
Niels de Vos	b132696e54	rbd: note that thick-provisioning is deprecated Thick-provisioning was introduced to make accounting of assigned space for volumes easier. When thick-provisioned volumes are the only consumer of the Ceph cluster, this works fine. However, it is unlikely that this is the case. Instead, accounting of the requested (thin-provisioned) size of volumes is much more practical as different types of volumes can be tracked. OpenShift already provides cluster-wide quotas, which can combine accounting of requested volumes by grouping different StorageClasses. In addition to the difficult practise of allowing only thick-provisioned RBD backed volumes, the performance makes thick-provisioning troublesome. As volumes need to be completely allocated, data needs to be written to the volume. This can take a long time, depending on the size of the volume. Provisioning, cloning and snapshotting becomes very much noticeable, and because of the additional time consumption, more prone to failures. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-10-27 06:54:07 +00:00
Madhu Rajanna	0838845c6a	cleanup: remove FIXME from ResyncVolume as the complexity of ResyncVolume is reduced removing the FIXME which is not valid anymore. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-26 12:00:36 +00:00
Madhu Rajanna	2017b8c621	rbd: log mirror daemon state for replication log the mirror deamon state in the local and remote cluster for better debugging. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-26 12:00:36 +00:00
Madhu Rajanna	7472338334	rbd: remove unwanted const for comparing the image states use the states defined in the go-ceph avoid creating of the deplicate const in cephcsi. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-26 12:00:36 +00:00
Madhu Rajanna	b92a6f5ccb	rbd: log the remote site details during resync logging the remote site details during resyncing for better debugging. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-26 12:00:36 +00:00
Madhu Rajanna	1fd2f28fee	rbd: check local image state for resyncing below are the local states of the mirrored image "unknown" -> If the image is in an error state means data is completely synced "error" -> If the image is in an error state means it needs resync "syncing" "starting_replay" "replaying" "stopping_replay" "stopped" If the resync is successfully started which means the image will be in "replaying" state. we can consider "replaying" state to report resync succesfully going on state. we are discarding the intermediate states like "syncing", "starting_replay" and "stopping_replay". Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-26 12:00:36 +00:00
Rakshith R	12cd05a408	rbd: add EnsureImageCleanup to snapshot deletion Signed-off-by: Rakshith R <rar@redhat.com>	2021-10-20 18:25:31 +00:00
Rakshith R	1849076aab	rbd: add EnsureImageCleanup to ensure image cleanup from trash After moving moving image to trash, if `trash remove` step fails, then external-provisioner will issue subsequent requests, in which image will be absent in pool( will be in trash) and omap cleanup will be done with stale image left in trash with no `trash remove` step on it. To avoid this scenario list trash images and find corresponding id for given image name and add a task to flatten when we encounter a ErrImageNotFound. Fixes: #1728 Signed-off-by: Rakshith R <rar@redhat.com>	2021-10-20 18:25:31 +00:00
Madhu Rajanna	0d51f6d833	rbd: check local image description for split-brain In some corner case like `re-player shutdown` the local image will not be in error state. It would be also worth considering `description` field to make sure about split-brain. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-18 11:22:03 +00:00
Humble Chirammal	c584fa20da	rbd: use clusterID from volumeContext at nodestage previously we were retriving clusterID using the monitors field in the volume context at node stage code path. however it is possible to retrieve or use clusterID directly from the volume context. This commit also remove the getClusterIDFromMigrationVolume() function which was used previously and its tests Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-10-11 10:06:30 +00:00
Humble Chirammal	4e61156dc4	rbd: change iteration variable name in the migration test to be specific we reuse or overload the variable name in the test execution at present. This commit use a different variable name as initialized in each run Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-10-11 10:06:30 +00:00
Madhu Rajanna	90ecd2d7e8	rbd: use go-ceph to get mirroring info use go-ceph api to get image mirroring info. closes #2558 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-07 08:02:06 +00:00
Madhu Rajanna	8ebc0659ab	rbd: perform resize of file system for static volume For static volume, the user will manually mounts already existing image as a volume to the application pods. As its a rbd Image, if the PVC is of type fileSystem the image will be mapped, formatted and mounted on the node, If the user resizes the image on the ceph cluster. User cannot not automatically resize the filesystem created on the rbd image. Even if deletes and recreates the kubernetes objects, the new size will not be visible on the node. With this changes During the NodeStageVolumeRequest the nodeplugin will check the size of the mapped rbd image on the node using the devicePath. and also the rbd image size on the ceph cluster. If the size is not matching it will do the file system resize on the node as part of the NodeStageVolumeRequest RPC call. The user need to do below operation to see new size * Resize the rbd image in ceph cluster * Scale down all the application pods using the static PVC. * Make sure no application pods which are using the static PVC is running on a node. * Scale up all the application pods. Validate the new size in application pod mounted volume. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-06 13:15:00 +00:00
Madhu Rajanna	fe9020260d	rbd: move flattening to helper function in NodeStage operation we are flattening the image to support mounting on the older clients. this commits moves it to a helper function to reduce code complexity. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-06 13:15:00 +00:00
Madhu Rajanna	cda2abca5d	rbd: use NewMetricsBlock to get size instead of lsblk command use NewMetricsBlock function from the kubernetes package to get the size. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-06 13:15:00 +00:00
Rakshith R	ded75eb099	rbd: copyEncryptionConfig for thickProvisioned snap restore too This commit adds bugfix to copy encryption passphrase for thick provisioned PVC restored from snapshot. Signed-off-by: Rakshith R <rar@redhat.com>	2021-10-05 07:46:57 +00:00
Rakshith R	59b7a26175	rbd: modify copyEncryptionConfig to accept copyOnlyPassphrase arg During PVC snapshot/clone both kms config and passphrase needs to copied, while for PVC restore only passphrase needs to be copied to dest rbdvol since destination storageclass may have another kms config. Signed-off-by: Rakshith R <rar@redhat.com>	2021-10-05 07:46:57 +00:00
Humble Chirammal	3c9d7e3cd5	rbd: detect migration volID in DeleteVolume() and delete rbd image This commit adds the logic to detect a passed in volumeID is a migrated volume ID and if yes, the driver connect to the backend cluster and clean/delete the image. The logic only applied if its a migration volume ID. The migration volume ID carry the information like mons, pool and image name which is good enough for the driver to identify and connect to the backend cluster for its operations. migration volID format: <mig>_mons-<monsHash>_image-<imageUID>_<poolHash> Details on the hash values: * MonsHash: this carry a hash value (md5sum) which will be acted as the `clusterID` for the operations in this context. * ImageUID: this is the unique UUID generated by kubernetes for the created volume. * PoolHash: this is an encoded string of pool name. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-10-04 16:06:31 +00:00
Madhu Rajanna	b1ef842640	cleanup: move core functions to core pkg as we are refractoring the cephfs code, Moving all the core functions to a new folder /pkg called core. This will make things easier to implement. For now onwards all the core functionalities will be added to the core package. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-09-23 06:39:37 +00:00
Humble Chirammal	2e8e8f5e64	rbd: fill clusterID if its a migration nodestage request the migration nodestage request does not carry the 'clusterID' in it and only monitors are available with the volumeContext. The volume context flag 'migration=true' and 'static=true' flags allow us to fill 'clusterID' from the passed in monitors to the volume Context,so that rest of the static operations on nodestage can be proceeded as we do treat static volumes today. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-09-20 09:54:54 +00:00
Prasanna Kumar Kalever	c9cc36d8db	rbd: provide alternatives to preserve the ceph log files Currently, we delete the ceph client log file on unmap/detach. This patch provides additional alternatives for users who would like to persist the log files. Strategies: ----------- `remove`: delete log file on unmap/detach `compress`: compress the log file to gzip on unmap/detach `preserve`: preserve the log file in text format Note that the default strategy will be remove on unmap, and these options can be tweaked from the storage class Compression size details example: On Map: (with debug-rbd=20) --------- $ ls -lh -rw-r--r-- 1 root root 526K Sep 1 18:15 rbd-nbd-0001-0024-fed5480a-f00f-417a-a51d-31d8a8144c03-0000000000000003-d2e89c87-0b4d-11ec-8ea6-160f128e682d.log On unmap: --------- $ ls -lh -rw-r--r-- 1 root root 33K Sep 1 18:15 rbd-nbd-0001-0024-fed5480a-f00f-417a-a51d-31d8a8144c03-0000000000000003-d2e89c87-0b4d-11ec-8ea6-160f128e682d.gz Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-09-16 13:55:15 +00:00
Prasanna Kumar Kalever	10bbb049f7	cleanup: passing pointers to larger type Log: internal/rbd/rbd_attach.go:424:2: hugeParam: dArgs is heavy (88 bytes); consider passing it by pointer (gocritic) Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-09-16 13:55:15 +00:00
Shyamsundar Ranganathan	47dc9cf28d	rbd: Report errors when a resync maybe in progress Currently we return a !ready status if an image is not found when a replication resync is issued. We also return a !ready just post issuing a resync. The change is to ensure we return errors in these cases for the caller to retry the operation till we can determine we are actually resyncing, and then return !ready with nil errors. Part of addressing: https://github.com/csi-addons/volume-replication-operator/issues/101 Signed-off-by: Shyamsundar Ranganathan <srangana@redhat.com>	2021-09-15 15:59:22 +00:00
Rakshith R	82d09d81cf	util: modify GetMonsAndClusterID() to take clusterID instead of options This commit: - modifies GetMonsAndClusterID() to take clusterID instead of options. - moves out validation of clusterID is set or not out of GetMonsAndClusterID(). - defines ErrClusterIDNotSet new error for reusability. - add GetClusterID() to obtain clusterID from options. Signed-off-by: Rakshith R <rar@redhat.com>	2021-09-14 08:39:57 +00:00
Rakshith R	9d1e98ca60	rbd: check for clusterid mapping in genVolFromVolumeOptions() This commit adds capability to genVolFromVolumeOptions() to fetch mapped clusted-id & mon ips for mirrored PVC on secondary cluster which may have different cluster-id. This is required for NodeStageVolume(). We also don't need to check for mapping during volume create requests, so it can be disabled by passing a bool checkClusterIDMapping as false. GetMonsAndClusterID() is modified to accept bool checkClusterIDMapping based on which clustermapping is checked to fetch mapped cluster-id and mon-ips. Signed-off-by: Rakshith R <rar@redhat.com>	2021-09-14 08:39:57 +00:00
Rakshith R	0a7a7f4866	util: call WriteCephConfig() in cephcsi.go This commit calls WriteCephConfig() in cephcsi.go to create ceph.conf and keyring if it is not mounted to be used by all cli calls and conn cmds. Before this change, rbd-controller/omap-generator did not create ceph.conf on startup. Signed-off-by: Rakshith R <rar@redhat.com>	2021-09-08 16:05:27 +00:00
Prasanna Kumar Kalever	9e55f015de	rbd: avoid supplying map options on unmap Thanks to the random unmap failure on my local machine: I0901 17:08:37.841890 2617035 cephcmds.go:55] ID: 11 Req-ID: 0001-0024-fed5480a-f00f-417a-a51d-31d8a8144c03-0000000000000003-024983f3-0b47-11ec-8fcb-e671f0b9f58e an error (exit status 22) occurred while running rbd args: [unmap rbd-pool/csi-vol-024983f3-0b47-11ec-8fcb-e671f0b9f58e --device-type nbd --options try-netlink --options reattach-timeout=300 --options io-timeout=0] Noticed the map args are also getting passed to/as unmap args, which is not correct. We have separate things for mapOptions and unmapOptions. This PR makes sure that the map args are not passed at the time of unmap. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-09-06 15:59:30 +00:00
Humble Chirammal	3f31ca8a3a	cleanup: introduce populateVolOptions(), to fill rbdVol from stage req At present the nodeStageVolume() handle many logic of filling rbdvol struct based on the request received and this method is complex to follow. with this patch, filling or populating volOptions has been segregrated and handled hence make the stage functions' job easy. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-09-06 07:49:03 +00:00
Humble Chirammal	f0b8a3f626	rbd: use String() method of MirrorImageState in return error MirrorImageState (type C.rbd_mirror_image_state_t) has a string method which can be used while returning error in the replication controller. Previously, we were using int return in the error which is not the proper usage. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-09-03 16:02:53 +00:00
Humble Chirammal	1d94c12cd6	cleanup: add checkErrAndUndoReserve() for error check,unreserve omap all the error check scenarios of genVolFromVolID() and unreserving omap entries based on the error made deleteVolume method complex, this patch create a new function which handle the error check and unrerving omap entries accordingly and finally return the response to deletevolume/caller. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-09-03 12:20:04 +00:00
Niels de Vos	60c2afbcca	util: NewK8sClient() should not panic on non-Kubernetes clusters When NewK8sClient() detects and error, it used to call FatalLogMsg() which causes a panic. There are additional features that can be used on Kubernetes clusters, but these are not a requirement for most functionalities of the driver. Instead of causing a panic, returning an error should suffice. This allows using the driver on non-Kubernetes clusters again. Fixes: #2452 Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-09-02 11:22:14 +00:00
Rakshith R	99168dc822	rbd: check for clusterid mapping in RegenerateJournal() This commit adds fetchMappedClusterIDAndMons() which returns monitors and clusterID info after checking cluster mapping info. This is required for regenerating omap entries in mirrored cluster with different clusterID. Signed-off-by: Rakshith R <rar@redhat.com>	2021-08-31 14:30:06 +00:00
Rakshith R	496bcba85c	rbd: move GetMappedID() to util package This commit moves getMappedID() from rbd to util package since it is not rbd specific and exports it from there. Signed-off-by: Rakshith R <rar@redhat.com>	2021-08-31 14:30:06 +00:00
Niels de Vos	4a3b1181ce	cleanup: move KMS functionality into its own package A new "internal/kms" package is introduced, it holds the API that can be consumed by the RBD components. The KMS providers are currently in the same package as the API. With later follow-up changes the providers will be placed in their own sub-package. Because of the name of the package "kms", the types, functions and structs inside the package should not be prefixed with KMS anymore: internal/kms/kms.go:213:6: type name will be used as kms.KMSInitializerArgs by other packages, and that stutters; consider calling this InitializerArgs (golint) Updates: #852 Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-08-30 16:31:40 +00:00
Niels de Vos	778b5e86de	cleanup: move k8s functions to the util/k8s package By placing the NewK8sClient() function in its own package, the KMS API can be split from the "internal/util" package. Some of the KMS providers use the NewK8sClient() function, and this causes circular dependencies between "internal/utils" -> "internal/kms" -> "internal/utils", which are not alowed in Go. Updates: #852 Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-08-30 16:31:40 +00:00
Humble Chirammal	8ea495ab81	rbd: skip volumeattachment processing if pv marked for deletion if the volumeattachment has been fetched but marked for deletion the nbd healer dont want to process further on this pv. This patch adds a check for pv is marked for deletion and if so, make the healer skip processing the same Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-08-26 15:04:19 +00:00
Niels de Vos	6d00b39886	cleanup: move log functions to new internal/util/log package Moving the log functions into its own internal/util/log package makes it possible to split out the humongous internal/util packages in further smaller pieces. This reduces the inter-dependencies between utility functions and components, preventing circular dependencies which are not allowed in Go. Updates: #852 Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-08-26 09:34:05 +00:00
Prasanna Kumar Kalever	4f40213d8e	rbd: fix rbd-nbd io-timeout to never abort With the tests at CI, it kind of looks like that the IO is timing out after 30 seconds (default with rbd-nbd). Since we have tweaked reattach-timeout to 300 seconds at ceph-csi, we need to explicitly set io-timeout on the device too, as it doesn't make any sense to keep io-timeout < reattach-timeout Hence we set io-timeout for rbd nbd to 0. Specifying io-timeout 0 tells the nbd driver to not abort the request and instead see if it can be restarted on another socket. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> Suggested-by: Ilya Dryomov <idryomov@redhat.com>	2021-08-24 17:09:09 +00:00
Prasanna Kumar Kalever	3bf17ade7a	doc: update code comments about available timeout options Adding some code comments to make them readable and easy to understand. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-08-24 17:09:09 +00:00
Prasanna Kumar Kalever	ea3def0db2	rbd: remove per volume rbd-nbd logfiles on detach - Update the meta stash with logDir details - Use the same to remove logfile on unstage/unmap to be space efficient Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-08-24 07:15:30 +00:00
Prasanna Kumar Kalever	d67e88ccd0	cleanup: embed args into struct and pass it to detachRBDImageOrDeviceSpec Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-08-24 07:15:30 +00:00

1 2 3 4 5 ...

358 Commits