ceph-csi

mirror of https://github.com/ceph/ceph-csi.git synced 2025-06-01 03:26:40 +00:00

Author	SHA1	Message	Date
Madhu Rajanna	810e285c50	rbd: reset dummy image id dummy image rbdVolume struct is derived from the actual one rbdVolume of the volumeID sent in the EnableVolumeReplication request. and the dummy rbdVolume struct contains the image id of the actual volume because of that when we are repairing the dummy image the image is sent to trash but not deleted due to the wrong image ID. resetting the image id will makes sure the image id is fetching from ceph cluster and same image id will be used for manager operation. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-21 17:39:07 +00:00
Humble Chirammal	88911eb4e9	rbd: add migration secret support to controllerserver functions This commit adds the migration secret request validation to expand, create controller functions. Ref # https://github.com/ceph/ceph-csi/issues/2509 Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-12-20 07:34:43 +00:00
Niels de Vos	30333378ef	cleanup: add IsBlockMultiNode() helper IsBlockMultiNode() is a new helper that takes a slice of VolumeCapability objects and checks if it includes multi-node access and/or block-mode support. This can then easily be used in other services that need checking for these particular capabilities, and preventing multi-node block-mode access. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-12-17 07:31:55 +00:00
Madhu Rajanna	50d6ea825c	rbd: remove retrieving volumeHandle from PV annotation we have added clusterID mapping to identify the volumes in case of a failover in Disaster recovery in #1946. with #2314 we are moving to a configuration in configmap for clusterID and poolID mapping. and with #2314 we have all the required information to identify the image mappings. This commit removes the workaround implementation done in #1946. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-17 03:38:29 +00:00
Niels de Vos	203920d8f4	rbd: move driver component into the rbd/driver package The rbd package contains several functions that can be used by CSI-Addons Service implmentations. Unfortunately it is not possible to do this, as the rbd-driver needs to import the csi-addons/rbd package to provide the CSI-Addons server. This causes a circular import when services use the rbd package: - rbd/driver.go import csi-addons/rbd - csi-addons/rbd import rbd (including the driver) By moving rbd/driver.go into its own package, the circular import can be prevented. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-12-10 07:35:26 +00:00
Niels de Vos	44d69502bc	rbd: export HexStringToInteger() HexStringToInteger() used to return a uint64, but everywhere else uint is used. Having HexStringToInteger() return a uint as well makes it a little easier to use when setting it with SetGlobalInt(). Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-12-10 07:35:26 +00:00
Niels de Vos	8b531f337e	rbd: add functions for initializing global variables When the rbd-driver starts, it initializes some global (yuck!) variables in the rbd package. Because the rbd-driver is moved out into its own package, these variables can not easily be set anymore. Introcude SetGlobalInt(), SetGlobalBool() and InitJournals() so that the rbd-driver can configure the rbd package. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-12-10 07:35:26 +00:00
Niels de Vos	3eeac3d36c	rbd: export RunVolumeHealer() so that rbd/driver can start it The rbd-driver calls rbd.runVolumeHealer() which is not available outside the rbd package. By moving the rbd-driver into its own package, RunVolumeHealer() needs to be exported. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-12-10 07:35:26 +00:00
Niels de Vos	5baf9811f9	rbd: export NodeServer.mounter outside of the rbd package NodeServer.mounter is internal to the NodeServer type, but it needs to be initialized by the rbd-driver. The rbd-driver is moved to its own package, so .Mounter needs to be available from there in order to set it. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-12-10 07:35:26 +00:00
Niels de Vos	8d09134125	rbd: export GenVolFromVolID() for consumption by csi-addons genVolFromVolID() is used by the CSI Controller service to create an rbdVolume object from a CSI volume_id. This function is useful for CSI-Addons Services as well, so rename it to GenVolFromVolID(). Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-12-10 07:35:26 +00:00
Madhu Rajanna	8081ac8251	rbd: add new image features for dummy image The dummy image will be created with 1Mib size. during the snapshot transfer operation the 1Mib will be transferred even if the dummy image doesnot contains any data. adding the new image features `fast-diff,layering,obj-map,exclusive-lock`on the dummy image will ensure that only the diff is transferred to the remote cluster. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-07 17:34:14 +00:00
Madhu Rajanna	9a4533e549	rbd: create 1MiB size dummy image we added a workaround for rbd scheduling by creating a dummy image in #2656. with the fix we are creating a dummy image of the size of the first actual rbd image which is sent in EnableVolumeReplication request if the actual rbd image size is 1TiB we are creating a dummy image of 1TiB which is not good. even though its a thin provisioned rbd images this is causing issue for the transfer of the snapshot during the mirroring operation. This commit recreates the rbd image with 1MiB size which is the smaller supported size in rbd. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-07 17:34:14 +00:00
Konstantin Shalygin	7411773f73	rbd: added RBD features support for krbd Added support for `object-map, fast-diff` Signed-off-by: Konstantin Shalygin <k0ste@k0ste.ru>	2021-12-07 07:38:24 +00:00
Madhu Rajanna	64ce5e0949	rbd: check local image state during promote operation rbd mirroring CLI calls are async and it doesn't wait for the operation to be completed. ex:- `rbd mirror image enable` it will enable the mirroring on the image but it doesn't ensure that the image is mirroring enabled and healthy primary. The same goes for the promote volume also. This commits adds a check-in PromoteVolume to make sure the image in a healthy state i.e `up+stopped`. note:- not considering any intermediate states to make sure the image is completely healthy before responding success to the RPC call. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-12-01 20:19:05 +00:00
Prasanna Kumar Kalever	e7d8834149	rbd: enabe journal based mirroring Journal-based RADOS block device mirroring ensures point-in-time consistent replicas of all changes to an image, including reads and writes, block device resizing, snapshots, clones, and flattening. Journaling-based mirroring records all modifications to an image in the order in which they occur. This ensures that a crash-consistent mirror of an image is available. Mirroring when configured in journal mode, mirroring will utilize the RBD journaling image feature to replicate the image contents. If the RBD journaling image feature is not yet enabled on the image, it will be automatically enabled. Fixes: #2018 Co-authored-by: Madhu Rajanna <madhupr007@gmail.com> Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-12-01 14:12:30 +00:00
Niels de Vos	ab76459e87	rbd: implement CSI-Addons Identity Service Depending on the way Ceph-CSI is deployed, the capabilities will be configured for the GetCapabilities procedure. The other procedures are more straight-forward. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-12-01 06:31:09 +00:00
Niels de Vos	20727bd41a	cleanup: reduce complexity of rbd.Driver.Run() After adding the new CSI-Addons Server, golang-ci complains that driver.Run() is too complex. By moving the profiling checks and starting of the go-routines in their own function, golang-ci is happy again. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-11-30 11:48:40 +00:00
Niels de Vos	b3910f2b4a	rbd: enable CSI-Addons Server and Identity Service Add a new endpoint for the CSI-Addons Service and enable the Identity Service for the RBD plugin. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-11-30 11:48:40 +00:00
Madhu Rajanna	f0b2ea6a6d	rbd: repair imageid after resync During resync operation the local image will get deleted and a new image is recreated by the rbd mirroring. The new image will have a new imageID. Once resync is completed update the imageID in the OMAP to get the image removed from the trash during DeleteVolume. Before resyncing ``` sh-4.4# rbd info replicapool/csi-vol-0c25bdd3-485f-11ec-bd30-0242ac110004 rbd image 'csi-vol-0c25bdd3-485f-11ec-bd30-0242ac110004': size 1 GiB in 256 objects order 22 (4 MiB objects) snapshot_count: 1 id: 1efcc6b7a769 block_name_prefix: rbd_data.1efcc6b7a769 format: 2 features: layering op_features: flags: create_timestamp: Thu Nov 18 11:02:40 2021 access_timestamp: Thu Nov 18 11:02:40 2021 modify_timestamp: Thu Nov 18 11:02:40 2021 mirroring state: enabled mirroring mode: snapshot mirroring global id: 9c4c236d-8a47-4779-b4f6-94e05da70dbd mirroring primary: true ``` ``` sh-4.4# rados listomapvals csi.volume.0c25bdd3-485f-11ec-bd30-0242ac110004 --pool=replicapool csi.imageid value (12 bytes) : 00000000 31 65 66 63 63 36 62 37 61 37 36 39 \|1efcc6b7a769\| 0000000c csi.imagename value (44 bytes) : 00000000 63 73 69 2d 76 6f 6c 2d 30 63 32 35 62 64 64 33 \|csi-vol-0c25bdd3\| 00000010 2d 34 38 35 66 2d 31 31 65 63 2d 62 64 33 30 2d \|-485f-11ec-bd30-\| 00000020 30 32 34 32 61 63 31 31 30 30 30 34 \|0242ac110004\| 0000002c csi.volname value (40 bytes) : 00000000 70 76 63 2d 32 36 38 39 33 66 30 38 2d 66 66 32 \|pvc-26893f08-ff2\| 00000010 62 2d 34 61 30 66 2d 61 35 63 33 2d 38 38 34 62 \|b-4a0f-a5c3-884b\| 00000020 37 32 30 66 66 62 32 63 \|720ffb2c\| 00000028 csi.volume.owner value (7 bytes) : 00000000 64 65 66 61 75 6c 74 \|default\| 00000007 ``` After Resyncing ``` sh-4.4# rbd info replicapool/csi-vol-0c25bdd3-485f-11ec-bd30-0242ac110004 rbd image 'csi-vol-0c25bdd3-485f-11ec-bd30-0242ac110004': size 1 GiB in 256 objects order 22 (4 MiB objects) snapshot_count: 1 id: 10b183a48a97 block_name_prefix: rbd_data.10b183a48a97 format: 2 features: layering, non-primary op_features: flags: create_timestamp: Thu Nov 18 11:09:39 2021 access_timestamp: Thu Nov 18 11:09:39 2021 modify_timestamp: Thu Nov 18 11:09:39 2021 mirroring state: enabled mirroring mode: snapshot mirroring global id: 9c4c236d-8a47-4779-b4f6-94e05da70dbd mirroring primary: false sh-4.4# rados listomapvals csi.volume.0c25bdd3-485f-11ec-bd30-0242ac110004 --pool=replicapool csi.imageid value (12 bytes) : 00000000 31 30 62 31 38 33 61 34 38 61 39 37 \|10b183a48a97\| 0000000c csi.imagename value (44 bytes) : 00000000 63 73 69 2d 76 6f 6c 2d 30 63 32 35 62 64 64 33 \|csi-vol-0c25bdd3\| 00000010 2d 34 38 35 66 2d 31 31 65 63 2d 62 64 33 30 2d \|-485f-11ec-bd30-\| 00000020 30 32 34 32 61 63 31 31 30 30 30 34 \|0242ac110004\| 0000002c csi.volname value (40 bytes) : 00000000 70 76 63 2d 32 36 38 39 33 66 30 38 2d 66 66 32 \|pvc-26893f08-ff2\| 00000010 62 2d 34 61 30 66 2d 61 35 63 33 2d 38 38 34 62 \|b-4a0f-a5c3-884b\| 00000020 37 32 30 66 66 62 32 63 \|720ffb2c\| 00000028 csi.volume.owner value (7 bytes) : 00000000 64 65 66 61 75 6c 74 \|default\| 00000007 ``` Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-11-25 09:22:13 +00:00
Madhu Rajanna	027b68ab39	rbd: operate on dummy image after adding scheduling currently we are fist operating on the dummy image to refresh the pool and then we are adding the scheduling. we think the scheduling should be added first and than we should refresh the pool. If we do this all the existing schedules will be considered from the scheduler. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-11-23 11:04:42 +00:00
Madhu Rajanna	211ca9b5a7	rbd: do deep copy for dummyVol struct with shallow copy of rbdVol to dummyVol the image name update of the dummyVol is getting reflected on the rbdVol which we dont want. do deep copy to avoid this problem. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-11-23 11:04:42 +00:00
Prasanna Kumar Kalever	bdcf3273b5	rbd: provide a way to supply mounter specific mapOptions from sc Uses the below schema to supply mounter specific map/unmapOptions to the nodeplugin based on the discussion we all had at https://github.com/ceph/ceph-csi/pull/2636 This should specifically be really helpful with the `tryOthermonters` set to true, i.e with fallback mechanism settings turned ON. mapOption: "kbrd:v1,v2,v3;nbd:v1,v2,v3" - By omitting `krbd:` or `nbd:`, the option(s) apply to rbdDefaultMounter which is krbd. - A user can _override_ the options for a mounter by specifying `krbd:` or `nbd:`. mapOption: "v1,v2,v3;nbd:v1,v2,v3" is effectively the same as the 1st example. - Sections are split by `;`. - If users want to specify common options for both `krbd` and `nbd`, they should mention them twice. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-11-23 08:54:37 +00:00
Shyamsundar Ranganathan	d1c21eece9	rbd: Update sequence of operations on dummy mirror image The dummy mirror image needs to be disabled and then reenabled for mirroring, to ensure a newly promoted primary is now starting to schedule snapshots. Signed-off-by: Shyamsundar Ranganathan <srangana@redhat.com>	2021-11-19 09:38:59 +05:30
Madhu Rajanna	517ad8c644	rbd: use dummy image to workaround rbd scheduling bug currently we have a bug in rbd mirror scheduling module. After doing failover and failback the scheduling is not getting updated and the mirroring snapshots are not getting created periodically as per the scheduling interval. This PR workarounds this one by doing below operations * Create a dummy (unique) image per cluster and this image should be easily identified. * During Promote operation on any image enable the mirroring on the dummy image. when we enable the mirroring on the dummy image the pool will get updated and the scheduling will be reconfigured. * During Demote operation on any image disable the mirroring on the dummy image. the disable need to be done to enable the mirroring again when we get the promote request to make the image as primary * When the DR is no more needed, this image need to be manually cleanup as for now as we dont want to add a check in the existing DeleteVolume code path for delete dummy image as it impact the performance of the DeleteVolume workflow. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-11-19 09:38:59 +05:30
Madhu Rajanna	e4e0f397a6	rbd: run schedule during promote operation Moved to add scheduling to the promote operation as scheduling need to be added when the image is promoted and this is the correct method of adding the scheduling to make the scheduling take place. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-11-19 09:38:59 +05:30
Madhu Rajanna	7bbd2ea284	rbd: use small case of error message the error message should not start with the capital letter changing the case as per the standard. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-11-18 10:44:12 +00:00
Madhu Rajanna	51998a5f4a	cleanup: log the image name and pool name instead of logging the volumeID and the pool name. log the poolname and image name for better debugging. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-11-18 10:44:12 +00:00
Niels de Vos	7e22180125	rbd: call undoStagingTransaction() when NodeStageVolume() fails On line 341 a `transaction` is created. This is passed to the deferred `undoStagingTransaction()` function when an error in the `NodeStageVolume` procedure is detected. So far, so good. However, on line 356 a new `transaction` is returned. This new `transaction` is not used for the defer call. By removing the empty `transaction` that is used in the defer call, and calling `undoStagingTransaction()` on an error of `stageTransaction()`, the code is a little simpler, and the cleanup of the transaction should be done correctly now. Updates: #2610 Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-11-17 23:58:00 +00:00
Prasanna Kumar Kalever	e6fa392df1	rbd: fix mapOptions passing with rbd-nbd mounter This was a regression introduced by: https://github.com/ceph/ceph-csi/pull/2556 Fixes: #2610 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-11-16 10:12:46 +00:00
Prasanna Kumar Kalever	3686b6da8b	rbd: utilize cookie support from rbd for nbd Problem: On remap/attach of device (i.e. nodeplugin restart), there is no way for rbd-nbd to defend if the backend storage is matching with the initial backend storage. Say, if an initial map request for backend "pool1/image1" got mapped to /dev/nbd0 and the userspace process is terminated (on nodeplugin restart). A next remap/attach (nodeplugin start) request within reattach-timeout is allowed to use /dev/nbd0 for a different backend "pool1/image2" For example, an operation like below could be dangerous: $ sudo rbd-nbd map --try-netlink rbd-pool/ext4-image /dev/nbd0 $ sudo blkid /dev/nbd0 /dev/nbd0: UUID="bfc444b4-64b1-418f-8b36-6e0d170cfc04" TYPE="ext4" $ sudo pkill -15 rbd-nbd <-- nodeplugin terminate $ sudo rbd-nbd attach --try-netlink --device /dev/nbd0 rbd-pool/xfs-image /dev/nbd0 $ sudo blkid /dev/nbd0 /dev/nbd0: UUID="d29bf343-6570-4069-a9ea-2fa156ced908" TYPE="xfs" Solution: rbd-nbd/kernel now provides a way to keep some metadata in sysfs to identify between the device and the backend, so that when a remap/attach request is made, rbd-nbd can compare and avoid such dangerous operations. With the provided solution, as part of the initial map request, backend cookie (ceph-csi VOLID) can be stored in the sysfs per device config, so that on a remap/attach request rbd-nbd will check and validate if the backend per device cookie matches with the initial map backend with the help of cookie. At Ceph-csi we use VOLID as device cookie, which will be unique, we pass the VOLID as cookie at map and use the same at the time of attach, that way rbd-nbd can identify backends and their matching devices. Requires: https://github.com/ceph/ceph/pull/41323 https://lkml.org/lkml/2021/4/29/274 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-11-04 03:20:59 +00:00
Prasanna Kumar Kalever	793b22cf27	rbd: check for nbd cookie support Change checkRbdNbdTools() to setRbdNbdToolFeatures() Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-11-04 03:20:59 +00:00
Prasanna Kumar Kalever	9a3170bf77	rbd: provide a way to disable the auto fallback to nbd mounter This change allows the user to choose not to fallback to NBD mounter when some ImageFeatures are absent with krbd driver, rather just fail the NodeStage call. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-11-01 08:17:36 +00:00
Prasanna Kumar Kalever	bfc24f6f12	cleanup: generalize the parseBool function Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-11-01 08:17:36 +00:00
Prasanna Kumar Kalever	84ec797dda	rbd: detect krbd features in runtime and fallback to nbd Currently, we recognize and warn for the provided image features based on our prior intelligence at ceph-csi (i.e based on supportedFeatures map and validateImageFeatures) at image/PV creation time. It might be very much possible that the cluster is heterogeneous i.e. the PV creation and application container might both be on different nodes with different kernel versions (krbd driver versions). This PR adds a mechanism to check for the supported krbd features during mount time, if the krbd driver doesn't have the specified image feature then it will fall back to rbd-nbd mounter. Fixes: #478 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-11-01 08:17:36 +00:00
Humble Chirammal	6aec858cba	rbd: parse migration secret and set fields for nodestage operations this commit make use of the migration request secret parsing and set the required fields for further nodestage operations Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-10-27 18:35:00 +00:00
Humble Chirammal	5621f2cfca	rbd: split the parsing and deletion logic to its own functions. parseAndDeleteMigratedVolume() prviously clubbed the logic of parsing of migration volume handle and then continued with the deletion of the volume. however this commit split this logic into two, ie parsing has been done in parseMigrationVolID() and DeleteMigratedVolume() deletes the backend volume. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-10-27 18:35:00 +00:00
Humble Chirammal	b49bf4b987	rbd: parse migration secret and set it for controller server operations This commit adds a couple of helper functions to parse the migration request secret and set it for further csi driver operations. More details: The intree secret has a data field called "key" which is the base64 admin secret key. The ceph CSI driver currently expect the secret to contain data field "UserKey" for the equivalant. The CSI driver also expect the "UserID" field which is not available in the in-tree secret by deafult. This missing userID will be filled (if the username differ than 'admin') in the migration secret as 'adminId' field in the migration request, this commit adds the logic to parse this migration secret as below: "key" field value will be picked up from the migraion secret to "UserKey" field. "adminId" field value will be picked up from the migration secret to "UserID" field if `adminId` field is nil or not set, `UserID` field will be filled with default value ie `admin`.The above logic get activated only when the secret is a migration secret, otherwise skipped to the normal workflow as we have today. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-10-27 18:35:00 +00:00
Niels de Vos	b132696e54	rbd: note that thick-provisioning is deprecated Thick-provisioning was introduced to make accounting of assigned space for volumes easier. When thick-provisioned volumes are the only consumer of the Ceph cluster, this works fine. However, it is unlikely that this is the case. Instead, accounting of the requested (thin-provisioned) size of volumes is much more practical as different types of volumes can be tracked. OpenShift already provides cluster-wide quotas, which can combine accounting of requested volumes by grouping different StorageClasses. In addition to the difficult practise of allowing only thick-provisioned RBD backed volumes, the performance makes thick-provisioning troublesome. As volumes need to be completely allocated, data needs to be written to the volume. This can take a long time, depending on the size of the volume. Provisioning, cloning and snapshotting becomes very much noticeable, and because of the additional time consumption, more prone to failures. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-10-27 06:54:07 +00:00
Madhu Rajanna	0838845c6a	cleanup: remove FIXME from ResyncVolume as the complexity of ResyncVolume is reduced removing the FIXME which is not valid anymore. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-26 12:00:36 +00:00
Madhu Rajanna	2017b8c621	rbd: log mirror daemon state for replication log the mirror deamon state in the local and remote cluster for better debugging. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-26 12:00:36 +00:00
Madhu Rajanna	7472338334	rbd: remove unwanted const for comparing the image states use the states defined in the go-ceph avoid creating of the deplicate const in cephcsi. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-26 12:00:36 +00:00
Madhu Rajanna	b92a6f5ccb	rbd: log the remote site details during resync logging the remote site details during resyncing for better debugging. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-26 12:00:36 +00:00
Madhu Rajanna	1fd2f28fee	rbd: check local image state for resyncing below are the local states of the mirrored image "unknown" -> If the image is in an error state means data is completely synced "error" -> If the image is in an error state means it needs resync "syncing" "starting_replay" "replaying" "stopping_replay" "stopped" If the resync is successfully started which means the image will be in "replaying" state. we can consider "replaying" state to report resync succesfully going on state. we are discarding the intermediate states like "syncing", "starting_replay" and "stopping_replay". Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-26 12:00:36 +00:00
Rakshith R	12cd05a408	rbd: add EnsureImageCleanup to snapshot deletion Signed-off-by: Rakshith R <rar@redhat.com>	2021-10-20 18:25:31 +00:00
Rakshith R	1849076aab	rbd: add EnsureImageCleanup to ensure image cleanup from trash After moving moving image to trash, if `trash remove` step fails, then external-provisioner will issue subsequent requests, in which image will be absent in pool( will be in trash) and omap cleanup will be done with stale image left in trash with no `trash remove` step on it. To avoid this scenario list trash images and find corresponding id for given image name and add a task to flatten when we encounter a ErrImageNotFound. Fixes: #1728 Signed-off-by: Rakshith R <rar@redhat.com>	2021-10-20 18:25:31 +00:00
Madhu Rajanna	0d51f6d833	rbd: check local image description for split-brain In some corner case like `re-player shutdown` the local image will not be in error state. It would be also worth considering `description` field to make sure about split-brain. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-18 11:22:03 +00:00
Humble Chirammal	c584fa20da	rbd: use clusterID from volumeContext at nodestage previously we were retriving clusterID using the monitors field in the volume context at node stage code path. however it is possible to retrieve or use clusterID directly from the volume context. This commit also remove the getClusterIDFromMigrationVolume() function which was used previously and its tests Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-10-11 10:06:30 +00:00
Humble Chirammal	4e61156dc4	rbd: change iteration variable name in the migration test to be specific we reuse or overload the variable name in the test execution at present. This commit use a different variable name as initialized in each run Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-10-11 10:06:30 +00:00
Madhu Rajanna	90ecd2d7e8	rbd: use go-ceph to get mirroring info use go-ceph api to get image mirroring info. closes #2558 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-07 08:02:06 +00:00
Madhu Rajanna	8ebc0659ab	rbd: perform resize of file system for static volume For static volume, the user will manually mounts already existing image as a volume to the application pods. As its a rbd Image, if the PVC is of type fileSystem the image will be mapped, formatted and mounted on the node, If the user resizes the image on the ceph cluster. User cannot not automatically resize the filesystem created on the rbd image. Even if deletes and recreates the kubernetes objects, the new size will not be visible on the node. With this changes During the NodeStageVolumeRequest the nodeplugin will check the size of the mapped rbd image on the node using the devicePath. and also the rbd image size on the ceph cluster. If the size is not matching it will do the file system resize on the node as part of the NodeStageVolumeRequest RPC call. The user need to do below operation to see new size * Resize the rbd image in ceph cluster * Scale down all the application pods using the static PVC. * Make sure no application pods which are using the static PVC is running on a node. * Scale up all the application pods. Validate the new size in application pod mounted volume. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-10-06 13:15:00 +00:00

1 2 3 4 5 ...

436 Commits