ceph-csi

mirror of https://github.com/ceph/ceph-csi.git synced 2025-05-30 18:46:41 +00:00

Author	SHA1	Message	Date
Madhu Rajanna	4bec4c1818	e2e: pvc mounting when snap and parent pvc is deleted Added an E2E test to test below case * Create PVC * Create Snapshot from PVC * Delete PVC * Create Clone from Snapshot * Delete Snapshot * Mount clone to Application * Delete Application and PVC Clone Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> (cherry picked from commit fa36a4668283fe3bb836c8fb4587ab32c666d898)	2021-05-07 09:25:48 +00:00
Madhu Rajanna	5e9f007ffd	rbd: flatten image if the depth is not zero flatten the image if the deep-flatten feature is present on the images in the chain or if the images in chain is not zero, as we cannot check the deep-flatten feature the images which are in trash. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> (cherry picked from commit 67d73cd6e9c9cede7d89667165fde4020c77a771)	2021-05-07 09:25:48 +00:00
Madhu Rajanna	38bd4e613e	rbd: discard image not found error For flatten we call checkImageChainHasFeature which internally calls to getImageInfo returns the parent name even if the parent is in the trash, when we try to open the parent image to get its information it fails as the image not found. we should treat error as nil if the parent is not found. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> (cherry picked from commit e15e2e5081975072fca6a6f4f7116b111eaaf308)	2021-05-07 09:25:48 +00:00
Madhu Rajanna	75fa1927fc	rbd: mark image ready when image state is up+unknown To recover from split brain (up+error) state the image need to be demoted and requested for resync on site-a and then the image on site-b should gets demoted.The volume should be marked to ready=true when the image state on both the clusters are up+unknown because during the last snapshot syncing the data gets copied first and then image state on the site-a changes to up+unknown. If the image state on both the sites are up+unknown consider that complete data is synced as the last snapshot gets exchanged between the clusters. * create 10 GB of file and validate the data after resync * Do Failover when the site-a goes down * Force promote the image and write data in GiB * Once the site-a comes back, Demote the image and issue resync * Demote the image on site-b * The status will get reflected on the other site when the last snapshot sync happens * The image will go to up+unknown state. and complete data will be copied to site a * Promote the image on site-a and use it ```bash csi-vol-5633715e-a7eb-11eb-bebb-0242ac110006: global_id: e7f9ec55-06ab-46cb-a1ae-784be75ed96d state: up+unknown description: remote image demoted service: a on minicluster1 last_update: 2021-04-28 07:11:56 peer_sites: name: e47e29f4-96e8-44ed-b6c6-edf15c5a91d6-rook-ceph state: up+unknown description: remote image demoted last_update: 2021-04-28 07:11:41 ``` * Do Failover when the site-a goes down * Force promote the image on site-b and write data in GiB * Demote the image on site-b * Once the site-a comes back, Demote the image on site-a * The images on the both site will go to split brain state ```bash csi-vol-37effcb5-a7f1-11eb-bebb-0242ac110006: global_id: 115c3df9-3d4f-4c04-93a7-531b82155ddf state: up+error description: split-brain service: a on minicluster2 last_update: 2021-04-28 07:25:41 peer_sites: name: abbda0f0-0117-4425-8cb2-deb4c853da47-rook-ceph state: up+error description: split-brain last_update: 2021-04-28 07:25:26 ``` * Issue resync * The images cannot be resynced because when we issue resync on site a the image on site-b was in demoted state * To recover from this state (promote and then demote the image on site-b after sometime) ```bash csi-vol-37effcb5-a7f1-11eb-bebb-0242ac110006: global_id: 115c3df9-3d4f-4c04-93a7-531b82155ddf state: up+unknown description: remote image demoted service: a on minicluster1 last_update: 2021-04-28 07:32:56 peer_sites: name: e47e29f4-96e8-44ed-b6c6-edf15c5a91d6-rook-ceph state: up+unknown description: remote image demoted last_update: 2021-04-28 07:32:41 ``` * Once the data is copied we can see that the image state is moved to up+unknown on both sites * Promote the image on site-a and use it Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> (cherry picked from commit 07a916b84d8cb4e9b2f4e0399f62692d985e7a89)	2021-05-05 15:07:18 +00:00
Madhu Rajanna	1c59f0683e	rbd: delete encryption key from KMS when a Snapshot is encrypted during a CreateSnapshot operation, the encryption key gets created in the KMS when we delete the Snapshot the key from the KMS should also gets deleted. When we create a volume from snapshot we are copying required information but we missed to copy the encryption information, This commit adds the missing information to delete the encryption key. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> (cherry picked from commit c3bae17fcee0413dfb0a15a3524f0e7e0fc8d5d9)	2021-04-30 09:37:23 +00:00
Madhu Rajanna	f547f76315	revert: deploy: update templates for v3.3.1 This reverts commit a07260f19153cb6fef7cb27bfb9135082630830e. which had template changes for v3.3.1 release. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-04-22 17:09:38 +05:30
Madhu Rajanna	a07260f191	deploy: update templates for v3.3.1 updated required templates for v3.3.1 release. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> v3.3.1	2021-04-22 13:35:31 +05:30
Humble Chirammal	1367cb445f	rbd: return crypt error for the rpc return At present we return the volume connect error if the clone from snapshot fails when rbdvolume is encrypted, which is incorrect. This patch correctly return the failed copy encryption error to the caller Signed-off-by: Humble Chirammal <hchiramm@redhat.com> (cherry picked from commit 798437d0c4cc970187f65ec110684b597f98cf49)	2021-04-22 12:55:50 +05:30
Madhu Rajanna	76fb7f6441	build: remove helm init from deploy.sh from helm v3.x version there is no helm init command. Removing the helm init which was causing helm chart pushing issue in release and devel branch. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> (cherry picked from commit 6508726276748c5618ee1972bf14f018135be39c)	2021-04-22 12:32:05 +05:30
Madhu Rajanna	969d3796fa	build: install helm version from build.env Install the helm package based on the version specified in the build.env Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> (cherry picked from commit aa77b677a32dfc5bc4a2048da10f741409b2dcb5)	2021-04-22 12:32:05 +05:30
Madhu Rajanna	599f3fd8e4	rbd: modified logic to check image watchers Before RBD map operation, we do check the watchers on the RBD image. In the case of RWO volume. cephcsi makes sure only one client is using the RBD image. If the rbd image is mirrored, by default mirroring daemon will add a watcher on the image and as we are using go-ceph a watcher will be added as we have opened the image So we will have two watchers on an image if mirroring is enabled. This holds when the rbd mirror daemon is running, In case if the mirror daemon is not running there will be only one watcher on the rbd image (which is placed by go-ceph image open) we should not block the map operation if the mirroring daemon is not running as its Async mirroring. This commit adds a check to make sure no more than 2 watchers if the image is mirrored or no more than 1 watcher if it is not mirrored image. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> (cherry picked from commit 52290333e61bf5e659ab8ff22c412366b42db126)	2021-04-20 11:54:30 +05:30
Madhu Rajanna	c0533d1b17	revert: update templates for v3.3.0 release This commit reverts back the changes done for v3.3.0 release. With this change a release canary tagged image and helm charts will get pushed. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-04-16 14:37:11 +05:30
Madhu Rajanna	8122750c58	build: update required files for release-v3.3 updated the required templates and upgrade document for release 3.3 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> v3.3.0	2021-04-15 19:06:49 +05:30
Madhu Rajanna	eea52847bc	rbd: check volumeID in PV if image not found If the pool or few keys are missing in the omap. GetImageAttributes function returns nil error message and few empty items in imageAttributes struct. if the image is not found and the entiries are missing use the volumeId present on the PV annotation for further operations. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-04-15 17:13:06 +05:30
Madhu Rajanna	cfc88c9910	rbd: discard up+unknown state in ResyncVolume incase if the image is promoted and demoted the image state will be set to up+unknown if the image on the remote cluster is still in demoted state. when user changes the state from primary to secondary and still the image is in demoted (secondary) state in the remote cluster. the image state on both the cluster will be on unknown state. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-04-15 17:13:06 +05:30
Rakshith R	31634ede3d	cleanup: update mergify.yml to use merge_bot_account option New version of mergifyio requires the use `merge_bot_account` instead of `bot_accout` configuration option. Signed-off-by: Rakshith R <rar@redhat.com>	2021-04-15 12:00:45 +05:30
Rakshith R	3795704340	ci: update feature gates setting from minikube.sh BlockVolume, CSIBlockVolume(GA since k8s v1.18) & VolumeSnapshotDataSource (GA since k8s v1.20) default to true and don't need to be set to true in feature gates setting. Signed-off-by: Rakshith R <rar@redhat.com>	2021-04-15 05:27:16 +00:00
Niels de Vos	8b8480017b	logging: report issues in rbdImage.DEKStore API with stacks It helps to get a stack trace when debugging issues. Certain things are considered bugs in the code (like missing attributes in a struct), and might cause a panic in certain occasions. In this case, a missing string will not panic, but the behaviour will also not be correct (DEKs getting encrypted, but unable to decrypt). Clearly logging this as a BUG is probably better than calling panic(). Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	35d58a7d5a	e2e: only test a single encrypted clone/snapshot The default number for cloning and snapshot/restore is 10 volumes. This adds to the time the test suite runs. There is no need to validate 10 copies of the encrypted volume, a single copy is sufficient. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	bb296c4f21	e2e: add verification for encrypted Snapshot/Restore operations This moves validatePVCSnapshot() into its own function, so that it follows the same format as validatePVCClone() does already. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	3fde636685	e2e: add validation for cloning encrypted volumes Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	b1d05a1840	rbd: repair encryption config in case it is missing It is possible that when a provisioner restarts after a snapshot was cloned, but before the newly restored image had its encryption metadata set, the new image is not marked as encrypted. This will prevent attaching/mounting the image, as the encryption key will not be fetched, or is not available in the DEKStore. By actively repairing the encryption configuration when needed, this problem should be addressed. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	1482105309	cleanup: use buildCreateVolumeResponse() to simplify CreateVolume() buildCreateVolumeResponse() exists exactly for the need to create a csi.CreateVolumeResponse based on an rbdVolume. Calling this helper reduces the code duplication in CreateVolume(). Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	52433841b4	cleanup: move copyEncryptionConfig() from CreateVolume to Exists() The rbdVolume that needs its encryption configured is constructed in the Exists() method. It is suitable to move the copyEncryptionConfig() call there as well, so that the object is completely constructed in a single place. Golang-ci:gocyclo complained about the increased complexity of the Exists() function. Moving the repairing of the ImageID into its own helper function makes the code a little easier to understand. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	596410ae60	cleanup: address "nolint" comments for RBD CreateSnapshot Introduce helper function cloneFromSnapshot() that takes care of the procedures that are needed when an existing snapshot has been found. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	b5d0524c39	cleanup: release resources for rbdImages objects after use Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	dc990037a5	rbd: move setupEncryption() from buildCreateVolumeResponse to CreateVolume Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	bea9d56117	rbd: copyEncryptionConfig in doSnapshotClone() Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	fd5f4dbafd	rbd: configureEncryption() in genSnapFromSnapID() Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	6fd3f57f40	rbd: set kmsID in reserveSnap() Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	0a046c5b6d	rbd: copy encryption configuration in CreateSnapshot Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	6b1285d38b	rbd: copy passphrase for encrypted clones When a source volume is encrypted, the passphrase needs to be copied and stored for the newly cloned volume. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	7b332a0184	rbd: add rbdImage.copyEncryptionConfig() to copy encryption metadata Cloning volumes requires copying the DEK from the source to the newly cloned volume. Introduce copyEncryptionConfig() as a helper for that. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	7e6feecc25	util: add VolumeEncryption.StoreCryptoPassphrase() The new StoreCryptoPassphrase() method makes it possible to store an unencrypted passphrase newly encrypted in the DEKStore. Cloning volumes will use this, as the passphrase from the original volume will need to get copied as part of the metadata for the volume. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	b6aa19eea5	rbd: pass secrets when creating an source rbdVolume for cloning Without this, the rbdVolume can not connect to the Ceph cluster and configure the (optional) encryption. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	92b2e08adf	rbd: improve logging in deleteImage() Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	99da92cfd7	rbd: move deletion of DEK to deleteImage() The ControllerServer should not need to care about support for encryption, ideally it is transparantly handled by the rbdVolume type and its internal API. Deleting the DEK was one of the last remainders that was explicitly done inside the ControllerServer. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	151d066938	util: add logging when OpenEncryptedVolume() encounters an error Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	bd1388fb96	util: log available configs when KMS not found When the KMS configuration can not be found, it is useful to know what configurations are available. This aids troubleshooting when typos in the KMS ID are made. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Niels de Vos	a7c261a394	logging: correct formatting when reporting error in createVolumeFromSnapshot() Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-04-14 03:59:28 +00:00
Rakshith R	ae6a52a84e	util: add nil check to default ControllerGetCapabilities() Currently default ControllerGetCapabilities function is being used which throws 'runtime error: invalid memory address or nil pointer dereference' when `--controllerServer=true` is not set in provisioner deployment args. This commit adds a check to prevent it. Fixes: 1925 Signed-off-by: Rakshith R <rar@redhat.com>	2021-04-09 10:12:48 +00:00
Rakshith R	10d539efc8	cleanup: correct nolint directive listing format nolint directive needs to be followed by comma separated list of linters. This commit changes to gocognit:gocyclo which was not recognised to linters which show error for the function. Signed-off-by: Rakshith R <rar@redhat.com>	2021-04-09 07:24:47 +00:00
Rakshith R	3f3489367c	cleanup: correct linter name mnd to gomnd Signed-off-by: Rakshith R <rar@redhat.com>	2021-04-09 07:24:47 +00:00
Rakshith R	fb7389f478	cephfs: add stderr to mount function errors This commit appends stderr to error in both kernel and ceph-fuse mounter functions to better be able to debug errors. Signed-off-by: Rakshith R <rar@redhat.com>	2021-04-08 12:18:01 +00:00
Yug	62ae17e263	doc: update dev standup timing The Dev standup was preponed 2 hours some time back. Updating the same in upstream. Signed-off-by: Yug <yuggupta27@gmail.com>	2021-04-07 13:59:10 +00:00
Yug	2f7b733f7e	doc: update command usage Running the command specified `date -d 14:00 UTC` fails with the following error: ```date: the argument ‘UTC’ lacks a leading '+'; when using an option to specify date(s), any non-option argument must be a format string beginning with '+' ``` Add quotes to ensure expected output. Signed-off-by: Yug <yuggupta27@gmail.com>	2021-04-07 13:59:10 +00:00
Yug	f4d9fd0e89	ci: Updated mergify rules for containerized-tests Since github actions cover all the tests covered by the containerized tests, disabling them in upstream to avoid running repetitive tests and properly utilize CI instances. The test will still be available to run locally. Signed-off-by: Yug <yuggupta27@gmail.com>	2021-04-07 18:26:07 +05:30
Madhu Rajanna	e2fa84357a	rbd: take lock when reconciling the PV there can be a change we can reconcile same PV parallelly we can endup in generating and deleting multiple omap keys. to be on safer side taking lock to process one volumeHandle at a time. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-04-07 11:46:27 +00:00
Madhu Rajanna	0f8813d89f	rbd:store/Read volumeID in/from PV annotation In the case of the Async DR, the volumeID will not be the same if the clusterID or the PoolID is different, With Earlier implementation, it is expected that the new volumeID mapping is stored in the rados omap pool. In the case of the ControllerExpand or the DeleteVolume Request, the only volumeID will be sent it's not possible to find the corresponding poolID in the new cluster. With This Change, it works as below The csi-rbdplugin-controller will watch for the PV objects, when there are any PV objects created it will check the omap already exists, If the omap doesn't exist it will generate the new volumeID and it checks for the volumeID mapping entry in the PV annotation, if the mapping does not exist, it will add the new entry to the PV annotation. The cephcsi will check for the PV annotations if the omap does not exist if the mapping exists in the PV annotation, it will use the new volumeID for further operations. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-04-07 11:46:27 +00:00
Rakshith R	9aea701bd9	ci: enable nestif linter The nestif linter reports deeply nested if statements. This commit enables the nestif linter and sets min-complexity to 7. Closes: #1229 Signed-off-by: Rakshith R <rar@redhat.com>	2021-04-07 02:31:41 +00:00

1 2 3 4 5 ...

1981 Commits