ceph-csi

mirror of https://github.com/ceph/ceph-csi.git synced 2025-06-02 11:56:41 +00:00

Author	SHA1	Message	Date
Rakshith R	5189ccc13e	rbd: refractor RegenerateJournal() to take in volumeAttributes This commit refractors RegenerateJournal() to take in volumeAttributes map[string]string as argument so it can extract required attributes internally. Signed-off-by: Rakshith R <rar@redhat.com> (cherry picked from commit b9b4b1e34ef4eb72e48e408dd6e40495cfe0ae24)	2021-08-11 09:50:10 +00:00
Rakshith R	d4c84e814b	rbd: use `CSIInstanceID` var instead of "default" in RegenerateJournal() Signed-off-by: Rakshith R <rar@redhat.com> (cherry picked from commit 39d6752fc14868f315de3aaf21518b7727beeafa)	2021-08-11 09:50:10 +00:00
Madhu Rajanna	72a2b97be2	rbd: consider empty mirroring mode consider the empty mirroring mode when validating the snapshot interval and the scheduling time. Even if the mirroring Mode is not set validate the snapshot scheduling details as cephcsi sets the mirroring mode to default snapshot. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> (cherry picked from commit 3c852199625333c8ccf8db18e592bb5627270d6b)	2021-08-10 12:55:41 +00:00
Madhu Rajanna	75ff33785b	rbd: log LastUpdate in UTC format This Commit converts the LastUpdate from int to the UTC format and logs it for better debugging. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> (cherry picked from commit 2782878ea20c7d49f392ccdb948001eb0e1b83e0)	2021-08-10 08:56:08 +00:00
Rakshith R	0b43e91c77	rbd: fix snapshot id idempotency issue This commit fixes snapshot id idempotency issue by always returning an error when flattening is in progress and not using `readyToUse:false` response. Signed-off-by: Rakshith R <rar@redhat.com> (cherry picked from commit 825211730cce9c6a909e66fb9e7248ea35c17c8e)	2021-08-09 12:10:42 +00:00
Rakshith R	33234c1b51	cleanup: refractor checkCloneImage to reducing nesting if This commit refractors checkCloneImage function to address nestif linter issue. Signed-off-by: Rakshith R <rar@redhat.com> (cherry picked from commit 859d69627935fd526074eb494cacde8e9dd34402)	2021-08-09 12:10:42 +00:00
Madhu Rajanna	32faed322a	rbd: fix clone problem This commit fixes a bug in checkCloneImage() which was caused by checking cloned image before checking on temp-clone image snap in a subsequent request which lead to stale images. This was solved by checking temp-clone image snap and flattening temp-clone if needed. This commit also fixes comparison bug in flattenCloneImage(). Signed-off-by: Rakshith R <rar@redhat.com> Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> (cherry picked from commit a5a89527165af23c12957e6fd1a9c9c7f427ecef)	2021-08-09 12:10:42 +00:00
Madhu Rajanna	a7a5a527c2	rbd: copy creds when copying the connection rbd flatten functions is a CLI call and it expects the creds as the input and copying of creds is required when we generate the temp clone image. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> (cherry picked from commit 916c97b4a87cfc4d9cad9aaaedcc33b0de75a032)	2021-08-09 12:10:42 +00:00
Rakshith R	33509ca90a	rbd: fix vol.VolID in cloneFromSnapshot() Volume generated from snap using genrateVolFromSnap already copies volume ID correctly, therefore removing `vol.VolID = rbdVol.VolID` which wrongly copies parent Volume ID instead leading to error from copyEncryption() on parent and clone volume ID being equal. Signed-off-by: Rakshith R <rar@redhat.com> (cherry picked from commit 08728b631b753ef44b7a4bd48d3eba383c497d35)	2021-08-09 12:10:42 +00:00
Madhu Rajanna	829fc5ed95	rbd: read clusterID and PoolID from mapping Whenever Ceph-CSI receives a CSI/Replication request it will first decode the volumeHandle and try to get the required OMAP details if it is not able to retrieve, receives a `Not Found` error message and Ceph-CSI will check for the clusterID mapping. If the old volumeID `0001-00013-site1-storage-0000000000000001 -b0285c97-a0ce-11eb-8c66-0242ac110002` contains the `site1-storage` as the clusterID, now Ceph-CSI will look for the corresponding clusterID `site2-storage` from the above configmap. If the clusterID mapping is found now Ceph-CSI will look for the poolID mapping ie mapping between `1` and `2`. Example:- pool with name exists on both the clusters with different ID's Replicapool with ID `1` on site1 and Replicapool with ID `2` on site2. After getting the required mapping Ceph-CSI has the required information to get more details from the rados OMAP. If we have multiple clusterID mapping it will loop through all the mapping and checks the corresponding pool to get the OMAP data. If the clusterID mapping does not exist Ceph-CSI will return an `Not Found` error message to the caller. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> (cherry picked from commit 92ad2ceec977f482060544c94f9228cfdcf586cb)	2021-08-09 09:24:16 +00:00
Madhu Rajanna	daea5177e5	util: add helper function to read clusterID mapping added helper function to read the clusterID mapping from the mounted file. The clusterID mapping contains below mappings * ClusterID mappings (to cluster to which we are failingover and from which cluster failover happened) * RBD PoolID mapping of between the clusters. * CephFS FscID mapping between the clusters. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> (cherry picked from commit ac11d71e19acfd46b9b6f157902d35c9dd7f953f)	2021-08-09 09:24:16 +00:00
Niels de Vos	bc24a8c8ac	util: allow configuring VAULT_AUTH_MOUNT_PATH for Vault Tenant SA KMS The VAULT_AUTH_MOUNT_PATH is a Vault configuration parameter that allows a user to set a non default path for the Kubernetes ServiceAccount integration. This can already be configured for the Vault KMS, and is now added to the Vault Tenant SA KMS as well. Signed-off-by: Niels de Vos <ndevos@redhat.com> (cherry picked from commit 4859f2dfdb88304cc484402739787adcbea4ed5f)	2021-08-06 09:30:32 +00:00
Alexandre Lossent	7688bc3a7a	cephfs: support selinux mount options - mount host's /etc/selinux in node plugins - process mount options in all code paths for cephfs volume options Signed-off-by: Alexandre Lossent <alexandre.lossent@cern.ch> (cherry picked from commit 5cba04c470d259438f8608af9918d5d3ac338d58)	2021-08-05 08:37:52 +00:00
Niels de Vos	b866bd491c	util: add vaultAuthNamespace option for Vault KMS The new `vaultAuthNamespace` configuration parameter can be set to the Vault Namespace where the authentication is setup in the service. Some Hashicorp Vault deployments use sub-namespaces for their users/tenants, with a 'root' namespace where the authentication is configured. This requires passing of different Vault namespaces for different operations. Example: - the Kubernetes Auth mechanism is configured for in the Vault Namespace called 'devops' - a user/tenant has a sub-namespace called 'devops/website' where the encryption passphrases can be placed in the key-value store The configuration for this, then looks like: vaultAuthNamespace: devops vaultNamespace: devops/homepage Note that Vault Namespaces are a feature of the Hashicorp Vault Enterprise product, and not part of the Open Source version. This prevents adding e2e tests that validate the Vault Namespace configuration. Signed-off-by: Niels de Vos <ndevos@redhat.com> (cherry picked from commit f2d5c2e0df8e2454bccc3c290600452989ebae97)	2021-08-05 06:44:23 +00:00
Niels de Vos	a962cccd0a	util: correct error message when connecting to Vault fails Signed-off-by: Niels de Vos <ndevos@redhat.com> (cherry picked from commit 83167e2ac55068422d59f7eb0f5c53280d7735b2)	2021-08-05 06:44:23 +00:00
Artur Troian	82fd1e5248	util: getCgroupPidsFile produces striped path when extra : present This commit uses `string.SplitN` instead of `string.Split`. The path for pids.max has extra `:` symbols in it due to which getCgroupPidsFile() splits the string into 5 tokens instead of 3 leading to loss of part of the path. As a result, the below error is reported: `Failed to get the PID limit, can not reconfigure: open /sys/fs/cgroup/pids/system.slice/containerd.service/ kubepods-besteffort-pod183b9d14_aed1_4b66_a696_da0c738bc012.slice/pids.max: no such file or directory` SplitN takes an argument n and splits the string accordingly which helps us to get the desired file path. Fixes: #2337 Co-authored-by: Yati Padia <ypadia@redhat.com> Signed-off-by: Yati Padia <ypadia@redhat.com> (cherry picked from commit 16ec97d8f75ae362c1e0f243601d39487d7c092c)	2021-08-04 07:11:26 +00:00
Madhu Rajanna	8f185bf7b2	rbd: use rados namespace for manager command Currently we have a bug that we are not using rados namespace when adding ceph manager command to remove the image from the trash. This commit adds the missing rados namespace when adding ceph manager task. without fix the image will be moved to trash and no task will be added to remove from the trash. it will become ceph responsibility to remove the image from trash when it will cleanup the trash. workaroud: manually purge the trash Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-07-28 03:48:33 +00:00
Niels de Vos	ec6703ed58	rbd: rename encryption metadata keys to enable mirroring RBD image metadata keys that start with '.rbd' are expected to be internal to RBD itself and are not mirrored to remote sites. Renaming the keys (dropping the '.' prefix) and using the new MigrateMetadata() function now makes the keys available on remote sites too. Closes: #2219 Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-07-26 11:49:56 +00:00
Niels de Vos	607129171d	rbd: move image metadata key migration to its own function The new MigrateMetadata() function can be used to get the metadata of an image with a deprecated and new key. Renaming metadata keys can be done easily this way. A default value will be set in the image metadata when it is missing completely. But if the deprecated key was set, the data is stored under the new key and the deprecated key is removed. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-07-26 11:49:56 +00:00
Yati Padia	6691951453	rbd: use go-ceph for getImageMirroringStatus Currently, getImageMirroringStatus() is using RBD CLI. This commit converts RBD CLI to go-ceph API. Fixes: #2120 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-07-26 06:37:40 +00:00
Prasanna Kumar Kalever	526ff95f10	rbd: add support to expand encrypted volume Previously in ControllerExpandVolume() we had a check for encrypted volumes and we use to fail for all expand requests on an encrypted volume. Also for Block VolumeMode PVCs NodeExpandVolume used to be ignored/skipped. With these changes, we add support for the expansion of encrypted volumes. Also for raw Block VolumeMode PVCs with Encryption we call NodeExpandVolume. That said, With LUKS1, cryptsetup utility doesn't prompt for a passphrase on resizing the crypto mapper device. This is because LUKS1 devices don't use kernel keyring for volume keys. Whereas, LUKS2 devices use kernel keyring for volume key by default, i.e. cryptsetup utility asks for a passphrase if it detects volume key was previously passed to dm-crypt via kernel keyring service, we are overriding the default by --disable-keyring option during cryptsetup open command. So that at the time of crypto mapper device resize we will not be prompted for any passphrase. Fixes: #1469 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-07-23 10:00:23 +00:00
Prasanna Kumar Kalever	4fa05cb3a1	util: add helper functions for resize of encrypted volume such as: ResizeEncryptedVolume() and LuksResize() Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-07-23 10:00:23 +00:00
Prasanna Kumar Kalever	572f39d656	util: fix log level in OpenEncryptedVolume() Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-07-23 10:00:23 +00:00
Prasanna Kumar Kalever	812003eb45	util: fix bug in DeviceEncryptionStatus() With Luks1 device: $ cryptsetup status /dev/mapper/crypto-rbd0 /dev/mapper/crypto-rbd0 is active and is in use. type: LUKS1 cipher: aes-xts-plain64 keysize: 512 bits key location: dm-crypt device: /dev/rbd0 sector size: 512 offset: 4096 sectors size: 4190208 sectors mode: read/write With Luks2 device: $ cryptsetup status /dev/mapper/crypto-rbd0 /dev/mapper/crypto-rbd0 is active and is in use. type: LUKS2 cipher: aes-xts-plain64 keysize: 512 bits key location: dm-crypt device: /dev/rbd0 sector size: 512 offset: 32768 sectors size: 4161536 sectors mode: read/write This could lead to failures with unmap in the NodeUnstageVolume path for the encrypted volumes. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-07-23 10:00:23 +00:00
Yati Padia	1ae2afe208	cleanup: modifies the error caused due to merged PRs This commit modifies the error of godot, cyclop, paralleltest linter caused due to merged PRs. Updates: #1586 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-07-22 18:15:48 +00:00
Yati Padia	172b66f73f	cleanup: resolves cyclop linter issue this commit adds `// nolint:cyclop` for the fucntions whose complexity is above 20 Updates: #1586 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-07-22 18:15:48 +00:00
Humble Chirammal	abe6a6e5ac	util: remove deleteLock test as it is enforced by the controller Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-07-22 15:07:49 +00:00
Humble Chirammal	c42d4768ca	util: remove the deleteLock acquistion check for clone and snapshot At present while acquiring the deleteLock on the volume, we check for ongoing clone and snapshot creation operations on the same. Considering snapshot and clone controllers does not allow parent volume deletion on subjected operations, we can be free from this extra check. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-07-22 15:07:49 +00:00
Niels de Vos	82557e3f34	util: allow configuring VAULT_BACKEND for Vault connection It seems that the version of the key/value engine can not always be detected for Hashicorp Vault. In certain cases, it is required to configure the `VAULT_BACKEND` (or `vaultBackend`) option so that a successful connection to the service can be made. The `kv-v2` is the current default for development deployments of Hashicorp Vault (what we use for automated testing). Production deployments default to version 1 for now. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-07-22 13:02:47 +00:00
Rakshith R	43f753760b	cleanup: resolve nlreturn linter issues nlreturn linter requires a new line before return and branch statements except when the return is alone inside a statement group (such as an if statement) to increase code clarity. This commit addresses such issues. Updates: #1586 Signed-off-by: Rakshith R <rar@redhat.com>	2021-07-22 06:05:01 +00:00
Yati Padia	3469dfc753	cleanup: resolve errorlint issues This commit resolves errorlint issues which checks for the code that will cause problems with the error wrapping scheme. Updates: #1586 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-07-19 13:31:29 +00:00
Yati Padia	bfda5fa57f	cleanup: resolve revive linter issue revive linter checks for var-declaration format. For example: "e2e/rbd_helper.go:441:36: var-declaration: should drop = nil from declaration of var noPVCValidation; it is the zero value (revive) var noPVCValidation validateFunc = nil" Updates: #1586 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-07-19 08:39:32 +00:00
Humble Chirammal	bd947bbe31	util: remove deleteLock check while acquiring snapshot createLock snapshot controller make sure the pvc which is the source for the snapshot request wont get deleted while snapshot is getting created, so we dont need to check for any ongoing delete operation here on the volume. Subjected code path in snapshot controller: ``` pvc, err := ctrl.getClaimFromVolumeSnapshot(snapshot) . .. pvcClone.ObjectMeta.Finalizers = append(pvcClone.ObjectMeta.Finalizers, utils.PVCFinalizer) _, err = ctrl.client.CoreV1().PersistentVolumeClaims(pvcClone.Namespace).Update(..) ``` Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-07-17 10:23:13 +00:00
Prasanna Kumar Kalever	78f740d903	rbd: improve healer to run multiple NodeStageVolume req concurrently This will bring down the healer run time by a great factor. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-07-16 16:30:58 +00:00
Prasanna Kumar Kalever	b6a88dd728	rbd: add volume healer Problem: ------- For rbd nbd userspace mounter backends, after a restart of the nodeplugin all the mounts will start seeing IO errors. This is because, for rbd-nbd backends there will be a userspace mount daemon running per volume, post restart of the nodeplugin pod, there is no way to restore the daemons back to life. Solution: -------- The volume healer is a one-time activity that is triggered at the startup time of the rbd nodeplugin. It navigates through the list of volume attachments on the node and acts accordingly. For now, it is limited to nbd type storage only, but it is flexible and can be extended in the future for other backend types as needed. From a few feets above: This solves a severe problem for nbd backed csi volumes. The healer while going through the list of volume attachments on the node, if finds the volume is in attached state and is of type nbd, then it will attempt to fix the rbd-nbd volumes by sending a NodeStageVolume request with the required volume attributes like secrets, device name, image attributes, and etc.. which will finally help start the required rbd-nbd daemons in the nodeplugin csi-rbdplugin container. This will allow reattaching the backend images with the right nbd device, thus allowing the applications to perform IO without any interruptions even after a nodeplugin restart. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-07-16 16:30:58 +00:00
Prasanna Kumar Kalever	6007fc9bfe	cleanup: move static volume check to helper function Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-07-16 16:30:58 +00:00
Prasanna Kumar Kalever	6d24080851	rbd: update per volume metadata stash-file with devicePath As part of stage transaction if the mounter is of type nbd, then capture device path after a successful rbd-nbd map. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-07-16 16:30:58 +00:00
Prasanna Kumar Kalever	70998571aa	cleanup: change variable name from path to metaDataPath path is used by standard package. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2021-07-16 16:30:58 +00:00
Humble Chirammal	94c5c5e119	util: remove deleteLock while we acquire clone operation lock clone controller make sure there is no delete operation happens on the source PVC which has been referred as the datasource of clone PVC, we are safe to operate without looking at delete operation lock in this case. Subjected code in the controller: ... if claim.Spec.DataSource != nil && rc.clone { err = p.setCloneFinalizer(ctx, claim) ... } if !checkFinalizer(claim, pvcCloneFinalizer) { claim.Finalizers = append(claim.Finalizers, pvcCloneFinalizer) _, err := p.client.CoreV1().PersistentVolumeClaims(claim.Namespace).Update(..claim..) } Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-07-16 12:32:28 +00:00
Humble Chirammal	e088e8fd2e	cephfs: Get rid of locking at nodepublish Considering kubelet make sure the stage and publish operations are serialized, we dont need any extra locking in nodePublish Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-07-16 07:18:56 +00:00
Humble Chirammal	61bf49a4f5	rbd: Get rid of locking at nodePublish Considering kubelet make sure the stage and publish operations are serialized, we dont need any extra locking in nodePublish Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-07-16 07:18:56 +00:00
Humble Chirammal	ced3a0922f	cephfs: Get rid of locking at nodeUnpublish call Considering kubelet make sure the unstage and unpublish operations are serialized, we dont need any extra locking in nodeUnpublish Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-07-16 07:18:56 +00:00
Humble Chirammal	ef852cc93d	rbd: Get rid of locking at nodeUnpublish call Considering kubelet make sure the unstage and unpublish operations are serialized, we dont need any extra locking in nodeUnpublish Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-07-16 07:18:56 +00:00
Yati Padia	f36d611ef9	cleanup: resolves gofumpt issues of internal codes This PR runs gofumpt for internal folder. Updates: #1586 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-07-14 19:50:56 +00:00
Yati Padia	299979fc14	ci: add unit test for toError() This commit adds unit test for the func converting cephFSCloneState to error. Fixes: #2259 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-07-14 15:02:12 +00:00
Yati Padia	c66872c3c6	cleanup: ineffective assignment This commit resolves ineffective assignent of snap. Updates: #1586 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-07-14 12:39:17 +00:00
Yati Padia	f210d5758b	cleanup: spell check getImageMirroingStatus This commit corrects the spelling for getImageMirroingStatus() -> getImageMirroringStatus Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-07-14 07:32:01 +00:00
Niels de Vos	d941e5abac	util: make parseTenantConfig() usable for modular KMSs parseTenantConfig() only allowed configuring a defined set of options, and KMSs were not able to re-use the implementation. Now, the function parses the ConfigMap from the Tenants Namespace and returns a map with options that the KMS supports. The map that parseTenantConfig() returns can be inspected by the KMS, and applied to the vaultTenantConnection type by calling parseConfig(). Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-07-13 17:16:35 +00:00
Niels de Vos	3d7d48a4aa	util: VaultTenantSA KMS implementation This new KMS uses a Kubernetes ServiceAccount from a Tenant (Namespace) to connect to Hashicorp Vault. The provisioner and node-plugin will check for the configured ServiceAccount and use the token that is located in one of the linked Secrets. Subsequently the Vault connection is configured to use the Kubernetes token from the Tenant. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-07-13 17:16:35 +00:00
Niels de Vos	6dc5bf2b29	util: split vaultTenantConnection from VaultTokensKMS This makes the Tenant configuration for Hashicorp Vault KMS connections more modular. Additional KMS implementations that use Hashicorp Vault with per-Tenant options can re-use the new vaultTenantConnection. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-07-13 17:16:35 +00:00

1 2 3 4 5 ...

625 Commits