ceph-csi

mirror of https://github.com/ceph/ceph-csi.git synced 2025-06-03 04:16:42 +00:00

Author	SHA1	Message	Date
Madhu Rajanna	17d47a4c31	rbd: remove checkHealthyPrimary check After Failover of workloads to the secondary cluster when the primary cluster is down, RBD Image is not marked healthy, and VR resources are not promoted to the Primary, In VolumeReplication, the `CURRENT STATE` remains Unknown and doesn't change to Primary. This happens because the primary cluster went down, and we have force promoted the image on the secondary cluster. and the image stays in up+stopping_replay or could be any other states. Currently assumption was that the image will always be `up+stopped`. But the image will be in `up+stopped` only for planned failover and it could be in any other state if its a forced failover. For this reason, removing checkHealthyPrimary from the PromoteVolume RPC call. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> (cherry picked from commit 8c5563a9bcef229b6cfe54ae9fbfbbbed70089be)	2022-07-27 09:39:56 +00:00
Rakshith R	c5750fb585	build: resolve a fixme and disable tcmu repo Cmd to disable apache arrow repo is removed, since it is no longer needed. Cmd to disable tcmu repo is added to make build pass. refer: https://github.com/ceph/ceph-container/issues/2034 Signed-off-by: Rakshith R <rar@redhat.com> (cherry picked from commit 5ed305850f42ee4a7273260209b0dc60a49125e0)	2022-07-20 09:58:58 +00:00
Niels de Vos	6fd0e19ef7	e2e: use `exclusive-lock` together with `lock_on_read` When using `lock_on_read`, the RBD image needs to have the `exclusive-lock` feature enabled too. Fixes: #3221 Signed-off-by: Niels de Vos <ndevos@redhat.com> (cherry picked from commit 2df55a55a3f276ffeaac3dc36450702ce47f60f8)	2022-07-04 04:52:43 +00:00
Prasanna Kumar Kalever	137adbf303	rebase: update minikube to v1.26.0 A new stable release of minikube is available, lets switch to it. https://github.com/kubernetes/minikube/releases/tag/v1.26.0 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> (cherry picked from commit 29ddfb501b5ae7d1c613f8adebb7330e00a5edb6)	2022-07-04 04:52:43 +00:00
Madhu Rajanna	08de15cebe	rbd: add unit test for checkHealthyPrimary Removed the code in checkHealthyPrimary which makes the ceph call, passing it as input now. Added unit test for checkHealthyPrimary function Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> (cherry picked from commit 8a47904e8f1a22496b8af79d22a4210a7b06aad8)	2022-06-28 15:39:59 +00:00
Madhu Rajanna	0e67c8da24	rbd: fix checkHealthyPrimary to consider up+stopped state we need to check for image should be in up+stopped state not anyone of the state for that the we need to use OR check not the AND check. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> (cherry picked from commit 53e76fab692750bb74e630ab25bb7052b366b420)	2022-06-28 15:39:59 +00:00
Madhu Rajanna	4b7f0a0541	revert: rbd: consider remote image health for primary When the image is force promoted to primary on the cluster the remote image might not be in replaying state because due to the split brain state. This PR reverts back the commit c3c87f2ef33e8d8ad08d7d9f28b59d1aedc4ef31. Which we added to check the remote image status. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> (cherry picked from commit 704cb5c941f7850800286e955c2db1de409ca35b)	2022-06-28 15:39:59 +00:00
Prasanna Kumar Kalever	8d283f3cca	deploy: fix the staging path accordingly in the templates Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> (cherry picked from commit d3650ae8638ef6adff7249e97e06e6de04c8f989)	2022-06-24 16:16:01 +00:00
Prasanna Kumar Kalever	d18eaceab4	rbd: healer detect Kubernetes version for right StagingTargetPath Kubernetes 1.24 and newer use a different path for staging the volume. That means the CSI-driver is requested to mount the volume at an other location, compared to previous versions of Kubernetes. CSI-drivers implementing the volumeHealer, must receive the correct path, otherwise the after a nodeplugin restart the NBD mounts will bailout attempting to NodeStageVolume() call and return an error. See-also: kubernetes/kubernetes#107065 Fixes: #3176 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> (cherry picked from commit 1da446d2f2496e3e44c94cdfeb531b94c44dee97)	2022-06-24 16:16:01 +00:00
Madhu Rajanna	471c1342b1	rbd: issue resync only if the force flag is set During failover we do demote the volume on the primary as the image is still not promoted yet on the remote cluster, there are spurious split-brain errors reported by RBD, the Cephcsi resync will attempt to resync from the "known" secondary and that will cause data loss Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> (cherry picked from commit 3acaa018dbb971df40b29a848eba1ca0c0420299)	2022-06-24 08:16:39 +00:00
Rakshith R	e7066798bb	rebase: update minikube to v1.26.0-beta.1 This version is required to support k8s 1.24, which is inturn required to test new sa token behavior. Signed-off-by: Rakshith R <rar@redhat.com> (cherry picked from commit 7a00736b4ff2ed069eda9cee2abfadd29fa4ead8)	2022-06-17 14:10:18 +00:00
Madhu Rajanna	d7c7f29d77	rbd: create token and use it for vault SA create the token if kubernetes version in 1.24+ and use it for vault sa. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> Signed-off-by: Rakshith R <rar@redhat.com> (cherry picked from commit 7a2dd4c3cf5891fc3d7627843b124dcdf4f8abf9)	2022-06-17 14:10:18 +00:00
Madhu Rajanna	77f5435c4c	revert: release 3.6.2 template changes This reverts commit 520903d07b5467e36f2b1ef03911a58f68902829. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-06-09 11:42:49 +00:00
Madhu Rajanna	520903d07b	build: release 3.6.2 template changes This commit changes the required image tag to v3.6.2 instead of v3.6-canary for the v3.6.2 release Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> v3.6.2	2022-06-09 09:02:23 +00:00
Madhu Rajanna	417f9f2030	cephfs: skip NetNamespaceFilePath if the volume is pre-provisioned In case of pre-provisioned volume the clusterID is not set in the volume context as the clusterID is missing we cannot extract the NetNamespaceFilePath from the configuration file. For static volume and dynamically provisioned volume the clusterID is set. Note:- This is a special case to support mounting PV without clusterID parameter. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> (cherry picked from commit c9943320ace5792f8549e85fb6a6121ba68aaf70)	2022-06-03 12:47:27 +00:00
Madhu Rajanna	67793ecff6	ci: fix pylint error fixed string format pylint error show in the CI. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> (cherry picked from commit 697ba8b5153585489552c0961fbaca2aee5190a4)	2022-05-30 10:44:58 +00:00
Rakshith R	9be88e5af5	rbd: use vaultAuthPath variable name in error msg Before the change, the error msg was the following: ``` failed to set VAULT_AUTH_MOUNT_PATH in Vault config: path is empty ``` `vaultAuthPath` is the actual variable name set by the user. The error message will now be the following: ``` failed to set "vaultAuthPath" in vault config: path is empty ``` Signed-off-by: Rakshith R <rar@redhat.com> (cherry picked from commit 7688306f87da143bf8c869c871fc6ef02a315baa)	2022-05-26 10:01:07 +00:00
Prasanna Kumar Kalever	9f5908d873	rbd: fix bug handling GetKrbdSupportedFeatures() continue running rbd driver when /sys/bus/rbd/supported_features file is missing, do not bailout. Fixes: #2678 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> (cherry picked from commit 6470cf334307965ccc859a88df49cabd6351ab7c)	2022-05-18 14:11:18 +00:00
Prasanna Kumar Kalever	a67bf8928c	rbd: handle when krbdFeatures is zero krbdFeatures is set to zero when kernel version < 3.8, i.e. in case where /sys/bus/rbd/supported_features is absent and we are unable to prepare the krbd attributes based on kernel version. When krbdFeatures is set to zero fallback to NBD only when autofallback is turned ON. Fixes: #2678 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> (cherry picked from commit 83cc1b0e5804b0bcd04bd5433891fa505fc51ab8)	2022-05-18 14:11:18 +00:00
Prasanna Kumar Kalever	6894b4b910	rbd: prepare krbd feature attrs if supported_features file is absent Upstream /sys/bus/rbd/supported_features is part of Linux kernel v4.11.0 Prepare the attributes and use them in case if /sys/bus/rbd/supported_features is missing. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> (cherry picked from commit e53fd8715422dc436d61e1fae09c8370aaa650ce)	2022-05-18 14:11:18 +00:00
Niels de Vos	2673a0b97a	ci: prevent ERR trap inheritance for kubectl_retry `bash -E` causes inheritance of the ERR trap into shell functions, command substitutions, and commands executed in a subshell environment. Because the `kubectl_retry` function depends on detection an error of a subshell, the ERR trap is not needed to be executed. The trap contains extra logging, and exits the script in the `rook.sh` case. The aborting of the script is not wanted when a retry is expected to be done. While checking for known failures, the `grep` command may exit with 1, if there are no matches. That means, the `ret` variable will be set to 0, but there will also be an error exit status. This causes `bash -E` to abort the function, and call the ERR trap. Signed-off-by: Niels de Vos <ndevos@redhat.com> (cherry picked from commit 4891e534d33b373aa63922e54e51a803b6fb2a74)	2022-05-18 10:08:43 +00:00
Madhu Rajanna	691714228e	rbd: consider rbd as default mounter if not set For the default mounter the mounter option will not be set in the storageclass and as it is not available in the storageclass same will not be set in the volume context, Because of this the mapOptions are getting discarded. If the mounter is not set assuming it's an rbd mounter. Note:- If the mounter is not set in the storageclass we can set it in the volume context explicitly, Doing this check-in node server to support backward existing volumes and the check is minimal we are not altering the volume context. fixes: #3076 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> (cherry picked from commit 70674565df3e593e5b9127d9700374e598638d5e)	2022-05-10 14:49:27 +00:00
Madhu Rajanna	06182eae6e	ci: fix commitlint problem Still seeing the issue of the commitlint as below fatal: unsafe repository ('/go/src/github.com/ceph/ceph-csi' is owned by someone else) To add an exception for this directory, call: git config --global --add safe.directory \ /go/src/github.com/ceph/ceph-csi Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> (cherry picked from commit df047ddaaf61d032fdc41071328b90bad44baa3f)	2022-05-10 06:45:16 +00:00
Rakshith R	179996990e	ci: use canary csi-provisioner image to test different sc clones This commit is added to use canary csi-provisioner image to test different sc pvc-pvc cloning feature, which is not yet present in released versions. refer: https://github.com/kubernetes-csi/external-provisioner/pull/699 Signed-off-by: Rakshith R <rar@redhat.com> (cherry picked from commit c8800618828101580499c20d5b051d65ad017634) # Conflicts: # charts/ceph-csi-rbd/values.yaml # deploy/rbd/kubernetes/csi-rbdplugin-provisioner.yaml	2022-05-06 17:37:36 +00:00
Rakshith R	fe22a44540	e2e: testcase for pvc-pvc clone with different SC & encryption Signed-off-by: Rakshith R <rar@redhat.com> (cherry picked from commit badcac38d356bbccbbe7b4174d4593fe3533a440)	2022-05-06 17:37:36 +00:00
Rakshith R	584c87ce34	rbd: support pvc-pvc clone with different sc & encryption This commit makes modification so as to allow pvc-pvc clone with different storageclass having different encryption configs. This commit also modifies `copyEncryptionConfig()` to include a `isEncrypted()` check within the function. Signed-off-by: Rakshith R <rar@redhat.com> (cherry picked from commit f1ccc4eced70322dd824f57d0e253fe71993996d)	2022-05-06 17:37:36 +00:00
Rakshith R	272182a588	rbd: use `vaultAuthPath` variable name in error msg Before the change, the error msg was the following: ``` failed to set VAULT_AUTH_MOUNT_PATH in Vault config: path is empty ``` `vaultAuthPath` is the actual variable name set by the user. The error message will now be the following: ``` failed to set "vaultAuthPath" in vault config: path is empty ``` Signed-off-by: Rakshith R <rar@redhat.com> (cherry picked from commit bd57feb26e19a9e6db1ecfa7a8a07521277329eb)	2022-05-05 06:54:32 +00:00
Niels de Vos	5c0f9d565e	nfs: delete the CephFS volume when the export is already removed In case the NFS-export has already been removed from the NFS-server, but the CSI Controller was restarted, a retry to remove the NFS-volume will fail with an error like: > GRPC error: ....: response status not empty: "Export does not exist" When this error is reported, assume the NFS-export was already removed from the NFS-server configuration, and continue with deleting the backend volume. Signed-off-by: Niels de Vos <ndevos@redhat.com> (cherry picked from commit 9d7faf850f73a80794d53aa253da9f48f084bd8a)	2022-05-05 03:27:30 +00:00
Niels de Vos	3fcfae35a5	ci: show all logs from `kubectl` Signed-off-by: Niels de Vos <ndevos@redhat.com> (cherry picked from commit 9940ec38e385d857eefe61776c2a51f5672cfa00)	2022-04-27 06:30:43 +00:00
Niels de Vos	2accbe66b1	ci: do not count `Warning:` matches in kubectl_retry() Depending on the Kubernetes version, the following warning is reported regulary: > Warning: policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, > unavailable in v1.25+ The warning is written to stderr, so skipping AlreadyExists or NotFound is not sufficient to trigger a retry. Ignoring '^Warning:' in the stderr output should prevent unneeded failures while deploying Rook or other components. Signed-off-by: Niels de Vos <ndevos@redhat.com> (cherry picked from commit 89e2ff39f13eb5165a9cad750a7058b1b286f81a)	2022-04-27 06:30:43 +00:00
Niels de Vos	2f7fb80498	ci: improve logging for `kubectl_retry` helper Rook deployments fail quite regulary in the CI environment now. It is not clear what the cause is, hopefully a little better logging will guide us to the issue. Now executing `kubectl` in a sub-shell, ensuring that the redirection of the command lands in the right files. Signed-off-by: Niels de Vos <ndevos@redhat.com> (cherry picked from commit 1e21972956e903763d2dea45ac02482c277285c0)	2022-04-27 06:30:43 +00:00
Silvan Loser	b50d8596ea	deploy: allowPrivilegeEscalation: true in containerSecurityContext When running the kubernetes cluster with one single privileged PodSecurityPolicy which is allowing everything the nodeplugin daemonset can fail to start. To be precise the problem is the defaultAllowPrivilegeEscalation: false configuration in the PSP. Containers of the nodeplugin daemonset won't start when they have privileged: true but no allowPrivilegeEscalation in their container securityContext. Kubernetes will not schedule if this mismatch exists cannot set allowPrivilegeEscalation to false and privileged to true: Signed-off-by: Silvan Loser <silvan.loser@hotmail.ch> Signed-off-by: Silvan Loser <33911078+losil@users.noreply.github.com> (cherry picked from commit f2e0fa28fb1a05f4091c666c7648c69b5c1037e2)	2022-04-26 10:02:04 +00:00
Silvan Loser	059969b10b	helm: allowPrivilegeEscalation: true in containerSecurityContext When running the kubernetes cluster with one single privileged PodSecurityPolicy which is allowing everything the nodeplugin daemonset can fail to start. To be precise the problem is the defaultAllowPrivilegeEscalation: false configuration in the PSP. Containers of the nodeplugin daemonset won't start when they have privileged: true but no allowPrivilegeEscalation in their container securityContext. Kubernetes will not schedule if this mismatch exists cannot set allowPrivilegeEscalation to false and privileged to true Signed-off-by: Silvan Loser <silvan.loser@hotmail.ch> Signed-off-by: Silvan Loser <33911078+losil@users.noreply.github.com> (cherry picked from commit 06c4477ff9fce02c49439d800f36cce82319f805)	2022-04-26 10:02:04 +00:00
Madhu Rajanna	41728c2465	revert: "deploy: change image versions to v3.6.1" This reverts commit 2032a84c686a0dbaabddb798690aa28083682d0e. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-04-22 16:05:51 +00:00
Madhu Rajanna	0932a4b6be	revert: "helm: update image tag for release 3.6.1" This reverts commit 1bd6297ecbdf11f1ebe6a4b20f8963b4bcebe13b. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-04-22 16:05:51 +00:00
Madhu Rajanna	1bd6297ecb	helm: update image tag for release 3.6.1 This commit change the required image tag to v3.6.1 instead of v3.6-canary for v3.6.1 release Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> v3.6.1	2022-04-22 13:41:39 +00:00
Madhu Rajanna	2032a84c68	deploy: change image versions to v3.6.1 This commit change the required image tag to v3.6.1 instead of v3.6-canary for v3.6.1 release Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2022-04-22 13:41:39 +00:00
Madhu Rajanna	eae4ff7fd3	doc: update doc for 3.6.1 release updated doc for 3.6.1 release, this will be backported to release-v3.6 branch and we will make deployment changes and do release. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> (cherry picked from commit 5e1a074ea3541caa129d549f3b9c64a3340bfd16)	2022-04-22 09:35:29 +00:00
Madhu Rajanna	c83a281857	cephfs: add netNamespaceFilePath for CephFS as same host directory is not shared between the cephfs and the rbd plugin pod. we need to keep the netNamespaceFilePath separately for both cephfs and rbd. CephFS plugin will use this path to execute mount -t commands. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> (cherry picked from commit d2bc9743f751d60c24db88a87f288ca48e945308)	2022-04-19 16:33:59 +00:00
Madhu Rajanna	a901997542	cleanup: use block comment for ClusterInfo example Adjusted the mix of tabs and the spaces and also used block comment for better readability. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> (cherry picked from commit eb4bfb73268e6c210e9d112d24e6700db601e0bd)	2022-04-19 16:33:59 +00:00
Madhu Rajanna	f8a19c8cbb	rbd: move radosNamespace to RBD section As radosNamespace is more specific to RBD not the general ceph configuration. Now we introduced a new RBD section for RBD specific options, Moving the radosNamespace to RBD section and keeping the radosNamespace still under the global ceph level configration for backward compatibility. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> (cherry picked from commit b4acbd08a5e5c6169dd52cc99be5ccfad06419d1)	2022-04-19 16:33:59 +00:00
Madhu Rajanna	76398d6887	util: Add RBD specific options in clusterInfo As the netNamespaceFilePath can be separate for both cephfs and rbd adding the netNamespaceFilePath path for RBD, This will help us to keep RBD and CephFS specific options separately. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> (cherry picked from commit 766346868e61db4639d80494cc3f2fb2ed2fc6c2)	2022-04-19 16:33:59 +00:00
Niels de Vos	61ca06148e	nfs: return gRPC status from CephFS CreateVolume failure The NFS Controller returns a non-gRPC error in case the CreateVolume call for the CephFS volume fails. It is better to return the gRPC-error that the CephFS Controller passed along. Signed-off-by: Niels de Vos <ndevos@redhat.com> (cherry picked from commit 2b71aac752614182fb0ba5dd956307401d2920b1)	2022-04-19 10:41:27 +00:00
Niels de Vos	6e0e6df2db	rebase: use go-ceph version with NFS-Admin API The NFS-Admin API has been added to go-ceph v0.15.0. As the API can not be tested in the go-ceph CI, it requires build-tag `ceph_ci_untested`. This additional build-tag has been added to the `Makefile` and should be removed when the API does not require the build-tag anymore. See-also: ceph/go-ceph#655 Signed-off-by: Niels de Vos <ndevos@redhat.com> (cherry picked from commit 282c33cb58fce8c1870e12bbda2268138a4fa70e)	2022-04-15 13:13:31 +00:00
Niels de Vos	3ce0e1fa50	nfs: use go-ceph API for creating/deleting exports Recent versions of Ceph allow calling the NFS-export management functions over the go-ceph API. This seems incompatible with older versions that have been tested with the `ceph nfs` commands that this commit replaces. Signed-off-by: Niels de Vos <ndevos@redhat.com> (cherry picked from commit 28369702d275381b559f77c7e38adf599128e457)	2022-04-15 13:13:31 +00:00
Madhu Rajanna	e61012da14	rbd: use leases for leader election use leases for leader election instead of the deprecated configmap based leader election. This PR is making leases as default leader election refer https://github.com/kubernetes-sigs/ controller-runtime/pull/1773, default from configmap to configmap leases was done with https://github.com/kubernetes-sigs/ controller-runtime/pull/1144. Release notes https://github.com/kubernetes-sigs/ controller-runtime/releases/tag/v0.7.0 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> (cherry picked from commit d886ab0d6634bca5a5b055bb7783138a2e3e5ded)	2022-04-15 10:24:19 +00:00
Madhu Rajanna	ebf2677b30	util: fix logging in ExecuteCommandWithNSEnter log the nsenter and its argument after executing the command with the nsenter CLI. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> (cherry picked from commit c245436ec4f89e6801a21782da64aaa0d370070b)	2022-04-14 16:33:49 +00:00
Madhu Rajanna	3521465e60	rbd: check nbd tool features only for rbd driver calling setRbdNbdToolFeatures inside an init gets called in main.go for both cephfs and rbd driver. instead of calling it in init function calling this in rbd driver.go as this is specific to rbd. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> (cherry picked from commit dffb6e72c2ab48d7b90197e405ae0d3d76a5fbc0)	2022-04-14 09:17:45 +00:00
Rakshith R	9245b58a9f	nfs: add provisioner & plugin sa to scc.yaml This commit adds nfs provisioner & plugin sa to scc.yaml to be used with openshift. Signed-off-by: Rakshith R <rar@redhat.com> (cherry picked from commit 784b086ea5ba1f57d14c67d57aecd8b5c2de6ad3)	2022-04-13 13:23:48 +00:00
Madhu Rajanna	db1b1dd6ec	rbd: consider remote image health for primary To consider the image is healthy during the Promote operation currently we are checking only the image state on the primary site. If the network is flaky or the remote site is down the image health is not as expected. To make sure the image is healthy across the clusters check the state on both local and the remote clusters. some details: https://bugzilla.redhat.com/show_bug.cgi?id=2014495 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> (cherry picked from commit 64a9b1fa5906d65478cdb5fb244b133bc1b1cfbe)	2022-04-13 10:57:40 +00:00

1 2 3 4 5 ...

2944 Commits