Commit Graph

2327 Commits

Author SHA1 Message Date
Rakshith R
de4e661c6f ci: pass $DISK_CONFIG properly to minikube start
Having double quotes around $DISK_CONFIG led to these args
not being properly passed to minikube start. This commit fixes it.

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 42a6c3c006)
2021-09-09 17:33:51 +00:00
Rakshith R
e5f6cc53f0 util: call WriteCephConfig() in cephcsi.go
This commit calls WriteCephConfig() in cephcsi.go to
create ceph.conf and keyring if it is not mounted to
be used by all cli calls and conn cmds.

Before this change, rbd-controller/omap-generator did not create
ceph.conf on startup.

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 0a7a7f4866)
2021-09-09 13:32:12 +00:00
Madhu Rajanna
c8f8272d77 rbd: set vaultAuthNamespace to vaultNamespace if empty
When we read the csi-kms-connection-details configmap
vaultAuthNamespace might not be set when we do the
conversion the vaultAuthNamespace might be set to empty
key and this commits check for the empty value of
vaultAuthNamespace and set the vaultAuthNamespace
to vaultNamespace.

setting empty value for vaultAuthNamespace happened due
to Marshalling at https://github.com/ceph/ceph-csi/blob/devel/
internal/kms/vault_tokens.go#L136-L139.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 8c8f34cf7a)
2021-09-09 08:48:47 +00:00
Rakshith R
96429384ec ci: add support to create extra disks through minikube
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 1b64a0a505)
2021-09-07 12:27:23 +00:00
Rakshith R
ff325ca0f6 rebase: update minikube to v1.23.0
See-also: https://github.com/kubernetes/minikube/releases/tag/v1.23.0

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 08c10c9f94)
2021-09-07 12:27:23 +00:00
Niels de Vos
24f92b2255 util: NewK8sClient() should not panic on non-Kubernetes clusters
When NewK8sClient() detects and error, it used to call FatalLogMsg()
which causes a panic. There are additional features that can be used on
Kubernetes clusters, but these are not a requirement for most
functionalities of the driver.

Instead of causing a panic, returning an error should suffice. This
allows using the driver on non-Kubernetes clusters again.

Fixes: #2452
Signed-off-by: Niels de Vos <ndevos@redhat.com>
(cherry picked from commit 60c2afbcca)
2021-09-02 16:50:40 +00:00
Rakshith R
cf93951f3b rbd: check for clusterid mapping in RegenerateJournal()
This commit adds fetchMappedClusterIDAndMons() which returns
monitors and clusterID info after checking cluster mapping info.
This is required for regenerating omap entries in mirrored cluster
with different clusterID.

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 99168dc822)
2021-09-01 09:40:24 +00:00
Rakshith R
dcd2a8c900 rbd: move GetMappedID() to util package
This commit moves getMappedID() from rbd to util
package since it is not rbd specific and exports
it from there.

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 496bcba85c)
2021-09-01 09:40:24 +00:00
Niels de Vos
d42814dfb9 util: fix unit-test for GetClusterMappingInfo()
Unit-testing often fails due to a race condition while writing the
clusterMappingConfigFile from multiple go-routines at the same time.
Failures from `make containerized-test` look like this:

    === CONT  TestGetClusterMappingInfo/site2-storage_cluster-id_mapping
        cluster_mapping_test.go:153: GetClusterMappingInfo() = <nil>, expected data &[{map[site1-storage:site2-storage] [map[1:3]] [map[11:5]]} {map[site3-storage:site2-storage] [map[8:3]] [map[10:5]]}]
    === CONT  TestGetClusterMappingInfo/site3-storage_cluster-id_mapping
        cluster_mapping_test.go:153: GetClusterMappingInfo() = <nil>, expected data &[{map[site3-storage:site2-storage] [map[8:3]] [map[10:5]]}]
    --- FAIL: TestGetClusterMappingInfo (0.01s)
        --- PASS: TestGetClusterMappingInfo/mapping_file_not_found (0.00s)
        --- PASS: TestGetClusterMappingInfo/mapping_file_found_with_empty_data (0.00s)
        --- PASS: TestGetClusterMappingInfo/cluster-id_mapping_not_found (0.00s)
        --- FAIL: TestGetClusterMappingInfo/site2-storage_cluster-id_mapping (0.00s)
        --- FAIL: TestGetClusterMappingInfo/site3-storage_cluster-id_mapping (0.00s)
        --- PASS: TestGetClusterMappingInfo/site1-storage_cluster-id_mapping (0.00s)

By splitting the public GetClusterMappingInfo() function into an
internal getClusterMappingInfo() that takes a filename, unit-testing can
use different files for each go-routine, and testing becomes more
predictable.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
(cherry picked from commit 8b71671b42665789fac4a4aa1453b0b107f475c6)
2021-09-01 09:40:24 +00:00
Niels de Vos
82b6857688 cleanup: address pylint "consider-using-with" in tracevol.py
pylint started to report errors like the following:

    troubleshooting/tools/tracevol.py:97:10: R1732: Consider using 'with' for resource-allocating operations (consider-using-with)

There probably has been an update of Pylint in the test-container that
is more strict than previous versions.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
(cherry picked from commit 544d73759c39a08d82f20ea674896abf7857d9ef)
2021-09-01 09:40:24 +00:00
Niels de Vos
addf6407b0 build: vendor code.cloudfoundry.org/gofileutils from GitHub
There is a problem accessing the code.cloudfoundry.org web service iver
TLS. It seems to redirect to GitHub, so use the package from there:

    running: go mod verify
    go: github.com/libopenstorage/secrets@v0.0.0-20210709082113-dde442ea20ec requires
    	github.com/hashicorp/vault@v1.4.2 requires
    	github.com/hashicorp/vault-plugin-auth-cf@v0.5.4 requires
    	github.com/cloudfoundry-community/go-cfclient@v0.0.0-20190201205600-f136f9222381 requires
    	code.cloudfoundry.org/gofileutils@v0.0.0-20170111115228-4d0c80011a0f: unrecognized import path "code.cloudfoundry.org/gofileutils": https fetch: Get "https://code.cloudfoundry.org/gofileutils?go-get=1": x509: certificate signed by unknown authority

Signed-off-by: Niels de Vos <ndevos@redhat.com>
(cherry picked from commit 32da0cf888ba452288a0e7436eed91cf7ca5dd4e)
2021-09-01 09:40:24 +00:00
Humble Chirammal
eb50407eac helm: correct the groupVersion of CSIDriver in the chart
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
(cherry picked from commit 3462cd9bbd)
2021-08-17 11:53:54 +00:00
Humble Chirammal
a78d24ce88 helm: correct watch verb in topology RBAC
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
(cherry picked from commit 8e00c2c810)
2021-08-17 11:53:54 +00:00
Madhu Rajanna
7690e43bed rbd: Cleanup OMAP data for secondary image
If the image is in a secondary state and its
up+replaying means its an healthy secondary
and the image is primary somewhere in the remote cluster
and the local image is getting replayed. Delete the
OMAP data generated as we cannot delete the
secondary image. When the image on the primary
cluster gets deleted/mirroring disabled, the image on
all the remote (secondary) clusters will get
auto-deleted. This helps in garbage collecting
the OMAP, PVC and PV objects after failback operation.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 5562e46d0f)
2021-08-17 04:36:04 +00:00
Madhu Rajanna
ad0009c427 rbd: return succuss if image is healthy secondary
If the image is in secondary state and its
up+replaying means its an healthy secondary
and the image is primary somewhere in the remote
cluster and the local image is getting replayed.
Return success for the Disabling mirroring as
we cannot disable the mirroring on the secondary
state, when the image on the remote site gets
disabled the image on all the remote (secondary)
will get auto deleted. This helps in garbage
collecting the volume replication kuberentes
artifacts

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit fc0d6f6b8b)
2021-08-17 04:36:04 +00:00
Madhu Rajanna
e42552dd2f rbd: add helper function to get local state
added helper function to check the local image
state is up+replaying.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 35324b2e17)
2021-08-17 04:36:04 +00:00
Rakshith R
8997a1bbdb ci: internally create & delete cephcsi namespace in install-helm.sh
This ensures the kubectl call is retried with kubectl_retry function.

Updates: #2309

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 7fba62dd47)
2021-08-11 15:08:48 +00:00
Rakshith R
f2c4a6409f ci: use kubectl_retry in install_helm.sh script
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit eb8c1cd5ab)
2021-08-11 15:08:48 +00:00
Rakshith R
bfd5f820c5 ci: modify kubectl_retry() to handle NotFound on delete cmd
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 2b19197e2f)
2021-08-11 15:08:48 +00:00
Rakshith R
6a4194c701 ci: move kubectl_retry() to utils.sh to be able to import it
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit a15892a87a)
2021-08-11 15:08:48 +00:00
Rakshith R
342867a197 e2e: create reusable variable vaultUserSecretPath = "user-secret.yaml"
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 1d49b6a288)
2021-08-11 09:50:10 +00:00
Rakshith R
0593071dac e2e: add modification to test encrypted PVC with rbd controller
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 2f995eada2)
2021-08-11 09:50:10 +00:00
Rakshith R
f97c3f901d e2e: use retryKubectlFile() for creating & deleting secrets
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 8ca7a35820)
2021-08-11 09:50:10 +00:00
Rakshith R
33899663e1 e2e: add prefixname to rbd controller test
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 0744ad502b)
2021-08-11 09:50:10 +00:00
Rakshith R
a797b7e200 rbd: extract kmsID from volumeAttributes in RegenerateJournal()
This commit adds functionality of extracting encryption kmsID,
owner from volumeAttributes in RegenerateJournal() and adds utility
functions ParseEncryptionOpts and FetchEncryptionKMSID.

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit f05ac2b25d)
2021-08-11 09:50:10 +00:00
Rakshith R
2545101842 rbd: extract volumeNamePrefix in RegenerateJournal()
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit b960e3633a)
2021-08-11 09:50:10 +00:00
Rakshith R
5189ccc13e rbd: refractor RegenerateJournal() to take in volumeAttributes
This commit refractors RegenerateJournal() to take in
volumeAttributes map[string]string as argument so it
can extract required attributes internally.

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit b9b4b1e34e)
2021-08-11 09:50:10 +00:00
Rakshith R
d4c84e814b rbd: use CSIInstanceID var instead of "default" in RegenerateJournal()
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 39d6752fc1)
2021-08-11 09:50:10 +00:00
Madhu Rajanna
fbc1e5f3d5 e2e: retry running kubectl on known errors
By using retryKubectl helper function,
a retry will be done, and the known error
messages will be skipped.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 2c66dfc3e4)
2021-08-11 07:03:05 +00:00
Madhu Rajanna
f7e150b84f e2e: pass variadic argument to kubectl helper function
this provides caller ability to pass the arguments
like ignore-not-found=true etc when executing
the kubectl commands.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 2071c535fa)
2021-08-11 07:03:05 +00:00
Madhu Rajanna
64937f1f68 e2e: add retryKubectlArgs helper for kubectl retry
added helper function retryKubectlArgs to perform
action if its a known error.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 9f0af30735)
2021-08-11 07:03:05 +00:00
Madhu Rajanna
9e84583063 e2e: add isAlreadyExistsCLIError to check known error
added isAlreadyExistsCLIError to check for known error.
if error is already exists we are considering it
as a success.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit dd9fabf747)
2021-08-11 07:03:05 +00:00
Madhu Rajanna
72a2b97be2 rbd: consider empty mirroring mode
consider the empty mirroring mode when
validating the snapshot interval and
the scheduling time.
Even if the mirroring Mode is not set
validate the snapshot scheduling details
as cephcsi sets the mirroring mode to default
snapshot.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 3c85219962)
2021-08-10 12:55:41 +00:00
Madhu Rajanna
75ff33785b rbd: log LastUpdate in UTC format
This Commit converts the LastUpdate
from int to the UTC format and logs
it for better debugging.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 2782878ea2)
2021-08-10 08:56:08 +00:00
Rakshith R
0b43e91c77 rbd: fix snapshot id idempotency issue
This commit fixes snapshot id idempotency issue by
always returning an error when flattening is in progress
and not using `readyToUse:false` response.

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 825211730c)
2021-08-09 12:10:42 +00:00
Rakshith R
05622b87c0 e2e: log imageList in validateRBDImageCount for better debugging
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 7f6b73e71f)
2021-08-09 12:10:42 +00:00
Rakshith R
f5e73009e1 e2e: add test cases for pvc-pvcClone chain with depth 2
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 9d57717222)
2021-08-09 12:10:42 +00:00
Rakshith R
dc046d0204 e2e: add test cases for snapshot-restore chain with depth 2
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 9321b4bce4)
2021-08-09 12:10:42 +00:00
Rakshith R
33234c1b51 cleanup: refractor checkCloneImage to reducing nesting if
This commit refractors checkCloneImage function to
address nestif linter issue.

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 859d696279)
2021-08-09 12:10:42 +00:00
Madhu Rajanna
32faed322a rbd: fix clone problem
This commit fixes a bug in checkCloneImage() which was caused
by checking cloned image before checking on temp-clone image snap
in a subsequent request which lead to stale images. This was solved
by checking temp-clone image snap and flattening temp-clone if
needed.
This commit also fixes comparison bug in flattenCloneImage().

Signed-off-by: Rakshith R <rar@redhat.com>
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit a5a8952716)
2021-08-09 12:10:42 +00:00
Madhu Rajanna
a7a5a527c2 rbd: copy creds when copying the connection
rbd flatten functions is a CLI call and it expects
the creds as the input and copying of creds is
required when we generate the temp clone image.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 916c97b4a8)
2021-08-09 12:10:42 +00:00
Rakshith R
33509ca90a rbd: fix vol.VolID in cloneFromSnapshot()
Volume generated from snap using genrateVolFromSnap
already copies volume ID correctly, therefore removing
`vol.VolID = rbdVol.VolID` which wrongly copies parent
Volume ID instead leading to error from copyEncryption()
on parent and clone volume ID being equal.

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 08728b631b)
2021-08-09 12:10:42 +00:00
Madhu Rajanna
1470af8316 doc: change FsID to FscID for cephfs
updated the filesystem identifier from
FsId to FscID.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit fce5a181d0)
2021-08-09 09:24:16 +00:00
Madhu Rajanna
f65961d01e doc: add design doc for clusterid poolid mapping
added design doc to handle volumeID mapping in case
of the failover in the Disaster Recovery.

update #2118

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 5fc9c3a046)
2021-08-09 09:24:16 +00:00
Madhu Rajanna
cbe3ac71f3 deploy: add template changes for mapping
added template changes for the clusterID and
poolID,fsID mapping details for the pod templates.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit d321663872)
2021-08-09 09:24:16 +00:00
Madhu Rajanna
829fc5ed95 rbd: read clusterID and PoolID from mapping
Whenever Ceph-CSI receives a CSI/Replication
request it will first decode the
volumeHandle and try to get the required
OMAP details if it is not able to
retrieve, receives a `Not Found` error
message and Ceph-CSI will check for the
clusterID mapping. If the old volumeID
`0001-00013-site1-storage-0000000000000001
-b0285c97-a0ce-11eb-8c66-0242ac110002`
contains the `site1-storage` as the clusterID,
now Ceph-CSI will look for the corresponding
clusterID `site2-storage` from the above configmap.
If the clusterID mapping is found now Ceph-CSI
will look for the poolID mapping ie mapping between
`1` and `2`. Example:- pool with name exists on
both the clusters with different ID's Replicapool
with ID `1` on site1 and Replicapool with ID `2`
on site2. After getting the required mapping Ceph-CSI
has the required information to get more details
from the rados OMAP. If we have multiple clusterID mapping
it will loop through all the mapping and checks the
corresponding pool to get the OMAP data. If the clusterID
mapping does not exist Ceph-CSI will return an `Not Found`
error message to the caller.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 92ad2ceec9)
2021-08-09 09:24:16 +00:00
Madhu Rajanna
daea5177e5 util: add helper function to read clusterID mapping
added helper function to read the clusterID mapping
from the mounted file.

The clusterID mapping contains below mappings
* ClusterID mappings (to cluster to which we are failingover
and from which cluster failover happened)
* RBD PoolID mapping of between the clusters.
* CephFS FscID mapping between the clusters.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit ac11d71e19)
2021-08-09 09:24:16 +00:00
Yug Gupta
459f6eca5a helm: update cephfs provisioner updateStrategy
Update ceph-csi-cephfs.provisioner updatestrategy
to allow maxUnavailable pods at a time to be 50%

Signed-off-by: Yug Gupta <yuggupta27@gmail.com>
(cherry picked from commit 080f7538c0)
2021-08-06 12:37:33 +00:00
Yug Gupta
45e80f8952 helm: update rbd provisioner updateStrategy
Update ceph-csi-rbd.provisioner updatestrategy
to allow maxUnavailable pods at a time to be 50%

Signed-off-by: Yug Gupta <yuggupta27@gmail.com>
(cherry picked from commit ea088d40be)
2021-08-06 12:37:33 +00:00
Niels de Vos
bc24a8c8ac util: allow configuring VAULT_AUTH_MOUNT_PATH for Vault Tenant SA KMS
The VAULT_AUTH_MOUNT_PATH is a Vault configuration parameter that allows
a user to set a non default path for the Kubernetes ServiceAccount
integration. This can already be configured for the Vault KMS, and is
now added to the Vault Tenant SA KMS as well.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
(cherry picked from commit 4859f2dfdb)
2021-08-06 09:30:32 +00:00