Commit Graph

2346 Commits

Author SHA1 Message Date
Madhu Rajanna
84d7726980 ci: use 1.8.5 vault for e2e
current latest vault release is 1.9.0 but
with the latest image our E2E is broken.
reverting back the vault version to 1.8.5
till we root cause the issue.

Note:- This is to unblock PR merging

updates: #2657

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 5524b2d538)
2021-11-24 13:07:20 +00:00
Prasanna Kumar Kalever
2441fe8515 e2e: validate encrypted image mount inside the nodeplugin
currently the mountType validation of the encrypted volume is done in
the application, we should rather validate this inside the nodeplugin
pod.

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
(cherry picked from commit 0bf9db822b)
2021-11-24 13:07:20 +00:00
Niels de Vos
e18435fc63 rebase: replace vendored layeh.com/radius with GitHub source
The webserver at layeh.com seems to be misbehaving, which causes `go mod
verify` to fail. The layeh.com/radius repository is maintained on
GitHub, so the sources can be vendored/verified from there too.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
(cherry picked from commit 1f650e1204)
2021-11-23 11:48:57 +00:00
Madhu Rajanna
c893a45117 cleanup: remove FIXME from ResyncVolume
as the complexity of ResyncVolume is
reduced removing the FIXME which is not valid
anymore.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 0838845c6a)
2021-10-27 04:25:04 +00:00
Madhu Rajanna
dade373fc0 rbd: log mirror daemon state for replication
log the mirror deamon state in the local and
remote cluster for better debugging.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 2017b8c621)
2021-10-27 04:25:04 +00:00
Madhu Rajanna
ca4bf66035 rbd: log the remote site details during resync
logging the remote site details during resyncing
for better debugging.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit b92a6f5ccb)
2021-10-27 04:25:04 +00:00
Madhu Rajanna
c41bf37b95 rbd: check local image state for resyncing
below are the local states of the mirrored image

"unknown"  -> If the image is in an error state
means data is completely synced
"error" -> If the image is in an error state
means it needs resync
"syncing"
"starting_replay"
"replaying"
"stopping_replay"
"stopped"

If the resync is successfully started which
means the image will be in "replaying" state.
we can consider "replaying" state to report
resync succesfully going on state.

we are discarding the intermediate states like
"syncing", "starting_replay" and "stopping_replay".

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 1fd2f28fee)
2021-10-27 04:25:04 +00:00
Madhu Rajanna
7d163dab64 rbd: check local image description for split-brain
In some corner case like `re-player shutdown` the
local image will not be in error state. It would
be also worth considering `description` field to
make sure about split-brain.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 0d51f6d833)
2021-10-18 13:54:21 +00:00
Rakshith R
e758c0a0c2 e2e: add testcase for thick encrypted PVC restore
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit f60b097f5f)
2021-10-05 11:33:26 +00:00
Rakshith R
711fbbfdcc rbd: copyEncryptionConfig for thickProvisioned snap restore too
This commit adds bugfix to copy encryption passphrase for thick
provisioned PVC restored from snapshot.

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit ded75eb099)
2021-10-05 11:33:26 +00:00
Rakshith R
15c2df0c39 e2e: add nolint:param to retryKubectlArgs
Currently only kubectlCreate arg is used with retryKubectlArgs(),
But it maybe used later on.

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit b471cac6bd)
2021-10-05 11:33:26 +00:00
Rakshith R
a1774cee87 e2e: add testcase for PVC restore from vaultKMS to vaultTenantSAKMS
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit dac4e76ae1)
2021-10-05 11:33:26 +00:00
Rakshith R
a781ebb844 e2e: modify validatePVCSnapshot() to use restoreSCName & restoreKMS
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit f63ed2ca5a)
2021-10-05 11:33:26 +00:00
Rakshith R
9ae00de979 rbd: modify copyEncryptionConfig to accept copyOnlyPassphrase arg
During PVC snapshot/clone both kms config and passphrase needs to copied,
while for PVC restore only passphrase needs to be copied to dest rbdvol
since destination storageclass may have another kms config.

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 59b7a26175)
2021-10-05 11:33:26 +00:00
Shyamsundar Ranganathan
8938ee81aa rbd: Report errors when a resync maybe in progress
Currently we return a !ready status if an image
is not found when a replication resync is issued.

We also return a !ready just post issuing a resync.

The change is to ensure we return errors in these
cases for the caller to retry the operation till
we can determine we are actually resyncing, and then
return !ready with nil errors.

Part of addressing:
  https://github.com/csi-addons/volume-replication-operator/issues/101

Signed-off-by: Shyamsundar Ranganathan <srangana@redhat.com>
(cherry picked from commit 47dc9cf28d)
2021-09-15 17:48:40 +00:00
Rakshith R
3f435f5eb2 util: modify GetMonsAndClusterID() to take clusterID instead of options
This commit:
- modifies GetMonsAndClusterID() to take clusterID instead of options.
- moves out validation of clusterID is set or not out of GetMonsAndClusterID().
- defines ErrClusterIDNotSet new error for reusability.
- add GetClusterID() to obtain clusterID from options.

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 82d09d81cf)
2021-09-14 12:56:12 +00:00
Rakshith R
b7505c29e2 rbd: check for clusterid mapping in genVolFromVolumeOptions()
This commit adds capability to genVolFromVolumeOptions() to fetch
mapped clusted-id & mon ips for mirrored PVC on secondary cluster
which may have different cluster-id.

This is required for NodeStageVolume().

We also don't need to check for mapping during volume create requests,
so it can be disabled by passing a bool checkClusterIDMapping as false.

GetMonsAndClusterID() is modified to accept bool checkClusterIDMapping
based on which clustermapping is checked to fetch mapped cluster-id and
mon-ips.

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 9d1e98ca60)
2021-09-14 12:56:12 +00:00
Rakshith R
f77e1a9e27 util: read ceph.conf by calling conn.ReadConfigFile(CephConfigPath)
The configurations in cpeh.conf is not picked up by rados connection
automatically, hence we need to call conn.ReadConfigFile before calling
Connect().

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit e99dd3dea4)
2021-09-10 03:30:52 +00:00
Rakshith R
075d1bfcee ci: use 0 as default NUM_DISKS in minikube.sh
This is done to prevent conflicts with current ci setup externally
attaching disks.

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 8f75a24cfd)
2021-09-09 17:33:51 +00:00
Rakshith R
de4e661c6f ci: pass $DISK_CONFIG properly to minikube start
Having double quotes around $DISK_CONFIG led to these args
not being properly passed to minikube start. This commit fixes it.

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 42a6c3c006)
2021-09-09 17:33:51 +00:00
Rakshith R
e5f6cc53f0 util: call WriteCephConfig() in cephcsi.go
This commit calls WriteCephConfig() in cephcsi.go to
create ceph.conf and keyring if it is not mounted to
be used by all cli calls and conn cmds.

Before this change, rbd-controller/omap-generator did not create
ceph.conf on startup.

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 0a7a7f4866)
2021-09-09 13:32:12 +00:00
Madhu Rajanna
c8f8272d77 rbd: set vaultAuthNamespace to vaultNamespace if empty
When we read the csi-kms-connection-details configmap
vaultAuthNamespace might not be set when we do the
conversion the vaultAuthNamespace might be set to empty
key and this commits check for the empty value of
vaultAuthNamespace and set the vaultAuthNamespace
to vaultNamespace.

setting empty value for vaultAuthNamespace happened due
to Marshalling at https://github.com/ceph/ceph-csi/blob/devel/
internal/kms/vault_tokens.go#L136-L139.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 8c8f34cf7a)
2021-09-09 08:48:47 +00:00
Rakshith R
96429384ec ci: add support to create extra disks through minikube
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 1b64a0a505)
2021-09-07 12:27:23 +00:00
Rakshith R
ff325ca0f6 rebase: update minikube to v1.23.0
See-also: https://github.com/kubernetes/minikube/releases/tag/v1.23.0

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 08c10c9f94)
2021-09-07 12:27:23 +00:00
Niels de Vos
24f92b2255 util: NewK8sClient() should not panic on non-Kubernetes clusters
When NewK8sClient() detects and error, it used to call FatalLogMsg()
which causes a panic. There are additional features that can be used on
Kubernetes clusters, but these are not a requirement for most
functionalities of the driver.

Instead of causing a panic, returning an error should suffice. This
allows using the driver on non-Kubernetes clusters again.

Fixes: #2452
Signed-off-by: Niels de Vos <ndevos@redhat.com>
(cherry picked from commit 60c2afbcca)
2021-09-02 16:50:40 +00:00
Rakshith R
cf93951f3b rbd: check for clusterid mapping in RegenerateJournal()
This commit adds fetchMappedClusterIDAndMons() which returns
monitors and clusterID info after checking cluster mapping info.
This is required for regenerating omap entries in mirrored cluster
with different clusterID.

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 99168dc822)
2021-09-01 09:40:24 +00:00
Rakshith R
dcd2a8c900 rbd: move GetMappedID() to util package
This commit moves getMappedID() from rbd to util
package since it is not rbd specific and exports
it from there.

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 496bcba85c)
2021-09-01 09:40:24 +00:00
Niels de Vos
d42814dfb9 util: fix unit-test for GetClusterMappingInfo()
Unit-testing often fails due to a race condition while writing the
clusterMappingConfigFile from multiple go-routines at the same time.
Failures from `make containerized-test` look like this:

    === CONT  TestGetClusterMappingInfo/site2-storage_cluster-id_mapping
        cluster_mapping_test.go:153: GetClusterMappingInfo() = <nil>, expected data &[{map[site1-storage:site2-storage] [map[1:3]] [map[11:5]]} {map[site3-storage:site2-storage] [map[8:3]] [map[10:5]]}]
    === CONT  TestGetClusterMappingInfo/site3-storage_cluster-id_mapping
        cluster_mapping_test.go:153: GetClusterMappingInfo() = <nil>, expected data &[{map[site3-storage:site2-storage] [map[8:3]] [map[10:5]]}]
    --- FAIL: TestGetClusterMappingInfo (0.01s)
        --- PASS: TestGetClusterMappingInfo/mapping_file_not_found (0.00s)
        --- PASS: TestGetClusterMappingInfo/mapping_file_found_with_empty_data (0.00s)
        --- PASS: TestGetClusterMappingInfo/cluster-id_mapping_not_found (0.00s)
        --- FAIL: TestGetClusterMappingInfo/site2-storage_cluster-id_mapping (0.00s)
        --- FAIL: TestGetClusterMappingInfo/site3-storage_cluster-id_mapping (0.00s)
        --- PASS: TestGetClusterMappingInfo/site1-storage_cluster-id_mapping (0.00s)

By splitting the public GetClusterMappingInfo() function into an
internal getClusterMappingInfo() that takes a filename, unit-testing can
use different files for each go-routine, and testing becomes more
predictable.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
(cherry picked from commit 8b71671b42665789fac4a4aa1453b0b107f475c6)
2021-09-01 09:40:24 +00:00
Niels de Vos
82b6857688 cleanup: address pylint "consider-using-with" in tracevol.py
pylint started to report errors like the following:

    troubleshooting/tools/tracevol.py:97:10: R1732: Consider using 'with' for resource-allocating operations (consider-using-with)

There probably has been an update of Pylint in the test-container that
is more strict than previous versions.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
(cherry picked from commit 544d73759c39a08d82f20ea674896abf7857d9ef)
2021-09-01 09:40:24 +00:00
Niels de Vos
addf6407b0 build: vendor code.cloudfoundry.org/gofileutils from GitHub
There is a problem accessing the code.cloudfoundry.org web service iver
TLS. It seems to redirect to GitHub, so use the package from there:

    running: go mod verify
    go: github.com/libopenstorage/secrets@v0.0.0-20210709082113-dde442ea20ec requires
    	github.com/hashicorp/vault@v1.4.2 requires
    	github.com/hashicorp/vault-plugin-auth-cf@v0.5.4 requires
    	github.com/cloudfoundry-community/go-cfclient@v0.0.0-20190201205600-f136f9222381 requires
    	code.cloudfoundry.org/gofileutils@v0.0.0-20170111115228-4d0c80011a0f: unrecognized import path "code.cloudfoundry.org/gofileutils": https fetch: Get "https://code.cloudfoundry.org/gofileutils?go-get=1": x509: certificate signed by unknown authority

Signed-off-by: Niels de Vos <ndevos@redhat.com>
(cherry picked from commit 32da0cf888ba452288a0e7436eed91cf7ca5dd4e)
2021-09-01 09:40:24 +00:00
Humble Chirammal
eb50407eac helm: correct the groupVersion of CSIDriver in the chart
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
(cherry picked from commit 3462cd9bbd)
2021-08-17 11:53:54 +00:00
Humble Chirammal
a78d24ce88 helm: correct watch verb in topology RBAC
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
(cherry picked from commit 8e00c2c810)
2021-08-17 11:53:54 +00:00
Madhu Rajanna
7690e43bed rbd: Cleanup OMAP data for secondary image
If the image is in a secondary state and its
up+replaying means its an healthy secondary
and the image is primary somewhere in the remote cluster
and the local image is getting replayed. Delete the
OMAP data generated as we cannot delete the
secondary image. When the image on the primary
cluster gets deleted/mirroring disabled, the image on
all the remote (secondary) clusters will get
auto-deleted. This helps in garbage collecting
the OMAP, PVC and PV objects after failback operation.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 5562e46d0f)
2021-08-17 04:36:04 +00:00
Madhu Rajanna
ad0009c427 rbd: return succuss if image is healthy secondary
If the image is in secondary state and its
up+replaying means its an healthy secondary
and the image is primary somewhere in the remote
cluster and the local image is getting replayed.
Return success for the Disabling mirroring as
we cannot disable the mirroring on the secondary
state, when the image on the remote site gets
disabled the image on all the remote (secondary)
will get auto deleted. This helps in garbage
collecting the volume replication kuberentes
artifacts

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit fc0d6f6b8b)
2021-08-17 04:36:04 +00:00
Madhu Rajanna
e42552dd2f rbd: add helper function to get local state
added helper function to check the local image
state is up+replaying.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 35324b2e17)
2021-08-17 04:36:04 +00:00
Rakshith R
8997a1bbdb ci: internally create & delete cephcsi namespace in install-helm.sh
This ensures the kubectl call is retried with kubectl_retry function.

Updates: #2309

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 7fba62dd47)
2021-08-11 15:08:48 +00:00
Rakshith R
f2c4a6409f ci: use kubectl_retry in install_helm.sh script
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit eb8c1cd5ab)
2021-08-11 15:08:48 +00:00
Rakshith R
bfd5f820c5 ci: modify kubectl_retry() to handle NotFound on delete cmd
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 2b19197e2f)
2021-08-11 15:08:48 +00:00
Rakshith R
6a4194c701 ci: move kubectl_retry() to utils.sh to be able to import it
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit a15892a87a)
2021-08-11 15:08:48 +00:00
Rakshith R
342867a197 e2e: create reusable variable vaultUserSecretPath = "user-secret.yaml"
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 1d49b6a288)
2021-08-11 09:50:10 +00:00
Rakshith R
0593071dac e2e: add modification to test encrypted PVC with rbd controller
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 2f995eada2)
2021-08-11 09:50:10 +00:00
Rakshith R
f97c3f901d e2e: use retryKubectlFile() for creating & deleting secrets
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 8ca7a35820)
2021-08-11 09:50:10 +00:00
Rakshith R
33899663e1 e2e: add prefixname to rbd controller test
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 0744ad502b)
2021-08-11 09:50:10 +00:00
Rakshith R
a797b7e200 rbd: extract kmsID from volumeAttributes in RegenerateJournal()
This commit adds functionality of extracting encryption kmsID,
owner from volumeAttributes in RegenerateJournal() and adds utility
functions ParseEncryptionOpts and FetchEncryptionKMSID.

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit f05ac2b25d)
2021-08-11 09:50:10 +00:00
Rakshith R
2545101842 rbd: extract volumeNamePrefix in RegenerateJournal()
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit b960e3633a)
2021-08-11 09:50:10 +00:00
Rakshith R
5189ccc13e rbd: refractor RegenerateJournal() to take in volumeAttributes
This commit refractors RegenerateJournal() to take in
volumeAttributes map[string]string as argument so it
can extract required attributes internally.

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit b9b4b1e34e)
2021-08-11 09:50:10 +00:00
Rakshith R
d4c84e814b rbd: use CSIInstanceID var instead of "default" in RegenerateJournal()
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 39d6752fc1)
2021-08-11 09:50:10 +00:00
Madhu Rajanna
fbc1e5f3d5 e2e: retry running kubectl on known errors
By using retryKubectl helper function,
a retry will be done, and the known error
messages will be skipped.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 2c66dfc3e4)
2021-08-11 07:03:05 +00:00
Madhu Rajanna
f7e150b84f e2e: pass variadic argument to kubectl helper function
this provides caller ability to pass the arguments
like ignore-not-found=true etc when executing
the kubectl commands.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 2071c535fa)
2021-08-11 07:03:05 +00:00
Madhu Rajanna
64937f1f68 e2e: add retryKubectlArgs helper for kubectl retry
added helper function retryKubectlArgs to perform
action if its a known error.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 9f0af30735)
2021-08-11 07:03:05 +00:00