Commit Graph

2332 Commits

Author SHA1 Message Date
Shyamsundar Ranganathan
8938ee81aa rbd: Report errors when a resync maybe in progress
Currently we return a !ready status if an image
is not found when a replication resync is issued.

We also return a !ready just post issuing a resync.

The change is to ensure we return errors in these
cases for the caller to retry the operation till
we can determine we are actually resyncing, and then
return !ready with nil errors.

Part of addressing:
  https://github.com/csi-addons/volume-replication-operator/issues/101

Signed-off-by: Shyamsundar Ranganathan <srangana@redhat.com>
(cherry picked from commit 47dc9cf28d)
2021-09-15 17:48:40 +00:00
Rakshith R
3f435f5eb2 util: modify GetMonsAndClusterID() to take clusterID instead of options
This commit:
- modifies GetMonsAndClusterID() to take clusterID instead of options.
- moves out validation of clusterID is set or not out of GetMonsAndClusterID().
- defines ErrClusterIDNotSet new error for reusability.
- add GetClusterID() to obtain clusterID from options.

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 82d09d81cf)
2021-09-14 12:56:12 +00:00
Rakshith R
b7505c29e2 rbd: check for clusterid mapping in genVolFromVolumeOptions()
This commit adds capability to genVolFromVolumeOptions() to fetch
mapped clusted-id & mon ips for mirrored PVC on secondary cluster
which may have different cluster-id.

This is required for NodeStageVolume().

We also don't need to check for mapping during volume create requests,
so it can be disabled by passing a bool checkClusterIDMapping as false.

GetMonsAndClusterID() is modified to accept bool checkClusterIDMapping
based on which clustermapping is checked to fetch mapped cluster-id and
mon-ips.

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 9d1e98ca60)
2021-09-14 12:56:12 +00:00
Rakshith R
f77e1a9e27 util: read ceph.conf by calling conn.ReadConfigFile(CephConfigPath)
The configurations in cpeh.conf is not picked up by rados connection
automatically, hence we need to call conn.ReadConfigFile before calling
Connect().

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit e99dd3dea4)
2021-09-10 03:30:52 +00:00
Rakshith R
075d1bfcee ci: use 0 as default NUM_DISKS in minikube.sh
This is done to prevent conflicts with current ci setup externally
attaching disks.

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 8f75a24cfd)
2021-09-09 17:33:51 +00:00
Rakshith R
de4e661c6f ci: pass $DISK_CONFIG properly to minikube start
Having double quotes around $DISK_CONFIG led to these args
not being properly passed to minikube start. This commit fixes it.

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 42a6c3c006)
2021-09-09 17:33:51 +00:00
Rakshith R
e5f6cc53f0 util: call WriteCephConfig() in cephcsi.go
This commit calls WriteCephConfig() in cephcsi.go to
create ceph.conf and keyring if it is not mounted to
be used by all cli calls and conn cmds.

Before this change, rbd-controller/omap-generator did not create
ceph.conf on startup.

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 0a7a7f4866)
2021-09-09 13:32:12 +00:00
Madhu Rajanna
c8f8272d77 rbd: set vaultAuthNamespace to vaultNamespace if empty
When we read the csi-kms-connection-details configmap
vaultAuthNamespace might not be set when we do the
conversion the vaultAuthNamespace might be set to empty
key and this commits check for the empty value of
vaultAuthNamespace and set the vaultAuthNamespace
to vaultNamespace.

setting empty value for vaultAuthNamespace happened due
to Marshalling at https://github.com/ceph/ceph-csi/blob/devel/
internal/kms/vault_tokens.go#L136-L139.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 8c8f34cf7a)
2021-09-09 08:48:47 +00:00
Rakshith R
96429384ec ci: add support to create extra disks through minikube
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 1b64a0a505)
2021-09-07 12:27:23 +00:00
Rakshith R
ff325ca0f6 rebase: update minikube to v1.23.0
See-also: https://github.com/kubernetes/minikube/releases/tag/v1.23.0

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 08c10c9f94)
2021-09-07 12:27:23 +00:00
Niels de Vos
24f92b2255 util: NewK8sClient() should not panic on non-Kubernetes clusters
When NewK8sClient() detects and error, it used to call FatalLogMsg()
which causes a panic. There are additional features that can be used on
Kubernetes clusters, but these are not a requirement for most
functionalities of the driver.

Instead of causing a panic, returning an error should suffice. This
allows using the driver on non-Kubernetes clusters again.

Fixes: #2452
Signed-off-by: Niels de Vos <ndevos@redhat.com>
(cherry picked from commit 60c2afbcca)
2021-09-02 16:50:40 +00:00
Rakshith R
cf93951f3b rbd: check for clusterid mapping in RegenerateJournal()
This commit adds fetchMappedClusterIDAndMons() which returns
monitors and clusterID info after checking cluster mapping info.
This is required for regenerating omap entries in mirrored cluster
with different clusterID.

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 99168dc822)
2021-09-01 09:40:24 +00:00
Rakshith R
dcd2a8c900 rbd: move GetMappedID() to util package
This commit moves getMappedID() from rbd to util
package since it is not rbd specific and exports
it from there.

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 496bcba85c)
2021-09-01 09:40:24 +00:00
Niels de Vos
d42814dfb9 util: fix unit-test for GetClusterMappingInfo()
Unit-testing often fails due to a race condition while writing the
clusterMappingConfigFile from multiple go-routines at the same time.
Failures from `make containerized-test` look like this:

    === CONT  TestGetClusterMappingInfo/site2-storage_cluster-id_mapping
        cluster_mapping_test.go:153: GetClusterMappingInfo() = <nil>, expected data &[{map[site1-storage:site2-storage] [map[1:3]] [map[11:5]]} {map[site3-storage:site2-storage] [map[8:3]] [map[10:5]]}]
    === CONT  TestGetClusterMappingInfo/site3-storage_cluster-id_mapping
        cluster_mapping_test.go:153: GetClusterMappingInfo() = <nil>, expected data &[{map[site3-storage:site2-storage] [map[8:3]] [map[10:5]]}]
    --- FAIL: TestGetClusterMappingInfo (0.01s)
        --- PASS: TestGetClusterMappingInfo/mapping_file_not_found (0.00s)
        --- PASS: TestGetClusterMappingInfo/mapping_file_found_with_empty_data (0.00s)
        --- PASS: TestGetClusterMappingInfo/cluster-id_mapping_not_found (0.00s)
        --- FAIL: TestGetClusterMappingInfo/site2-storage_cluster-id_mapping (0.00s)
        --- FAIL: TestGetClusterMappingInfo/site3-storage_cluster-id_mapping (0.00s)
        --- PASS: TestGetClusterMappingInfo/site1-storage_cluster-id_mapping (0.00s)

By splitting the public GetClusterMappingInfo() function into an
internal getClusterMappingInfo() that takes a filename, unit-testing can
use different files for each go-routine, and testing becomes more
predictable.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
(cherry picked from commit 8b71671b42665789fac4a4aa1453b0b107f475c6)
2021-09-01 09:40:24 +00:00
Niels de Vos
82b6857688 cleanup: address pylint "consider-using-with" in tracevol.py
pylint started to report errors like the following:

    troubleshooting/tools/tracevol.py:97:10: R1732: Consider using 'with' for resource-allocating operations (consider-using-with)

There probably has been an update of Pylint in the test-container that
is more strict than previous versions.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
(cherry picked from commit 544d73759c39a08d82f20ea674896abf7857d9ef)
2021-09-01 09:40:24 +00:00
Niels de Vos
addf6407b0 build: vendor code.cloudfoundry.org/gofileutils from GitHub
There is a problem accessing the code.cloudfoundry.org web service iver
TLS. It seems to redirect to GitHub, so use the package from there:

    running: go mod verify
    go: github.com/libopenstorage/secrets@v0.0.0-20210709082113-dde442ea20ec requires
    	github.com/hashicorp/vault@v1.4.2 requires
    	github.com/hashicorp/vault-plugin-auth-cf@v0.5.4 requires
    	github.com/cloudfoundry-community/go-cfclient@v0.0.0-20190201205600-f136f9222381 requires
    	code.cloudfoundry.org/gofileutils@v0.0.0-20170111115228-4d0c80011a0f: unrecognized import path "code.cloudfoundry.org/gofileutils": https fetch: Get "https://code.cloudfoundry.org/gofileutils?go-get=1": x509: certificate signed by unknown authority

Signed-off-by: Niels de Vos <ndevos@redhat.com>
(cherry picked from commit 32da0cf888ba452288a0e7436eed91cf7ca5dd4e)
2021-09-01 09:40:24 +00:00
Humble Chirammal
eb50407eac helm: correct the groupVersion of CSIDriver in the chart
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
(cherry picked from commit 3462cd9bbd)
2021-08-17 11:53:54 +00:00
Humble Chirammal
a78d24ce88 helm: correct watch verb in topology RBAC
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
(cherry picked from commit 8e00c2c810)
2021-08-17 11:53:54 +00:00
Madhu Rajanna
7690e43bed rbd: Cleanup OMAP data for secondary image
If the image is in a secondary state and its
up+replaying means its an healthy secondary
and the image is primary somewhere in the remote cluster
and the local image is getting replayed. Delete the
OMAP data generated as we cannot delete the
secondary image. When the image on the primary
cluster gets deleted/mirroring disabled, the image on
all the remote (secondary) clusters will get
auto-deleted. This helps in garbage collecting
the OMAP, PVC and PV objects after failback operation.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 5562e46d0f)
2021-08-17 04:36:04 +00:00
Madhu Rajanna
ad0009c427 rbd: return succuss if image is healthy secondary
If the image is in secondary state and its
up+replaying means its an healthy secondary
and the image is primary somewhere in the remote
cluster and the local image is getting replayed.
Return success for the Disabling mirroring as
we cannot disable the mirroring on the secondary
state, when the image on the remote site gets
disabled the image on all the remote (secondary)
will get auto deleted. This helps in garbage
collecting the volume replication kuberentes
artifacts

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit fc0d6f6b8b)
2021-08-17 04:36:04 +00:00
Madhu Rajanna
e42552dd2f rbd: add helper function to get local state
added helper function to check the local image
state is up+replaying.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 35324b2e17)
2021-08-17 04:36:04 +00:00
Rakshith R
8997a1bbdb ci: internally create & delete cephcsi namespace in install-helm.sh
This ensures the kubectl call is retried with kubectl_retry function.

Updates: #2309

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 7fba62dd47)
2021-08-11 15:08:48 +00:00
Rakshith R
f2c4a6409f ci: use kubectl_retry in install_helm.sh script
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit eb8c1cd5ab)
2021-08-11 15:08:48 +00:00
Rakshith R
bfd5f820c5 ci: modify kubectl_retry() to handle NotFound on delete cmd
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 2b19197e2f)
2021-08-11 15:08:48 +00:00
Rakshith R
6a4194c701 ci: move kubectl_retry() to utils.sh to be able to import it
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit a15892a87a)
2021-08-11 15:08:48 +00:00
Rakshith R
342867a197 e2e: create reusable variable vaultUserSecretPath = "user-secret.yaml"
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 1d49b6a288)
2021-08-11 09:50:10 +00:00
Rakshith R
0593071dac e2e: add modification to test encrypted PVC with rbd controller
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 2f995eada2)
2021-08-11 09:50:10 +00:00
Rakshith R
f97c3f901d e2e: use retryKubectlFile() for creating & deleting secrets
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 8ca7a35820)
2021-08-11 09:50:10 +00:00
Rakshith R
33899663e1 e2e: add prefixname to rbd controller test
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 0744ad502b)
2021-08-11 09:50:10 +00:00
Rakshith R
a797b7e200 rbd: extract kmsID from volumeAttributes in RegenerateJournal()
This commit adds functionality of extracting encryption kmsID,
owner from volumeAttributes in RegenerateJournal() and adds utility
functions ParseEncryptionOpts and FetchEncryptionKMSID.

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit f05ac2b25d)
2021-08-11 09:50:10 +00:00
Rakshith R
2545101842 rbd: extract volumeNamePrefix in RegenerateJournal()
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit b960e3633a)
2021-08-11 09:50:10 +00:00
Rakshith R
5189ccc13e rbd: refractor RegenerateJournal() to take in volumeAttributes
This commit refractors RegenerateJournal() to take in
volumeAttributes map[string]string as argument so it
can extract required attributes internally.

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit b9b4b1e34e)
2021-08-11 09:50:10 +00:00
Rakshith R
d4c84e814b rbd: use CSIInstanceID var instead of "default" in RegenerateJournal()
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 39d6752fc1)
2021-08-11 09:50:10 +00:00
Madhu Rajanna
fbc1e5f3d5 e2e: retry running kubectl on known errors
By using retryKubectl helper function,
a retry will be done, and the known error
messages will be skipped.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 2c66dfc3e4)
2021-08-11 07:03:05 +00:00
Madhu Rajanna
f7e150b84f e2e: pass variadic argument to kubectl helper function
this provides caller ability to pass the arguments
like ignore-not-found=true etc when executing
the kubectl commands.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 2071c535fa)
2021-08-11 07:03:05 +00:00
Madhu Rajanna
64937f1f68 e2e: add retryKubectlArgs helper for kubectl retry
added helper function retryKubectlArgs to perform
action if its a known error.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 9f0af30735)
2021-08-11 07:03:05 +00:00
Madhu Rajanna
9e84583063 e2e: add isAlreadyExistsCLIError to check known error
added isAlreadyExistsCLIError to check for known error.
if error is already exists we are considering it
as a success.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit dd9fabf747)
2021-08-11 07:03:05 +00:00
Madhu Rajanna
72a2b97be2 rbd: consider empty mirroring mode
consider the empty mirroring mode when
validating the snapshot interval and
the scheduling time.
Even if the mirroring Mode is not set
validate the snapshot scheduling details
as cephcsi sets the mirroring mode to default
snapshot.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 3c85219962)
2021-08-10 12:55:41 +00:00
Madhu Rajanna
75ff33785b rbd: log LastUpdate in UTC format
This Commit converts the LastUpdate
from int to the UTC format and logs
it for better debugging.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 2782878ea2)
2021-08-10 08:56:08 +00:00
Rakshith R
0b43e91c77 rbd: fix snapshot id idempotency issue
This commit fixes snapshot id idempotency issue by
always returning an error when flattening is in progress
and not using `readyToUse:false` response.

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 825211730c)
2021-08-09 12:10:42 +00:00
Rakshith R
05622b87c0 e2e: log imageList in validateRBDImageCount for better debugging
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 7f6b73e71f)
2021-08-09 12:10:42 +00:00
Rakshith R
f5e73009e1 e2e: add test cases for pvc-pvcClone chain with depth 2
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 9d57717222)
2021-08-09 12:10:42 +00:00
Rakshith R
dc046d0204 e2e: add test cases for snapshot-restore chain with depth 2
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 9321b4bce4)
2021-08-09 12:10:42 +00:00
Rakshith R
33234c1b51 cleanup: refractor checkCloneImage to reducing nesting if
This commit refractors checkCloneImage function to
address nestif linter issue.

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 859d696279)
2021-08-09 12:10:42 +00:00
Madhu Rajanna
32faed322a rbd: fix clone problem
This commit fixes a bug in checkCloneImage() which was caused
by checking cloned image before checking on temp-clone image snap
in a subsequent request which lead to stale images. This was solved
by checking temp-clone image snap and flattening temp-clone if
needed.
This commit also fixes comparison bug in flattenCloneImage().

Signed-off-by: Rakshith R <rar@redhat.com>
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit a5a8952716)
2021-08-09 12:10:42 +00:00
Madhu Rajanna
a7a5a527c2 rbd: copy creds when copying the connection
rbd flatten functions is a CLI call and it expects
the creds as the input and copying of creds is
required when we generate the temp clone image.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 916c97b4a8)
2021-08-09 12:10:42 +00:00
Rakshith R
33509ca90a rbd: fix vol.VolID in cloneFromSnapshot()
Volume generated from snap using genrateVolFromSnap
already copies volume ID correctly, therefore removing
`vol.VolID = rbdVol.VolID` which wrongly copies parent
Volume ID instead leading to error from copyEncryption()
on parent and clone volume ID being equal.

Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 08728b631b)
2021-08-09 12:10:42 +00:00
Madhu Rajanna
1470af8316 doc: change FsID to FscID for cephfs
updated the filesystem identifier from
FsId to FscID.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit fce5a181d0)
2021-08-09 09:24:16 +00:00
Madhu Rajanna
f65961d01e doc: add design doc for clusterid poolid mapping
added design doc to handle volumeID mapping in case
of the failover in the Disaster Recovery.

update #2118

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 5fc9c3a046)
2021-08-09 09:24:16 +00:00
Madhu Rajanna
cbe3ac71f3 deploy: add template changes for mapping
added template changes for the clusterID and
poolID,fsID mapping details for the pod templates.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit d321663872)
2021-08-09 09:24:16 +00:00