The webserver at layeh.com seems to be misbehaving, which causes `go mod
verify` to fail. The layeh.com/radius repository is maintained on
GitHub, so the sources can be vendored/verified from there too.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
(cherry picked from commit 1f650e12046be14ad99ada6cdd69cda5a792e0b3)
as the complexity of ResyncVolume is
reduced removing the FIXME which is not valid
anymore.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 0838845c6a36e6bf5868c276a55fbbbcbd2835ba)
log the mirror deamon state in the local and
remote cluster for better debugging.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 2017b8c6214d05867f90db89200a3d8b0f19023b)
logging the remote site details during resyncing
for better debugging.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit b92a6f5ccb5ac9381de0672439a808ce32d060c9)
below are the local states of the mirrored image
"unknown" -> If the image is in an error state
means data is completely synced
"error" -> If the image is in an error state
means it needs resync
"syncing"
"starting_replay"
"replaying"
"stopping_replay"
"stopped"
If the resync is successfully started which
means the image will be in "replaying" state.
we can consider "replaying" state to report
resync succesfully going on state.
we are discarding the intermediate states like
"syncing", "starting_replay" and "stopping_replay".
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 1fd2f28fee704d6b6200ddd8e2670a934c97e18b)
In some corner case like `re-player shutdown` the
local image will not be in error state. It would
be also worth considering `description` field to
make sure about split-brain.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 0d51f6d833f749214377c8c9616cd68760c7de58)
This commit adds bugfix to copy encryption passphrase for thick
provisioned PVC restored from snapshot.
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit ded75eb09954c2a04b9fb45993d5ff3d23b1c457)
Currently only kubectlCreate arg is used with retryKubectlArgs(),
But it maybe used later on.
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit b471cac6bd4914201fbfd866f5d02dc184f7278d)
During PVC snapshot/clone both kms config and passphrase needs to copied,
while for PVC restore only passphrase needs to be copied to dest rbdvol
since destination storageclass may have another kms config.
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 59b7a261754d8b4af4fbf63e2794bca08329c421)
Currently we return a !ready status if an image
is not found when a replication resync is issued.
We also return a !ready just post issuing a resync.
The change is to ensure we return errors in these
cases for the caller to retry the operation till
we can determine we are actually resyncing, and then
return !ready with nil errors.
Part of addressing:
https://github.com/csi-addons/volume-replication-operator/issues/101
Signed-off-by: Shyamsundar Ranganathan <srangana@redhat.com>
(cherry picked from commit 47dc9cf28dcada5f6bd0a2416f5eb03fa2c64a6b)
This commit:
- modifies GetMonsAndClusterID() to take clusterID instead of options.
- moves out validation of clusterID is set or not out of GetMonsAndClusterID().
- defines ErrClusterIDNotSet new error for reusability.
- add GetClusterID() to obtain clusterID from options.
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 82d09d81cfba97eab88ce2dca6fdca71c69437be)
This commit adds capability to genVolFromVolumeOptions() to fetch
mapped clusted-id & mon ips for mirrored PVC on secondary cluster
which may have different cluster-id.
This is required for NodeStageVolume().
We also don't need to check for mapping during volume create requests,
so it can be disabled by passing a bool checkClusterIDMapping as false.
GetMonsAndClusterID() is modified to accept bool checkClusterIDMapping
based on which clustermapping is checked to fetch mapped cluster-id and
mon-ips.
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 9d1e98ca60911351794c398cf936ba8c5a36b3c8)
The configurations in cpeh.conf is not picked up by rados connection
automatically, hence we need to call conn.ReadConfigFile before calling
Connect().
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit e99dd3dea4759887a051eda486ef1df5149ad689)
This is done to prevent conflicts with current ci setup externally
attaching disks.
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 8f75a24cfd17d9f098cd18ccb12630d33523e370)
Having double quotes around $DISK_CONFIG led to these args
not being properly passed to minikube start. This commit fixes it.
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 42a6c3c006efc53aea57e07dd74ee2770a9128f8)
This commit calls WriteCephConfig() in cephcsi.go to
create ceph.conf and keyring if it is not mounted to
be used by all cli calls and conn cmds.
Before this change, rbd-controller/omap-generator did not create
ceph.conf on startup.
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 0a7a7f4866572f67cb9cb3fccff43b60413b7018)
When we read the csi-kms-connection-details configmap
vaultAuthNamespace might not be set when we do the
conversion the vaultAuthNamespace might be set to empty
key and this commits check for the empty value of
vaultAuthNamespace and set the vaultAuthNamespace
to vaultNamespace.
setting empty value for vaultAuthNamespace happened due
to Marshalling at https://github.com/ceph/ceph-csi/blob/devel/
internal/kms/vault_tokens.go#L136-L139.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 8c8f34cf7a98527ef6fb9324f1da7b0cf50a3445)
When NewK8sClient() detects and error, it used to call FatalLogMsg()
which causes a panic. There are additional features that can be used on
Kubernetes clusters, but these are not a requirement for most
functionalities of the driver.
Instead of causing a panic, returning an error should suffice. This
allows using the driver on non-Kubernetes clusters again.
Fixes: #2452
Signed-off-by: Niels de Vos <ndevos@redhat.com>
(cherry picked from commit 60c2afbccadf008ebcd1293f3c10f0d5596b0e62)
This commit adds fetchMappedClusterIDAndMons() which returns
monitors and clusterID info after checking cluster mapping info.
This is required for regenerating omap entries in mirrored cluster
with different clusterID.
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 99168dc822a4c81f4a12dcf00a7165e7426594ce)
This commit moves getMappedID() from rbd to util
package since it is not rbd specific and exports
it from there.
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 496bcba85c8fc1b4a00b71872cb0f560036180fe)
Unit-testing often fails due to a race condition while writing the
clusterMappingConfigFile from multiple go-routines at the same time.
Failures from `make containerized-test` look like this:
=== CONT TestGetClusterMappingInfo/site2-storage_cluster-id_mapping
cluster_mapping_test.go:153: GetClusterMappingInfo() = <nil>, expected data &[{map[site1-storage:site2-storage] [map[1:3]] [map[11:5]]} {map[site3-storage:site2-storage] [map[8:3]] [map[10:5]]}]
=== CONT TestGetClusterMappingInfo/site3-storage_cluster-id_mapping
cluster_mapping_test.go:153: GetClusterMappingInfo() = <nil>, expected data &[{map[site3-storage:site2-storage] [map[8:3]] [map[10:5]]}]
--- FAIL: TestGetClusterMappingInfo (0.01s)
--- PASS: TestGetClusterMappingInfo/mapping_file_not_found (0.00s)
--- PASS: TestGetClusterMappingInfo/mapping_file_found_with_empty_data (0.00s)
--- PASS: TestGetClusterMappingInfo/cluster-id_mapping_not_found (0.00s)
--- FAIL: TestGetClusterMappingInfo/site2-storage_cluster-id_mapping (0.00s)
--- FAIL: TestGetClusterMappingInfo/site3-storage_cluster-id_mapping (0.00s)
--- PASS: TestGetClusterMappingInfo/site1-storage_cluster-id_mapping (0.00s)
By splitting the public GetClusterMappingInfo() function into an
internal getClusterMappingInfo() that takes a filename, unit-testing can
use different files for each go-routine, and testing becomes more
predictable.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
(cherry picked from commit 8b71671b42665789fac4a4aa1453b0b107f475c6)
pylint started to report errors like the following:
troubleshooting/tools/tracevol.py:97:10: R1732: Consider using 'with' for resource-allocating operations (consider-using-with)
There probably has been an update of Pylint in the test-container that
is more strict than previous versions.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
(cherry picked from commit 544d73759c39a08d82f20ea674896abf7857d9ef)
There is a problem accessing the code.cloudfoundry.org web service iver
TLS. It seems to redirect to GitHub, so use the package from there:
running: go mod verify
go: github.com/libopenstorage/secrets@v0.0.0-20210709082113-dde442ea20ec requires
github.com/hashicorp/vault@v1.4.2 requires
github.com/hashicorp/vault-plugin-auth-cf@v0.5.4 requires
github.com/cloudfoundry-community/go-cfclient@v0.0.0-20190201205600-f136f9222381 requires
code.cloudfoundry.org/gofileutils@v0.0.0-20170111115228-4d0c80011a0f: unrecognized import path "code.cloudfoundry.org/gofileutils": https fetch: Get "https://code.cloudfoundry.org/gofileutils?go-get=1": x509: certificate signed by unknown authority
Signed-off-by: Niels de Vos <ndevos@redhat.com>
(cherry picked from commit 32da0cf888ba452288a0e7436eed91cf7ca5dd4e)
If the image is in a secondary state and its
up+replaying means its an healthy secondary
and the image is primary somewhere in the remote cluster
and the local image is getting replayed. Delete the
OMAP data generated as we cannot delete the
secondary image. When the image on the primary
cluster gets deleted/mirroring disabled, the image on
all the remote (secondary) clusters will get
auto-deleted. This helps in garbage collecting
the OMAP, PVC and PV objects after failback operation.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 5562e46d0f5ca1bb7cab492bb77f05873698bb80)
If the image is in secondary state and its
up+replaying means its an healthy secondary
and the image is primary somewhere in the remote
cluster and the local image is getting replayed.
Return success for the Disabling mirroring as
we cannot disable the mirroring on the secondary
state, when the image on the remote site gets
disabled the image on all the remote (secondary)
will get auto deleted. This helps in garbage
collecting the volume replication kuberentes
artifacts
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit fc0d6f6b8b1461ddec596a090719172224856bfe)
added helper function to check the local image
state is up+replaying.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 35324b2e1710fc6215ba7e39076b5d4372d1cb4a)
This ensures the kubectl call is retried with kubectl_retry function.
Updates: #2309
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 7fba62dd47d7573d2840c7df8ee38d13b7d7e21c)
This commit adds functionality of extracting encryption kmsID,
owner from volumeAttributes in RegenerateJournal() and adds utility
functions ParseEncryptionOpts and FetchEncryptionKMSID.
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit f05ac2b25dc0f3d81f6fd5c917aa5f1dadf60b17)
This commit refractors RegenerateJournal() to take in
volumeAttributes map[string]string as argument so it
can extract required attributes internally.
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit b9b4b1e34ef4eb72e48e408dd6e40495cfe0ae24)
By using retryKubectl helper function,
a retry will be done, and the known error
messages will be skipped.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 2c66dfc3e42382dab4f717c0fe9aeae10a79ad32)
this provides caller ability to pass the arguments
like ignore-not-found=true etc when executing
the kubectl commands.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 2071c535fa32bc83f4189ed6dce55d2a2892371f)
added helper function retryKubectlArgs to perform
action if its a known error.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 9f0af30735f34b4977e56a29e4035ce3edd8fc0c)
added isAlreadyExistsCLIError to check for known error.
if error is already exists we are considering it
as a success.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit dd9fabf747108f402f02cb3eeb7fbb39d7682c1a)
consider the empty mirroring mode when
validating the snapshot interval and
the scheduling time.
Even if the mirroring Mode is not set
validate the snapshot scheduling details
as cephcsi sets the mirroring mode to default
snapshot.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 3c852199625333c8ccf8db18e592bb5627270d6b)