This commit adds fetchMappedClusterIDAndMons() which returns
monitors and clusterID info after checking cluster mapping info.
This is required for regenerating omap entries in mirrored cluster
with different clusterID.
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 99168dc822a4c81f4a12dcf00a7165e7426594ce)
This commit moves getMappedID() from rbd to util
package since it is not rbd specific and exports
it from there.
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 496bcba85c8fc1b4a00b71872cb0f560036180fe)
Unit-testing often fails due to a race condition while writing the
clusterMappingConfigFile from multiple go-routines at the same time.
Failures from `make containerized-test` look like this:
=== CONT TestGetClusterMappingInfo/site2-storage_cluster-id_mapping
cluster_mapping_test.go:153: GetClusterMappingInfo() = <nil>, expected data &[{map[site1-storage:site2-storage] [map[1:3]] [map[11:5]]} {map[site3-storage:site2-storage] [map[8:3]] [map[10:5]]}]
=== CONT TestGetClusterMappingInfo/site3-storage_cluster-id_mapping
cluster_mapping_test.go:153: GetClusterMappingInfo() = <nil>, expected data &[{map[site3-storage:site2-storage] [map[8:3]] [map[10:5]]}]
--- FAIL: TestGetClusterMappingInfo (0.01s)
--- PASS: TestGetClusterMappingInfo/mapping_file_not_found (0.00s)
--- PASS: TestGetClusterMappingInfo/mapping_file_found_with_empty_data (0.00s)
--- PASS: TestGetClusterMappingInfo/cluster-id_mapping_not_found (0.00s)
--- FAIL: TestGetClusterMappingInfo/site2-storage_cluster-id_mapping (0.00s)
--- FAIL: TestGetClusterMappingInfo/site3-storage_cluster-id_mapping (0.00s)
--- PASS: TestGetClusterMappingInfo/site1-storage_cluster-id_mapping (0.00s)
By splitting the public GetClusterMappingInfo() function into an
internal getClusterMappingInfo() that takes a filename, unit-testing can
use different files for each go-routine, and testing becomes more
predictable.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
(cherry picked from commit 8b71671b42665789fac4a4aa1453b0b107f475c6)
pylint started to report errors like the following:
troubleshooting/tools/tracevol.py:97:10: R1732: Consider using 'with' for resource-allocating operations (consider-using-with)
There probably has been an update of Pylint in the test-container that
is more strict than previous versions.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
(cherry picked from commit 544d73759c39a08d82f20ea674896abf7857d9ef)
There is a problem accessing the code.cloudfoundry.org web service iver
TLS. It seems to redirect to GitHub, so use the package from there:
running: go mod verify
go: github.com/libopenstorage/secrets@v0.0.0-20210709082113-dde442ea20ec requires
github.com/hashicorp/vault@v1.4.2 requires
github.com/hashicorp/vault-plugin-auth-cf@v0.5.4 requires
github.com/cloudfoundry-community/go-cfclient@v0.0.0-20190201205600-f136f9222381 requires
code.cloudfoundry.org/gofileutils@v0.0.0-20170111115228-4d0c80011a0f: unrecognized import path "code.cloudfoundry.org/gofileutils": https fetch: Get "https://code.cloudfoundry.org/gofileutils?go-get=1": x509: certificate signed by unknown authority
Signed-off-by: Niels de Vos <ndevos@redhat.com>
(cherry picked from commit 32da0cf888ba452288a0e7436eed91cf7ca5dd4e)
If the image is in a secondary state and its
up+replaying means its an healthy secondary
and the image is primary somewhere in the remote cluster
and the local image is getting replayed. Delete the
OMAP data generated as we cannot delete the
secondary image. When the image on the primary
cluster gets deleted/mirroring disabled, the image on
all the remote (secondary) clusters will get
auto-deleted. This helps in garbage collecting
the OMAP, PVC and PV objects after failback operation.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 5562e46d0f5ca1bb7cab492bb77f05873698bb80)
If the image is in secondary state and its
up+replaying means its an healthy secondary
and the image is primary somewhere in the remote
cluster and the local image is getting replayed.
Return success for the Disabling mirroring as
we cannot disable the mirroring on the secondary
state, when the image on the remote site gets
disabled the image on all the remote (secondary)
will get auto deleted. This helps in garbage
collecting the volume replication kuberentes
artifacts
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit fc0d6f6b8b1461ddec596a090719172224856bfe)
added helper function to check the local image
state is up+replaying.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 35324b2e1710fc6215ba7e39076b5d4372d1cb4a)
This ensures the kubectl call is retried with kubectl_retry function.
Updates: #2309
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 7fba62dd47d7573d2840c7df8ee38d13b7d7e21c)
This commit adds functionality of extracting encryption kmsID,
owner from volumeAttributes in RegenerateJournal() and adds utility
functions ParseEncryptionOpts and FetchEncryptionKMSID.
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit f05ac2b25dc0f3d81f6fd5c917aa5f1dadf60b17)
This commit refractors RegenerateJournal() to take in
volumeAttributes map[string]string as argument so it
can extract required attributes internally.
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit b9b4b1e34ef4eb72e48e408dd6e40495cfe0ae24)
By using retryKubectl helper function,
a retry will be done, and the known error
messages will be skipped.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 2c66dfc3e42382dab4f717c0fe9aeae10a79ad32)
this provides caller ability to pass the arguments
like ignore-not-found=true etc when executing
the kubectl commands.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 2071c535fa32bc83f4189ed6dce55d2a2892371f)
added helper function retryKubectlArgs to perform
action if its a known error.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 9f0af30735f34b4977e56a29e4035ce3edd8fc0c)
added isAlreadyExistsCLIError to check for known error.
if error is already exists we are considering it
as a success.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit dd9fabf747108f402f02cb3eeb7fbb39d7682c1a)
consider the empty mirroring mode when
validating the snapshot interval and
the scheduling time.
Even if the mirroring Mode is not set
validate the snapshot scheduling details
as cephcsi sets the mirroring mode to default
snapshot.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 3c852199625333c8ccf8db18e592bb5627270d6b)
This Commit converts the LastUpdate
from int to the UTC format and logs
it for better debugging.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 2782878ea20c7d49f392ccdb948001eb0e1b83e0)
This commit fixes snapshot id idempotency issue by
always returning an error when flattening is in progress
and not using `readyToUse:false` response.
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 825211730cce9c6a909e66fb9e7248ea35c17c8e)
This commit refractors checkCloneImage function to
address nestif linter issue.
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 859d69627935fd526074eb494cacde8e9dd34402)
This commit fixes a bug in checkCloneImage() which was caused
by checking cloned image before checking on temp-clone image snap
in a subsequent request which lead to stale images. This was solved
by checking temp-clone image snap and flattening temp-clone if
needed.
This commit also fixes comparison bug in flattenCloneImage().
Signed-off-by: Rakshith R <rar@redhat.com>
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit a5a89527165af23c12957e6fd1a9c9c7f427ecef)
rbd flatten functions is a CLI call and it expects
the creds as the input and copying of creds is
required when we generate the temp clone image.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 916c97b4a87cfc4d9cad9aaaedcc33b0de75a032)
Volume generated from snap using genrateVolFromSnap
already copies volume ID correctly, therefore removing
`vol.VolID = rbdVol.VolID` which wrongly copies parent
Volume ID instead leading to error from copyEncryption()
on parent and clone volume ID being equal.
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 08728b631b753ef44b7a4bd48d3eba383c497d35)
updated the filesystem identifier from
FsId to FscID.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit fce5a181d05b4880db3ece754489403fc9b7c9e1)
added design doc to handle volumeID mapping in case
of the failover in the Disaster Recovery.
update #2118
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 5fc9c3a046d445480ee6f39c9bfb53e308158561)
added template changes for the clusterID and
poolID,fsID mapping details for the pod templates.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit d321663872badfc1ef91b4214db1876edc697368)
Whenever Ceph-CSI receives a CSI/Replication
request it will first decode the
volumeHandle and try to get the required
OMAP details if it is not able to
retrieve, receives a `Not Found` error
message and Ceph-CSI will check for the
clusterID mapping. If the old volumeID
`0001-00013-site1-storage-0000000000000001
-b0285c97-a0ce-11eb-8c66-0242ac110002`
contains the `site1-storage` as the clusterID,
now Ceph-CSI will look for the corresponding
clusterID `site2-storage` from the above configmap.
If the clusterID mapping is found now Ceph-CSI
will look for the poolID mapping ie mapping between
`1` and `2`. Example:- pool with name exists on
both the clusters with different ID's Replicapool
with ID `1` on site1 and Replicapool with ID `2`
on site2. After getting the required mapping Ceph-CSI
has the required information to get more details
from the rados OMAP. If we have multiple clusterID mapping
it will loop through all the mapping and checks the
corresponding pool to get the OMAP data. If the clusterID
mapping does not exist Ceph-CSI will return an `Not Found`
error message to the caller.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 92ad2ceec977f482060544c94f9228cfdcf586cb)
added helper function to read the clusterID mapping
from the mounted file.
The clusterID mapping contains below mappings
* ClusterID mappings (to cluster to which we are failingover
and from which cluster failover happened)
* RBD PoolID mapping of between the clusters.
* CephFS FscID mapping between the clusters.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit ac11d71e19acfd46b9b6f157902d35c9dd7f953f)
Update ceph-csi-cephfs.provisioner updatestrategy
to allow maxUnavailable pods at a time to be 50%
Signed-off-by: Yug Gupta <yuggupta27@gmail.com>
(cherry picked from commit 080f7538c0d5ec0e94306b4e4a80f7bd887aa506)
Update ceph-csi-rbd.provisioner updatestrategy
to allow maxUnavailable pods at a time to be 50%
Signed-off-by: Yug Gupta <yuggupta27@gmail.com>
(cherry picked from commit ea088d40beba000ba4d601298e96c526c89283fa)
The VAULT_AUTH_MOUNT_PATH is a Vault configuration parameter that allows
a user to set a non default path for the Kubernetes ServiceAccount
integration. This can already be configured for the Vault KMS, and is
now added to the Vault Tenant SA KMS as well.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
(cherry picked from commit 4859f2dfdb88304cc484402739787adcbea4ed5f)
updaing the commitlint to the latest, so
each time latest release can be installed by
default.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 3805c29f36968d97b9e75f3df9d94e9781bb7e83)
updated commitlint mergify rules to
consider the commitlint status to
merge the PR.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 0b6322afda0cd173bb36be241d29c0f7719365d6)
This commit uses trailer-exists instead
of signed-off-by to verify the sign-off-by
message.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Suggested-by: Ade Attwood
(cherry picked from commit 38ef32a496b7109b8d186e7de79b5b8d6e494647)
- mount host's /etc/selinux in node plugins
- process mount options in all code paths for cephfs volume options
Signed-off-by: Alexandre Lossent <alexandre.lossent@cern.ch>
(cherry picked from commit 5cba04c470d259438f8608af9918d5d3ac338d58)
The new `vaultAuthNamespace` configuration parameter can be set to the
Vault Namespace where the authentication is setup in the service. Some
Hashicorp Vault deployments use sub-namespaces for their users/tenants,
with a 'root' namespace where the authentication is configured. This
requires passing of different Vault namespaces for different operations.
Example:
- the Kubernetes Auth mechanism is configured for in the Vault
Namespace called 'devops'
- a user/tenant has a sub-namespace called 'devops/website' where the
encryption passphrases can be placed in the key-value store
The configuration for this, then looks like:
vaultAuthNamespace: devops
vaultNamespace: devops/homepage
Note that Vault Namespaces are a feature of the Hashicorp Vault
Enterprise product, and not part of the Open Source version. This
prevents adding e2e tests that validate the Vault Namespace
configuration.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
(cherry picked from commit f2d5c2e0df8e2454bccc3c290600452989ebae97)