This commit disables mon,mgr and mds liveness probe
which on failing caused `crashLoopBackOff` state.
Updates: #2094
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit a4e4750fdc)
pylint started to report errors like the following:
troubleshooting/tools/tracevol.py:97:10: R1732: Consider using 'with' for resource-allocating operations (consider-using-with)
There probably has been an update of Pylint in the test-container that
is more strict than previous versions.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
(cherry picked from commit 8447a1feab)
This ensures the kubectl call is retried with kubectl_retry function.
Updates: #2309
Signed-off-by: Rakshith R <rar@redhat.com>
(cherry picked from commit 7fba62dd47)
It seems that the version of the key/value engine can not always be
detected for Hashicorp Vault. In certain cases, it is required to
configure the `VAULT_BACKEND` (or `vaultBackend`) option so that a
successful connection to the service can be made.
The `kv-v2` is the current default for development deployments of
Hashicorp Vault (what we use for automated testing). Production
deployments default to version 1 for now.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
(cherry picked from commit 82557e3f34)
Testing encrypted PVCs does not work anymore since Kubernetes v1.21. It
seems that disabling the iss validation in Hashicorp Vault is a
relatively simple workaround that we can use instead of the more complex
securing of the environment like should be done in production
deployments.
Updates: #1963
See-also: external-secrets/kubernetes-external-secrets#721
Signed-off-by: Niels de Vos <ndevos@redhat.com>
(cherry picked from commit fd9fee74de)
While cleaning up snapshots, not all object may exist after a partial
provisioning attempt. In case objects are missing, do not try to delete
them.
Fixes: #2192
Signed-off-by: Niels de Vos <ndevos@redhat.com>
(cherry picked from commit 0ee0c12027)
Getting rid of function keyword for two reasons:
1. Defining a function without 'function' keyword is more
portable as it is compatible with Bourne/Korn/POSIX scripts
2. To ensure the coding style is same for the file.
Signed-off-by: Yug <yuggupta27@gmail.com>
(cherry picked from commit e47738fa75)
The current approach uses hard-coded command line
arguments which is not very robust;
To maintain backward compatibility, script will
also keep working as the previous approach.
Signed-off-by: Yug <yuggupta27@gmail.com>
(cherry picked from commit cc72de4b1c)
add an e2eArg `helmTest` to specify if tests are running
on ceph-csi deployment via helm.
For testing in CI, Storageclass and secret deployment
is enabled on helm installation.
Signed-off-by: Yug <yuggupta27@gmail.com>
(cherry picked from commit a4548c3983)
the parent volume(CreateVolume) and the clone volume
(CreateSnapshot) are both indepedent and parent volume
can be deleted anytime. To check the thick provision
during Snapshot restore(CreateVolume from snapshot)
we need the thick provision metadata so for the same
reason setting the thick provision metadata on the
clone image we are creating at the CreateSnapshot time.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 591ba3f580)
RbdSnapName holds the actual RBD image name which
got created during the CreateSnapshot operation.
RbdImageName holds the name of the parent from
which the snapshot is created. and the parent
is independent of snapshot and it can be deleted
any time for the same reason using the RbdSnapName
to check the rbd image details.
generate a temporary volume from the snapshot which
replaces the rbdImageName with RbdSnapName and use
it to check the image metadata.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 6d14eeee70)
added validation to allow only Restore of Thick PVC
snapshot to a thick clone and creation of thick clone
from thick PVC.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 7966d2e5c1)
isThickProvisioned can be used for both snapshot
and clone validation if isThickProvisioned is method
of common rbdImage structure.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit fc442221e4)
actual error will be present in the stdErr not the error
when we try to add a task to flatten the rbd image. This
commits corrects the error checking when the image does
not have a parent.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 05b8433b89)
The variable naming for rbd mount options has been changed
to rbdMountOptions to be consistent with other variable naming schema
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
In golang world, if you split an empty string that does not contain
the seperator, you get an array with one empty string. This results
in volumes failing to mount with "invalid feature " (note extra space
because it's trying to check if 'empty string' is a valid feature).
This patch checks if the string is empty, and if so, it just decides
to skip the entire validation and returning nothing.
Signed-off-by: Mohammed Naser <mnaser@vexxhost.com>
(cherry picked from commit 671d6a7767)
CreateVolume will fail in below cases
* If the snapshot is encrypted and requested volume
is not encrypted
* If the snapshot is not encrypted and requested
volume is encrypted
* If the parent volume is encrypted and requested volume
is not encrypted
* If the parent volume is not encrypted and requested
volume is encrypted
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 7b5c78ec7c)
Move the repairing of a volume/snapshot from CreateVolume to its own
function. This reduces the complexity of the code, and makes the
procedure easier to understand. Further enhancements to repairing an
exsiting volume can be done in the new function.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
(cherry picked from commit 96a8ea3e88)
Added an E2E test to test below case
* Create PVC
* Create Snapshot from PVC
* Delete PVC
* Create Clone from Snapshot
* Delete Snapshot
* Mount clone to Application
* Delete Application and PVC Clone
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit fa36a46682)
flatten the image if the deep-flatten feature
is present on the images in the chain or if the
images in chain is not zero, as we cannot check
the deep-flatten feature the images which are
in trash.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 67d73cd6e9)
For flatten we call checkImageChainHasFeature
which internally calls to getImageInfo returns
the parent name even if the parent is in the trash,
when we try to open the parent image to get its
information it fails as the image not found.
we should treat error as nil if the parent is not found.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit e15e2e5081)
To recover from split brain (up+error) state the image need to be
demoted and requested for resync on site-a and then the image on site-b
should gets demoted.The volume should be marked to ready=true when the
image state on both the clusters are up+unknown because during the last
snapshot syncing the data gets copied first and then image state on the
site-a changes to up+unknown.
If the image state on both the sites are up+unknown consider that
complete data is synced as the last snapshot
gets exchanged between the clusters.
* create 10 GB of file and validate the data after resync
* Do Failover when the site-a goes down
* Force promote the image and write data in GiB
* Once the site-a comes back, Demote the image and issue resync
* Demote the image on site-b
* The status will get reflected on the other site when the last
snapshot sync happens
* The image will go to up+unknown state. and complete data will
be copied to site a
* Promote the image on site-a and use it
```bash
csi-vol-5633715e-a7eb-11eb-bebb-0242ac110006:
global_id: e7f9ec55-06ab-46cb-a1ae-784be75ed96d
state: up+unknown
description: remote image demoted
service: a on minicluster1
last_update: 2021-04-28 07:11:56
peer_sites:
name: e47e29f4-96e8-44ed-b6c6-edf15c5a91d6-rook-ceph
state: up+unknown
description: remote image demoted
last_update: 2021-04-28 07:11:41
```
* Do Failover when the site-a goes down
* Force promote the image on site-b and write data in GiB
* Demote the image on site-b
* Once the site-a comes back, Demote the image on site-a
* The images on the both site will go to split brain state
```bash
csi-vol-37effcb5-a7f1-11eb-bebb-0242ac110006:
global_id: 115c3df9-3d4f-4c04-93a7-531b82155ddf
state: up+error
description: split-brain
service: a on minicluster2
last_update: 2021-04-28 07:25:41
peer_sites:
name: abbda0f0-0117-4425-8cb2-deb4c853da47-rook-ceph
state: up+error
description: split-brain
last_update: 2021-04-28 07:25:26
```
* Issue resync
* The images cannot be resynced because when we issue resync
on site a the image on site-b was in demoted state
* To recover from this state (promote and then demote the
image on site-b after sometime)
```bash
csi-vol-37effcb5-a7f1-11eb-bebb-0242ac110006:
global_id: 115c3df9-3d4f-4c04-93a7-531b82155ddf
state: up+unknown
description: remote image demoted
service: a on minicluster1
last_update: 2021-04-28 07:32:56
peer_sites:
name: e47e29f4-96e8-44ed-b6c6-edf15c5a91d6-rook-ceph
state: up+unknown
description: remote image demoted
last_update: 2021-04-28 07:32:41
```
* Once the data is copied we can see that the image state
is moved to up+unknown on both sites
* Promote the image on site-a and use it
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 07a916b84d)
when a Snapshot is encrypted during a CreateSnapshot
operation, the encryption key gets created in the KMS
when we delete the Snapshot the key from the KMS
should also gets deleted.
When we create a volume from snapshot we are copying
required information but we missed to copy the
encryption information, This commit adds the missing
information to delete the encryption key.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit c3bae17fce)
At present we return the volume connect error if the clone
from snapshot fails when rbdvolume is encrypted, which is incorrect.
This patch correctly return the failed copy encryption error to the
caller
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
(cherry picked from commit 798437d0c4)
from helm v3.x version there is no helm init
command. Removing the helm init which was causing
helm chart pushing issue in release and devel
branch.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 6508726276)
Install the helm package based on the version
specified in the build.env
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit aa77b677a3)
Before RBD map operation, we do check the
watchers on the RBD image. In the case of
RWO volume. cephcsi makes sure only one
client is using the RBD image. If the rbd
image is mirrored, by default mirroring
daemon will add a watcher on the image
and as we are using go-ceph a watcher will
be added as we have opened the image So
we will have two watchers on an image if
mirroring is enabled. This holds when the
rbd mirror daemon is running, In case if
the mirror daemon is not running there will
be only one watcher on the rbd image
(which is placed by go-ceph image open)
we should not block the map operation if
the mirroring daemon is not running as
its Async mirroring. This commit adds a
check to make sure no more than 2 watchers
if the image is mirrored or no more than 1
watcher if it is not mirrored image.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 52290333e6)
This commit reverts back the changes done
for v3.3.0 release. With this change a
release canary tagged image and helm charts
will get pushed.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
If the pool or few keys are missing in the omap.
GetImageAttributes function returns nil error message and few
empty items in imageAttributes struct. if the image is not
found and the entiries are missing use
the volumeId present on the PV annotation for further operations.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
incase if the image is promoted and demoted the
image state will be set to up+unknown if the image
on the remote cluster is still in demoted state.
when user changes the state from primary to secondary
and still the image is in demoted (secondary) state
in the remote cluster. the image state on both the cluster
will be on unknown state.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
BlockVolume, CSIBlockVolume(GA since k8s v1.18) & VolumeSnapshotDataSource
(GA since k8s v1.20) default to true and don't need to be set to true in
feature gates setting.
Signed-off-by: Rakshith R <rar@redhat.com>