as deep-flatten is long supported in ceph and its
enabled by default in the librbd, providing an option
to enable it in cephcsi for the rbd images we are
creating.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
it might need sometime for the deployment to
get created, consider the NotFound as a valid
error and retry again.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
On occasion deploying CephFS components fail due to errors like these:
failed to delete provisioner rbac .../csi-provisioner-rbac.yaml
By using the deleteResource() helper, an retry is done in case of a
failure.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
There have been errors while CephFS tests were running, like:
failed to create storageclass: etcdserver: request timed out
When retrying to create the StorageClass, the e2e tests are expected to
continue and (hopefully) succeed.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
The CentOS Stream 8 base container image does not have `ps` installed.
This causes CI jobs to fail, when checking for a restarted rbd-nbd
process.
Instead of using `ps`, the `pstree` command can be used. This will add
some ASCII-tree symbols in front of the command that is logged by the
e2e tests, but that is only used for manual reviewing and does not harm
the running test.
Fixes: #2850
Signed-off-by: Niels de Vos <ndevos@redhat.com>
This commit removes the thick provisioning
code as thick provisioning is deprecated in
cephcsi 3.5.0.
fixes: #2795
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
cephfs data pool name is changed from filesystem-data0
to filesystem-replicated in Rook 1.8. updating
the cephcsi helper functions also to use new
pool names.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
as ioutil.ReadFile is deprecated and
suggestion is to use os.ReadFile as
per https://pkg.go.dev/io/ioutil updating
the same.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
This commit make recreateCSIRBDPods function to be a general one
so that it can be consumed by more clients.
Updates https://github.com/ceph/ceph-csi/issues/2509
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
added e2e for below cases
Normal PVC clone to a bigger
size PVC (without encryption)
* Filesystem pvc clone to a bigger size
* Block pvc clone to a bigger size
Encrypted PVC clone to a bigger
size PVC
* Filesystem pvc clone to a bigger size
* Block pvc clone to a bigger size
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
added e2e for below cases
Normal PVC snapshot restore to a bigger
size PVC (without encryption)
* Filesystem pvc restore to a bigger size
* Block pvc restore to a bigger size
Encrypted PVC snapshot restore to a bigger
size PVC
* Filesystem pvc restore to a bigger size
* Block pvc restore to a bigger size
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
This commit adjust existing migration e2e tests to a couple of tests
to cover the scenarios. The seperate filesystem and block tests have
been shrinked to single one and also introduced a couple of helper
functions to setup and teardown migraition specific secret,configmap
and sc. The static pv function has been renamed to a general name
while the tests were adjusted.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
This `unparam` linter escape is no longer needed and CI is failing
if we keep there. This commit remove the same and make CI happy.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
adding e2e testcase to validate the workflow
of pvc creation and attaching to pod works for
new image features like fast-diff,obj-map,exclusive-lock
and layering.
fixes: #2695
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Currently, we are skipping the generic ephemeral
testing if the kubernetes version is less than
1.21 because of this one the who test suite is
getting skipped and e2e is marked as success
in 2 minutes. This commit runs the ephemeral
tests if the kube=>1.21+. If we do this, for
the lower version we can run other tests.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
The e2e sometimes fail getting objects like PVCs from the Kubernetes API
server, and log the following error:
Error getting pvc "rbd-6940" in namespace "rbd-694": rpc error: code = Unknown desc = OK: HTTP status code 200; transport: missing content-type field
By checking the error message, and initiating a retry on this failure,
CI jobs should fail less regulary.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Add tests for RWX and ROX accessModes for Block and FileSystem Mode
PVCs.
Fixes: #2262
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
To make the error return consistent across e2e tests we have decided
to remove with error presence from the logs and this commit
does that for e2e/snapshot.go.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
To make the error return consistent across e2e tests we have decided
to remove with error presence from the logs and this commit
does that for e2e/cephfs_helper.go.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
To make the error return consistent across e2e tests we have decided
to remove with error presence from the logs and this commit
does that for e2e/upgrade-rbd.go.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
To make the error return consistent across e2e tests we have decided
to remove with error presence from the logs and this commit
does that for e2e/upgrade-cephfs.go.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
To make the error return consistent across e2e tests we have decided
to remove with error presence from the logs and this commit
does that for e2e/rbd_helper.go.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
To make the error return consistent across e2e tests we have decided
to remove with error presence from the logs and this commit
does that for e2e/ceph_user.go.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
To make the error return consistent across e2e tests we have decided
to remove with error presence from the logs and this commit
does that for e2e/utils.go.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
To make the error return consistent across e2e tests we have decided
to remove `with error` presence from the logs and this commit
does that for cephfs tests.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
To make the error return consistent across e2e tests we have decided
to remove `with error` presence from the logs and this commit
does that for rbd tests.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
This commit adds the validation of csi cephfs driver to work with
ephemeral volume support. With ephemeral volume support a user can
specify ephemeral volumes in its pod spec and tie the lifecycle
of the PVC with the POD.
An example POD spec also included in this commit.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
This commit adds the validation of csi RBD driver to work with
ephemeral volume support. With ephemeral volume support a user can
specify ephemeral volumes in its pod spec and tie the lifecycle
of the PVC with the POD.
An example pod spec is also included in this commit.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
Considering we are far out of these release and only care about
kubernetes releases from v1.20, there is no need to have this
version check in place for the tests.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
Considering we are far out of these release and only care about
kubernetes releases from v1.20, there is no need to have this
version check in place for the tests.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
Considering we are far out of these release and only care about
kubernetes releases from v1.20, there is no need to have this
version check in place for the tests.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
considering we are far out of this release and only care about
kubernetes releases from v1.20, there is no need to have this
version check in place for the tests.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
considering we are far out of this release and only care about
kubernetes releases from v1.20, there is no need to have this
version check in place for the tests.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
considering we are far out of this release and only care about
kubernetes releases from v1.20, there is no need to have this
version check in place for the tests.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
There have been occasional CI job failures due to "transport is closing"
errors. Adding this error to the isRetryableAPIError() function should
make sure to retry the request until the connection is restored.
Fixes: #2613
Signed-off-by: Niels de Vos <ndevos@redhat.com>
currently the mountType validation of the encrypted volume is done in
the application, we should rather validate this inside the nodeplugin
pod.
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Currently, at "perform IO on rbd-nbd volume after nodeplugin restart"
test we are performing write on the rbd-nbd based mount after nodeplugin
restart. But due to a bug in NBD driver the writes are failing, please
note NBD zero cmd timeout handling is fixed with kernel >= 5.4 and hence
we should defend on writes based on kernel version to avoid unnecessary
CI failures.
For more information see
https://github.com/ceph/ceph-csi/issues/2204#issuecomment-930941047
updates: #2204
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
this commit create and make use of migration secret in the requests and
validate various csi operations
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
This is to preserve the rbd-nbd logs post unmap, so that the CI can dump
the available logs from logdir.
Fixes: #2451
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
For static volume, the user will manually mounts
already existing image as a volume to the application
pods. As its a rbd Image, if the PVC is of type
fileSystem the image will be mapped, formatted
and mounted on the node,
If the user resizes the image on the ceph cluster.
User cannot not automatically resize the filesystem
created on the rbd image. Even if deletes and
recreates the kubernetes objects, the new size
will not be visible on the node.
With this changes During the NodeStageVolumeRequest
the nodeplugin will check the size of the mapped rbd
image on the node using the devicePath. and also
the rbd image size on the ceph cluster.
If the size is not matching it will do the file
system resize on the node as part of the
NodeStageVolumeRequest RPC call.
The user need to do below operation to see new size
* Resize the rbd image in ceph cluster
* Scale down all the application pods using the static
PVC.
* Make sure no application pods which are using the
static PVC is running on a node.
* Scale up all the application pods.
Validate the new size in application pod mounted
volume.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
This commit add test for migration delete volID detection scenario
by passing a custom volID and with the entries in configmap changed
to simulate the situation. The staticPV function also changed its
accept the annotation map which make it more general usage.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
createCustomConfigmap helps to create a custom cluster entry in
the configmap, however this was coupled with subvolumegroup filling
in the cluster configuration. This commit helps to make it more
general and the subvolumegroup filling is controlled now with a flag
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
e2elog.Logf("waiting for kubectl (%s -f $q args %s) to finish", action, args)
changed to
e2elog.Logf("waiting for kubectl (%s -f args %s) to finish", action, args)
Signed-off-by: Rakshith R <rar@redhat.com>
We need
https://www.mail-archive.com/linux-block@vger.kernel.org/msg38060.html
inorder to use `--io-timeout=0`. This patch is part of kernel 5.4
Since minikube doesn't have a v5.4 kernel yet, lets use io-timeout value
conditionally based on kernel version at our e2e.
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Currently, we get the kernel version where the e2e (client) executable runs,
not the kernel version that is used by the csi-rbdplugin pod.
Add a function that run `uname -r` command from the specified container and
returns the kernel version.
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Suggested-by: Niels de Vos <ndevos@redhat.com>
Ceph’s logging levels operate on a scale of 1 to 20, where 1 is terse
and 20 is verbose.
Format:
debug-{subsystem} = {log-level}
Setting `rbd` loglevel to 20 at our e2e tests.
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
The rbd-nbd resize volume support with its netlink interface needs linux
kernel version >= v5.3.0
Hence define a defence check for the supported kernel version
Fixes: #2234
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
The kmsConfig type in the e2e suite has been enhanced with two functions
that make it possible to validate the destruction of deleted keys.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
By using retryKubectl helper function,
a retry will be done, and the known error
messages will be skipped.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
this provides caller ability to pass the arguments
like ignore-not-found=true etc when executing
the kubectl commands.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
added isAlreadyExistsCLIError to check for known error.
if error is already exists we are considering it
as a success.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
expandPVCSize() uses the namespace of the PVC that was checked. In case
the .Get() call fails, the PVC will not have its namespace set, and
subsequent tries will fail with errors like:
Error getting pvc in namespace: '': etcdserver: request timed out
waiting for PVC (9 seconds elapsed)
Error getting pvc in namespace: '': an empty namespace may not be set when a resource name is provided
By using the original namespace of the PVC stored in a separate variable
as is done with the name of the PVC, this problem should not occur
anymore.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
In case listing the Kubernetes Services fails, the following error is
returned immediately:
failed to create configmap with error failed to list services: etcdserver: request timed out
Wrapping the listing of the Services in a PollImmediate() routine, adds
a retry in case of common temporary issues.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
registry.centos.org is not officially maintained by the CentOS
infrastructure team. The container images on quay.io are the official
once and we should use those instead.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Until we have a real fix, just to avoid occasionally file system entering
into read-only on nodeplugin restart, lets sync data from the application
pod.
Updates: #2204
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
There are reports where CephFS deploying failed with etcdserver
timeouts:
INFO: Running '/usr/bin/kubectl --server=https://192.168.39.187:8443 --kubeconfig=/root/.kube/config --namespace=cephcsi-e2e-ea434921 create --namespace=cephcsi-e2e-ea434921 -f -'
INFO: rc: 1
FAIL: failed to create CephFS provisioner rbac with error error running /usr/bin/kubectl --server=https://192.168.39.187:8443 --kubeconfig=/root/.kube/config --namespace=cephcsi-e2e-ea434921 create --namespace=cephcsi-e2e-ea434921 -f -:
Command stdout:
role.rbac.authorization.k8s.io/cephfs-external-provisioner-cfg created
rolebinding.rbac.authorization.k8s.io/cephfs-csi-provisioner-role-cfg created
stderr:
Error from server: error when creating "STDIN": etcdserver: request timed out
Error from server: error when creating "STDIN": etcdserver: request timed out
Error from server: error when creating "STDIN": etcdserver: request timed out
error:
exit status 1
By using retryKubectlInput() helper function, a retry will be done, and
the failure should not be fatal any longer.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
RBD image metadata keys that start with '.rbd' are expected to be
internal to RBD itself and are not mirrored to remote sites. Renaming
the keys (dropping the '.' prefix) and using the new MigrateMetadata()
function now makes the keys available on remote sites too.
Closes: #2219
Signed-off-by: Niels de Vos <ndevos@redhat.com>
framework.RunKubectl() returns an error that does not end with
"etcdserver: request timed out", but contains the text somewhere in the
middle:
error running /usr/bin/kubectl --server=https://192.168.39.57:8443 --kubeconfig=/root/.kube/config --namespace=cephcsi-e2e-a44ec4b4 create -f -:
Command stdout:
stderr:
Error from server: error when creating "STDIN": etcdserver: request timed out
error:
exit status 1
isRetryableAPIError() should return `true` for this case as well, so
instead of using HasSuffix(), we'll use Contains().
Signed-off-by: Niels de Vos <ndevos@redhat.com>
This commit modifies the error of godot, cyclop,
paralleltest linter caused due to merged PRs.
Updates: #1586
Signed-off-by: Yati Padia <ypadia@redhat.com>
nlreturn linter requires a new line before return
and branch statements except when the return is alone
inside a statement group (such as an if statement) to
increase code clarity. This commit addresses such issues.
Updates: #1586
Signed-off-by: Rakshith R <rar@redhat.com>