We need
https://www.mail-archive.com/linux-block@vger.kernel.org/msg38060.html
inorder to use `--io-timeout=0`. This patch is part of kernel 5.4
Since minikube doesn't have a v5.4 kernel yet, lets use io-timeout value
conditionally based on kernel version at our e2e.
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Currently, we get the kernel version where the e2e (client) executable runs,
not the kernel version that is used by the csi-rbdplugin pod.
Add a function that run `uname -r` command from the specified container and
returns the kernel version.
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Suggested-by: Niels de Vos <ndevos@redhat.com>
Ceph’s logging levels operate on a scale of 1 to 20, where 1 is terse
and 20 is verbose.
Format:
debug-{subsystem} = {log-level}
Setting `rbd` loglevel to 20 at our e2e tests.
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
The rbd-nbd resize volume support with its netlink interface needs linux
kernel version >= v5.3.0
Hence define a defence check for the supported kernel version
Fixes: #2234
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
The kmsConfig type in the e2e suite has been enhanced with two functions
that make it possible to validate the destruction of deleted keys.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
By using retryKubectl helper function,
a retry will be done, and the known error
messages will be skipped.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
this provides caller ability to pass the arguments
like ignore-not-found=true etc when executing
the kubectl commands.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
added isAlreadyExistsCLIError to check for known error.
if error is already exists we are considering it
as a success.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
expandPVCSize() uses the namespace of the PVC that was checked. In case
the .Get() call fails, the PVC will not have its namespace set, and
subsequent tries will fail with errors like:
Error getting pvc in namespace: '': etcdserver: request timed out
waiting for PVC (9 seconds elapsed)
Error getting pvc in namespace: '': an empty namespace may not be set when a resource name is provided
By using the original namespace of the PVC stored in a separate variable
as is done with the name of the PVC, this problem should not occur
anymore.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
In case listing the Kubernetes Services fails, the following error is
returned immediately:
failed to create configmap with error failed to list services: etcdserver: request timed out
Wrapping the listing of the Services in a PollImmediate() routine, adds
a retry in case of common temporary issues.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
registry.centos.org is not officially maintained by the CentOS
infrastructure team. The container images on quay.io are the official
once and we should use those instead.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Until we have a real fix, just to avoid occasionally file system entering
into read-only on nodeplugin restart, lets sync data from the application
pod.
Updates: #2204
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
There are reports where CephFS deploying failed with etcdserver
timeouts:
INFO: Running '/usr/bin/kubectl --server=https://192.168.39.187:8443 --kubeconfig=/root/.kube/config --namespace=cephcsi-e2e-ea434921 create --namespace=cephcsi-e2e-ea434921 -f -'
INFO: rc: 1
FAIL: failed to create CephFS provisioner rbac with error error running /usr/bin/kubectl --server=https://192.168.39.187:8443 --kubeconfig=/root/.kube/config --namespace=cephcsi-e2e-ea434921 create --namespace=cephcsi-e2e-ea434921 -f -:
Command stdout:
role.rbac.authorization.k8s.io/cephfs-external-provisioner-cfg created
rolebinding.rbac.authorization.k8s.io/cephfs-csi-provisioner-role-cfg created
stderr:
Error from server: error when creating "STDIN": etcdserver: request timed out
Error from server: error when creating "STDIN": etcdserver: request timed out
Error from server: error when creating "STDIN": etcdserver: request timed out
error:
exit status 1
By using retryKubectlInput() helper function, a retry will be done, and
the failure should not be fatal any longer.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
RBD image metadata keys that start with '.rbd' are expected to be
internal to RBD itself and are not mirrored to remote sites. Renaming
the keys (dropping the '.' prefix) and using the new MigrateMetadata()
function now makes the keys available on remote sites too.
Closes: #2219
Signed-off-by: Niels de Vos <ndevos@redhat.com>
framework.RunKubectl() returns an error that does not end with
"etcdserver: request timed out", but contains the text somewhere in the
middle:
error running /usr/bin/kubectl --server=https://192.168.39.57:8443 --kubeconfig=/root/.kube/config --namespace=cephcsi-e2e-a44ec4b4 create -f -:
Command stdout:
stderr:
Error from server: error when creating "STDIN": etcdserver: request timed out
error:
exit status 1
isRetryableAPIError() should return `true` for this case as well, so
instead of using HasSuffix(), we'll use Contains().
Signed-off-by: Niels de Vos <ndevos@redhat.com>
This commit modifies the error of godot, cyclop,
paralleltest linter caused due to merged PRs.
Updates: #1586
Signed-off-by: Yati Padia <ypadia@redhat.com>
nlreturn linter requires a new line before return
and branch statements except when the return is alone
inside a statement group (such as an if statement) to
increase code clarity. This commit addresses such issues.
Updates: #1586
Signed-off-by: Rakshith R <rar@redhat.com>
Sometimes there are failures in the e2e suite when connecting to the
etcdserver fails. The following error was caught:
INFO: Error getting pvc "rbd-pvc" in namespace "rbd-1318": Get "https://192.168.39.222:8443/api/v1/namespaces/rbd-1318/persistentvolumeclaims/rbd-pvc": dial tcp 192.168.39.222:8443: connect: connection refused
FAIL: failed to create PVC with error failed to get pvc: Get "https://192.168.39.222:8443/api/v1/namespaces/rbd-1318/persistentvolumeclaims/rbd-pvc": dial tcp 192.168.39.222:8443: connect: connection refused
If etcdserver was only briefly unavailable, one or more retries might be
sufficient to have the test pass.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Sometimes it happens that the deployment of Hashicorp Vault fails.
Deployment is one of the 1st steps that are done when starting the e2e
suite, and the Kubernetes cluster may still be a little overloaded while
it is settling down. It should be possible to retry and succeed after a
while.
Fixes: #2288
Signed-off-by: Niels de Vos <ndevos@redhat.com>
This commit resolves errorlint issues
which checks for the code that will cause
problems with the error wrapping scheme.
Updates: #1586
Signed-off-by: Yati Padia <ypadia@redhat.com>
revive linter checks for var-declaration
format.
For example:
"e2e/rbd_helper.go:441:36: var-declaration:
should drop = nil from declaration of
var noPVCValidation; it is the zero value (revive)
var noPVCValidation validateFunc = nil"
Updates: #1586
Signed-off-by: Yati Padia <ypadia@redhat.com>
warnings from golangci-lint:
e2e/pod.go:207:122: directive `//nolint:unparam,lll // cn can be used
with different inputs later` is unused for linter unparam (nolintlint)
func execCommandInContainer(f *framework.Framework, c, ns, cn string,
opt *metav1.ListOptions) (string, string, error) { //nolint:unparam,lll
// cn can be used with different inputs later
e2e/pod.go:307:70: directive `//nolint:unparam // skipNotFound can be
used with different inputs later` is unused for linter unparam (nolintlint)
func deletePodWithLabel(label, ns string, skipNotFound bool) error {
//nolint:unparam // skipNotFound can be used with different inputs later
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Now that the healer functionaity for mounter processes is available,
lets start, using it.
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
When an error occurs, the pvc object is overwritten in the
PollImmediate() loop. Re-using the pvc.Namespace results in error
messages like
Error getting pvc in namespace: '': an empty namespace may not be set when a resource name is provided
and prevents the retry by PollImmediate() to never succeed. Storing the
namespace in a local variable prevents this from happening.
Reported-by: Rakshith R <rar@redhat.com>
Signed-off-by: Niels de Vos <ndevos@redhat.com>
There are regular CI failures where etcdserver times out. These errors
seem not to get caught by any of the existing error comparing. Matching
the error by string should prevent temporary etcdserver issues now too.
Updates: #2218Closes: #1969
Signed-off-by: Niels de Vos <ndevos@redhat.com>
This commit resolves godot linter issue
which says "Comment should end in a period (godot)".
Updates: #1586
Signed-off-by: Yati Padia <ypadia@redhat.com>
This adds a new `kmsConfig` interface that can be used to validate
different KMS services and setting. It makes checking for the available
support easier, and fetching the passphrase simpler.
The basicKMS mirrors the current validation of the KMS implementations
that use secrets and metadata. vaultKMS can be used to validate the
passphrase stored in a Vault service.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
This commit adds e2e for user secret based metadata encryption,
adds user-secret.yaml and makes required changes in kms-connection-details,
kms-config yamls.
Signed-off-by: Rakshith R <rar@redhat.com>