On occasion the Pods have not been (re)started before they get listed.
This can result in an empty list. It can occur during RBD testing where
Pods are restarted before `uname` is executed. In case the Pods are not
available yet, the test will fail with the "podlist is empty" error.
By adding a retry when the list of Pods is empty, the tests should
become a little more stable.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Some of the deployment artifacts refer to others (like ServiceAccount in
a Deployment). If the dependencies are not available (yet), there will
be errors reported in the logs. By deploying the components in a more
correct order, fewer errors are reported, making the logs a little
easier to understand.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Resizing is handled by the csi-resizer container, which needs to run in
the provisioner Pod. In addition to the container, the StorageClass also
needs to allow volume expansion.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
There is not much the NFS-provisioner needs to do to expand a volume,
everything is handled by the CephFS components.
NFS does not need a resize on the node, so only ControllerExpandVolume
is required.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
If the `ci/skip/multi-arch-build` label is set on a PR, the GitHub
Workflow only builds for the local architecture. This makes it possible
to merge PRs faster.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
When testing NFS-provisioning on a cluster that has an NFS-provisioner
and node-plugins deployed with a different driver-name, it is very
useful to have a commandline option to change the name of the
provisioner that is placed in the StorageClass.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
NFS testing will automatically be enabled when CephFS is enabled. This
makes sure the NFS tests run in the CI where there are different jobs
for CephFS and RBD. With a dedicated testNFS variable, it is still
possible to only run the NFS tests, when both CephFS and RBD are
disabled.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
The tests for the NFS-provisioner can be run by passing -deploy-nfs and
-test-nfs as parameters to the `go test` or `e2e.test` command.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
"nfs-ganesha" is the default pool for older Ceph versions, recent
versions use ".nfs" (which can not be changed in the CephNFS resource).
Signed-off-by: Niels de Vos <ndevos@redhat.com>
This should address the following failure when Pod Security Policies are
enabled:
> FailedCreate: Error creating: pods "csi-nfs-node-" is forbidden:
> PodSecurityPolicy: unable to admit pod: spec.containers[2].hostPort:
> Invalid value: 29653: Host port 29653 is not allowed to be used.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
For the default mounter the mounter option
will not be set in the storageclass and as it is
not available in the storageclass same will not
be set in the volume context, Because of this the
mapOptions are getting discarded. If the mounter
is not set assuming it's an rbd mounter.
Note:- If the mounter is not set in the storageclass
we can set it in the volume context explicitly,
Doing this check-in node server to support backward
existing volumes and the check is minimal we are not
altering the volume context.
fixes: #3076
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
PodSecurity featuregate is beta in kubernetes
1.23 and its causing problem for the existing
tests. This PR disables the PodSecurity featuregate
for now and will be enabled later.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Still seeing the issue of the commitlint
as below
fatal: unsafe repository
('/go/src/github.com/ceph/ceph-csi'
is owned by someone else)
To add an exception for this directory,
call:
git config --global --add safe.directory \
/go/src/github.com/ceph/ceph-csi
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Commitlint fails with errors like:
```
git fetch -v origin devel
fatal: unsafe repository ('/go/src/github.com/ceph/ceph-csi' is owned by
someone else)
To add an exception for this directory, call:
git config --global --add safe.directory /go/src/github.com/ceph/ceph-csi
make: *** [Makefile:153: commitlint] Error 128
```
By not setting the option with actions/checkout@v3, the error should not
happen anymore.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
With cgroup v2, the location of the pids.max file changed and so did the
/proc/self/cgroup file
new /proc/self/cgroup file
`
0::/user.slice/user-500.slice/session-14.scope
`
old file:
`
11:pids:/user.slice/user-500.slice/session-2.scope
10:blkio:/user.slice
9:net_cls,net_prio:/
8:perf_event:/
...
`
There is no directory per subsystem (e.g. /sys/fs/cgroup/pids) any more, all
files are now in one directory.
fixes: https://github.com/ceph/ceph-csi/issues/3085
Signed-off-by: Marcus Röder <m.roeder@yieldlab.de>
added getPersistentVolume helper function
to get the PV and also try if there is any API
error to improve the CI.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
added getPersistentVolumeClaim helper function
to get the PVC and also try if there is any API
error to improve the CI.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
This commit is added to use canary csi-provisioner image
to test different sc pvc-pvc cloning feature, which is not
yet present in released versions.
refer:
https://github.com/kubernetes-csi/external-provisioner/pull/699
Signed-off-by: Rakshith R <rar@redhat.com>
This commit makes modification so as to allow pvc-pvc clone
with different storageclass having different encryption
configs.
This commit also modifies `copyEncryptionConfig()` to
include a `isEncrypted()` check within the function.
Signed-off-by: Rakshith R <rar@redhat.com>
Before the change, the error msg was the following:
```
failed to set VAULT_AUTH_MOUNT_PATH in Vault config: path is empty
```
`vaultAuthPath` is the actual variable name set by the
user. The error message will now be the following:
```
failed to set "vaultAuthPath" in vault config: path is empty
```
Signed-off-by: Rakshith R <rar@redhat.com>
In case the NFS-export has already been removed from the NFS-server, but
the CSI Controller was restarted, a retry to remove the NFS-volume will
fail with an error like:
> GRPC error: ....: response status not empty: "Export does not exist"
When this error is reported, assume the NFS-export was already removed
from the NFS-server configuration, and continue with deleting the
backend volume.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
This commit make use of latest sidecars of livenessprobe and
node driver registrar in NFS driver deployment.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
`bash -E` causes inheritance of the ERR trap into shell functions,
command substitutions, and commands executed in a subshell environment.
Because the `kubectl_retry` function depends on detection an error of a
subshell, the ERR trap is not needed to be executed. The trap contains
extra logging, and exits the script in the `rook.sh` case. The aborting
of the script is not wanted when a retry is expected to be done.
While checking for known failures, the `grep` command may exit with 1,
if there are no matches. That means, the `ret` variable will be set to
0, but there will also be an error exit status. This causes `bash -E` to
abort the function, and call the ERR trap.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
We heavily use the service for Open Source communities from Mergify. It
is probably nice to promote them a little in our README.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
This commit remove the clusterRole and Binding of cephfs node plugin
as the node RBAC is not needed for CephFS.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
Depending on the Kubernetes version, the following warning is reported
regulary:
> Warning: policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+,
> unavailable in v1.25+
The warning is written to stderr, so skipping AlreadyExists or NotFound
is not sufficient to trigger a retry. Ignoring '^Warning:' in the stderr
output should prevent unneeded failures while deploying Rook or other
components.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Rook deployments fail quite regulary in the CI environment now. It is
not clear what the cause is, hopefully a little better logging will
guide us to the issue.
Now executing `kubectl` in a sub-shell, ensuring that the redirection of
the command lands in the right files.
Signed-off-by: Niels de Vos <ndevos@redhat.com>