Before the change, the error msg was the following:
```
failed to set VAULT_AUTH_MOUNT_PATH in Vault config: path is empty
```
`vaultAuthPath` is the actual variable name set by the
user. The error message will now be the following:
```
failed to set "vaultAuthPath" in vault config: path is empty
```
Signed-off-by: Rakshith R <rar@redhat.com>
In case the NFS-export has already been removed from the NFS-server, but
the CSI Controller was restarted, a retry to remove the NFS-volume will
fail with an error like:
> GRPC error: ....: response status not empty: "Export does not exist"
When this error is reported, assume the NFS-export was already removed
from the NFS-server configuration, and continue with deleting the
backend volume.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
This commit make use of latest sidecars of livenessprobe and
node driver registrar in NFS driver deployment.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
`bash -E` causes inheritance of the ERR trap into shell functions,
command substitutions, and commands executed in a subshell environment.
Because the `kubectl_retry` function depends on detection an error of a
subshell, the ERR trap is not needed to be executed. The trap contains
extra logging, and exits the script in the `rook.sh` case. The aborting
of the script is not wanted when a retry is expected to be done.
While checking for known failures, the `grep` command may exit with 1,
if there are no matches. That means, the `ret` variable will be set to
0, but there will also be an error exit status. This causes `bash -E` to
abort the function, and call the ERR trap.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
We heavily use the service for Open Source communities from Mergify. It
is probably nice to promote them a little in our README.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
This commit remove the clusterRole and Binding of cephfs node plugin
as the node RBAC is not needed for CephFS.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
Depending on the Kubernetes version, the following warning is reported
regulary:
> Warning: policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+,
> unavailable in v1.25+
The warning is written to stderr, so skipping AlreadyExists or NotFound
is not sufficient to trigger a retry. Ignoring '^Warning:' in the stderr
output should prevent unneeded failures while deploying Rook or other
components.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Rook deployments fail quite regulary in the CI environment now. It is
not clear what the cause is, hopefully a little better logging will
guide us to the issue.
Now executing `kubectl` in a sub-shell, ensuring that the redirection of
the command lands in the right files.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
The project is currently at 54% of the best practices. Hopefully this
badge creates some interest in increasing the grade.
See-also: https://bestpractices.coreinfrastructure.org/projects/5940
Signed-off-by: Niels de Vos <ndevos@redhat.com>
When running the kubernetes cluster with one single privileged
PodSecurityPolicy which is allowing everything the nodeplugin
daemonset can fail to start. To be precise the problem is the
defaultAllowPrivilegeEscalation: false configuration in the PSP.
Containers of the nodeplugin daemonset won't start when they
have privileged: true but no allowPrivilegeEscalation in their
container securityContext.
Kubernetes will not schedule if this mismatch exists cannot set
allowPrivilegeEscalation to false and privileged to true:
Signed-off-by: Silvan Loser <silvan.loser@hotmail.ch>
Signed-off-by: Silvan Loser <33911078+losil@users.noreply.github.com>
When running the kubernetes cluster with one single privileged
PodSecurityPolicy which is allowing everything the nodeplugin
daemonset can fail to start. To be precise the problem is the
defaultAllowPrivilegeEscalation: false configuration in the PSP.
Containers of the nodeplugin daemonset won't start when they
have privileged: true but no allowPrivilegeEscalation in their
container securityContext.
Kubernetes will not schedule if this mismatch exists cannot set
allowPrivilegeEscalation to false and privileged to true
Signed-off-by: Silvan Loser <silvan.loser@hotmail.ch>
Signed-off-by: Silvan Loser <33911078+losil@users.noreply.github.com>
updated doc for 3.6.1 release, this will
be backported to release-v3.6 branch and
we will make deployment changes and do release.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
The Ceph cluster-id is usually detected with `ceph fsid`. This is not
always correct, as the the Ceph cluster can also be configured by name.
If the -clusterid=... is passed, it will be used instead of trying to
detect it with `ceph fsid`.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
There are many locations where the cluster-id (`ceph fsid`) is obtained
from the Rook Toolbox. Instead of duplicating the code everywhere, use a
new helper function getClusterID().
Signed-off-by: Niels de Vos <ndevos@redhat.com>
A new -filesystem=... option has been added so that the e2e tests can
run against environments that do not have a "myfs" CephFS filesystem.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
StorageClasses are cluster resources, not namespaced; there is no need
to log the namespace of a StorageClass.
When creating a StorageClass, NotFound is not an error that will be
returned, not need to check for it.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
On occasion the creation of the StorageClass can fail due to an
etcdserver timeout. If that happens, the creation can be attempted after
a delay.
This has already been done for CephFS StorageClasses, but was missed for
RBD.
See-also: ceph/ceph-csi@8a0377ef02
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Some parts of the Context() seem to get executed, even when BeforeEach()
did a Skip() for the test. By adding a return inside the Context(), the
tests should not get executed at all.
This was noticed in a failed test, where upgrade was running, eventhough
the job was executed as a nornal non-upgrade one.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
The current version of Mergify provides a `requeue` command in addition
to `refresh`. After a CI job failed, the PR needs to be re-added to the
queue, so the `requeue` command is more appropriate.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
This commit change the image registry URL for sidecars in the
deployment from `k8s.gcr.io` to `registry.k8s.io` as
the migration is happening from former to the latter. This commit
also correct the e2e readme for the change.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
This commit change the image registry URL for sidecars in the
NFS deployment from `k8s.gcr.io` to `registry.k8s.io` as
the migration is happening from former to the latter.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
This commit change the image registry URL for sidecars in the
RBD deployment from `k8s.gcr.io` to `registry.k8s.io` as
the migration is happening from former to the latter.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
This commit change the image registry URL for sidecars in the
CephFS deployment from `k8s.gcr.io` to `registry.k8s.io` as
the migration is happening from former to the latter.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
as same host directory is not shared between
the cephfs and the rbd plugin pod. we need
to keep the netNamespaceFilePath separately
for both cephfs and rbd. CephFS plugin will
use this path to execute mount -t commands.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
As radosNamespace is more specific to
RBD not the general ceph configuration. Now
we introduced a new RBD section for RBD specific
options, Moving the radosNamespace to RBD section
and keeping the radosNamespace still under the
global ceph level configration for backward
compatibility.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
As the netNamespaceFilePath can be separate for
both cephfs and rbd adding the netNamespaceFilePath
path for RBD, This will help us to keep RBD and
CephFS specific options separately.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
The NFS Controller returns a non-gRPC error in case the CreateVolume
call for the CephFS volume fails. It is better to return the gRPC-error
that the CephFS Controller passed along.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Kubernetes CSI now hosts the container-image for the NFS-nodeplugin in
the the k8s.gcr.io instead of the Microsoft registry.
See-also: kubernetes-csi/csi-driver-nfs@7b5b6f344
Signed-off-by: Niels de Vos <ndevos@redhat.com>
The NFS-Admin API has been added to go-ceph v0.15.0. As the API can not
be tested in the go-ceph CI, it requires build-tag `ceph_ci_untested`.
This additional build-tag has been added to the `Makefile` and should be
removed when the API does not require the build-tag anymore.
See-also: ceph/go-ceph#655
Signed-off-by: Niels de Vos <ndevos@redhat.com>