The Ceph v17.2.2 container-image fails to start Ceph Mgr. This causes
issues while the e2e test suite is running. It is better to check if
Ceph Mgr is available, before continuing with the rest of the CI job.
Updates: #3259
Signed-off-by: Niels de Vos <ndevos@redhat.com>
PodSecurity featuregate is beta in kubernetes
1.23 and its causing problem for the existing
tests. This PR disables the PodSecurity featuregate
for now and will be enabled later.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Still seeing the issue of the commitlint
as below
fatal: unsafe repository
('/go/src/github.com/ceph/ceph-csi'
is owned by someone else)
To add an exception for this directory,
call:
git config --global --add safe.directory \
/go/src/github.com/ceph/ceph-csi
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
`bash -E` causes inheritance of the ERR trap into shell functions,
command substitutions, and commands executed in a subshell environment.
Because the `kubectl_retry` function depends on detection an error of a
subshell, the ERR trap is not needed to be executed. The trap contains
extra logging, and exits the script in the `rook.sh` case. The aborting
of the script is not wanted when a retry is expected to be done.
While checking for known failures, the `grep` command may exit with 1,
if there are no matches. That means, the `ret` variable will be set to
0, but there will also be an error exit status. This causes `bash -E` to
abort the function, and call the ERR trap.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Depending on the Kubernetes version, the following warning is reported
regulary:
> Warning: policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+,
> unavailable in v1.25+
The warning is written to stderr, so skipping AlreadyExists or NotFound
is not sufficient to trigger a retry. Ignoring '^Warning:' in the stderr
output should prevent unneeded failures while deploying Rook or other
components.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Rook deployments fail quite regulary in the CI environment now. It is
not clear what the cause is, hopefully a little better logging will
guide us to the issue.
Now executing `kubectl` in a sub-shell, ensuring that the redirection of
the command lands in the right files.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
This commit change the image registry URL for sidecars in the
deployment from `k8s.gcr.io` to `registry.k8s.io` as
the migration is happening from former to the latter. This commit
also correct the e2e readme for the change.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
Considering snapshot controllers have been moved to GA since
kube version 1.20, we no longer need to have a mention of beta
version of the same in our deployment.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
The sidecar images in minikube deployment will be fetched from
build.env and used/validated accordingly.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
The CentOS 8 repository for Apache Arrow has been removed. This causes
container-image builds fail with the following error:
Errors during downloading metadata for repository 'apache-arrow-centos':
- Status code: 404 for https://apache.jfrog.io/artifactory/arrow/centos/8/x86_64/repodata/repomd.xml (IP: 54.190.66.70)
Error: Failed to download metadata for repo 'apache-arrow-centos': Cannot download repomd.xml: Cannot download repodata/repomd.xml: All mirrors were tried
The Ceph base image has `arrow/centos/8` configured, maybe Apache Arrow
offers a CentOS Stream 8 repository now? Once the Ceph container-image
has been updated, the repository can be enabled again.
Ceph-CSI does not depend on Apache Arrow, so there is no functional
change by disabling the repository.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
in Rook 1.8 templates the failureDomain is
set to host for ec pools, as we are using
single node minikube cluster, setting the
failureDomain for osd.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Rook creates a detault pool with name
device_health_metrics for
device-health-metrics CephBlockPool CR.
device-health-metrics is added to cluster-test.yaml
in Rook 1.8
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
kubectl log with labels will log only
last 10 lines by default adding tail=-1
to log complete output.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
In Rook v1.8+ the path for the deployment articafts
are changed from `"cluster/examples/kubernetes/ceph`
to `deploy/examples`.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
currently, rook.sh code is not aligned properly
unside the functions, this commit does
the code alignment.
PS: this is done by vscode for me :)
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
we dont need to specify the image version
separately in minikube script, use the image
version defined in the build.env
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
The ReadWriteOncePod feature gate need to be enabled only when we
are operating on kube 1.22 or above cluster. This commit adds the
logic to parse the kubernetes cluster at time of minikube deployment
and if it is above v1.22, enable the RWOP feature gate
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
The golangci-lint install script at goreleaser.com is deprecated. Docs
now advise to install from a github link:
goreleaser/godownloader#207
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Instead of installing the amd64 on all the
platforms, install architecture specific go
version in test dockerfile.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Instead of installing the amd64 on all the
platforms, install architecture specific go
version for devel dockerfile
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
currently getting find command not found
and xargs command not found when we run
the dockerfile.test. installing findutils
to fix it.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
The `expandCSIVolumes` feature gate is beta since kubernetes 1.16
version and we no longer wanted to explictly enable it in the
deployment.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
as there are lot of spell check failures
on the vendor directory, skipping it.
skipping retest action folder as
PullRequests word is not getting ignored
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
we need to start minikube with the
PodSecurityPolicy admission controller
and the pod-security-policy addon enabled
for psp.
Ref:- https://minikube.sigs.k8s.io/docs/
tutorials/using_psp/#tutorial
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
as we are no longer using older (<1.11.1)
version of minikube, removing the
workaround to support the psp with
older minikube versions.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Having double quotes around $DISK_CONFIG led to these args
not being properly passed to minikube start. This commit fixes it.
Signed-off-by: Rakshith R <rar@redhat.com>
commitlint 13.1.0 is causing issues when
PR is backported from devel branch to release
branch
https://github.com/ceph/ceph-csi/pull/2332#issuecomment-888325775
Lets revert back to commitlint 12.1.4 where we have
not seen any issue with backports to release
branch.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
When running 'make containerized-test' the following error gets
reported:
yamllint -s -d '{extends: default, rules: {line-length: {allow-non-breakable-inline-mappings: true}},ignore: charts/*/templates/*.yaml}' ./scripts/golangci.yml
./scripts/golangci.yml
179:81 error line too long (84 > 80 characters) (line-length)
The golangci.yml.in is used to generate golangci.yml, addressing the
line-length there resolves the issue.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
This commit disables the forbidigo linter as
this linter forbids the use of fmt.Printf
but we need to use it in various part of
our codebase.
Updates: #1586
Signed-off-by: Yati Padia <ypadia@redhat.com>
This commit disables the exhaustivestruct linter
as it is meant to be used only for special cases.
We don't need to enable this for our project.
Fixes: #2224
Signed-off-by: Yati Padia <ypadia@redhat.com>
Since we parse multiple options from command line in install-helm,
script will not recognize any random string directly as namespace.
To inform script about namespace, `--namespace` flag is used before
passing namespace.
Signed-off-by: Yug <yuggupta27@gmail.com>
Update install-helm script to accept `--deploy-sc` and
`--deploy-secret` as optional parameters when deploying
ceph-csi via helm charts.
Signed-off-by: Yug <yuggupta27@gmail.com>
`lll` linter configuration was made for 180 chars till now due to
long source code signatures/comments we had in the project. This
address it and enabling linter with 120 chars check.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
Getting rid of function keyword for two reasons:
1. Defining a function without 'function' keyword is more
portable as it is compatible with Bourne/Korn/POSIX scripts
2. To ensure the coding style is same for the file.
Signed-off-by: Yug <yuggupta27@gmail.com>
The current approach uses hard-coded command line
arguments which is not very robust;
To maintain backward compatibility, script will
also keep working as the previous approach.
Signed-off-by: Yug <yuggupta27@gmail.com>
Sometimes there is an unclear error while starting the Minikube VM.
Possibly this can be worked around by passing the --delete-on-failure
option:
If set, delete the current cluster if start fails and try again. Defaults to false.
See-also: https://minikube.sigs.k8s.io/docs/commands/start/
Signed-off-by: Niels de Vos <ndevos@redhat.com>
As Travis CI `https://travis-ci.org/` is getting
shutdown date on June 15th. Either we need to move
to new place https://www.travis-ci.com/ or we can
switch to github action to push image and the helm
charts when a PR is merged.
fixes: #1781
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
minikube waits for 6 minutes for the minikube or
host to be ready, due to resource issue sometimes
the host/minikube might take longer time to start
the cluster. Increase the timeout to 10m by default
updates #1969
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Test if metrics are available at all. The actual values are a little
difficult to validate.
BlockMode volumes support metrics since Kubernetes 1.22.
See-also: kubernetes/kubernetes#97972
Signed-off-by: Niels de Vos <ndevos@redhat.com>
This commit disables mon,mgr and mds liveness probe
which on failing caused `crashLoopBackOff` state.
Updates: #2094
Signed-off-by: Rakshith R <rar@redhat.com>
BlockVolume, CSIBlockVolume(GA since k8s v1.18) & VolumeSnapshotDataSource
(GA since k8s v1.20) default to true and don't need to be set to true in
feature gates setting.
Signed-off-by: Rakshith R <rar@redhat.com>
The nestif linter reports deeply nested if statements.
This commit enables the nestif linter and sets
min-complexity to 7.
Closes: #1229
Signed-off-by: Rakshith R <rar@redhat.com>
We have a new release v4.0.0 of
https://github.com/kubernetes-csi/external-snapshotter
Adjusting SNAPSHOT_VERSION will pull the latest controller and CRDs
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Changes:
1. Add a variable in build.env for rook ceph cluster version.
2. Modify rook.sh so that it can deploy ceph cluster with
desirable version also rather than the one which rook installs
by default.
3. Remove the code which is no longer required:
a. Code which was added to test snapshot feature.
b. Code which was required because
https://github.com/rook/rook/pull/5925 was not fixed.
Signed-off-by: Mudit Agarwal <muagarwa@redhat.com>
It seems that recent minikube versions changed something in the
networking, and that prevents
$ ceph fs subvolumegroup create myfs testGroup
from working. Strangely RBD is not impacted. Possibly something is
confusing the CephMgr pod that handles the CephFS admin commands.
Using the "bridge" CNI seems to help, CephFS admin commands work with
this in minikube.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
While deploying Rook, there can be issues when the environment is not
completely settled yet. On occasion the 1st kubectl command fails with
The connection to the server ... was refused - did you specify the right host or port?
This would set the 'ret' variable to a non-zero value, before the next
retry of the kubectl command is done. In case the kubectl command
succeeds, the 'ret' variable still contains the old non-zero value, and
kubectl_retry returns the incorrect result.
By setting the 'ret' variable to 0 before calling kubectl again, this
problem is prevented.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
This PR makes the changes in csi templates and
upgrade documentation required for updating
csi sidecar images.
Signed-off-by: Mudit Agarwal <muagarwa@redhat.com>
Depending on the local changes, running 'make containerized-test' fails
with an error like:
level=error msg="Running error: gofmt: error computing diff: exec: \"diff\": executable file not found in $PATH"
Installing the diffutils package makes sure 'go fmt' finds the
executable.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
It seems that the new log_errors() function does not get triggered when
the script hits `exit 1` conditions in functions. The functions should
return a non-0 value, not cause an exit of the script.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Log a few commands that help troubleshooting Rook deployment issues.
This might need to get extended with more commands.
Updates: #1636
Signed-off-by: Niels de Vos <ndevos@redhat.com>
An rbd image can have a maximum number of
snapshots defined by maxsnapshotsonimage
On the limit is reached the cephcsi will
start flattening the older snapshots and
returns the ABORT error message, The Request
comes after this as to wait till all the
images are flattened (this will increase the
PVC creation time. Instead of waiting till
the maximum snapshots on an RBD image, we can
have a soft limit, once the limit reached
cephcsi will start flattening the task to
break the chain. With this PVC creation time
will only be affected when the hard limit
(minsnapshotsonimage) reached.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
The GitHub style for Pull Request and Issue templates add HTML tags for
some advanced usage. The MarkDown linter should not give warnings when
these are used.
Signed-off-by: Niels de Vos <ndevos@redhat.com>