While deploying Rook, there can be issues when the environment is not
completely settled yet. On occasion the 1st kubectl command fails with
The connection to the server ... was refused - did you specify the right host or port?
This would set the 'ret' variable to a non-zero value, before the next
retry of the kubectl command is done. In case the kubectl command
succeeds, the 'ret' variable still contains the old non-zero value, and
kubectl_retry returns the incorrect result.
By setting the 'ret' variable to 0 before calling kubectl again, this
problem is prevented.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
This PR makes the changes in csi templates and
upgrade documentation required for updating
csi sidecar images.
Signed-off-by: Mudit Agarwal <muagarwa@redhat.com>
Depending on the local changes, running 'make containerized-test' fails
with an error like:
level=error msg="Running error: gofmt: error computing diff: exec: \"diff\": executable file not found in $PATH"
Installing the diffutils package makes sure 'go fmt' finds the
executable.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
It seems that the new log_errors() function does not get triggered when
the script hits `exit 1` conditions in functions. The functions should
return a non-0 value, not cause an exit of the script.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Log a few commands that help troubleshooting Rook deployment issues.
This might need to get extended with more commands.
Updates: #1636
Signed-off-by: Niels de Vos <ndevos@redhat.com>
An rbd image can have a maximum number of
snapshots defined by maxsnapshotsonimage
On the limit is reached the cephcsi will
start flattening the older snapshots and
returns the ABORT error message, The Request
comes after this as to wait till all the
images are flattened (this will increase the
PVC creation time. Instead of waiting till
the maximum snapshots on an RBD image, we can
have a soft limit, once the limit reached
cephcsi will start flattening the task to
break the chain. With this PVC creation time
will only be affected when the hard limit
(minsnapshotsonimage) reached.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
The GitHub style for Pull Request and Issue templates add HTML tags for
some advanced usage. The MarkDown linter should not give warnings when
these are used.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
The StorageClasses that get deployed for the Kubernetes e2e external
storage tests reference a ConfigMap that contains the connection details
for the Ceph cluster. Without this ConfigMap, Ceph-CSI will not function
correctly.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Currently the scripts/install-snapshot.sh script needs to be called
depending on the Kubernetes version. It would be much easier to use the
script if it is intelligent enough to decide itself whether k8s snapshot
controller needs to be installed or not.
Fixes: #1139
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Just like deploy-rook and teardown-rook, this patch will add
install snapshotter and cleanup snapshotter option to minikube
script.
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
When replication count is >1 of the provisioner, the added anti-affinity rules
will prevent provisioner operators from scheduling on the same nodes. The
kubernetes scheduler will spread the pods across nodes to improve availability
during node failures.
Signed-off-by: Nico Berlee <nico.berlee@on2it.net>
Instead of using the Docker command to push the image to to minikube VM,
read the image from stdin over ssh and load it with the Docker command
that is available inside the VM.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Allow passing:
$ CONTAINER_CMD="sudo docker" ./scripts/minikube.sh cephcsi
or
$ CONTAINER_CMD="sudo podman" ./scripts/minikube.sh cephcsi
Because the container images could list in '# sudo docker images' or
'# sudo podman images' incase if the Makefile target image-cephcsi is
run with sudo
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Add a way to supply local CONTAINER_CMD option of choice via
env variable to minikube.sh
Note: we still use docker daemon env at minikube box, in the future
we can switch to podman service env '# minikube podman-env' if needed
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Problem:
-------
$ minikube version
minikube version: v1.12.2
commit: be7c19d391302656d27f1f213657d925c4e1cfc2-dirty
$ ./scripts/minikube.sh up
installed minikube version v1.12.2 is not matching requested version latest
Here v1.12.2 is the latest version of minikube, but the script simply bails out.
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
In this script we defined all the functions at the top and then started
with executable commands (entry points to script start).
Only this function is odd in the script unlike the rest of them, defined
in between the execution sequence taking away the readability.
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
we cannot depend on the master branch of external-snapshotter
in cephcsi as the master branch can change anytime. its
good to use released tags to our E2E.
fixes: #1416
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
There can be spurious failures in the CI when running kubectl create. On
occasion, the command returns with an error, but the api-server did
receive and process the request. This causes a 2nd create action to fail
with messages like:
cephcluster.ceph.rook.io/my-cluster created
Error from server: error when creating "/tmp/tmp.Ur1ZPG85o9/cluster-test.yaml": etcdserver: request timed out
Error from server (AlreadyExists): error when creating "/tmp/tmp.Ur1ZPG85o9/cluster-test.yaml": configmaps "rook-config-override" already exists
Error from server (AlreadyExists): error when creating "/tmp/tmp.Ur1ZPG85o9/cluster-test.yaml": cephclusters.ceph.rook.io "my-cluster" already exists
Error from server (AlreadyExists): error when creating "/tmp/tmp.Ur1ZPG85o9/cluster-test.yaml": configmaps "rook-config-override" already exists
Error from server (AlreadyExists): error when creating "/tmp/tmp.Ur1ZPG85o9/cluster-test.yaml": cephclusters.ceph.rook.io "my-cluster" already exists
Error from server (AlreadyExists): error when creating "/tmp/tmp.Ur1ZPG85o9/cluster-test.yaml": configmaps "rook-config-override" already exists
Error from server (AlreadyExists): error when creating "/tmp/tmp.Ur1ZPG85o9/cluster-test.yaml": cephclusters.ceph.rook.io "my-cluster" already exists
Error from server (AlreadyExists): error when creating "/tmp/tmp.Ur1ZPG85o9/cluster-test.yaml": configmaps "rook-config-override" already exists
Error from server (AlreadyExists): error when creating "/tmp/tmp.Ur1ZPG85o9/cluster-test.yaml": cephclusters.ceph.rook.io "my-cluster" already exists
By handling the create action differently, and checking for the
AlreadyExists word in the stderr output, it is possible to detect
repeated creates that are not needed.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Add retries to prevent ci failure instantly.
Now, the command execution will retry upto
5 times, to avoid failures in some runs.
Signed-off-by: Yug <yuggupta27@gmail.com>
By default the install-helm.sh script uses "latest" as version for Helm.
Unfortunately this version does not exist. The HELM_VERSION variable is
already set in build.env, so source the configuration file as one of the
first actions in install-helm.sh.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
By default minikube uses 2 CPUs, which might be too little for some of
the tests. When not passing a CPUS environment variable, use all CPUs
available on the system (detected with 'nproc').
Signed-off-by: Niels de Vos <ndevos@redhat.com>
The keeps the standard arguments for e2e testing in a single location
instead of spread over multiple files and CI jobs.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
In test environments the default pool size is set to 1, so there is no
redundancy. This causes recent Ceph versions to complain with
HEALTH_WARN as POOL_NO_REDUNDANCY get set.
By disabling the mon_warn_on_pool_no_redundancy option in ceph.conf, the
warning is not reported and the cluster is marked HEALTHY.
See-also: rook/rook#5925
Signed-off-by: Niels de Vos <ndevos@redhat.com>
minikube has /sbin/losetup from Busybox, and that does not work with
raw-block PVCs. Use the losetup executable from the host in the VM
instead.
See-also: kubernetes/minikube#8284
Signed-off-by: Niels de Vos <ndevos@redhat.com>
While testing with tehj default 3000 MB RAM in the minikube VM, creating
a encrypted RBD volume fails because 'cryptsetup' gets killed:
[ 766.072585] Out of memory: Kill process 18497 (cryptsetup) score 1182 or sacrifice child
[ 766.072589] Killed process 18497 (cryptsetup) total-vm:863136kB, anon-rss:510336kB, file-rss:10788kB, shmem-rss:0kB
[ 766.072688] oom_reaper: reaped process 18497 (cryptsetup), now anon-rss:510336kB, file-rss:10780kB, shmem-rss:0kB
Using 4 GB RAM should prevent this from occuring.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
In case kubectl did not get installed (VM_DRIVER != none),
scripts/minikube.sh can fail when kubectl is not in the path. By running
the "kubectl cluster-info" command through minikube, the script will
succeed.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
In case there is a minikube executable in the $PATH already, use that
for all commands. If there is none, install_minikube() will place a
newly downloaded executable in /usr/local/bin which will be used by the
full pathname, so that commands as root without /usr/local/bin in the
$PATH will work.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
The command fails when PWD=/. It is unclear what the command tries to
achieve. The next command does something more useful, although it can
maybe be removed as well.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
When starting minikube as root with --driver=kvm2, minikube complains
that this is not the right thing to do. However, in the CentOS CI we
really want to run as root, as that makes the scripts simpler.
Add the --force option while starting, so that minikube does not abort
anymore.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Storage providers and the default storage class is not needed for
Ceph-CSI testing. In order to reduce resources and potential conflicts
between storage plugins, disable them.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
As part of https://github.com/ceph/ceph-csi/pull/1237/ there was
a patching enabled for the ceph cluster deployed, however due to
an error in the version fetching logic, the patching was not applied
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
goerr113 linter checks the errors handling expressions.
It warns about using wrapped static errors in place
of dynamic at multiple places.
Disabled the linter as of now to avoid regression,
and this need to be handled in a seperate issue.
Tracker Issue: #1227
Signed-off-by: Yug <yuggupta27@gmail.com>
The `nestif` linter reports deeply nested if statements.
Disabled `nestif` as of now to avoid regression and
needs to addressed seperately.
Tracking Issue: #1229
Signed-off-by: Yug <yuggupta27@gmail.com>
The 'testpackage' linter recommends "black box" testing, which prevents
testing internal/non-exported functions from being tested. We have tests
that *do* test non-exported functions and types.
Disabling the linter allows us to test non-exported types and functions.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Rook version is currently 1.1.7 in our e2e deployment which brings 14.2.4 version
of ceph cluster. To support cephfs snapshot e2e, we need latest version of Ceph Cluster
in E2E. Rook 1.2.7 is good enough which on patching bring up ceph 14.2.10 cluster.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
With minikube versions greater than 1.6.2 and less than 1.11.1, the YAML files
minikube path will not be automatically applied to the cluster. we will get
errors during bootstrap of the cluster if the admission controller is enabled.
To use Pod Security Policies with these versions of minikube, first start a
cluster without the `PodSecurityPolicy` admission controller enabled.
Next, apply the psp yaml. and stop the cluster and then restart it
with the admission controller enabled.
```
minikube start
kubectl apply -f /path/to/psp.yaml
minikube stop
minikube start --extra-config=apiserver.enable-admission-plugins=PodSecurityPolicy
```
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
as we need to test the maxsnapshotsonimage we
need to set the limit to minimal value which we
can test in CI as the default limit is 450,which
cannot be tested in CI.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
When building against go-ceph, the most recent version of Ceph is
assumed to be available (currently Octopus). In case an older version of
the development packages is installed, building go-ceph will fail.
Golangci-lint does not accept the `-tags nautilus` parameter like other
Golang tools. Instead, the build-constraints need to be configured in a
confguration file.
This change takes care of the following:
- move the current scripts/golangci.yml to a template
- add the @@CEPH_VERSION@@ substitute
- generate the configuration file when needed
Signed-off-by: Niels de Vos <ndevos@redhat.com>
in Travis CI the e2e tests are timing
out, 30 minutes seems less now for E2E
testing, increasing the timeout for 40
minutes.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Build `GO_TAGS` based on the `CEPH_VERSION` from build.env. In case the
version is non-empty, pass `-tags=${CEPH_VERSION}` to any of the go
commandline and script that call go programs.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
add pylint to catch static issues of python
files in the repo.
User can now run make lint-py for pylint
check on python files.
Signed-off-by: Yug Gupta <ygupta@redhat.com>
Without -mod=vendor running go run in this script may take more
resources than needed to execute. This also makes it consistent go build
(and alike) are invoked in the other scripts and the Makefile.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
The script checks for the ceph development headers.
In case required packages are not found, script
suggests to run containerized build.
Signed-off-by: Yug Gupta <ygupta@redhat.com>
In some Linux distributions the /etc/resolv.conf file is a symlink. This
file gets included in the Kubernetes containers and will be used for
resolving hostnames. By including the symlink, it is possible that that
target file is not available in the container(s). This will cause
problems when resolving hostnames, and Kubernetes will not get deployed.
The default minikube VM provides /run/systemd/resolve/resolv.conf, with
/etc/resolv.conf being a symlink. Therefor, it is needed to pass the
`--extra-config=kubelet.resolv-conf=..` parameter to `kubeadm`.
In case minikube is started with `--vm-driver=none` and
/run/systemd/resolve/resolv.conf does not exist, the local
/etc/resolv.conf will be used for inclusion in the Kubelet container. If
this is a symlink, the final destination should get passed with
`--extra-config=kubelet.resolv-conf=..` so that a working hostname
resolution configuration is available in the container.
Updates: #1121
Signed-off-by: Niels de Vos <ndevos@redhat.com>
The `commitlint` command can be used to verify the subject of commit
messages, so it is added to the $PATH.
See-also: https://commitlint.js.org
Signed-off-by: Niels de Vos <ndevos@redhat.com>
snapshot beta CRD wont work if the
kubernetes version is less than 1.17.0
as the snapshot CRD wont be installed
we cannot test the snapshot,so disabling
it if the kube version is less than 1.17
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
when we deploy rook+cephcsi for E2E, the
external-snapshotter in cephcsi deployed by
rook will create the CRD, we need to delete
the crd created by external-snapshotter.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Earlier we were running all the linter for non-go
files in one short, this wont be helpful for the
users who want to run particular tests.
now the Makefile as different target to
run separate lint test for different type
of non-go files.
Fixes: #979
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
With extra logging, there is no need to call `travis_wait` anymore. In
addition the `travis_wait` command blocks output, so the build steps are
not reported until the script finishes.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
While introducing scripts/build_step.inc.sh the tests start to fail as
the script is included and each shell script is tested separately. By
adding the option --external-sources to shellcheck, the related warnings
are not reported.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
The internal/ directory in Go has a special meaning, and indicates that
those packages are not meant for external consumption. Ceph-CSI does
provide public APIs for other projects to consume. There is no plan to
keep the API of the internally used packages stable.
Closes: #903
Signed-off-by: Niels de Vos <ndevos@redhat.com>
create ns before the helm create to avoid
`Error: create: failed to create: namespaces
"xxx" not found` issue
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
By setting the WORKDIR in the container image, there is no need to pass
it on the commandline in the Makefile. This makes the line for the make
target a little cleaner.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
`make containerized-test` has been added as a make target. This runs the
'make test' target in a container. All test dependencies are installed
in the container once, and the container is kept around for running
`make containerized-test` subsequently.
The test container is based on Fedora:latest so that all test tools get
easily installed and are available in a recent version. The production
container is based on the Ceph container which has CentOS as Operating
System and therefor a more stable (too old) toolset.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
This commit adds support to mention dataPool parameter for the
topology constrained pools in the StorageClass, that can be
leveraged to mention erasure coded pool names to use for RBD
data instead of the replica pools.
Signed-off-by: ShyamsundarR <srangana@redhat.com>
- This commit adds tests only for RBD, as CephFS still needs
an enhancement in CephFS subvolume commands to effectively use
topology based provisioning
Signed-off-by: ShyamsundarR <srangana@redhat.com>
The current version of go ( 1.12.x) is causing issues
on some method call under errors package. This patch
could help to overcome the same. More details about the failure
is @https://github.com/ceph/ceph-csi/pull/917#issuecomment-609998502
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
Recently resizer 0.5.0 has been released.
This PR updated the resizer container from
v0.4.0 to v0.5.0
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
This PR adds the support for helm
installation, and cephcsi helm charts
deployment and teardown and also runs E2E
on for helm charts.
Add socat to provide port forwadring access for helm
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
looks like git is not installed by default
in v15 base image.This PR installs the
git which is required to make containerized
build.
set GO111MODULE=on in dockerfile
we need to set GO111MODULE=on to fix
"build flag -mod=vendor only valid when using modules"
issue
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
As we have the octopus as the latest
release base image,this PR updates the
base image in Dockerfile
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
This makes it possible to build on any platform that supports Linux
containers. The container image used for building is created once, or on
updating the `scripts/Dockerfile.build` and is cached afterwards.
To build the executable in a container, use `make containerized-build`
and everything will be done automatically. The executable will also be
available on the usual location.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Running tests without `-mod=vendor` causes the tests to download the
dependencies if these are not available in the standard go-module
directories (parent directories of the project). All dependencies are
already included in the ./vendor directory, so passing `-mod=vendor`
prevents downloading the dependencies and speeds up testing a lot.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
this allows administrators to override the naming prefix for both volumes and snapshots
created by the rbd plugin.
Signed-off-by: Reinier Schoof <reinier@skoef.nl>
Update CI merge job to build and push Arm64 image to
quay.io/cephcsi/cephcsi:version-arm64.
Add CI PR job running on Travis Arm64 nodes to make sure cephcsi
compiles successfully on Arm64.
No CI test job is availabe for Arm64 now due to below issues
- k8s-csi sidecar images for Arm64 are not available
- Travis Arm64 CI job runs inside unprivileged LXD which blocks
launching minikube test environment
Signed-off-by: Yibo Cai <yibo.cai@arm.com>
We have the e2e test with --deploy-rook=true that makes all test
environment. It works fine, but It does not seem to be the role of
e2e test. In addition, when developing the code we need to run full
test scenario with deploying rook every time, or we need to build
rook environment by hand. Move rook-deploy code to minikube.sh.
Use Deployment with leader election instead of StatefulSet
Deployment behaves better when a node gets disconnected
from the rest of the cluster - new provisioner leader
is elected in ~15 seconds, while it may take up to
5 minutes for StatefulSet to start a new replica.
Refer: kubernetes-csi/external-provisioner@52d1fbc
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Currently CephFs provisioner mounts the ceph filesystem
and creates a subdirectory as a part of provisioning the
volume. Ceph now supports commands to provision fs subvolumes,
hance modify the provisioner to use ceph mgr commands to
(de)provision fs subvolumes.
Signed-off-by: Poornima G <pgurusid@redhat.com>
RBD plugin needs only a single ID to manage images and operations against a
pool, mentioned in the storage class. The current scheme of 2 IDs is hence not
needed and removed in this commit.
Further, unlike CephFS plugin, the RBD plugin splits the user id and the key
into the storage class and the secret respectively. Also the parameter name
for the key in the secret is noted in the storageclass making it a variant and
hampers usability/comprehension. This is also fixed by moving the id and the key
to the secret and not retaining the same in the storage class, like CephFS.
Fixes#270
Testing done:
- Basic PVC creation and mounting
Signed-off-by: ShyamsundarR <srangana@redhat.com>
* Enable all static-checks in golangci-lint
* Update golangci-lint version
* Fix issue found in golangci-lint
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
update travis and makefile for functional test
skip docker pull if image is already present
on local machine.
if the image is not present locally pull the
image from repo.
export kubeconfig in travis
build cephcsi image in travis job for
functional testing
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
in some cases, we don't need to do
functional testing, like doc change
of the yml files related to Travis
or mergify.This PR skip functional
testing for this kind of changes
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>