When a failure occurs, by default the test namespace is removed. This
makes it impossible to fetch the logs of the containers where the
failure was discovered. Pass --delete-namespace-on-failure=false as an
additional argument to the `run-e2e` make target, so that the namespace
is kept.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Jenkins does not like the passing of the username as variable to the
podman_login() function. Calling the function results in an error like
Warning: A secret was passed to "sh" using Groovy String interpolation, which is insecure.
Affected argument(s) used the following variable(s): [CREDS_USER]
See https://jenkins.io/redirect/groovy-string-interpolation for details.
+ ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no root@n7.pufty.ci.centos.org 'podman login --authfile=~/.podman-auth.json --username=$CREDS_USER --password=**** registry-****.apps.ocp.ci.centos.org'
Username: Error: error getting username and password: error reading username: EOF
By single quoting the username, just like the password, it may work
better.
Fixes: aca3745e2 ("ci: do not use Groovy string interpolation for credentials")
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Jenkins warns in the output of CI jobs about the following:
Warning: A secret was passed to "sh" using Groovy String interpolation, which is insecure.
Affected argument(s) used the following variable(s): [CREDS_PASSWD, CREDS_USER]
See https://jenkins.io/redirect/groovy-string-interpolation for details.
Variable with 'single quotes' and without the {curly brackets} are
expecred to not be affected. There is some indirection in the strings
passed to the `sh` function, so this approach might not fix it?
Signed-off-by: Niels de Vos <ndevos@redhat.com>
It seems that it is required to re-throw the error after a catch{..}
block. Without this, and a successful execution of system-status.sh, the
CI jobs get marked as SUCCESS, even when there was a failure.
Fixes: e36155283 "ci: run system-status.sh in case a job fails"
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Without the script on the node, it can not be executed...
Fixes: e36155283 "ci: run system-status.sh in case a job fails"
Signed-off-by: Niels de Vos <ndevos@redhat.com>
The new `system-status.sh` script logs the status of the host and the
minikube VM. This gets executed when a CI job fails, and should aid in
troubleshooting spurious failures.
Updates: #1969
Signed-off-by: Niels de Vos <ndevos@redhat.com>
In case a job has been started without a PR (manual, or timed), the
current checked out branch matches the original as there are not
additional changes in the tree. There is no need to abort the jobs when
the skip-doc-change.sh script did not detect any non-doc changes, as
there are no changes at all.
Updates: #1963
Signed-off-by: Niels de Vos <ndevos@redhat.com>
When tests are started manually (through the Jenkins webui), there is no
PR associated with the job. That means the `git_since` and `ref` are
equal. Trying to create a new branch named `ref` will not work, as the
branch was already created when cloning the repository with `git_since`.
With this change, Jenkins jobs can be started manually. This makes it
possible to run regular/nightly jobs as well.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
After the introduction of ROOK_CEPH_CLUSTER_IMAGE in build.env, the
additional image needs to get pulled from the CI registry mirror and
pushed into the minikube VM.
Without this addition, the Docker Hub pull limits may prevent deploying
Rook.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
The CI scripts pull all container images from the local CI registry. If
the image name starts with "docker.io/", the images will be pushed into
the test environment as "docker.io/docker.io/ceph/ceph:v15". This image
will not be used by the tests, so things can still fail in case Docker
Hub has reached the pull rate-limit.
By dropping the additional "docker.io/" from the BASE_IMAGE name, the
image gets pushed as "docker.io/ceph/ceph:v15" so the tests will use it
automatically.
Groovy-syntax: https://www.baeldung.com/groovy-remove-string-prefix#using-regex
Signed-off-by: Niels de Vos <ndevos@redhat.com>
The mirror option of the Docker Registry container is very limited and
prevents updating or manually pushing images to the registry. Instead,
it tries to push the images to the docker.io, which is not what we need.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
docker.io/nginx:latest and docker.io/vault:latest are being redirected
to docker.io/library/. The redirection is not cached, and Docker Hub
might return an error during redirection when the pull rate-limit is
hit.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Unqualified container images are currently used for CI jobs. In the
future this is expected to change. By preparing the cache/mirror and
images in minikube with the qualified tags, transition to qualified
image names should become easier.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
This makes it possible to pull images from Docker Hub through the local
container image registry in the CI OpenShift deployment. The registry in
the CI is configured with the 'cephcsibot' account so that pulling
images is accounted towards the account, and not anonymous consumers
within the whole CentOS CI.
There should be no need to manually sync the images between the local
registry and Docker Hub anymore.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Functions with Groovy can not use `def ci_registry` as the variable is
not in the scope. Pass the registry to the podman_login() and
podman_pull() functions instead.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
A typo when calling podman_log() causes CI jobs to fail.
Fixes: 1eec379 "ci: pre-pull Ceph base-image and cephcsi:devel for mini-e2e-helm jobs"
Signed-off-by: Niels de Vos <ndevos@redhat.com>
It seems that "podman pull" does not consume the authentication details
from ~/.docker/config.json, so store the results of "podman login" in
~/.podman-auth.json and use the file for all "podman pull" commands.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Once the ssh command finishes, the runtime directory is removed and the
results of "podman login" are lost. By storing the results in the
standard Docker configuration file, subsequent "podman pull" commands
will be able to re-use the authentication details.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
The ${BASE_IMAGE} variable gets expanded by running the ssh command.
This becomes an empty variable, so the "echo" part of the command does
not output anything.
By escaping the command, there is no variable substitution, and the
BASE_IMAGE variable should get stored in the variable.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
The BASE_IMAGE variable was not stored in the variable so that the CI
job can consume it. By using sh(), this should be the case now.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
When fetching refs/pull/<pr-id/merge from GitHub, there is no need to do
a manual rebase. This makes things easier, as a the scripted rebases
sometimes cause CI jobs to fail.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
refs/pull/<id>/head might not contain the most current state of the
branch. In case other PRs got merged, the PR under test needs rebasing.
GitHub offers refs/pull/<id>/merge to checkout the rebased PR, use that
in the CI jobs.
In case refs/pull/<id>/merge is not available, it means the PR can not
be rebased on its target branch. This will cause the CI job to fail, but
GitHub also will have a message about rebase conflicts.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
When the [ci/skip/e2e] label is set on PRs, the withCredentials()
statement is aborted, but the other stages still continue. This causes
the tests to run, which is not what we want when the label is added.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
It still seems that the environment is not set when the GitHub API is
called. Maybe things work better when the environment is set before
starting the cico-workspace node.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
The `credentials()` function might only work in the `environment` block
in the Pipelines. At the moment, running the 'skip ci/skip/e2e label'
stage always reports 'Error: 401 Client Error: Unauthorized'.
Fixes: e0d49908 (ci: fetch GITHUB_API_TOKEN from Jenkins credential store)
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Fetch the named credential "github-api-token" from the Jenkins
configuration. This is a "personal access token" that has been created
with the ceph-csi-bot user account.
CC: @ceph-csi-bot
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Jobs can now pass the wanted Kubernetes major version (like '1.19') to
the Jenkins Pipeline scripts. The Pipelines detect the most recent patch
release for the major version with the new get_patch_release.py script.
This causes the CI Job status context to not have the patch number (last
digit of the release) included anymore. Restarting a test will only need
the major version number, as does updating the Mergify configuration.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Due to a strict timeout, the job
tends to abort sometimes. Increasing the
timeout to allow sufficient time for
tests to execute.
Signed-off-by: Yug <yuggupta27@gmail.com>
Curently the stage name directly
prints the name of the variable
in place of substituting it.
This is a fix for that issue.
Signed-off-by: Yug <yuggupta27@gmail.com>
Using double quotes as variables are
expanded inside them.
The script fails currently as it is
unable to expand the variables.
Signed-off-by: Yug <yuggupta27@gmail.com>
Move the mini-e2e job into a template-job and generate two jobs out of
it: mini-e2e/k8s-1.17.8 and mini-e2e/k8s-1.18.5
By passing the k8s_version as variable to the job-template, and placing
it in the parameters for the mini-e2e.groovy script, all hard-coded
occurences of the Kubernetes version can be replaced by the
{k8s_version} placeholder.
See-also: https://jenkins-job-builder.readthedocs.io/en/latest/definition.html#job-template
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Commit f5cba3aaa8 added the mini-e2e job, but still referred to the
temporary location that was used for testing the job. As everything is
available in the ceph-csi:ci/centos repository:branch, there is no need
to refer to the temporary location.
Reported-by: Yug <yuggupta27@gmail.com>
Signed-off-by: Niels de Vos <ndevos@redhat.com>
While debugging issues with the job itself, a sleep has been very
useful. PRs that have been rebased on the master branch contain all the
deployment fixes that are needed for the job to pass. There is no need
anymore to run into the long sleep when the job fails.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
The new mini-e2e jobs does the following:
- reserve a bare-metal machine
- checkout the git repository with the PR
- build used artifacts (container image and e2e.test executable)
- deploy k8s and Rook in a minikube VM
- run the e2e tests
With-contributions-from: Yug <yuggupta27@gmail.com>
Signed-off-by: Niels de Vos <ndevos@redhat.com>