Commit Graph

3734 Commits

Author SHA1 Message Date
Madhu Rajanna
df047ddaaf ci: fix commitlint problem
Still seeing the issue of the commitlint
as below

fatal: unsafe repository
('/go/src/github.com/ceph/ceph-csi'
is owned by someone else)
To add an exception for this directory,
call:

git config --global --add safe.directory \
/go/src/github.com/ceph/ceph-csi
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-05-09 05:57:14 +00:00
Madhu Rajanna
b4ff3884f1 ci: remove set-safe-directory from commitlint
Removed set-safe-directory option from the
commitlint.yaml as its not working as expected.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-05-09 05:57:14 +00:00
Niels de Vos
9533889b64 ci: do not set safe.directory for commitlint checkout
Commitlint fails with errors like:

```
git fetch -v origin devel
fatal: unsafe repository ('/go/src/github.com/ceph/ceph-csi' is owned by
someone else)
To add an exception for this directory, call:

	git config --global --add safe.directory /go/src/github.com/ceph/ceph-csi
make: *** [Makefile:153: commitlint] Error 128
```

By not setting the option with actions/checkout@v3, the error should not
happen anymore.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2022-05-09 03:10:04 +00:00
Marcus Röder
a95a6213eb util: support systems using the new cgroup v2 structure
With cgroup v2, the location of the pids.max file changed and so did the
/proc/self/cgroup file

new /proc/self/cgroup file
`
0::/user.slice/user-500.slice/session-14.scope
`

old file:
`
11:pids:/user.slice/user-500.slice/session-2.scope
10:blkio:/user.slice
9:net_cls,net_prio:/
8:perf_event:/
...
`

There is no directory per subsystem (e.g. /sys/fs/cgroup/pids) any more, all
files are now in one directory.

fixes: https://github.com/ceph/ceph-csi/issues/3085

Signed-off-by: Marcus Röder <m.roeder@yieldlab.de>
2022-05-07 20:38:48 +00:00
Madhu Rajanna
1197b94149 e2e: add getPersistentVolume helper function
added getPersistentVolume helper function
to get the PV and also try if there is any API
error to improve the CI.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-05-06 15:55:54 +00:00
Madhu Rajanna
89d9ec0823 e2e: add getPersistentVolumeClaim helper function
added getPersistentVolumeClaim helper function
to get the PVC and also try if there is any API
error to improve the CI.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-05-06 15:55:54 +00:00
Rakshith R
c880061882 ci: use canary csi-provisioner image to test different sc clones
This commit is added to use canary csi-provisioner image
to test different sc pvc-pvc cloning feature, which is not
yet present in released versions.
refer:
https://github.com/kubernetes-csi/external-provisioner/pull/699

Signed-off-by: Rakshith R <rar@redhat.com>
2022-05-06 10:32:21 +00:00
Rakshith R
badcac38d3 e2e: testcase for pvc-pvc clone with different SC & encryption
Signed-off-by: Rakshith R <rar@redhat.com>
2022-05-06 10:32:21 +00:00
Rakshith R
f1ccc4eced rbd: support pvc-pvc clone with different sc & encryption
This commit makes modification so as to allow pvc-pvc clone
with different storageclass having different encryption
configs.
This commit also modifies `copyEncryptionConfig()` to
include a `isEncrypted()` check within the function.

Signed-off-by: Rakshith R <rar@redhat.com>
2022-05-06 10:32:21 +00:00
naveen
2672fad90a ci: Set permissions for GitHub actions
Restrict the GitHub token permissions only to the required ones; this way,
 even if the attackers will succeed in compromising your workflow,
 they won’t be able to do much.

- Included permissions for the action.
https://github.com/ossf/scorecard/blob/main/docs/checks.md#token-permissions

https://docs.github.com/en/actions/using-jobs/assigning-permissions-to-jobs

Signed-off-by: naveen <172697+naveensrinivasan@users.noreply.github.com>
naveensrinivasan <172697+naveensrinivasan@users.noreply.github.com>
2022-05-05 20:21:15 +05:30
dependabot[bot]
b1a0f42b31 rebase: bump actions/checkout from 2 to 3
Bumps [actions/checkout](https://github.com/actions/checkout) from 2 to 3.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v2...v3)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-05-05 12:47:46 +00:00
dependabot[bot]
194db3edd5 rebase: bump actions/stale from 3 to 5
Bumps [actions/stale](https://github.com/actions/stale) from 3 to 5.
- [Release notes](https://github.com/actions/stale/releases)
- [Changelog](https://github.com/actions/stale/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/stale/compare/v3...v5)

---
updated-dependencies:
- dependency-name: actions/stale
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-05-05 10:29:39 +00:00
Naveen
09f8ee0f3f ci: Included githubactions in the dependabot config
This should help with keeping the GitHub actions updated on new
releases. This will also help with keeping it secure.

Dependabot helps in keeping the supply chain secure:
https://docs.github.com/en/code-security/dependabot

GitHub actions up to dat: e
https://docs.github.com/en/code-security/dependabot/ \
  working-with-dependabot/keeping-your-actions-up-to-date-with-dependabot

dependency-update-tool:
https://github.com/ossf/scorecard/blob/main/docs/checks.md

Signed-off-by: Naveen <172697+naveensrinivasan@users.noreply.github.com>
2022-05-05 09:57:57 +00:00
Rakshith R
bd57feb26e rbd: use vaultAuthPath variable name in error msg
Before the change, the error msg was the following:
```
failed to set VAULT_AUTH_MOUNT_PATH in Vault config: path is empty
```
`vaultAuthPath` is the actual variable name set by the
user. The error message will now be the following:
```
failed to set "vaultAuthPath" in vault config: path is empty
```

Signed-off-by: Rakshith R <rar@redhat.com>
2022-05-05 05:49:31 +00:00
Niels de Vos
9d7faf850f nfs: delete the CephFS volume when the export is already removed
In case the NFS-export has already been removed from the NFS-server, but
the CSI Controller was restarted, a retry to remove the NFS-volume will
fail with an error like:

> GRPC error: ....: response status not empty: "Export does not exist"

When this error is reported, assume the NFS-export was already removed
from the NFS-server configuration, and continue with deleting the
backend volume.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2022-05-04 21:31:06 +00:00
Humble Chirammal
188e560ee9 nfs: use latest liveness probe and node driver registrar
This commit make use of latest sidecars of livenessprobe and
node driver registrar in NFS driver deployment.

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2022-05-04 17:38:54 +00:00
Niels de Vos
4891e534d3 ci: prevent ERR trap inheritance for kubectl_retry
`bash -E` causes inheritance of the ERR trap into shell functions,
command substitutions, and commands executed in a subshell environment.
Because the `kubectl_retry` function depends on detection an error of a
subshell, the ERR trap is not needed to be executed. The trap contains
extra logging, and exits the script in the `rook.sh` case. The aborting
of the script is not wanted when a retry is expected to be done.

While checking for known failures, the `grep` command may exit with 1,
if there are no matches. That means, the `ret` variable will be set to
0, but there will also be an error exit status. This causes `bash -E` to
abort the function, and call the ERR trap.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2022-05-04 12:27:59 +00:00
Niels de Vos
43fe5c34cd doc: add Mergify badge to README
We heavily use the service for Open Source communities from Mergify. It
is probably nice to promote them a little in our README.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2022-05-04 06:15:16 +00:00
dependabot[bot]
6a5758c346 rebase: bump github.com/aws/aws-sdk-go from 1.44.0 to 1.44.5
Bumps [github.com/aws/aws-sdk-go](https://github.com/aws/aws-sdk-go) from 1.44.0 to 1.44.5.
- [Release notes](https://github.com/aws/aws-sdk-go/releases)
- [Changelog](https://github.com/aws/aws-sdk-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-go/compare/v1.44.0...v1.44.5)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-05-03 09:45:13 +00:00
dependabot[bot]
1ca5e8efb0 rebase: bump github.com/aws/aws-sdk-go from 1.43.41 to 1.44.0
Bumps [github.com/aws/aws-sdk-go](https://github.com/aws/aws-sdk-go) from 1.43.41 to 1.44.0.
- [Release notes](https://github.com/aws/aws-sdk-go/releases)
- [Changelog](https://github.com/aws/aws-sdk-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-go/compare/v1.43.41...v1.44.0)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-04-28 09:34:08 +00:00
Humble Chirammal
b50e93e689 nfs: remove node plugin RBAC for NFS provisioner
this commit removes the node plugin RBAC for NFS plugin as it is
not needed.

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2022-04-27 15:40:48 +00:00
Humble Chirammal
f141326d51 doc: remove the clusterRole sections from the upgrade guide
As we have removed the clusterrole and binding the subjected
documentation also removed.

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2022-04-27 10:51:33 +00:00
Humble Chirammal
6558de9e08 helm: remove cephfs node plugin cluster role RBAC
This commit remove the node plugin RBAC of CephFS.

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2022-04-27 10:51:33 +00:00
Humble Chirammal
a2059d5cb2 cephfs: remove nodeplugin RBAC
This commit remove the clusterRole and Binding of cephfs node plugin
as the node RBAC is not needed for CephFS.

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2022-04-27 10:51:33 +00:00
Niels de Vos
9940ec38e3 ci: show all logs from kubectl
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2022-04-26 20:55:18 +00:00
Niels de Vos
89e2ff39f1 ci: do not count Warning: matches in kubectl_retry()
Depending on the Kubernetes version, the following warning is reported
regulary:

> Warning: policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+,
> unavailable in v1.25+

The warning is written to stderr, so skipping AlreadyExists or NotFound
is not sufficient to trigger a retry. Ignoring '^Warning:' in the stderr
output should prevent unneeded failures while deploying Rook or other
components.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2022-04-26 20:55:18 +00:00
Niels de Vos
1e21972956 ci: improve logging for kubectl_retry helper
Rook deployments fail quite regulary in the CI environment now. It is
not clear what the cause is, hopefully a little better logging will
guide us to the issue.

Now executing `kubectl` in a sub-shell, ensuring that the redirection of
the command lands in the right files.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2022-04-26 20:55:18 +00:00
dependabot[bot]
863a9a7882 rebase: bump k8s.io/kubernetes from 1.23.5 to 1.23.6
Bumps [k8s.io/kubernetes](https://github.com/kubernetes/kubernetes) from 1.23.5 to 1.23.6.
- [Release notes](https://github.com/kubernetes/kubernetes/releases)
- [Commits](https://github.com/kubernetes/kubernetes/compare/v1.23.5...v1.23.6)

---
updated-dependencies:
- dependency-name: k8s.io/kubernetes
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2022-04-26 14:45:04 +00:00
dependabot[bot]
b11951a710 rebase: bump google.golang.org/grpc from 1.45.0 to 1.46.0
Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.45.0 to 1.46.0.
- [Release notes](https://github.com/grpc/grpc-go/releases)
- [Commits](https://github.com/grpc/grpc-go/compare/v1.45.0...v1.46.0)

---
updated-dependencies:
- dependency-name: google.golang.org/grpc
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-04-26 10:09:00 +00:00
dependabot[bot]
70fc6db2cf rebase: bump github.com/aws/aws-sdk-go-v2/service/sts
Bumps [github.com/aws/aws-sdk-go-v2/service/sts](https://github.com/aws/aws-sdk-go-v2) from 1.16.3 to 1.16.4.
- [Release notes](https://github.com/aws/aws-sdk-go-v2/releases)
- [Changelog](https://github.com/aws/aws-sdk-go-v2/blob/main/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-go-v2/compare/v1.16.3...service/ivs/v1.16.4)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go-v2/service/sts
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-04-26 04:56:36 +00:00
Niels de Vos
b036537e1c doc: add Open Source Security Foundation (OpenSSF) best practices badge
The project is currently at 54% of the best practices. Hopefully this
badge creates some interest in increasing the grade.

See-also: https://bestpractices.coreinfrastructure.org/projects/5940
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2022-04-25 06:17:56 +00:00
Silvan Loser
f2e0fa28fb deploy: allowPrivilegeEscalation: true in containerSecurityContext
When running the kubernetes cluster with one single privileged
PodSecurityPolicy which is allowing everything the nodeplugin
daemonset can fail to start. To be precise the problem is the
defaultAllowPrivilegeEscalation: false configuration in the PSP.
 Containers of the nodeplugin daemonset won't start when they
have privileged: true but no allowPrivilegeEscalation in their
container securityContext.

Kubernetes will not schedule if this mismatch exists cannot set
allowPrivilegeEscalation to false and privileged to true:

Signed-off-by: Silvan Loser <silvan.loser@hotmail.ch>
Signed-off-by: Silvan Loser <33911078+losil@users.noreply.github.com>
2022-04-22 23:36:02 +00:00
Silvan Loser
06c4477ff9 helm: allowPrivilegeEscalation: true in containerSecurityContext
When running the kubernetes cluster with one single privileged
PodSecurityPolicy which is allowing everything the nodeplugin
daemonset can fail to start. To be precise the problem is the
defaultAllowPrivilegeEscalation: false configuration in the PSP.
 Containers of the nodeplugin daemonset won't start when they
have privileged: true but no allowPrivilegeEscalation in their
container securityContext.

Kubernetes will not schedule if this mismatch exists cannot set
allowPrivilegeEscalation to false and privileged to true

Signed-off-by: Silvan Loser <silvan.loser@hotmail.ch>
Signed-off-by: Silvan Loser <33911078+losil@users.noreply.github.com>
2022-04-22 23:36:02 +00:00
Madhu Rajanna
432787fee2 doc: add new hackmd link for meeting minutes
replaced google doc link with new hackmd
link for upstream meetings.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-04-22 12:52:23 +00:00
Madhu Rajanna
5e1a074ea3 doc: update doc for 3.6.1 release
updated doc for 3.6.1 release, this will
be backported to release-v3.6 branch and
we will make deployment changes and do release.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-04-22 09:05:09 +00:00
Niels de Vos
5e66372e31 e2e: allow RWOP tests to fail
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2022-04-21 16:22:49 +00:00
Niels de Vos
b82af7559b e2e: add -clusterid=... for selecting a Ceph cluster
The Ceph cluster-id is usually detected with `ceph fsid`. This is not
always correct, as the the Ceph cluster can also be configured by name.
If the -clusterid=... is passed, it will be used instead of trying to
detect it with `ceph fsid`.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2022-04-21 16:22:49 +00:00
Niels de Vos
e30364cb4d e2e: introduce getClusterID() helper
There are many locations where the cluster-id (`ceph fsid`) is obtained
from the Rook Toolbox. Instead of duplicating the code everywhere, use a
new helper function getClusterID().

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2022-04-21 16:22:49 +00:00
Niels de Vos
b20e3fa784 e2e: allow passing CephFS filesystem name on CLI
A new -filesystem=... option has been added so that the e2e tests can
run against environments that do not have a "myfs" CephFS filesystem.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2022-04-21 16:22:49 +00:00
Niels de Vos
fcb4701a06 e2e: cleanup StorageClass create retry
StorageClasses are cluster resources, not namespaced; there is no need
to log the namespace of a StorageClass.

When creating a StorageClass, NotFound is not an error that will be
returned, not need to check for it.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2022-04-21 10:48:04 +00:00
Niels de Vos
b235c171ba e2e: retry creation of the RBD StorageClass
On occasion the creation of the StorageClass can fail due to an
etcdserver timeout. If that happens, the creation can be attempted after
a delay.

This has already been done for CephFS StorageClasses, but was missed for
RBD.

See-also: ceph/ceph-csi@8a0377ef02
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2022-04-21 10:48:04 +00:00
Humble Chirammal
e8652ee366 helm: update helm version to v3.8.2
This commit update the helm version to latest available
v3.8.2.

Refer# https://github.com/helm/helm/releases/tag/v3.8.2

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2022-04-21 06:16:55 +00:00
Niels de Vos
e3d061075c e2e: use ResourceDeployer for RBD
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2022-04-21 00:22:53 +00:00
Niels de Vos
60442fa916 e2e: use ResourceDeployer for CephFS
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2022-04-21 00:22:53 +00:00
Niels de Vos
2b4bb63eb8 e2e: introduce ResourceDeployer interface
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2022-04-21 00:22:53 +00:00
Niels de Vos
701b5d7ecb e2e: Add early return in Context() for disabled tests
Some parts of the Context() seem to get executed, even when BeforeEach()
did a Skip() for the test. By adding a return inside the Context(), the
tests should not get executed at all.

This was noticed in a failed test, where upgrade was running, eventhough
the job was executed as a nornal non-upgrade one.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2022-04-20 17:34:00 +00:00
Niels de Vos
ef1d589caa ci: use requeue instead of refresh for re-adding PRs to the queue
The current version of Mergify provides a `requeue` command in addition
to `refresh`. After a CI job failed, the PR needs to be re-added to the
queue, so the `requeue` command is more appropriate.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2022-04-20 12:38:46 +00:00
Humble Chirammal
0dfc15121c deploy: change the image registry for sidecars
This commit change the image registry URL for sidecars in the
deployment from `k8s.gcr.io` to `registry.k8s.io` as
the migration is happening from former to the latter. This commit
also correct the e2e readme for the change.

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2022-04-20 10:05:13 +00:00
Humble Chirammal
7d3fd4f683 nfs: change the image registry for sidecars
This commit change the image registry URL for sidecars in the
NFS deployment from `k8s.gcr.io` to `registry.k8s.io` as
the migration is happening from former to the latter.

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2022-04-20 10:05:13 +00:00
Humble Chirammal
6d06698672 rbd: change the image registry for sidecars
This commit change the image registry URL for sidecars in the
RBD deployment from `k8s.gcr.io` to `registry.k8s.io` as
the migration is happening from former to the latter.

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2022-04-20 10:05:13 +00:00