Commit Graph

2047 Commits

Author SHA1 Message Date
Madhu Rajanna
2eebf6b6e0 build: use cephcsi bot token to push helm charts
GITHUB_TOKEN is auto generated for cephcsi repo
and it cannot be used to push helm charts to
different repo. added new secret CEPH_CSI_BOT_TOKEN
to push helm charts.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2021-06-10 12:56:10 +05:30
Madhu Rajanna
3e9aafd730 build: enclose shell variables in {}
enclose the shell variables in `{}` deploy.sh

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2021-06-10 10:17:19 +05:30
Madhu Rajanna
08c8272282 build: add shellcheck source=build.env in deploy.sh
added `# shellcheck source=build.env` in deploy.sh
to fix shellcheck SC1090 failure.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2021-06-10 10:17:19 +05:30
Madhu Rajanna
36db988f73 ci: pushing artifacts using github actions
As Travis CI `https://travis-ci.org/` is getting
shutdown date on June 15th. Either we need to move
to new place https://www.travis-ci.com/ or we can
switch to github action to push image and the helm
charts when a PR is merged.

fixes: #1781

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2021-06-10 10:17:19 +05:30
Yati Padia
21a400839f cleanup: No use of variable validateEncryption
In the function validatePVCSnapshot(...), we don't need
validateEncryption variable as we are passing kms value
which can help us check the value of validateEncryption.
Hence, we can avoid using that.

Signed-off-by: Yati Padia <ypadia@redhat.com>
2021-06-08 13:00:11 +00:00
Madhu Rajanna
7b5c78ec7c rbd: fail fast in create volume for missmatch encryption
CreateVolume will fail in below cases

* If the snapshot is encrypted and requested volume
is not encrypted
* If the snapshot is not encrypted and requested
volume is encrypted

* If the parent volume is encrypted and requested volume
is not encrypted
* If the parent volume is not encrypted and requested
volume is encrypted

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2021-06-07 15:05:21 +00:00
Niels de Vos
538e36f7a7 ci: pass GITHUB_BASE_REF when running commitlint
GitHub Actions include a merge commit for the PR, which will defeat the
commitlint checking of all the commits inside the PR (only the merge
commit is checked).

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-06-07 18:19:23 +05:30
Madhu Rajanna
4e2c4ef704 cephfs: return internal server error
if it is an error from the IsMountPoint
function and the error is not IsNotExist return
it as a internal server error.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2021-06-07 07:38:48 +00:00
Madhu Rajanna
46f1ab9e99 cephfs: use IsMountPoint to check mountpoint
Currently we are relaying on the error output from
the umount command we run on the nodes when mounting
the volume but we are not checking for all the error
message to verify the volume is mounted or not.
This commits uses IsMountPoint function in util
to check the mountpoint.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2021-06-07 07:38:48 +00:00
Madhu Rajanna
b4dbffa316 util: return actual error from IsMountPoint
as callers are already taking care of returing
the GRPC error code return the actual error
from  the IsMountPoint function.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2021-06-07 07:38:48 +00:00
Madhu Rajanna
fb7dc13dfe rebase: update packages in go.mod to latest releases
updated few packages in go.mod to latest
available release.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2021-06-04 11:52:22 +00:00
Yati Padia
0f44c6acb7 cleanup: address wasted assign issues
At places variable is reassigned without
being used.

Signed-off-by: Yati Padia <ypadia@redhat.com>
2021-06-03 09:51:14 +00:00
YingshuoTao
bfe64d4aee cephfs: pass extra volume attributes to static PV
when using pre-provisioned volumes, pass these parameters:
- kernelMountOptions
- fuseMountOptions
- subVolumeGroup
in spec.csi.volumeAttributes in PV declaration

Signed-off-by: YingshuoTao <frigid.blues@gmail.com>
2021-06-03 04:42:59 +00:00
Niels de Vos
7cbad9305f rbd: repair thick-provisioned images on CreateVolume restart
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-06-01 14:42:12 +00:00
Niels de Vos
96a8ea3e88 cleanup: split repairExistingVolume() from CreateVolume()
Move the repairing of a volume/snapshot from CreateVolume to its own
function. This reduces the complexity of the code, and makes the
procedure easier to understand. Further enhancements to repairing an
exsiting volume can be done in the new function.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-06-01 14:42:12 +00:00
Madhu Rajanna
2e978e4211 rbd: fix typo in error message
fixed typo in error message.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2021-06-01 10:40:07 +00:00
Madhu Rajanna
a666d452bf cephfs: return GRPC error in NodeGetVolumeStats
in case of failure return GRPC error to the caller.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2021-05-31 08:17:37 +00:00
Rakshith R
81809500be e2e: wait for upgraded pods to be running during upgrade-tests
This commit calls `waitForDaemonSets` and `waitForDeploymentComplete`
after upgrading to wait for csi driver pods to be in running state
for both rbd and cephfs upgrade tests.

Signed-off-by: Rakshith R <rar@redhat.com>
2021-05-26 13:15:52 +00:00
Prasanna Kumar Kalever
6984da5096 build: ignore unparam linter false positive
Ignoring below warnings:

e2e/pod.go:207:60: `execCommandInContainer` - `cn` always receives
`"csi-rbdplugin"` (unparam)
func execCommandInContainer(f *framework.Framework, c, ns, cn string,
opt *metav1.ListOptions) (string, string, error) {
                                                           ^
e2e/pod.go:308:43: `deletePodWithLabel` - `skipNotFound` always receives
`false` (unparam)
func deletePodWithLabel(label, ns string, skipNotFound bool) error {

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2021-05-26 10:41:34 +00:00
Prasanna Kumar Kalever
85e1e0370a e2e: enable an old testcase as the ndb module is available
This testcase tests journaling/exclusive-lock image-features with
rbd-nbd mounter

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2021-05-26 10:41:34 +00:00
Prasanna Kumar Kalever
819267112e e2e: restart rbd-nbd process after nodeplugin reboot
Bringup the rbd-nbd map/attach process on the rbd node plugin and expect the
IO to continue uninterrupted.

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2021-05-26 10:41:34 +00:00
Prasanna Kumar Kalever
7334c3b783 e2e: add ability to run command inside specified container
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2021-05-26 10:41:34 +00:00
Prasanna Kumar Kalever
695ec6dffe e2e: Test IO after nodeplugin reboot
This is a negative testcase to showcase as per current design
the IO will fail because of the missing mappings

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2021-05-26 10:41:34 +00:00
Prasanna Kumar Kalever
8bae8f8458 e2e: add a test case for rbd-nbd mounter
To validate the basic working of rbd-nbd

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2021-05-26 10:41:34 +00:00
Humble Chirammal
78211b694b build: update client-go and other kube dependencies to 1.20.6
client-go 1.20.6 has a fix for below CVE: This patch address this
via updating client-go and other dependencies.

CVE-2019-11250 : The MITRE CVE dictionary describes this issue as:

The Kubernetes client-go library logs request headers at verbosity
levels of 7 or higher. This can disclose credentials to unauthorized
users via logs or command output. Kubernetes components (such as
kube-apiserver) prior to v1.16.0, which make use of basic or bearer
token authentication, and run at high verbosity levels, are affected.

Ref# https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-11250

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2021-05-26 09:14:10 +00:00
Rakshith R
fa1414d98f cleanup: address ineffectual assignement linter issue
updates: #1586

Signed-off-by: Rakshith R <rar@redhat.com>
2021-05-26 07:04:32 +00:00
Rakshith R
b891e5585d cleanup: address ifshort linter issues
This commit addresses ifshort linter issues which
checks if short syntax for if-statements is possible.

updates: #1586

Signed-off-by: Rakshith R <rar@redhat.com>
2021-05-26 07:04:32 +00:00
Rakshith R
6618e2012d cleanup: remove unnecessary calling of .String() when logging
This commit removes calling of .String() when logging
since `%s`,`%v` or `%q` will call an existing .String() function
automatically.

Fixes: #2051

Signed-off-by: Rakshith R <rar@redhat.com>
2021-05-25 18:02:11 +00:00
Niels de Vos
19a4d12bec rebase: update minikube to v1.20.0
See-also: https://github.com/kubernetes/minikube/releases/tag/v1.20.0
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-05-25 16:14:13 +00:00
Rakshith R
d04bfe890f helm: fix k8s version string for csidriver crds semverCompare
Current implementation of semvercompare fails against
pre-release versions. This commit fixes it by using
the entire version string at which csidriver api became GA.

s|">=1.18"|">=1.18.0-beta.1"

Fixes: #2039

Signed-off-by: Rakshith R <rar@redhat.com>
2021-05-25 14:23:33 +00:00
Madhu Rajanna
13b85d09fd ci: add wait flag to minikube start
add wait flag to minikube start to wait
for all components to be ready by default.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2021-05-25 12:49:12 +00:00
Madhu Rajanna
4cc6238cc4 ci: add wait-timeout flag to minikube start
minikube waits for 6 minutes for the minikube or
host to be ready, due to resource issue sometimes
the host/minikube might take longer time to start
the cluster. Increase the timeout to 10m by default

updates #1969

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2021-05-25 12:49:12 +00:00
Yati Padia
774e8e4042 util: enable golang profiling
Add support for golang profiling.
Standard tools like go tool pprof and curl
work. example:
$ go tool pprof http://localhost:8080/debug/pprof/profile
$ go tool pprof http://localhost:8080/debug/pprof/heap
$ curl http://localhost:8080/debug/pprof/heap?debug=1

https://golang.org/pkg/net/http/pprof/ contains
more details about the pprof interface.

Fixes: #1699

Signed-off-by: Yati Padia <ypadia@redhat.com>
2021-05-25 10:41:22 +00:00
Humble Chirammal
9aa3520c9d build: update go version to 1.16 in go.mod
Make go version latest in the repo

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2021-05-25 09:03:52 +00:00
Niels de Vos
2b9f6c3598 e2e: fetch volume metrics from Kubelet
Test if metrics are available at all. The actual values are a little
difficult to validate.

BlockMode volumes support metrics since Kubernetes 1.22.

See-also: kubernetes/kubernetes#97972
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-05-25 06:41:04 +00:00
Niels de Vos
25d0a1cfc0 rbd: add support for block-devices in NodeGetVolumeStats()
The NodeGetVolumeStats procedure can now be used to fetch the capacity
of the RBD block-device. By default this is a thin-provisioned device,
which means that the capacity is not reserved in the Ceph cluster. This
makes it possible to over-provision the cluster.

In order to detect the amount of storage used by the RBD block-device
(when thin-provisioned), it is required to connect to the Ceph cluster.
Unfortunately, the NodeGetVolumeStats CSI procedure does not provide
enough parameters to connect to the Ceph cluster and fetch more details
about the RBD image.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-05-25 06:41:04 +00:00
Niels de Vos
c0ab4c03e6 cephfs: move NodeGetVolumeStats() to CephFS NodeServer
The CephFS NodeServer should handle the CephFS specific requests. This
is not something that the NodeServer for RBD should handle.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-05-25 06:41:04 +00:00
Rakshith R
a4e4750fdc deploy: disable mon,mgr and mds liveness probe
This commit disables mon,mgr and mds liveness probe
which on failing caused `crashLoopBackOff` state.

Updates: #2094

Signed-off-by: Rakshith R <rar@redhat.com>
2021-05-24 16:12:20 +00:00
Humble Chirammal
151b8c665f e2e: update upgrade-version to v3.3.1
making default version of upgrade test to v3.3.1

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2021-05-24 16:12:20 +00:00
Humble Chirammal
d56978739f deploy: update Rook version to v1.6.2
Rook v1.6.2 is available and this patch updates the version to the
same:

https://github.com/rook/rook/releases/tag/v1.6.2

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2021-05-24 16:12:20 +00:00
Rakshith R
5cf3b4ab80 cleanup: update mergify.yml to remove bot_account option
Mergify.io has removed bot_account from its free open source plan.
This commit removes bot_account option from comment, merge and rebase
actions default and documenting the implications going forward.

Signed-off-by: Rakshith R <rar@redhat.com>
2021-05-20 13:38:00 +05:30
Niels de Vos
150de619ee ci: have mergify set the component/build label on PRs
Quite a few PRs have the `build:` prefix. It would be good to teach
Mergify to set the label for those.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-05-17 09:56:56 +05:30
Madhu Rajanna
5ccdb35da0 doc: update readme for latest releases
as we have done new releases for v3.3.x and
v3.2.x updating the readme for the same.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2021-05-10 09:42:18 +00:00
Madhu Rajanna
0ce6ad1152 rbd: fix image details logging
log only the required details of
the image.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2021-05-07 07:57:37 +00:00
Madhu Rajanna
fa36a46682 e2e: pvc mounting when snap and parent pvc is deleted
Added an E2E test to test below case

* Create PVC
* Create Snapshot from PVC
* Delete PVC
* Create Clone from Snapshot
* Delete Snapshot
* Mount clone to Application
* Delete Application and PVC Clone

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2021-05-07 07:57:37 +00:00
Madhu Rajanna
67d73cd6e9 rbd: flatten image if the depth is not zero
flatten the image if the deep-flatten feature
is present on the images in the chain or if the
images in chain is not zero, as we cannot check
the deep-flatten feature the images which are
in trash.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2021-05-07 07:57:37 +00:00
Madhu Rajanna
e15e2e5081 rbd: discard image not found error
For flatten we call checkImageChainHasFeature
which internally calls to getImageInfo returns
the parent name even if the parent is in the trash,
when we try to open the parent image to get its
information it fails as the image not found.
we should treat error as nil if the parent is not found.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2021-05-07 07:57:37 +00:00
Niels de Vos
86cfc3dd0c ci: update mergify config for labelling PRs
When the configuration for Mergify itself gets updated, there is not
need to run the e2e tests.

It seems that the `(ci)` part of matching a title for ci/testing PRs
would match partial words like 'capacity'. This is not intended, so
rephrasing the regex and adding `e2e` as match too.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-05-07 07:43:40 +05:30
Niels de Vos
f11a041f56 cleanup: address gosec complaint about creating a file
The new gosec 2.7.0 complains like:

    G304 (CWE-22): Potential file inclusion via variable (Confidence: HIGH, Severity: MEDIUM)

Updates: #2025
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-05-05 16:05:23 +00:00
Madhu Rajanna
07a916b84d rbd: mark image ready when image state is up+unknown
To recover from split brain (up+error) state the image need to be
demoted and requested for resync on site-a and then the image on site-b
should gets demoted.The volume should be marked to ready=true when the
image state on both the clusters are up+unknown because during the last
snapshot syncing the data gets copied first and then image state on the
site-a changes to up+unknown.

If the image state on both the sites are up+unknown consider that
complete data is synced as the last snapshot
gets exchanged between the clusters.

* create 10 GB of file and validate the data after resync

* Do Failover when the site-a goes down
* Force promote the image and write data in GiB
* Once the site-a comes back, Demote the image and issue resync
* Demote the image on site-b
* The status will get reflected on the other site when the last
  snapshot sync happens
* The image will go to up+unknown state. and complete data will
  be copied to site a
* Promote the image on site-a and use it

```bash
csi-vol-5633715e-a7eb-11eb-bebb-0242ac110006:
  global_id:   e7f9ec55-06ab-46cb-a1ae-784be75ed96d
  state:       up+unknown
  description: remote image demoted
  service:     a on minicluster1
  last_update: 2021-04-28 07:11:56
  peer_sites:
    name: e47e29f4-96e8-44ed-b6c6-edf15c5a91d6-rook-ceph
    state: up+unknown
    description: remote image demoted
    last_update: 2021-04-28 07:11:41
 ```

* Do Failover when the site-a goes down
* Force promote the image on site-b and write data in GiB
* Demote the image on site-b
* Once the site-a comes back, Demote the image on site-a
* The images on the both site will go to split brain state

```bash
csi-vol-37effcb5-a7f1-11eb-bebb-0242ac110006:
  global_id:   115c3df9-3d4f-4c04-93a7-531b82155ddf
  state:       up+error
  description: split-brain
  service:     a on minicluster2
  last_update: 2021-04-28 07:25:41
  peer_sites:
    name: abbda0f0-0117-4425-8cb2-deb4c853da47-rook-ceph
    state: up+error
    description: split-brain
    last_update: 2021-04-28 07:25:26
```
* Issue resync
* The images cannot be resynced because when we issue resync
  on site a the image on site-b was in demoted state
* To recover from this state (promote and then demote the
  image on site-b after sometime)

```bash
csi-vol-37effcb5-a7f1-11eb-bebb-0242ac110006:
  global_id:   115c3df9-3d4f-4c04-93a7-531b82155ddf
  state:       up+unknown
  description: remote image demoted
  service:     a on minicluster1
  last_update: 2021-04-28 07:32:56
  peer_sites:
    name: e47e29f4-96e8-44ed-b6c6-edf15c5a91d6-rook-ceph
    state: up+unknown
    description: remote image demoted
    last_update: 2021-04-28 07:32:41
```
* Once the data is copied we can see that  the image state
  is moved to up+unknown on both sites
* Promote the image on site-a and use it

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2021-05-05 13:38:29 +00:00