Commit Graph

2589 Commits

Author SHA1 Message Date
Prasanna Kumar Kalever
fcaa332921 doc: improve e2e tests guide
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2020-11-11 13:18:05 +00:00
Yug
3ac6bbd87c cephfs: Add isCloneRetryError function
The function isCloneRetryError verifies
if the clone error is `pending` or
`in-progress` error.

Co-authored-by: Madhu Rajanna <madhupr007@gmail.com>
Signed-off-by: Yug <yuggupta27@gmail.com>
2020-11-09 07:29:12 +00:00
Yug
acbedc52bf cephfs: Add 'pending' state for clone status
In certain cases, clone status can be 'pending'.
In that case, abort error message should be
returned similar to that during 'in-progress'
state.

Co-authored-by: Madhu Rajanna <madhupr007@gmail.com>
Signed-off-by: Yug <yuggupta27@gmail.com>
2020-11-09 07:29:12 +00:00
Niels de Vos
565038fdfd cephfs: ignore quota when SubVolumeInfo() returns Infinite
There is a type-check on BytesQuota after calling SubVolumeInfo() to see
if the value is supported. In case no quota is configured, the value
Infinite is returned. This can not be converted to an int64, so the
original code returned an error.

It seems that attaching/mounting sometimes fails with the following
error:

    FailedMount: MountVolume.MountDevice failed for volume "pvc-0e8fdd18-873b-4420-bd27-fa6c02a49496" : rpc error: code = Internal desc = subvolume csi-vol-0d68d71a-1f5f-11eb-96d2-0242ac110012 has unsupported quota: infinite

By ignoring the quota of Infinite, and not setting a quota in the
Subvolume object, this problem should not happen again.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-11-06 14:58:26 +00:00
John Mulligan
8a41cd03a5 journal: fix reading omaps from objects with large key counts
The implementation of getOMapValues assumed that the number of key-value
pairs assigned to the object would be close to the number of keys
being requested. When the number of keys on the object exceeded the
"listExcess" value the function would fail to read additional keys
even if they existed in the omap.
This change sets a large fixed "chunk size" value and keeps reading
key-value pairs as long as the callback gets called and increments
the numKeys counter.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
2020-11-06 06:42:22 +00:00
Niels de Vos
7caca1137f e2e: do not use Failf() to abort tests in a go-routine (util)
There are several go-routines where Failf() is called, which will cause
a Golang panic inside the Ginko test framework. Instead of aborting the
go-routine, capture the error and check for failures once all
go-routines have finished.

The CephFS tests have been updated already, this changs only affects the
validatePVCClone() utility function.

Updates: #1359
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-11-06 03:31:39 +00:00
Niels de Vos
45d64ab7d0 e2e: do not use Failf() to abort tests in a go-routine (rbd)
There are several go-routines where Failf() is called, which will cause
a Golang panic inside the Ginko test framework. Instead of aborting the
go-routine, capture the error and check for failures once all
go-routines have finished.

The CephFS tests have been updated already, this changs only affects the
RBD tests.

Updates: #1359
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-11-06 03:31:39 +00:00
Niels de Vos
96eafecad8 e2e: do not use Failf() to abort tests in a go-routine
There are several go-routines where Failf() is called, which will cause
a Golang panic inside the Ginko test framework. Instead of aborting the
go-routine, capture the error and check for failures once all
go-routines have finished.

Updates: #1359
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-11-05 11:26:20 +00:00
Niels de Vos
cba466c163 ci: do not run "make mod-check" for containerized-tests
An updated CI job will run "make mod-check" in parallel with the full
containerized-test and containerized-build targets. This will hopefully
reduced the time that is needed for the whole
ci/centos/containerized-tests job.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-11-04 05:36:07 +00:00
Niels de Vos
afd6994e19 ci: allow passing USE_PULLED_IMAGE=yes to use pre-pulled images
When passing USE_PULLED_IMAGE=yes to the containerized-test or
containerized-build make targets, it is now possible to use pre-pulled
container images. This saves time as the container images will not get
created from scratch.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-11-04 05:36:07 +00:00
Mudit Agarwal
0ecfd0e72c rbd: replace go-ceph GetParentInfo() with GetParent()
GetParent() is a newer and better version of
GetParentInfo() in go-ceph.

Signed-off-by: Mudit Agarwal <muagarwa@redhat.com>
2020-11-03 08:00:12 +00:00
Niels de Vos
eefaf09ade doc: add common bot commands to GitHub PR template
By placing the common bot commands and their description in the PR
template, developers are reminded on their usage. The idea comes from
the Ceph project where this is done too.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-10-28 11:25:34 +00:00
Niels de Vos
523d813b4e doc: allow in-line HTML in MarkDown documents
The GitHub style for Pull Request and Issue templates add HTML tags for
some advanced usage. The MarkDown linter should not give warnings when
these are used.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-10-28 11:25:34 +00:00
Niels de Vos
ecc33a9f86 cleanup: no need to validate conf.Vtype twice
conf.Vtype was verified already, no need to do it a 2nd time.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-10-28 09:37:36 +00:00
Niels de Vos
4c91f07c78 cleanup: do not panic when validateMaxSnaphostFlag() detects an error
When the cephcsi executable detects an error when calling
validateMaxSnaphostFlag(), it panics due to klog.Fatalln(). The error
that validateMaxSnaphostFlag() logs should be understandable enough, so
that users know what to investigate. A Go panic on a user error is not
very userfriendly, and does not provide any additional usefil
information.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-10-28 09:37:36 +00:00
Niels de Vos
3e305970df cleanup: do not panic when validateCloneDepthFlag() detects an error
When the cephcsi executable receives an error when calling
validateCloneDepthFlag(), it panics due to klog.Fatalln(). The errors
that validateCloneDepthFlag() logs should be understandable enough, so
that users know what to investigate. A Go panic on a user error is not
very userfriendly, and does not provide any additional usefil
information.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-10-28 09:37:36 +00:00
Niels de Vos
86a8d29bd1 cleanup: do not panic when the metricspath is not a valid URL
When the cephcsi executable receives an error when calling
util.ValidateURL() on the optional "metricspath". The error that
util.ValidateURL() returns should be understandable enough, so that
users know what to investigate. A Go panic on a user error is not very
userfriendly, and does not provide any additional usefil information.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-10-28 09:37:36 +00:00
Niels de Vos
23817c1a83 cleanup: do not panic on invalid drivername
When the cephcsi executable receives an error when calling
util.ValidateDriverName(), it panics due to klog.Fatalln(). The error
that util.ValidateDriverName() returns should be understandable enough,
so that users know what to investigate. A Go panic on a user error is
not very userfriendly, and does not provide any additional usefil
information.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-10-28 09:37:36 +00:00
Niels de Vos
79e91fa894 cleanup: prevent Go panic on missing driver type
When running the 'cephcsi' executable without arguments, a Go panic is
reported:

    $ ./_output/cephcsi
    F1026 13:59:04.302740 3409054 cephcsi.go:126] driver type not specified
    goroutine 1 [running]:
    k8s.io/klog/v2.stacks(0xc000010001, 0xc0000520a0, 0x48, 0x9a)
    	/go/src/github.com/ceph/ceph-csi/vendor/k8s.io/klog/v2/klog.go:996 +0xb9
    k8s.io/klog/v2.(*loggingT).output(0x2370360, 0xc000000003, 0x0, 0x0, 0xc000194770, 0x20cb265, 0xa, 0x7e, 0x413500)
    	/go/src/github.com/ceph/ceph-csi/vendor/k8s.io/klog/v2/klog.go:945 +0x191
    k8s.io/klog/v2.(*loggingT).println(0x2370360, 0x3, 0x0, 0x0, 0xc000163e08, 0x1, 0x1)
    	/go/src/github.com/ceph/ceph-csi/vendor/k8s.io/klog/v2/klog.go:699 +0x11a
    k8s.io/klog/v2.Fatalln(...)
    	/go/src/github.com/ceph/ceph-csi/vendor/k8s.io/klog/v2/klog.go:1456
    main.main()
    	/go/src/github.com/ceph/ceph-csi/cmd/cephcsi.go:126 +0xafa

Just logging the error and exiting should be sufficient. This stack-trace
from the Go panic does not add any useful information.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-10-28 09:37:36 +00:00
Niels de Vos
7101a6dc8e cleanup: add logAndExit() for cephcsi:main() to call instead of panic
The main() function of the cephcsi executable calls klog.Fatalln() to
report certain errors. This causes the executable to panic which is not
helpful to users that only need the error message.

By introducing logAndExit(), there is no need to call klog.Fatalln()
anymore.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-10-28 09:37:36 +00:00
Niels de Vos
9732cf16a1 cephfs: drop unused Credentials from resizeVolume()
When using go-ceph and the volumeOptions.Connect() call, the credentials
are not needed once the connection is established.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-10-28 08:02:12 +00:00
Niels de Vos
5baed6190c cephfs: implement resizeVolume() with go-ceph
Reduce the number of calls to the `ceph fs` executable to improve
performance of CephFS volume resizing.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-10-28 08:02:12 +00:00
Niels de Vos
d431402101 cephfs: make resizeVolume() a method of volumeOptions
This prepares resizeVolume() so that the volumeOptions.conn can be used
for connecting with go-ceph and use the connection cache.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-10-28 08:02:12 +00:00
Niels de Vos
48108bc549 e2e: retry deploying on API-server timeouts
The upgrade-tests-cephfs fails relative regularly with the following
error during intial deployment:

    timeout waiting for deployment csi-cephfsplugin-provisioner with error error waiting for deployment "csi-cephfsplugin-provisioner" status to match expectation: etcdserver: request timed out

By detecting if the API-server returned a non-fatal error, the test does
not need to abort, but can wait for completion. PollImmediate() will
still return ErrWaitTimeout once the timeout elapsed.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-10-28 06:56:41 +00:00
Niels de Vos
b26d33b7c1 build: install git as when building from Dockerfile
When running a simple build with only the required arguments, the
following warning are reported:

    $ buildah bud --build-arg=BASE_IMAGE=ceph/ceph:v15 --build-arg=GO_ARCH=amd64 -f ./deploy/cephcsi/image/Dockerfile .
    ...
    STEP 15: COPY . ${SRC_DIR}
    STEP 16: RUN make cephcsi
    cephcsi image settings: quay.io/cephcsi/cephcsi version canary
    make: git: Command not found
    make: git: Command not found
    if [ ! -d ./vendor ]; then (go mod tidy && go mod vendor); fi
    make: git: Command not found
    ...
    STEP 23: COMMIT
    Getting image source signatures
    ...
    Writing manifest to image destination
    Storing signatures
    --> 239b19c4049

git is used to detect the current commit, and store it in the binary
that is built. Without the commit, the "Git Commit:" in the output is
empty, making it impossible to get the exact version:

    $ podman run --rm 239b19c4049 --version
    Cephcsi Version: canary
    Git Commit:
    Go Version: go1.15
    Compiler: gc
    Platform: linux/amd64
    Kernel: 5.8.4-200.fc32.x86_64

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-10-27 21:46:38 +00:00
Madhu Rajanna
fdbd487741 ci: fix shellcheck in test-go
Fixed shellcheck in test-go script

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-10-27 17:04:09 +00:00
Niels de Vos
6e3d68f575 ci: use major Kubernetes versions again for Mergify checks
With the update to minikube v1.14.1 downloading binaries for the recent
Kubernetes patch releases works again. The CI jobs have been updated to
use the major versions, and so should Mergify.

Fixes: #1588
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-10-27 19:35:02 +05:30
Niels de Vos
3c1aff2174 rebase: update minikube to v1.14.1
Minikube 1.14.1 contains a fix for downloading Kubernetes binaries with
version 1.19.3 and 1.18.10. When this version of minikube is used, we
can return to passing major versions to CI jobs (1.19 and 1.18).

Updates: #1588
See-also: kubernetes/minikube#9500
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-10-27 03:35:31 +00:00
Niels de Vos
381ea22641 ci: create ceph-csi-config ConfigMap for external storage tests
The StorageClasses that get deployed for the Kubernetes e2e external
storage tests reference a ConfigMap that contains the connection details
for the Ceph cluster. Without this ConfigMap, Ceph-CSI will not function
correctly.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-10-23 11:40:28 +00:00
Niels de Vos
6a46b8f17f cephfs: implement getSubVolumeInfo() with go-ceph
Fixes: #1551
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-10-23 10:58:35 +00:00
Niels de Vos
429f7acd8f cephfs: make getSubVolumeInfo() a method of volumeOptions
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-10-23 10:58:35 +00:00
Satoru Takeuchi
79e7c6a3e2 doc: remove the description of provisioner statefulset
Provisioners don't use StatefulSet anymore.

Signed-off-by: Satoru Takeuchi <satoru.takeuchi@gmail.com>
2020-10-23 06:48:33 +00:00
Prasanna Kumar Kalever
7f8b7ec3d3 ci: fix typo in commitlintrc.yml
fix typo at code comments in commitlintrc.yml

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2020-10-21 23:04:18 +00:00
Prasanna Kumar Kalever
08a6ab6cca ci: disable scope-empty rule in commitlint
we do not use scopes for commit messages

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2020-10-21 23:04:18 +00:00
Prasanna Kumar Kalever
2855334986 ci: improve rules for commitlint
Add few more rules, like:
* subject cannot be empty
* body shouldn't be empty
* avoid body line length over 80 characters (warning only)
* always sign-off the commit msg

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2020-10-21 23:04:18 +00:00
Prasanna Kumar Kalever
ea5264220e doc: update developer guide about retriggering CI jobs
Add instructions about how and when to retrigger the CI jobs

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2020-10-21 22:21:57 +00:00
Humble Chirammal
992d4b92bb util: NewCredentials() dont have any callers
We have below exported function in credentials.go which is not
called from anywhere in the repo. Removing it for the same reason.

```
 // NewCredentials generates new credentials when id and key
 // are provided.

 func NewCredentials(id, key string) (*Credentials, error) {
...
```

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2020-10-21 21:04:11 +00:00
Humble Chirammal
b4452f0787 e2e: validate the error return instead of object for consitency
If loadPVC() fails, it return error and we expect the PVC object
to be nil too. In many places we check on the error and exit.
However in few places we are looking at PVC object.
This commit make the condition check on `err` instead of `PVC`
object for consistency.

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2020-10-21 13:56:12 +00:00
Humble Chirammal
dc4da25a30 e2e: correct typo in encryption string
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2020-10-21 13:56:12 +00:00
Humble Chirammal
5ac3f1e29e util: change member cap of CSIDriver struct
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2020-10-21 12:30:18 +00:00
Humble Chirammal
25617929f0 rbd: use different variable instead of builtin cap function
`cap` builtin function returns the capacity of a type. Its not
good practice to use this builtin function for other variable
names, removing it here

Ref# https://golang.org/pkg/builtin/#cap

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2020-10-21 12:30:18 +00:00
Humble Chirammal
2d70e25081 util: remove unused CSI server initialization functions
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2020-10-21 11:17:22 +00:00
Niels de Vos
5a1e370433 util: drop nolint comment for execCommandJSON()
golang-ci suddenly complains about the following issue

    internal/cephfs/util.go:41:1: directive `// nolint:unparam //  todo:program values has to be revisited later` is unused for linter unparam (nolintlint)
    // nolint:unparam //  todo:program values has to be revisited later
    ^

Dropping the comment completely seems to fix it. Ideally
execCommandJSON() will get removed once the migration to go-ceph is
complete.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-10-20 15:38:44 +00:00
Niels de Vos
eb2584095b cephfs: implement getMetadataPool() with go-ceph
Fixes: #1554
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-10-20 15:38:44 +00:00
Niels de Vos
cabdac4913 cephfs: make getMetadataPool() a method of volumeOptions
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-10-20 15:38:44 +00:00
Madhu Rajanna
551a5018d0 helm: make log level configurable
instead of keeping the log level at 5, which
is required only for tracing the errors. this commit
adds an option for users to configure the log level
for all containers.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-10-20 13:59:11 +00:00
Humble Chirammal
70358c8eb7 rbd: volJournal.Connect() return wrongly pushed to caller
volJournal.Connect() got the error on err2 variable, however
the return was on variable err which hold the error return of
DecomposeCSIID() which is wrong. This cause the error return wrongly
parsed and pushed from the caller. From now on, we are reusing the
err variable to hold and revert the error of volJournal.Connect().

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2020-10-20 12:45:51 +00:00
Humble Chirammal
5c73f0e41b rbd: correct the code comment for ErrFlattenInProgress
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2020-10-20 08:59:25 +00:00
Humble Chirammal
9dee064b77 cephfs: fix wrong error check in CreateVolume rollback action
Previously the purgeVolume error was ignored due to wrong error variable
check in the createVolume. With this change it checks on the proper error.

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2020-10-20 04:08:51 +00:00
Niels de Vos
af922f267a e2e: use k8sVersionGreaterEquals() for rbd upgrade tests
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-10-19 16:00:39 +00:00