ceph-csi

mirror of https://github.com/ceph/ceph-csi.git synced 2025-06-14 18:53:35 +00:00

Author	SHA1	Message	Date
Madhu Rajanna	0dd152928d	e2e: add option to set retainpolicy for rbd storageclass added an option to set retain policy for rbd storageclasses. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2020-11-28 18:50:00 +00:00
Madhu Rajanna	30af703a2f	rbd: add controller to main initialize and start the rbd controller when we the driver type is controller. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2020-11-28 18:50:00 +00:00
Madhu Rajanna	68bd44beba	rbd: add new controller to regenerate omap data In the case of Disaster Recovery failover, the user expected to create the static PVC's. We have planned not to go with the PVC name and namespace for many reasons (as in kubernetes it's planned to support PVC transfer to a new namespace with a different name and with new features coming in like data populator etc). For now, we are planning to go with static PVC's to support async mirroring. During Async mirroring only the RBD images are mirrored to the secondary site, and when the user creates the static PVC's on the failover we need to regenerate the omap data. The volumeHandler in PV spec is an encoded string which contains clusterID and poolID and image UUID, The clusterID and poolID won't remain same on both the clusters, for that cephcsi need to generate the new volume handler and its to create a mapping between new volume handler and old volume handler with that whenever cephcsi gets csi requests it check if the mapping exists it will pull the new volume handler and continues other operations. The new controller watches for the PVs created, It checks if the omap exists if it doesn't it will regenerate the entire omap data. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2020-11-28 18:50:00 +00:00
Madhu Rajanna	5af3fe5deb	rebase: add controller runtime dependency this commits add the controller runtime and its dependency to the vendor. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2020-11-28 18:50:00 +00:00
Madhu Rajanna	14700b89d1	rbd: update inuse logic of a rbd image in case of mirrored image, if the image is primary a watcher will be added by the rbd mirror deamon on the rbd image. we have to consider 2 watcher to check image is in use. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2020-11-28 18:50:00 +00:00
Madhu Rajanna	ba84f14241	journal: create object with provided UUID incase of async mirroring the volume UUID is retrieved from the volume name, instead of cephcsi generating a new UUID it should reserve the passed UUID it will be useful when we support both metro DR and async mirroring on a kubernetes clusters. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2020-11-28 18:50:00 +00:00
Niels de Vos	1f18e876f0	e2e: use docker.io/library as prefix for official images Docker Hub offers a way to pull official images without any project prefix, like "docker.io/vault:latest". This does a redirect to the images located under "docker.io/library". By using the full qualified image name, a redirect gets removed while pulling the images. This reduces the likelyhood of hittin Docker Hub pull rate-limits. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2020-11-26 13:51:02 +00:00
Niels de Vos	954ac97d22	ci: add more logging during Rook deployment It seems that the new log_errors() function does not get triggered when the script hits `exit 1` conditions in functions. The functions should return a non-0 value, not cause an exit of the script. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2020-11-24 18:12:29 +00:00
Niels de Vos	ed033153ea	ci: gather logs when deploying Rook fails Log a few commands that help troubleshooting Rook deployment issues. This might need to get extended with more commands. Updates: #1636 Signed-off-by: Niels de Vos <ndevos@redhat.com>	2020-11-24 14:53:38 +00:00
Niels de Vos	eaeee8ac3d	deploy: use docker.io for unqualified image names Images that have an unqualified name (no explicit registry) come from Docker Hub. This can be made explicit by adding docker.io as prefix. In addition, the default :latest tag has been added too. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2020-11-24 10:27:33 +00:00
Niels de Vos	7a69c5e238	build: add ROOK_VERSION to build.env The CentOS CI jobs use Rook v1.3.9, this version should be places in build.env just like other versions that the CI jobs detect. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2020-11-23 11:15:23 +00:00
Niels de Vos	4fd0294eb7	e2e: pull centos image from registry.centos.org The BlockVolume PVC tests consume the example files that refer to "centos:latest" without registry. This means that the images will get pulled from Docker Hub, which has rate limits preventing CI jobs from pulling images. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2020-11-19 16:00:33 +00:00
Madhu Rajanna	7d229c2369	build: update imagepullpolicy for vault this allows the image to be reused instead of pulling it again. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2020-11-19 16:00:33 +00:00
Madhu Rajanna	168526a906	e2e: use centos as image for normal user validation Reduce the number of images that get pulled from Docker Hub. Use the official CentOS container registry instead. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2020-11-19 16:00:33 +00:00
Madhu Rajanna	8d86bac9b7	e2e: set imagePullPolicy to ifNotPresent If the imagePullPolicy is not set and the image tag is empty or latest the image is always pulled. This commit sets the policy to pull image if not present. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2020-11-18 13:43:52 +00:00
Madhu Rajanna	8d3a44d0c4	rbd: add minsnapshotsonimage flag An rbd image can have a maximum number of snapshots defined by maxsnapshotsonimage On the limit is reached the cephcsi will start flattening the older snapshots and returns the ABORT error message, The Request comes after this as to wait till all the images are flattened (this will increase the PVC creation time. Instead of waiting till the maximum snapshots on an RBD image, we can have a soft limit, once the limit reached cephcsi will start flattening the task to break the chain. With this PVC creation time will only be affected when the hard limit (minsnapshotsonimage) reached. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2020-11-18 05:59:20 +00:00
Niels de Vos	880b5bb427	ci: use the Fedora container registry for cephcsi:test This reduces the dependency on Docker, where image pull rate limits are seen in the CI. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2020-11-17 09:28:02 +00:00
Prasanna Kumar Kalever	817edfd1c7	cleanup: remove the use of text in markdown We do not have `text` in the new section of the MarkDown Rules. Hence dropping them. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2020-11-11 13:18:05 +00:00
Prasanna Kumar Kalever	8475a3b97e	doc: update about a markdown rule in coding guide Update the coding guide about MD014, i.e. Dollar signs used before commands without showing output Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2020-11-11 13:18:05 +00:00
Prasanna Kumar Kalever	2945f7b669	cleanup: stick to standards when using dollar-sign in md MD014 - Dollar signs used before commands without showing output The dollar signs are unnecessary, it is easier to copy and paste and less noisy if the dollar signs are omitted. Especially when the command doesn't list the output, but if the command follows output we can use `$ ` (dollar+space) mainly to differentiate between command and its ouput. scenario 1: when command doesn't follow output ```console cd ~/work ``` scenario 2: when command follow output (use dollar+space) ```console $ ls ~/work file1 file2 dir1 dir2 ... ``` Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2020-11-11 13:18:05 +00:00
Prasanna Kumar Kalever	fcaa332921	doc: improve e2e tests guide Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>	2020-11-11 13:18:05 +00:00
Yug	3ac6bbd87c	cephfs: Add isCloneRetryError function The function isCloneRetryError verifies if the clone error is `pending` or `in-progress` error. Co-authored-by: Madhu Rajanna <madhupr007@gmail.com> Signed-off-by: Yug <yuggupta27@gmail.com>	2020-11-09 07:29:12 +00:00
Yug	acbedc52bf	cephfs: Add 'pending' state for clone status In certain cases, clone status can be 'pending'. In that case, abort error message should be returned similar to that during 'in-progress' state. Co-authored-by: Madhu Rajanna <madhupr007@gmail.com> Signed-off-by: Yug <yuggupta27@gmail.com>	2020-11-09 07:29:12 +00:00
Niels de Vos	565038fdfd	cephfs: ignore quota when SubVolumeInfo() returns Infinite There is a type-check on BytesQuota after calling SubVolumeInfo() to see if the value is supported. In case no quota is configured, the value Infinite is returned. This can not be converted to an int64, so the original code returned an error. It seems that attaching/mounting sometimes fails with the following error: FailedMount: MountVolume.MountDevice failed for volume "pvc-0e8fdd18-873b-4420-bd27-fa6c02a49496" : rpc error: code = Internal desc = subvolume csi-vol-0d68d71a-1f5f-11eb-96d2-0242ac110012 has unsupported quota: infinite By ignoring the quota of Infinite, and not setting a quota in the Subvolume object, this problem should not happen again. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2020-11-06 14:58:26 +00:00
John Mulligan	8a41cd03a5	journal: fix reading omaps from objects with large key counts The implementation of getOMapValues assumed that the number of key-value pairs assigned to the object would be close to the number of keys being requested. When the number of keys on the object exceeded the "listExcess" value the function would fail to read additional keys even if they existed in the omap. This change sets a large fixed "chunk size" value and keeps reading key-value pairs as long as the callback gets called and increments the numKeys counter. Signed-off-by: John Mulligan <jmulligan@redhat.com>	2020-11-06 06:42:22 +00:00
Niels de Vos	7caca1137f	e2e: do not use Failf() to abort tests in a go-routine (util) There are several go-routines where Failf() is called, which will cause a Golang panic inside the Ginko test framework. Instead of aborting the go-routine, capture the error and check for failures once all go-routines have finished. The CephFS tests have been updated already, this changs only affects the validatePVCClone() utility function. Updates: #1359 Signed-off-by: Niels de Vos <ndevos@redhat.com>	2020-11-06 03:31:39 +00:00
Niels de Vos	45d64ab7d0	e2e: do not use Failf() to abort tests in a go-routine (rbd) There are several go-routines where Failf() is called, which will cause a Golang panic inside the Ginko test framework. Instead of aborting the go-routine, capture the error and check for failures once all go-routines have finished. The CephFS tests have been updated already, this changs only affects the RBD tests. Updates: #1359 Signed-off-by: Niels de Vos <ndevos@redhat.com>	2020-11-06 03:31:39 +00:00
Niels de Vos	96eafecad8	e2e: do not use Failf() to abort tests in a go-routine There are several go-routines where Failf() is called, which will cause a Golang panic inside the Ginko test framework. Instead of aborting the go-routine, capture the error and check for failures once all go-routines have finished. Updates: #1359 Signed-off-by: Niels de Vos <ndevos@redhat.com>	2020-11-05 11:26:20 +00:00
Niels de Vos	cba466c163	ci: do not run "make mod-check" for containerized-tests An updated CI job will run "make mod-check" in parallel with the full containerized-test and containerized-build targets. This will hopefully reduced the time that is needed for the whole ci/centos/containerized-tests job. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2020-11-04 05:36:07 +00:00
Niels de Vos	afd6994e19	ci: allow passing USE_PULLED_IMAGE=yes to use pre-pulled images When passing USE_PULLED_IMAGE=yes to the containerized-test or containerized-build make targets, it is now possible to use pre-pulled container images. This saves time as the container images will not get created from scratch. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2020-11-04 05:36:07 +00:00
Mudit Agarwal	0ecfd0e72c	rbd: replace go-ceph GetParentInfo() with GetParent() GetParent() is a newer and better version of GetParentInfo() in go-ceph. Signed-off-by: Mudit Agarwal <muagarwa@redhat.com>	2020-11-03 08:00:12 +00:00
Niels de Vos	eefaf09ade	doc: add common bot commands to GitHub PR template By placing the common bot commands and their description in the PR template, developers are reminded on their usage. The idea comes from the Ceph project where this is done too. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2020-10-28 11:25:34 +00:00
Niels de Vos	523d813b4e	doc: allow in-line HTML in MarkDown documents The GitHub style for Pull Request and Issue templates add HTML tags for some advanced usage. The MarkDown linter should not give warnings when these are used. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2020-10-28 11:25:34 +00:00
Niels de Vos	ecc33a9f86	cleanup: no need to validate conf.Vtype twice conf.Vtype was verified already, no need to do it a 2nd time. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2020-10-28 09:37:36 +00:00
Niels de Vos	4c91f07c78	cleanup: do not panic when validateMaxSnaphostFlag() detects an error When the cephcsi executable detects an error when calling validateMaxSnaphostFlag(), it panics due to klog.Fatalln(). The error that validateMaxSnaphostFlag() logs should be understandable enough, so that users know what to investigate. A Go panic on a user error is not very userfriendly, and does not provide any additional usefil information. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2020-10-28 09:37:36 +00:00
Niels de Vos	3e305970df	cleanup: do not panic when validateCloneDepthFlag() detects an error When the cephcsi executable receives an error when calling validateCloneDepthFlag(), it panics due to klog.Fatalln(). The errors that validateCloneDepthFlag() logs should be understandable enough, so that users know what to investigate. A Go panic on a user error is not very userfriendly, and does not provide any additional usefil information. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2020-10-28 09:37:36 +00:00
Niels de Vos	86a8d29bd1	cleanup: do not panic when the metricspath is not a valid URL When the cephcsi executable receives an error when calling util.ValidateURL() on the optional "metricspath". The error that util.ValidateURL() returns should be understandable enough, so that users know what to investigate. A Go panic on a user error is not very userfriendly, and does not provide any additional usefil information. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2020-10-28 09:37:36 +00:00
Niels de Vos	23817c1a83	cleanup: do not panic on invalid drivername When the cephcsi executable receives an error when calling util.ValidateDriverName(), it panics due to klog.Fatalln(). The error that util.ValidateDriverName() returns should be understandable enough, so that users know what to investigate. A Go panic on a user error is not very userfriendly, and does not provide any additional usefil information. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2020-10-28 09:37:36 +00:00
Niels de Vos	79e91fa894	cleanup: prevent Go panic on missing driver type When running the 'cephcsi' executable without arguments, a Go panic is reported: $ ./_output/cephcsi F1026 13:59:04.302740 3409054 cephcsi.go:126] driver type not specified goroutine 1 [running]: k8s.io/klog/v2.stacks(0xc000010001, 0xc0000520a0, 0x48, 0x9a) /go/src/github.com/ceph/ceph-csi/vendor/k8s.io/klog/v2/klog.go:996 +0xb9 k8s.io/klog/v2.(loggingT).output(0x2370360, 0xc000000003, 0x0, 0x0, 0xc000194770, 0x20cb265, 0xa, 0x7e, 0x413500) /go/src/github.com/ceph/ceph-csi/vendor/k8s.io/klog/v2/klog.go:945 +0x191 k8s.io/klog/v2.(loggingT).println(0x2370360, 0x3, 0x0, 0x0, 0xc000163e08, 0x1, 0x1) /go/src/github.com/ceph/ceph-csi/vendor/k8s.io/klog/v2/klog.go:699 +0x11a k8s.io/klog/v2.Fatalln(...) /go/src/github.com/ceph/ceph-csi/vendor/k8s.io/klog/v2/klog.go:1456 main.main() /go/src/github.com/ceph/ceph-csi/cmd/cephcsi.go:126 +0xafa Just logging the error and exiting should be sufficient. This stack-trace from the Go panic does not add any useful information. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2020-10-28 09:37:36 +00:00
Niels de Vos	7101a6dc8e	cleanup: add logAndExit() for cephcsi:main() to call instead of panic The main() function of the cephcsi executable calls klog.Fatalln() to report certain errors. This causes the executable to panic which is not helpful to users that only need the error message. By introducing logAndExit(), there is no need to call klog.Fatalln() anymore. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2020-10-28 09:37:36 +00:00
Niels de Vos	9732cf16a1	cephfs: drop unused Credentials from resizeVolume() When using go-ceph and the volumeOptions.Connect() call, the credentials are not needed once the connection is established. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2020-10-28 08:02:12 +00:00
Niels de Vos	5baed6190c	cephfs: implement resizeVolume() with go-ceph Reduce the number of calls to the `ceph fs` executable to improve performance of CephFS volume resizing. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2020-10-28 08:02:12 +00:00
Niels de Vos	d431402101	cephfs: make resizeVolume() a method of volumeOptions This prepares resizeVolume() so that the volumeOptions.conn can be used for connecting with go-ceph and use the connection cache. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2020-10-28 08:02:12 +00:00
Niels de Vos	48108bc549	e2e: retry deploying on API-server timeouts The upgrade-tests-cephfs fails relative regularly with the following error during intial deployment: timeout waiting for deployment csi-cephfsplugin-provisioner with error error waiting for deployment "csi-cephfsplugin-provisioner" status to match expectation: etcdserver: request timed out By detecting if the API-server returned a non-fatal error, the test does not need to abort, but can wait for completion. PollImmediate() will still return ErrWaitTimeout once the timeout elapsed. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2020-10-28 06:56:41 +00:00
Niels de Vos	b26d33b7c1	build: install git as when building from Dockerfile When running a simple build with only the required arguments, the following warning are reported: $ buildah bud --build-arg=BASE_IMAGE=ceph/ceph:v15 --build-arg=GO_ARCH=amd64 -f ./deploy/cephcsi/image/Dockerfile . ... STEP 15: COPY . ${SRC_DIR} STEP 16: RUN make cephcsi cephcsi image settings: quay.io/cephcsi/cephcsi version canary make: git: Command not found make: git: Command not found if [ ! -d ./vendor ]; then (go mod tidy && go mod vendor); fi make: git: Command not found ... STEP 23: COMMIT Getting image source signatures ... Writing manifest to image destination Storing signatures --> 239b19c4049 git is used to detect the current commit, and store it in the binary that is built. Without the commit, the "Git Commit:" in the output is empty, making it impossible to get the exact version: $ podman run --rm 239b19c4049 --version Cephcsi Version: canary Git Commit: Go Version: go1.15 Compiler: gc Platform: linux/amd64 Kernel: 5.8.4-200.fc32.x86_64 Signed-off-by: Niels de Vos <ndevos@redhat.com>	2020-10-27 21:46:38 +00:00
Madhu Rajanna	fdbd487741	ci: fix shellcheck in test-go Fixed shellcheck in test-go script Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> Signed-off-by: Niels de Vos <ndevos@redhat.com>	2020-10-27 17:04:09 +00:00
Niels de Vos	6e3d68f575	ci: use major Kubernetes versions again for Mergify checks With the update to minikube v1.14.1 downloading binaries for the recent Kubernetes patch releases works again. The CI jobs have been updated to use the major versions, and so should Mergify. Fixes: #1588 Signed-off-by: Niels de Vos <ndevos@redhat.com>	2020-10-27 19:35:02 +05:30
Niels de Vos	3c1aff2174	rebase: update minikube to v1.14.1 Minikube 1.14.1 contains a fix for downloading Kubernetes binaries with version 1.19.3 and 1.18.10. When this version of minikube is used, we can return to passing major versions to CI jobs (1.19 and 1.18). Updates: #1588 See-also: kubernetes/minikube#9500 Signed-off-by: Niels de Vos <ndevos@redhat.com>	2020-10-27 03:35:31 +00:00
Niels de Vos	381ea22641	ci: create ceph-csi-config ConfigMap for external storage tests The StorageClasses that get deployed for the Kubernetes e2e external storage tests reference a ConfigMap that contains the connection details for the Ceph cluster. Without this ConfigMap, Ceph-CSI will not function correctly. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2020-10-23 11:40:28 +00:00
Niels de Vos	6a46b8f17f	cephfs: implement getSubVolumeInfo() with go-ceph Fixes: #1551 Signed-off-by: Niels de Vos <ndevos@redhat.com>	2020-10-23 10:58:35 +00:00

... 7 8 9 10 11 ...

2059 Commits