ceph-csi

mirror of https://github.com/ceph/ceph-csi.git synced 2025-06-02 03:46:41 +00:00

Author	SHA1	Message	Date
Madhu Rajanna	8de7d9c90d	Fix volsize for cephfs and rbd Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> (cherry picked from commit 7274bd09e548aee038506e1685b2d77fbc486497)	2019-10-11 09:01:49 +00:00
Madhu Rajanna	454394322a	Add a check for nil secrets Improve the error message if secrets are not provided in request Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> (cherry picked from commit b8568a5bb992287259e832fd7b099a0e85614929)	2019-09-27 18:05:33 +05:30
Madhu Rajanna	1ae09be924	Change the logic of locking if any on going opearation is seen,we have to return Abort error message Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> (cherry picked from commit 6aac3990758c350874421488625867fe864340f6)	2019-09-23 12:38:51 +00:00
Madhu Rajanna	b90ddd7ade	Remove volumemounter flag from cephfs Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2019-09-05 13:04:44 +05:30
Niels de Vos	dd668e59f1	Address security concerns reported by 'gosec' gosec reports several issues, none of them looks very critical. With this change the following concerns have been addressed: [pkg/cephfs/nodeserver.go:229] - G302: Expect file permissions to be 0600 or less (Confidence: HIGH, Severity: MEDIUM) > os.Chmod(targetPath, 0777) [pkg/cephfs/util.go:39] - G204: Subprocess launched with variable (Confidence: HIGH, Severity: MEDIUM) > exec.Command(program, args...) [pkg/rbd/nodeserver.go:156] - G302: Expect file permissions to be 0600 or less (Confidence: HIGH, Severity: MEDIUM) > os.Chmod(stagingTargetPath, 0777) [pkg/rbd/nodeserver.go:205] - G302: Expect file permissions to be 0600 or less (Confidence: HIGH, Severity: MEDIUM) > os.OpenFile(mountPath, os.O_CREATE\|os.O_RDWR, 0750) [pkg/rbd/rbd_util.go:797] - G304: Potential file inclusion via variable (Confidence: HIGH, Severity: MEDIUM) > ioutil.ReadFile(fPath) [pkg/util/cephcmds.go:35] - G204: Subprocess launched with variable (Confidence: HIGH, Severity: MEDIUM) > exec.Command(program, args...) [pkg/util/credentials.go:47] - G104: Errors unhandled. (Confidence: HIGH, Severity: LOW) > os.Remove(tmpfile.Name()) [pkg/util/credentials.go:92] - G104: Errors unhandled. (Confidence: HIGH, Severity: LOW) > os.Remove(cr.KeyFile) [pkg/util/pidlimit.go:74] - G304: Potential file inclusion via variable (Confidence: HIGH, Severity: MEDIUM) > os.Open(pidsMax) URL: https://github.com/securego/gosec Signed-off-by: Niels de Vos <ndevos@redhat.com>	2019-09-04 11:48:37 +00:00
Madhu Rajanna	a81a3bf96b	implement grpc metrics for ceph-csi Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2019-08-30 06:50:32 +00:00
Daniel-Pivonka	01a78cace5	switch to cephfs, utils, and csicommon to new loging system Signed-off-by: Daniel-Pivonka <dpivonka@redhat.com>	2019-08-29 14:04:31 +00:00
Madhu Rajanna	38ca08bf65	Context based logging for rbd Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2019-08-26 06:19:24 +00:00
Daniel-Pivonka	81c28d6cb0	implement klog wrapper Signed-off-by: Daniel-Pivonka <dpivonka@redhat.com>	2019-08-21 14:36:41 +00:00
Madhu Rajanna	0da4bd5151	start controller or node server based on config if both controller and nodeserver flags are set/unset cephcsi will start both server, if only one flag is set, it will start relavent service. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2019-08-19 06:11:43 +00:00
Madhu Rajanna	89732d923f	move flag configuration variable to util remove unwanted checks remove getting drivertype from binary name Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2019-08-19 06:11:43 +00:00
Niels de Vos	31648c8feb	provisioners: add reconfiguring of PID limit The container runtime CRI-O limits the number of PIDs to 1024 by default. When many PVCs are requested at the same time, it is possible for the provisioner to start too many threads (or go routines) and executing 'rbd' commands can start to fail. In case a go routine can not get started, the process panics. The PID limit can be changed by passing an argument to kubelet, but this will affect all pids running on a host. Changing the parameters to kubelet is also not a very elegant solution. Instead, the provisioner pod can change the configuration itself. The pod is running in privileged mode and can write to /sys/fs/cgroup where the limit is configured. With this change, the limit is configured to 'max', just as if there is no limit at all. The logs of the csi-rbdplugin in the provisioner pod will reflect the change it makes when starting the service: $ oc -n rook-ceph logs -c csi-rbdplugin csi-rbdplugin-provisioner-0 .. I0726 13:59:19.737678 1 cephcsi.go:127] Initial PID limit is set to 1024 I0726 13:59:19.737746 1 cephcsi.go:136] Reconfigured PID limit to -1 (max) .. It is possible to pass a different limit on the commandline of the cephcsi executable. The following flag has been added: --pidlimit=<int> the PID limit to configure through cgroups This accepts special values -1 (max) and 0 (default, do not reconfigure). Other integers will be the limit that gets configured in cgroups. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2019-08-13 14:43:29 +00:00
ShyamsundarR	925bda2881	Move mounting staging instance to a sub-path within staging path This commit moves the mounting of a block volumes and filesystems to a sub-file (already the case) or a sub-dir within the staging path. This enables using the staging path to store any additional data regarding the mount. For example, this will be extended in the future to store the fsid of the cluster, and maybe the pool name to map unmap requests to the right image. Also, this fixes the noted hack in the code, to determine in a common manner if there is a mount on the passed in staging path. Signed-off-by: ShyamsundarR <srangana@redhat.com>	2019-08-13 14:07:52 +00:00
Madhu Rajanna	dfbdec4b6a	add validation to check if stagingPath exists It's CO responsibility to create the stagingPath as per the CSI spec. The CO SHALL ensure // that the path is directory and that the process serving the // request has `read` and `write` permission to that directory. The // CO SHALL be responsible for creating the directory if it does not // exist. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2019-07-29 12:52:10 +00:00
Humble Devassy Chirammal	c7d990a96b	Merge pull request #460 from Madhu-1/fix-pluginapath Fix pluginpath for cephfs	2019-07-29 14:02:18 +05:30
ShyamsundarR	bd204d7d45	Use --keyfile option to pass keys to all Ceph CLIs Every Ceph CLI that is invoked at present passes the key via the --key option, and hence is exposed to key being displayed on the host using a ps command or such means. This commit addresses this issue by stashing the key in a tmp file, which is again created on a tmpfs (or empty dir backed by memory). Further using such tmp files as arguments to the --keyfile option for every CLI that is invoked. This prevents the key from being visible as part of the argument list of the invoked program on the system. Fixes: #318 Signed-off-by: ShyamsundarR <srangana@redhat.com>	2019-07-25 12:46:15 +00:00
Madhu Rajanna	a5164cfa41	Avoid keyring message while logging Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2019-07-25 09:48:09 +00:00
Madhu Rajanna	778cfb3090	provide option to set pluginpath for cephfs Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2019-07-25 14:47:42 +05:30
Madhu Rajanna	f4c80dec9a	Implement NodeStage and NodeUnstage for rbd in NodeStage RPC call we have to map the device to the node plugin and make sure the the device will be mounted to the global path in nodeUnstage request unmount the device from global path and unmap the device if the volume mode is block we will be creating a file inside a stageTargetPath and it will be considered as the global path Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2019-07-24 12:49:21 +00:00
Humble Devassy Chirammal	5d5a6c4d91	Merge pull request #469 from Madhu-1/driver-version Update driver version during build time	2019-07-24 14:41:45 +05:30
ShyamsundarR	e5e332eded	Use correct file descriptor to parse errors File descriptors in use to parse errors from a few command invocations were incorrect. This led to inability to detect certain errors cases and act accordingly. One of the easiest noticeable issues was when an image is deleted but its RADOS keys and maps are still intact. In such cases the DeleteVolume call always errored out unable to find the image rather than, proceed with cleaning up the RADOS objects and returning a success. The original method of using stdout was incorrect, as the command was tested from within a shell script and the scripts STDIN/OUT/ERR was redirected to understand behavior. This is now tested using just the CLI in question, and also examining Ceph code, and further testing a couple of edge conditions by deleting backing images for PVs Signed-off-by: ShyamsundarR <srangana@redhat.com>	2019-07-16 07:51:10 +00:00
Madhu Rajanna	3f8bd3b2a6	Update driver version during build time update driver version and add git commit to the image. This will help us to identify what latest git commit image contains. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2019-07-12 15:54:52 +05:30
Poornima G	32ea550e3a	Modify CephFs provisioner to use the ceph mgr commands Currently CephFs provisioner mounts the ceph filesystem and creates a subdirectory as a part of provisioning the volume. Ceph now supports commands to provision fs subvolumes, hance modify the provisioner to use ceph mgr commands to (de)provision fs subvolumes. Signed-off-by: Poornima G <pgurusid@redhat.com>	2019-07-12 05:42:41 +00:00
ShyamsundarR	c4a3675cec	Move locks to more granular locking than CPU count based As detailed in issue #279, current lock scheme has hash buckets that are count of CPUs. This causes a lot of contention when parallel requests are made to the CSI plugin. To reduce lock contention, this commit introduces granular locks per identifier. The commit also changes the timeout for gRPC requests to Create and Delete volumes, as the current timeout is 10s (kubernetes documentation says 15s but code defaults are 10s). A virtual setup takes about 12-15s to complete a request at times, that leads to unwanted retries of the same request, hence the increased timeout to enable operation completion with minimal retries. Tests to create PVCs before and after these changes look like so, Before: Default master code + sidecar provisioner --timeout option set to 30 seconds 20 PVCs Creation: 3 runs, 396/391/400 seconds Deletion: 3 runs, 218/271/118 seconds - Once was stalled for more than 8 minutes and cancelled the run After: Current commit + sidecar provisioner --timeout option set to 30 sec 20 PVCs Creation: 3 runs, 42/59/65 seconds Deletion: 3 runs, 32/32/31 seconds Fixes: #279 Signed-off-by: ShyamsundarR <srangana@redhat.com>	2019-07-01 14:10:14 +00:00
ShyamsundarR	bc39c523b7	Fix returning success from DeleteSnapshot for stale requests Also reduced code duplication in fetching pool list from Ceph. DeleteSnapshot like DeleteVolume, should return a success when it detects that the snapshot keys are missing from the RADOS OMaps that store the snapshot UUID to request name mapping. This was missing in the code, and is now added. Signed-off-by: ShyamsundarR <srangana@redhat.com>	2019-07-01 10:54:53 +00:00
ShyamsundarR	c5762b6b5c	Modify RBD plugin to use a single ID and move the id and key into the secret RBD plugin needs only a single ID to manage images and operations against a pool, mentioned in the storage class. The current scheme of 2 IDs is hence not needed and removed in this commit. Further, unlike CephFS plugin, the RBD plugin splits the user id and the key into the storage class and the secret respectively. Also the parameter name for the key in the secret is noted in the storageclass making it a variant and hampers usability/comprehension. This is also fixed by moving the id and the key to the secret and not retaining the same in the storage class, like CephFS. Fixes #270 Testing done: - Basic PVC creation and mounting Signed-off-by: ShyamsundarR <srangana@redhat.com>	2019-06-24 13:46:14 +00:00
Madhu Rajanna	a38986fce0	Enable all static-checks in golangci-lint * Enable all static-checks in golangci-lint * Update golangci-lint version * Fix issue found in golangci-lint Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2019-06-10 15:56:17 +05:30
Madhu Rajanna	7d3a6105c7	Fix misspell words Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2019-06-10 12:52:13 +05:30
ShyamsundarR	b9cd0e18ad	Make CephFS plugin stateless reusing RADOS based journal scheme This is a part of the stateless set of commits for CephCSI. This commit removes the dependency on config maps to store cephFS provisioned volumes, and instead relies on RADOS based objects and keys, and required CSI VolumeID encoding to detect the provisioned volumes. Changes: - Provide backward compatibility to provisioned volumes by older plugin versions (1.0.0 or older) - Remove Create/Delete support for statically provisioned volumes (fixes #382) - Added namespace support to RADOS OMaps and used the same to store RADOS CSI objects and keys in the CephFS metadata pool - Added support to mention fsname for CephFS provisioning (fixes #359) - Changed field name in CSI Identifier to 'location', to denote a pool or fscid - Updated mounter cache to use new scheme - Required Helm manifests are updated - Required documentation and other manifests are updated - Made driver option 'metadatastorage' as optional, as fresh installs do not need to specify the same Testing done: - Create/Mount/Delete PVC - Create/Delete 5 PVCs - Mount version 1.0.0 PVC - Delete version 1.0.0 PV - Mount Statically defined PV/PVC/Pod - Mount Statically defined version 1.0.0 PV/PVC/Pod - Delete Statically defined version 1.0.0 PV/PVC/Pod - Node restart when mounted to test mountcache - Use InstanceID other than 'default' - RBD basic round of tests, as namespace is added to OMaps - csitest against ceph-fs plugin - NOTE: CephFS plugin still does not detect and address already created volumes but of a different size - Test not providing any value to the metadata storage parameter Signed-off-by: ShyamsundarR <srangana@redhat.com>	2019-05-30 06:20:35 -04:00
ShyamsundarR	1406f29dcd	Refactor voljournal to aid reuse with CephFS and to also inmprove the code reuse in rbd itself. Signed-off-by: ShyamsundarR <srangana@redhat.com>	2019-05-30 09:58:40 +00:00
ShyamsundarR	d02e50aa9b	Removed config maps and replaced with rados omaps Existing config maps are now replaced with rados omaps that help store information regarding the requested volume names and the rbd image names backing the same. Further to detect cluster, pool and which image a volume ID refers to, changes to volume ID encoding has been done as per provided design specification in the stateless ceph-csi proposal. Additional changes and updates, - Updated documentation - Updated manifests - Updated Helm chart - Addressed a few csi-test failures Signed-off-by: ShyamsundarR <srangana@redhat.com>	2019-05-19 12:29:33 +00:00
wilmardo	891daa9375	Replaces the references to the Kubernete Authors with the Ceph-CSI authors	2019-04-03 11:14:08 +02:00
Róbert Vašek	d0d5da83c9	Merge pull request #282 from huaizong/improve-remount-pv-path-when-exit-v2 remount old mount point when csi plugin unexpect exit	2019-04-02 08:36:07 +02:00
王怀宗	af330fe68e	1. fix mountcache race conflict 2. support user-defined cache dir 3. if not define mountcachedir disable mountcache	2019-03-27 16:04:58 +08:00
ShyamsundarR	ba2e5cff51	Address remenant subject reference and code style reviews Signed-off-by: ShyamsundarR <srangana@redhat.com>	2019-03-26 16:19:24 +00:00
ShyamsundarR	fc0cf957be	Updated code and docs to reflect correct terminology - Updated instances of fsid with clusterid - Updated instances of credentials/subject with user/key Signed-off-by: ShyamsundarR <srangana@redhat.com>	2019-03-26 16:19:24 +00:00
ShyamsundarR	c9c1c871fc	Removed a couple of debug logs Signed-off-by: ShyamsundarR <srangana@redhat.com>	2019-03-26 16:19:24 +00:00
ShyamsundarR	2064e674a4	Addressed using k8s client APIs to fetch secrets Based on the review comments addressed the following, - Moved away from having to update the pod with volumes when a new Ceph cluster is added for provisioning via the CSI driver - The above now used k8s APIs to fetch secrets - TBD: Need to add a watch mechanisim such that these secrets can be cached and updated when changed - Folded the Cephc configuration and ID/key config map and secrets into a single secret - Provided the ability to read the same config via mapped or created files within the pod Tests: - Ran PV creation/deletion/attach/use using new scheme StorageClass - Ran PV creation/deletion/attach/use using older scheme to ensure nothing is broken - Did not execute snapshot related tests Signed-off-by: ShyamsundarR <srangana@redhat.com>	2019-03-26 16:19:24 +00:00
ShyamsundarR	97f8c4b677	Provide options to pass in Ceph cluster-id This commit provides the option to pass in Ceph cluster-id instead of a MON list from the storage class. This helps in moving towards a stateless CSI implementation. Tested the following, - PV provisioning and staging using cluster-id in storage class - PV provisioning and staging using MON list in storage class Did not test, - snapshot operations in either forms of the storage class Signed-off-by: ShyamsundarR <srangana@redhat.com>	2019-03-26 16:19:24 +00:00
王怀宗	b318964af5	issue #91 issue #217 Goal we try to solve when csi exit unexpect, the pod use cephfs pv can not auto recovery because lost mount relation until pod be killed and reschedule to other node. i think this is may be a problem. may be csi plugin can do more thing to remount the old path so when pod may be auto recovery when pod exit and restart, the old mount path can use. NoGoal Pod should exit and restart when csi plugin pod exit and mount point lost. if pod not exit will get error of transport endpoint is not connected. implment logic csi-plugin start: 1. load all MountCachEntry from node local dir 2. check if volID exist in cluster, if no we ignore this entry, if yes continue 3. check if stagingPath exist, if yes we mount the path 4. check if all targetPath exist, if yes we binmount to staging path NodeServer: 1. NodeStageVolume: add MountCachEntry on local dir include readonly attr and ceph secret 2. NodeStagePublishVolume: add pod bind mount path to MountCachEntry and persist local dir 3. NodeStageunPublishVolume: remove pod bind mount path From MountCachEntry and persist local dir 4. NodeStageunStageVolume: remove MountCachEntry from local dir	2019-03-25 22:47:39 +08:00
Róbert Vašek	a4dd845735	Merge pull request #223 from Madhu-1/fix-222-1.0 update driver name as per csi spec	2019-03-14 06:38:13 +01:00
Madhu Rajanna	d61a87b42e	Fix driver name as per CSI spec Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2019-03-13 12:04:30 +05:30
Madhu Rajanna	16279eda78	Roundup volume size to Mib for rbd Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2019-03-04 19:17:28 +05:30
Madhu Rajanna	6f4f148d3b	remove glog Signed-off-by: Madhu Rajanna <mrajanna@redhat.com>	2019-02-27 14:17:19 +05:30
gman	e5dbea15d3	util/cachepersister: check and return CacheEntryNotFound error in Get()	2019-02-25 18:05:20 +01:00
gman	0235b9c249	k8s metadata cache: delete shouldn't fail on NotFound errors	2019-02-20 20:20:44 +01:00
Madhu Rajanna	fd4c019aba	cleanup: remove duplicate code Signed-off-by: Madhu Rajanna <mrajanna@redhat.com>	2019-02-19 13:44:10 +05:30
gman	8223ae325b	addressed review comments	2019-02-14 13:55:51 +00:00
gman	892d65d387	added StripSecretInArgs in pkg/util	2019-02-14 13:55:51 +00:00
gman	6099f142f0	moved klog initialization into pkg/util package	2019-02-12 16:31:55 +01:00

1 2

61 Commits