Commit Graph

178 Commits

Author SHA1 Message Date
Madhu Rajanna
bcd646ee55 Deprecate grpc metrics in ceph-csi
As kubernetes CSI sidecar is exposing the
GRPC mertics we can make use of the same in
ceph-csi we dont need to expose our own.

update: #881

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-04-01 11:59:37 +00:00
Madhu Rajanna
3c3a624d1a Log error message for cephfs
If the loading of cephfs fuse or
the mount.ceph command fails log it.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-04-01 09:35:40 +00:00
Humble Chirammal
b1dfcb4d7e Correct static errors and source code comments.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2020-03-30 05:13:24 +00:00
xu.chen
399f0b0d89 Audit log and follow klog standard 2020-03-27 09:24:52 +00:00
Reinier Schoof
8da7e4bbf9 removed unreachable code path 2020-03-10 11:34:53 +00:00
Madhu Rajanna
128f3fc2cf check subvolume present in backend
If a CreateVolume call is interrupted,
post creating the required CSI journal entries,
but prior to creating the backing CephFS subvolume,
then a subsequent CreateVolume call will return
a valid response with a VolumeID that has
it's backing image missing. This PR adds a check
for backend image, if image notfound it deletes the
reserved keys in omap.

fixes #839

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-02-25 11:46:04 +00:00
Reinier Schoof
a4532fafd0 added volumeNamePrefix and snapshotNamePrefix as parameters for storageClass
this allows administrators to override the naming prefix for both volumes and snapshots
created by the rbd plugin.

Signed-off-by: Reinier Schoof <reinier@skoef.nl>
2020-02-25 05:03:51 +00:00
Madhu Rajanna
8dcb6a6105 Handle Delete operation if pool not found
If the backend rbd or cephfs pool is already deleted
we need to return success to the  DeleteVolume RPC
call to make it idempotent.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-02-11 15:58:15 +00:00
Madhu Rajanna
034b123478 Remove mount cache for cephfs
PR #282 introduces the mount cache to
solve cephfs fuse mount issue when cephfs plugin pod
restarts .This is not working as intended. This PR removes
the code for maintainability.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-02-11 15:11:21 +00:00
Vasyl Purchel
419ad0dd8e Adds per volume encryption with Vault integration
- adds proposal document for PVC encryption from PR448
- adds per-volume encription by generating encryption passphrase
  for each volume and storing it in a KMS
- adds HashiCorp Vault integration as a KMS for encryption passphrases
- avoids encrypting volume second time if it was already encrypted but
  no file system created
- avoids unnecessary checks if volume is a mapped device when encryption
  was not requested
- prevents resizing encrypted volumes (it is not currently supported)
- prevents creating snapshots from encrypted volumes to prevent attack
  on encryption key (security guard until re-encryption of volumes
  implemented)

Signed-off-by: Vasyl Purchel vasyl.purchel@workday.com
Signed-off-by: Andrea Baglioni andrea.baglioni@workday.com

Fixes #420
Fixes #744
2020-02-05 05:18:56 +00:00
ShyamsundarR
35e8c3b3a5 CephFS: Added ENOENT checks for possible missing volumes
Added checks in DeleteVolume RPC, for image missing errors, and
taking appropriate actions to cleanup the CSI reservations.

Further removed forcing a volume purge, and instead added checks
for missing volume errors in purgeVolume.

This should now fix issues where an continuation of an interrupted
DeleteVolume call, that only deleted the backing volume, will
proceed and not error out.

Signed-off-by: ShyamsundarR <srangana@redhat.com>
2020-01-29 10:05:13 +00:00
Madhu Rajanna
881f59d142 Add _netdev as default mount options in plugin
This values will be added at both nodestage
and nodepublish for rbd, nbd and ceph kernel client.

As cephfs fuse doesnot support this value,
this is added only during the nodepublish.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-01-28 16:50:18 +00:00
Madhu Rajanna
85960b6571 Add ID based logging for ExpandVolume
Updated logging to log ReqID

Fixes: #732

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-12-17 14:25:34 +00:00
Madhu Rajanna
dcafdb519e discard umount error if directory is not mounted
if the directory is not mounted return nil
during umount of mountPoint

Discard error if error is os.IsNotExist

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-12-17 13:33:41 +00:00
Woohyung Han
8a16f740d6 Update golangci-lint version to v1.21.0
Signed-off-by: Woohyung Han <techhanx@gmail.com>
2019-12-12 04:57:14 +00:00
Humble Chirammal
671e2d814a Add volumesize roundoff for expandrequest
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2019-11-27 14:00:47 +00:00
Humble Chirammal
ac09c5553c Add E2E for cephfs resize functionality
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2019-11-27 14:00:47 +00:00
Humble Chirammal
b721accaf5 Resize CephFS Volumes
This feature enables CephFS Volume expansion on demand
based on the CO resizer request.

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2019-11-27 14:00:47 +00:00
Niels de Vos
290beb4dda cephfs: add kernel version detection for mounting with client
Linux kernel 4.17.0 adds support for quota with CephFS. Without quota,
it is not possible to fullfill the requirements of the CSI Spec and
guarantee sufficient space on the filesystem for a volume. With this in
mind, usage of the kernel client is only allowed with kernel 4.17.0 or
newer.

However, some Linux vendors backport features and patches to their
Enterprise products. These kernels may have an older version, but do
support quota. One of these is the kernel that comes with RHEL-7.7.

By comparing the current running version of the Linux kernel against
known versions that support quota, we can now automatically decide to
use the kernel client, or not.

Note that this does not change the 'forcekernelclient' parameter. The
parameter is still available and can be used for kernels that are not in
the 'known to support quota list'. Or users can pass the parameter to
use a CephFS kernel client that does not support quota.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-11-13 11:56:09 +00:00
Stefan Haas
6a2717ce20 Added forcecephkernelclient as startup parameter to force enabling ceph
Signed-off-by: Stefan Haas <shaas@suse.com>
2019-10-16 06:47:10 +00:00
Madhu Rajanna
7274bd09e5 Fix volsize for cephfs and rbd
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-10-11 08:22:27 +00:00
Madhu Rajanna
6aac399075 Change the logic of locking
if any on going opearation is seen,we
have to return Abort error message

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-09-20 07:37:17 +00:00
KingJ
0639e00705 Reorder kernel version checking logic 2019-09-07 11:10:27 +00:00
KingJ
197c8fcfcc Consider Kernel >=5.x as sufficent for using the Kernel mounter 2019-09-07 11:10:27 +00:00
Poornima G
060ff8d25e Add mount option for Cephfs
The storage class already takes MountOptions(MountFlags), these are the
bind mount options. Some of these options may not be recognised by the
cephfs mount. Hence added a new parameterin Storage Class for
- cephfs kernel mount options,
- ceph-fuse mount options

Ceph kernel mount options are different from ceph-fuse options, hence
added two different parameters.

Signed-off-by: Poornima G <pgurusid@redhat.com>
2019-09-06 16:32:10 +00:00
Madhu Rajanna
f4b38228ae Remove volumemounter flag from cephfs
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-09-05 07:20:50 +00:00
Poornima G
90c4d6a451 Cephfs: Use ceph kernel client if kernel version >= 4.17
Ceph kernel client is more performant than ceph fuse client.
The kernel client has Quota support only in the kernel version >=4.17.
Hence use ceph kernel client when the kernel version is >=4.17.

Signed-off-by: Poornima G <pgurusid@redhat.com>
2019-09-05 04:55:05 +00:00
Niels de Vos
dd668e59f1 Address security concerns reported by 'gosec'
gosec reports several issues, none of them looks very critical. With
this change the following concerns have been addressed:

[pkg/cephfs/nodeserver.go:229] - G302: Expect file permissions to be 0600 or less (Confidence: HIGH, Severity: MEDIUM)
  > os.Chmod(targetPath, 0777)

[pkg/cephfs/util.go:39] - G204: Subprocess launched with variable (Confidence: HIGH, Severity: MEDIUM)
  > exec.Command(program, args...)

[pkg/rbd/nodeserver.go:156] - G302: Expect file permissions to be 0600 or less (Confidence: HIGH, Severity: MEDIUM)
  > os.Chmod(stagingTargetPath, 0777)

[pkg/rbd/nodeserver.go:205] - G302: Expect file permissions to be 0600 or less (Confidence: HIGH, Severity: MEDIUM)
  > os.OpenFile(mountPath, os.O_CREATE|os.O_RDWR, 0750)

[pkg/rbd/rbd_util.go:797] - G304: Potential file inclusion via variable (Confidence: HIGH, Severity: MEDIUM)
  > ioutil.ReadFile(fPath)

[pkg/util/cephcmds.go:35] - G204: Subprocess launched with variable (Confidence: HIGH, Severity: MEDIUM)
  > exec.Command(program, args...)

[pkg/util/credentials.go:47] - G104: Errors unhandled. (Confidence: HIGH, Severity: LOW)
  > os.Remove(tmpfile.Name())

[pkg/util/credentials.go:92] - G104: Errors unhandled. (Confidence: HIGH, Severity: LOW)
  > os.Remove(cr.KeyFile)

[pkg/util/pidlimit.go:74] - G304: Potential file inclusion via variable (Confidence: HIGH, Severity: MEDIUM)
  > os.Open(pidsMax)

URL: https://github.com/securego/gosec
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2019-09-04 11:48:37 +00:00
Madhu Rajanna
a81a3bf96b implement grpc metrics for ceph-csi
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-08-30 06:50:32 +00:00
Daniel-Pivonka
01a78cace5 switch to cephfs, utils, and csicommon to new loging system
Signed-off-by: Daniel-Pivonka <dpivonka@redhat.com>
2019-08-29 14:04:31 +00:00
Madhu Rajanna
3af364e7b5 move to statand context package
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-08-26 06:19:24 +00:00
Madhu Rajanna
0da4bd5151 start controller or node server based on config
if both controller and nodeserver flags are set/unset
cephcsi will start both server,

if only one flag is set, it will start relavent
service.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-08-19 06:11:43 +00:00
Madhu Rajanna
89732d923f move flag configuration variable to util
remove unwanted checks
remove getting drivertype from binary name

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-08-19 06:11:43 +00:00
Madhu Rajanna
2ca575b99d Wrap error if failed to fetch mon
This will help user to check whats
the actual error. if the config file
is having issue or the  clusterid is
not valid.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-08-13 17:16:27 +00:00
Daniel-Pivonka
0063727199 Make parameter pool optional in CephFS storageclass
Signed-off-by: Daniel-Pivonka <dpivonka@redhat.com>
2019-08-07 13:30:38 +00:00
Humble Chirammal
0786225937 Implement metrics for RBD plugin
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2019-08-01 11:58:54 +00:00
Madhu Rajanna
dfbdec4b6a add validation to check if stagingPath exists
It's CO responsibility to create the
stagingPath as per the CSI spec.

The CO SHALL ensure
// that the path is directory and that the process serving the
// request has `read` and `write` permission to that directory. The
// CO SHALL be responsible for creating the directory if it does not
// exist.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-07-29 12:52:10 +00:00
Ramana Raja
5af29662b2 cephfs: set the mode of the FS subvolumes
... and not that of the FS subvolume group `csi`.

There is no reason for setting the mode of FS subvolume group `csi`
(a CephFS subdirectory) as 777. It's default mode is 755. It's
sufficient to set the mode of FS subvolumes within the subvolume group
to `777`.

Signed-off-by: Ramana Raja <rraja@redhat.com>
2019-07-29 10:11:48 +00:00
Ramana Raja
5932fff93e cephfs: set pool layout of the FS subvolumes
... instead of that of the `csi` subvolume group. The pool layout
specified via storage class's `pool` setting is a subvolume property
and not a subvolume group property. The `csi` subvolume group
may have subvolumes of different storage classes with different
pool layouts.

Fixes: #499
Signed-off-by: Ramana Raja <rraja@redhat.com>
2019-07-29 10:11:48 +00:00
Humble Devassy Chirammal
c7d990a96b
Merge pull request #460 from Madhu-1/fix-pluginapath
Fix pluginpath for cephfs
2019-07-29 14:02:18 +05:30
ShyamsundarR
bd204d7d45 Use --keyfile option to pass keys to all Ceph CLIs
Every Ceph CLI that is invoked at present passes the key via the
--key option, and hence is exposed to key being displayed on
the host using a ps command or such means.

This commit addresses this issue by stashing the key in a tmp
file, which is again created on a tmpfs (or empty dir backed by
memory). Further using such tmp files as arguments to the --keyfile
option for every CLI that is invoked.

This prevents the key from being visible as part of the argument list
of the invoked program on the system.

Fixes: #318

Signed-off-by: ShyamsundarR <srangana@redhat.com>
2019-07-25 12:46:15 +00:00
Poornima G
c2835183e5 Remove user creation for every volume
Currently, provisioner creates user for every volume and nodeplugin
uses this user to mount that volume. But nodeplugin and provisioner
already have admin credentials, hence using the admin credentials
to mount the volume and getting rid of user creation for each volume.

Signed-off-by: Poornima G <pgurusid@redhat.com>
2019-07-25 10:59:42 +00:00
Madhu Rajanna
778cfb3090 provide option to set pluginpath for cephfs
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-07-25 14:47:42 +05:30
Humble Chirammal
561cc26e4c Implement metrics for CephFS CSI driver
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2019-07-25 06:03:54 +00:00
Madhu Rajanna
f4c80dec9a Implement NodeStage and NodeUnstage for rbd
in NodeStage RPC call  we  have to map the
device to the node plugin and make  sure  the
the device will be mounted to  the global path

in  nodeUnstage request unmount the device from
global path and unmap the device

if the volume mode is block  we will be creating
a file inside a stageTargetPath  and it will be
considered  as the global path

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-07-24 12:49:21 +00:00
Madhu Rajanna
3f8bd3b2a6 Update driver version during build time
update driver version and add git commit
to the image. This will help us to identify
what latest git commit image contains.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-07-12 15:54:52 +05:30
Poornima G
0d566ee30c Backward compatibility for deleting and mounting old volumes
Signed-off-by: Poornima G <pgurusid@redhat.com>
2019-07-12 05:42:41 +00:00
Poornima G
32ea550e3a Modify CephFs provisioner to use the ceph mgr commands
Currently CephFs provisioner mounts the ceph filesystem
and creates a subdirectory as a part of provisioning the
volume. Ceph now supports commands to provision fs subvolumes,
hance modify the provisioner to use ceph mgr commands to
(de)provision fs subvolumes.

Signed-off-by: Poornima G <pgurusid@redhat.com>
2019-07-12 05:42:41 +00:00
Madhu Rajanna
09f126691c Add nil check for process
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-07-08 12:11:53 +00:00
ShyamsundarR
c4a3675cec Move locks to more granular locking than CPU count based
As detailed in issue #279, current lock scheme has hash
buckets that are count of CPUs. This causes a lot of contention
when parallel requests are made to the CSI plugin. To reduce
lock contention, this commit introduces granular locks per
identifier.

The commit also changes the timeout for gRPC requests to Create
and Delete volumes, as the current timeout is 10s (kubernetes
documentation says 15s but code defaults are 10s). A virtual
setup takes about 12-15s to complete a request at times, that leads
to unwanted retries of the same request, hence the increased
timeout to enable operation completion with minimal retries.

Tests to create PVCs before and after these changes look like so,

Before:
Default master code + sidecar provisioner --timeout option set
to 30 seconds

20 PVCs
Creation: 3 runs, 396/391/400 seconds
Deletion: 3 runs, 218/271/118 seconds
  - Once was stalled for more than 8 minutes and cancelled the run

After:
Current commit + sidecar provisioner --timeout option set to 30 sec
20 PVCs
Creation: 3 runs, 42/59/65 seconds
Deletion: 3 runs, 32/32/31 seconds

Fixes: #279
Signed-off-by: ShyamsundarR <srangana@redhat.com>
2019-07-01 14:10:14 +00:00