Commit Graph

939 Commits

Author SHA1 Message Date
Madhu Rajanna
41b701c98c Add support for erasure pool in rbd
Allow specifying different metadata and data pools in a
CSI RBD StorageClass

Fixes: #199
Fixes: https://github.com/rook/rook/issues/2650
Fixes: https://github.com/rook/rook/issues/3763

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-09-11 06:48:08 +00:00
Madhu Rajanna
8992e9d321 update readme for v1.2.0 release
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-09-10 13:11:53 +00:00
KingJ
0639e00705 Reorder kernel version checking logic 2019-09-07 11:10:27 +00:00
KingJ
197c8fcfcc Consider Kernel >=5.x as sufficent for using the Kernel mounter 2019-09-07 11:10:27 +00:00
Poornima G
060ff8d25e Add mount option for Cephfs
The storage class already takes MountOptions(MountFlags), these are the
bind mount options. Some of these options may not be recognised by the
cephfs mount. Hence added a new parameterin Storage Class for
- cephfs kernel mount options,
- ceph-fuse mount options

Ceph kernel mount options are different from ceph-fuse options, hence
added two different parameters.

Signed-off-by: Poornima G <pgurusid@redhat.com>
2019-09-06 16:32:10 +00:00
Daniel-Pivonka
d1952e5fd0 Fix liveness connection opening endless sockets
Signed-off-by: Daniel-Pivonka <dpivonka@redhat.com>
2019-09-06 03:15:28 +00:00
Madhu Rajanna
f4b38228ae Remove volumemounter flag from cephfs
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-09-05 07:20:50 +00:00
Poornima G
90c4d6a451 Cephfs: Use ceph kernel client if kernel version >= 4.17
Ceph kernel client is more performant than ceph fuse client.
The kernel client has Quota support only in the kernel version >=4.17.
Hence use ceph kernel client when the kernel version is >=4.17.

Signed-off-by: Poornima G <pgurusid@redhat.com>
2019-09-05 04:55:05 +00:00
Niels de Vos
d03a352abf e2e: correct log format in execCommandInPod()
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2019-09-04 12:26:55 +00:00
Niels de Vos
8f133e03b8 Add 'gosec' to the static-checks
Run static security scanning tool 'gosec' while testing.

URL: https://github.com/securego/gosec
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2019-09-04 11:48:37 +00:00
Niels de Vos
dd668e59f1 Address security concerns reported by 'gosec'
gosec reports several issues, none of them looks very critical. With
this change the following concerns have been addressed:

[pkg/cephfs/nodeserver.go:229] - G302: Expect file permissions to be 0600 or less (Confidence: HIGH, Severity: MEDIUM)
  > os.Chmod(targetPath, 0777)

[pkg/cephfs/util.go:39] - G204: Subprocess launched with variable (Confidence: HIGH, Severity: MEDIUM)
  > exec.Command(program, args...)

[pkg/rbd/nodeserver.go:156] - G302: Expect file permissions to be 0600 or less (Confidence: HIGH, Severity: MEDIUM)
  > os.Chmod(stagingTargetPath, 0777)

[pkg/rbd/nodeserver.go:205] - G302: Expect file permissions to be 0600 or less (Confidence: HIGH, Severity: MEDIUM)
  > os.OpenFile(mountPath, os.O_CREATE|os.O_RDWR, 0750)

[pkg/rbd/rbd_util.go:797] - G304: Potential file inclusion via variable (Confidence: HIGH, Severity: MEDIUM)
  > ioutil.ReadFile(fPath)

[pkg/util/cephcmds.go:35] - G204: Subprocess launched with variable (Confidence: HIGH, Severity: MEDIUM)
  > exec.Command(program, args...)

[pkg/util/credentials.go:47] - G104: Errors unhandled. (Confidence: HIGH, Severity: LOW)
  > os.Remove(tmpfile.Name())

[pkg/util/credentials.go:92] - G104: Errors unhandled. (Confidence: HIGH, Severity: LOW)
  > os.Remove(cr.KeyFile)

[pkg/util/pidlimit.go:74] - G304: Potential file inclusion via variable (Confidence: HIGH, Severity: MEDIUM)
  > os.Open(pidsMax)

URL: https://github.com/securego/gosec
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2019-09-04 11:48:37 +00:00
Madhu Rajanna
d6f1c938d8 comment yum update from dockerfile
currently we are facing issue in  building
docker image,commenting yum update it fix it

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-09-04 11:12:07 +00:00
Madhu Rajanna
64ca401a51 Fix mon endpoint issue in E2E
in toolbox mon endpoints are not
updated properly, this is causing an issue in E2E
this PR is a workaround to fix this issue.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-09-04 11:12:07 +00:00
Madhu Rajanna
677be13b11
Merge pull request #583 from nixpanic/mergify/dismiss_reviews
mergify: Remove approved review when a PR changes
2019-09-04 15:14:25 +05:30
Niels de Vos
2192d40937 mergify: Remove approved review when a PR changes
When people have reviewed a PR, and the contributor updates it, the
reviews are staying valid. This is unfortunate as the PR might need new
corrections.

Ideally a positive review (Approved) needs to be given again. And there
needs to be a confirmation that a negative review (Changes requested)
has been addressed.

Closes: #505
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2019-09-03 14:29:40 +02:00
Madhu Rajanna
ae67534a44 Add wait logic to check fsid from toolbox
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-08-30 06:50:32 +00:00
Madhu Rajanna
a81a3bf96b implement grpc metrics for ceph-csi
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-08-30 06:50:32 +00:00
Daniel-Pivonka
01a78cace5 switch to cephfs, utils, and csicommon to new loging system
Signed-off-by: Daniel-Pivonka <dpivonka@redhat.com>
2019-08-29 14:04:31 +00:00
Madhu Rajanna
7d3a18c5b7 Fix e2e failure caused due to mon connection from toolbox
i think now its take time to discover the mon IP
from svc name in tool box, this is a workaround
to fix it.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-08-29 11:17:54 +00:00
Madhu Rajanna
025aec3dd3 Trace backend volume from pvc
logs
```
+----------+------------------------------------------+----------------------------------------------+-----------------+------------------+------------------+
| PVC Name |                 PV Name                  |                  Image Name                  | PV name in omap | Image ID in omap | Image in cluster |
+----------+------------------------------------------+----------------------------------------------+-----------------+------------------+------------------+
| rbd-pvc  | pvc-f1a501dd-03f6-45c9-89f4-85eed7a13ef2 | csi-vol-1b00f5f8-b1c1-11e9-8421-9243c1f659f0 |       True      |       True       |      False       |
| rbd-pvcq | pvc-09a8bceb-0f60-4036-85b9-dc89912ae372 | csi-vol-b781b9b1-b1c5-11e9-8421-9243c1f659f0 |       True      |       True       |       True       |
+----------+------------------------------------------+----------------------------------------------+-----------------+------------------+------------------+
```

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-08-26 14:42:12 +00:00
Madhu Rajanna
3af364e7b5 move to statand context package
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-08-26 06:19:24 +00:00
Madhu Rajanna
38ca08bf65 Context based logging for rbd
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-08-26 06:19:24 +00:00
Daniel-Pivonka
81c28d6cb0 implement klog wrapper
Signed-off-by: Daniel-Pivonka <dpivonka@redhat.com>
2019-08-21 14:36:41 +00:00
Daniel-Pivonka
aa74f8c87f Implement context based logging
Signed-off-by: Daniel-Pivonka <dpivonka@redhat.com>
2019-08-21 14:36:41 +00:00
wilmardo
3111e7712a feat: Adds Ceph logo as icon for Helm charts
Signed-off-by: wilmardo <info@wilmardenouden.nl>
2019-08-20 05:34:28 +00:00
Humble Devassy Chirammal
3f32dea047
Merge pull request #551 from humblec/dockerfile
Fix the vulnarabilities in the image.
2019-08-20 10:28:33 +05:30
Madhu Rajanna
e557438f87 unmap rbd image if connection timeout.
Sometime rbd images are mapped even if the
connection timeout error occurs, this will
try to unmap if the received error message
is connection timeout.This will fix stale maps
and rbd image deletion issue

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-08-19 10:54:17 +00:00
Madhu Rajanna
0da4bd5151 start controller or node server based on config
if both controller and nodeserver flags are set/unset
cephcsi will start both server,

if only one flag is set, it will start relavent
service.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-08-19 06:11:43 +00:00
Madhu Rajanna
89732d923f move flag configuration variable to util
remove unwanted checks
remove getting drivertype from binary name

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-08-19 06:11:43 +00:00
Madhu Rajanna
2b1355061e rename rbd.go to driver.go
to keep code struct similar with cephfs
renamed rbd.go to driver.go

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-08-19 06:11:43 +00:00
Humble Chirammal
0fc7f4513b Snashotter update
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2019-08-19 05:06:42 +00:00
wilmardo
0a90762970 fix: Adds liveness sidecar to v1.14+ helm charts
Signed-off-by: wilmardo <info@wilmardenouden.nl>
2019-08-16 08:38:49 +00:00
wilmardo
30fb7de118 feat: Implement helm lint
Signed-off-by: wilmardo <info@wilmardenouden.nl>
2019-08-16 07:38:33 +00:00
Humble Chirammal
6950ad468f Fix the vulnarabilities in the image.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2019-08-14 17:37:33 +05:30
Humble Chirammal
8fddb53931 Make the kube dependency to 1.15.2
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2019-08-14 07:40:50 +00:00
Humble Chirammal
9a3a04b180 Add container image compatibility matrix.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2019-08-14 06:48:43 +00:00
Daniel-Pivonka
d621a58207 prometheus liveness probe sidecar
Signed-off-by: Daniel-Pivonka dpivonka@redhat.com
2019-08-13 17:51:41 +00:00
Madhu Rajanna
2ca575b99d Wrap error if failed to fetch mon
This will help user to check whats
the actual error. if the config file
is having issue or the  clusterid is
not valid.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-08-13 17:16:27 +00:00
wilmardo
cba6115e30 Fix 1.13 charts
Signed-off-by: wilmardo <info@wilmardenouden.nl>
2019-08-13 16:42:15 +00:00
wilmardo
c739ce9d5e Removes last reference to node-publish-secret
Signed-off-by: wilmardo <info@wilmardenouden.nl>
2019-08-13 16:42:15 +00:00
wilmardo
ca5fbc180c Rework of helm charts
Signed-off-by: wilmardo <info@wilmardenouden.nl>
2019-08-13 16:42:15 +00:00
ShyamsundarR
20d336fca3 Add support to use ceph manager rbd command to delete an image
Image deletion takes time proportional to the size of the
image. Hence, ceph manager is enhanced to support async
deletion of an image, or rather passing the task of
deleting an image to the ceph manager.

This commit leverages the ceph manager enhancement in the CSI code.

NOTE: This is tested against a ceph cluster that is running
Ceph master version of the code. Once other releases
catch up in terms of the feature, the optimization would be
available to the CSI driver as well.

Fixes: #523
Signed-off-by: ShyamsundarR <srangana@redhat.com>
2019-08-13 16:08:22 +00:00
Madhu Rajanna
4ba2d0e10b Add xfs fstype as default type in storageclass
we have see better performace in device
format and mounting by setting the fstype to xfs
from default ext4.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-08-13 15:19:36 +00:00
Niels de Vos
31648c8feb provisioners: add reconfiguring of PID limit
The container runtime CRI-O limits the number of PIDs to 1024 by
default. When many PVCs are requested at the same time, it is possible
for the provisioner to start too many threads (or go routines) and
executing 'rbd' commands can start to fail. In case a go routine can not
get started, the process panics.

The PID limit can be changed by passing an argument to kubelet, but this
will affect all pids running on a host. Changing the parameters to
kubelet is also not a very elegant solution.

Instead, the provisioner pod can change the configuration itself. The
pod is running in privileged mode and can write to /sys/fs/cgroup where
the limit is configured.

With this change, the limit is configured to 'max', just as if there is
no limit at all. The logs of the csi-rbdplugin in the provisioner pod
will reflect the change it makes when starting the service:

    $ oc -n rook-ceph logs -c csi-rbdplugin csi-rbdplugin-provisioner-0
    ..
    I0726 13:59:19.737678       1 cephcsi.go:127] Initial PID limit is set to 1024
    I0726 13:59:19.737746       1 cephcsi.go:136] Reconfigured PID limit to -1 (max)
    ..

It is possible to pass a different limit on the commandline of the
cephcsi executable. The following flag has been added:

    --pidlimit=<int>       the PID limit to configure through cgroups

This accepts special values -1 (max) and 0 (default, do not
reconfigure). Other integers will be the limit that gets configured in
cgroups.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2019-08-13 14:43:29 +00:00
ShyamsundarR
885ec7049d Update Unstage transaction to undo steps done in Stage
In unstage we now adhere to the transaction (or order of steps)
done in Stage. To enable this we stash the image meta data
into a local file on the staging path for use with unstage
request.

This helps in unmapping a stale map, in case the mount or
other steps in the transaction are complete.

Signed-off-by: ShyamsundarR <srangana@redhat.com>
2019-08-13 14:07:52 +00:00
ShyamsundarR
44f7b1fe4b Use "rbd device list" to list and find rbd images and their device paths
This change also starts mapping nbd based access using ther rbd CLI
as, it is a prerequisite to get device listing for nbd as well.

Signed-off-by: ShyamsundarR <srangana@redhat.com>
2019-08-13 14:07:52 +00:00
ShyamsundarR
925bda2881 Move mounting staging instance to a sub-path within staging path
This commit moves the mounting of a block volumes and filesystems
to a sub-file (already the case) or a sub-dir within the staging
path.

This enables using the staging path to store any additional data
regarding the mount. For example, this will be extended in the
future to store the fsid of the cluster, and maybe the pool name
to map unmap requests to the right image.

Also, this fixes the noted hack in the code, to determine in a
common manner if there is a mount on the passed in staging path.

Signed-off-by: ShyamsundarR <srangana@redhat.com>
2019-08-13 14:07:52 +00:00
Niels de Vos
b812ec26df e2e: do not fail deleting resources when "resource not found"
Sometimes the tests fail cleaning up due unavailable resources that are
listed in the .yaml files. Deleting the missing resources returns
"resource not found". By passing --ignore-not-found to kubectl, this
problem should not happen anymore (and possibly makes it more obvious
where tests do go wrong).
2019-08-13 12:46:07 +00:00
Madhu Rajanna
17eb9fb47b
Merge pull request #544 from Madhu-1/skip-snap
Skip snapshot testing  in CI
2019-08-13 16:54:20 +05:30
Madhu Rajanna
90fef919d5 Skip snapshot testing in CI
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-08-13 16:22:55 +05:30