Commit Graph

144 Commits

Author SHA1 Message Date
Humble Chirammal
1efdf14ac5 At present, the request timeout of sidecars are at the 60s and this is a request to increase
this time out value to 150s or higher. The higher timeout value can help to reduce the
load of our backend ceph cluster and also can avoid throttling issues at sidecars to an extent.

Fix# #602

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2019-10-09 05:28:40 +00:00
Daniel-Pivonka
cd52798a51 Change default csi liveness ports to ones less common
Signed-off-by: Daniel-Pivonka <dpivonka@redhat.com>
2019-10-01 15:08:58 +00:00
wilmardo
6ee381db3a refactor: Merge 1.13 and 1.14 Helm charts and improve charts
Signed-off-by: wilmardo <info@wilmardenouden.nl>
2019-09-27 05:49:18 +00:00
Madhu Rajanna
e2890a27ff connect to provisioner socket
Fixes: #619

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-09-20 08:13:19 +00:00
Madhu Rajanna
a81a3bf96b implement grpc metrics for ceph-csi
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-08-30 06:50:32 +00:00
wilmardo
3111e7712a feat: Adds Ceph logo as icon for Helm charts
Signed-off-by: wilmardo <info@wilmardenouden.nl>
2019-08-20 05:34:28 +00:00
Madhu Rajanna
0da4bd5151 start controller or node server based on config
if both controller and nodeserver flags are set/unset
cephcsi will start both server,

if only one flag is set, it will start relavent
service.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-08-19 06:11:43 +00:00
wilmardo
0a90762970 fix: Adds liveness sidecar to v1.14+ helm charts
Signed-off-by: wilmardo <info@wilmardenouden.nl>
2019-08-16 08:38:49 +00:00
wilmardo
30fb7de118 feat: Implement helm lint
Signed-off-by: wilmardo <info@wilmardenouden.nl>
2019-08-16 07:38:33 +00:00
Daniel-Pivonka
d621a58207 prometheus liveness probe sidecar
Signed-off-by: Daniel-Pivonka dpivonka@redhat.com
2019-08-13 17:51:41 +00:00
wilmardo
cba6115e30 Fix 1.13 charts
Signed-off-by: wilmardo <info@wilmardenouden.nl>
2019-08-13 16:42:15 +00:00
wilmardo
ca5fbc180c Rework of helm charts
Signed-off-by: wilmardo <info@wilmardenouden.nl>
2019-08-13 16:42:15 +00:00
Niels de Vos
31648c8feb provisioners: add reconfiguring of PID limit
The container runtime CRI-O limits the number of PIDs to 1024 by
default. When many PVCs are requested at the same time, it is possible
for the provisioner to start too many threads (or go routines) and
executing 'rbd' commands can start to fail. In case a go routine can not
get started, the process panics.

The PID limit can be changed by passing an argument to kubelet, but this
will affect all pids running on a host. Changing the parameters to
kubelet is also not a very elegant solution.

Instead, the provisioner pod can change the configuration itself. The
pod is running in privileged mode and can write to /sys/fs/cgroup where
the limit is configured.

With this change, the limit is configured to 'max', just as if there is
no limit at all. The logs of the csi-rbdplugin in the provisioner pod
will reflect the change it makes when starting the service:

    $ oc -n rook-ceph logs -c csi-rbdplugin csi-rbdplugin-provisioner-0
    ..
    I0726 13:59:19.737678       1 cephcsi.go:127] Initial PID limit is set to 1024
    I0726 13:59:19.737746       1 cephcsi.go:136] Reconfigured PID limit to -1 (max)
    ..

It is possible to pass a different limit on the commandline of the
cephcsi executable. The following flag has been added:

    --pidlimit=<int>       the PID limit to configure through cgroups

This accepts special values -1 (max) and 0 (default, do not
reconfigure). Other integers will be the limit that gets configured in
cgroups.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2019-08-13 14:43:29 +00:00
ShyamsundarR
44f7b1fe4b Use "rbd device list" to list and find rbd images and their device paths
This change also starts mapping nbd based access using ther rbd CLI
as, it is a prerequisite to get device listing for nbd as well.

Signed-off-by: ShyamsundarR <srangana@redhat.com>
2019-08-13 14:07:52 +00:00
Madhu Rajanna
02bcb5f16a Enable leader election in v1.14+
Use Deployment with leader election instead of StatefulSet

Deployment behaves better when a node gets disconnected
from the rest of the cluster - new provisioner leader
is elected in ~15 seconds, while it may take up to
5 minutes for StatefulSet to start a new replica.

Refer: kubernetes-csi/external-provisioner@52d1fbc

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-08-05 07:11:44 +00:00
ShyamsundarR
bd204d7d45 Use --keyfile option to pass keys to all Ceph CLIs
Every Ceph CLI that is invoked at present passes the key via the
--key option, and hence is exposed to key being displayed on
the host using a ps command or such means.

This commit addresses this issue by stashing the key in a tmp
file, which is again created on a tmpfs (or empty dir backed by
memory). Further using such tmp files as arguments to the --keyfile
option for every CLI that is invoked.

This prevents the key from being visible as part of the argument list
of the invoked program on the system.

Fixes: #318

Signed-off-by: ShyamsundarR <srangana@redhat.com>
2019-07-25 12:46:15 +00:00
Madhu Rajanna
f4c80dec9a Implement NodeStage and NodeUnstage for rbd
in NodeStage RPC call  we  have to map the
device to the node plugin and make  sure  the
the device will be mounted to  the global path

in  nodeUnstage request unmount the device from
global path and unmap the device

if the volume mode is block  we will be creating
a file inside a stageTargetPath  and it will be
considered  as the global path

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-07-24 12:49:21 +00:00
ShyamsundarR
c4a3675cec Move locks to more granular locking than CPU count based
As detailed in issue #279, current lock scheme has hash
buckets that are count of CPUs. This causes a lot of contention
when parallel requests are made to the CSI plugin. To reduce
lock contention, this commit introduces granular locks per
identifier.

The commit also changes the timeout for gRPC requests to Create
and Delete volumes, as the current timeout is 10s (kubernetes
documentation says 15s but code defaults are 10s). A virtual
setup takes about 12-15s to complete a request at times, that leads
to unwanted retries of the same request, hence the increased
timeout to enable operation completion with minimal retries.

Tests to create PVCs before and after these changes look like so,

Before:
Default master code + sidecar provisioner --timeout option set
to 30 seconds

20 PVCs
Creation: 3 runs, 396/391/400 seconds
Deletion: 3 runs, 218/271/118 seconds
  - Once was stalled for more than 8 minutes and cancelled the run

After:
Current commit + sidecar provisioner --timeout option set to 30 sec
20 PVCs
Creation: 3 runs, 42/59/65 seconds
Deletion: 3 runs, 32/32/31 seconds

Fixes: #279
Signed-off-by: ShyamsundarR <srangana@redhat.com>
2019-07-01 14:10:14 +00:00
Humble Chirammal
027331c186 Use sidecar which support cloning
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2019-06-28 01:11:06 +00:00
Madhu Rajanna
59d3365d3b update statefulset and daemonset api-version
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-06-25 14:00:46 +00:00
Madhu Rajanna
983f28ad2f Revert "Use Deployment with leader election instead of StatefulSet"
This reverts commit a151bec94b.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-06-14 13:39:03 +00:00
Madhu Rajanna
bccfafdfb2 update helm chart version
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-06-10 16:54:05 +05:30
Madhu Rajanna
a151bec94b Use Deployment with leader election instead of StatefulSet
Deployment behaves better when a node gets disconnected from the rest of
the cluster - new provisioner leader is elected in ~15 seconds, while
it may take up to 5 minutes for StatefulSet to start a new replica.

Refer: 52d1fbcf9d

Fixes: #335

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-06-10 09:51:22 +05:30
Humble Devassy Chirammal
95252dd9f6
Merge pull request #390 from ShyamsundarR/stateless-cephfs
Make CephFS plugin stateless reusing RADOS based journal scheme
2019-06-07 10:44:18 +05:30
Humble Chirammal
45ae1c56e4 Promote sidecars to latest available version tags.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2019-06-02 15:01:34 +05:30
ShyamsundarR
b9cd0e18ad Make CephFS plugin stateless reusing RADOS based journal scheme
This is a part of the stateless set of commits for CephCSI.

This commit removes the dependency on config maps to store cephFS provisioned
volumes, and instead relies on RADOS based objects and keys, and required
CSI VolumeID encoding to detect the provisioned volumes.

Changes:
- Provide backward compatibility to provisioned volumes by older plugin versions (1.0.0 or older)
- Remove Create/Delete support for statically provisioned volumes (fixes #382)
- Added namespace support to RADOS OMaps and used the same to store RADOS CSI objects and keys in the CephFS metadata pool
- Added support to mention fsname for CephFS provisioning (fixes #359)
- Changed field name in CSI Identifier to 'location', to denote a pool or fscid
- Updated mounter cache to use new scheme
- Required Helm manifests are updated
- Required documentation and other manifests are updated
- Made driver option 'metadatastorage' as optional, as fresh installs do not need to specify the same

Testing done:
- Create/Mount/Delete PVC
- Create/Delete 5 PVCs
- Mount version 1.0.0 PVC
- Delete version 1.0.0 PV
- Mount Statically defined PV/PVC/Pod
- Mount Statically defined version 1.0.0 PV/PVC/Pod
- Delete Statically defined version 1.0.0 PV/PVC/Pod
- Node restart when mounted to test mountcache
- Use InstanceID other than 'default'
- RBD basic round of tests, as namespace is added to OMaps
- csitest against ceph-fs plugin
  - NOTE: CephFS plugin still does not detect and address already created
  volumes but of a different size
- Test not providing any value to the metadata storage parameter

Signed-off-by: ShyamsundarR <srangana@redhat.com>
2019-05-30 06:20:35 -04:00
Madhu Rajanna
2d560ba087 update ceph-csi to build and use a single docker image
currently, we have 3 docker files(cephcsi,rbd,cephfs) in the ceph-csi repo.
[commit ](85e121ebfe)
added by John to build a single image which can act as rbd or
cephfs based on the input configuration.

This PR updates the makefile and kubernetes templates to use
the unified image and also its deletes the other two dockerfiles.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-05-28 18:10:22 +00:00
ShyamsundarR
d02e50aa9b Removed config maps and replaced with rados omaps
Existing config maps are now replaced with rados omaps that help
store information regarding the requested volume names and the rbd
image names backing the same.

Further to detect cluster, pool and which image a volume ID refers
to, changes to volume ID encoding has been done as per provided
design specification in the stateless ceph-csi proposal.

Additional changes and updates,
- Updated documentation
- Updated manifests
- Updated Helm chart
- Addressed a few csi-test failures

Signed-off-by: ShyamsundarR <srangana@redhat.com>
2019-05-19 12:29:33 +00:00
Humble Chirammal
68ff602391 Resolve merge conflict
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2019-05-07 15:27:34 +05:30
Humble Chirammal
1eff2e1490 Merge branch 'master' of http://github.com/ceph/ceph-csi into csi-v1.0
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2019-05-07 15:14:14 +05:30
Miao Zhou
00e7e29996 bump up the chart version 2019-05-06 15:52:45 +08:00
Zhou Miao
a01c01b01b fix helm value pullPolicy mismatch bug 2019-04-25 12:03:44 +08:00
Kaushal M
63d00afb28
deploy: Use aggregated ClusterRoles
The kubernetes manifests and Helm templates have been updated to use
aggregated ClusterRoles. The same change has been done in Rook as well.

Refer rook/rook#2634 and rook/rook#2975

Signed-off-by: Kaushal M <kshlmster@gmail.com>
2019-04-17 11:15:08 +05:30
Madhu Rajanna
849de000f4 updated helm chat version
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-04-04 11:13:16 +05:30
Madhu Rajanna
3767375b6a Add csidriver CRD
if attacher is not enabled, we need to
create the csidriver CRD with spec
to make attachRequired as false to
skip volume attach check in kube.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-04-04 11:11:29 +05:30
Madhu Rajanna
e4d830a2c2 remove extra node rules in provisioner
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-04-04 11:11:29 +05:30
Madhu Rajanna
c6b4e47723 add if condition for attacher
adding the condition will help us
to easily remove the attacher later.
or even we can add else condition
if we have an alternate to attacher.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-04-04 11:11:29 +05:30
Madhu Rajanna
54d52bb411 update attacher endpoint
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-04-04 11:11:29 +05:30
Madhu Rajanna
94f7ac3d4e update cephfs helm template to deploy attacher sidecar container in provisioner.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-04-04 11:11:29 +05:30
Madhu Rajanna
168468a934 deploy cssi-attacher as sidecar container in provisioner
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-04-04 11:11:29 +05:30
Róbert Vašek
d0d5da83c9
Merge pull request #282 from huaizong/improve-remount-pv-path-when-exit-v2
remount old mount point when csi plugin unexpect exit
2019-04-02 08:36:07 +02:00
王怀宗
acdc759029 bump up the chart version 2019-04-01 16:48:30 +00:00
王怀宗
bb6754fb37 csi-provisioner rbac add resources nodes get, list, watch #293 2019-04-01 16:48:30 +00:00
王怀宗
1ccbb5b6a5 cephfs driver deploy support remount volume 2019-03-29 16:12:09 +08:00
Madhu Rajanna
52397b4dc4 rename socket directory to a common name
as the socket directory will be created
inside the container no need to follow
the plugin name in for the directory
creation, this will also reduce the code
changes if we want to change driver name.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-03-22 09:58:21 +05:30
Madhu Rajanna
497411b26c update readme to delete namespace
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-03-13 12:04:30 +05:30
Madhu Rajanna
d61a87b42e Fix driver name as per CSI spec
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-03-13 12:04:30 +05:30
Madhu Rajanna
c0745486a7 add event rules for provisioner
Fixes: #https://github.com/ceph/ceph-csi/pull/234#issuecomment-468967752

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-03-04 14:34:14 +00:00
Madhu Rajanna
eb14742874 bump helm chat version from 0.4.0 to 0.5.0
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-02-28 13:41:11 +05:30
Madhu Rajanna
b629b22cf0 Add csinodeinfos rules
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-02-27 19:29:11 +05:30
Madhu Rajanna
119504c004 Add role and rolebinding for cephfs
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-02-27 16:44:46 +05:30
Madhu Rajanna
c9815e99a9 Fix rbac issue in cephfs plugin
remove unwanted rules and update
rbac to have permission to modify
endpoints and configmaps in the
current namespace.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-02-27 16:38:20 +05:30
Madhu Rajanna
55ad4924b3 update readme to deploy cephfs in namespace
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-02-25 21:24:50 +05:30
Madhu Rajanna
27b46aba08 Add helm chat for cephfs
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2019-02-21 10:38:25 +05:30
Madhu Rajanna
cee9c4f8b2 Fix yamllint issues
Signed-off-by: Madhu Rajanna <mrajanna@redhat.com>
2019-02-07 12:19:14 +00:00
Huamin Chen
6df22b38ba
Merge branch 'csi-v1.0' into fix-134 2019-02-04 10:57:56 -05:00
Madhu Rajanna
ad06507aca update sidecar containers to v1.0.1 stable release
Fixes: #134

Signed-off-by: Madhu Rajanna <mrajanna@redhat.com>
2019-02-04 15:34:12 +05:30
Huamin Chen
e4b24711f6 cope with latest changes in csi provisioner and deprecations 2019-01-23 10:58:50 -05:00
Huamin Chen
e0e764b3a1 review feedback: tune rbd provisioner rbac
Signed-off-by: Huamin Chen <hchen@redhat.com>
2019-01-23 10:05:15 -05:00
Huamin Chen
7caf03b556 review feedback: tune cephfs provisioner and driver rbac, de-escalate privilage
Signed-off-by: Huamin Chen <hchen@redhat.com>
2019-01-23 09:14:11 -05:00
Huamin Chen
c6c496ff59 switch to node registrar 2019-01-22 14:46:41 -05:00
Huamin Chen
aed7506d88 fix merge leftovers; use canary driver-registrar image, as v1.0.0 is not hosted in quay.io
Signed-off-by: Huamin Chen <hchen@redhat.com>
2019-01-15 13:31:06 -05:00
Huamin Chen
85b8415024 Merge branch 'master' into master-to-1.0 2019-01-15 16:15:30 +00:00
mickymiek
b23ee70d7f fix rbac rules for configmaps 2019-01-14 20:15:09 +00:00
mickymiek
7d47bb0698 make k8s_configmap default metadatastorage for k8s deployments 2019-01-14 20:15:09 +00:00
mickymiek
d64dc3a1b2 modified cephfs deployment 2019-01-14 20:15:09 +00:00
Huamin Chen
095044fc90 switch to centos base image 2019-01-14 20:15:09 +00:00
Mike Cronce
a0be6e27d3 deploy/cephfs/kubernetes/csi-cephfsplugin.yaml: Add /var/lib/kubelet/plugins/kubernetes.io/csi bidirectional mount into plugin container 2018-12-14 15:16:11 -05:00
Mike Cronce
5ae81821e4 deploy/cephfs/kubernetes/csi-cephfsplugin.yaml: Made volumeMounts for plugin container slightly more readable 2018-12-14 15:06:42 -05:00
Mike Cronce
82b7904542 deploy/cephfs/kubernetes: Use CSI 1.x plugin directory 2018-12-04 15:38:10 -05:00
Mike Cronce
d46dc33611 deploy/cephfs: Updated all image tags from v0.3.0 to v1.0.0 2018-11-29 13:16:19 -05:00
Huamin Chen
b2459574ee switch to centos base image 2018-11-20 14:46:29 +00:00
Huamin Chen
4453cfce5b set dns policy in csi plugin so storage class can use mons' FQDN
Signed-off-by: Huamin Chen <hchen@redhat.com>
2018-09-19 14:39:43 +00:00
Masaki Kimura
02fdf238b0 Add configurations to handle kubelet-plugin-watcher to sample yaml files
Fixes: #73
2018-09-10 19:16:17 +00:00
gman
e2910f1c18 deployment update for 0.3.0 2018-08-07 15:11:22 +02:00
Wong Hoi Sing Edison
1fbd3e69de Update CEPH_VERSION to mimic 2018-07-04 12:20:56 +08:00
gman
a6181200c1 cephfs/deploy: bump csi-provisioner to 0.2.1 2018-06-12 17:10:54 +02:00
gman
9bbabc2f5d cephfs/deploy: updates storage class, secrets 2018-04-13 15:25:13 +02:00
gman
f881bf5249 cephfs/Dockerfile: added attr package 2018-04-13 14:35:38 +02:00
gman
48b4177949 cephfs/Makefile: renamed image to quay.io/cephcsi/cephfsplugin 2018-03-26 15:02:20 +02:00
gman
a585f083ab cephfs/cephfsplugin.yaml: mount hosts's /dev into csi-cephfsplugin container 2018-03-22 16:51:39 +01:00
gman
4c5c67b8f9 cephfs: check volumeOptions.Mounter and choose ceph-fuse or mount.ceph accordingly 2018-03-22 14:14:57 +01:00
gman
e45f87632e cephfs/Dockerfile: use ceph's package repositories instead 2018-03-22 14:14:47 +01:00
gman
f7cdd5a9bd cephfs/deploy: added more convenience scripts 2018-03-20 16:40:31 +01:00
gman
e0935a9772 added cephfs/secret.yaml 2018-03-20 16:40:31 +01:00
gman
e0b8767401 cephfs/Dockerfile: ceph-common package not needed anymore 2018-03-20 16:40:31 +01:00
gman
0df8415067 cephfs: cleaning/renaming 2018-03-20 15:46:31 +01:00
gman
257a11780f cephfs/deploy/k8s: updated naming and some permissions 2018-03-18 15:08:39 +01:00
gman
66c16e35e6 cephfs: refactoring for CSI 0.2.0 part 1 2018-03-13 10:25:50 +01:00
gman
06f411bbf3 cephfs: volumes are now created for separate ceph users with limited access to fs
Uses a slightly modified version of https://github.com/kubernetes-incubator/external-storage/blob/master/ceph/cephfs/cephfs_provisioner/cephfs_provisioner.py
This should be rewritten properly in Go, but for it works for now - for demonstration purposes

TODO:
* readOnly is not taken into account
* controllerServer.DeleteVolume does nothing
2018-03-09 17:05:19 +01:00
gman
3dc810a75b cephfs: lowered permissions in cephfsplugin.yaml 2018-03-09 17:03:31 +01:00
gman
6655b87683 updated .gitignore 2018-03-09 17:01:42 +01:00
gman
aa023ea405 cephfs: set access mode to MULTI_NODE_MULTI_WRITER; controller (un)publish is not needed 2018-03-07 14:19:08 +01:00
gman
1c1b0eab1e WIP cephfs CSI plugin 2018-03-05 13:21:30 +01:00