Currently CephFs provisioner mounts the ceph filesystem
and creates a subdirectory as a part of provisioning the
volume. Ceph now supports commands to provision fs subvolumes,
hance modify the provisioner to use ceph mgr commands to
(de)provision fs subvolumes.
Signed-off-by: Poornima G <pgurusid@redhat.com>
This commit adds support to mount and delete volumes provisioned by older
plugin versions (1.0.0) in order to support backward compatibility to 1.0.0
created volumes.
It adds back the ability to specify where older meta data was specified, using
the metadatastorage option to the plugin. Further, using the provided meta data
to mount and delete the older volumes.
It also supports a variety of ways in which monitor information may have been
specified (in the storage class, or in the secret), to keep the monitor
information current.
Testing done:
- Mount/Delete 1.0.0 plugin created volume with monitors in the StorageClass
- Mount/Delete 1.0.0 plugin created volume with monitors in the secret with
a key "monitors"
- Mount/Delete 1.0.0 plugin created volume with monitors in the secret with
a user specified key
- PVC creation and deletion with the current version (to ensure at the minimum
no broken functionality)
- Tested some negative cases, where monitor information is missing in secrets
or present with a different key name, to understand if failure scenarios work
as expected
Updates #378
Follow-up work:
- Documentation on how to upgrade to 1.1 plugin and retain above functionality
for older volumes
Signed-off-by: ShyamsundarR <srangana@redhat.com>
As detailed in issue #279, current lock scheme has hash
buckets that are count of CPUs. This causes a lot of contention
when parallel requests are made to the CSI plugin. To reduce
lock contention, this commit introduces granular locks per
identifier.
The commit also changes the timeout for gRPC requests to Create
and Delete volumes, as the current timeout is 10s (kubernetes
documentation says 15s but code defaults are 10s). A virtual
setup takes about 12-15s to complete a request at times, that leads
to unwanted retries of the same request, hence the increased
timeout to enable operation completion with minimal retries.
Tests to create PVCs before and after these changes look like so,
Before:
Default master code + sidecar provisioner --timeout option set
to 30 seconds
20 PVCs
Creation: 3 runs, 396/391/400 seconds
Deletion: 3 runs, 218/271/118 seconds
- Once was stalled for more than 8 minutes and cancelled the run
After:
Current commit + sidecar provisioner --timeout option set to 30 sec
20 PVCs
Creation: 3 runs, 42/59/65 seconds
Deletion: 3 runs, 32/32/31 seconds
Fixes: #279
Signed-off-by: ShyamsundarR <srangana@redhat.com>
Also reduced code duplication in fetching pool list from Ceph.
DeleteSnapshot like DeleteVolume, should return a success when it
detects that the snapshot keys are missing from the RADOS OMaps that
store the snapshot UUID to request name mapping.
This was missing in the code, and is now added.
Signed-off-by: ShyamsundarR <srangana@redhat.com>
RBD plugin needs only a single ID to manage images and operations against a
pool, mentioned in the storage class. The current scheme of 2 IDs is hence not
needed and removed in this commit.
Further, unlike CephFS plugin, the RBD plugin splits the user id and the key
into the storage class and the secret respectively. Also the parameter name
for the key in the secret is noted in the storageclass making it a variant and
hampers usability/comprehension. This is also fixed by moving the id and the key
to the secret and not retaining the same in the storage class, like CephFS.
Fixes#270
Testing done:
- Basic PVC creation and mounting
Signed-off-by: ShyamsundarR <srangana@redhat.com>
update example rbd PVC from ReadWriteMany
to ReadWriteOnce as rbd supports ReadWriteMany
only if pvc mode is block
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
* Enable all static-checks in golangci-lint
* Update golangci-lint version
* Fix issue found in golangci-lint
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Deployment behaves better when a node gets disconnected from the rest of
the cluster - new provisioner leader is elected in ~15 seconds, while
it may take up to 5 minutes for StatefulSet to start a new replica.
Refer: 52d1fbcf9dFixes: #335
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>