This commit adds the validation of csi cephfs driver to work with
ephemeral volume support. With ephemeral volume support a user can
specify ephemeral volumes in its pod spec and tie the lifecycle
of the PVC with the POD.
An example POD spec also included in this commit.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
This commit adds the validation of csi RBD driver to work with
ephemeral volume support. With ephemeral volume support a user can
specify ephemeral volumes in its pod spec and tie the lifecycle
of the PVC with the POD.
An example pod spec is also included in this commit.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
current latest vault release is 1.9.0 but
with the latest image our E2E is broken.
reverting back the vault version to 1.8.5
till we root cause the issue.
Note:- This is to unblock PR merging
updates: #2657
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
This change allows the user to choose not to fallback to NBD mounter
when some ImageFeatures are absent with krbd driver, rather just fail
the NodeStage call.
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Thick-provisioning was introduced to make accounting of assigned space
for volumes easier. When thick-provisioned volumes are the only consumer
of the Ceph cluster, this works fine. However, it is unlikely that this
is the case. Instead, accounting of the requested (thin-provisioned)
size of volumes is much more practical as different types of volumes can
be tracked.
OpenShift already provides cluster-wide quotas, which can combine
accounting of requested volumes by grouping different StorageClasses.
In addition to the difficult practise of allowing only thick-provisioned
RBD backed volumes, the performance makes thick-provisioning
troublesome. As volumes need to be completely allocated, data needs to
be written to the volume. This can take a long time, depending on the
size of the volume. Provisioning, cloning and snapshotting becomes very
much noticeable, and because of the additional time consumption, more
prone to failures.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
The command `vault monitor` can be used to stream logging from the Vault
service. This is very helpful while debugging Vault configuration
failures.
By adding a 2nd container to the Vault deployment, it is now possible to
get the messages from the Vault service by running
$ kubectl logs -c monitor <vault-pod-0123abcd>
This will be very useful when the e2e tests do not delete the deployment
after a failure and fetch the logs from all containers.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Currently, we delete the ceph client log file on unmap/detach.
This patch provides additional alternatives for users who would like to
persist the log files.
Strategies:
-----------
`remove`: delete log file on unmap/detach
`compress`: compress the log file to gzip on unmap/detach
`preserve`: preserve the log file in text format
Note that the default strategy will be remove on unmap, and these options
can be tweaked from the storage class
Compression size details example:
On Map: (with debug-rbd=20)
---------
$ ls -lh
-rw-r--r-- 1 root root 526K Sep 1 18:15
rbd-nbd-0001-0024-fed5480a-f00f-417a-a51d-31d8a8144c03-0000000000000003-d2e89c87-0b4d-11ec-8ea6-160f128e682d.log
On unmap:
---------
$ ls -lh
-rw-r--r-- 1 root root 33K Sep 1 18:15
rbd-nbd-0001-0024-fed5480a-f00f-417a-a51d-31d8a8144c03-0000000000000003-d2e89c87-0b4d-11ec-8ea6-160f128e682d.gz
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
cephLogDir: is a storage class option that is passed to rbd-nbd daemon.
cephLogDirHostPath: is a nodeplugin daemonset level option that helps in
using the right host-path while bind-mounting
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
added an example configmap for the
ceph.conf file. This allows the users to
specify different ceph configuration when
deploying cephcsi.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
- One logfile per device/volume
- Add ability to customize the logdir, default: /var/log/ceph
Note: if user customizes the hostpath to something else other than default
/var/log/ceph, then it is his responsibility to update the `cephLogDir`
in storageclass to reflect the same with daemon:
```
cephLogDir: "/var/log/mynewpath"
```
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
The kmsConfig type in the e2e suite has been enhanced with two functions
that make it possible to validate the destruction of deleted keys.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Hashicorp Vault does not completely remove the secrets in a kv-v2
backend when the keys are deleted. The metadata of the keys will be
kept, and it is possible to recover the contents of the keys afterwards.
With the new `vaultDestroyKeys` configuration parameter, this behaviour
can now be selected. By default the parameter will be set to `true`,
indicating that the keys and contents should completely be destroyed.
Setting it to any other value will make it possible to recover the
deleted keys.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
The new `vaultAuthNamespace` configuration parameter can be set to the
Vault Namespace where the authentication is setup in the service. Some
Hashicorp Vault deployments use sub-namespaces for their users/tenants,
with a 'root' namespace where the authentication is configured. This
requires passing of different Vault namespaces for different operations.
Example:
- the Kubernetes Auth mechanism is configured for in the Vault
Namespace called 'devops'
- a user/tenant has a sub-namespace called 'devops/website' where the
encryption passphrases can be placed in the key-value store
The configuration for this, then looks like:
vaultAuthNamespace: devops
vaultNamespace: devops/homepage
Note that Vault Namespaces are a feature of the Hashicorp Vault
Enterprise product, and not part of the Open Source version. This
prevents adding e2e tests that validate the Vault Namespace
configuration.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
registry.centos.org is not officially maintained by the CentOS
infrastructure team. The container images on quay.io are the official
once and we should use those instead.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
It seems that the version of the key/value engine can not always be
detected for Hashicorp Vault. In certain cases, it is required to
configure the `VAULT_BACKEND` (or `vaultBackend`) option so that a
successful connection to the service can be made.
The `kv-v2` is the current default for development deployments of
Hashicorp Vault (what we use for automated testing). Production
deployments default to version 1 for now.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
This change resolves a typo for installing the CSIDriver
resource in Kubernetes clusters before 1.18,
where the apiVersion is incorrect.
See also:
https://kubernetes-csi.github.io/docs/csi-driver-object.html
[ndevos: replace v1betav1 in examples with v1beta1]
Signed-off-by: Thomas Kooi <t.j.kooi@avisi.nl>
parseTenantConfig() only allowed configuring a defined set of options,
and KMSs were not able to re-use the implementation. Now, the function
parses the ConfigMap from the Tenants Namespace and returns a map with
options that the KMS supports.
The map that parseTenantConfig() returns can be inspected by the KMS,
and applied to the vaultTenantConnection type by calling parseConfig().
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Kubelet sometimes reports the following error:
failed to "StartContainer" for "vault-init-job" with CreateContainerConfigError: container has runAsNonRoot and image will run as root
Setting securityContext.runAsUser resolves this.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
The ServiceAccount "ceph-csi-vault-sa" is expected to be placed in the
Namespace "tenant" so that the provisioner and node-plugin fetch the
ServiceAccount from a Namespace where Ceph-CSI is not deployed.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
This commit adds e2e for user secret based metadata encryption,
adds user-secret.yaml and makes required changes in kms-connection-details,
kms-config yamls.
Signed-off-by: Rakshith R <rar@redhat.com>
The deployment of the Vault ConfigMap for the init-scripts job contains
a List with a single Item. This can be cleaned up to just be a ConfigMap
(without the list structure around it).
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Testing encrypted PVCs does not work anymore since Kubernetes v1.21. It
seems that disabling the iss validation in Hashicorp Vault is a
relatively simple workaround that we can use instead of the more complex
securing of the environment like should be done in production
deployments.
Updates: #1963
See-also: external-secrets/kubernetes-external-secrets#721
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Current rbd plugin only supports the layering feature
for rbd image. Add exclusive-lock and journaling image
features for the rbd.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Signed-off-by: woohhan <woohyung_han@tmax.co.kr>
With v4.0.0 release of external-snapshotter, we are moving towards v1
from v1beta1 API version
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Add an option to the StorageClass to support creating fully allocated
(thick provisioned) RBD images
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Signed-off-by: Niels de Vos <ndevos@redhat.com>
A tenant can place a ConfigMap in their Kubernetes Namespace with
configuration options that differ from the global (by the Storage Admin
set) values.
The ConfigMap needs to be located in the Tenants namespace, as described
in the documentation
See-also: docs/design/proposals/encryption-with-vault-tokens.md
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Deploying Vault still fails on occasion. It seems that the
imagePullPolicy has not been configured for the container yet.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
On occasion deploying Vault fails. It seems the vault-init-job batch job
does not use a full-qualified-image-name for the "vault" container.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Add "sys/mounts" so that VaultBackendKey does not need to be set. The
libopenstorage API detects the version for the key-value store in Vault
by reading "sys/mounts". Without permissions to read this endpoint, the
VaultBackendKey is required to be configured.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Docker Hub offers a way to pull official images without any project
prefix, like "docker.io/vault:latest". This does a redirect to the
images located under "docker.io/library".
By using the full qualified image name, a redirect gets removed while
pulling the images. This reduces the likelyhood of hittin Docker Hub
pull rate-limits.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Images that have an unqualified name (no explicit registry) come from
Docker Hub. This can be made explicit by adding docker.io as prefix. In
addition, the default :latest tag has been added too.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
The BlockVolume PVC tests consume the example files that refer to
"centos:latest" without registry. This means that the images will get
pulled from Docker Hub, which has rate limits preventing CI jobs from
pulling images.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
MD014 - Dollar signs used before commands without showing output
The dollar signs are unnecessary, it is easier to copy and paste and
less noisy if the dollar signs are omitted. Especially when the
command doesn't list the output, but if the command follows output
we can use `$ ` (dollar+space) mainly to differentiate between
command and its ouput.
scenario 1: when command doesn't follow output
```console
cd ~/work
```
scenario 2: when command follow output (use dollar+space)
```console
$ ls ~/work
file1 file2 dir1 dir2 ...
```
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
This patch with touch on the varuious other fields with in the storage class
yamls and label them with optional/required.
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Obviously expecting a pool with name `rbd` or CephFS name `myfs`
will be a limitation, as the pool/fs is created by admin manually,
let them choose the name that suits their requirement and come back
edit it in the storage class.
Making the pool/fs name as required field will give more attention,
else with new users it will be mostly left unedited until one hit
the errors saying no pool/fs exists.
This patch clips-off the default pool/fs name and make it a mandatory
field.
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
The intention here is to keep the example YAMLs of CephFS
with recommended Access Mode of CephFS which is RWX instead of RWO.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
Currently, the script does not deploy the driver singlehandedly;
As the vault creation needs to be done prior to that.
The script now includes the vault creation so that
one script can be sufficient to deploy the rbd driver.
Signed-off-by: Yug <yuggupta27@gmail.com>
Added support for RBD PVC to PVC cloning, below
commands are executed to create a PVC-PVC clone from
RBD side.
* Check the depth(n) of the cloned image if n>=(hard limit -2)
or ((soft limit-2) Add a task to flatten the image and return
about (to avoid image leak) **Note** will try to flatten the
temp clone image in the chain if available
* Reserve the key and values in omap (this will help us to
avoid the leak as it's not reserved earlier as we have returned
ABORT (the request may not come back))
* Create a snapshot of rbd image
* Clone the snapshot (temp clone)
* Delete the snapshot
* Snapshot the temp clone
* Clone the snapshot (final clone)
* Delete the snapshot
```bash
1) check the image depth of the parent image if flatten required
add a task to flatten image and return ABORT to avoid leak
(hardlimit-2 and softlimit-2 check will be done)
2) Reserve omap keys
2) rbd snap create <RBD image for src k8s volume>@<random snap name>
3) rbd clone --rbd-default-clone-format 2 --image-feature
layering,deep-flatten <RBD image for src k8s volume>@<random snap>
<RBD image for temporary snap image>
4) rbd snap rm <RBD image for src k8s volume>@<random snap name>
5) rbd snap create <cloned RBD image created in snapshot process>@<random snap name>
6) rbd clone --rbd-default-clone-format 2 --image-feature <k8s dst vol config>
<RBD image for temporary snap image>@<random snap name> <RBD image for k8s dst vol>
7)rbd snap rm <RBD image for src k8s volume>@<random snap name>
```
* Delete temporary clone image created as part of clone(delete if present)
* Delete rbd image
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Updated storageclass and snapshotclass
to include the name prefix for naming
subvolumes and snapshots.
Fixes: #1087
Signed-off-by: Yug Gupta <ygupta@redhat.com>
The name of the CephFS SubvolumeGroup for the CSI volumes was hardcoded to "csi". To make permission management in multi tenancy environments easier, this commit makes it possible to configure the CSI SubvolumeGroup.
related to #798 and #931
This commit adds support to mention dataPool parameter for the
topology constrained pools in the StorageClass, that can be
leveraged to mention erasure coded pool names to use for RBD
data instead of the replica pools.
Signed-off-by: ShyamsundarR <srangana@redhat.com>
we need to provide access to the Service
account created with helm charts to access
the vault service.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
As kubernetes CSI sidecar is exposing the
GRPC mertics we can make use of the same in
ceph-csi we dont need to expose our own.
update: #881
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
While running the 'make test' target and have 'yamllint' available, the
test fails with the following exception:
yamllint -s -d {extends: default, rules: {line-length: {allow-non-breakable-inline-mappings: true}},ignore: charts/*/templates/*.yaml} ./examples/rbd/storageclass.yaml
Traceback (most recent call last):
File "/usr/local/bin/yamllint", line 11, in <module>
sys.exit(run())
File "/usr/local/lib/python3.6/site-packages/yamllint/cli.py", line 181, in run
problems = linter.run(f, conf, filepath)
File "/usr/local/lib/python3.6/site-packages/yamllint/linter.py", line 237, in run
content = input.read()
File "/usr/lib64/python3.6/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 1947: ordinal not in range(128)
The quotes used in the comments seem to be non-ascii characters.
Replacing these with standard " makes the test pass again.
This problem occurred while running tests in a container based on the
Ceph image (CentOS-7) with Python 3. Travis CI might still use Python 2
for yamllint, and hide the problem.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
* moves KMS type from StorageClass into KMS configuration itself
* updates omapval used to identify KMS to only it's ID without the type
why?
1. when using multiple KMS configurations (not currently supported)
automated parsing of kms configuration will be failing because some
entries in configs won't comply with the requested type
2. less options are needed in the StorageClass and less data used to
identify the KMS
Signed-off-by: Vasyl Purchel vasyl.purchel@workday.com
Signed-off-by: Andrea Baglioni andrea.baglioni@workday.com
- adds proposal document for PVC encryption from PR448
- adds per-volume encription by generating encryption passphrase
for each volume and storing it in a KMS
- adds HashiCorp Vault integration as a KMS for encryption passphrases
- avoids encrypting volume second time if it was already encrypted but
no file system created
- avoids unnecessary checks if volume is a mapped device when encryption
was not requested
- prevents resizing encrypted volumes (it is not currently supported)
- prevents creating snapshots from encrypted volumes to prevent attack
on encryption key (security guard until re-encryption of volumes
implemented)
Signed-off-by: Vasyl Purchel vasyl.purchel@workday.com
Signed-off-by: Andrea Baglioni andrea.baglioni@workday.comFixes#420Fixes#744
Fedora 26 has been End-Of-Life for a long time already, is should not be
used anymore. Instead use the latest CentOS image that gets regular
updates. The Ceph base image used by Ceph-CSI is also based on CentOS,
so uses would be familiar with the available tools.
Also use "sleep infinity" instead of a weird 'tail -f /dev/null' hack.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Adds encryption in StorageClass as a parameter. Encryption passphrase is
stored in kubernetes secrets per StorageClass. Implements rbd volume
encryption relying on dm-crypt and cryptsetup using LUKS extension
The change is related to proposal made earlier. This is a first part of
the full feature that adds encryption with passphrase stored in secrets.
Signed-off-by: Vasyl Purchel vasyl.purchel@workday.com
Signed-off-by: Andrea Baglioni andrea.baglioni@workday.com
Signed-off-by: Ioannis Papaioannou ioannis.papaioannou@workday.com
Signed-off-by: Paul Mc Auley paul.mcauley@workday.com
Signed-off-by: Sergio de Carvalho sergio.carvalho@workday.com
However further testing shows that, ext4 should be
the default or preferred fs for RBD devices.
This patch bring that change
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
The storage class already takes MountOptions(MountFlags), these are the
bind mount options. Some of these options may not be recognised by the
cephfs mount. Hence added a new parameterin Storage Class for
- cephfs kernel mount options,
- ceph-fuse mount options
Ceph kernel mount options are different from ceph-fuse options, hence
added two different parameters.
Signed-off-by: Poornima G <pgurusid@redhat.com>
we have see better performace in device
format and mounting by setting the fstype to xfs
from default ext4.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Use Deployment with leader election instead of StatefulSet
Deployment behaves better when a node gets disconnected
from the rest of the cluster - new provisioner leader
is elected in ~15 seconds, while it may take up to
5 minutes for StatefulSet to start a new replica.
Refer: kubernetes-csi/external-provisioner@52d1fbc
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
in NodeStage RPC call we have to map the
device to the node plugin and make sure the
the device will be mounted to the global path
in nodeUnstage request unmount the device from
global path and unmap the device
if the volume mode is block we will be creating
a file inside a stageTargetPath and it will be
considered as the global path
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
RBD plugin needs only a single ID to manage images and operations against a
pool, mentioned in the storage class. The current scheme of 2 IDs is hence not
needed and removed in this commit.
Further, unlike CephFS plugin, the RBD plugin splits the user id and the key
into the storage class and the secret respectively. Also the parameter name
for the key in the secret is noted in the storageclass making it a variant and
hampers usability/comprehension. This is also fixed by moving the id and the key
to the secret and not retaining the same in the storage class, like CephFS.
Fixes#270
Testing done:
- Basic PVC creation and mounting
Signed-off-by: ShyamsundarR <srangana@redhat.com>
update example rbd PVC from ReadWriteMany
to ReadWriteOnce as rbd supports ReadWriteMany
only if pvc mode is block
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
The SnapshotClass for RBD requires a pool parameter. This is redundant
as a snapshot is not created on a different pool than the source image
of the snapshot (refer rbd man page).
Further, when a snapshot needs to be created its source CSI VolumeID
is passed to the creation call, and hence the source volumes pool
needs to be reused to create the snapshot.
Similarly to clone a snapshot, the create request would come in with a
SnapshotID to help identify the snapshot pool, and the same create
request parameters would contain the storage class based pool parameter
to create the clone into (as clones can be in different pools as
compared to their parent snapshots).
Thus, the parameter pool seems redundant in the snapshot class and
should be removed to improve ease of use.
Fixes#379
Signed-off-by: ShyamsundarR <srangana@redhat.com>
This is a part of the stateless set of commits for CephCSI.
This commit removes the dependency on config maps to store cephFS provisioned
volumes, and instead relies on RADOS based objects and keys, and required
CSI VolumeID encoding to detect the provisioned volumes.
Changes:
- Provide backward compatibility to provisioned volumes by older plugin versions (1.0.0 or older)
- Remove Create/Delete support for statically provisioned volumes (fixes#382)
- Added namespace support to RADOS OMaps and used the same to store RADOS CSI objects and keys in the CephFS metadata pool
- Added support to mention fsname for CephFS provisioning (fixes#359)
- Changed field name in CSI Identifier to 'location', to denote a pool or fscid
- Updated mounter cache to use new scheme
- Required Helm manifests are updated
- Required documentation and other manifests are updated
- Made driver option 'metadatastorage' as optional, as fresh installs do not need to specify the same
Testing done:
- Create/Mount/Delete PVC
- Create/Delete 5 PVCs
- Mount version 1.0.0 PVC
- Delete version 1.0.0 PV
- Mount Statically defined PV/PVC/Pod
- Mount Statically defined version 1.0.0 PV/PVC/Pod
- Delete Statically defined version 1.0.0 PV/PVC/Pod
- Node restart when mounted to test mountcache
- Use InstanceID other than 'default'
- RBD basic round of tests, as namespace is added to OMaps
- csitest against ceph-fs plugin
- NOTE: CephFS plugin still does not detect and address already created
volumes but of a different size
- Test not providing any value to the metadata storage parameter
Signed-off-by: ShyamsundarR <srangana@redhat.com>
Existing config maps are now replaced with rados omaps that help
store information regarding the requested volume names and the rbd
image names backing the same.
Further to detect cluster, pool and which image a volume ID refers
to, changes to volume ID encoding has been done as per provided
design specification in the stateless ceph-csi proposal.
Additional changes and updates,
- Updated documentation
- Updated manifests
- Updated Helm chart
- Addressed a few csi-test failures
Signed-off-by: ShyamsundarR <srangana@redhat.com>
Based on the review comments addressed the following,
- Moved away from having to update the pod with volumes
when a new Ceph cluster is added for provisioning via the
CSI driver
- The above now used k8s APIs to fetch secrets
- TBD: Need to add a watch mechanisim such that these
secrets can be cached and updated when changed
- Folded the Cephc configuration and ID/key config map
and secrets into a single secret
- Provided the ability to read the same config via mapped
or created files within the pod
Tests:
- Ran PV creation/deletion/attach/use using new scheme
StorageClass
- Ran PV creation/deletion/attach/use using older scheme
to ensure nothing is broken
- Did not execute snapshot related tests
Signed-off-by: ShyamsundarR <srangana@redhat.com>
This commit provides the option to pass in Ceph cluster-id instead
of a MON list from the storage class.
This helps in moving towards a stateless CSI implementation.
Tested the following,
- PV provisioning and staging using cluster-id in storage class
- PV provisioning and staging using MON list in storage class
Did not test,
- snapshot operations in either forms of the storage class
Signed-off-by: ShyamsundarR <srangana@redhat.com>