Whenever Ceph-CSI receives a CSI/Replication
request it will first decode the
volumeHandle and try to get the required
OMAP details if it is not able to
retrieve, receives a `Not Found` error
message and Ceph-CSI will check for the
clusterID mapping. If the old volumeID
`0001-00013-site1-storage-0000000000000001
-b0285c97-a0ce-11eb-8c66-0242ac110002`
contains the `site1-storage` as the clusterID,
now Ceph-CSI will look for the corresponding
clusterID `site2-storage` from the above configmap.
If the clusterID mapping is found now Ceph-CSI
will look for the poolID mapping ie mapping between
`1` and `2`. Example:- pool with name exists on
both the clusters with different ID's Replicapool
with ID `1` on site1 and Replicapool with ID `2`
on site2. After getting the required mapping Ceph-CSI
has the required information to get more details
from the rados OMAP. If we have multiple clusterID mapping
it will loop through all the mapping and checks the
corresponding pool to get the OMAP data. If the clusterID
mapping does not exist Ceph-CSI will return an `Not Found`
error message to the caller.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
added helper function to read the clusterID mapping
from the mounted file.
The clusterID mapping contains below mappings
* ClusterID mappings (to cluster to which we are failingover
and from which cluster failover happened)
* RBD PoolID mapping of between the clusters.
* CephFS FscID mapping between the clusters.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
The VAULT_AUTH_MOUNT_PATH is a Vault configuration parameter that allows
a user to set a non default path for the Kubernetes ServiceAccount
integration. This can already be configured for the Vault KMS, and is
now added to the Vault Tenant SA KMS as well.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
The new `vaultAuthNamespace` configuration parameter can be set to the
Vault Namespace where the authentication is setup in the service. Some
Hashicorp Vault deployments use sub-namespaces for their users/tenants,
with a 'root' namespace where the authentication is configured. This
requires passing of different Vault namespaces for different operations.
Example:
- the Kubernetes Auth mechanism is configured for in the Vault
Namespace called 'devops'
- a user/tenant has a sub-namespace called 'devops/website' where the
encryption passphrases can be placed in the key-value store
The configuration for this, then looks like:
vaultAuthNamespace: devops
vaultNamespace: devops/homepage
Note that Vault Namespaces are a feature of the Hashicorp Vault
Enterprise product, and not part of the Open Source version. This
prevents adding e2e tests that validate the Vault Namespace
configuration.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
- mount host's /etc/selinux in node plugins
- process mount options in all code paths for cephfs volume options
Signed-off-by: Alexandre Lossent <alexandre.lossent@cern.ch>
This commit uses `string.SplitN` instead of `string.Split`.
The path for pids.max has extra `:` symbols in it due to which
getCgroupPidsFile() splits the string into 5 tokens instead of
3 leading to loss of part of the path.
As a result, the below error is reported:
`Failed to get the PID limit, can not reconfigure: open
/sys/fs/cgroup/pids/system.slice/containerd.service/
kubepods-besteffort-pod183b9d14_aed1_4b66_a696_da0c738bc012.slice/pids.max:
no such file or directory`
SplitN takes an argument n and splits the string
accordingly which helps us to get the desired
file path.
Fixes: #2337
Co-authored-by: Yati Padia <ypadia@redhat.com>
Signed-off-by: Yati Padia <ypadia@redhat.com>
Currently we have a bug that we are not using rados
namespace when adding ceph manager command to
remove the image from the trash. This commit
adds the missing rados namespace when adding
ceph manager task.
without fix the image will be moved to trash
and no task will be added to remove from the
trash. it will become ceph responsibility to
remove the image from trash when it will cleanup
the trash.
workaroud: manually purge the trash
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
RBD image metadata keys that start with '.rbd' are expected to be
internal to RBD itself and are not mirrored to remote sites. Renaming
the keys (dropping the '.' prefix) and using the new MigrateMetadata()
function now makes the keys available on remote sites too.
Closes: #2219
Signed-off-by: Niels de Vos <ndevos@redhat.com>
The new MigrateMetadata() function can be used to get the metadata of an
image with a deprecated and new key. Renaming metadata keys can be done
easily this way.
A default value will be set in the image metadata when it is missing
completely. But if the deprecated key was set, the data is stored under
the new key and the deprecated key is removed.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Currently, getImageMirroringStatus() is using RBD CLI.
This commit converts RBD CLI to go-ceph API.
Fixes: #2120
Signed-off-by: Yati Padia <ypadia@redhat.com>
Previously in ControllerExpandVolume() we had a check for encrypted
volumes and we use to fail for all expand requests on an encrypted
volume. Also for Block VolumeMode PVCs NodeExpandVolume used to be
ignored/skipped.
With these changes, we add support for the expansion of encrypted volumes.
Also for raw Block VolumeMode PVCs with Encryption we call NodeExpandVolume.
That said,
With LUKS1, cryptsetup utility doesn't prompt for a passphrase on resizing
the crypto mapper device. This is because LUKS1 devices don't use kernel
keyring for volume keys.
Whereas, LUKS2 devices use kernel keyring for volume key by default, i.e.
cryptsetup utility asks for a passphrase if it detects volume key was
previously passed to dm-crypt via kernel keyring service, we are overriding
the default by --disable-keyring option during cryptsetup open command.
So that at the time of crypto mapper device resize we will not be
prompted for any passphrase.
Fixes: #1469
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
With Luks1 device:
$ cryptsetup status /dev/mapper/crypto-rbd0
/dev/mapper/crypto-rbd0 is active and is in use.
type: LUKS1
cipher: aes-xts-plain64
keysize: 512 bits
key location: dm-crypt
device: /dev/rbd0
sector size: 512
offset: 4096 sectors
size: 4190208 sectors
mode: read/write
With Luks2 device:
$ cryptsetup status /dev/mapper/crypto-rbd0
/dev/mapper/crypto-rbd0 is active and is in use.
type: LUKS2
cipher: aes-xts-plain64
keysize: 512 bits
key location: dm-crypt
device: /dev/rbd0
sector size: 512
offset: 32768 sectors
size: 4161536 sectors
mode: read/write
This could lead to failures with unmap in the NodeUnstageVolume path
for the encrypted volumes.
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
This commit modifies the error of godot, cyclop,
paralleltest linter caused due to merged PRs.
Updates: #1586
Signed-off-by: Yati Padia <ypadia@redhat.com>
At present while acquiring the deleteLock on the volume, we check
for ongoing clone and snapshot creation operations on the same.
Considering snapshot and clone controllers does not allow parent
volume deletion on subjected operations, we can be free from this
extra check.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
It seems that the version of the key/value engine can not always be
detected for Hashicorp Vault. In certain cases, it is required to
configure the `VAULT_BACKEND` (or `vaultBackend`) option so that a
successful connection to the service can be made.
The `kv-v2` is the current default for development deployments of
Hashicorp Vault (what we use for automated testing). Production
deployments default to version 1 for now.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
nlreturn linter requires a new line before return
and branch statements except when the return is alone
inside a statement group (such as an if statement) to
increase code clarity. This commit addresses such issues.
Updates: #1586
Signed-off-by: Rakshith R <rar@redhat.com>
This commit resolves errorlint issues
which checks for the code that will cause
problems with the error wrapping scheme.
Updates: #1586
Signed-off-by: Yati Padia <ypadia@redhat.com>
revive linter checks for var-declaration
format.
For example:
"e2e/rbd_helper.go:441:36: var-declaration:
should drop = nil from declaration of
var noPVCValidation; it is the zero value (revive)
var noPVCValidation validateFunc = nil"
Updates: #1586
Signed-off-by: Yati Padia <ypadia@redhat.com>
snapshot controller make sure the pvc which is the source for the
snapshot request wont get deleted while snapshot is getting created,
so we dont need to check for any ongoing delete operation here on the
volume.
Subjected code path in snapshot controller:
```
pvc, err := ctrl.getClaimFromVolumeSnapshot(snapshot)
.
..
pvcClone.ObjectMeta.Finalizers = append(pvcClone.ObjectMeta.Finalizers, utils.PVCFinalizer)
_, err = ctrl.client.CoreV1().PersistentVolumeClaims(pvcClone.Namespace).Update(..)
```
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
Problem:
-------
For rbd nbd userspace mounter backends, after a restart of the nodeplugin
all the mounts will start seeing IO errors. This is because, for rbd-nbd
backends there will be a userspace mount daemon running per volume, post
restart of the nodeplugin pod, there is no way to restore the daemons
back to life.
Solution:
--------
The volume healer is a one-time activity that is triggered at the startup
time of the rbd nodeplugin. It navigates through the list of volume
attachments on the node and acts accordingly.
For now, it is limited to nbd type storage only, but it is flexible and
can be extended in the future for other backend types as needed.
From a few feets above:
This solves a severe problem for nbd backed csi volumes. The healer while
going through the list of volume attachments on the node, if finds the
volume is in attached state and is of type nbd, then it will attempt to
fix the rbd-nbd volumes by sending a NodeStageVolume request with the
required volume attributes like secrets, device name, image attributes,
and etc.. which will finally help start the required rbd-nbd daemons in
the nodeplugin csi-rbdplugin container. This will allow reattaching the
backend images with the right nbd device, thus allowing the applications
to perform IO without any interruptions even after a nodeplugin restart.
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
As part of stage transaction if the mounter is of type nbd, then capture
device path after a successful rbd-nbd map.
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
clone controller make sure there is no delete operation happens
on the source PVC which has been referred as the datasource of
clone PVC, we are safe to operate without looking at delete
operation lock in this case.
Subjected code in the controller:
...
if claim.Spec.DataSource != nil && rc.clone {
err = p.setCloneFinalizer(ctx, claim)
...
}
if !checkFinalizer(claim, pvcCloneFinalizer) {
claim.Finalizers = append(claim.Finalizers, pvcCloneFinalizer)
_, err := p.client.CoreV1().PersistentVolumeClaims(claim.Namespace).Update(..claim..)
}
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
Considering kubelet make sure the stage and publish operations
are serialized, we dont need any extra locking in nodePublish
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
Considering kubelet make sure the stage and publish operations
are serialized, we dont need any extra locking in nodePublish
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
Considering kubelet make sure the unstage and unpublish operations
are serialized, we dont need any extra locking in nodeUnpublish
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
Considering kubelet make sure the unstage and unpublish operations
are serialized, we dont need any extra locking in nodeUnpublish
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
parseTenantConfig() only allowed configuring a defined set of options,
and KMSs were not able to re-use the implementation. Now, the function
parses the ConfigMap from the Tenants Namespace and returns a map with
options that the KMS supports.
The map that parseTenantConfig() returns can be inspected by the KMS,
and applied to the vaultTenantConnection type by calling parseConfig().
Signed-off-by: Niels de Vos <ndevos@redhat.com>
This new KMS uses a Kubernetes ServiceAccount from a Tenant (Namespace)
to connect to Hashicorp Vault. The provisioner and node-plugin will
check for the configured ServiceAccount and use the token that is
located in one of the linked Secrets. Subsequently the Vault connection
is configured to use the Kubernetes token from the Tenant.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
This makes the Tenant configuration for Hashicorp Vault KMS connections
more modular. Additional KMS implementations that use Hashicorp Vault
with per-Tenant options can re-use the new vaultTenantConnection.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
This commit resolves parallel test issues
and also excludes internal/util/conn_pool_test.go
as those test can't run in parallel.
Updates: #1586
Signed-off-by: Yati Padia <ypadia@redhat.com>
This commit resolves godot linter issue
which says "Comment should end in a period (godot)".
Updates: #1586
Signed-off-by: Yati Padia <ypadia@redhat.com>
This commit adds t.Helper() to the test helper
function. With this call go test prints correct
lines of code for failed tests. Otherwise,
printed lines will be inside helpers functions.
For more details check: https://github.com/kulti/thelper
Updates: #1586
Signed-off-by: Yati Padia <ypadia@redhat.com>
* Make kernel version parsing to support more (valid) version strings
* Put version string parsing into a separate, testable function
* Fixes#2248 (Kernel Subversion Parsing Failure)
Signed-off-by: Jonas Zeiger <jonas.zeiger@talpidae.net>
This commit adds capability to `metadata` encryption
to be able to fetch `encryptionPassphrase` from user
specified secret name and namespace(if not specified,
will default to namespace where PVC was created).
This behavior is followed if `secretName` key is found
in the encryption configuration else defaults to fetching
`encryptionPassphrase` from storageclass secrets.
Closes: 2107
Signed-off-by: Rakshith R <rar@redhat.com>
setting metadata in isThickProvisioned method
helps us to avoid checking thick metakey and
deprecated metakey for both thick and thin
provisioned images and also this will easily
help us to migrated the deprecated key to new key.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>