A new "internal/kms" package is introduced, it holds the API that can be
consumed by the RBD components.
The KMS providers are currently in the same package as the API. With
later follow-up changes the providers will be placed in their own
sub-package.
Because of the name of the package "kms", the types, functions and
structs inside the package should not be prefixed with KMS anymore:
internal/kms/kms.go:213:6: type name will be used as kms.KMSInitializerArgs by other packages, and that stutters; consider calling this InitializerArgs (golint)
Updates: #852
Signed-off-by: Niels de Vos <ndevos@redhat.com>
By placing the NewK8sClient() function in its own package, the KMS API
can be split from the "internal/util" package. Some of the KMS providers
use the NewK8sClient() function, and this causes circular dependencies
between "internal/utils" -> "internal/kms" -> "internal/utils", which
are not alowed in Go.
Updates: #852
Signed-off-by: Niels de Vos <ndevos@redhat.com>
if the volumeattachment has been fetched but marked for deletion
the nbd healer dont want to process further on this pv. This patch
adds a check for pv is marked for deletion and if so, make the
healer skip processing the same
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
Moving the log functions into its own internal/util/log package makes it
possible to split out the humongous internal/util packages in further
smaller pieces. This reduces the inter-dependencies between utility
functions and components, preventing circular dependencies which are not
allowed in Go.
Updates: #852
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Unit-testing often fails due to a race condition while writing the
clusterMappingConfigFile from multiple go-routines at the same time.
Failures from `make containerized-test` look like this:
=== CONT TestGetClusterMappingInfo/site2-storage_cluster-id_mapping
cluster_mapping_test.go:153: GetClusterMappingInfo() = <nil>, expected data &[{map[site1-storage:site2-storage] [map[1:3]] [map[11:5]]} {map[site3-storage:site2-storage] [map[8:3]] [map[10:5]]}]
=== CONT TestGetClusterMappingInfo/site3-storage_cluster-id_mapping
cluster_mapping_test.go:153: GetClusterMappingInfo() = <nil>, expected data &[{map[site3-storage:site2-storage] [map[8:3]] [map[10:5]]}]
--- FAIL: TestGetClusterMappingInfo (0.01s)
--- PASS: TestGetClusterMappingInfo/mapping_file_not_found (0.00s)
--- PASS: TestGetClusterMappingInfo/mapping_file_found_with_empty_data (0.00s)
--- PASS: TestGetClusterMappingInfo/cluster-id_mapping_not_found (0.00s)
--- FAIL: TestGetClusterMappingInfo/site2-storage_cluster-id_mapping (0.00s)
--- FAIL: TestGetClusterMappingInfo/site3-storage_cluster-id_mapping (0.00s)
--- PASS: TestGetClusterMappingInfo/site1-storage_cluster-id_mapping (0.00s)
By splitting the public GetClusterMappingInfo() function into an
internal getClusterMappingInfo() that takes a filename, unit-testing can
use different files for each go-routine, and testing becomes more
predictable.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
With the tests at CI, it kind of looks like that the IO is timing out after
30 seconds (default with rbd-nbd). Since we have tweaked reattach-timeout
to 300 seconds at ceph-csi, we need to explicitly set io-timeout on the
device too, as it doesn't make any sense to keep
io-timeout < reattach-timeout
Hence we set io-timeout for rbd nbd to 0. Specifying io-timeout 0 tells
the nbd driver to not abort the request and instead see if it can be
restarted on another socket.
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Suggested-by: Ilya Dryomov <idryomov@redhat.com>
- Update the meta stash with logDir details
- Use the same to remove logfile on unstage/unmap to be space efficient
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
- One logfile per device/volume
- Add ability to customize the logdir, default: /var/log/ceph
Note: if user customizes the hostpath to something else other than default
/var/log/ceph, then it is his responsibility to update the `cephLogDir`
in storageclass to reflect the same with daemon:
```
cephLogDir: "/var/log/mynewpath"
```
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
ContollerManager had a typo in it, and if we correct it,
linter will fail and suggest not to use controller.ControllerManager
as the interface name and package name is redundant, keeping manager
as the interface name which is the practice and also address the
linter issues.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
https://github.com/ceph/go-ceph/pull/455/ added `state` field
to subvolume info struct which helps to identify the snapshot
retention state in the caller. This patch make use of the same
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
If the image is in a secondary state and its
up+replaying means its an healthy secondary
and the image is primary somewhere in the remote cluster
and the local image is getting replayed. Delete the
OMAP data generated as we cannot delete the
secondary image. When the image on the primary
cluster gets deleted/mirroring disabled, the image on
all the remote (secondary) clusters will get
auto-deleted. This helps in garbage collecting
the OMAP, PVC and PV objects after failback operation.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
If the image is in secondary state and its
up+replaying means its an healthy secondary
and the image is primary somewhere in the remote
cluster and the local image is getting replayed.
Return success for the Disabling mirroring as
we cannot disable the mirroring on the secondary
state, when the image on the remote site gets
disabled the image on all the remote (secondary)
will get auto deleted. This helps in garbage
collecting the volume replication kuberentes
artifacts
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
This commit adds functionality of extracting encryption kmsID,
owner from volumeAttributes in RegenerateJournal() and adds utility
functions ParseEncryptionOpts and FetchEncryptionKMSID.
Signed-off-by: Rakshith R <rar@redhat.com>
This commit refractors RegenerateJournal() to take in
volumeAttributes map[string]string as argument so it
can extract required attributes internally.
Signed-off-by: Rakshith R <rar@redhat.com>
For clusterMappingConfigFile using different
file name so that multiple unit test cases can
work without any data race.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
consider the empty mirroring mode when
validating the snapshot interval and
the scheduling time.
Even if the mirroring Mode is not set
validate the snapshot scheduling details
as cephcsi sets the mirroring mode to default
snapshot.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
This commit fixes snapshot id idempotency issue by
always returning an error when flattening is in progress
and not using `readyToUse:false` response.
Signed-off-by: Rakshith R <rar@redhat.com>
This commit fixes a bug in checkCloneImage() which was caused
by checking cloned image before checking on temp-clone image snap
in a subsequent request which lead to stale images. This was solved
by checking temp-clone image snap and flattening temp-clone if
needed.
This commit also fixes comparison bug in flattenCloneImage().
Signed-off-by: Rakshith R <rar@redhat.com>
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
rbd flatten functions is a CLI call and it expects
the creds as the input and copying of creds is
required when we generate the temp clone image.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Volume generated from snap using genrateVolFromSnap
already copies volume ID correctly, therefore removing
`vol.VolID = rbdVol.VolID` which wrongly copies parent
Volume ID instead leading to error from copyEncryption()
on parent and clone volume ID being equal.
Signed-off-by: Rakshith R <rar@redhat.com>
Golang-ci complains about the following:
internal/util/vault_tokens.go:99:20: string `true` has 4 occurrences, but such constant `vaultDefaultDestroyKeys` already exists (goconst)
v.VaultCAVerify = "true"
^
This occurence of "true" can be replaced by vaultDefaultCAVerify so
address the warning.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Hashicorp Vault does not completely remove the secrets in a kv-v2
backend when the keys are deleted. The metadata of the keys will be
kept, and it is possible to recover the contents of the keys afterwards.
With the new `vaultDestroyKeys` configuration parameter, this behaviour
can now be selected. By default the parameter will be set to `true`,
indicating that the keys and contents should completely be destroyed.
Setting it to any other value will make it possible to recover the
deleted keys.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Whenever Ceph-CSI receives a CSI/Replication
request it will first decode the
volumeHandle and try to get the required
OMAP details if it is not able to
retrieve, receives a `Not Found` error
message and Ceph-CSI will check for the
clusterID mapping. If the old volumeID
`0001-00013-site1-storage-0000000000000001
-b0285c97-a0ce-11eb-8c66-0242ac110002`
contains the `site1-storage` as the clusterID,
now Ceph-CSI will look for the corresponding
clusterID `site2-storage` from the above configmap.
If the clusterID mapping is found now Ceph-CSI
will look for the poolID mapping ie mapping between
`1` and `2`. Example:- pool with name exists on
both the clusters with different ID's Replicapool
with ID `1` on site1 and Replicapool with ID `2`
on site2. After getting the required mapping Ceph-CSI
has the required information to get more details
from the rados OMAP. If we have multiple clusterID mapping
it will loop through all the mapping and checks the
corresponding pool to get the OMAP data. If the clusterID
mapping does not exist Ceph-CSI will return an `Not Found`
error message to the caller.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
added helper function to read the clusterID mapping
from the mounted file.
The clusterID mapping contains below mappings
* ClusterID mappings (to cluster to which we are failingover
and from which cluster failover happened)
* RBD PoolID mapping of between the clusters.
* CephFS FscID mapping between the clusters.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
The VAULT_AUTH_MOUNT_PATH is a Vault configuration parameter that allows
a user to set a non default path for the Kubernetes ServiceAccount
integration. This can already be configured for the Vault KMS, and is
now added to the Vault Tenant SA KMS as well.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
The new `vaultAuthNamespace` configuration parameter can be set to the
Vault Namespace where the authentication is setup in the service. Some
Hashicorp Vault deployments use sub-namespaces for their users/tenants,
with a 'root' namespace where the authentication is configured. This
requires passing of different Vault namespaces for different operations.
Example:
- the Kubernetes Auth mechanism is configured for in the Vault
Namespace called 'devops'
- a user/tenant has a sub-namespace called 'devops/website' where the
encryption passphrases can be placed in the key-value store
The configuration for this, then looks like:
vaultAuthNamespace: devops
vaultNamespace: devops/homepage
Note that Vault Namespaces are a feature of the Hashicorp Vault
Enterprise product, and not part of the Open Source version. This
prevents adding e2e tests that validate the Vault Namespace
configuration.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
- mount host's /etc/selinux in node plugins
- process mount options in all code paths for cephfs volume options
Signed-off-by: Alexandre Lossent <alexandre.lossent@cern.ch>
This commit uses `string.SplitN` instead of `string.Split`.
The path for pids.max has extra `:` symbols in it due to which
getCgroupPidsFile() splits the string into 5 tokens instead of
3 leading to loss of part of the path.
As a result, the below error is reported:
`Failed to get the PID limit, can not reconfigure: open
/sys/fs/cgroup/pids/system.slice/containerd.service/
kubepods-besteffort-pod183b9d14_aed1_4b66_a696_da0c738bc012.slice/pids.max:
no such file or directory`
SplitN takes an argument n and splits the string
accordingly which helps us to get the desired
file path.
Fixes: #2337
Co-authored-by: Yati Padia <ypadia@redhat.com>
Signed-off-by: Yati Padia <ypadia@redhat.com>
Currently we have a bug that we are not using rados
namespace when adding ceph manager command to
remove the image from the trash. This commit
adds the missing rados namespace when adding
ceph manager task.
without fix the image will be moved to trash
and no task will be added to remove from the
trash. it will become ceph responsibility to
remove the image from trash when it will cleanup
the trash.
workaroud: manually purge the trash
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
RBD image metadata keys that start with '.rbd' are expected to be
internal to RBD itself and are not mirrored to remote sites. Renaming
the keys (dropping the '.' prefix) and using the new MigrateMetadata()
function now makes the keys available on remote sites too.
Closes: #2219
Signed-off-by: Niels de Vos <ndevos@redhat.com>
The new MigrateMetadata() function can be used to get the metadata of an
image with a deprecated and new key. Renaming metadata keys can be done
easily this way.
A default value will be set in the image metadata when it is missing
completely. But if the deprecated key was set, the data is stored under
the new key and the deprecated key is removed.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Currently, getImageMirroringStatus() is using RBD CLI.
This commit converts RBD CLI to go-ceph API.
Fixes: #2120
Signed-off-by: Yati Padia <ypadia@redhat.com>
Previously in ControllerExpandVolume() we had a check for encrypted
volumes and we use to fail for all expand requests on an encrypted
volume. Also for Block VolumeMode PVCs NodeExpandVolume used to be
ignored/skipped.
With these changes, we add support for the expansion of encrypted volumes.
Also for raw Block VolumeMode PVCs with Encryption we call NodeExpandVolume.
That said,
With LUKS1, cryptsetup utility doesn't prompt for a passphrase on resizing
the crypto mapper device. This is because LUKS1 devices don't use kernel
keyring for volume keys.
Whereas, LUKS2 devices use kernel keyring for volume key by default, i.e.
cryptsetup utility asks for a passphrase if it detects volume key was
previously passed to dm-crypt via kernel keyring service, we are overriding
the default by --disable-keyring option during cryptsetup open command.
So that at the time of crypto mapper device resize we will not be
prompted for any passphrase.
Fixes: #1469
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
With Luks1 device:
$ cryptsetup status /dev/mapper/crypto-rbd0
/dev/mapper/crypto-rbd0 is active and is in use.
type: LUKS1
cipher: aes-xts-plain64
keysize: 512 bits
key location: dm-crypt
device: /dev/rbd0
sector size: 512
offset: 4096 sectors
size: 4190208 sectors
mode: read/write
With Luks2 device:
$ cryptsetup status /dev/mapper/crypto-rbd0
/dev/mapper/crypto-rbd0 is active and is in use.
type: LUKS2
cipher: aes-xts-plain64
keysize: 512 bits
key location: dm-crypt
device: /dev/rbd0
sector size: 512
offset: 32768 sectors
size: 4161536 sectors
mode: read/write
This could lead to failures with unmap in the NodeUnstageVolume path
for the encrypted volumes.
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>