Commit Graph

2284 Commits

Author SHA1 Message Date
Madhu Rajanna
f65961d01e doc: add design doc for clusterid poolid mapping
added design doc to handle volumeID mapping in case
of the failover in the Disaster Recovery.

update #2118

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 5fc9c3a046)
2021-08-09 09:24:16 +00:00
Madhu Rajanna
cbe3ac71f3 deploy: add template changes for mapping
added template changes for the clusterID and
poolID,fsID mapping details for the pod templates.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit d321663872)
2021-08-09 09:24:16 +00:00
Madhu Rajanna
829fc5ed95 rbd: read clusterID and PoolID from mapping
Whenever Ceph-CSI receives a CSI/Replication
request it will first decode the
volumeHandle and try to get the required
OMAP details if it is not able to
retrieve, receives a `Not Found` error
message and Ceph-CSI will check for the
clusterID mapping. If the old volumeID
`0001-00013-site1-storage-0000000000000001
-b0285c97-a0ce-11eb-8c66-0242ac110002`
contains the `site1-storage` as the clusterID,
now Ceph-CSI will look for the corresponding
clusterID `site2-storage` from the above configmap.
If the clusterID mapping is found now Ceph-CSI
will look for the poolID mapping ie mapping between
`1` and `2`. Example:- pool with name exists on
both the clusters with different ID's Replicapool
with ID `1` on site1 and Replicapool with ID `2`
on site2. After getting the required mapping Ceph-CSI
has the required information to get more details
from the rados OMAP. If we have multiple clusterID mapping
it will loop through all the mapping and checks the
corresponding pool to get the OMAP data. If the clusterID
mapping does not exist Ceph-CSI will return an `Not Found`
error message to the caller.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 92ad2ceec9)
2021-08-09 09:24:16 +00:00
Madhu Rajanna
daea5177e5 util: add helper function to read clusterID mapping
added helper function to read the clusterID mapping
from the mounted file.

The clusterID mapping contains below mappings
* ClusterID mappings (to cluster to which we are failingover
and from which cluster failover happened)
* RBD PoolID mapping of between the clusters.
* CephFS FscID mapping between the clusters.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit ac11d71e19)
2021-08-09 09:24:16 +00:00
Yug Gupta
459f6eca5a helm: update cephfs provisioner updateStrategy
Update ceph-csi-cephfs.provisioner updatestrategy
to allow maxUnavailable pods at a time to be 50%

Signed-off-by: Yug Gupta <yuggupta27@gmail.com>
(cherry picked from commit 080f7538c0)
2021-08-06 12:37:33 +00:00
Yug Gupta
45e80f8952 helm: update rbd provisioner updateStrategy
Update ceph-csi-rbd.provisioner updatestrategy
to allow maxUnavailable pods at a time to be 50%

Signed-off-by: Yug Gupta <yuggupta27@gmail.com>
(cherry picked from commit ea088d40be)
2021-08-06 12:37:33 +00:00
Niels de Vos
bc24a8c8ac util: allow configuring VAULT_AUTH_MOUNT_PATH for Vault Tenant SA KMS
The VAULT_AUTH_MOUNT_PATH is a Vault configuration parameter that allows
a user to set a non default path for the Kubernetes ServiceAccount
integration. This can already be configured for the Vault KMS, and is
now added to the Vault Tenant SA KMS as well.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
(cherry picked from commit 4859f2dfdb)
2021-08-06 09:30:32 +00:00
Madhu Rajanna
05c9b3b245 build: update commitlint to use latest tag
updaing the commitlint to the latest, so
each time latest release can be installed by
default.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 3805c29f36)
2021-08-05 14:51:03 +00:00
Madhu Rajanna
1a83027a4d ci: update mergify for commitlint
updated commitlint mergify rules to
consider the commitlint status to
merge the PR.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 0b6322afda)
2021-08-05 14:51:03 +00:00
Madhu Rajanna
e7ea1fd2d9 ci: trailer-exists to verify sign-off
This commit uses trailer-exists instead
of signed-off-by to verify the sign-off-by
message.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Suggested-by: Ade Attwood
(cherry picked from commit 38ef32a496)
2021-08-05 14:51:03 +00:00
Alexandre Lossent
7688bc3a7a cephfs: support selinux mount options
- mount host's /etc/selinux in node plugins
- process mount options in all code paths for cephfs volume options

Signed-off-by: Alexandre Lossent <alexandre.lossent@cern.ch>
(cherry picked from commit 5cba04c470)
2021-08-05 08:37:52 +00:00
Niels de Vos
b866bd491c util: add vaultAuthNamespace option for Vault KMS
The new `vaultAuthNamespace` configuration parameter can be set to the
Vault Namespace where the authentication is setup in the service. Some
Hashicorp Vault deployments use sub-namespaces for their users/tenants,
with a 'root' namespace where the authentication is configured. This
requires passing of different Vault namespaces for different operations.

Example:
 - the Kubernetes Auth mechanism is configured for in the Vault
   Namespace called 'devops'
 - a user/tenant has a sub-namespace called 'devops/website' where the
   encryption passphrases can be placed in the key-value store

The configuration for this, then looks like:

    vaultAuthNamespace: devops
    vaultNamespace: devops/homepage

Note that Vault Namespaces are a feature of the Hashicorp Vault
Enterprise product, and not part of the Open Source version. This
prevents adding e2e tests that validate the Vault Namespace
configuration.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
(cherry picked from commit f2d5c2e0df)
2021-08-05 06:44:23 +00:00
Niels de Vos
a962cccd0a util: correct error message when connecting to Vault fails
Signed-off-by: Niels de Vos <ndevos@redhat.com>
(cherry picked from commit 83167e2ac5)
2021-08-05 06:44:23 +00:00
rtsp
cb40ae5bca deploy: rbd kubernetes manifests
add ability to deploy ceph-csi-rbd on non-default namespace

Signed-off-by: rtsp <git@rtsp.us>
(cherry picked from commit af1f50ba04)
2021-08-04 16:24:58 +00:00
Niels de Vos
3bbcda6174 e2e: use official CentOS container location
registry.centos.org is not officially maintained by the CentOS
infrastructure team. The container images on quay.io are the official
once and we should use those instead.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
(cherry picked from commit e0ac70f8fb)
2021-08-04 10:50:30 +00:00
Artur Troian
82fd1e5248 util: getCgroupPidsFile produces striped path when extra : present
This commit uses `string.SplitN` instead of `string.Split`.
The path for pids.max has extra `:` symbols in it due to which
getCgroupPidsFile() splits the string into 5 tokens instead of
3 leading to loss of part of the path.
As a result, the below error is reported:
`Failed to get the PID limit, can not reconfigure: open
/sys/fs/cgroup/pids/system.slice/containerd.service/
kubepods-besteffort-pod183b9d14_aed1_4b66_a696_da0c738bc012.slice/pids.max:
no such file or directory`
SplitN takes an argument n and splits the string
accordingly which helps us to get the desired
file path.

Fixes: #2337

Co-authored-by: Yati Padia <ypadia@redhat.com>
Signed-off-by: Yati Padia <ypadia@redhat.com>
(cherry picked from commit 16ec97d8f7)
2021-08-04 07:11:26 +00:00
Humble Chirammal
966841cafc deploy: revert changes made for 3.4.0 release
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2021-07-30 06:45:38 +00:00
Humble Chirammal
94ef181bc8 build: update build.env for 3.4.0 release
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2021-07-29 10:03:20 +00:00
Humble Chirammal
1f515404e7 deploy: change minikube image for 3.4.0 release
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2021-07-29 10:03:20 +00:00
Humble Chirammal
61aab6ddb5 helm: replace image tag to v3.4.0 from canary for the release
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2021-07-29 10:03:20 +00:00
Humble Chirammal
03ab0738a4 deploy: changes the image to v3.4.0 instead of canary for release
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2021-07-29 10:03:20 +00:00
Prasanna Kumar Kalever
0a02343f2d doc: update the upgrade documentation to reflect 3.4.0 changes
Mainly removed rbd-nbd mounter specified at the pre-upgrade
considerations affecting the restarts.

Also updated the 3.3 tags to 3.4

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
(cherry picked from commit d2def71944)
2021-07-28 21:14:35 +05:30
Niels de Vos
ce9e54e5bd ci: add Mergify backport rules for release-v3.4
The new `backport-to-release-v3.4` label can be added to PRs and Mergify
will create a backport once the PR for the devel branch has been merged.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-07-28 12:53:58 +05:30
Prasanna Kumar Kalever
52799da09d doc: add design doc for volume healer
Closes: #667

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2021-07-28 11:54:59 +05:30
Prasanna Kumar Kalever
ebe4e1f944 ci: ignore spell check for design proposal images
To avoid failures triggered by checking SVG image formats.

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2021-07-28 11:54:59 +05:30
Prasanna Kumar Kalever
068e44bdb1 cleanup: move rbd-mirror image to a new directory
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2021-07-28 11:54:59 +05:30
Madhu Rajanna
080b251850 e2e: validate images in trash for rados namespace
added validation check to verify stale images in trash
for the rados namespace testing.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2021-07-28 03:48:33 +00:00
Madhu Rajanna
8f185bf7b2 rbd: use rados namespace for manager command
Currently we have a bug that we are not using rados
namespace when adding ceph manager command to
remove the image from the trash. This commit
adds the missing rados namespace when adding
ceph manager task.

without fix the image will be moved to trash
and no task will be added to remove from the
trash. it will become ceph responsibility to
remove the image from trash when it will cleanup
the trash.

workaroud: manually purge the trash

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2021-07-28 03:48:33 +00:00
Yug Gupta
d14c0afe28 doc: Add documentation for DR
Add documenation for Disaster Recovery
which steps to Failover and Failback in case
of a planned migration or a Disaster.

Signed-off-by: Yug Gupta <yuggupta27@gmail.com>
2021-07-27 11:43:01 +00:00
Niels de Vos
ec6703ed58 rbd: rename encryption metadata keys to enable mirroring
RBD image metadata keys that start with '.rbd' are expected to be
internal to RBD itself and are not mirrored to remote sites. Renaming
the keys (dropping the '.' prefix) and using the new MigrateMetadata()
function now makes the keys available on remote sites too.

Closes: #2219
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-07-26 11:49:56 +00:00
Niels de Vos
607129171d rbd: move image metadata key migration to its own function
The new MigrateMetadata() function can be used to get the metadata of an
image with a deprecated and new key. Renaming metadata keys can be done
easily this way.

A default value will be set in the image metadata when it is missing
completely. But if the deprecated key was set, the data is stored under
the new key and the deprecated key is removed.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-07-26 11:49:56 +00:00
Yati Padia
6691951453 rbd: use go-ceph for getImageMirroringStatus
Currently, getImageMirroringStatus() is using RBD CLI.
This commit converts RBD CLI to go-ceph API.

Fixes: #2120

Signed-off-by: Yati Padia <ypadia@redhat.com>
2021-07-26 06:37:40 +00:00
Niels de Vos
4e6d9be826 ci: fix yamllint error in generated golangci.yml file
When running 'make containerized-test' the following error gets
reported:

    yamllint -s -d '{extends: default, rules: {line-length: {allow-non-breakable-inline-mappings: true}},ignore: charts/*/templates/*.yaml}' ./scripts/golangci.yml
    ./scripts/golangci.yml
      179:81    error    line too long (84 > 80 characters)  (line-length)

The golangci.yml.in is used to generate golangci.yml, addressing the
line-length there resolves the issue.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-07-26 04:05:50 +00:00
Niels de Vos
e75d308b9c e2e: isRetryableAPIError() should match any etcdserver timeout
framework.RunKubectl() returns an error that does not end with
"etcdserver: request timed out", but contains the text somewhere in the
middle:

    error running /usr/bin/kubectl --server=https://192.168.39.57:8443 --kubeconfig=/root/.kube/config --namespace=cephcsi-e2e-a44ec4b4 create -f -:
    Command stdout:

    stderr:
    Error from server: error when creating "STDIN": etcdserver: request timed out

    error:
    exit status 1

isRetryableAPIError() should  return `true` for this case as well, so
instead of using HasSuffix(), we'll use Contains().

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-07-23 12:20:16 +00:00
Prasanna Kumar Kalever
75dda7ac0d e2e: add test for expansion of encrypted volumes
Also adds a test case to validate the default encryption type

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2021-07-23 10:00:23 +00:00
Prasanna Kumar Kalever
526ff95f10 rbd: add support to expand encrypted volume
Previously in ControllerExpandVolume() we had a check for encrypted
volumes and we use to fail for all expand requests on an encrypted
volume. Also for Block VolumeMode PVCs NodeExpandVolume used to be
ignored/skipped.

With these changes, we add support for the expansion of encrypted volumes.
Also for raw Block VolumeMode PVCs with Encryption we call NodeExpandVolume.

That said,
With LUKS1, cryptsetup utility doesn't prompt for a passphrase on resizing
the crypto mapper device. This is because LUKS1 devices don't use kernel
keyring for volume keys.

Whereas, LUKS2 devices use kernel keyring for volume key by default, i.e.
cryptsetup utility asks for a passphrase if it detects volume key was
previously passed to dm-crypt via kernel keyring service, we are overriding
the default by --disable-keyring option during cryptsetup open command.
So that at the time of crypto mapper device resize we will not be
prompted for any passphrase.

Fixes: #1469

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2021-07-23 10:00:23 +00:00
Prasanna Kumar Kalever
4fa05cb3a1 util: add helper functions for resize of encrypted volume
such as:
ResizeEncryptedVolume() and LuksResize()

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2021-07-23 10:00:23 +00:00
Prasanna Kumar Kalever
572f39d656 util: fix log level in OpenEncryptedVolume()
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2021-07-23 10:00:23 +00:00
Prasanna Kumar Kalever
812003eb45 util: fix bug in DeviceEncryptionStatus()
With Luks1 device:
$ cryptsetup status /dev/mapper/crypto-rbd0
/dev/mapper/crypto-rbd0 is active and is in use.
  type:    LUKS1
  cipher:  aes-xts-plain64
  keysize: 512 bits
  key location: dm-crypt
  device:  /dev/rbd0
  sector size:  512
  offset:  4096 sectors
  size:    4190208 sectors
  mode:    read/write

With Luks2 device:
$ cryptsetup status /dev/mapper/crypto-rbd0
/dev/mapper/crypto-rbd0 is active and is in use.
  type:    LUKS2
  cipher:  aes-xts-plain64
  keysize: 512 bits
  key location: dm-crypt
  device:  /dev/rbd0
  sector size:  512
  offset:  32768 sectors
  size:    4161536 sectors
  mode:    read/write

This could lead to failures with unmap in the NodeUnstageVolume path
for the encrypted volumes.

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2021-07-23 10:00:23 +00:00
Yati Padia
1ae2afe208 cleanup: modifies the error caused due to merged PRs
This commit modifies the error of godot, cyclop,
paralleltest linter caused due to merged PRs.

Updates: #1586

Signed-off-by: Yati Padia <ypadia@redhat.com>
2021-07-22 18:15:48 +00:00
Yati Padia
4e890e9daf ci: disable gci and wrapcheck linter
This commit disables wrapcheck and gci
linters.

Updates: #1586

Signed-off-by: Yati Padia <ypadia@redhat.com>
2021-07-22 18:15:48 +00:00
Yati Padia
172b66f73f cleanup: resolves cyclop linter issue
this commit adds `// nolint:cyclop` for the
fucntions whose complexity is above 20

Updates: #1586

Signed-off-by: Yati Padia <ypadia@redhat.com>
2021-07-22 18:15:48 +00:00
Yati Padia
e85c0eedc4 ci: set max-complexity of cyclop as 20
This commit sets the value of max-
complexity of cyclop linter as 20

Signed-off-by: Yati Padia <ypadia@redhat.com>
2021-07-22 18:15:48 +00:00
Yati Padia
45b40661e2 ci: disable gomoddirectives linter
This commit disables gomoddirectives
linter as it bans use of replace directive.

Update: #1586

Signed-off-by: Yati Padia <ypadia@redhat.com>
2021-07-22 18:15:48 +00:00
Yati Padia
9d6ce7c5dd ci: disable forbidigo linter
This commit disables the forbidigo linter as
this linter forbids the use of fmt.Printf
but we need to use it in various part of
our codebase.

Updates: #1586

Signed-off-by: Yati Padia <ypadia@redhat.com>
2021-07-22 18:15:48 +00:00
Yati Padia
9414a76a86 ci: disable exhaustivestruct linter
This commit disables the exhaustivestruct linter
as it is meant to be used only for special cases.
We don't need to enable this for our project.

Fixes: #2224

Signed-off-by: Yati Padia <ypadia@redhat.com>
2021-07-22 18:15:48 +00:00
Yati Padia
c5bc3d38c4 ci: update static check tools
This PR updates the static check tools to
the latest version.
Further needs to resolve all the errors after
updating the version.

Updates: #1586

Signed-off-by: Yati Padia <ypadia@redhat.com>
2021-07-22 18:15:48 +00:00
Humble Chirammal
abe6a6e5ac util: remove deleteLock test as it is enforced by the controller
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2021-07-22 15:07:49 +00:00
Humble Chirammal
c42d4768ca util: remove the deleteLock acquistion check for clone and snapshot
At present while acquiring the deleteLock on the volume, we check
for ongoing clone and snapshot creation operations on the same.
Considering snapshot and clone controllers does not allow parent
volume deletion on subjected operations, we can be free from this
extra check.

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2021-07-22 15:07:49 +00:00
Niels de Vos
82557e3f34 util: allow configuring VAULT_BACKEND for Vault connection
It seems that the version of the key/value engine can not always be
detected for Hashicorp Vault. In certain cases, it is required to
configure the `VAULT_BACKEND` (or `vaultBackend`) option so that a
successful connection to the service can be made.

The `kv-v2` is the current default for development deployments of
Hashicorp Vault (what we use for automated testing). Production
deployments default to version 1 for now.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-07-22 13:02:47 +00:00