Commit Graph

501 Commits

Author SHA1 Message Date
Niels de Vos
a0ca713d79 rbd: repair thick-provisioned images on CreateVolume restart
Signed-off-by: Niels de Vos <ndevos@redhat.com>
(cherry picked from commit 7cbad9305f)
2021-06-01 16:08:39 +00:00
Niels de Vos
58d606ab8d cleanup: split repairExistingVolume() from CreateVolume()
Move the repairing of a volume/snapshot from CreateVolume to its own
function. This reduces the complexity of the code, and makes the
procedure easier to understand. Further enhancements to repairing an
exsiting volume can be done in the new function.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
(cherry picked from commit 96a8ea3e88)
2021-06-01 16:08:39 +00:00
Madhu Rajanna
a884b99c6e rbd: fix image details logging
log only the required details of
the image.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 0ce6ad1152)
2021-05-07 09:25:48 +00:00
Madhu Rajanna
5e9f007ffd rbd: flatten image if the depth is not zero
flatten the image if the deep-flatten feature
is present on the images in the chain or if the
images in chain is not zero, as we cannot check
the deep-flatten feature the images which are
in trash.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 67d73cd6e9)
2021-05-07 09:25:48 +00:00
Madhu Rajanna
38bd4e613e rbd: discard image not found error
For flatten we call checkImageChainHasFeature
which internally calls to getImageInfo returns
the parent name even if the parent is in the trash,
when we try to open the parent image to get its
information it fails as the image not found.
we should treat error as nil if the parent is not found.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit e15e2e5081)
2021-05-07 09:25:48 +00:00
Madhu Rajanna
75fa1927fc rbd: mark image ready when image state is up+unknown
To recover from split brain (up+error) state the image need to be
demoted and requested for resync on site-a and then the image on site-b
should gets demoted.The volume should be marked to ready=true when the
image state on both the clusters are up+unknown because during the last
snapshot syncing the data gets copied first and then image state on the
site-a changes to up+unknown.

If the image state on both the sites are up+unknown consider that
complete data is synced as the last snapshot
gets exchanged between the clusters.

* create 10 GB of file and validate the data after resync

* Do Failover when the site-a goes down
* Force promote the image and write data in GiB
* Once the site-a comes back, Demote the image and issue resync
* Demote the image on site-b
* The status will get reflected on the other site when the last
  snapshot sync happens
* The image will go to up+unknown state. and complete data will
  be copied to site a
* Promote the image on site-a and use it

```bash
csi-vol-5633715e-a7eb-11eb-bebb-0242ac110006:
  global_id:   e7f9ec55-06ab-46cb-a1ae-784be75ed96d
  state:       up+unknown
  description: remote image demoted
  service:     a on minicluster1
  last_update: 2021-04-28 07:11:56
  peer_sites:
    name: e47e29f4-96e8-44ed-b6c6-edf15c5a91d6-rook-ceph
    state: up+unknown
    description: remote image demoted
    last_update: 2021-04-28 07:11:41
 ```

* Do Failover when the site-a goes down
* Force promote the image on site-b and write data in GiB
* Demote the image on site-b
* Once the site-a comes back, Demote the image on site-a
* The images on the both site will go to split brain state

```bash
csi-vol-37effcb5-a7f1-11eb-bebb-0242ac110006:
  global_id:   115c3df9-3d4f-4c04-93a7-531b82155ddf
  state:       up+error
  description: split-brain
  service:     a on minicluster2
  last_update: 2021-04-28 07:25:41
  peer_sites:
    name: abbda0f0-0117-4425-8cb2-deb4c853da47-rook-ceph
    state: up+error
    description: split-brain
    last_update: 2021-04-28 07:25:26
```
* Issue resync
* The images cannot be resynced because when we issue resync
  on site a the image on site-b was in demoted state
* To recover from this state (promote and then demote the
  image on site-b after sometime)

```bash
csi-vol-37effcb5-a7f1-11eb-bebb-0242ac110006:
  global_id:   115c3df9-3d4f-4c04-93a7-531b82155ddf
  state:       up+unknown
  description: remote image demoted
  service:     a on minicluster1
  last_update: 2021-04-28 07:32:56
  peer_sites:
    name: e47e29f4-96e8-44ed-b6c6-edf15c5a91d6-rook-ceph
    state: up+unknown
    description: remote image demoted
    last_update: 2021-04-28 07:32:41
```
* Once the data is copied we can see that  the image state
  is moved to up+unknown on both sites
* Promote the image on site-a and use it

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 07a916b84d)
2021-05-05 15:07:18 +00:00
Madhu Rajanna
1c59f0683e rbd: delete encryption key from KMS
when a Snapshot is encrypted during a CreateSnapshot
operation, the encryption key gets created in the KMS
when we delete the Snapshot the key from the KMS
should also gets deleted.

When we create a volume from snapshot we are copying
required information but we missed to copy the
encryption information, This commit adds the missing
information to delete the encryption key.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit c3bae17fce)
2021-04-30 09:37:23 +00:00
Humble Chirammal
1367cb445f rbd: return crypt error for the rpc return
At present we return the volume connect error if the clone
from snapshot fails when rbdvolume is encrypted, which is incorrect.
This patch correctly return the failed copy encryption error to the
caller

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
(cherry picked from commit 798437d0c4)
2021-04-22 12:55:50 +05:30
Madhu Rajanna
599f3fd8e4 rbd: modified logic to check image watchers
Before RBD map operation, we do check the
watchers on the RBD image. In the case of
RWO volume. cephcsi makes sure only one
client is using the RBD image. If the rbd
image is mirrored, by default mirroring
daemon will add a watcher on the image
and as we are using go-ceph a watcher will
be added as we have opened the image So
we will have two watchers on an image if
mirroring is enabled. This holds when the
rbd mirror daemon is running, In case if
the mirror daemon is not running there will
be only one watcher on the rbd image
(which is placed by go-ceph image open)
we should not block the map operation if
the mirroring daemon is not running as
its Async mirroring. This commit adds a
check to make sure no more than 2 watchers
if the image is mirrored or no more than 1
watcher if it is not mirrored image.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 52290333e6)
2021-04-20 11:54:30 +05:30
Madhu Rajanna
eea52847bc rbd: check volumeID in PV if image not found
If the pool or few keys are missing in the omap.
GetImageAttributes function returns nil error message and few
empty items in imageAttributes struct. if the image is not
found and  the entiries are missing use
the volumeId present on the PV annotation for further operations.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2021-04-15 17:13:06 +05:30
Madhu Rajanna
cfc88c9910 rbd: discard up+unknown state in ResyncVolume
incase if the image is promoted and demoted the
image state will be set to up+unknown if the image
on the remote cluster is still in demoted state.

when user changes the state from primary to secondary
and still the image is in demoted (secondary) state
in the remote cluster. the image state on both the cluster
will be on unknown state.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2021-04-15 17:13:06 +05:30
Niels de Vos
8b8480017b logging: report issues in rbdImage.DEKStore API with stacks
It helps to get a stack trace when debugging issues. Certain things are
considered bugs in the code (like missing attributes in a struct), and
might cause a panic in certain occasions.

In this case, a missing string will not panic, but the behaviour will
also not be correct (DEKs getting encrypted, but unable to decrypt).
Clearly logging this as a BUG is probably better than calling panic().

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-04-14 03:59:28 +00:00
Niels de Vos
b1d05a1840 rbd: repair encryption config in case it is missing
It is possible that when a provisioner restarts after a snapshot was
cloned, but before the newly restored image had its encryption metadata
set, the new image is not marked as encrypted. This will prevent
attaching/mounting the image, as the encryption key will not be fetched,
or is not available in the DEKStore.

By actively repairing the encryption configuration when needed, this
problem should be addressed.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-04-14 03:59:28 +00:00
Niels de Vos
1482105309 cleanup: use buildCreateVolumeResponse() to simplify CreateVolume()
buildCreateVolumeResponse() exists exactly for the need to create a
csi.CreateVolumeResponse based on an rbdVolume. Calling this helper
reduces the code duplication in CreateVolume().

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-04-14 03:59:28 +00:00
Niels de Vos
52433841b4 cleanup: move copyEncryptionConfig() from CreateVolume to Exists()
The rbdVolume that needs its encryption configured is constructed in the
Exists() method. It is suitable to move the copyEncryptionConfig() call
there as well, so that the object is completely constructed in a single
place.

Golang-ci:gocyclo complained about the increased complexity of the
Exists() function. Moving the repairing of the ImageID into its own
helper function makes the code a little easier to understand.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-04-14 03:59:28 +00:00
Niels de Vos
596410ae60 cleanup: address "nolint" comments for RBD CreateSnapshot
Introduce helper function cloneFromSnapshot() that takes care of the
procedures that are needed when an existing snapshot has been found.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-04-14 03:59:28 +00:00
Niels de Vos
b5d0524c39 cleanup: release resources for rbdImages objects after use
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-04-14 03:59:28 +00:00
Niels de Vos
dc990037a5 rbd: move setupEncryption() from buildCreateVolumeResponse to CreateVolume
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-04-14 03:59:28 +00:00
Niels de Vos
bea9d56117 rbd: copyEncryptionConfig in doSnapshotClone()
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-04-14 03:59:28 +00:00
Niels de Vos
fd5f4dbafd rbd: configureEncryption() in genSnapFromSnapID()
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-04-14 03:59:28 +00:00
Niels de Vos
6fd3f57f40 rbd: set kmsID in reserveSnap()
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-04-14 03:59:28 +00:00
Niels de Vos
0a046c5b6d rbd: copy encryption configuration in CreateSnapshot
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-04-14 03:59:28 +00:00
Niels de Vos
6b1285d38b rbd: copy passphrase for encrypted clones
When a source volume is encrypted, the passphrase needs to be copied and
stored for the newly cloned volume.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-04-14 03:59:28 +00:00
Niels de Vos
7b332a0184 rbd: add rbdImage.copyEncryptionConfig() to copy encryption metadata
Cloning volumes requires copying the DEK from the source to the newly
cloned volume. Introduce copyEncryptionConfig() as a helper for that.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-04-14 03:59:28 +00:00
Niels de Vos
7e6feecc25 util: add VolumeEncryption.StoreCryptoPassphrase()
The new StoreCryptoPassphrase() method makes it possible to store an
unencrypted passphrase newly encrypted in the DEKStore.

Cloning volumes will use this, as the passphrase from the original
volume will need to get copied as part of the metadata for the volume.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-04-14 03:59:28 +00:00
Niels de Vos
b6aa19eea5 rbd: pass secrets when creating an source rbdVolume for cloning
Without this, the rbdVolume can not connect to the Ceph cluster and
configure the (optional) encryption.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-04-14 03:59:28 +00:00
Niels de Vos
92b2e08adf rbd: improve logging in deleteImage()
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-04-14 03:59:28 +00:00
Niels de Vos
99da92cfd7 rbd: move deletion of DEK to deleteImage()
The ControllerServer should not need to care about support for
encryption, ideally it is transparantly handled by the rbdVolume type
and its internal API.

Deleting the DEK was one of the last remainders that was explicitly done
inside the ControllerServer.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-04-14 03:59:28 +00:00
Niels de Vos
151d066938 util: add logging when OpenEncryptedVolume() encounters an error
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-04-14 03:59:28 +00:00
Niels de Vos
bd1388fb96 util: log available configs when KMS not found
When the KMS configuration can not be found, it is useful to know what
configurations are available. This aids troubleshooting when typos in
the KMS ID are made.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-04-14 03:59:28 +00:00
Niels de Vos
a7c261a394 logging: correct formatting when reporting error in createVolumeFromSnapshot()
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-04-14 03:59:28 +00:00
Rakshith R
ae6a52a84e util: add nil check to default ControllerGetCapabilities()
Currently default ControllerGetCapabilities function is being
used which throws 'runtime error: invalid memory address or
nil pointer dereference' when `--controllerServer=true` is not
set in provisioner deployment args.
This commit adds a check to prevent it.

Fixes: 1925

Signed-off-by: Rakshith R <rar@redhat.com>
2021-04-09 10:12:48 +00:00
Rakshith R
10d539efc8 cleanup: correct nolint directive listing format
nolint directive needs to be followed by comma separated
list of linters. This commit changes to gocognit:gocyclo
which was not recognised to linters which show error for
the function.

Signed-off-by: Rakshith R <rar@redhat.com>
2021-04-09 07:24:47 +00:00
Rakshith R
fb7389f478 cephfs: add stderr to mount function errors
This commit appends stderr to error in both kernel and
ceph-fuse mounter functions to better be able to debug
errors.

Signed-off-by: Rakshith R <rar@redhat.com>
2021-04-08 12:18:01 +00:00
Madhu Rajanna
e2fa84357a rbd: take lock when reconciling the PV
there can be a change we can reconcile same
PV parallelly we can endup in generating and
deleting multiple omap keys. to be on safer
side taking lock to process one volumeHandle
at a time.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2021-04-07 11:46:27 +00:00
Madhu Rajanna
0f8813d89f rbd:store/Read volumeID in/from PV annotation
In the case of the Async DR, the volumeID will
not be the same if the clusterID or the PoolID
is different, With Earlier implementation, it
is expected that the new volumeID mapping is
stored in the rados omap pool. In the case of the
ControllerExpand or the DeleteVolume Request,
the only volumeID will be sent it's not possible
to find the corresponding poolID in the new cluster.

With This Change, it works as below

The csi-rbdplugin-controller will watch for the PV
objects, when there are any PV objects created it
will check the omap already exists, If the omap doesn't
exist it will generate the new volumeID and it checks for
the volumeID mapping entry in the PV annotation, if the
mapping does not exist, it will add the new entry
to the PV annotation.

The cephcsi will check for the PV annotations if the
omap does not exist if the mapping exists in the PV
annotation, it will use the new volumeID for further
operations.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2021-04-07 11:46:27 +00:00
Rakshith R
020cded581 cleanup: refactor deeply nested if statements in internal/rbd
Refactored deeply nested if statement in internal/rbd to
reduce cognitive complexity.

Signed-off-by: Rakshith R <rar@redhat.com>
2021-04-07 02:31:41 +00:00
Rakshith R
d4cfd7bef9 cleanup: refactor deeply nested if statement in vault_tokens.go
Refactored deeply nested if statement in vault_tokens.go to
reduce cognitive complexity by adding fetchTenantConfig function.

Signed-off-by: Rakshith R <rar@redhat.com>
2021-04-07 02:31:41 +00:00
Rakshith R
2d1a572d11 cleanup: refactor deeply nested if statements in internal/cephfs
Refactored deeply nested if statement in internal/cephfs to
reduce cognitive complexity.

Signed-off-by: Rakshith R <rar@redhat.com>
2021-04-07 02:31:41 +00:00
Rakshith R
0f7b653b4e cleanup: refactor deeply nested if statements in persistentvolume.go
Refactored deeply nested if statement in persistentvolume.go to
reduce cognitive complexity.

Signed-off-by: Rakshith R <rar@redhat.com>
2021-04-07 02:31:41 +00:00
Niels de Vos
aaeb35eceb rbd: encrypted volumes can be of type "crypto_LUKS" too
It seems that newer versions of some tools/libraries identify encrypted
filesystems with `crypto_LUKS` instead of `crypt`.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-04-06 15:54:27 +00:00
Madhu Rajanna
d7838defcf rbd: return FailedPrecondition error message
In case of the DR the image on the primary site cannot be
demoted as the cluster is down, during failover the image need
to be force promoted. RBD returns `Device or resource busy`
error message if the image cannot be promoted for above reason.
Return FailedPrecondition so that replication operator can send
request to force promote the image.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2021-04-06 14:12:41 +00:00
Madhu Rajanna
403532c9a6 rbd: use force from PromoteVolume Request
instead of fetching the force option from the
parameters. Use the Force field available in
the PromoteVolume Request.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2021-04-06 14:12:41 +00:00
Madhu Rajanna
385a751b8e rebase: rename kube-storage to csi-addons
as the org github.com/kube-storage is renamed
to github.com/csi-addons as the name kube-storage
was more generic.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2021-04-06 10:59:58 +00:00
Niels de Vos
1c1683ba20 util: add AmazonMetadata KMS provider
The new Amazon Metadata KMS provider uses a CMK stored in AWS KMS to
encrypt/decrypt the DEK which is stored in the volume metadata.

Updates: #1921
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-04-06 07:33:54 +00:00
Niels de Vos
f3b06d4c4a util: pass Namespace as part of KMSInitializerArgs
Amazon KMS expects a Secret with sensitive account and key information
in the Kubernetes Namespace where the Ceph-CSI Pods are running. It will
fetch the contents of the Secret itself.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-04-06 07:33:54 +00:00
Niels de Vos
523ac4b975 util: move getPodNamespace() and getKMSConfigMapName() into its own helpers
These functions can now be re-used easier. The Amazon KMS needs to know
the Namespace of the Pod for reading a Secret with more key/values.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-04-06 07:33:54 +00:00
Humble Chirammal
314fe0e23d cleanup: correct misspelling in rbd/clone.go
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2021-04-05 09:34:09 +00:00
Madhu Rajanna
448be70682 rbd: early check for disabled,disabling in DisableVolumeReplication
added early check for disabling and disabled
image mirroring state in DisableVolumeReplication

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2021-04-05 08:53:40 +00:00
Madhu Rajanna
fb3f7fe202 rbd: remove todo for image not found
Incase of resync the image will get deleted, gets
recreated and its a a time consuming operation.
It makes sense to return aborted error instead
of not found as we have omap data only the image
is missing in rbd pool.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2021-04-05 08:53:40 +00:00