Commit Graph

637 Commits

Author SHA1 Message Date
Niels de Vos
d98516e9d8 rbd: add locking for VolumeGroupSnapshot operations
Add VolumeGroupLocks in the CSI Controller Server so that operations are
protected against concurrent requests for the same VolumeGroupSnapshot.

Signed-off-by: Niels de Vos <ndevos@ibm.com>
2024-11-12 09:28:30 +00:00
Niels de Vos
f3d40f9e5a rbd: cleanup inconsistent state in reserveSnap() after a failure
`reserveSnap()` can potentially fail halfway through, in that case it
needs to undo the snapshot reservation and restore modified attributes
of the snapshot.

Fixes: #4945
Signed-off-by: Niels de Vos <ndevos@ibm.com>
2024-11-11 13:39:05 +00:00
Niels de Vos
cea8bf8110 rbd: set SnapshotGroupID on each Snapshot of a VolumeGroupSnapshot
Without the SnapshotGroupID in the Snapshot object, Kubernetes CSI does
not know that the Snapshot belongs to a group. In that case, it allows
the deletion of the Snapshot, which should be denied.

Signed-off-by: Niels de Vos <ndevos@ibm.com>
2024-11-06 11:37:44 +00:00
Niels de Vos
ec1e7a4ee0 rbd: expose the GroupControllerService
When the GroupSnapGetInfo go-ceph function is supported by librbd, the
Group Controller Servive and VolumeGroupSnapshot capabilities can be
exposed to the Container Orchestrator.

Signed-off-by: Niels de Vos <ndevos@ibm.com>
2024-11-06 11:37:44 +00:00
Niels de Vos
e34dceff27 rbd: implement CSI Group Controller Server
Signed-off-by: Niels de Vos <ndevos@ibm.com>
2024-11-06 11:37:44 +00:00
Niels de Vos
e011e74b9d rbd: fix snapshot deletion by resolving image names correctly
When creating a Snapshot with the new NewSnapshotByID() function, the
name of the RBD-image that is created is the same as the name of the
Snapshot. The `RbdImageName` points to the name of parent image, which
causes deleting the Snapshot to delete the parent image instead.

Correcting the `RbdImageName` and setting it to the `RbdSnapName` makes
sure that upon deletion, the Snapshot RBD-image is removed, and not the
parent image.

Signed-off-by: Niels de Vos <ndevos@ibm.com>
2024-11-06 11:37:44 +00:00
Niels de Vos
fdccba1f33 rbd: add Manager.GetVolumeGroupSnapshotByName
The Group Controller Server may need to fetch a VolumeGroupSnapshot that
was statically provisioned. In that case, only the name of the
VolumeGroupSnapshot is known and should be resolved to an object.

Signed-off-by: Niels de Vos <ndevos@ibm.com>
2024-11-06 11:37:44 +00:00
Niels de Vos
ad381c4ff0 rbd: implement Manager.GetVolumeGroupSnapshotByID
The GetVolumeGroupSnapshotByID function makes it possible to get a
VolumeGroupSnapshot object from the Manager by passing a request-id.
This makes it simple for the Group Controller Server to check if a
VolumeGroupSnapshot already exists, so it is not needed to try and
re-create an existing one.

Signed-off-by: Niels de Vos <ndevos@ibm.com>
2024-11-06 11:37:44 +00:00
Niels de Vos
7563f4285d rbd: add manager.CreateVolumeGroupSnapshot()
Implement the CreateVolumeGroupSnapshot for the rbd.Manager. A Group
Controller Server can use the rbd.Manager to create VolumeGroupSnapshots
in an easy an idempotent way.

Signed-off-by: Niels de Vos <ndevos@ibm.com>
2024-11-06 11:37:44 +00:00
Niels de Vos
9bea3feff1 rbd: add manager GetSnapshotByID and SnapshotResolver interface
A (CSI) VolumeGroupSnapshot object contains references to Snapshot IDs
(or CSI Snapshot handles). In order to work with a VolumeGroupSnapshot
struct, the Snapshot IDs need to be resolved into rbdSnapshot structs.

Signed-off-by: Niels de Vos <ndevos@ibm.com>
2024-11-06 11:37:44 +00:00
Niels de Vos
455a90e9f4 rbd: add VolumeGroupSnapshot type
The VolumeGroupSnapshot type will be used by the rbd.Manager to create,
inspect and delete VolumeGroupSnapshos.

Signed-off-by: Niels de Vos <ndevos@ibm.com>
2024-11-06 11:37:44 +00:00
Niels de Vos
efb7bccaea rbd: add VolumeGroup.CreateSnapshots() implementation
When the rbd.Manager creates a VolumeGroupSnapshot, each RBD-snapshot
that is created as part of the RBD-group needs to be cloned into its own
RBD-image that will be used as a CSI Snapshot.

The VolumeGroup.CreateSnapshots() creates the RBD-group snapshot and
returns a list of the Snapshot structs.

Signed-off-by: Niels de Vos <ndevos@ibm.com>
2024-11-06 11:37:44 +00:00
Niels de Vos
20fadf2016 rbd: add rbdVolume.NewSnapshotByID to clone images by RBD snapshot-id
The NewSnapshotByID() function makes it possible to clone a new Snapshot
from an existing RBD-image and the ID of an RBD-snapshot on that image.

This will be used by the VolumeGroupSnapshot feature, where the ID of an
RBD-snapshot is obtained for the RBD-snapshot on the RBD-images.

Signed-off-by: Niels de Vos <ndevos@ibm.com>
2024-11-06 11:37:44 +00:00
Niels de Vos
9808408340 rbd: pass CSI-drivername to volume group instead of journal instance
Each object is responsible for maintaining a connection to the journal.

By sharing a single journal, cleanup of objects becomes more complex as
the journal is used in deferred functions and only the last should
destroy the journal connection resources.

Signed-off-by: Niels de Vos <ndevos@ibm.com>
2024-11-06 11:37:44 +00:00
Niels de Vos
29bf5797b0 rbd: add .requestName to the commonVolumeGroup struct
Signed-off-by: Niels de Vos <ndevos@ibm.com>
2024-11-06 11:37:44 +00:00
Niels de Vos
4b13e9132b rbd: have GetVolumeGroup() return an empty volume group if it was not found
Signed-off-by: Niels de Vos <ndevos@ibm.com>
2024-11-06 11:37:44 +00:00
Niels de Vos
6d88e0a4c7 rbd: close the RBD-image after adding it to a VolumeGroup
When the image is not closed, it keeps a watch open. This prevents the
CSI Controller to delete the Volume, as there is still a user of it.

Fixes: f9ab14e826 "rbd: check if an image is part of a group before adding it"
Signed-off-by: Niels de Vos <ndevos@ibm.com>
2024-11-06 11:37:44 +00:00
Madhu Rajanna
fdc74973d8 rbd: register GET_CLIENTS_TO_FENCE caps
register Capability_NetworkFence_
GET_CLIENTS_TO_FENCE capability and
start a NetworkFence controllers
as part of rbd nodeplugin.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2024-11-06 09:48:45 +00:00
Niraj Yadav
1c02e69ba4 rbd: Add timeout for cryptsetup commands
This PR modifies the execCryptSetupCommand so that
the process is killed in an event of lock timeout.

Useful in cases where the volume lock is released but
the command is still running.

Signed-off-by: Niraj Yadav <niryadav@redhat.com>
2024-11-05 11:39:59 +00:00
Praveen M
c7f41cf84b util: add GetCephFSRadosNamespace method
This commit adds `GetCephFSRadosNamespace` util method that returns
the `RadosNamespace` specified in ceph-csi-config ConfigMap under
cephFS.radosNamespace.

If not specified, the method returns the default RadosNamespace
i.e, csi.

Signed-off-by: Praveen M <m.praveen@ibm.com>
2024-10-21 14:11:27 +00:00
Niels de Vos
a51a6ae43a rbd: add types.Snapshot interface
The rbdSnapshot/rbdImage object implements all functions for a useful
Snapshot interface. The rbd.Manager will be able to use this for
providing VolumeGroupSnapshot support.

Signed-off-by: Niels de Vos <ndevos@ibm.com>
2024-10-17 16:30:33 +00:00
Niels de Vos
f885c77f4e rbd: use GetCreationTime() to build the CSI-Snapshot object
Signed-off-by: Niels de Vos <ndevos@ibm.com>
2024-10-17 16:30:33 +00:00
Niels de Vos
e154eae732 cleanup: use err and target in recommended order to errors.Is()
The documentation has `error.Is(err, target)`, so use this as the order
of the parameters.

Signed-off-by: Niels de Vos <ndevos@ibm.com>
2024-10-14 07:29:12 +00:00
Niels de Vos
3802dd2c2c rbd: add feature check to see if GroupSnapGetInfo is available
The go-ceph rbd package provides the GroupSnapGetInfo function, but it
may return ErrUnsupported when called. Returning this error after
advertising the support for VolumeGroupSnapshot seems ugly.

In order to advertise support for VolumeGroupSnapshot,
SupportsGroupSnapGetInfo() can be used, which detects the required C
function of librbd.

Signed-off-by: Niels de Vos <ndevos@ibm.com>
2024-10-10 15:45:47 +00:00
Niels de Vos
d33e6b14fe rbd: validate IOContext before getting the list of trashed images
`ensureImageCleanup()` can cause a panic when an image was deleted, but
the journal still contained a reference. By opening the IOContext before
using, an error may be returned instead of a panic when using a `nil` or
freed IOContext.

Signed-off-by: Niels de Vos <ndevos@ibm.com>
2024-10-04 11:04:22 +00:00
Niels de Vos
10076ca11f rbd: use the new go-ceph rbd.ErrExist for checking rbd.GroupCreate()
The go-ceph rbd.GroupCreate() now returns ErrExist in case the group
that is created, already exists. The previous check only ever matched
the string comparison, which is prone to errors in case the contents is
modified by go-ceph.

Signed-off-by: Niels de Vos <ndevos@ibm.com>
2024-10-04 09:00:23 +00:00
Madhu Rajanna
88b964fe18 rbd: consider ErrPermissionDenied for vol
Incase of RDR with restricted access the
ceph user will not have access to all the objects
or all the pools where mapping exists

This commits add a check to continue to get
the volume if there is a permission error

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2024-10-03 08:40:07 +00:00
Niels de Vos
2d82cebfeb rbd: move repairImageID() from rbdVolume struct to rbdImage
The `repairImageID()` function is useful for the `rbdSnapshot` objects
as well. Move it to the `rbdImage` struct that is the base for both
`rbdVolume` and `rbdSnapshot`.

Signed-off-by: Niels de Vos <ndevos@ibm.com>
2024-09-26 18:02:22 +00:00
Niels de Vos
f2bc1c674b rbd: replace Manager.DeleteVolumeGroup() by VolumeGroup.Delete()
There is no need for the `Manager.DeleteVolumeGroup()` function as
`VolumeGroup.Delete()` should cover everything too.

By moving the `.Delete()` functionality of removing the group from the
journal to the shared `commonVolumeGroup` type, a volume group snaphot
can use it as well.

Signed-off-by: Niels de Vos <ndevos@ibm.com>
2024-09-26 13:59:21 +00:00
Niels de Vos
8c252d58ea rbd: prevent re-use of destroyed resources
When an `.Destroy()` is called on an rbdImage (or rbdVolume or
rbdSnapshot), the IOContext, Connection and other attributes are
invalid. When using a destroyed resource that points to an object that
was allocated through librbd, the process most likely ends with a panic.

Signed-off-by: Niels de Vos <ndevos@ibm.com>
2024-09-26 09:37:21 +00:00
Robert Vasek
7a727c2a43 util: added logs for slow gRPC calls
This commit adds a gRPC middleware that logs calls that
keep running after their deadline.

Adds --logslowopinterval cmdline argument to pass the log rate.

Signed-off-by: Robert Vasek <robert.vasek@clyso.com>
2024-09-20 08:55:17 +00:00
Niels de Vos
05d501a728 rbd: prevent panic when using rbdImage that is not connected
When an `rbdVolume` or `rbdSnapshot` is not connected with credentials
to the Ceph cluster, operations may try to get the IOContext which then
causes a panic.

Signed-off-by: Niels de Vos <ndevos@ibm.com>
2024-09-18 07:09:12 +00:00
Niels de Vos
42fc0b6bce rbd: rename setImageOptions() to constructImageOptions()
A function called `setImageOptions()` is expected to set the passed
options on the volume. However, the passed options parameter is only
filled with the options that should get set on the RBD-image at the time
of creation.

The naming of the function, and it's parameter is confusing. Rename the
function to `constructImageOptions()` and return the ImageOptions to
make it easier to understand.

Signed-off-by: Niels de Vos <ndevos@ibm.com>
2024-09-12 10:31:49 +00:00
Rakshith R
61c23dd4d2 rbd: fail DisableVolumeReplication() if image is not mirror disabled
This commit modifies DisableVolumeReplication() to fail
if the image is not in mirror disabled state

Signed-off-by: Rakshith R <rar@redhat.com>
2024-09-11 16:22:29 +00:00
Madhu Rajanna
4d5594acab rbd: set volume condition for block
set volume condition as healthy if
we dont have any errors for the block
mode as well.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2024-09-02 12:08:03 +00:00
Madhu Rajanna
54ae9f953a rbd: advertise VOLUME_CONDITION
rbd nodeserver is already setting
volume condition in NodeGetVolumeStats
RPC call but the cap is not updated
for it, This PR advertise the
VOLUME_CONDITION

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2024-09-02 12:08:03 +00:00
Niels de Vos
689498e66a rbd: move common functions for VolumeGroup structs into own type
Many functions that are implemented for the volumeGroup type can be
shared with the (coming) volumeGroupSnapshot type. Move these functions
into a commonVolumeGroup type, so that volumeGroup and
volumeGroupSnapshot can inherit them.

Signed-off-by: Niels de Vos <ndevos@ibm.com>
2024-08-28 11:46:29 +00:00
Madhu Rajanna
3ac596840c rbd: add a check for CSI pv
add a check for CSI as it can be
nil for non-csi PV.

fixes: #4807

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2024-08-27 17:08:44 +00:00
Madhu Rajanna
38fd27a209 rbd: add image size in toSnapshot
we need to return the rbd image size
as a snapshot size in CreateSnapshot
Response.

fixes: #4788

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2024-08-21 20:14:51 +00:00
Niels de Vos
869aaced7d rbd: convert rbdVolume to rbdSnapshot
After cloning the RBD snapshot, an rbdVolume is returned for the
CSI.Snapshot object. In order to use the rbdSnapshot.ToCSI() function,
the rbdVolume needs to be converted (back) to an rbdSnaphot.

Signed-off-by: Niels de Vos <ndevos@ibm.com>
2024-08-09 14:58:59 +00:00
Niels de Vos
6d1ab1b8d9 rbd: have GetCreationTime() return a time.Time struct
Do not use protobuf types when there is no need. Just use the standard
time.Time format instead.

Signed-off-by: Niels de Vos <ndevos@ibm.com>
2024-08-09 14:58:59 +00:00
Niels de Vos
dfb48bac17 util: add CSIDriver.GetInstanceID()
There has been some confusion about using different variables for the
InstanceID of the RBD-driver. By removing the global variable
CSIInstanceID, there should be no confusion anymore what variable to
use.

Signed-off-by: Niels de Vos <ndevos@ibm.com>
2024-08-05 17:04:52 +00:00
Madhu Rajanna
f7c78ae4fe rbd: update group Stringer method
updated the group stringer method
to have pool and namespace for
proper debugging/logging and to
use it with CLI as agrument as well.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2024-07-31 12:16:42 +00:00
Madhu Rajanna
37970ae212 rbd: add context to mirror interface
adding required ctx to the mirror
interface as ctx is required for
the volumegroup operations.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2024-07-31 12:16:42 +00:00
Madhu Rajanna
e682f2cc73 rbd: add struct to error
updating HandleParentImageExistence function
to return more details error which includes
the pool/namespace/image name

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2024-07-31 12:16:42 +00:00
Madhu Rajanna
b222b773aa rbd: implement journalledObject for volumes
implement journalledObject interface to
return the journal objects of the volume.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2024-07-31 12:16:42 +00:00
Madhu Rajanna
a243cf52d4 rbd: return more descriptive error
updated GetVolumeByID to return more
descriptive error so that caller no
need to add more details in
the error message.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2024-07-31 12:16:42 +00:00
Praveen M
8fa3ac9fb3 cleanup: remove unnecessary error return type
Signed-off-by: Praveen M <m.praveen@ibm.com>
2024-07-31 06:56:32 +00:00
Praveen M
243a0fd0fb rbd: add volume locks for reclaimspace operations
This commit adds locks on reclaimspace operations to
prevent multiple process executing rbd sparsify/fstrim
on same volume.

Signed-off-by: Praveen M <m.praveen@ibm.com>
2024-07-31 06:56:32 +00:00
Niraj Yadav
4445247690 rbd: use ioctx locks for key rotation
Signed-off-by: Niraj Yadav <niryadav@redhat.com>
2024-07-30 14:51:49 +00:00