Commit Graph

683 Commits

Author SHA1 Message Date
Praveen M
add4b36900 cleanup: move Destroy() method to journalledObject interface
VolumeGroup interface has more than 10 methods and it causes
golangci-lint to fail. Moving the `Destroy()` method to a base
interface journalledObject.

Signed-off-by: Praveen M <m.praveen@ibm.com>
2025-03-27 09:59:12 +00:00
Praveen M
8d9f353f15 rbd: check for volume group existence
Signed-off-by: Praveen M <m.praveen@ibm.com>
2025-03-27 09:59:12 +00:00
Praveen M
5cbc14454a cleanup: move internal/rbd/errors.go to internal/rbd/errors pacakge
Signed-off-by: Praveen M <m.praveen@ibm.com>
2025-03-27 09:59:12 +00:00
Praveen M
0ed0af120b rbd: retain intermediate RBD snapshot on temp image
Currently, Ceph-CSI deletes intermediate RBD snapshot on
temporary cloned images (`csi-vol-xxxx-temp@csi-vol-xxxx`)
which is the parent of the final clone image.

The parent-child mirroring requires both the parent and child
images to be present (i.e, not in trash).

This commit makes enhancement to `createRBDClone` function by
introducing `deleteSnap` parameter. If `deleteSnap` is true,
the snapshot is deleted after the clone is created.

This is required to support mirroring of child image with its
parent image.

Signed-off-by: Praveen M <m.praveen@ibm.com>
2025-03-18 13:42:11 +00:00
Rakshith R
6f802589aa rbd: add one depth for softlimit of snapshot for restore PVC
Currently, while preparing a volume for snapshot,
the depthToAvoidFlatten is set to 2. This accounts one
for snapshot and another since parent of the volume is
flattened.
This commit modifies the depth to 3 to also account for
future PVC restore since
- snapshot alone is useless and it is very likely to be restore
  at one point in time.
- this ensures snapshot is not flattened when restore does occur.
- flattening of snapshot in the above case will make the snapshot
  no longer eligible for changed block tracking(snap diff)
  operation.
- maintain similarity with PVC-PVC clone operation which currently
  depthToAvoidFlatten set to 1.

Signed-off-by: Rakshith R <rar@redhat.com>
2025-03-14 15:12:27 +00:00
Niels de Vos
7f7988be0d rbd: cleanup NodeServer.createTargetMountPath()
The inverse checking and returning of is-a-mounted-path makes it
difficult to understand the function. It is easier to follow the code
when the function just returns what it says it does, hence added the
comment for the function too.

Some errors were returned directly, others were converted to gRPC
errors. This has been corrected now too, and the caller converts the
plain error to a gRPC error now.

Signed-off-by: Niels de Vos <ndevos@ibm.com>
2025-03-14 10:27:13 +00:00
Niels de Vos
79cf0321dd util: do not use mount-utils.IsLikelyNotMountPoint anymore
`IsLikelyNotMountPoint()` is an optimized version for `IsMountPoint()`
which can not detect all type of mounts (anymore). The slower
`IsMountPoint()` is more safe to use. This can cause a slight
performance regression in the case there are many mountpoints on the
system, but correctness is more important than speed while mounting.

Fixes: #4633
Signed-off-by: Niels de Vos <ndevos@ibm.com>
2025-03-14 10:27:13 +00:00
Rakshith R
796e6b6c44 rbd: use ListChildrenAttributes() instead of ListChildren()
This commit modifies listSnapAndChildren() to make use of
ListChildrenAttributes() instead of ListChildren() which
allows us to filter out images in trash.
This commit also order the alive images so that temp clone
images are followed by images backing volume snapshots so
that temp clone images are flattened first.

Signed-off-by: Rakshith R <rar@redhat.com>
2025-03-12 08:51:02 +00:00
Niels de Vos
15da101b1b util: move kernel version functions to pkg/util/kernel
Signed-off-by: Niels de Vos <ndevos@ibm.com>
2025-03-07 16:05:04 +00:00
Niels de Vos
542ed3de63 util: move EncryptionType(s) to pkg/util/crypto
Signed-off-by: Niels de Vos <ndevos@ibm.com>
2025-03-07 16:05:04 +00:00
Niels de Vos
43b150f14d rbd: return gRPC code Aborted when the RBD-image is in-use on delete
According to the error scheme documented in the CSI specification, the
Aborted error code should be initiate retries, whereas the Internal
error code does not require this behaviour.

When an RBD-image is still in-use, it can not be removed. The
DeleteVolume procedure should be retried and will succeed once the
RBD-image is not in-use anymore.

Fixes: #5166
Signed-off-by: Niels de Vos <ndevos@ibm.com>
2025-02-24 11:19:17 +00:00
Niels de Vos
ac8cda5e37 rbd: add validation to ToCSI() for rbdVolume and rbdSnapshot
After an unfortnate timed restart of the provisioner, a volume that got
cloned did not get a `rbdVolume.VolID` set. The `.VolID` is used as the
CSI Volume Handle, and is a required attribute.

The `rbdVolume` and `rbdSnapshot` structs have a `.ToCSI()` function
that can do the validation of required attributes. This is now added,
including unit-tests.

Signed-off-by: Niels de Vos <ndevos@ibm.com>
2025-02-20 10:14:29 +00:00
Niels de Vos
b3faa04504 rbd: always include the SourceVolumeID when returning a Snapshot
`doSnapshotClone()` returns a new `rbdVolume` object from a temporary
snapshot. This conversion drops the `SourceVolumeID` attribute, as a
`rbdVolume` does not have that.

After converting the `rbdVolume` back to a `rbdSnapshot`, the
`SourceVolumeID` attribute can be set again, and the `ToCSI()` function
can create an appropriate CSI Snapshot struct.

Signed-off-by: Niels de Vos <ndevos@ibm.com>
2025-02-20 10:14:29 +00:00
ecosysbin
0907ba98c4 rbd: Update return error massage
Issue: When delete pv failed, error message shows '*** Directory not empty ***'

the actual failed reason is 'access denied'

This commit ensures ceph-csi return right error massage.

Signed-off-by: ecosysbin <14729934+ecosysbin@user.noreply.gitee.com>
2025-02-19 15:23:21 +00:00
Rakshith R
b05d467679 rbd: fix bug in rbdVol.Exists() in PVC-PVC clone case
This commit fixes a bug in rbdVol.Exists() which caused
VolID not to be set in PVC-PVC clone case.

Signed-off-by: Rakshith R <rar@redhat.com>
2025-02-18 13:05:28 +00:00
Yite Gu
7595e20969 rbd: support QoS based on capacity for rbd volume
1. QoS provides settings for rbd volume read/write iops
   and read/write bandwidth.
2. All QoS parameters are placed in the SC,
   send QoS parameters from SC to Cephcsi through PVC create request.
3. We need provide QoS parameters in the SC as below:
   - BaseReadIops
   - BaseWriteIops
   - BaseReadBytesPerSecond
   - BaseWriteBytesPerSecond
   - ReadIopsPerGB
   - WriteIopsPerGB
   - ReadBpsPerGB
   - WriteBpsPerGB
   - BaseVolSizeBytes
   There are 4 base qos parameters among them, when users apply for
   a volume capacity equal to or less than BaseVolSizebytes, use base
   qos limit. For the portion of capacity exceeding BaseVolSizebytes,
   QoS will be increased in steps set per GB. If the step size parameter
   per GB is not provided, only base QoS limit will be used and not associated
   with capacity size.
4. If PVC has resize request, adjust the QoS limit
   according to the QoS parameters after resizing.

Signed-off-by: Yite Gu <guyite@bytedance.com>
2025-02-17 18:25:33 +00:00
Praveen M
e4d41c42d6 rbd: get volumegroup in secondary cluster
Currently, `GetVolumeGroup()` fetches the RBD group from the
pool using the clusterID & poolID encoded in the VolumeGroupHandle.
However, this approach may fail in a secondary mirrored cluster,
where the clusterID & poolID could differ.

This commit ensures that `GetVolumeGroup` leverages the
clusterIDMapping and RBDPoolIDMapping to locate the RBD group in the
appropriate  pool if it is not found in the pool corresponding
to the poolID encoded in the VolumeGroupHandle.

Signed-off-by: Praveen M <m.praveen@ibm.com>
2025-02-17 13:33:21 +00:00
Praveen M
cbd73f296d cleanup: move ShouldRetryVolumeGeneration from internal/rbd to internal/util
Signed-off-by: Praveen M <m.praveen@ibm.com>
2025-02-17 13:33:21 +00:00
Praveen M
6414e94401 cleanup: move ErrImageNotFound from rbd/errors to util/errors
Signed-off-by: Praveen M <m.praveen@ibm.com>
2025-02-17 13:33:21 +00:00
Niels de Vos
b1834552c1 cleanup: drop deprecated Rbd prefix from go-ceph rbd.ImageOption*
Signed-off-by: Niels de Vos <ndevos@ibm.com>
2025-01-30 13:27:28 +00:00
Niels de Vos
c905dd863c rbd: format log message correctly
When a `dataPool` is passed while creating a volume, there is a
`%!s(MISSING)` piece added to a debug log message. When concatinating
strings, the `%s` formatter is not needed.

Updates: #5103
Signed-off-by: Niels de Vos <ndevos@ibm.com>
2025-01-30 13:27:28 +00:00
Praveen M
f83a9f7eb8 rbd: add RegenerateVolumeGroupJournal method for Manager interface
This commit adds `RegenerateVolumeGroupJournal` to Manager
interface. RegenerateVolumeGroupJournal regenerate the omap
data for the volume group.

This performs the following operations:
  - extracts clusterID and Mons from the cluster mapping
  - Retrieves pool and journalPool parameters from the VolumeGroupReplicationClass
  - Reserves omap data
  - Add volumeIDs mapping to the reserved volume group omap object
  - Generate new volume group handle

Returns the generated volume group handler.

Signed-off-by: Praveen M <m.praveen@ibm.com>
2025-01-28 17:19:32 +00:00
Praveen M
df4d2eb915 journal: pass groupUUID to be used for omap name reserve
This commit adds groupUUID param for `ReserveName` to be used for
OMAP name reserve instead of auto-generating.
This is useful for mirroring and metro-DR ensuring that mirrored
resources have consistent OMAP names across mirrored clusters.

Signed-off-by: Praveen M <m.praveen@ibm.com>
2025-01-28 17:19:32 +00:00
Praveen M
ce767fe891 rbd: rename volumeNamePrefix to volumeGroupNamePrefix
Signed-off-by: Praveen M <m.praveen@ibm.com>
2025-01-28 17:19:32 +00:00
Niels de Vos
ecd15970de cleanup: rename csiID to driverInstance
The attribute and variable `csiID` ise used for at least two different
things:

 - name of the driver instance, used for journalling metadata
 - objects of the CSIIdentifier struct, composing a volume-handle

By changing the name of the `csiID` attribute for driver instances to
`driverInstance`, any confusion should be prevented.

Signed-off-by: Niels de Vos <ndevos@ibm.com>
2025-01-28 10:19:58 +00:00
Niels de Vos
af0a223edb csiaddons: use rbd.Manager within ReclaimSpaceControllerServer
Signed-off-by: Niels de Vos <ndevos@ibm.com>
2025-01-28 10:19:58 +00:00
Niels de Vos
6560eee3d8 csiaddons: use rbd.Manager for encryption key rotation
Signed-off-by: Niels de Vos <ndevos@ibm.com>
2025-01-28 10:19:58 +00:00
Niels de Vos
2dd235849e rbd: add sub-types for large Volume type
Introduce `snapshottableVolume` and `csiAddonsVolume` types which group
related functions together.

Signed-off-by: Niels de Vos <ndevos@ibm.com>
2025-01-28 10:19:58 +00:00
Niraj Yadav
c308e096da rbd: Use assume_storage_prezeroed when formatting
Instead of passing lazy_itable_init=1 and lazy_journal_init=1 to
mkfs.ext4, pass assume_storage_prezeroed=1 which is
stronger and allows the filesystem to skip inode table zeroing
completely instead of simply doing it lazily.

The support for this flag is checked by trying to format a fake
temporary image with mkfs.ext4 and checking its STDERR.

Closes: #4948
Signed-off-by: Niraj Yadav <niryadav@redhat.com>
2025-01-24 11:58:33 +00:00
Praveen M
8a66575825 rbd: use correct radosnamespace
Issue: When an RBD image is created in a non-default namespace,
the OMAP data for the PersistentVolume fails to regenerate
because it still attempts to locate the RBD image in the default
namespace.

This commit ensures the correct radosNamespace is retrieved from
the ceph-csi-config.

Signed-off-by: Praveen M <m.praveen@ibm.com>
2025-01-21 16:12:23 +00:00
Praveen M
0cfb2b012b rbd: correct default encryption type
Problem: When the encryptionType is not specified in the StorageClass,
the default type (block) is used and stored in OMAP. However, during
OMAP regeneration in a secondary cluster, the default type is incorrectly
set to none. This discrepancy leads to errors during PVC cloning,
with the message: `cannot create encrypted volume from unencrypted volume.`

Solution: Update the default encryption type to consistently use
block instead of none.

Signed-off-by: Praveen M <m.praveen@ibm.com>
2025-01-17 11:07:26 +00:00
Praveen M
eebfd15e78 rbd: rename groupNamePrefix to volumeGroupNamePrefix
CephFS uses the parameter `volumeGroupNamePrefix` for creating VolumeGroups.
This commit renames `groupNamePrefix` to `volumeGroupNamePrefix` for RBD
VolumeGroup creation to ensure consistent naming.

Signed-off-by: Praveen M <m.praveen@ibm.com>
2025-01-09 11:59:16 +00:00
Praveen M
54a8b50957 ci: non-constant format string (govet)
Signed-off-by: Praveen M <m.praveen@ibm.com>
2025-01-08 11:56:24 +00:00
Praveen M
d46029ca1f ci: address arguments have the wrong order (staticcheck)
Signed-off-by: Praveen M <m.praveen@ibm.com>
2025-01-08 11:56:24 +00:00
Praveen M
ea205410f5 ci: update golangci-lint to v1.62.2
- gomnd is replaced by mnd in v1.58.0
- gosec exlcude G115 rule (Potential integer overflow when converting between integer types)
- disable new iface linter
- disable new recvcheck linter

Signed-off-by: Praveen M <m.praveen@ibm.com>
2025-01-08 11:56:24 +00:00
Nikhil-Ladha
18a62ec9de util: return correct status code for VolumeGroupSnapshot
Fix status codes that are returned for Get/Delete RPC calls
for VolumeGroup/VolumeGroupSnapshot.

Signed-off-by: Nikhil-Ladha <nikhilladha1999@gmail.com>
2024-12-19 10:42:01 +00:00
Rakshith R
50b2a0528e rbd: add layering & deep flattenfeatures for groupsnapshot image
Signed-off-by: Rakshith R <rar@redhat.com>
2024-12-17 15:15:42 +00:00
Rakshith R
09d848e017 rbd: make use of both listSnapshots and listChildren
Currently, CephCSI only uses listSnaps to determine
number of snapshots on a RBD image and uses snapshot
names as child image names to flatten them.
But child images may have different name(in case of
group snapshot) or they maybe in trash
(deleted k8s VolSnapshot with alive restored PVC).

The above problems are avoid by making use of both
snap and child image lists.

Signed-off-by: Rakshith R <rar@redhat.com>
2024-12-17 15:15:42 +00:00
Rakshith R
9936033283 rbd: consolidate snapshot flatten logic in PrepareVolumeForSnapshot()
This commit consolidates flatten logic checks for cloneDepth
and snapshotLimit in PrepareVolumeForSnapshot. This allows
the function to be called for both CreateSnapshot and
CreateVolumeGroupSnapshot.
Clone Depth check and flattening of grand parent image
now occurs before creation of snapshot starts.
This aligns better with how PVC-PVC clone and
PVC-restore process occurs currently.
Flattening the grandparent image once prevents
flattening of every newly created snapshot.
Snapshot in above para refers to k8s VolumeSnapshot
(which is backed by a rbd image).

Signed-off-by: Rakshith R <rar@redhat.com>
2024-12-17 15:15:42 +00:00
Praveen M
51d0a08112 rbd: fix volumeGroup UndoReservation
This commit fixes the VolumeGroup UndoReservation
by using the correct RequestName of the VolumeGroup
instead of the volumeGroupHandle.

Signed-off-by: Praveen M <m.praveen@ibm.com>
2024-12-16 13:36:22 +00:00
Praveen M
797eceebb2 rbd: add rbdSnap.Delete() function
This function deletes rbd snap and rbd image
backing k8s snapshot.
The same function is used for deleting
individual snapshots in group snapshot.

Signed-off-by: Praveen M <m.praveen@ibm.com>
2024-12-16 13:36:22 +00:00
Nikhil-Ladha
c7d54ab776 rbd: return group not found error for Get,Delete RPC calls
We should return NotFound status if the group doesn't exists
for ControllerGetVolumeGroup RPC call.
And, an empty/OK response for DeleteVolumeGroup if the group
doesn't exists

Signed-off-by: Nikhil-Ladha <nikhilladha1999@gmail.com>
2024-12-12 22:50:10 +00:00
Madhu Rajanna
00d252e4ac rbd: use os.Remove to remove directory
using os.RemoveAll will remove everything
in the director after the Umount we should
be using os.Remove only to remove the empty
directory

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2024-11-21 10:18:56 +00:00
Madhu Rajanna
b6bd8ca71a rbd: take lock on targetpath during node operation
We should not be dependent on the CO to ensure
that it will serialize the request instead of
that we need to have own internal locks to ensure
that we dont do concurrent operations for same
request.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2024-11-21 10:18:56 +00:00
Rakshith R
d457840d21 rbd: set depthToAvoidFlatten to 3 during PVC-PVC clone
During PVC-PVC clone creation, parent of the datasource
image is flattened after checking for clone depth.
We need to account for data source image as well since
we're calculating depth from the parent image.
depthToAvoidFlatten = 3(datasource image + temp + final clone)

Signed-off-by: Rakshith R <rar@redhat.com>
2024-11-19 11:34:34 +00:00
Rakshith R
eea64fe1f9 rbd: remove checkFlatten() function
CephCSI should not flatten image that can be mounted
for use by the user.
`checkFlatten()` was called in a recovery code flow
of PVC restored from snapshot and was missed while
refractoring in https://github.com/ceph/ceph-csi/pull/2900

refer: #2900

Signed-off-by: Rakshith R <rar@redhat.com>
2024-11-19 11:34:34 +00:00
Niels de Vos
d98516e9d8 rbd: add locking for VolumeGroupSnapshot operations
Add VolumeGroupLocks in the CSI Controller Server so that operations are
protected against concurrent requests for the same VolumeGroupSnapshot.

Signed-off-by: Niels de Vos <ndevos@ibm.com>
2024-11-12 09:28:30 +00:00
Niels de Vos
f3d40f9e5a rbd: cleanup inconsistent state in reserveSnap() after a failure
`reserveSnap()` can potentially fail halfway through, in that case it
needs to undo the snapshot reservation and restore modified attributes
of the snapshot.

Fixes: #4945
Signed-off-by: Niels de Vos <ndevos@ibm.com>
2024-11-11 13:39:05 +00:00
Niels de Vos
cea8bf8110 rbd: set SnapshotGroupID on each Snapshot of a VolumeGroupSnapshot
Without the SnapshotGroupID in the Snapshot object, Kubernetes CSI does
not know that the Snapshot belongs to a group. In that case, it allows
the deletion of the Snapshot, which should be denied.

Signed-off-by: Niels de Vos <ndevos@ibm.com>
2024-11-06 11:37:44 +00:00
Niels de Vos
ec1e7a4ee0 rbd: expose the GroupControllerService
When the GroupSnapGetInfo go-ceph function is supported by librbd, the
Group Controller Servive and VolumeGroupSnapshot capabilities can be
exposed to the Container Orchestrator.

Signed-off-by: Niels de Vos <ndevos@ibm.com>
2024-11-06 11:37:44 +00:00