ceph-csi

mirror of https://github.com/ceph/ceph-csi.git synced 2025-05-31 11:06:42 +00:00

Author	SHA1	Message	Date
Nikhil-Ladha	706cd88065	rbd: improve logging for rpc calls added logging of reqID for volume group rpc calls. Also, added logs for replication rpc calls which are helpful during debugging of issues related to failover/relocate. Signed-off-by: Nikhil-Ladha <nikhilladha1999@gmail.com>	2025-05-19 12:41:09 +00:00
Niels de Vos	4db8b6222c	deploy: add `-automaxprocs` to reduce CPU and memory resources With the new `-automaxprocs` commandline parameter, Ceph-CSI will adjust the GOMAXPROCS environment variable for the Golang runtime. The values are based on the CPU quota that is given to the process. This can reduce the number of threads that the Golang runtime spawns, which affects the require amount of memory as well. Updates: #5228 Signed-off-by: Niels de Vos <ndevos@ibm.com>	2025-05-07 18:24:57 +00:00
Bart Laarhoven	fdb4002298	util: run cryptsetup with "-d -" instead of "-d /dev/stdin" Make sure that cryptsetup runs with the correct parameters to fix issues in some environments. Signed-off-by: Bart Laarhoven <bartlaarhoven@users.noreply.github.com>	2025-05-07 09:44:19 +00:00
Niels de Vos	bc8b1e792f	cleanup: address golangci 'errcheck' issues Many reports are about closing or removing files. In some cases it is possible to report an error in the logs, in other cases the error can be ignored without potential issues. Test cases have been modified to not remove the temporary files. The temporary directory that is provided by the testing package, is removed once the tests are done. Signed-off-by: Niels de Vos <ndevos@ibm.com>	2025-05-06 11:26:06 +00:00
Niels de Vos	0a22e3a186	cleanup: address golangci 'funcorder' linter problems The new 'funcorder' linter expects all public functions to be placed before private functions of a struct. Many private functions needed moving further down into their files. Some files had many issues reported. To reduce the churn in those files, they have been annotated with a `//nolint:funcorder` comment. Signed-off-by: Niels de Vos <ndevos@ibm.com>	2025-05-06 11:26:06 +00:00
Niels de Vos	0907f39d95	cleanup: address golangci 'nilnesserr' issue Inside VolumeGroupServer.ModifyVolumeGroupMembership() there is an error used in a return that could be `nil`. It seems the error that was returned isn't the correct one, and an other error object should have been returned instead. Signed-off-by: Niels de Vos <ndevos@ibm.com>	2025-05-06 11:26:06 +00:00
Niels de Vos	4ffa1d6c89	cleanup: address golangci 'gosec' issues The golangci 'gosec' linter complains about permissions that could be more secure. These have been modified or annotated on. Signed-off-by: Niels de Vos <ndevos@ibm.com>	2025-05-06 11:26:06 +00:00
Niels de Vos	5941371c4b	cleanup: address golangci 'nolintlint' issue There is no issue with importing "github.com/golang/protobuf/proto" anymore, the lint annotation can be removed. Signed-off-by: Niels de Vos <ndevos@ibm.com>	2025-05-06 11:26:06 +00:00
Niels de Vos	edb962dc46	cleanup: address golangci 'godot' complaints There are a few comments that do not pass through the 'godot' linter. Signed-off-by: Niels de Vos <ndevos@ibm.com>	2025-05-06 11:26:06 +00:00
Niels de Vos	300acd6fb9	cleanup: address golangci 'testifylint' issues The new 'testify' linter complains about incorrect usage of Equal() and similar calls. When comparing to an empty value, Empty() should be used instead. Signed-off-by: Niels de Vos <ndevos@ibm.com>	2025-05-06 11:26:06 +00:00
Niels de Vos	457ffe884a	cleanup: address golangci 'usetesting' linter issues When a context.Context is needed in a unit test, t.Context() should be used instead of creating a new one with context.TODO() or context.Background(). Signed-off-by: Niels de Vos <ndevos@ibm.com>	2025-05-06 11:26:06 +00:00
Niels de Vos	86576b4e11	rbd: include a delay and check for `syncing` status after resyncing It may take some time for the RBD-mirror daemon to start syncing the image. After the resync operation is executed, the status of the resync is checked with a small delay to prevent subsequent resync calls from re-starting the resync quickly after each other. Signed-off-by: Niels de Vos <ndevos@ibm.com>	2025-05-05 17:28:58 +00:00
Niels de Vos	b0994a5356	rbd: split mirror functions off into rbdMrror type Signed-off-by: Niels de Vos <ndevos@ibm.com>	2025-05-05 17:28:58 +00:00
Niels de Vos	af4431f60b	rbd: prevent calling mirror.Resync() if the mirror is syncing Signed-off-by: Niels de Vos <ndevos@ibm.com>	2025-05-05 17:28:58 +00:00
Niels de Vos	04257464bb	rbd: cleanup mirror last-sync-info parsing By introducing a SyncInfo interface, the `getLastSyncInfo` function can be removed from the csi-addons/rbd package. Getting details from an RBD mirror status, should be part of the main internal/rbd package, the CSI-Addons components should only use the internal API, and not add any parsing logic. This makes it easier to enhance the SyncInfo interface in the future. Signed-off-by: Niels de Vos <ndevos@ibm.com>	2025-05-05 17:28:58 +00:00
Niels de Vos	31da09863e	rbd: do not resize read-only volumes while staging Volumes that were requested with a read-only capability should not be resized. Reported-by: Alex Kalenyuk <akalenyu@redhat.com> Signed-off-by: Niels de Vos <ndevos@ibm.com>	2025-04-30 13:54:53 +00:00
Niels de Vos	ea7be34396	rbd: use helper functions from csi-common for VolumeCapability checking The internal/csi-common package offers helper functions like `IsReaderOnly()` and `IsBlockMultiNode()`. These should be used instead of checking the VolumeCapability that is passed in a request in different places. This also suggested that adding the "ro" mount option in `NodeServer.mountVolumeToStagePath()` is not appropriate, as the csi-common helper `ConstructMountOptions()` can take care of that already too. Signed-off-by: Niels de Vos <ndevos@ibm.com>	2025-04-30 13:54:53 +00:00
Rakshith R	86f2ad9e0d	util: fix bug in health checker This commit fixes a bug in health checker that caused shared checker to get keyed with volumeID+volumepath instead of just volumeID and the other way around for non-shared checkers. Signed-off-by: Rakshith R <rar@redhat.com>	2025-04-14 14:55:56 +00:00
Niels de Vos	437d90c84d	rbd: do not start the healer for NBD on non-Kubernetes platforms When running on Docker Swarm, the RBD-healer fails with an error like: > healer had failures, err failed to get cluster config: unable to load > in-cluster configuration, KUBERNETES_SERVICE_HOST and > KUBERNETES_SERVICE_PORT must be defined Before starting the healer, check if we're running on Kubernetes, so that non-Kubernetes platforms do not get confusing warnings. Updates: #3769 Signed-off-by: Niels de Vos <ndevos@ibm.com>	2025-04-02 13:59:11 +00:00
monoamin	71decb822d	rbd: Register FenceController only once Running cephcsi in docker swarm currently requires serving both the nodeserver and controllerserver over the same socket. This leads to errors like > FATAL: [core] grpc: Server.RegisterService found duplicate > service registration for \"fence.FenceController\"" ...since `FenceController` is registererd once per server type. Commit proposes simple fix by registering `FenceController` only once when at least one of `IsControllerServer` or `IsNodeServer` is `true`. Signed-off-by: monoamin <precision1998@gmail.com>	2025-04-01 16:21:40 +00:00
Nikhil-Ladha	23fce43925	rbd: cleanup volume info from group even if image is not part of group we should continue to cleanup the volume info like the omap data, mappings from the group if the image is not part of the goup anymore. Signed-off-by: Nikhil-Ladha <nikhilladha1999@gmail.com>	2025-04-01 12:34:03 +00:00
Niels de Vos	3f33e87e70	rbd: improve the description for GetID() and GetName() interfaces The `GetID()` and `GetName()` functions can be confusing, as names and ID's are not always distinctive enough. The name is used to reference an object that exists in a pool. The ID the CSI-handle formatted and can be used to locate the entry for the object in the journal. Signed-off-by: Niels de Vos <ndevos@ibm.com>	2025-03-27 14:09:44 +00:00
Niels de Vos	63df17171a	rbd: use the existing VolumeGroup if contents are matching When a VolumeGroup has been created through the CSI-Addons API, the VolumeGroupSnapshot CSI API will now use the existing VolumeGroup. There are checks in place to validate that the Volumes in the VolumeGroup match the Volumes in the VolumeGroupSnapshot request. Signed-off-by: Niels de Vos <ndevos@ibm.com>	2025-03-27 14:09:44 +00:00
Niels de Vos	e489413dbd	rbd: introduce functions for comparing Volumes in a VolumeGroup CompareVolumesInGroup() verifies that all the volumes are part of the given VolumeGroup. It does so by obtaining the VolumeGroupID for each volume with GetVolumeGroupByID(). The helper VolumesInSameGroup() verifies that all volumes belong to the same (or no) VolumeGroup. It can be called by CSI(-Addons) procedures before acting on a VolumeGroup. Signed-off-by: Niels de Vos <ndevos@ibm.com>	2025-03-27 14:09:44 +00:00
Niels de Vos	32285c8365	rbd: add MakeVolumeGroupID() utility function The Manager.MakeVolumeGroupID() function can be used to build a CSI VolumeGroupID from the backend (pool and name of the RBD-group). This will be used when checking if an RBD-image belongs to a group already. It is also possible to resolve the VolumeGroup by passing the VolumeGroupID to the existing Manager.GetVolumeGroupByID() function. Signed-off-by: Niels de Vos <ndevos@ibm.com>	2025-03-27 14:09:44 +00:00
Niels de Vos	a8ee0fe304	rbd: add Manager.getVolumeGroupNamePrefix() The `prefix` is passed to several functions, but it can easily be obtained with a small helper function. This makes calling the functions a little simpler. Signed-off-by: Niels de Vos <ndevos@ibm.com>	2025-03-27 14:09:44 +00:00
Niels de Vos	45c91ab0f1	rbd: prevent panic in CreateVolumeGroup if volumeID was not found When an incorrect volumeID is passed while creating a VolumeGroup, the `.Destroy()` function caused a panic. By appending each volume to the volumes slice, the slice won't contain any `nil` volumes anymore. Signed-off-by: Niels de Vos <ndevos@ibm.com>	2025-03-27 14:09:44 +00:00
Praveen M	add4b36900	cleanup: move Destroy() method to journalledObject interface VolumeGroup interface has more than 10 methods and it causes golangci-lint to fail. Moving the `Destroy()` method to a base interface journalledObject. Signed-off-by: Praveen M <m.praveen@ibm.com>	2025-03-27 09:59:12 +00:00
Praveen M	8d9f353f15	rbd: check for volume group existence Signed-off-by: Praveen M <m.praveen@ibm.com>	2025-03-27 09:59:12 +00:00
Praveen M	5cbc14454a	cleanup: move internal/rbd/errors.go to internal/rbd/errors pacakge Signed-off-by: Praveen M <m.praveen@ibm.com>	2025-03-27 09:59:12 +00:00
mageekchiu	0c60fd28ea	cephfs: upgrading mount syntax The old syntax is almost deprecated,and there are reasons to upgrade it - old syntax is lack of fsid(critical for debugging and observability) - mds_namespace is deprecated, it might be inappropriate to continue using it - kernel will try new syntax first and then the old one, it's a waste Signed-off-by: mageekchiu <qiukang@mail.ustc.edu.cn>	2025-03-25 14:39:22 +00:00
Praveen M	0ed0af120b	rbd: retain intermediate RBD snapshot on temp image Currently, Ceph-CSI deletes intermediate RBD snapshot on temporary cloned images (`csi-vol-xxxx-temp@csi-vol-xxxx`) which is the parent of the final clone image. The parent-child mirroring requires both the parent and child images to be present (i.e, not in trash). This commit makes enhancement to `createRBDClone` function by introducing `deleteSnap` parameter. If `deleteSnap` is true, the snapshot is deleted after the clone is created. This is required to support mirroring of child image with its parent image. Signed-off-by: Praveen M <m.praveen@ibm.com>	2025-03-18 13:42:11 +00:00
Rakshith R	6f802589aa	rbd: add one depth for softlimit of snapshot for restore PVC Currently, while preparing a volume for snapshot, the depthToAvoidFlatten is set to 2. This accounts one for snapshot and another since parent of the volume is flattened. This commit modifies the depth to 3 to also account for future PVC restore since - snapshot alone is useless and it is very likely to be restore at one point in time. - this ensures snapshot is not flattened when restore does occur. - flattening of snapshot in the above case will make the snapshot no longer eligible for changed block tracking(snap diff) operation. - maintain similarity with PVC-PVC clone operation which currently depthToAvoidFlatten set to 1. Signed-off-by: Rakshith R <rar@redhat.com>	2025-03-14 15:12:27 +00:00
Niels de Vos	7f7988be0d	rbd: cleanup NodeServer.createTargetMountPath() The inverse checking and returning of is-a-mounted-path makes it difficult to understand the function. It is easier to follow the code when the function just returns what it says it does, hence added the comment for the function too. Some errors were returned directly, others were converted to gRPC errors. This has been corrected now too, and the caller converts the plain error to a gRPC error now. Signed-off-by: Niels de Vos <ndevos@ibm.com>	2025-03-14 10:27:13 +00:00
Niels de Vos	79cf0321dd	util: do not use `mount-utils.IsLikelyNotMountPoint` anymore `IsLikelyNotMountPoint()` is an optimized version for `IsMountPoint()` which can not detect all type of mounts (anymore). The slower `IsMountPoint()` is more safe to use. This can cause a slight performance regression in the case there are many mountpoints on the system, but correctness is more important than speed while mounting. Fixes: #4633 Signed-off-by: Niels de Vos <ndevos@ibm.com>	2025-03-14 10:27:13 +00:00
Rakshith R	796e6b6c44	rbd: use ListChildrenAttributes() instead of ListChildren() This commit modifies listSnapAndChildren() to make use of ListChildrenAttributes() instead of ListChildren() which allows us to filter out images in trash. This commit also order the alive images so that temp clone images are followed by images backing volume snapshots so that temp clone images are flattened first. Signed-off-by: Rakshith R <rar@redhat.com>	2025-03-12 08:51:02 +00:00
Niels de Vos	15da101b1b	util: move kernel version functions to pkg/util/kernel Signed-off-by: Niels de Vos <ndevos@ibm.com>	2025-03-07 16:05:04 +00:00
Niels de Vos	542ed3de63	util: move EncryptionType(s) to pkg/util/crypto Signed-off-by: Niels de Vos <ndevos@ibm.com>	2025-03-07 16:05:04 +00:00
Zerotens	5b587c9484	rbd: fix encrypted PVC with metadata KMS cannot be deleted Signed-off-by: Zerotens <12968743+zerotens@users.noreply.github.com>	2025-02-25 13:51:42 +00:00
Niels de Vos	43b150f14d	rbd: return gRPC code `Aborted` when the RBD-image is in-use on delete According to the error scheme documented in the CSI specification, the Aborted error code should be initiate retries, whereas the Internal error code does not require this behaviour. When an RBD-image is still in-use, it can not be removed. The DeleteVolume procedure should be retried and will succeed once the RBD-image is not in-use anymore. Fixes: #5166 Signed-off-by: Niels de Vos <ndevos@ibm.com>	2025-02-24 11:19:17 +00:00
Niels de Vos	ac8cda5e37	rbd: add validation to ToCSI() for rbdVolume and rbdSnapshot After an unfortnate timed restart of the provisioner, a volume that got cloned did not get a `rbdVolume.VolID` set. The `.VolID` is used as the CSI Volume Handle, and is a required attribute. The `rbdVolume` and `rbdSnapshot` structs have a `.ToCSI()` function that can do the validation of required attributes. This is now added, including unit-tests. Signed-off-by: Niels de Vos <ndevos@ibm.com>	2025-02-20 10:14:29 +00:00
Niels de Vos	b3faa04504	rbd: always include the `SourceVolumeID` when returning a Snapshot `doSnapshotClone()` returns a new `rbdVolume` object from a temporary snapshot. This conversion drops the `SourceVolumeID` attribute, as a `rbdVolume` does not have that. After converting the `rbdVolume` back to a `rbdSnapshot`, the `SourceVolumeID` attribute can be set again, and the `ToCSI()` function can create an appropriate CSI Snapshot struct. Signed-off-by: Niels de Vos <ndevos@ibm.com>	2025-02-20 10:14:29 +00:00
ecosysbin	0907ba98c4	rbd: Update return error massage Issue: When delete pv failed, error message shows '* Directory not empty *' the actual failed reason is 'access denied' This commit ensures ceph-csi return right error massage. Signed-off-by: ecosysbin <14729934+ecosysbin@user.noreply.gitee.com>	2025-02-19 15:23:21 +00:00
Rakshith R	b05d467679	rbd: fix bug in rbdVol.Exists() in PVC-PVC clone case This commit fixes a bug in rbdVol.Exists() which caused VolID not to be set in PVC-PVC clone case. Signed-off-by: Rakshith R <rar@redhat.com>	2025-02-18 13:05:28 +00:00
Yite Gu	7595e20969	rbd: support QoS based on capacity for rbd volume 1. QoS provides settings for rbd volume read/write iops and read/write bandwidth. 2. All QoS parameters are placed in the SC, send QoS parameters from SC to Cephcsi through PVC create request. 3. We need provide QoS parameters in the SC as below: - BaseReadIops - BaseWriteIops - BaseReadBytesPerSecond - BaseWriteBytesPerSecond - ReadIopsPerGB - WriteIopsPerGB - ReadBpsPerGB - WriteBpsPerGB - BaseVolSizeBytes There are 4 base qos parameters among them, when users apply for a volume capacity equal to or less than BaseVolSizebytes, use base qos limit. For the portion of capacity exceeding BaseVolSizebytes, QoS will be increased in steps set per GB. If the step size parameter per GB is not provided, only base QoS limit will be used and not associated with capacity size. 4. If PVC has resize request, adjust the QoS limit according to the QoS parameters after resizing. Signed-off-by: Yite Gu <guyite@bytedance.com>	2025-02-17 18:25:33 +00:00
Praveen M	e4d41c42d6	rbd: get volumegroup in secondary cluster Currently, `GetVolumeGroup()` fetches the RBD group from the pool using the clusterID & poolID encoded in the VolumeGroupHandle. However, this approach may fail in a secondary mirrored cluster, where the clusterID & poolID could differ. This commit ensures that `GetVolumeGroup` leverages the clusterIDMapping and RBDPoolIDMapping to locate the RBD group in the appropriate pool if it is not found in the pool corresponding to the poolID encoded in the VolumeGroupHandle. Signed-off-by: Praveen M <m.praveen@ibm.com>	2025-02-17 13:33:21 +00:00
Praveen M	cbd73f296d	cleanup: move ShouldRetryVolumeGeneration from internal/rbd to internal/util Signed-off-by: Praveen M <m.praveen@ibm.com>	2025-02-17 13:33:21 +00:00
Praveen M	6414e94401	cleanup: move ErrImageNotFound from rbd/errors to util/errors Signed-off-by: Praveen M <m.praveen@ibm.com>	2025-02-17 13:33:21 +00:00
Niels de Vos	b1834552c1	cleanup: drop deprecated `Rbd` prefix from go-ceph `rbd.ImageOption*` Signed-off-by: Niels de Vos <ndevos@ibm.com>	2025-01-30 13:27:28 +00:00
Niels de Vos	c905dd863c	rbd: format log message correctly When a `dataPool` is passed while creating a volume, there is a `%!s(MISSING)` piece added to a debug log message. When concatinating strings, the `%s` formatter is not needed. Updates: #5103 Signed-off-by: Niels de Vos <ndevos@ibm.com>	2025-01-30 13:27:28 +00:00

1 2 3 4 5 ...

1366 Commits