The go-ceph rbd package provides the GroupSnapGetInfo function, but it
may return ErrUnsupported when called. Returning this error after
advertising the support for VolumeGroupSnapshot seems ugly.
In order to advertise support for VolumeGroupSnapshot,
SupportsGroupSnapGetInfo() can be used, which detects the required C
function of librbd.
Signed-off-by: Niels de Vos <ndevos@ibm.com>
`ensureImageCleanup()` can cause a panic when an image was deleted, but
the journal still contained a reference. By opening the IOContext before
using, an error may be returned instead of a panic when using a `nil` or
freed IOContext.
Signed-off-by: Niels de Vos <ndevos@ibm.com>
The go-ceph rbd.GroupCreate() now returns ErrExist in case the group
that is created, already exists. The previous check only ever matched
the string comparison, which is prone to errors in case the contents is
modified by go-ceph.
Signed-off-by: Niels de Vos <ndevos@ibm.com>
Incase of RDR with restricted access the
ceph user will not have access to all the objects
or all the pools where mapping exists
This commits add a check to continue to get
the volume if there is a permission error
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
The `repairImageID()` function is useful for the `rbdSnapshot` objects
as well. Move it to the `rbdImage` struct that is the base for both
`rbdVolume` and `rbdSnapshot`.
Signed-off-by: Niels de Vos <ndevos@ibm.com>
There is no need for the `Manager.DeleteVolumeGroup()` function as
`VolumeGroup.Delete()` should cover everything too.
By moving the `.Delete()` functionality of removing the group from the
journal to the shared `commonVolumeGroup` type, a volume group snaphot
can use it as well.
Signed-off-by: Niels de Vos <ndevos@ibm.com>
For core K8s API objects like Pods, Nodes, etc., we
can use protobuf encoding which reduces CPU consumption
related to (de)serialization, reduces overall latency
of the API call, reduces memory footprint, reduces the
amount of work performed by the GC and results in quicker
propagation of objects to event handlers of shared informers.
Signed-off-by: Nikhil-Ladha <nikhilladha1999@gmail.com>
When an `.Destroy()` is called on an rbdImage (or rbdVolume or
rbdSnapshot), the IOContext, Connection and other attributes are
invalid. When using a destroyed resource that points to an object that
was allocated through librbd, the process most likely ends with a panic.
Signed-off-by: Niels de Vos <ndevos@ibm.com>
This commit adds a gRPC middleware that logs calls that
keep running after their deadline.
Adds --logslowopinterval cmdline argument to pass the log rate.
Signed-off-by: Robert Vasek <robert.vasek@clyso.com>
When an `rbdVolume` or `rbdSnapshot` is not connected with credentials
to the Ceph cluster, operations may try to get the IOContext which then
causes a panic.
Signed-off-by: Niels de Vos <ndevos@ibm.com>
A function called `setImageOptions()` is expected to set the passed
options on the volume. However, the passed options parameter is only
filled with the options that should get set on the RBD-image at the time
of creation.
The naming of the function, and it's parameter is confusing. Rename the
function to `constructImageOptions()` and return the ImageOptions to
make it easier to understand.
Signed-off-by: Niels de Vos <ndevos@ibm.com>
While dealing with CephFS fencing we evict the
clients and block the IPs from the CIDR range
that do not have any active clients individually.
While Unfencing, the IP is removed via the
CIDR range which fails to remove the individual
IPs from Ceph's blacklist.
This PR fetches the blocklist from ceph and
removes the IPs in blocklist that lie inside
the CIDR range along with their unique nonces.
Signed-off-by: Niraj Yadav <niryadav@redhat.com>
rbd nodeserver is already setting
volume condition in NodeGetVolumeStats
RPC call but the cap is not updated
for it, This PR advertise the
VOLUME_CONDITION
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Many functions that are implemented for the volumeGroup type can be
shared with the (coming) volumeGroupSnapshot type. Move these functions
into a commonVolumeGroup type, so that volumeGroup and
volumeGroupSnapshot can inherit them.
Signed-off-by: Niels de Vos <ndevos@ibm.com>
This commit fixes the issue where the `csiCreationTimeKey`
field was missing during the rebuilding of the
`VolumeGroupJournalConfig` struct in the `Connect()` method,
which led to the `csi.creationtime` key not being stored in
the omap.
Signed-off-by: Praveen M <m.praveen@ibm.com>
After cloning the RBD snapshot, an rbdVolume is returned for the
CSI.Snapshot object. In order to use the rbdSnapshot.ToCSI() function,
the rbdVolume needs to be converted (back) to an rbdSnaphot.
Signed-off-by: Niels de Vos <ndevos@ibm.com>
There has been some confusion about using different variables for the
InstanceID of the RBD-driver. By removing the global variable
CSIInstanceID, there should be no confusion anymore what variable to
use.
Signed-off-by: Niels de Vos <ndevos@ibm.com>
updated the group stringer method
to have pool and namespace for
proper debugging/logging and to
use it with CLI as agrument as well.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
GetVolumeByID already returning detailed
error message, the caller just need to return
it. No need to add duplicate details to error
message.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
in ModifyVolumeGroupMembership RPC call,
flatten the required images before adding it
to the group or else if the parent is not
mirror enabled adding a child to the group
will fail.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
updating HandleParentImageExistence function
to return more details error which includes
the pool/namespace/image name
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
This commit adds support for flattenMode option
for volumegroup.
If the flattenMode is set to "force" in
volumegroupreplicationclass parameters,
cephcsi will add a task to flatten the image
if it has parent before adding it to the group.
This enable cephcsi to then mirror such images
after flattening them.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
updated GetVolumeByID to return more
descriptive error so that caller no
need to add more details in
the error message.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
This commit adds locks on reclaimspace operations to
prevent multiple process executing rbd sparsify/fstrim
on same volume.
Signed-off-by: Praveen M <m.praveen@ibm.com>
With the ControllerGetVolumeGroup operation the caller can verify that a
VolumeGroup exists, and validate the volumes that are part of it.
Signed-off-by: Niels de Vos <ndevos@ibm.com>
There was a discrepancy between the objectId
when creating the lock and when releasing the lock
this caused every lock to hang.
Signed-off-by: NymanRobin <robin.nyman@est.tech>
It seems to be possible that the UUID was found, but the name is not
set. Checking on UUID makes the CreateVolumeGroup operation more
idempotent.
Signed-off-by: Niels de Vos <ndevos@ibm.com>
The ModifyVolumeGroupMembership operation can be used to change the
volumes that are part of a VolumeGroup. Only empty VolumeGroups can be
removed, this operation is required to make that possible.
Signed-off-by: Niels de Vos <ndevos@ibm.com>
A RBD image can only be part of a single group. While an image is added
to a group, check if the image is already part of a group, and return an
error in case it is.
Signed-off-by: Niels de Vos <ndevos@ibm.com>
Add extra error checking to make sure trying to create an existing
volume group does not result in a failure. The same counts for deleting
a non-existing volume group, and adding/removing volumes to/from the
volume group.
Signed-off-by: Niels de Vos <ndevos@ibm.com>
VolumeGroupJournalConnection is not used outside the internal/journal
package. There is no need to expose the type outside of the package, it
causes only confusion about the usage of the journalling API.
Signed-off-by: Niels de Vos <ndevos@ibm.com>
This patch allows to avoid hanging mutex lock scenario when
fscrypt fails to unlock. Prevents uncessary delays
Signed-off-by: Sunnatillo <sunnat.samadov@est.tech>
This commit resolves a bug where node labels with empty values
are processed for the crush_location mount option,
leading to invalid mount options and subsequent mount failures.
Signed-off-by: Praveen M <m.praveen@ibm.com>
The way fscrypt client handles metadata and policy creation
causing errors when multiple instances start simultaneously.
This commit adds a lock to ensure the initial setup
completes correctly, preventing race conditions and
mismatches.
Signed-off-by: Sunnatillo <sunnat.samadov@est.tech>
A VolumeGroup CSI-Addons object contains a list of CSI Volumes. A
ToCSI() function makes creating such a list much simpler.
Signed-off-by: Niels de Vos <ndevos@ibm.com>
Register the volumegroup controller as part
of rbd controller server to serve the volume
group RPC spec.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
The rbd_types package was initially created with references to the rbd
package. And the rbd package references the rbd_types package. Having
rbd/types was not possible due to recursive imports. After cleaning up
the rbd_types package, it can be renamed to rbd/types.
Signed-off-by: Niels de Vos <ndevos@ibm.com>
This commit resolves the govet issue -
`copylocks: call of append copies lock value ... contains sync.Mutex`
Embedding DoNotCopy in a struct is a convention to signal and prevent
shallow copies, as recommended in Go's best practices. This does not
rely on a language feature but is instead a special case within the vet
checker.
For more details, see https://golang.org/issues/8005
Signed-off-by: Praveen M <m.praveen@ibm.com>
The DefaultIdentityServer struct embedded UnimplementedControllerServer,
but it should have been UnimplementedIdentityServer instead.
Signed-off-by: Niels de Vos <ndevos@ibm.com>
The Volume interface will make it easier to work with the rbdImage
struct, as the functions are cleaner defined. This benefits work that is
needed for VolumeGroups and other CSI-Addons procedures.
Signed-off-by: Niels de Vos <ndevos@ibm.com>
In the future we'll introduce a more standard interface for objects like
Volumes and Snapshots. It is useful to have the context passed as 1st
argument to all functions of those objects, including their Destroy()
function.
Signed-off-by: Niels de Vos <ndevos@ibm.com>
Version 0.18.0 of github.com/kubernetes-csi/csi-lib-utils
added support for structured logging.
This commit includes passing the context parameter for the
necessary function.
Signed-off-by: Praveen M <m.praveen@ibm.com>
read the volumeID from replication
source if the ID is missing read
it from req VolumeId as a fallback.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
This commit adds support for flattenMode option
for replication.
If the flattenMode is set to "force" in
volumereplicationclass parameters, cephcsi will
add a task to flatten the image if it has parent.
This enable cephcsi to then mirror such images after
flattening them.
The error message when the image's parent is
in trash or unmirrored is improved as well.
Signed-off-by: Rakshith R <rar@redhat.com>
instead of adding single volumes to the
group journal, support adding multiple
volumeID's map to the group journal
which is required for RBD as well.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Adjusted method names to not have any
specific things to volumesnapshot as
we want to reuse the same journal for
volumegroup as well.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
we need to have groupID stored and retrived
when we are doing group level operations,
we need to find out the groupID from the volumeID
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
This commit remove `VOLUME_ACCESSIBILITY_CONSTRAINTS` capabilities
from CephFS as topology based volume provisioning is not yet supported.
Signed-off-by: Praveen M <m.praveen@ibm.com>
ensure a clean and isolated environment for testing purposes.
Signed-off-by: Mayank Pal <mayankpal9654@gmail.com>
ci: Use temporary directory for unit tests
remove err = os.Mkdir('/etc/ceph-csi-config', 0o600)
Signed-off-by: Mayank Pal <mayankpal9654@gmail.com>
ci: Use temporary directory for unit tests
remove err = os.Mkdir('/etc/ceph-csi-config', 0o600)
Signed-off-by: Mayank Pal <mayankpal9654@gmail.com>
ci: Use temporary directory for unit tests
remove if err
Signed-off-by: Mayank Pal <mayankpal9654@gmail.com>
golangci-lint reports these:
The copy of the 'for' variable "kmsID" can be deleted (Go 1.22+)
(copyloopvar)
Signed-off-by: Niels de Vos <ndevos@ibm.com>
This commit modifies a test case to check creation of
PVC-PVC clone of a restored PVC when parent snapshot
is deleted.
Signed-off-by: Rakshith R <rar@redhat.com>
This commit adds ParentInTrash parameter in rbdImage struct
and makes use of it in getParent() function in order to avoid
error in case the parent is present but in trash.
Signed-off-by: Rakshith R <rar@redhat.com>
Currently we are assuming that only one
rbd mirror daemon running on the ceph cluster
but that is not true for many cases and it
can be more that one, this PR make this as a
configurable parameter.
fixes: #4312
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>