go-ceph provides a new GetFailure() method to retrieve details errors
when cloning failed. This is now included in the `cephFSCloneState`
struct, which was a simple string before.
While modifying the `cephFSCloneState` struct, the constants have been
removed, as go-ceph provides them as well.
Fixes: #3140
Signed-off-by: Niels de Vos <ndevos@redhat.com>
This commit removes the clone incase
unsetAllMetadata or copyEncryptionConfig or
expand fails for createVolumeFromSnapshot
and CreateSnapshot.
It also removes the clone in case of
any failure in createCloneFromImage.
issue: #3103
Signed-off-by: Yati Padia <ypadia@redhat.com>
When getting the PVC or PV failed, the returned object may contain empty
values. If that happens, a retry uses the empty values for Namespace and
Name, which will never be successful.
Instead, use the Namespace and Name attributes from the original object,
and not from the object returned by the Get() call.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
When using `lock_on_read`, the RBD image needs to have the
`exclusive-lock` feature enabled too.
Fixes: #3221
Signed-off-by: Niels de Vos <ndevos@redhat.com>
CI is failing very frequently hitting resource leaks issue,
until we solve the root cause for resource leaks reducing the clone
count from 10 to 3.
related: #2327
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Address: hugeParam linter
internal/controller/persistentvolume/persistentvolume.go:59:7:
hugeParam: r is heavy (80 bytes); consider passing it by pointer
(gocritic)
[...]
internal/controller/persistentvolume/persistentvolume.go:135:7:
hugeParam: r is heavy (80 bytes); consider passing it by pointer
(gocritic)
func (r ReconcilePersistentVolume) reconcilePV(ctx context.Context, obj
runtime.Object) error {}
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
As we added support to set the metadata on the rbd images created for
the PVC and volume snapshot, by default metadata is set on all the images.
As we have seen we are hitting issues#2327 a lot of times with this,
we start to leave a lot of stale images. Currently, we rely on
`--extra-create-metadata=true` to decide to set the metadata or not,
we cannot set this option to false to disable setting metadata because we
use this for encryption too.
This changes is to provide an option to disable setting the image
metadata when starting cephcsi.
Fixes: #3009
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Removed the code in checkHealthyPrimary which
makes the ceph call, passing it as input now.
Added unit test for checkHealthyPrimary function
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
we need to check for image should be in up+stopped state
not anyone of the state for that the we need to use
OR check not the AND check.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
When the image is force promoted to primary on the
cluster the remote image might not be in replaying
state because due to the split brain state. This
PR reverts back the commit
c3c87f2ef3. Which we added
to check the remote image status.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
In case a PullRequest does not have a MergeableState set, it will be
`nil`. Dereferencing the pointer will cause a Go panic, and the action
won't work as intended.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Kubernetes 1.24 and newer use a different path for staging the volume.
That means the CSI-driver is requested to mount the volume at an other
location, compared to previous versions of Kubernetes. CSI-drivers
implementing the volumeHealer, must receive the correct path, otherwise
the after a nodeplugin restart the NBD mounts will bailout attempting
to NodeStageVolume() call and return an error.
See-also: kubernetes/kubernetes#107065
Fixes: #3176
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
During failover we do demote the volume on the primary
as the image is still not promoted yet on the remote cluster,
there are spurious split-brain errors reported by RBD,
the Cephcsi resync will attempt to resync from the "known"
secondary and that will cause data loss
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
eventhough the snapshotter is of 6.0.1 the client libraries
are of 6.0.0 and this commit update the same.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
create the token if kubernetes version in
1.24+ and use it for vault sa.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Signed-off-by: Rakshith R <rar@redhat.com>
This commit implements most of
docs/design/proposals/cephfs-snapshot-shallow-ro-vol.md design document;
specifically (de-)provisioning of snapshot-backed volumes, mounting such
volumes as well as mounting pre-provisioned snapshot-backed volumes.
Signed-off-by: Robert Vasek <robert.vasek@cern.ch>
go-ceph v0.16.0 contains subvolume metadata APIs and subvolume snapshot
metadata APIs.
Please note, as the APIs can not be tested in the go-ceph CI, it requires
build-tag `ceph_ci_untested`.
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
RBD supports creating rbd images with
object size, stripe unit and stripe count
to support striping. This PR adds the support
for the same.
More details about striping at
https://docs.ceph.com/en/quincy/man/8/rbd/#stripingfixes: #3124
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
This change helps read the cluster name from the cmdline args,
the provisioner will set the same on the RBD images.
Fixes: #2973
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
As the attacher is no longer required we have to mention the same
for csidriver object parameter.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
CephFS CSI driver dont need attacher sidecar for its operations.
This commit remove the same. The RBAC has also got adjusted.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
It seems PullRequests.ListReviews() returns 30 reviews by default. This
is practical for pagination, but not so much for a GitHub action. There
is the rare occasion that the number of reviews if higher than 30. If
that is the case, the retest action does not capture the latest reviews,
which is where the APPROVED status is set.
By increasing the number of results to 100 per page, we should have
sufficient space for many more reviews and automated retest-ing.
Signed-off-by: Niels de Vos <ndevos@redhat.com>