Current code uses an !A && !B condition incorrectly to
test A:Up and B:status for a remote peer image.
This should be !A || !B as we require both conditions to
be in the specified state (Up: true, and status Unknown).
This is corrected by this commit, and further fixes:
- check and return ready only when a remote site is
found in the status output
- check if all peer sites are ready, if multiple are found
and return ready appropriately
Signed-off-by: Shyamsundar Ranganathan <srangana@redhat.com>
During ResyncVolume we check if the image
is in an error state, and we resync.
After resync, the image will move to
either the `Error` or the `Resyncing` state.
And if the image is in the above two
conditions, we will return a successful
response and Ready=false so that the
consumer can wait until the volume is
ready to use. If the image is in any
other state we return an error message
to indicate the syncing is not going on.
The whole resync and image state change
depends on the rbd mirror daemon. If the
mirror daemon is not running, the image
can be in Resyncing or Unknown state.
The Ramen marks the volume replication as
secondary, and once the resync starts, it
will delete the volume replication CR as a
cleanup process.
As we dont have a check for the rbd mirror
daemon, we are returning a resync success
response and Ready=false. Due to this false
response Ramen is assuming the resync started
and deleted the volume replication CR, and
because of this, the cluster goes into a bad
state and needs manual intervention.
fixes#3289
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
IsNotMountPoint() is deprecated and Mounter.IsMountPoint() is
recommended to be used instead.
Reported-by: golangci/staticcheck
Signed-off-by: Niels de Vos <ndevos@redhat.com>
NewWithoutSystemd() has been introduced in the k8s.io/mount-utils
package so that systemd is not called while executing functions. This
offers consumers the ability to prevent confusing and scary messages
from getting logged.
See-also: kubernetes/kubernetes#111218
Signed-off-by: Niels de Vos <ndevos@redhat.com>
previously, it was a requirement to have attacher sidecar for CSI
drivers and there had an implementation of dummy mode of operation.
However skipAttach implementation has been stabilized and the dummy
mode of operation is going to be removed from the external-attacher.
Considering this driver work on volumeattachment objects for NBD driver
use cases, we have to implement dummy controllerpublish and unpublish
and thus keep supporting our operations even in absence of dummy mode
of operation in the sidecar.
This commit make a NOOP controller publish and unpublish for RBD driver.
CephFS driver does not require attacher and it has already been made free
from the attachment operations.
Ref# https://github.com/ceph/ceph-csi/pull/3149
Ref# https://github.com/kubernetes-csi/external-attacher/issues/226
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
`--setmetadata` is false by default, honoring it
will keep the metadata disabled by default
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
`--setmetadata` is false by default, honoring it
will keep the metadata disabled by default
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Make sure to set metadata when subvolume snapshot exist, i.e. if the
provisioner pod is restarted while createSnapShot is in progress, say it
created the subvolume snapshot but didn't yet set the metadata.
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Set snapshot-name/snapshot-namespace/snapshotcontent-name details
on subvolume snapshots as metadata on create.
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
This change helps read the cluster name from the cmdline args,
the provisioner will set the same on the subvolume.
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Make sure to set metadata when subvolume exist, i.e. if the provisioner pod
is restarted while createVolume is in progress, say it created the subvolume
but didn't yet set the metadata.
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
This helps Monitoring solutions without access to Kubernetes clusters to
display the details of the PV/PVC/NameSpace in their dashboard.
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
After Failover of workloads to the secondary
cluster when the primary cluster is down,
RBD Image is not marked healthy, and VR
resources are not promoted to the Primary,
In VolumeReplication, the `CURRENT STATE`
remains Unknown and doesn't change to Primary.
This happens because the primary cluster went down,
and we have force promoted the image on the
secondary cluster. and the image stays in
up+stopping_replay or could be any other states.
Currently assumption was that the image will
always be `up+stopped`. But the image will be in
`up+stopped` only for planned failover and it
could be in any other state if its a forced
failover. For this reason, removing
checkHealthyPrimary from the PromoteVolume RPC call.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Recently the k8s.io/mount-utils package added more runtime dectection.
When creating a new Mounter, the detect is run every time. This is
unfortunate, as it logs a message like the following:
```
mount_linux.go:283] Detected umount with safe 'not mounted' behavior
```
This message might be useful, so it probably good to keep it.
In Ceph-CSI there are various locations where Mounter instances are
created. Moving that to the DefaultNodeServer type reduces it to a
single place. Some utility functions need to accept the additional
parameter too, so that has been modified as well.
See-also: kubernetes/kubernetes#109676
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Looks like cephfs snapshot size is buggy and its
getting removed in ceph fs. we cannot get the size
of the snapshot during CreateVolume call, so we cannot
do any size check at CreateVolume to check if the
restore size is smaller or not.
As we are removing this check it also fixes#3147
but we dont have any validation at CSI level for
smaller restore we need to depend on kubernetes
external-provisioner for it.
fixes: #3147
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Due to the bug in the df stat we need to round off
the subvolume size to align with 4Mib.
Note:- Minimum supported size in cephcsi is 1Mib,
we dont need to take care of Kib.
fixes#3240
More details at https://github.com/ceph/ceph/pull/46905
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
While creating subvolumes, CephFS driver set the mode to `777`
and pass it along to go ceph apis which cause the subvolume
permission to be on 777, however if we create a subvolume
directly in the ceph cluster, the default permission bits are
set which is 755 for the subvolume. This commit try to stick
to the default behaviour even while creating the subvolume.
This also means that we can work with fsgrouppolicy set to
`File` in csiDriver object which is also addressed in this commit.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
When the Ceph user is restricted to a specific namespace in the pool, it is
crucial that evey interaction with the cluster is done within that namespace.
This wasn't the case in `getCloneDepth()`.
This issue was causing snapshot creation to fail with
> Failed to check and update snapshot content: failed to take snapshot of the
> volume X: "rpc error: code = Internal desc = rbd: ret=-1, Operation not
> permitted"
Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
There are regular reports that identify a non-error as the cause of
failures. The Kubernetes mount-utils package has detection for systemd
based environments, and if systemd is unavailable, the following error
is logged:
Cannot run systemd-run, assuming non-systemd OS
systemd-run output: System has not been booted with systemd as init
system (PID 1). Can't operate.
Failed to create bus connection: Host is down, failed with: exit status 1
Because of the `failed` and `exit status 1` error message, users might
assume that the mounting failed. This does not need to be the case. The
container-images that the Ceph-CSI projects provides, do not use
systemd, so the error will get logged with each mount attempt.
By using the newer MountSensitiveWithoutSystemd() function from the
mount-utils package where we can, the number of confusing logs get
reduced.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
go-ceph provides a new GetFailure() method to retrieve details errors
when cloning failed. This is now included in the `cephFSCloneState`
struct, which was a simple string before.
While modifying the `cephFSCloneState` struct, the constants have been
removed, as go-ceph provides them as well.
Fixes: #3140
Signed-off-by: Niels de Vos <ndevos@redhat.com>
This commit removes the clone incase
unsetAllMetadata or copyEncryptionConfig or
expand fails for createVolumeFromSnapshot
and CreateSnapshot.
It also removes the clone in case of
any failure in createCloneFromImage.
issue: #3103
Signed-off-by: Yati Padia <ypadia@redhat.com>
Address: hugeParam linter
internal/controller/persistentvolume/persistentvolume.go:59:7:
hugeParam: r is heavy (80 bytes); consider passing it by pointer
(gocritic)
[...]
internal/controller/persistentvolume/persistentvolume.go:135:7:
hugeParam: r is heavy (80 bytes); consider passing it by pointer
(gocritic)
func (r ReconcilePersistentVolume) reconcilePV(ctx context.Context, obj
runtime.Object) error {}
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
As we added support to set the metadata on the rbd images created for
the PVC and volume snapshot, by default metadata is set on all the images.
As we have seen we are hitting issues#2327 a lot of times with this,
we start to leave a lot of stale images. Currently, we rely on
`--extra-create-metadata=true` to decide to set the metadata or not,
we cannot set this option to false to disable setting metadata because we
use this for encryption too.
This changes is to provide an option to disable setting the image
metadata when starting cephcsi.
Fixes: #3009
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Removed the code in checkHealthyPrimary which
makes the ceph call, passing it as input now.
Added unit test for checkHealthyPrimary function
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
we need to check for image should be in up+stopped state
not anyone of the state for that the we need to use
OR check not the AND check.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
When the image is force promoted to primary on the
cluster the remote image might not be in replaying
state because due to the split brain state. This
PR reverts back the commit
c3c87f2ef3. Which we added
to check the remote image status.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Kubernetes 1.24 and newer use a different path for staging the volume.
That means the CSI-driver is requested to mount the volume at an other
location, compared to previous versions of Kubernetes. CSI-drivers
implementing the volumeHealer, must receive the correct path, otherwise
the after a nodeplugin restart the NBD mounts will bailout attempting
to NodeStageVolume() call and return an error.
See-also: kubernetes/kubernetes#107065
Fixes: #3176
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
During failover we do demote the volume on the primary
as the image is still not promoted yet on the remote cluster,
there are spurious split-brain errors reported by RBD,
the Cephcsi resync will attempt to resync from the "known"
secondary and that will cause data loss
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
create the token if kubernetes version in
1.24+ and use it for vault sa.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Signed-off-by: Rakshith R <rar@redhat.com>
This commit implements most of
docs/design/proposals/cephfs-snapshot-shallow-ro-vol.md design document;
specifically (de-)provisioning of snapshot-backed volumes, mounting such
volumes as well as mounting pre-provisioned snapshot-backed volumes.
Signed-off-by: Robert Vasek <robert.vasek@cern.ch>
RBD supports creating rbd images with
object size, stripe unit and stripe count
to support striping. This PR adds the support
for the same.
More details about striping at
https://docs.ceph.com/en/quincy/man/8/rbd/#stripingfixes: #3124
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
This change helps read the cluster name from the cmdline args,
the provisioner will set the same on the RBD images.
Fixes: #2973
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
In case of pre-provisioned volume the clusterID is
not set in the volume context as the clusterID is missing
we cannot extract the NetNamespaceFilePath from the
configuration file. For static volume and dynamically
provisioned volume the clusterID is set.
Note:- This is a special case to support mounting PV
without clusterID parameter.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Before the change, the error msg was the following:
```
failed to set VAULT_AUTH_MOUNT_PATH in Vault config: path is empty
```
`vaultAuthPath` is the actual variable name set by the
user. The error message will now be the following:
```
failed to set "vaultAuthPath" in vault config: path is empty
```
Signed-off-by: Rakshith R <rar@redhat.com>
This commit adds support for pvc-pvc clone.
Only capability needed to be advertised, the
underlying support is already provided by cephfs
backend.
Signed-off-by: Rakshith R <rar@redhat.com>