Commit Graph

921 Commits

Author SHA1 Message Date
Prasanna Kumar Kalever
ecf03eb6ae cephfs: add set/Get/List/Remove metadata utility functions
Add utility functions to set/Get/List/Remove PV/PVC/PVCNamespace metadata
on subvolume.

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2022-07-28 04:07:52 +00:00
Madhu Rajanna
8c5563a9bc rbd: remove checkHealthyPrimary check
After Failover of workloads to the secondary
cluster when the primary cluster is down,
RBD Image is not marked healthy, and VR
resources are not promoted to the Primary,
In VolumeReplication, the `CURRENT STATE`
remains Unknown and doesn't change to Primary.

This happens because the primary cluster went down,
and we have force promoted the image on the
secondary cluster. and the image stays in
up+stopping_replay or could be any other states.
Currently assumption was that the image will
always be `up+stopped`. But the image will be in
`up+stopped` only for planned failover and it
could be in any other state if its a forced
failover. For this reason, removing
checkHealthyPrimary from the PromoteVolume RPC call.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-07-27 09:04:27 +00:00
Niels de Vos
011d4fc81c cleanup: create k8s.io/mount-utils Mounter only once
Recently the k8s.io/mount-utils package added more runtime dectection.
When creating a new Mounter, the detect is run every time. This is
unfortunate, as it logs a message like the following:

```
mount_linux.go:283] Detected umount with safe 'not mounted' behavior
```

This message might be useful, so it probably good to keep it.

In Ceph-CSI there are various locations where Mounter instances are
created. Moving that to the DefaultNodeServer type reduces it to a
single place. Some utility functions need to accept the additional
parameter too, so that has been modified as well.

See-also: kubernetes/kubernetes#109676
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2022-07-21 07:14:43 +00:00
takeaki-matsumoto
1025871021 cephfs: Support mount option on nodeplugin
add mount options on nodeplugin side

Signed-off-by: takeaki-matsumoto <takeaki.matsumoto@linecorp.com>
2022-07-18 22:04:12 +00:00
Madhu Rajanna
ceb88d6498 cephfs: remove extra check for restore size
Looks like cephfs snapshot size is buggy and its
getting removed in ceph fs. we cannot get the size
of the snapshot during CreateVolume call, so we cannot
do any size check at CreateVolume to check if the
restore size is smaller or not.

As we are removing this check it also fixes #3147
but we dont have any validation at CSI level for
smaller restore we need to depend on kubernetes
external-provisioner for it.

fixes: #3147

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-07-18 10:04:14 +00:00
Madhu Rajanna
f171143135 cephfs: round to cephfs size to multiple of 4Mib
Due to the bug in the df stat we need to round off
the subvolume size to align with 4Mib.

Note:- Minimum supported size in cephcsi is 1Mib,
we dont need to take care of Kib.

fixes #3240

More details at https://github.com/ceph/ceph/pull/46905

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-07-13 18:32:40 +00:00
Humble Chirammal
1856647506 cephfs: go with default permissions while creating subvolumes
While creating subvolumes, CephFS driver set the mode to `777`
and pass it along to go ceph apis which cause the subvolume
permission to be on 777, however if we create a subvolume
directly in the ceph cluster, the default permission bits are
set which is 755 for the subvolume. This commit try to stick
to the default behaviour even while creating the subvolume.

This also means that we can work with fsgrouppolicy set to
`File` in csiDriver object which is also addressed in this commit.

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2022-07-13 06:49:58 +00:00
Benoît Knecht
507844c9b1 rbd: Use rados namespace when getting clone depth
When the Ceph user is restricted to a specific namespace in the pool, it is
crucial that evey interaction with the cluster is done within that namespace.
This wasn't the case in `getCloneDepth()`.

This issue was causing snapshot creation to fail with

> Failed to check and update snapshot content: failed to take snapshot of the
> volume X: "rpc error: code = Internal desc = rbd: ret=-1, Operation not
> permitted"

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
2022-07-07 22:20:29 +00:00
Niels de Vos
14ba1498bf util: reduce systemd related errors while mounting
There are regular reports that identify a non-error as the cause of
failures. The Kubernetes mount-utils package has detection for systemd
based environments, and if systemd is unavailable, the following error
is logged:

    Cannot run systemd-run, assuming non-systemd OS
    systemd-run output: System has not been booted with systemd as init
    system (PID 1). Can't operate.
    Failed to create bus connection: Host is down, failed with: exit status 1

Because of the `failed` and `exit status 1` error message, users might
assume that the mounting failed. This does not need to be the case. The
container-images that the Ceph-CSI projects provides, do not use
systemd, so the error will get logged with each mount attempt.

By using the newer MountSensitiveWithoutSystemd() function from the
mount-utils package where we can, the number of confusing logs get
reduced.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2022-07-04 10:02:54 +00:00
Niels de Vos
a1ed6207f6 cephfs: report detailed error message on clone failure
go-ceph provides a new GetFailure() method to retrieve details errors
when cloning failed. This is now included in the `cephFSCloneState`
struct, which was a simple string before.

While modifying the `cephFSCloneState` struct, the constants have been
removed, as go-ceph provides them as well.

Fixes: #3140
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2022-06-30 19:33:41 +00:00
Yati Padia
5c40f1ef33 rbd: remove the clone in case of failure
This commit removes the clone incase
unsetAllMetadata or copyEncryptionConfig or
expand fails for createVolumeFromSnapshot
and CreateSnapshot.
It also removes the clone in case of
any failure in createCloneFromImage.

issue: #3103

Signed-off-by: Yati Padia <ypadia@redhat.com>
2022-06-30 05:50:16 +00:00
Prasanna Kumar Kalever
9fa3c8382b cleanup: reduce struct padding
internal/rbd/rbd_util.go:89:15: struct of size 312 bytes could be of
size 304 bytes:
``
struct{
	RbdImageName   	string,
	ImageID        	string,
	VolID          	string,
	Monitors       	string,
	JournalPool    	string,
	Pool           	string,
	RadosNamespace 	string,
	ClusterID      	string,
	RequestName    	string,
	NamePrefix     	string,
	ParentName     	string,
	ParentPool     	string,
	ClusterName    	string,
	Owner          	string,
	VolSize        	int64,
	StripeCount    	uint64,
	StripeUnit     	uint64,
	ObjectSize     	uint64,
	ImageFeatureSet	github.com/ceph/go-ceph/rbd.FeatureSet,
	encryption
*github.com/ceph/ceph-csi/internal/util.VolumeEncryption,
	CreatedAt
*google.golang.org/protobuf/types/known/timestamppb.Timestamp,
	conn
*github.com/ceph/ceph-csi/internal/util.ClusterConnection,
	ioctx          	*github.com/ceph/go-ceph/rados.IOContext,
	Primary        	bool,
	EnableMetadata 	bool,
}
`` (maligned)
type rbdImage struct {
              ^}`
make: *** [Makefile:118: go-lint] Error 1

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2022-06-28 19:12:53 +00:00
Prasanna Kumar Kalever
29a3f4acf6 cleanup: ReconcilePersistentVolume consider passing it by pointer
Address: hugeParam linter

internal/controller/persistentvolume/persistentvolume.go:59:7:
hugeParam: r is heavy (80 bytes); consider passing it by pointer
(gocritic)
[...]
internal/controller/persistentvolume/persistentvolume.go:135:7:
hugeParam: r is heavy (80 bytes); consider passing it by pointer
(gocritic)
func (r ReconcilePersistentVolume) reconcilePV(ctx context.Context, obj
runtime.Object) error {}

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2022-06-28 19:12:53 +00:00
Prasanna Kumar Kalever
caf4090657 rbd: provide option to disable setting metadata on rbd images
As we added support to set the metadata on the rbd images created for
the PVC and volume snapshot, by default metadata is set on all the images.

As we have seen we are hitting issues#2327 a lot of times with this,
we start to leave a lot of stale images. Currently, we rely on
`--extra-create-metadata=true` to decide to set the metadata or not,
we cannot set this option to false to disable setting metadata because we
use this for encryption too.

This changes is to provide an option to disable setting the image
metadata when starting cephcsi.

Fixes: #3009
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2022-06-28 19:12:53 +00:00
Madhu Rajanna
8a47904e8f rbd: add unit test for checkHealthyPrimary
Removed the code in checkHealthyPrimary which
makes the ceph call, passing it as input now.
Added unit test for checkHealthyPrimary function

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-06-28 13:17:11 +00:00
Madhu Rajanna
53e76fab69 rbd: fix checkHealthyPrimary to consider up+stopped state
we need to check for image should be in up+stopped state
not anyone of the state for that the we need to use
OR check not the AND check.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-06-28 13:17:11 +00:00
Madhu Rajanna
704cb5c941 revert: rbd: consider remote image health for primary
When the image is force promoted to primary on the
cluster the remote image might not be in replaying
state because due to the split brain state. This
PR reverts back the commit
c3c87f2ef3. Which we added
to check the remote image status.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-06-28 13:17:11 +00:00
Prasanna Kumar Kalever
1da446d2f2 rbd: healer detect Kubernetes version for right StagingTargetPath
Kubernetes 1.24 and newer use a different path for staging the volume.
That means the CSI-driver is requested to mount the volume at an other
location, compared to previous versions of Kubernetes. CSI-drivers
implementing the volumeHealer, must receive the correct path, otherwise
the after a nodeplugin restart the NBD mounts will bailout attempting
to NodeStageVolume() call and return an error.

See-also: kubernetes/kubernetes#107065

Fixes: #3176
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2022-06-24 12:23:29 +00:00
Madhu Rajanna
3acaa018db rbd: issue resync only if the force flag is set
During failover we do demote the volume on the primary
as the image is still not promoted yet on the remote cluster,
there are spurious split-brain errors reported by RBD,
the Cephcsi resync will attempt to resync from the "known"
secondary and that will cause data loss

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-06-23 13:28:18 +00:00
Madhu Rajanna
7a2dd4c3cf rbd: create token and use it for vault SA
create the token if kubernetes version in
1.24+ and use it for vault sa.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Signed-off-by: Rakshith R <rar@redhat.com>
2022-06-17 11:37:59 +00:00
Robert Vasek
fd7559a903 cephfs: added support for snapshot-backed volumes
This commit implements most of
docs/design/proposals/cephfs-snapshot-shallow-ro-vol.md design document;
specifically (de-)provisioning of snapshot-backed volumes, mounting such
volumes as well as mounting pre-provisioned snapshot-backed volumes.

Signed-off-by: Robert Vasek <robert.vasek@cern.ch>
2022-06-16 09:44:27 +00:00
Robert Vasek
0807fd2e6c journal: added csi.volume.backingsnapshotid image attribute
Signed-off-by: Robert Vasek <robert.vasek@cern.ch>
2022-06-16 09:44:27 +00:00
Madhu Rajanna
4b57cc3ec5 rbd: add support for rbd striping
RBD supports creating rbd images with
object size, stripe unit and stripe count
to support striping. This PR adds the support
for the same.

More details about striping at
https://docs.ceph.com/en/quincy/man/8/rbd/#striping

fixes: #3124

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-06-09 18:59:00 +00:00
Prasanna Kumar Kalever
09a8e5e9e6 rbd: unset cluster Name metadata
unsets the cluster name metadata key and value on the RBD image

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2022-06-08 16:23:59 +00:00
Prasanna Kumar Kalever
2880c25fd6 rbd: set cluster Name as metadata on the image
This change helps read the cluster name from the cmdline args,
the provisioner will set the same on the RBD images.

Fixes: #2973
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2022-06-08 16:23:59 +00:00
Prasanna Kumar Kalever
deb003e605 cleanup: use prefix instead of hardcoding csiParameterPrefix
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2022-06-08 16:23:59 +00:00
Madhu Rajanna
1952a9b4b3 ci: fix all linter errors found in golangci-lint
Fixing all the linter errors found in golang-ci
lint v1.46.2

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-06-03 12:55:54 +00:00
Madhu Rajanna
c9943320ac cephfs: skip NetNamespaceFilePath if the volume is pre-provisioned
In case of pre-provisioned volume the clusterID is
not set in the volume context as the clusterID is missing
we cannot extract the NetNamespaceFilePath from the
configuration file. For static volume and dynamically
provisioned volume the clusterID is set.

Note:- This is a special case to support mounting PV
without clusterID parameter.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-06-03 07:25:25 +00:00
Rakshith R
7688306f87 rbd: use vaultAuthPath variable name in error msg
Before the change, the error msg was the following:
```
failed to set VAULT_AUTH_MOUNT_PATH in Vault config: path is empty
```
`vaultAuthPath` is the actual variable name set by the
user. The error message will now be the following:
```
failed to set "vaultAuthPath" in vault config: path is empty
```

Signed-off-by: Rakshith R <rar@redhat.com>
2022-05-26 07:37:48 +00:00
Rakshith R
894c20f792 nfs: add support for pvc-pvc clone
This commit adds support for pvc-pvc clone.
Only capability needed to be advertised, the
underlying support is already provided by cephfs
backend.

Signed-off-by: Rakshith R <rar@redhat.com>
2022-05-24 18:13:02 +00:00
Rakshith R
24515b509f nfs: add support for create & delete snapshot
This commits adds support for creation and
deletion of nfs snapshots based on cephfs.

Signed-off-by: Rakshith R <rar@redhat.com>
2022-05-24 18:13:02 +00:00
Prasanna Kumar Kalever
6470cf3343 rbd: fix bug handling GetKrbdSupportedFeatures()
continue running rbd driver when /sys/bus/rbd/supported_features file is
missing, do not bailout.

Fixes: #2678
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2022-05-15 15:10:08 +00:00
Prasanna Kumar Kalever
83cc1b0e58 rbd: handle when krbdFeatures is zero
krbdFeatures is set to zero when kernel version < 3.8, i.e. in  case where
/sys/bus/rbd/supported_features is absent and we are unable to prepare
the krbd attributes based on kernel version.

When krbdFeatures is set to zero fallback to NBD only when autofallback
is turned ON.

Fixes: #2678
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2022-05-15 15:10:08 +00:00
Prasanna Kumar Kalever
e53fd87154 rbd: prepare krbd feature attrs if supported_features file is absent
Upstream /sys/bus/rbd/supported_features is part of Linux kernel v4.11.0
Prepare the attributes and use them in case if
/sys/bus/rbd/supported_features is missing.

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2022-05-15 15:10:08 +00:00
Prasanna Kumar Kalever
27f503c144 rbd: unset parent PVC metadata on CreateVolume From Volume
Unset the parent PVC metadata on the temp clone rbd image

Fixes: #2970
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2022-05-12 15:54:09 +00:00
Prasanna Kumar Kalever
e0f34a6d60 rbd: unset snapshot metadata on CreateVolume From snapshot
Unset the snapshot metadata from the rbd image created from the snapshot

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2022-05-12 15:54:09 +00:00
Prasanna Kumar Kalever
d89c5fb39f rbd: unset PVC metadata on CreateSnapshot
Unset the PVC metadata on the rbd image created for the snapshot

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2022-05-12 15:54:09 +00:00
Prasanna Kumar Kalever
bac33262ae rbd: add unset volume/snapshot metadata utility functions
Added
GetVolumeMetadataKeys()
GetSnaoshotMetadataKeys()
unsetVolumeMetadata() and
unsetSnapshotMetadata()

functions.

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2022-05-12 15:54:09 +00:00
Prasanna Kumar Kalever
1fd5277b3c cleanup: simplify setVolumeMetadata and rename it
Move k8s.GetVolumeMetadata() out of setVolumeMetadata() and rename it to
setAllMetadata() so that the same can be used for setting volume and
snapshot metadata.

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2022-05-12 15:54:09 +00:00
Niels de Vos
36e51402cb nfs: support ExpandVolume CSI procedure
There is not much the NFS-provisioner needs to do to expand a volume,
everything is handled by the CephFS components.

NFS does not need a resize on the node, so only ControllerExpandVolume
is required.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2022-05-10 17:43:59 +00:00
Madhu Rajanna
70674565df rbd: consider rbd as default mounter if not set
For the default mounter the mounter option
will not be set in the storageclass and as it is
not available in the storageclass same will not
be set in the volume context, Because of this the
mapOptions are getting discarded. If the mounter
is not set assuming it's an rbd mounter.

Note:- If the mounter is not set in the storageclass
we can set it in the volume context explicitly,
Doing this check-in node server to support backward
existing volumes and the check is minimal we are not
altering the volume context.

fixes: #3076

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-05-09 20:00:11 +00:00
Marcus Röder
a95a6213eb util: support systems using the new cgroup v2 structure
With cgroup v2, the location of the pids.max file changed and so did the
/proc/self/cgroup file

new /proc/self/cgroup file
`
0::/user.slice/user-500.slice/session-14.scope
`

old file:
`
11:pids:/user.slice/user-500.slice/session-2.scope
10:blkio:/user.slice
9:net_cls,net_prio:/
8:perf_event:/
...
`

There is no directory per subsystem (e.g. /sys/fs/cgroup/pids) any more, all
files are now in one directory.

fixes: https://github.com/ceph/ceph-csi/issues/3085

Signed-off-by: Marcus Röder <m.roeder@yieldlab.de>
2022-05-07 20:38:48 +00:00
Rakshith R
f1ccc4eced rbd: support pvc-pvc clone with different sc & encryption
This commit makes modification so as to allow pvc-pvc clone
with different storageclass having different encryption
configs.
This commit also modifies `copyEncryptionConfig()` to
include a `isEncrypted()` check within the function.

Signed-off-by: Rakshith R <rar@redhat.com>
2022-05-06 10:32:21 +00:00
Rakshith R
bd57feb26e rbd: use vaultAuthPath variable name in error msg
Before the change, the error msg was the following:
```
failed to set VAULT_AUTH_MOUNT_PATH in Vault config: path is empty
```
`vaultAuthPath` is the actual variable name set by the
user. The error message will now be the following:
```
failed to set "vaultAuthPath" in vault config: path is empty
```

Signed-off-by: Rakshith R <rar@redhat.com>
2022-05-05 05:49:31 +00:00
Niels de Vos
9d7faf850f nfs: delete the CephFS volume when the export is already removed
In case the NFS-export has already been removed from the NFS-server, but
the CSI Controller was restarted, a retry to remove the NFS-volume will
fail with an error like:

> GRPC error: ....: response status not empty: "Export does not exist"

When this error is reported, assume the NFS-export was already removed
from the NFS-server configuration, and continue with deleting the
backend volume.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2022-05-04 21:31:06 +00:00
Madhu Rajanna
d2bc9743f7 cephfs: add netNamespaceFilePath for CephFS
as same host directory is not shared between
the cephfs and the rbd plugin pod. we need
to keep the netNamespaceFilePath separately
for both cephfs and rbd. CephFS plugin will
use this path to execute mount -t commands.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-04-19 12:28:46 +00:00
Madhu Rajanna
eb4bfb7326 cleanup: use block comment for ClusterInfo example
Adjusted the mix of tabs and the spaces and also
used block comment for better readability.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-04-19 12:28:46 +00:00
Madhu Rajanna
b4acbd08a5 rbd: move radosNamespace to RBD section
As radosNamespace is more specific to
RBD not the general ceph configuration. Now
we introduced a new RBD section for RBD specific
options, Moving the radosNamespace to RBD section
and keeping the radosNamespace still under the
global ceph level configration for backward
compatibility.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-04-19 12:28:46 +00:00
Madhu Rajanna
766346868e util: Add RBD specific options in clusterInfo
As the netNamespaceFilePath can be separate for
both cephfs and rbd adding the netNamespaceFilePath
path for RBD, This will help us to keep RBD and
CephFS specific options separately.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-04-19 12:28:46 +00:00
Niels de Vos
2b71aac752 nfs: return gRPC status from CephFS CreateVolume failure
The NFS Controller returns a non-gRPC error in case the CreateVolume
call for the CephFS volume fails. It is better to return the gRPC-error
that the CephFS Controller passed along.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2022-04-19 08:23:16 +00:00