Commit Graph

270 Commits

Author SHA1 Message Date
Madhu Rajanna
44d4546480 cephfs: return abnormal in NodeGetVolumeStats
When we do stat on the targetpath, if there is
any error we can check is it due to corruption.
If yes, cephcsi can return abnormal in the
NodeGetVolumeStats so that consumer (CO/admin)
and detect and take further action.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-10-26 09:40:22 +00:00
Madhu Rajanna
302fead713 cephfs: delete subvolume if SetAllMetadata fails
To avoid subvolume leaks if the SetAllMetadata
operations fails delete the subvolume.
If any operation fails after creating the subvolume
we will remove the omap as the omap gets
removed we will need to remove the subvolume to
avoid stale resources.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-10-18 15:10:18 +00:00
Marcel Lauhoff
5a55419025 cephfs: Add placeholder journal fscrypt support
Signed-off-by: Marcel Lauhoff <marcel.lauhoff@suse.com>
2022-10-17 17:33:52 +00:00
Madhu Rajanna
b40e8894f8 cephfs: use errors.As instead of errors.Is
As we need to compare the error type instead
of the error value we need to use errors.As
to check the API is implemented or not.

fixes: #3347

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-10-17 09:11:45 +00:00
Niels de Vos
b7703faf37 util: make inode metrics optional in FilesystemNodeGetVolumeStats()
CephFS does not have a concept of "free inodes", inodes get allocated
on-demand in the filesystem.

This confuses alerting managers that expect a (high) number of free
inodes, and warnings get produced if the number of free inodes is not
high enough. This causes alerts to always get reported for CephFS.

To prevent the false-positive alerts from happening, the
NodeGetVolumeStats procedure for CephFS (and CephNFS) will not contain
inodes in the reply anymore.

See-also: https://bugzilla.redhat.com/2128263
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2022-10-13 19:02:47 +00:00
Madhu Rajanna
76064d8e34 cephfs: retry subvolumegroup creation
Incase the  subvolumegroup is deleted
and recreated we need to restart the
cephcsi provisioner pod to clear cache
that cephcsi maintains. With this PR
if cephcsi sees NotFound error duing
subvolume creation it will reset the cache
for that filesystem so that in next RPC
call cephcsi will try to create the
subvolumegroup again

Ref: https://github.com/rook/rook/issues/10623

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-09-07 18:24:30 +00:00
Madhu Rajanna
e56621cd66 cephfs: fix subvolumegroup creation for multiple fs
In a cluster we can have multiple filesystem
for that we need to have a map of
subvolumegroups to check filesystem is created
nor not.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-09-07 18:24:30 +00:00
Madhu Rajanna
038462ff43 cephfs: return success if metadata operation not supported
If the ceph cluster is of older version and doesnot
support metadata operation, Instead of failing
the request return the success if metadata
operation is not supported.

fixes #3347

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-08-29 18:37:53 +00:00
Madhu Rajanna
dde21543bd cephfs: fix staticcheck comment
getting is unused for linter "staticcheck"
(nolintlint) error message due to wrong
comment format. this the format now with
`//directive // comment`

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-08-10 17:51:26 +00:00
Prasanna Kumar Kalever
30244bf11b cephfs: snapshots honor --setmetadata option
`--setmetadata` is false by default, honoring it
will keep the metadata disabled by default

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2022-08-01 07:15:29 +00:00
Prasanna Kumar Kalever
14d6211d6d cephfs: subvolumes honor --setmetadata option
`--setmetadata` is false by default, honoring it
will keep the metadata disabled by default

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2022-08-01 07:15:29 +00:00
Prasanna Kumar Kalever
de7128b3a2 cephfs: Add clusterName as metadata on snapshots
Example:
sh-4.4$ ceph fs subvolume snapshot metadata ls myfs csi-vol-ba248f9e-0e75-11ed-b774-8e97192ff5ec \
			csi-snap-ce24e3bb-0e75-11ed-b774-8e97192ff5ec --group_name csi
{
    "csi.ceph.com/cluster/name": "\"K8s-cluster-1\"",
    "csi.storage.k8s.io/volumesnapshot/name": "cephfs-pvc-snapshot",
    "csi.storage.k8s.io/volumesnapshot/namespace": "rook-ceph",
    "csi.storage.k8s.io/volumesnapshotcontent/name": "snapcontent-2e89e1b2-e6e9-48fe-b365-edb493d7022e"
}

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2022-08-01 07:15:29 +00:00
Prasanna Kumar Kalever
856d7c264c cephfs: handle metadata op-failures with unsupported ceph versions
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2022-07-28 19:37:23 +00:00
Prasanna Kumar Kalever
5f36f7e8bd cephfs: update subvolume snapshot metadata if snapshot already exists.
Make sure to set metadata when subvolume snapshot exist, i.e. if the
provisioner pod is restarted while createSnapShot is in progress, say it
created the subvolume snapshot but didn't yet set the metadata.

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2022-07-28 19:37:23 +00:00
Prasanna Kumar Kalever
7c9259a45e cephfs: set metadata on the subvolume snapshot on create
Set snapshot-name/snapshot-namespace/snapshotcontent-name details
on subvolume snapshots as metadata on create.

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2022-07-28 19:37:23 +00:00
Prasanna Kumar Kalever
8c0dd482fa cephfs: add set/Remove subvolume snapshot metadata utility functions
Add utility functions to set/Remove
snapshot-name/snapshot-namespace/snapshotcontent-name metadata on
subvolume snapshots.

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2022-07-28 19:37:23 +00:00
Prasanna Kumar Kalever
51099d60fe cephfs: handle metadata op-failures with unsupported ceph versions
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2022-07-28 04:07:52 +00:00
Prasanna Kumar Kalever
11d51ed9b0 cephfs: unset cluster Name metadata
unsets the cluster name metadata key and value on the subvolume

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2022-07-28 04:07:52 +00:00
Prasanna Kumar Kalever
21d811096b cephfs: set cluster Name as metadata on the subvolume
This change helps read the cluster name from the cmdline args,
the provisioner will set the same on the subvolume.

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2022-07-28 04:07:52 +00:00
Prasanna Kumar Kalever
466bdf97b2 cephfs: set metadata on restart of provisioner pod
Make sure to set metadata when subvolume exist, i.e. if the provisioner pod
is restarted while createVolume is in progress, say it created the subvolume
but didn't yet set the metadata.

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2022-07-28 04:07:52 +00:00
Prasanna Kumar Kalever
6bcb8ecc68 cephfs: set PV/PVC details on the subvolume as metadata on create
This helps Monitoring solutions without access to Kubernetes clusters to
display the details of the PV/PVC/NameSpace in their dashboard.

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2022-07-28 04:07:52 +00:00
Prasanna Kumar Kalever
ecf03eb6ae cephfs: add set/Get/List/Remove metadata utility functions
Add utility functions to set/Get/List/Remove PV/PVC/PVCNamespace metadata
on subvolume.

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2022-07-28 04:07:52 +00:00
Niels de Vos
011d4fc81c cleanup: create k8s.io/mount-utils Mounter only once
Recently the k8s.io/mount-utils package added more runtime dectection.
When creating a new Mounter, the detect is run every time. This is
unfortunate, as it logs a message like the following:

```
mount_linux.go:283] Detected umount with safe 'not mounted' behavior
```

This message might be useful, so it probably good to keep it.

In Ceph-CSI there are various locations where Mounter instances are
created. Moving that to the DefaultNodeServer type reduces it to a
single place. Some utility functions need to accept the additional
parameter too, so that has been modified as well.

See-also: kubernetes/kubernetes#109676
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2022-07-21 07:14:43 +00:00
takeaki-matsumoto
1025871021 cephfs: Support mount option on nodeplugin
add mount options on nodeplugin side

Signed-off-by: takeaki-matsumoto <takeaki.matsumoto@linecorp.com>
2022-07-18 22:04:12 +00:00
Madhu Rajanna
ceb88d6498 cephfs: remove extra check for restore size
Looks like cephfs snapshot size is buggy and its
getting removed in ceph fs. we cannot get the size
of the snapshot during CreateVolume call, so we cannot
do any size check at CreateVolume to check if the
restore size is smaller or not.

As we are removing this check it also fixes #3147
but we dont have any validation at CSI level for
smaller restore we need to depend on kubernetes
external-provisioner for it.

fixes: #3147

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-07-18 10:04:14 +00:00
Madhu Rajanna
f171143135 cephfs: round to cephfs size to multiple of 4Mib
Due to the bug in the df stat we need to round off
the subvolume size to align with 4Mib.

Note:- Minimum supported size in cephcsi is 1Mib,
we dont need to take care of Kib.

fixes #3240

More details at https://github.com/ceph/ceph/pull/46905

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-07-13 18:32:40 +00:00
Humble Chirammal
1856647506 cephfs: go with default permissions while creating subvolumes
While creating subvolumes, CephFS driver set the mode to `777`
and pass it along to go ceph apis which cause the subvolume
permission to be on 777, however if we create a subvolume
directly in the ceph cluster, the default permission bits are
set which is 755 for the subvolume. This commit try to stick
to the default behaviour even while creating the subvolume.

This also means that we can work with fsgrouppolicy set to
`File` in csiDriver object which is also addressed in this commit.

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2022-07-13 06:49:58 +00:00
Niels de Vos
a1ed6207f6 cephfs: report detailed error message on clone failure
go-ceph provides a new GetFailure() method to retrieve details errors
when cloning failed. This is now included in the `cephFSCloneState`
struct, which was a simple string before.

While modifying the `cephFSCloneState` struct, the constants have been
removed, as go-ceph provides them as well.

Fixes: #3140
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2022-06-30 19:33:41 +00:00
Robert Vasek
fd7559a903 cephfs: added support for snapshot-backed volumes
This commit implements most of
docs/design/proposals/cephfs-snapshot-shallow-ro-vol.md design document;
specifically (de-)provisioning of snapshot-backed volumes, mounting such
volumes as well as mounting pre-provisioned snapshot-backed volumes.

Signed-off-by: Robert Vasek <robert.vasek@cern.ch>
2022-06-16 09:44:27 +00:00
Robert Vasek
0807fd2e6c journal: added csi.volume.backingsnapshotid image attribute
Signed-off-by: Robert Vasek <robert.vasek@cern.ch>
2022-06-16 09:44:27 +00:00
Madhu Rajanna
1952a9b4b3 ci: fix all linter errors found in golangci-lint
Fixing all the linter errors found in golang-ci
lint v1.46.2

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-06-03 12:55:54 +00:00
Madhu Rajanna
c9943320ac cephfs: skip NetNamespaceFilePath if the volume is pre-provisioned
In case of pre-provisioned volume the clusterID is
not set in the volume context as the clusterID is missing
we cannot extract the NetNamespaceFilePath from the
configuration file. For static volume and dynamically
provisioned volume the clusterID is set.

Note:- This is a special case to support mounting PV
without clusterID parameter.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-06-03 07:25:25 +00:00
Madhu Rajanna
d2bc9743f7 cephfs: add netNamespaceFilePath for CephFS
as same host directory is not shared between
the cephfs and the rbd plugin pod. we need
to keep the netNamespaceFilePath separately
for both cephfs and rbd. CephFS plugin will
use this path to execute mount -t commands.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-04-19 12:28:46 +00:00
Madhu Rajanna
766346868e util: Add RBD specific options in clusterInfo
As the netNamespaceFilePath can be separate for
both cephfs and rbd adding the netNamespaceFilePath
path for RBD, This will help us to keep RBD and
CephFS specific options separately.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-04-19 12:28:46 +00:00
Madhu Rajanna
7b2aef0d81 util: add support for the nsenter
add support to run rbd map and mount -t
commands with the nsenter.

complete design of pod/multus network
is added here https://github.com/rook/rook/
blob/master/design/ceph/multus-network.md#csi-pods

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-04-08 10:23:21 +00:00
Madhu Rajanna
f8bbd2f60f cephfs: fix omap deletion in DeleteSnapshot
The omap is stored with the requested
snapshot name not with the subvolume
snapshotname. This fix uses the correct
snapshot request name to cleanup the omap
once the subvolume snapshot is deleted.

fixes: #2974

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-03-31 13:46:03 +00:00
Madhu Rajanna
77011fbc61 cephfs: remove kubernetes csi prefixed parameters
remove kubernetes csi prefixed parameters
from the volumeContext as we dont want
to store it in the PV VolumeAttributes.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-03-21 08:54:43 +00:00
Madhu Rajanna
d357bebbc2 cephfs: disallow creating small volumes from snapshot/volume
as per the CSI standard the size is optional parameter,
as we are allowing the clone to a bigger size
today we need to block the clone to a smaller size
as its a have side effects like data corruption etc.

Note:- Even though this check is present in kubernetes
sidecar as CSI is CO independent adding the check
here.

fixes: #2718

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-03-17 05:07:26 +00:00
Madhu Rajanna
78ec859dc6 cleanup: remove unwanted print
Removing unwanted print from the code

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-03-11 05:40:32 +00:00
Robert Vasek
80dda7cc30 cephfs: detect corrupt ceph-fuse mounts and try to remount
Mounts managed by ceph-fuse may get corrupted by e.g. the ceph-fuse process
exiting abruptly, or its parent container being terminated, taking down its
child processes with it.

This commit adds checks to NodeStageVolume and NodePublishVolume procedures
to detect whether a mountpoint in staging_target_path and/or target_path is
corrupted, and remount is performed if corruption is detected.

Signed-off-by: Robert Vasek <robert.vasek@cern.ch>
2022-03-10 06:05:52 +00:00
Robert Vasek
aa6297e164 cleanup: refactor helper functions in nodeserver.go
Refactored a couple of helper functions for easier resue.

* Code for building store.VolumeOptions is factored out into a separate function.

* Changed args of getCredentailsForVolume() and NodeServer.mount() so that
  instead of passing in whole csi.NodeStageVolumeRequest, only necessary
  properties are passed explicitly. This is to allow these functions to be
  called outside of NodeStageVolume() where NodeStageVolumeRequest is not
  available.

Signed-off-by: Robert Vasek <robert.vasek@cern.ch>
2022-03-10 06:05:52 +00:00
Madhu Rajanna
e9802c4940 cephfs: refactor cephfs core functions
This commits refactors the cephfs core
functions with interfaces. This helps in
better code structuring and writing the
unit test cases.

update #852

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-02-22 20:39:23 +00:00
Humble Chirammal
8f6a7da538 cephfs: dont set explicit permissions on the volume
At present we are node staging with worldwide permissions which is
not correct. We should allow the CO to take care of it and make
the decision. This commit also remove `fuseMountOptions` and
`KernelMountOptions` as they are no longer needed

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2022-02-09 17:30:29 +00:00
Madhu Rajanna
2943555904 cephfs: fix omap deletion in DeleteSnapshot
the omap is stored with the requested
snapshot name not with the subvolume
snapshotname. This fix uses the correct
snapshot request name to cleanup the omap
once the subvolume snapshot is deleted.

fixes: #2832

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-02-08 20:37:53 +00:00
Madhu Rajanna
992d257530 cephfs: fix error logging in filesystem.go
fix error message logging in filesystem.go

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-01-27 14:31:12 +00:00
Madhu Rajanna
14c008c419 cleanup: use interface in filesystem.go
Currently, we are using methods and all the methods
makes a network call to fetch details from the ceph
clusters, its difficult to write test cases for
these functions, if we move to the interfaces
we can make use of mock to write unit testing
for the caller functions.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-01-27 14:31:12 +00:00
Madhu Rajanna
2daf2f9f0c cephfs: log error message if clone fails
During CreateVolume from snapshot/volume,
its difficult to identify if the clone is
failed and a new clone is created. In case
of clone failure logging the error message
for better debugging.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-01-17 09:43:09 +00:00
Madhu Rajanna
ef14ea7723 cephfs: resize cloned, restored volume if required
Currently, as a workaround, we are calling
the resize volume on the cloned, restore volumes
to adjust the cloned, restored volumes.
With this fix, we are calling the resize volume
only if there is a size mismatch with requested
and the volume from which the new volume needs
to be created.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-01-12 10:44:11 +00:00
Humble Chirammal
68350e8815 cephfs: add SINGLE_NODE_{SINGLE/MULTI}_WRITER capability
SINGLE_NODE_WRITER capability ambiguity has been fixed in csi spec v1.5
which allows the SP drivers to declare more granular WRITE capability.
These are not really new capabilities rather capabilities introduced to
get the desired functionality from CO side based on the capabilities SP
driver support for various CSI operations.

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2022-01-11 19:40:22 +00:00
Madhu Rajanna
12e8e46bcf revert: remove explicit size setting of cloned volume
The ceph changes  are done on the both server and the
client side this change is not enough for remove
setting the size of cloned volumes.
this caused the regression like #2719 #2720 #2721 #2722.

This reverts commit 3565a342d5.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2021-12-21 14:15:46 +00:00