Commit Graph

104 Commits

Author SHA1 Message Date
Niels de Vos
b7703faf37 util: make inode metrics optional in FilesystemNodeGetVolumeStats()
CephFS does not have a concept of "free inodes", inodes get allocated
on-demand in the filesystem.

This confuses alerting managers that expect a (high) number of free
inodes, and warnings get produced if the number of free inodes is not
high enough. This causes alerts to always get reported for CephFS.

To prevent the false-positive alerts from happening, the
NodeGetVolumeStats procedure for CephFS (and CephNFS) will not contain
inodes in the reply anymore.

See-also: https://bugzilla.redhat.com/2128263
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2022-10-13 19:02:47 +00:00
Niels de Vos
83df1eae53 rebase: k8s.io/mount-utils/IsNotMountPoint() is deprecated
IsNotMountPoint() is deprecated and Mounter.IsMountPoint() is
recommended to be used instead.

Reported-by: golangci/staticcheck
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2022-08-04 09:53:07 +00:00
Niels de Vos
3a200b6976 rbd: use IsLikelyNotMountPoint() to prevent systemd log messages
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2022-08-04 09:53:07 +00:00
Niels de Vos
011d4fc81c cleanup: create k8s.io/mount-utils Mounter only once
Recently the k8s.io/mount-utils package added more runtime dectection.
When creating a new Mounter, the detect is run every time. This is
unfortunate, as it logs a message like the following:

```
mount_linux.go:283] Detected umount with safe 'not mounted' behavior
```

This message might be useful, so it probably good to keep it.

In Ceph-CSI there are various locations where Mounter instances are
created. Moving that to the DefaultNodeServer type reduces it to a
single place. Some utility functions need to accept the additional
parameter too, so that has been modified as well.

See-also: kubernetes/kubernetes#109676
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2022-07-21 07:14:43 +00:00
Niels de Vos
14ba1498bf util: reduce systemd related errors while mounting
There are regular reports that identify a non-error as the cause of
failures. The Kubernetes mount-utils package has detection for systemd
based environments, and if systemd is unavailable, the following error
is logged:

    Cannot run systemd-run, assuming non-systemd OS
    systemd-run output: System has not been booted with systemd as init
    system (PID 1). Can't operate.
    Failed to create bus connection: Host is down, failed with: exit status 1

Because of the `failed` and `exit status 1` error message, users might
assume that the mounting failed. This does not need to be the case. The
container-images that the Ceph-CSI projects provides, do not use
systemd, so the error will get logged with each mount attempt.

By using the newer MountSensitiveWithoutSystemd() function from the
mount-utils package where we can, the number of confusing logs get
reduced.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2022-07-04 10:02:54 +00:00
Madhu Rajanna
1952a9b4b3 ci: fix all linter errors found in golangci-lint
Fixing all the linter errors found in golang-ci
lint v1.46.2

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-06-03 12:55:54 +00:00
Prasanna Kumar Kalever
83cc1b0e58 rbd: handle when krbdFeatures is zero
krbdFeatures is set to zero when kernel version < 3.8, i.e. in  case where
/sys/bus/rbd/supported_features is absent and we are unable to prepare
the krbd attributes based on kernel version.

When krbdFeatures is set to zero fallback to NBD only when autofallback
is turned ON.

Fixes: #2678
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2022-05-15 15:10:08 +00:00
Madhu Rajanna
70674565df rbd: consider rbd as default mounter if not set
For the default mounter the mounter option
will not be set in the storageclass and as it is
not available in the storageclass same will not
be set in the volume context, Because of this the
mapOptions are getting discarded. If the mounter
is not set assuming it's an rbd mounter.

Note:- If the mounter is not set in the storageclass
we can set it in the volume context explicitly,
Doing this check-in node server to support backward
existing volumes and the check is minimal we are not
altering the volume context.

fixes: #3076

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-05-09 20:00:11 +00:00
Madhu Rajanna
766346868e util: Add RBD specific options in clusterInfo
As the netNamespaceFilePath can be separate for
both cephfs and rbd adding the netNamespaceFilePath
path for RBD, This will help us to keep RBD and
CephFS specific options separately.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-04-19 12:28:46 +00:00
Madhu Rajanna
7b2aef0d81 util: add support for the nsenter
add support to run rbd map and mount -t
commands with the nsenter.

complete design of pod/multus network
is added here https://github.com/rook/rook/
blob/master/design/ceph/multus-network.md#csi-pods

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-04-08 10:23:21 +00:00
Madhu Rajanna
8c5e414d53 rbd: do not read pvc namespace from volume attributes
Below are the 3 different cases where we need
the PVC namespace for encryption

* CreateVolume:- Read the namespace from the
createVolume parameters and store it in the omap
* NodeStage:- Read the namespace from the omap
not from the volumeContext
* Regenerate:- Read the pvc namespace from the claimRef
not from the volumeAttributes.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-03-21 08:54:43 +00:00
Madhu Rajanna
d5c98f81a2 rbd: make image features as optional parameter
Makes the rbd images features in the storageclass
as optional so that default image features of librbd
can be used. and also kept the option to user
to specify the image features in the storageclass.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-02-28 13:10:03 +00:00
Madhu Rajanna
28fef9b379 cleanup: remove thick provisioning code
This commit removes the thick provisioning
code as thick provisioning is deprecated in
cephcsi 3.5.0.

fixes: #2795

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2022-01-28 11:17:15 +00:00
Humble Chirammal
7ff048bf1e e2e: add podsecuritycontext fsgroup for normal user validation
considering the pod has run as normal user, the fsgroup has also
set to the same.

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2022-01-25 16:25:11 +00:00
Humble Chirammal
bf4ba0ec84 rbd: dont attempt explicit permission mod change from the RBD driver
currently we are overriding the permission to `0o777` at time of node
stage which is not the correct action. That said, this permission
change causes an extra permission correction at time of nodestaging
by the CO while the FSGROUP change policy has been set to
`OnRootMismatch`.

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2022-01-25 16:25:11 +00:00
Humble Chirammal
3730a462f4 rbd: add SINGLE_NODE{SINGLE_MULTI}_WRITER capabilities
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2022-01-11 19:40:22 +00:00
Rakshith R
9adb25691c rbd: remove redundant util.Credentials arg from flattenRbdImage()
With introduction of go-ceph rbd admin task api, credentials are
no longer required to be passed as cli cmd is not invoked.

Signed-off-by: Rakshith R <rar@redhat.com>
2022-01-06 12:28:18 +00:00
Madhu Rajanna
3169c8e23a rbd: expand filesystem during NodeStageVolume
If the volume with a bigger size is created
from a snapshot or from another volume we
need to exapand the filesystem also in the
csidriver as nodeExpand request is not triggered
for this one, During NodeStageVolume we can
expand the filesystem by checking filesystem
needs expansion or not.

If its a encrypted device, check the device
size of rbd device and the LUKS device if required
the device will be expanded before
expanding the filesystem.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2021-12-23 03:47:00 +00:00
Humble Chirammal
88911eb4e9 rbd: add migration secret support to controllerserver functions
This commit adds the migration secret request validation to expand,
create controller functions.

Ref # https://github.com/ceph/ceph-csi/issues/2509

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2021-12-20 07:34:43 +00:00
Niels de Vos
5baf9811f9 rbd: export NodeServer.mounter outside of the rbd package
NodeServer.mounter is internal to the NodeServer type, but it needs to
be initialized by the rbd-driver. The rbd-driver is moved to its own
package, so .Mounter needs to be available from there in order to set
it.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-12-10 07:35:26 +00:00
Prasanna Kumar Kalever
bdcf3273b5 rbd: provide a way to supply mounter specific mapOptions from sc
Uses the below schema to supply mounter specific map/unmapOptions to the
nodeplugin based on the discussion we all had at
https://github.com/ceph/ceph-csi/pull/2636

This should specifically be really helpful with the `tryOthermonters`
set to true, i.e with fallback mechanism settings turned ON.

mapOption: "kbrd:v1,v2,v3;nbd:v1,v2,v3"

- By omitting `krbd:` or `nbd:`, the option(s) apply to
  rbdDefaultMounter which is krbd.
- A user can _override_ the options for a mounter by specifying `krbd:`
  or `nbd:`.
  mapOption: "v1,v2,v3;nbd:v1,v2,v3"
  is effectively the same as the 1st example.
- Sections are split by `;`.
- If users want to specify common options for both `krbd` and `nbd`,
  they should mention them twice.

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2021-11-23 08:54:37 +00:00
Madhu Rajanna
7bbd2ea284 rbd: use small case of error message
the error message should not start with
the capital letter changing the case as
per the standard.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2021-11-18 10:44:12 +00:00
Madhu Rajanna
51998a5f4a cleanup: log the image name and pool name
instead of logging the volumeID and the pool
name. log the poolname and image name for better
debugging.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2021-11-18 10:44:12 +00:00
Niels de Vos
7e22180125 rbd: call undoStagingTransaction() when NodeStageVolume() fails
On line 341 a `transaction` is created. This is passed to the deferred
`undoStagingTransaction()` function when an error in the
`NodeStageVolume` procedure is detected. So far, so good.

However, on line 356 a new `transaction` is returned. This new
`transaction` is not used for the defer call.

By removing the empty `transaction` that is used in the defer call, and
calling `undoStagingTransaction()` on an error of `stageTransaction()`,
the code is a little simpler, and the cleanup of the transaction should
be done correctly now.

Updates: #2610
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-11-17 23:58:00 +00:00
Prasanna Kumar Kalever
e6fa392df1 rbd: fix mapOptions passing with rbd-nbd mounter
This was a regression introduced by:
https://github.com/ceph/ceph-csi/pull/2556

Fixes: #2610
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2021-11-16 10:12:46 +00:00
Prasanna Kumar Kalever
9a3170bf77 rbd: provide a way to disable the auto fallback to nbd mounter
This change allows the user to choose not to fallback to NBD mounter
when some ImageFeatures are absent with krbd driver, rather just fail
the NodeStage call.

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2021-11-01 08:17:36 +00:00
Prasanna Kumar Kalever
bfc24f6f12 cleanup: generalize the parseBool function
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2021-11-01 08:17:36 +00:00
Prasanna Kumar Kalever
84ec797dda rbd: detect krbd features in runtime and fallback to nbd
Currently, we recognize and warn for the provided image features based on
our prior intelligence at ceph-csi (i.e based on supportedFeatures map
and validateImageFeatures) at image/PV creation time. It might be very
much possible that the cluster is heterogeneous i.e. the PV creation and
application container might both be on different nodes with different
kernel versions (krbd driver versions).

This PR adds a mechanism to check for the supported krbd features during
mount time, if the krbd driver doesn't have the specified image feature
then it will fall back to rbd-nbd mounter.

Fixes: #478
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2021-11-01 08:17:36 +00:00
Humble Chirammal
6aec858cba rbd: parse migration secret and set fields for nodestage operations
this commit make use of the migration request secret parsing and set
the required fields for further nodestage operations

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2021-10-27 18:35:00 +00:00
Humble Chirammal
c584fa20da rbd: use clusterID from volumeContext at nodestage
previously we were retriving clusterID using the monitors field
in the volume context at node stage code path. however it is possible to
retrieve or use clusterID directly from the volume context. This
commit also remove the getClusterIDFromMigrationVolume() function
which was used previously and its tests

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2021-10-11 10:06:30 +00:00
Madhu Rajanna
8ebc0659ab rbd: perform resize of file system for static volume
For static volume, the user will manually mounts
already existing image as a volume to the application
pods. As its a rbd Image, if the PVC is of type
fileSystem the image will be mapped, formatted
and mounted on the node,
If the user resizes the image on the ceph cluster.
User cannot not automatically resize the filesystem
created on the rbd image. Even if deletes and
recreates the kubernetes objects, the new size
will not be visible on the node.

With this changes During the NodeStageVolumeRequest
the nodeplugin will check the size of the mapped rbd
image on the node using the devicePath. and also
the rbd image size on the ceph cluster.

If the size is not matching it will do the file
system resize on the node as part of the
NodeStageVolumeRequest RPC call.

The user need to do below operation to see new size
* Resize the rbd image in ceph cluster
* Scale down all the application pods using the static
PVC.
* Make sure no application pods which are using the
static PVC is running on a node.
* Scale up all the application pods.

Validate the new size in application pod mounted
volume.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2021-10-06 13:15:00 +00:00
Madhu Rajanna
fe9020260d rbd: move flattening to helper function
in NodeStage operation we are flattening
the image to support mounting on the older
clients. this commits moves it to a helper
function to reduce code complexity.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2021-10-06 13:15:00 +00:00
Madhu Rajanna
cda2abca5d rbd: use NewMetricsBlock to get size
instead of lsblk command use NewMetricsBlock
function from the kubernetes package to get
the size.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2021-10-06 13:15:00 +00:00
Humble Chirammal
3c9d7e3cd5 rbd: detect migration volID in DeleteVolume() and delete rbd image
This commit adds the logic to detect a passed in volumeID
is a migrated volume ID and if yes, the driver connect to the
backend cluster and clean/delete the image. The logic
only applied if its a migration volume ID. The migration volume ID
carry the information like mons, pool and image name which is
good enough for the driver to identify and connect to the backend
cluster for its operations.

migration volID format:
<mig>_mons-<monsHash>_image-<imageUID>_<poolHash>

Details on the hash values:

* MonsHash: this carry a hash value (md5sum) which will be acted as the
`clusterID` for the operations in this context.

* ImageUID: this is the unique UUID generated by kubernetes for the created
volume.

* PoolHash: this is an encoded string of pool name.

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2021-10-04 16:06:31 +00:00
Humble Chirammal
2e8e8f5e64 rbd: fill clusterID if its a migration nodestage request
the migration nodestage request does not carry the 'clusterID' in it
and only monitors are available with the volumeContext. The volume
context flag 'migration=true' and 'static=true' flags allow us to
fill 'clusterID' from the passed in monitors to the volume Context,so
that rest of the static operations on nodestage can be proceeded as we
do treat static volumes today.

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2021-09-20 09:54:54 +00:00
Prasanna Kumar Kalever
c9cc36d8db rbd: provide alternatives to preserve the ceph log files
Currently, we delete the ceph client log file on unmap/detach.

This patch provides additional alternatives for users who would like to
persist the log files.

Strategies:
-----------
`remove`: delete log file on unmap/detach
`compress`: compress the log file to gzip on unmap/detach
`preserve`: preserve the log file in text format

Note that the default strategy will be remove on unmap, and these options
can be tweaked from the storage class

Compression size details example:

On Map: (with debug-rbd=20)
---------
$ ls -lh
-rw-r--r-- 1 root root 526K Sep  1 18:15
rbd-nbd-0001-0024-fed5480a-f00f-417a-a51d-31d8a8144c03-0000000000000003-d2e89c87-0b4d-11ec-8ea6-160f128e682d.log

On unmap:
---------
$ ls -lh
-rw-r--r-- 1 root root  33K Sep  1 18:15
rbd-nbd-0001-0024-fed5480a-f00f-417a-a51d-31d8a8144c03-0000000000000003-d2e89c87-0b4d-11ec-8ea6-160f128e682d.gz

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2021-09-16 13:55:15 +00:00
Prasanna Kumar Kalever
10bbb049f7 cleanup: passing pointers to larger type
Log:
internal/rbd/rbd_attach.go:424:2: hugeParam: dArgs is heavy (88 bytes);
consider passing it by pointer (gocritic)

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2021-09-16 13:55:15 +00:00
Rakshith R
9d1e98ca60 rbd: check for clusterid mapping in genVolFromVolumeOptions()
This commit adds capability to genVolFromVolumeOptions() to fetch
mapped clusted-id & mon ips for mirrored PVC on secondary cluster
which may have different cluster-id.

This is required for NodeStageVolume().

We also don't need to check for mapping during volume create requests,
so it can be disabled by passing a bool checkClusterIDMapping as false.

GetMonsAndClusterID() is modified to accept bool checkClusterIDMapping
based on which clustermapping is checked to fetch mapped cluster-id and
mon-ips.

Signed-off-by: Rakshith R <rar@redhat.com>
2021-09-14 08:39:57 +00:00
Humble Chirammal
3f31ca8a3a cleanup: introduce populateVolOptions(), to fill rbdVol from stage req
At present the nodeStageVolume() handle many logic of filling rbdvol
struct based on the request received and this method is complex to
follow. with this patch, filling or populating volOptions has been
segregrated and handled hence make the stage functions' job easy.

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2021-09-06 07:49:03 +00:00
Niels de Vos
6d00b39886 cleanup: move log functions to new internal/util/log package
Moving the log functions into its own internal/util/log package makes it
possible to split out the humongous internal/util packages in further
smaller pieces. This reduces the inter-dependencies between utility
functions and components, preventing circular dependencies which are not
allowed in Go.

Updates: #852
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2021-08-26 09:34:05 +00:00
Prasanna Kumar Kalever
ea3def0db2 rbd: remove per volume rbd-nbd logfiles on detach
- Update the meta stash with logDir details
- Use the same to remove logfile on unstage/unmap to be space efficient

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2021-08-24 07:15:30 +00:00
Prasanna Kumar Kalever
d67e88ccd0 cleanup: embed args into struct and pass it to detachRBDImageOrDeviceSpec
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2021-08-24 07:15:30 +00:00
Prasanna Kumar Kalever
682b3a980b rbd: rbd-nbd logging the ceph-CSI way
- One logfile per device/volume
- Add ability to customize the logdir, default: /var/log/ceph

Note: if user customizes the hostpath to something else other than default
/var/log/ceph, then it is his responsibility to update the `cephLogDir`
in storageclass to reflect the same with daemon:

```
cephLogDir: "/var/log/mynewpath"
```

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2021-08-24 07:15:30 +00:00
Prasanna Kumar Kalever
526ff95f10 rbd: add support to expand encrypted volume
Previously in ControllerExpandVolume() we had a check for encrypted
volumes and we use to fail for all expand requests on an encrypted
volume. Also for Block VolumeMode PVCs NodeExpandVolume used to be
ignored/skipped.

With these changes, we add support for the expansion of encrypted volumes.
Also for raw Block VolumeMode PVCs with Encryption we call NodeExpandVolume.

That said,
With LUKS1, cryptsetup utility doesn't prompt for a passphrase on resizing
the crypto mapper device. This is because LUKS1 devices don't use kernel
keyring for volume keys.

Whereas, LUKS2 devices use kernel keyring for volume key by default, i.e.
cryptsetup utility asks for a passphrase if it detects volume key was
previously passed to dm-crypt via kernel keyring service, we are overriding
the default by --disable-keyring option during cryptsetup open command.
So that at the time of crypto mapper device resize we will not be
prompted for any passphrase.

Fixes: #1469

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2021-07-23 10:00:23 +00:00
Yati Padia
1ae2afe208 cleanup: modifies the error caused due to merged PRs
This commit modifies the error of godot, cyclop,
paralleltest linter caused due to merged PRs.

Updates: #1586

Signed-off-by: Yati Padia <ypadia@redhat.com>
2021-07-22 18:15:48 +00:00
Rakshith R
43f753760b cleanup: resolve nlreturn linter issues
nlreturn linter requires a new line before return
and branch statements except when the return is alone
inside a statement group (such as an if statement) to
increase code clarity. This commit addresses such issues.

Updates: #1586

Signed-off-by: Rakshith R <rar@redhat.com>
2021-07-22 06:05:01 +00:00
Prasanna Kumar Kalever
b6a88dd728 rbd: add volume healer
Problem:
-------
For rbd nbd userspace mounter backends, after a restart of the nodeplugin
all the mounts will start seeing IO errors. This is because, for rbd-nbd
backends there will be a userspace mount daemon running per volume, post
restart of the nodeplugin pod, there is no way to restore the daemons
back to life.

Solution:
--------
The volume healer is a one-time activity that is triggered at the startup
time of the rbd nodeplugin. It navigates through the list of volume
attachments on the node and acts accordingly.

For now, it is limited to nbd type storage only, but it is flexible and
can be extended in the future for other backend types as needed.

From a few feets above:
This solves a severe problem for nbd backed csi volumes. The healer while
going through the list of volume attachments on the node, if finds the
volume is in attached state and is of type nbd, then it will attempt to
fix the rbd-nbd volumes by sending a NodeStageVolume request with the
required volume attributes like secrets, device name, image attributes,
and etc.. which will finally help start the required rbd-nbd daemons in
the nodeplugin csi-rbdplugin container. This will allow reattaching the
backend images with the right nbd device, thus allowing the applications
to perform IO without any interruptions even after a nodeplugin restart.

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2021-07-16 16:30:58 +00:00
Prasanna Kumar Kalever
6007fc9bfe cleanup: move static volume check to helper function
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2021-07-16 16:30:58 +00:00
Prasanna Kumar Kalever
6d24080851 rbd: update per volume metadata stash-file with devicePath
As part of stage transaction if the mounter is of type nbd, then capture
device path after a successful rbd-nbd map.

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2021-07-16 16:30:58 +00:00
Humble Chirammal
61bf49a4f5 rbd: Get rid of locking at nodePublish
Considering kubelet make sure the stage and publish operations
are serialized, we dont need any extra locking in nodePublish

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2021-07-16 07:18:56 +00:00