Problem:
-------
For rbd nbd userspace mounter backends, after a restart of the nodeplugin
all the mounts will start seeing IO errors. This is because, for rbd-nbd
backends there will be a userspace mount daemon running per volume, post
restart of the nodeplugin pod, there is no way to restore the daemons
back to life.
Solution:
--------
The volume healer is a one-time activity that is triggered at the startup
time of the rbd nodeplugin. It navigates through the list of volume
attachments on the node and acts accordingly.
For now, it is limited to nbd type storage only, but it is flexible and
can be extended in the future for other backend types as needed.
From a few feets above:
This solves a severe problem for nbd backed csi volumes. The healer while
going through the list of volume attachments on the node, if finds the
volume is in attached state and is of type nbd, then it will attempt to
fix the rbd-nbd volumes by sending a NodeStageVolume request with the
required volume attributes like secrets, device name, image attributes,
and etc.. which will finally help start the required rbd-nbd daemons in
the nodeplugin csi-rbdplugin container. This will allow reattaching the
backend images with the right nbd device, thus allowing the applications
to perform IO without any interruptions even after a nodeplugin restart.
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
This commit resolves godot linter issue
which says "Comment should end in a period (godot)".
Updates: #1586
Signed-off-by: Yati Padia <ypadia@redhat.com>
We have many declarations and invocations..etc with long lines which are
very difficult to follow while doing code reading. This address the issues
in 'internal/rbd/*server.go' and 'internal/rbd/driver.go' files to restrict
the line length to 120 chars.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
added ReplicationServer struct for the replication related
operation it also embed the ControllerServer which
already implements the helper functions like locking/unlocking etc.
removed getVolumeFromID and cleanup functions for better
code readability and easy maintaince.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
as RBD is implementing the replication
we are registering it. For CephFS, its
not implementing the replication we are
passing nil so we dont want to register
it.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
mount packges is moved from
k8s.io/utils/mount to a new repository
k8s.io/mount-utils. updated code to use
the same.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
An rbd image can have a maximum number of
snapshots defined by maxsnapshotsonimage
On the limit is reached the cephcsi will
start flattening the older snapshots and
returns the ABORT error message, The Request
comes after this as to wait till all the
images are flattened (this will increase the
PVC creation time. Instead of waiting till
the maximum snapshots on an RBD image, we can
have a soft limit, once the limit reached
cephcsi will start flattening the task to
break the chain. With this PVC creation time
will only be affected when the hard limit
(minsnapshotsonimage) reached.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Take operation locks on the resources before operating
on the resouces. This allows us to do parallel operations
for some RPC calls such as Clone and Restore of PVC.
This operations will only be blocked if the image is
expanding or Snapshot and RBD image is getting deleted.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
as v1.0.0 is deprecated we need to remove the support
for it in the Next coming (v3.0.0) release. This PR
removes the support for the same.
closes#882
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Added maxsnapshotsonimage flag to flatten
the older rbd images on the chain to avoid
issue in krbd.The limit is in krbd since it
only allocate 1 4KiB page to handle all the
snapshot ids for an image.
The max limit is 510 as per
https://github.com/torvalds/linux/blob/
aaa2faab4ed8e5fe0111e04d6e168c028fe2987f/drivers/block/rbd.c#L98
in cephcsi we arekeeping the default to 450 to reserve 10%
to avoid issues.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
added skipForceFlatten flag to skip
the image deptha and skip image flattening.
This will be very useful if the kernel is
not listed in cephcsi which supports deep
flatten fauture.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
added Hardlimit and Softlimit flags for cephcsi
arguments. When the Softlimit is reached cephcsi
will start a background task to flatten the rbd
image and return success and if the hardlimit
is reached it will start a background task
to flatten the rbd image and return ready
to use as false to make sure that the image
will not be used until it is flatten.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
The function SetCSIDirectorySuffix was used only one per (long-lived,
gloabl) journal object. It is simpler to construct the journal objects
with this needed parameter:
1. As it is required to function and non-optional AFAICT
2. Removes the temptation to mutate global object
3. Reduces LOC with exact same functionality
4. SetCSIDirectorySuffix would not behave correctly if called a 2nd time
anyway.
Point 4. means that if you called the function twice to change the
suffix when you previously had "csi.volumes.alice", you'd get
"csi.volumes.alice.bob" instead of "csi.volumes.bob" what one would
expect.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
This new journal package isolates journal logic from the rest of util
and helps draw bright lines between what is a generic utility function
and what is csi journal logic.
Done partly as preparation for making use of go-ceph in journal.
No functional changes are made except to update references to allow the
code to compile.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
The internal/ directory in Go has a special meaning, and indicates that
those packages are not meant for external consumption. Ceph-CSI does
provide public APIs for other projects to consume. There is no plan to
keep the API of the internally used packages stable.
Closes: #903
Signed-off-by: Niels de Vos <ndevos@redhat.com>