Added support for RBD PVC to PVC cloning, below
commands are executed to create a PVC-PVC clone from
RBD side.
* Check the depth(n) of the cloned image if n>=(hard limit -2)
or ((soft limit-2) Add a task to flatten the image and return
about (to avoid image leak) **Note** will try to flatten the
temp clone image in the chain if available
* Reserve the key and values in omap (this will help us to
avoid the leak as it's not reserved earlier as we have returned
ABORT (the request may not come back))
* Create a snapshot of rbd image
* Clone the snapshot (temp clone)
* Delete the snapshot
* Snapshot the temp clone
* Clone the snapshot (final clone)
* Delete the snapshot
```bash
1) check the image depth of the parent image if flatten required
add a task to flatten image and return ABORT to avoid leak
(hardlimit-2 and softlimit-2 check will be done)
2) Reserve omap keys
2) rbd snap create <RBD image for src k8s volume>@<random snap name>
3) rbd clone --rbd-default-clone-format 2 --image-feature
layering,deep-flatten <RBD image for src k8s volume>@<random snap>
<RBD image for temporary snap image>
4) rbd snap rm <RBD image for src k8s volume>@<random snap name>
5) rbd snap create <cloned RBD image created in snapshot process>@<random snap name>
6) rbd clone --rbd-default-clone-format 2 --image-feature <k8s dst vol config>
<RBD image for temporary snap image>@<random snap name> <RBD image for k8s dst vol>
7)rbd snap rm <RBD image for src k8s volume>@<random snap name>
```
* Delete temporary clone image created as part of clone(delete if present)
* Delete rbd image
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
during the checkSnapCloneExists we are checking
the image, if the image not found we are deleting
the snapshot on the parent image, This PR corrects
the comparasion. instead of snapshotNotFound we need
to check ImageNotFound error.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
updated unit testing for the kernel check
for deep flatten feature for both supported
upstream kernel version (5.1.0+) and RHEL
8.2 backport
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
as RHEL 8.2 supports the deep-flatten
feature, added it to the list to map
the rbd images on the node without flattening.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
as v5.1.0 supports the deep-flatten feature,lowering
the required version to map rbd images which
are having deep-flatten feature
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
we need to take lock on parent rbd image when
we are creating a snapshot from it, if the user
tries to delete/resize the rbd image when we are
taking snapshots,we may face issues. if the volume
lock is present on the rbd image, the user cannot
resize the rbd image nor delete the rbd image.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
flatten cloned images to remove the link
with the snapshot as the parent snapshot can
be removed from the trash.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Added maxsnapshotsonimage flag to flatten
the older rbd images on the chain to avoid
issue in krbd.The limit is in krbd since it
only allocate 1 4KiB page to handle all the
snapshot ids for an image.
The max limit is 510 as per
https://github.com/torvalds/linux/blob/
aaa2faab4ed8e5fe0111e04d6e168c028fe2987f/drivers/block/rbd.c#L98
in cephcsi we arekeeping the default to 450 to reserve 10%
to avoid issues.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
added listsnapshots function for an
rbd image to list all the snapshots
created from an rbd images, This will
list the snapshots which are in trash also.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Several places in the code compared errors directly with the go-ceph
sentinel errors. This change uses the errors.Is() function of go
1.13 instead. The err113 linter reported this issue as:
err113: do not compare errors directly, use errors.Is() instead
Signed-off-by: Sven Anderson <sven@redhat.com>
"github.com/pkg/errors" does not offer more functionlity than that we
need from the standard "errors" package. With Golang v1.13 errors can be
wrapped with `fmt.Errorf("... %w", err)`. `errors.Is()` and
`errors.As()` are available as well.
See-also: https://tip.golang.org/doc/go1.13#error_wrapping
Signed-off-by: Niels de Vos <ndevos@redhat.com>
By fixing the golangci-lint runs, this now gets reported as a problem.
Instead of addressing the compexity of the DeleteVolume() method here,
mark it as a TODO.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
currently, various calls to deleteImage does not
need the rbdStatus check. That is only required
when calling from DeleteVolume. This PR optimizes the
rbd image deletion by removing the status check.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
added skipForceFlatten flag to skip
the image deptha and skip image flattening.
This will be very useful if the kernel is
not listed in cephcsi which supports deep
flatten fauture.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
This Adds a support for create,delete snapshot
and creating a new rbd image from the snapshot.
* Create a snapshot
* Create a temporary snapshot from the parent volume
* Clone a new image from a temporary snapshot with options
--rbd-default-clone-format 2 --image-feature layering,deep-flatten
* Delete temporary snapshot created
* Create a new snapshot from cloned image
* Check the image chain depth, if the Softlimit is reached Add a
task Flatten the cloned image and return success. if the depth
is reached hard limit Add a task Flatten the cloned image and
return snapshot status ready as false
```bash
1) rbd snap create <RBD image for src k8s volume>@<random snap name>
2) rbd clone --rbd-default-clone-format 2 --image-feature
layering,deep-flatten <RBD image for src k8s volume>@<random snap>
<RBD image for temporary snap image>
3) rbd snap rm <RBD image for src k8s volume>@<random snap name>
4) rbd snap rm <RBD image for temporary snap image>@<random snap name>
5) check the depth, if the depth is greater than configured hard
limit add a task to flatten the cloned image return snapshot status
ready as false if the depth is greater than soft limit add a task
to flatten the image and return success
```
* Create a clone from snapshot
* Clone a new image from the snapshot with user-provided options
* Check the depth(n) of the cloned image if n>=(hard limit)
Add task to flatten the image and return ABORT (to avoid image leak)
```bash
1) rbd clone --rbd-default-clone-format 2 --image-feature
<k8s dst vol config> <RBD image for temporary snap image>@<random snap name>
<RBD image for k8s dst vol>
2) check the depth, if the depth is greater than configured hard limit
add a task to flatten the cloned image return ABORT error if the depth is
greater than soft limit add a task to flatten the image and return success
```
* Delete snapshot or pvc
* Move the temporary cloned image to the trash
* Add task to remove the image from the trash
```bash
1) rbd trash mv <cloned image>
2) ceph rbd task trash remove <cloned image>
```
With earlier implementation to delete the image, we used to add
a task to remove the image with new changes this cannot be done
as the image may contain snapshots or linking.so we will be
doing below steps to delete an image(this will be
applicable for both normal image and cloned image)
* Move the rbd image to the trash
* Add task to remove the image from the trash
```bash
1) rbd trash mv <image>
2) ceph rbd task trash remove <image>
```
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
we dont need to call the snapshot CLI functions
to get snapshot details. as these details are not
requried with new snapshot design.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
cephcsi need to store and retrieve the rbd image ID
in the omap as we need the image ID to add a
task to remove the image from the Trash.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
as with snapshot and cloning implementation
the rbd images cannot be deleted with rbd
remove or add a task to delete the rbd image
as it might contains the snapshots and clones.
we need to make use of the rbd mv trask and
add a task to remove the image from trash
once all its clones and snapshots links
are broken and there will no longer any
dependency between parent and child images.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
add a new function called getImageID to fetch
the image id of an image which need to be stored
and retrived for Delete operation.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
added Hardlimit and Softlimit flags for cephcsi
arguments. When the Softlimit is reached cephcsi
will start a background task to flatten the rbd
image and return success and if the hardlimit
is reached it will start a background task
to flatten the rbd image and return ready
to use as false to make sure that the image
will not be used until it is flatten.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
As we are using v2 cloning we dont need to
do protect the snapshot before cloning and
unprotect it before deleting.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
as we need to reuse the same code for both cephfs
and rbd moving the supported version check function
to util package, for better readability renamed
the function.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
A new version of gosec insists on handling errors returned by Close():
[/go/src/github.com/ceph/ceph-csi/internal/util/pidlimit.go:44] - G307 (CWE-): Deferring unsafe method "*os.File" on type "Close" (Confidence: HIGH, Severity: MEDIUM)
> defer cgroup.Close()
[/go/src/github.com/ceph/ceph-csi/internal/util/pidlimit.go:78] - G307 (CWE-): Deferring unsafe method "*os.File" on type "Close" (Confidence: HIGH, Severity: MEDIUM)
> defer f.Close()
[/go/src/github.com/ceph/ceph-csi/internal/util/pidlimit.go:113] - G307 (CWE-): Deferring unsafe method "*os.File" on type "Close" (Confidence: HIGH, Severity: MEDIUM)
> defer f.Close()
Signed-off-by: Niels de Vos <ndevos@redhat.com>
The generated ceph.conf does not need readable by the group, there is
only one (system) user consuming the configurations file.
This addresses the following gosec warning:
[/go/src/github.com/ceph/ceph-csi/internal/util/cephconf.go:52] - G306 (CWE-): Expect WriteFile permissions to be 0600 or less (Confidence: HIGH, Severity: MEDIUM)
> ioutil.WriteFile(CephConfigPath, cephConfig, 0640)
Signed-off-by: Niels de Vos <ndevos@redhat.com>
gosec-2.3.0 complains about the following:
[/go/src/github.com/ceph/ceph-csi/internal/util/cephcmds.go:146] - G307 (CWE-): Deferring unsafe method "*os.File" on type "Close" (Confidence: HIGH, Severity: MEDIUM)
> defer tmpFile.Close()
By logging the error from Close(), the warning is gone.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
cephcsi need to add mount the cephfs subvolume
as the readonly when the PVC type is ROX to
provide only readonly access to the users
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
With the current code base, the subvolumegroup will
be created once, and even for a different cluster,
subvolumegroup creation is not allowed again.
Added support multiple subvolumegroups creation by
validating one subvolumegroup creation per cluster.
Fixes: #1123
Signed-off-by: Yug Gupta <ygupta@redhat.com>
The previous function used to remove omap keys apparently did not
return errors when removing omap keys from a missing omap (oid).
Mimic that behavior when using the api.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
For any function that sets more than one key on a single oid setting
them as a batch will be more efficient.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
For any function that removes more than one key on a single oid removing
them as a batch will be more efficient.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Taking this appraoch means that any function that must get more than one
key's value from the same oid can be more efficient by calling out to
ceph only once.
To be cautious and avoid missing things we always request ceph return
more keys than we actually expect to be set on the oid. If there are
unexpected keys there, we will not miss the keys we want if we first hit
an unexpected key if we were to limit ourselves to iterating only over
the number of keys we're expecting to be on the object.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Convert the business-logic of the journal to use the new go-ceph based
omap manipulation functions.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
These new omap manipulation functions (get/set/remove) are roughly
equivalent to the previous command-line based approach but rely
on direct api calls to ceph.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
These types have private fields but we need to construct them outside of
the util package. Add New* methods for both.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
if the PVC access mode is ReadOnlyMany
or single node readonly, mounting the rbd
device path to the staging path as readonly
to avoid the write operation.
If the PVC acccess mode is readonly, mapping
rbd images as readonly.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
go-ceph v0.3 adds constants for ImageFeature values and their names.
Instead of hardcoding "layering" in several places, use the constant
given by librbd.
The rbdVolume.ImageFeatures does not seem to be used anywhere after the
conversion. Stashing the image metadata does include the ImageFeatures
as these are retrieved when getting the image information. It is safe to
drop ImageFeatures altogether and only use the imageFeatureSet instead.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
It seems that convering the release component from the unix.Utsrelease
type leaves some trailing "\x00" characters.
While splitting the string to compare kernel versions, these additional
characters might prevent converting the string to an int. Strip the
additional characters before returning the string.
Note:
"\x00" characters are not visible when printing to a file or screen.
They can be seen in hex-editors, or sending the output through 'xxd'.
Fixes: #1167
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Go 1.13 contains support for error wrapping. To support wrapping,
fmt.Errorf now has a %w verb for creating wrapped errors, and three
new functions in the errors package ( errors.Unwrap, errors.Is and
errors.As) simplify unwrapping and inspecting wrapped errors.
With this change, If we currently compare errors using ==, we have to
use errors.Is instead. Example:
if err == io.ErrUnexpectedEOF
becomes
if errors.Is(err, io.ErrUnexpectedEOF)
https://tip.golang.org/doc/go1.13#error_wrapping
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
IneffAssign warns about the two following statements:
Line 147: warning: ineffectual assignment to supported (ineffassign)
Line 148: warning: ineffectual assignment to ok (ineffassign)
Reported-by: https://goreportcard.com/report/github.com/ceph/ceph-csi
Updates: #975
Signed-off-by: Yug Gupta <ygupta@redhat.com>
This prevents the need to open the IOContext for additional operations
on the image.
It also addresses a leak of the IOContext in case `rbdVolume.open()` was
called. The method only returned the `rbd.Image` without the possibility
to close the related IOContext.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
There is a bug in current code where the devicePath
is always empty and the rbd image unmap never
happens if nodeplugin fails to mount the rbd image
to the stagingpath.
This is a fix to unmap the rbd image if some issue
occurs after rbd image is mapped.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
When mounting fails, the node-plugin should give a suggestion to check the
kernel logs so that users can report problems better.
Edited the existing log to include the message in both rbd and cephfs.
Fixes: https://github.com/ceph/ceph-csi/issues/1006
Signed-off-by: Mudit Agarwal <muagarwa@redhat.com>
util: golint warns about exported methods to have a
comment or to unexport them.
e2e: golint warns about package comment to be of the form
"Package e2e ..."
Reported-by: https://goreportcard.com/report/github.com/ceph/ceph-csi
Updates: #975
Signed-off-by: Yug Gupta <ygupta@redhat.com>
InvalidPoolID has recently been added, and can be used in other location
too. As GetPoolID is updated with this patch set, return InvalidPoolID
on errors too.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
GetPoolID() did not return ErrPoolNotFound in case the pool could not be
found. This has been addressed as well, so that looking for an existing
pool behaves the same for checking by Name or ID.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Before, the one CSIJournal type was handling both configuration and
providing methods to make changes to the journal. This created the
temptation to modify the state of the global configuration object to
enact changes through the method calls.
This change creates a new type `journal.Connection` that takes the
monitors and credentials to create a short(er)-lived object to actually
read and make changes on the journal. This also avoid mixing the
arguments needed to connect to the cluster with the arguments needed
for the various journal read & update calls.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
By using switch/case it is easier to follow the error checking of the
genVolFromVolID() function. In case a new error is added as a return of
the function, it will be simpler to add checking for it.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
The SetNamespace setter function was called only once, immediately after
the creation of a volume journal object in cephfs only.
Remove this function so that it is no longer implied that this field can
be mutated after the journal is created. In it's place, use an extended
"constructor" NewCSIVolumeJournalWithNamespace that takes a namespace
value at create-time only.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
The function SetCSIDirectorySuffix was used only one per (long-lived,
gloabl) journal object. It is simpler to construct the journal objects
with this needed parameter:
1. As it is required to function and non-optional AFAICT
2. Removes the temptation to mutate global object
3. Reduces LOC with exact same functionality
4. SetCSIDirectorySuffix would not behave correctly if called a 2nd time
anyway.
Point 4. means that if you called the function twice to change the
suffix when you previously had "csi.volumes.alice", you'd get
"csi.volumes.alice.bob" instead of "csi.volumes.bob" what one would
expect.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
The problem happens when multiple PVCs with the
same UUID are attached/mounted on a node. This
can happen after creating a PVC from a snapshot,
or cloning a PVC.
make nouuid as the default mount option if
the format type is xfs to avoid mounting
issues.
updates: #966
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
The gocyclo linter complains about the high complexity of the
CreateVolume() function:
> pkg/rbd/controllerserver.go:133:1: cyclomatic complexity 21 of func `(*ControllerServer).CreateVolume` is high (> 20) (gocyclo)
By splitting it up and separeting the creation of an exisint CSI Volume
object in buildCreateVolumeResponse(), the gocyclic linter does not
complain any longer.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
golint has a pretty struct stylechek, it down not allow different
variable names for methods on an object:
pkg/rbd/rbd_util.go:970:1: receiver name rbdVol should be consistent with previous receiver name rv for rbdVolume (golint)
func (rbdVol *rbdVolume) ensureEncryptionMetadataSet(ctx context.Context) error {
^
pkg/rbd/rbd_journal.go:166:26: ST1016: methods on the same type should have the same receiver name (seen 2x "rbdVol", 3x "rv") (stylecheck)
func (rbdVol *rbdVolume) Exists(ctx context.Context) (bool, error) {
^
Rename the 'rbdVol' variable to 'rv' to make it consistent.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
rbdVolume.open() was split from commit 5dd34732e1e while moving part of
the functionality to util.ClusterConnection. It seems that .open() is
not used anywhere at the moment, so drop it until follow-up patches
require it again.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
The shared util.ClusterConnection can be used for rbd.rbdVolume and
cephfs.volumeOptions to connect to the Ceph cluster. This will then use
the shared ConnPool, and functions for obtaining connection details will
be the same across cephfs and rbd packages.
The ClusterConnection.Creds credentials are temporarily available until
all the functions have been adapted to use go-ceph and the connection
from the ConnPool.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
The name of the CephFS SubvolumeGroup for the CSI volumes was hardcoded to "csi". To make permission management in multi tenancy environments easier, this commit makes it possible to configure the CSI SubvolumeGroup.
related to #798 and #931
golint warns about the following statements:
ceph-csi/internal/util/csiconfig.go
Line 49: warning: exported function Mons should have comment or be unexported (golint)
ceph-csi/pkg/util/volid.go :
Line 72: warning: exported method CSIIdentifier.ComposeCSIID should have comment
or be unexported (golint)
Reported-by: https://goreportcard.com/report/github.com/ceph/ceph-csi
Updates: #975
Signed-off-by: Yug Gupta <ygupta@redhat.com>
This new journal package isolates journal logic from the rest of util
and helps draw bright lines between what is a generic utility function
and what is csi journal logic.
Done partly as preparation for making use of go-ceph in journal.
No functional changes are made except to update references to allow the
code to compile.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
The NewErrSnapNameConflict will allow packages outside of "util" to
create new instances of the ErrSnapNameConflict error.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
The internal/ directory in Go has a special meaning, and indicates that
those packages are not meant for external consumption. Ceph-CSI does
provide public APIs for other projects to consume. There is no plan to
keep the API of the internally used packages stable.
Closes: #903
Signed-off-by: Niels de Vos <ndevos@redhat.com>