RBD supports creating rbd images with
object size, stripe unit and stripe count
to support striping. This PR adds the support
for the same.
More details about striping at
https://docs.ceph.com/en/quincy/man/8/rbd/#stripingfixes: #3124
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
This change helps read the cluster name from the cmdline args,
the provisioner will set the same on the RBD images.
Fixes: #2973
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Move k8s.GetVolumeMetadata() out of setVolumeMetadata() and rename it to
setAllMetadata() so that the same can be used for setting volume and
snapshot metadata.
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
This commit makes modification so as to allow pvc-pvc clone
with different storageclass having different encryption
configs.
This commit also modifies `copyEncryptionConfig()` to
include a `isEncrypted()` check within the function.
Signed-off-by: Rakshith R <rar@redhat.com>
Set snapshot-name/snapshot-namespace/snapshotcontent-name details
on RBD backend snapshot image as metadata on snapshot
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Make sure to set metadata when image exist, i.e. if the provisioner pod
is restarted while createVolume is in progress, say it created the image
but didn't yet set the metadata.
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
This helps Monitoring solutions without access to Kubernetes clusters to
display the details of the PV/PVC/NameSpace in their dashboard.
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Below are the 3 different cases where we need
the PVC namespace for encryption
* CreateVolume:- Read the namespace from the
createVolume parameters and store it in the omap
* NodeStage:- Read the namespace from the omap
not from the volumeContext
* Regenerate:- Read the pvc namespace from the claimRef
not from the volumeAttributes.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
remove kubernetes csi prefixed parameters
from the volumeContext as we dont want
to store it in the PV VolumeAttributes.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
This commit ensures that parent image is flattened before
creating volume.
- If the data source is a PVC, the underlying image's parent
is flattened(which would be a temp clone or snapshot).
hard & soft limit is reduced by 2 to account for depth that
will be added by temp & final clone.
- If the data source is a Snapshot, the underlying image is
itself flattened.
hard & soft limit is reduced by 1 to account for depth that
will be added by the clone which will be restored from the
snapshot.
Flattening step for resulting PVC image restored from snapshot is removed.
Flattening step for temp clone & final image is removed when pvc clone is
being created.
Fixes: #2190
Signed-off-by: Rakshith R <rar@redhat.com>
Makes the rbd images features in the storageclass
as optional so that default image features of librbd
can be used. and also kept the option to user
to specify the image features in the storageclass.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Return the dataPool used to create the image instead of the default one
provided by the createVolumeRequest.
In case of topologyConstrainedDataPools, they may differ.
Don't add datapool if it's not present
Signed-off-by: Sébastien Bernard <sebastien.bernard@sfr.com>
This commit removes the thick provisioning
code as thick provisioning is deprecated in
cephcsi 3.5.0.
fixes: #2795
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Currently most of the internal methods have the
rbdVolume as the received. As these methods
are completely internal and requires only
the fields of the rbdImage use rbdImage
as the receiver instead of rbdVolume.
updates #2742
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
as per the CSI standard the size is optional parameter,
as we are allowing the clone to a bigger size
today we need to block the clone to a smaller size
as its a have side effects like data corruption etc.
Note:- Even though this check is present in kubernetes
sidecar as CSI is CO independent adding the check
here.
updates: #2718
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
as per the CSI standard the size is optional parameter,
as we are allowing the restore to a bigger size
today we need to block the restore to a smaller size
as its a have side effects like data corruption.
Note:- Even though this check is present in kubernetes
sidecar as CSI is CO independent adding the check
here.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
SINGLE_NODE_WRITER capability ambiguity has been fixed in csi spec v1.5
which allows the SP drivers to declare more granular WRITE capability in form
of SINGLE_NODE_SINGLE_WRITER or SINGLE_NODE_MULTI_WRITER.
These are not really new capabilities rather capabilities introduced to
get the desired functionality from CO side based on the capabilities SP
driver support for various CSI operations, this new capabilities also help
to address new access mode RWOP (readwriteoncepod).
This commit adds a helper function which identity the request is of
multiwriter mode and also validates whether it is filesystem mode or
block mode. Based on the inspection it fails to allow multi write
requests for filesystem mode and only allow multi write request against
block mode.
This commit also adds unit tests for isMultiWriterBlock function which
validates various accesstypes and accessmodes.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
With introduction of go-ceph rbd admin task api, credentials are
no longer required to be passed as cli cmd is not invoked.
Signed-off-by: Rakshith R <rar@redhat.com>
after creating the rbd image log the image
details corresponding for the request along
with the request name.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
moved ParentName, ParentPool and ImageFeatureSet
fields to the rbdImage struct as these are the
first citizens on the rbdImage.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
If the requested volume size is greater than
the snapshot size, resize the cloned volume
after creating a clone from a snapshot.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
If the requested volume size is greater than
the parent volume size, resize the cloned volume
after creating a final clone from a parent volume.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
added a check to consider ErrImageNotFound error
during DeleteSnapshot operation, if the error
is ErrImageNotFound we need to ensure that image
is removed from the trash and also the rados
OMAP data is removed.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
as we are moving the VolSize to rbdImage struct
we should reuse the same instead of maintaining
one more field in rbdSnapshot struct.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
when doing the internal operation to get the
latest details the rbd image size is also getting
updated and this will update the volume size also
without actual requested size we cannot do the
resize operation for bigger clones. This commit
adds a new field called RequestedVolSize to rbdVolume
struct to hold the user requested size.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
added a new helper function called cleanupThickClone
to cleanup the snapshot and clone if the thick
provisioning is not fully completed.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
remove the bigger size validation when
creating a volume from a snapshot or when
creation a clone from a volume as we resized
the volume after cloning.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
IsBlockMultiNode() is a new helper that takes a slice of
VolumeCapability objects and checks if it includes multi-node access
and/or block-mode support.
This can then easily be used in other services that need checking for
these particular capabilities, and preventing multi-node block-mode
access.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
genVolFromVolID() is used by the CSI Controller service to create an
rbdVolume object from a CSI volume_id. This function is useful for
CSI-Addons Services as well, so rename it to GenVolFromVolID().
Signed-off-by: Niels de Vos <ndevos@redhat.com>
parseAndDeleteMigratedVolume() prviously clubbed the logic of
parsing of migration volume handle and then continued with the
deletion of the volume. however this commit split this
logic into two, ie parsing has been done in parseMigrationVolID()
and DeleteMigratedVolume() deletes the backend volume.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
This commit adds a couple of helper functions to parse the migration
request secret and set it for further csi driver operations.
More details:
The intree secret has a data field called "key" which is the base64
admin secret key. The ceph CSI driver currently expect the secret to
contain data field "UserKey" for the equivalant. The CSI driver also
expect the "UserID" field which is not available in the in-tree secret
by deafult. This missing userID will be filled (if the username differ
than 'admin') in the migration secret as 'adminId' field in the
migration request, this commit adds the logic to parse this migration
secret as below:
"key" field value will be picked up from the migraion secret to "UserKey"
field.
"adminId" field value will be picked up from the migration secret to "UserID"
field
if `adminId` field is nil or not set, `UserID` field will be filled with
default value ie `admin`.The above logic get activated only when the secret
is a migration secret, otherwise skipped to the normal workflow as we have
today.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
Thick-provisioning was introduced to make accounting of assigned space
for volumes easier. When thick-provisioned volumes are the only consumer
of the Ceph cluster, this works fine. However, it is unlikely that this
is the case. Instead, accounting of the requested (thin-provisioned)
size of volumes is much more practical as different types of volumes can
be tracked.
OpenShift already provides cluster-wide quotas, which can combine
accounting of requested volumes by grouping different StorageClasses.
In addition to the difficult practise of allowing only thick-provisioned
RBD backed volumes, the performance makes thick-provisioning
troublesome. As volumes need to be completely allocated, data needs to
be written to the volume. This can take a long time, depending on the
size of the volume. Provisioning, cloning and snapshotting becomes very
much noticeable, and because of the additional time consumption, more
prone to failures.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
After moving moving image to trash, if `trash remove` step fails,
then external-provisioner will issue subsequent requests, in which
image will be absent in pool( will be in trash) and omap cleanup will
be done with stale image left in trash with no `trash remove` step on it.
To avoid this scenario list trash images and find corresponding id for given
image name and add a task to flatten when we encounter a ErrImageNotFound.
Fixes: #1728
Signed-off-by: Rakshith R <rar@redhat.com>
During PVC snapshot/clone both kms config and passphrase needs to copied,
while for PVC restore only passphrase needs to be copied to dest rbdvol
since destination storageclass may have another kms config.
Signed-off-by: Rakshith R <rar@redhat.com>
This commit adds the logic to detect a passed in volumeID
is a migrated volume ID and if yes, the driver connect to the
backend cluster and clean/delete the image. The logic
only applied if its a migration volume ID. The migration volume ID
carry the information like mons, pool and image name which is
good enough for the driver to identify and connect to the backend
cluster for its operations.
migration volID format:
<mig>_mons-<monsHash>_image-<imageUID>_<poolHash>
Details on the hash values:
* MonsHash: this carry a hash value (md5sum) which will be acted as the
`clusterID` for the operations in this context.
* ImageUID: this is the unique UUID generated by kubernetes for the created
volume.
* PoolHash: this is an encoded string of pool name.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
This commit adds capability to genVolFromVolumeOptions() to fetch
mapped clusted-id & mon ips for mirrored PVC on secondary cluster
which may have different cluster-id.
This is required for NodeStageVolume().
We also don't need to check for mapping during volume create requests,
so it can be disabled by passing a bool checkClusterIDMapping as false.
GetMonsAndClusterID() is modified to accept bool checkClusterIDMapping
based on which clustermapping is checked to fetch mapped cluster-id and
mon-ips.
Signed-off-by: Rakshith R <rar@redhat.com>
all the error check scenarios of genVolFromVolID() and unreserving
omap entries based on the error made deleteVolume method complex,
this patch create a new function which handle the error check and
unrerving omap entries accordingly and finally return the response
to deletevolume/caller.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
Moving the log functions into its own internal/util/log package makes it
possible to split out the humongous internal/util packages in further
smaller pieces. This reduces the inter-dependencies between utility
functions and components, preventing circular dependencies which are not
allowed in Go.
Updates: #852
Signed-off-by: Niels de Vos <ndevos@redhat.com>
If the image is in a secondary state and its
up+replaying means its an healthy secondary
and the image is primary somewhere in the remote cluster
and the local image is getting replayed. Delete the
OMAP data generated as we cannot delete the
secondary image. When the image on the primary
cluster gets deleted/mirroring disabled, the image on
all the remote (secondary) clusters will get
auto-deleted. This helps in garbage collecting
the OMAP, PVC and PV objects after failback operation.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
This commit fixes snapshot id idempotency issue by
always returning an error when flattening is in progress
and not using `readyToUse:false` response.
Signed-off-by: Rakshith R <rar@redhat.com>