In the case of the Async DR, the volumeID will
not be the same if the clusterID or the PoolID
is different, With Earlier implementation, it
is expected that the new volumeID mapping is
stored in the rados omap pool. In the case of the
ControllerExpand or the DeleteVolume Request,
the only volumeID will be sent it's not possible
to find the corresponding poolID in the new cluster.
With This Change, it works as below
The csi-rbdplugin-controller will watch for the PV
objects, when there are any PV objects created it
will check the omap already exists, If the omap doesn't
exist it will generate the new volumeID and it checks for
the volumeID mapping entry in the PV annotation, if the
mapping does not exist, it will add the new entry
to the PV annotation.
The cephcsi will check for the PV annotations if the
omap does not exist if the mapping exists in the PV
annotation, it will use the new volumeID for further
operations.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
It seems that newer versions of some tools/libraries identify encrypted
filesystems with `crypto_LUKS` instead of `crypt`.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
In case of the DR the image on the primary site cannot be
demoted as the cluster is down, during failover the image need
to be force promoted. RBD returns `Device or resource busy`
error message if the image cannot be promoted for above reason.
Return FailedPrecondition so that replication operator can send
request to force promote the image.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
instead of fetching the force option from the
parameters. Use the Force field available in
the PromoteVolume Request.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
as the org github.com/kube-storage is renamed
to github.com/csi-addons as the name kube-storage
was more generic.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Incase of resync the image will get deleted, gets
recreated and its a a time consuming operation.
It makes sense to return aborted error instead
of not found as we have omap data only the image
is missing in rbd pool.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Do resync if the image is in unknow or in error
state.
Check for the current image state for up+stopped
or up+replaying and also all peer site status
should be un up+stopped to confirm that resyncing
is done and image can be promoted and used.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
added replication related operations as a method
of rbdImage as these methods can be easily used
when we introduce volumesnaphot mirroring operations.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
the rbd mirror state can be in enabled,disabled
or disabling state. If the mirroring is not disabled
yet and still in disabling state. we need to
check for it and return abort error message if
the mirroring is still getting disabled.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
added ReplicationServer struct for the replication related
operation it also embed the ControllerServer which
already implements the helper functions like locking/unlocking etc.
removed getVolumeFromID and cleanup functions for better
code readability and easy maintaince.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
There is no need for each EncryptionKMS to implement the same GetID()
function. We have a VolumeEncryption type that is more suitable for
keeping track of the KMS-ID that was used to get the configuration of
the KMS.
This does not change any metadata that is stored anywhere.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Current rbd plugin only supports the layering feature
for rbd image. Add exclusive-lock and journaling image
features for the rbd.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Signed-off-by: woohhan <woohyung_han@tmax.co.kr>
rbdVolumes can have several resources that get allocated during its
usage. Only destroying the IOContext may not be suffiecient and can
cause resource leaks.
Use rbdVolume.Destroy() when the rbdVolume is not used anymore.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Connections are reference counted, so just assigning the connection to
an other object for re-use is not correct. This can cause connections to
be garbage collected while something else is still using it.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
as RBD is implementing the replication
we are registering it. For CephFS, its
not implementing the replication we are
passing nil so we dont want to register
it.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Because rbdVolume and rbdSnapshot are very similar, they can be based
off a common struct rbdImage that contains the common attributes and
functions.
This makes it possible to re-use functions for snapshots, and prevents
further duplication or code.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
The rbdSnapshot and rbdVolume structs have many common attributes. In
order to combine these into an rbdImage struct that implements shared
functionality, having the same attribute for the ID makes things much
easier.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
NewVolumeEncryption() will return an indication that an alternative
DEKStore needs to be configured in case the KMS does not support it.
setKMS() will also set the DEKStore if needed, so renaming it to
configureEncryption() makes things clearer.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Use DEKStore API for Fetching and Storing passphrases.
Drop the fallback for the old KMS interface that is now provided as
DEKStore. The original implementation has been re-used for the DEKStore
interface.
This also moves GetCryptoPassphrase/StoreNewCryptoPassphrase functions
to methods of VolumeEncryption.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Prepare for grouping encryption related functions together. The main
rbdVolume object should not be cluttered with KMS or DEK procedures.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
For NodeUnstageVolume its a two step process,
first unmount the volume and than unmap the volume.
Currently, we are logging only after rbd unmapping is done.
sometimes it becomes difficult to debug with above logging
whether more time is spent in unmount or unmap.
This commits adds one more debug log after unmount is done.
with this we can identify where exactly more time is spent
by looking at the logs.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
It seems that writing more than 1 GiB per WriteSame() operation causes
an EINVAL (22) "Invalid argument" error. Splitting the writes in blocks
of maximum 1 GiB should prevent that from happening.
Not all volumes are of a size that is the multiple of the stripe-size.
WriteSame() needs to write full blocks of data, so in case there is a
small left-over, it will be filled with WriteAt().
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Introduce initKMS() as a function of rbdVolume. KMS functionality does
not need to pollute general RBD image functions. Encryption functions
are now in internal/rbd.encryption.go, so move initKMS() there as well.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
When and RBD image is expanded, the additional extents need to get
allocated when the image was thick-provisioned.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
When images get resized/expanded, the additional space needs to be
allocated if the image was initially thick-provisioned. By marking the
image with a "thick-provisioned" key in the metadata, future operations
can check the need.
A missing "thick-provisioned" key indicates that the image has not been
thick-provisioned.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Write blocks of stripe-size to allocate RBD images when
Thick-Provisioning is enabled in the StorageClass.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Add an option to the StorageClass to support creating fully allocated
(thick provisioned) RBD images
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Signed-off-by: Niels de Vos <ndevos@redhat.com>
When a volume was provisioned by an old Ceph-CSI provisioner, the
metadata of the RBD image will contain `requiresEncryption` to indicate
a passphrase needs to be created. New Ceph-CSI provisioners create the
passphrase in the CreateVolume request, and set `encryptionPrepared`
instead.
When a new node-plugin detects that `requiresEncryption` is set in the
RBD image metadata, it will fallback to the old behaviour.
In case `encryptionPrepared` is read from the RBD image metadata, the
passphrase is used to cryptsetup/format the image.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
This adds internal/rbd/encryption.go which will be used to include other
encryption functionality to support additional KMS related functions. It
will work together with the shared API from internal/util/kms.go.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Have the provisioner create the passphrase for the volume, instead of
doign it lazily at the time the volume is used for the 1st time. This
prevents potential races where pods on different nodes try to store
different passphrases at the (almost) same time.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
It seems that calls to addRbdManagerTask() do not include the
rados-namespace in the image location. Functions calling
addRbdManagerTask() construct the image location themselves, but should
use rbdVolume.String() to include all the attributes.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Make rbdplugin pod work in a non-initial network namespace (i.e. with
"hostNetwork: false") by skipping waiting for udev events when mapping
and unmapping images. CSI use case is very simple: all that is needed
is a device node which is immediately fed to mkfs, so we should be able
to tolerate udev not being finished with the device just fine.
Fixes: #1323
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
With the new support for passing --options, referring to ExecCommand()
argument slices as mapOptions and options is confusing.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
mount packges is moved from
k8s.io/utils/mount to a new repository
k8s.io/mount-utils. updated code to use
the same.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>