in toolbox mon endpoints are not
updated properly, this is causing an issue in E2E
this PR is a workaround to fix this issue.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
When people have reviewed a PR, and the contributor updates it, the
reviews are staying valid. This is unfortunate as the PR might need new
corrections.
Ideally a positive review (Approved) needs to be given again. And there
needs to be a confirmation that a negative review (Changes requested)
has been addressed.
Closes: #505
Signed-off-by: Niels de Vos <ndevos@redhat.com>
i think now its take time to discover the mon IP
from svc name in tool box, this is a workaround
to fix it.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Sometime rbd images are mapped even if the
connection timeout error occurs, this will
try to unmap if the received error message
is connection timeout.This will fix stale maps
and rbd image deletion issue
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
if both controller and nodeserver flags are set/unset
cephcsi will start both server,
if only one flag is set, it will start relavent
service.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
This will help user to check whats
the actual error. if the config file
is having issue or the clusterid is
not valid.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Image deletion takes time proportional to the size of the
image. Hence, ceph manager is enhanced to support async
deletion of an image, or rather passing the task of
deleting an image to the ceph manager.
This commit leverages the ceph manager enhancement in the CSI code.
NOTE: This is tested against a ceph cluster that is running
Ceph master version of the code. Once other releases
catch up in terms of the feature, the optimization would be
available to the CSI driver as well.
Fixes: #523
Signed-off-by: ShyamsundarR <srangana@redhat.com>
we have see better performace in device
format and mounting by setting the fstype to xfs
from default ext4.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
The container runtime CRI-O limits the number of PIDs to 1024 by
default. When many PVCs are requested at the same time, it is possible
for the provisioner to start too many threads (or go routines) and
executing 'rbd' commands can start to fail. In case a go routine can not
get started, the process panics.
The PID limit can be changed by passing an argument to kubelet, but this
will affect all pids running on a host. Changing the parameters to
kubelet is also not a very elegant solution.
Instead, the provisioner pod can change the configuration itself. The
pod is running in privileged mode and can write to /sys/fs/cgroup where
the limit is configured.
With this change, the limit is configured to 'max', just as if there is
no limit at all. The logs of the csi-rbdplugin in the provisioner pod
will reflect the change it makes when starting the service:
$ oc -n rook-ceph logs -c csi-rbdplugin csi-rbdplugin-provisioner-0
..
I0726 13:59:19.737678 1 cephcsi.go:127] Initial PID limit is set to 1024
I0726 13:59:19.737746 1 cephcsi.go:136] Reconfigured PID limit to -1 (max)
..
It is possible to pass a different limit on the commandline of the
cephcsi executable. The following flag has been added:
--pidlimit=<int> the PID limit to configure through cgroups
This accepts special values -1 (max) and 0 (default, do not
reconfigure). Other integers will be the limit that gets configured in
cgroups.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
In unstage we now adhere to the transaction (or order of steps)
done in Stage. To enable this we stash the image meta data
into a local file on the staging path for use with unstage
request.
This helps in unmapping a stale map, in case the mount or
other steps in the transaction are complete.
Signed-off-by: ShyamsundarR <srangana@redhat.com>
This change also starts mapping nbd based access using ther rbd CLI
as, it is a prerequisite to get device listing for nbd as well.
Signed-off-by: ShyamsundarR <srangana@redhat.com>
This commit moves the mounting of a block volumes and filesystems
to a sub-file (already the case) or a sub-dir within the staging
path.
This enables using the staging path to store any additional data
regarding the mount. For example, this will be extended in the
future to store the fsid of the cluster, and maybe the pool name
to map unmap requests to the right image.
Also, this fixes the noted hack in the code, to determine in a
common manner if there is a mount on the passed in staging path.
Signed-off-by: ShyamsundarR <srangana@redhat.com>
Sometimes the tests fail cleaning up due unavailable resources that are
listed in the .yaml files. Deleting the missing resources returns
"resource not found". By passing --ignore-not-found to kubectl, this
problem should not happen anymore (and possibly makes it more obvious
where tests do go wrong).
rook master deploys the ceph-csi
by default now, this will affect the
ceph-csi testing failure, This PR will
remove the ceph-csi resources created rook
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
once we map the rbd image on a node
we will get the device name its mapped
in the map output itself,no need to
check the devicepath post rbd mapping
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>