When the cephcsi executable receives an error when calling
validateCloneDepthFlag(), it panics due to klog.Fatalln(). The errors
that validateCloneDepthFlag() logs should be understandable enough, so
that users know what to investigate. A Go panic on a user error is not
very userfriendly, and does not provide any additional usefil
information.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
When the cephcsi executable receives an error when calling
util.ValidateURL() on the optional "metricspath". The error that
util.ValidateURL() returns should be understandable enough, so that
users know what to investigate. A Go panic on a user error is not very
userfriendly, and does not provide any additional usefil information.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
When the cephcsi executable receives an error when calling
util.ValidateDriverName(), it panics due to klog.Fatalln(). The error
that util.ValidateDriverName() returns should be understandable enough,
so that users know what to investigate. A Go panic on a user error is
not very userfriendly, and does not provide any additional usefil
information.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
When running the 'cephcsi' executable without arguments, a Go panic is
reported:
$ ./_output/cephcsi
F1026 13:59:04.302740 3409054 cephcsi.go:126] driver type not specified
goroutine 1 [running]:
k8s.io/klog/v2.stacks(0xc000010001, 0xc0000520a0, 0x48, 0x9a)
/go/src/github.com/ceph/ceph-csi/vendor/k8s.io/klog/v2/klog.go:996 +0xb9
k8s.io/klog/v2.(*loggingT).output(0x2370360, 0xc000000003, 0x0, 0x0, 0xc000194770, 0x20cb265, 0xa, 0x7e, 0x413500)
/go/src/github.com/ceph/ceph-csi/vendor/k8s.io/klog/v2/klog.go:945 +0x191
k8s.io/klog/v2.(*loggingT).println(0x2370360, 0x3, 0x0, 0x0, 0xc000163e08, 0x1, 0x1)
/go/src/github.com/ceph/ceph-csi/vendor/k8s.io/klog/v2/klog.go:699 +0x11a
k8s.io/klog/v2.Fatalln(...)
/go/src/github.com/ceph/ceph-csi/vendor/k8s.io/klog/v2/klog.go:1456
main.main()
/go/src/github.com/ceph/ceph-csi/cmd/cephcsi.go:126 +0xafa
Just logging the error and exiting should be sufficient. This stack-trace
from the Go panic does not add any useful information.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
The main() function of the cephcsi executable calls klog.Fatalln() to
report certain errors. This causes the executable to panic which is not
helpful to users that only need the error message.
By introducing logAndExit(), there is no need to call klog.Fatalln()
anymore.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Direct usage of numbers should be avoided.
Issue reported:
mnd: Magic number: X, in <argument> detected (gomnd)
Signed-off-by: Yug <yuggupta27@gmail.com>
as v1.0.0 is deprecated we need to remove the support
for it in the Next coming (v3.0.0) release. This PR
removes the support for the same.
closes#882
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Added maxsnapshotsonimage flag to flatten
the older rbd images on the chain to avoid
issue in krbd.The limit is in krbd since it
only allocate 1 4KiB page to handle all the
snapshot ids for an image.
The max limit is 510 as per
https://github.com/torvalds/linux/blob/
aaa2faab4ed8e5fe0111e04d6e168c028fe2987f/drivers/block/rbd.c#L98
in cephcsi we arekeeping the default to 450 to reserve 10%
to avoid issues.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
added skipForceFlatten flag to skip
the image deptha and skip image flattening.
This will be very useful if the kernel is
not listed in cephcsi which supports deep
flatten fauture.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
added Hardlimit and Softlimit flags for cephcsi
arguments. When the Softlimit is reached cephcsi
will start a background task to flatten the rbd
image and return success and if the hardlimit
is reached it will start a background task
to flatten the rbd image and return ready
to use as false to make sure that the image
will not be used until it is flatten.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
It is useful to have the kernel version logged while starting binaries.
Some functionality depends on the version of the kernel, debugging
issues related to this will be easier.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
The internal/ directory in Go has a special meaning, and indicates that
those packages are not meant for external consumption. Ceph-CSI does
provide public APIs for other projects to consume. There is no plan to
keep the API of the internally used packages stable.
Closes: #903
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Fix below error in current codebase
File is not `goimports`-ed with -local
github.com/ceph/ceph-csi
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
As kubernetes CSI sidecar is exposing the
GRPC mertics we can make use of the same in
ceph-csi we dont need to expose our own.
update: #881
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
PR #282 introduces the mount cache to
solve cephfs fuse mount issue when cephfs plugin pod
restarts .This is not working as intended. This PR removes
the code for maintainability.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
This will be helpful if someone wants
to check the cephcsi version
output
```
docker run quay.io/cephcsi/cephcsi:v1.2.1 --version
Cephcsi Version: v1.2.1
Git Commit: 4b871366327d63e27fc1abfb699f0faaf0fc16b9
GoVersion: go1.12.5
Compiler: gc
Platform: linux/amd64
```
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
if both controller and nodeserver flags are set/unset
cephcsi will start both server,
if only one flag is set, it will start relavent
service.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
The container runtime CRI-O limits the number of PIDs to 1024 by
default. When many PVCs are requested at the same time, it is possible
for the provisioner to start too many threads (or go routines) and
executing 'rbd' commands can start to fail. In case a go routine can not
get started, the process panics.
The PID limit can be changed by passing an argument to kubelet, but this
will affect all pids running on a host. Changing the parameters to
kubelet is also not a very elegant solution.
Instead, the provisioner pod can change the configuration itself. The
pod is running in privileged mode and can write to /sys/fs/cgroup where
the limit is configured.
With this change, the limit is configured to 'max', just as if there is
no limit at all. The logs of the csi-rbdplugin in the provisioner pod
will reflect the change it makes when starting the service:
$ oc -n rook-ceph logs -c csi-rbdplugin csi-rbdplugin-provisioner-0
..
I0726 13:59:19.737678 1 cephcsi.go:127] Initial PID limit is set to 1024
I0726 13:59:19.737746 1 cephcsi.go:136] Reconfigured PID limit to -1 (max)
..
It is possible to pass a different limit on the commandline of the
cephcsi executable. The following flag has been added:
--pidlimit=<int> the PID limit to configure through cgroups
This accepts special values -1 (max) and 0 (default, do not
reconfigure). Other integers will be the limit that gets configured in
cgroups.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
update driver version and add git commit
to the image. This will help us to identify
what latest git commit image contains.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
This commit adds support to mount and delete volumes provisioned by older
plugin versions (1.0.0) in order to support backward compatibility to 1.0.0
created volumes.
It adds back the ability to specify where older meta data was specified, using
the metadatastorage option to the plugin. Further, using the provided meta data
to mount and delete the older volumes.
It also supports a variety of ways in which monitor information may have been
specified (in the storage class, or in the secret), to keep the monitor
information current.
Testing done:
- Mount/Delete 1.0.0 plugin created volume with monitors in the StorageClass
- Mount/Delete 1.0.0 plugin created volume with monitors in the secret with
a key "monitors"
- Mount/Delete 1.0.0 plugin created volume with monitors in the secret with
a user specified key
- PVC creation and deletion with the current version (to ensure at the minimum
no broken functionality)
- Tested some negative cases, where monitor information is missing in secrets
or present with a different key name, to understand if failure scenarios work
as expected
Updates #378
Follow-up work:
- Documentation on how to upgrade to 1.1 plugin and retain above functionality
for older volumes
Signed-off-by: ShyamsundarR <srangana@redhat.com>
* Enable all static-checks in golangci-lint
* Update golangci-lint version
* Fix issue found in golangci-lint
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
This is a part of the stateless set of commits for CephCSI.
This commit removes the dependency on config maps to store cephFS provisioned
volumes, and instead relies on RADOS based objects and keys, and required
CSI VolumeID encoding to detect the provisioned volumes.
Changes:
- Provide backward compatibility to provisioned volumes by older plugin versions (1.0.0 or older)
- Remove Create/Delete support for statically provisioned volumes (fixes#382)
- Added namespace support to RADOS OMaps and used the same to store RADOS CSI objects and keys in the CephFS metadata pool
- Added support to mention fsname for CephFS provisioning (fixes#359)
- Changed field name in CSI Identifier to 'location', to denote a pool or fscid
- Updated mounter cache to use new scheme
- Required Helm manifests are updated
- Required documentation and other manifests are updated
- Made driver option 'metadatastorage' as optional, as fresh installs do not need to specify the same
Testing done:
- Create/Mount/Delete PVC
- Create/Delete 5 PVCs
- Mount version 1.0.0 PVC
- Delete version 1.0.0 PV
- Mount Statically defined PV/PVC/Pod
- Mount Statically defined version 1.0.0 PV/PVC/Pod
- Delete Statically defined version 1.0.0 PV/PVC/Pod
- Node restart when mounted to test mountcache
- Use InstanceID other than 'default'
- RBD basic round of tests, as namespace is added to OMaps
- csitest against ceph-fs plugin
- NOTE: CephFS plugin still does not detect and address already created
volumes but of a different size
- Test not providing any value to the metadata storage parameter
Signed-off-by: ShyamsundarR <srangana@redhat.com>
currently, we have 3 docker files(cephcsi,rbd,cephfs) in the ceph-csi repo.
[commit ](85e121ebfe)
added by John to build a single image which can act as rbd or
cephfs based on the input configuration.
This PR updates the makefile and kubernetes templates to use
the unified image and also its deletes the other two dockerfiles.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Existing config maps are now replaced with rados omaps that help
store information regarding the requested volume names and the rbd
image names backing the same.
Further to detect cluster, pool and which image a volume ID refers
to, changes to volume ID encoding has been done as per provided
design specification in the stateless ceph-csi proposal.
Additional changes and updates,
- Updated documentation
- Updated manifests
- Updated Helm chart
- Addressed a few csi-test failures
Signed-off-by: ShyamsundarR <srangana@redhat.com>
Now that the unified binary has its own main file under "cmd/"
we no longer need the split main.go files.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Based on the review comments addressed the following,
- Moved away from having to update the pod with volumes
when a new Ceph cluster is added for provisioning via the
CSI driver
- The above now used k8s APIs to fetch secrets
- TBD: Need to add a watch mechanisim such that these
secrets can be cached and updated when changed
- Folded the Cephc configuration and ID/key config map
and secrets into a single secret
- Provided the ability to read the same config via mapped
or created files within the pod
Tests:
- Ran PV creation/deletion/attach/use using new scheme
StorageClass
- Ran PV creation/deletion/attach/use using older scheme
to ensure nothing is broken
- Did not execute snapshot related tests
Signed-off-by: ShyamsundarR <srangana@redhat.com>
This commit provides the option to pass in Ceph cluster-id instead
of a MON list from the storage class.
This helps in moving towards a stateless CSI implementation.
Tested the following,
- PV provisioning and staging using cluster-id in storage class
- PV provisioning and staging using MON list in storage class
Did not test,
- snapshot operations in either forms of the storage class
Signed-off-by: ShyamsundarR <srangana@redhat.com>