Commit Graph

4241 Commits

Author SHA1 Message Date
Niels de Vos
36469b87e2 util: make ExecComand return stdout and stderr as string
Most consumers of util.ExecCommand() need to convert the returned []byte
format of stdout and/or stderr to string. By having util.ExecCommand()
return strings instead, the code gets a little simpler.

A few commands return JSON that needs to be parsed. These commands will
be replaced by go-ceph implementations later on. For now, convert the
strings back to []byte when needed.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-07-24 16:04:13 +00:00
Niels de Vos
ddac66d76b util: use context.Context for logging in ExecCommand
All calls to util.ExecCommand() now pass the context.Context. In some
cases this is not possible or needed, and util.ExecCommand() will not
log the command.

This should make debugging easier when command executions fail.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-07-24 16:04:13 +00:00
Niels de Vos
bb4f1c7c9d rbd: use util.ExecCommand() instead of execCommand()
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-07-24 16:04:13 +00:00
Niels de Vos
457d846241 cephfs: use util.ExecCommand() instead of execCommand()
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-07-24 16:04:13 +00:00
Niels de Vos
47d5b60af8 rbd: disable reflink while creating XFS filesystems
Current versions of the mkfs.xfs binary enable reflink support by
default. This causes problems on systems where the kernel does not
support this feature. When the kernel the feature does not support, but
the filesystem has it enabled, the following error is logged in `dmesg`:

    XFS: Superblock has unknown read-only compatible features (0x4) enabled

Introduce a check to see if mkfs.xfs supports the `-m reflink=` option.
In case it does, pass `-m reflink=0` while creating the filesystem.

The check is executed once during the first XFS filesystem creation. The
result of the check is cached until the nodeserver restarts.

Fixes: #966
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-07-24 13:37:51 +00:00
Niels de Vos
526da43b6a rbd: remove unused rbdStatus()
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-07-24 11:34:48 +00:00
Niels de Vos
7afaac9c66 rbd: implement rbdVolume.isInUse() with go-ceph
The new rbdVolume.isInUse() method will replace the rbdStatus()
function. This removes one more rbd command execution in the
DeleteVolume path.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-07-24 11:34:48 +00:00
Humble Chirammal
9e0589cf12 ci: fix rook cluster version fetching
As part of https://github.com/ceph/ceph-csi/pull/1237/ there was
a patching enabled for the ceph cluster deployed, however due to
an error in the version fetching logic, the patching was not applied

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2020-07-24 09:55:04 +00:00
Sven Anderson
92884f56f4 rbd: simplify error handling
This change replaces the sentinel errors in rbd module with
standard errors created with errors.New().

Related: #1203

Signed-off-by: Sven Anderson <sven@redhat.com>
2020-07-23 11:16:40 +00:00
Sven Anderson
dba2c27bcb cephfs: simplify error handling
This change replaces the sentinel errors in cephfs module with
standard errors created with errors.New().

Related: #1203

Signed-off-by: Sven Anderson <sven@redhat.com>
2020-07-23 11:16:40 +00:00
Sven Anderson
7c9c7c78a7 util: add tests for JoinErrors()
Signed-off-by: Sven Anderson <sven@redhat.com>
2020-07-23 11:16:40 +00:00
Sven Anderson
8393fbe40b util: simplify error handling
The sentinel error code had additional fields in the errors, that are
used nowhere.  This leads to unneccesarily complicated code.  This
change replaces the sentinel errors in utils with standard errors
created with errors.New() and adds a simple JoinErrors() function to
be able to combine sentinel errors from different code tiers.

Related: #1203

Signed-off-by: Sven Anderson <sven@redhat.com>
2020-07-23 11:16:40 +00:00
Madhu Rajanna
c277ed588d doc: correct kubernetes version for snap and clone
Corrected the required kubernetes version for rbd
snapshot and clone in README.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-07-23 07:01:36 +00:00
Madhu Rajanna
b18fca7ae0 doc: Remove support for mimic
As ceph mimic is deprecated in the ceph upstream,
we are removing the support for mimic from ceph-csi
also, the user need to update the latest Nautilus or
Octopus to use ceph-csi.

more info realated to ceph mimim deprecation at
https://lists.ceph.io/hyperkitty/list/dev@ceph.io/thread/X5IUICDEM4IVVWTMUTSSNEU424MB6WL7/
https://ceph.io/releases/mimic-is-retired/

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-07-23 04:44:37 +00:00
Madhu Rajanna
5168ad7ddf e2e: create/delete snap and clone in parallel
In rbd E2E testing,we need to create snap and clone
as parallel operation.

This helps us to insure that functionality works when
we have parallel delete and create operations and also
it helps to catch bugs when we get parallel requests.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-07-21 13:25:19 +00:00
Madhu Rajanna
b3a4f510e6 rbd: take operation locks before operating on resource
Take operation locks on the resources before operating
on the resouces. This allows us to do parallel operations
for some RPC calls such as Clone and Restore of PVC.
This operations will only be blocked if the image is
expanding or Snapshot and RBD image is getting deleted.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-07-21 13:25:19 +00:00
Madhu Rajanna
d6348545ab journal: Add additional operation based locking
As we are adding new functionalities like Create/Delete
snapshot,Clone from Snapshot and Clone from Volume.
with the current implementation, there are only serial
operations allowed for this functionalities, for some
function we can allow parallel operations like
Clone from snapshot and Clone from Volume and Create
`N` snapshots on a single volume.

Delete Volume: Need to ensure that there is no clone,
Snapshot create and  Expand volume in progress.

Expand Volume: Need to ensure that there is no clone,
snapshot create and cloning in progress

Delete Snapshot: Need to ensure that there is no
cloning in progress

Restore Volume/Snapshot: Need to ensure that there is
no Expand or delete operation in progress.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-07-21 13:25:19 +00:00
Yug
71ddf51544 cleanup: address gomnd warnings
Direct usage of numbers should be avoided.

Issue reported:
mnd: Magic number: X, in <argument> detected (gomnd)

Signed-off-by: Yug <yuggupta27@gmail.com>
2020-07-21 08:36:24 +00:00
Yug
e73fe64a0d cleanup: address gosec warnings
gosec warns about security problems by scanning the
Go AST.

Issues Reported:
G101 (CWE-798): Potential hardcoded credentials (Confidence: LOW, Severity: HIGH)
G204 (CWE-78): Subprocess launched with variable (Confidence: HIGH, Severity: MEDIUM)
G304 (CWE-22): Potential file inclusion via variable (Confidence: HIGH, Severity: MEDIUM)

Signed-off-by: Yug <yuggupta27@gmail.com>
2020-07-21 08:36:24 +00:00
Yug
48fa43270f cleanup: address gocritic warnings
Add explanation to nolint directives.

Issue reported:
whyNoLint: include an explanation for nolint directive (gocritic)

Signed-off-by: Yug <yuggupta27@gmail.com>
2020-07-21 08:36:24 +00:00
Yug
628ae9e982 cleanup: use wrapped static errors instead of dynamic
In Go 1.13, the fmt.Errorf function supports a new %w verb.
When this verb is present, the error returned by fmt.Errorf
will have an Unwrap method returning the argument of %w,
which must be an error. In all other ways, %w is identical to %v.

Updates: #1227

Signed-off-by: Yug <yuggupta27@gmail.com>
2020-07-21 08:36:24 +00:00
Yug
2bfe7670d2 cleanup: address govet warnings
We had "ns" as a parameter and then trying to
declare it also as a local variable, which is what
the complaint about "shadowing" refers to.

Issue reported:
shadow: declaration of "ns" shadows declaration at line 57 (govet)

Signed-off-by: Yug <yuggupta27@gmail.com>
2020-07-21 08:36:24 +00:00
Yug
94b745d573 ci: disable goerr113 linter
goerr113 linter checks the errors handling expressions.
It warns about using wrapped static errors in place
of dynamic at multiple places.

Disabled the linter as of now to avoid regression,
and this need to be handled in a seperate issue.
Tracker Issue: #1227

Signed-off-by: Yug <yuggupta27@gmail.com>
2020-07-21 08:36:24 +00:00
Yug
9e435db454 ci: disable nestif linter
The `nestif` linter reports deeply nested if statements.

Disabled `nestif` as of now to avoid regression and
needs to addressed seperately.
Tracking Issue: #1229

Signed-off-by: Yug <yuggupta27@gmail.com>
2020-07-21 08:36:24 +00:00
Yug
7f94a57908 cleanup: address godot warnings
Top level comments should end in a period

Signed-off-by: Yug <yuggupta27@gmail.com>
2020-07-21 08:36:24 +00:00
Niels de Vos
cb7ab307dd ci: disable 'testpackage' linter in golangci
The 'testpackage' linter recommends "black box" testing, which prevents
testing internal/non-exported functions from being tested. We have tests
that *do* test non-exported functions and types.

Disabling the linter allows us to test non-exported types and functions.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-07-21 08:36:24 +00:00
Humble Chirammal
3695115371 ci: update golang-ci to v1.27
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2020-07-21 08:36:24 +00:00
Madhu Rajanna
11a6f6c1dd rbd: Support data-pool when cloning rbd image
Added support to clone an image in data-pool
during CreateVolume RPC call.

updates #1188

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-07-20 17:29:48 +00:00
Humble Chirammal
9cb9020e2e ci: update e2e ceph cluster version to 14.2.10
Rook version is currently 1.1.7 in our e2e deployment which brings 14.2.4 version
of ceph cluster. To support cephfs snapshot e2e, we need latest version of Ceph Cluster
in E2E. Rook 1.2.7 is good enough which on patching bring up ceph 14.2.10 cluster.

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2020-07-20 15:25:49 +00:00
Madhu Rajanna
cf98442ef6 doc: add document for rbd snapshot and clone
Added a document which contains the steps
and RBD CLI commands we execute when we create
a kubernetes snapshot, delete kubernetes snapshot,
Restore a snapshot to a new PVC,Kubernetes volume
cloning and kubernetes PVC deletion.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-07-20 12:56:23 +00:00
Yug
be8d3b6047 ci: enable containerized test job in mergify
Since containerized-tests job is functioning
again, we can re-enable the job as a required
condition to merge PRs.

Signed-off-by: Yug <yuggupta27@gmail.com>
2020-07-20 14:15:47 +05:30
Madhu Rajanna
1e5370a1f3 cephfs: return volume not found error if volume doesnot exists
In some ceph version if the subvolume is not present, the
ceph returns doesnot exists and in some version not found
error message. This commit fixes issue for both error
checks.

By only checking Error ENOENT: for doesnot exist seems good.
even if some error message changes in ceph ceph-csi wont get
any issue.

```bash
sh-4.2# ceph version
ceph version 14.2.10 (b340acf629a010a74d90da5782a2c5fe0b54ac20) nautilus (stable)

sh-4.2# ceph fs subvolume getpath myfs csi-vol-a24a3d97-c7f4-11ea-8cfc-0242ac110012 --group_name csi
Error ENOENT: subvolume 'csi-vol-a24a3d97-c7f4-11ea-8cfc-0242ac110012' does not exist
```

```bash
sh-4.2# ceph version
ceph version 14.2.4 (75f4de193b3ea58512f204623e6c5a16e6c1e1ba) nautilus (stable)

sh-4.2# ceph fs subvolume getpath myfs testing --group_name=csi
Error ENOENT: Subvolume 'testing' not found
```

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-07-17 10:27:18 +00:00
Madhu Rajanna
e03c0dc4a8 build: Fix docker permission issue during manifest create
This is a workaround to fix docker permission denied issue
during manifest create in Travis CI
`docker manifest create` fails due to permission denied
on `/etc/docker/certs.d/quay.io`
more info https://github.com/docker/for-linux/issues/396.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-07-16 16:37:36 +00:00
Madhu Rajanna
684cb13c54 rbd: DisAllow CreateVoulume for missmatch volume size
If the requested volume size and the snapshot or the
parent volume from which the clone is to be created
is not equal cephcsi returns an error message.

updates #1188

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-07-16 12:33:27 +00:00
Madhu Rajanna
b949d5f9c7 build: Build and push multi architecture images to dockerhub
As quay.io doesnot support the multi architecture
images, We need to switch to dockerhub as it supports
multi-architecture images.

closes: #1003

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-07-16 08:57:38 +00:00
Madhu Rajanna
5208c0fc38 cleanup: replace klog with v2
This commit replaces the klog with klog/v2
in leftover place.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-07-16 04:10:58 +00:00
Madhu Rajanna
2c67ba1ec4 rbd: Return current depth if the image is not found
If the image in the chain is moved to trash, we
cannot get the image details. We need to return the
found depth to the caller.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-07-15 18:40:45 +00:00
Madhu Rajanna
76c2f3c109 cleanup: re-use flattenTemporaryClonedImages to reduce duplicate code
re-use flattenTemporaryClonedImages to avoid code duplication

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-07-15 16:10:38 +00:00
Madhu Rajanna
8fc9146056 rbd: flatten temp cloned images
If the snapshots on the parent image exceeds
maxSnapshotsOnImage count, we need to flatten
all the temporary cloned images to over come the
krbd issue of maximum number of snapshots on
an image.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-07-15 16:10:38 +00:00
Madhu Rajanna
2fe1ee5287 rbd: create temporary snapshot with name same as temporary clone
create temporary snapshot on the parent image same as
name as the temporary clone rbd image. Naming the snapshot
and the temporary cloned image helps to flatten the temporary
cloned images when the snapshots on the parent image exceeds
the configured maxSnapshotsOnImage.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-07-15 16:10:38 +00:00
Humble Chirammal
285028727a ci: disable centos ci checks from mergify
CentOS CI outage block PRs to get merged even when it has enough
approvals. Disabling the centos check for now as there is no ETA
on the availability of the setup atm.

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2020-07-15 20:10:34 +05:30
Madhu Rajanna
09ffaee7c3 cleanup: rename newVolumeOptionsFromVersion1Context for more clarity
rename newVolumeOptionsFromVersion1Context to newVolumeOptionsFromMonitorList
to provide more clarity to the function readers and also fixed comments.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-07-13 04:04:28 +00:00
Madhu Rajanna
d15ded88f5 cleanup: Remove support for Delete and Unmounting v1.1.0 PVC
as v1.0.0 is deprecated we need to remove the support
for it in the Next coming (v3.0.0) release. This PR
removes the support for the same.

closes #882

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-07-10 16:07:13 +00:00
Madhu Rajanna
a0fd805a8b rbd: Add support for smart cloning
Added support for RBD PVC to PVC cloning, below
commands are executed to create a PVC-PVC clone from
RBD side.

* Check the depth(n) of the cloned image if n>=(hard limit -2)
or ((soft limit-2) Add a task to flatten the image and return
about (to avoid image leak) **Note** will try to flatten the
temp clone image in the chain if available
* Reserve the key and values in omap (this will help us to
avoid the leak as it's not reserved earlier as we have returned
ABORT (the request may not come back))
* Create a snapshot of rbd image
* Clone the snapshot (temp clone)
* Delete the snapshot
* Snapshot the temp clone
* Clone the snapshot (final clone)
* Delete the snapshot

```bash
1) check the image depth of the parent image if flatten required
add a task to flatten image and return ABORT to avoid leak
(hardlimit-2 and softlimit-2 check will be done)
2) Reserve omap keys
2) rbd snap create <RBD image for src k8s volume>@<random snap name>
3) rbd clone --rbd-default-clone-format 2 --image-feature
layering,deep-flatten <RBD image for src k8s volume>@<random snap>
<RBD image for temporary snap image>
4) rbd snap rm <RBD image for src k8s volume>@<random snap name>
5) rbd snap create <cloned RBD image created in snapshot process>@<random snap name>
6) rbd clone --rbd-default-clone-format 2 --image-feature <k8s dst vol config>
 <RBD image for temporary snap image>@<random snap name> <RBD image for k8s dst vol>
7)rbd snap rm <RBD image for src k8s volume>@<random snap name>
```

* Delete temporary clone image created as part of clone(delete if present)
* Delete rbd image

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-07-10 14:02:12 +00:00
Madhu Rajanna
9077c25c15 e2e: add e2e to test pvc-pvc cloning
Added an e2e testcase to test pvc-pvc
cloning.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-07-10 14:02:12 +00:00
Madhu Rajanna
7481786c44 ci: update kubernetes v1.17 to latest patch release
kubernetes v1.17.8 is released recently,This commit
updates the kubernetes v1.17 to latest patch release

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-07-10 09:47:21 +00:00
Madhu Rajanna
caa1eca8f6 ci: update kubenetes version to v1.18.5 for helm E2E
updated kubernetes version to latest version
to test cephcsi helm charts

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-07-10 09:47:21 +00:00
Madhu Rajanna
c8e532aa98 ci: Add kube v1.18.5 to E2E testing
Updated travis CI to test cephcsi with
kubernetes v1.18.5

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-07-10 09:47:21 +00:00
Madhu Rajanna
6786d92246 ci: remove kube 1.16 from Travis CI
As kube v1.18 is released we dont need to test
-2 version in our CI, removing the 1.16 testing
for the same reason.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-07-10 09:47:21 +00:00
Madhu Rajanna
f05c9a6a93 ci: fix psp issue in minikube latest version
With minikube versions greater than 1.6.2 and less than 1.11.1, the YAML files
minikube path will not be automatically applied to the cluster. we will get
errors during bootstrap of the cluster if the admission controller is enabled.

To use Pod Security Policies with these versions of minikube, first start a
cluster without the `PodSecurityPolicy` admission controller enabled.

Next, apply the psp yaml. and stop the cluster and then restart it
with the admission controller enabled.

```
minikube start
kubectl apply -f /path/to/psp.yaml
minikube stop
minikube start --extra-config=apiserver.enable-admission-plugins=PodSecurityPolicy
```

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-07-10 09:47:21 +00:00