Commit Graph

2371 Commits

Author SHA1 Message Date
Yug
2e14116ed7 deploy: add vault creation to rbd driver deployment
Currently, the script does not deploy the driver singlehandedly;
As the vault creation needs to be done prior to that.
The script now includes the vault creation so that
one script can be sufficient to deploy the rbd driver.

Signed-off-by: Yug <yuggupta27@gmail.com>
2020-08-04 16:00:21 +00:00
Niels de Vos
b9d1f16360 ci: make number of CPUs for minikube VM configurable
By default minikube uses 2 CPUs, which might be too little for some of
the tests. When not passing a CPUS environment variable, use all CPUs
available on the system (detected with 'nproc').

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-08-04 08:46:17 +00:00
Madhu Rajanna
2458ec6573 rbd: return error if fetching cluster id fails
if we are not able to fetch the cluster-ID from
the createSnapshot request and also if we are
not able to get the monitor information from
the cluster-ID return error instead of using
the parent image information.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-08-03 14:25:06 +00:00
Mudit Agarwal
9ed0811422 rbd: implement rbdVolume.resize() with go-ceph
Replaced command execution with go-ceph Resize() function.
Volsize is being updated before waiting for resize() to return,
fixed it to get updated only after resize() is successful.

Signed-off-by: Mudit Agarwal <muagarwa@redhat.com>
2020-08-03 10:50:01 +00:00
Mudit Agarwal
1eff20590d util: restructure tracevol.py function to get_subvol_group()
get_subvol_group() returns empty string if subvolumeGroup is not defined,
changed it to return "csi" as default subvolumeGroup.

Signed-off-by: Mudit Agarwal <muagarwa@redhat.com>
2020-08-03 07:14:02 +00:00
Mudit Agarwal
5fa1be586a cephfs: fix tracevol.py to work with dynamic value of fsname
tracevol.py takes 'myfs' as default fsname, changed it so that it can work with
dynamic values of fsname.

Signed-off-by: Mudit Agarwal <muagarwa@redhat.com>
2020-08-03 07:14:02 +00:00
Mudit Agarwal
6814f598ce util: fix tracevol.py to manage config map created by Rook
If the config map is created by rook then there won't be any provision to
specify subvolumeGroup, tracevol.py should skip looking for subvolumeGroup
in such case.

Signed-off-by: Mudit Agarwal <muagarwa@redhat.com>
2020-08-03 07:14:02 +00:00
Mudit Agarwal
3585b21e80 util: fix tracevol.py to take config map namespace as an option
Added a new option so that user can specify the namespace for config map.

Signed-off-by: Mudit Agarwal <muagarwa@redhat.com>
2020-08-03 07:14:02 +00:00
Mudit Agarwal
b2c9c589aa util: fix tracevol.py to use namespace argument
tracevol.py accepts namespace argument but doesn't use it currently, fixing the same.

Signed-off-by: Mudit Agarwal <muagarwa@redhat.com>
2020-08-03 07:14:02 +00:00
Niels de Vos
74ba85f87b ci: add "make run-e2e"
The keeps the standard arguments for e2e testing in a single location
instead of spread over multiple files and CI jobs.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-07-31 16:03:54 +00:00
Niels de Vos
ff94ba282c ci: deploy rook with mon_warn_on_pool_no_redundancy in ceph.conf
In test environments the default pool size is set to 1, so there is no
redundancy. This causes recent Ceph versions to complain with
HEALTH_WARN as POOL_NO_REDUNDANCY get set.

By disabling the mon_warn_on_pool_no_redundancy option in ceph.conf, the
warning is not reported and the cluster is marked HEALTHY.

See-also: rook/rook#5925
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-07-31 16:03:54 +00:00
Niels de Vos
fb60f66178 ci: use the host /sbin/losetup in minikube VM
minikube has /sbin/losetup from Busybox, and that does not work with
raw-block PVCs. Use the losetup executable from the host in the VM
instead.

See-also: kubernetes/minikube#8284
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-07-31 16:03:54 +00:00
Niels de Vos
230dd67752 ci: increase memory in the minikube VM
While testing with tehj default 3000 MB RAM in the minikube VM, creating
a encrypted RBD volume fails because 'cryptsetup' gets killed:

[  766.072585] Out of memory: Kill process 18497 (cryptsetup) score 1182 or sacrifice child
[  766.072589] Killed process 18497 (cryptsetup) total-vm:863136kB, anon-rss:510336kB, file-rss:10788kB, shmem-rss:0kB
[  766.072688] oom_reaper: reaped process 18497 (cryptsetup), now anon-rss:510336kB, file-rss:10780kB, shmem-rss:0kB

Using 4 GB RAM should prevent this from occuring.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-07-31 16:03:54 +00:00
Niels de Vos
2034992607 rebase: upgrade to minikube 1.12.1
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-07-31 16:03:54 +00:00
Niels de Vos
f46fb13357 ci: run "kubectl cluster-info" through minikube
In case kubectl did not get installed (VM_DRIVER != none),
scripts/minikube.sh can fail when kubectl is not in the path. By running
the "kubectl cluster-info" command through minikube, the script will
succeed.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-07-31 16:03:54 +00:00
Niels de Vos
e774ebb7f9 ci: detect available minikube executable
In case there is a minikube executable in the $PATH already, use that
for all commands. If there is none, install_minikube() will place a
newly downloaded executable in /usr/local/bin which will be used by the
full pathname, so that commands as root without /usr/local/bin in the
$PATH will work.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-07-31 16:03:54 +00:00
Niels de Vos
fc378ac74b ci: remove weird mkdir/ln command
The command fails when PWD=/. It is unclear what the command tries to
achieve. The next command does something more useful, although it can
maybe be removed as well.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-07-31 16:03:54 +00:00
Niels de Vos
04934a40e5 ci: start minikube with --force to allow running as root
When starting minikube as root with --driver=kvm2, minikube complains
that this is not the right thing to do. However, in the CentOS CI we
really want to run as root, as that makes the scripts simpler.

Add the --force option while starting, so that minikube does not abort
anymore.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-07-31 16:03:54 +00:00
Niels de Vos
43500fd6b8 ci: disable storage addons when starting minikube
Storage providers and the default storage class is not needed for
Ceph-CSI testing. In order to reduce resources and potential conflicts
between storage plugins, disable them.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-07-31 16:03:54 +00:00
Madhu Rajanna
4937ee97e9 doc: correct upgrade doc
fixed the missing `v` version in upgrade
doc.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-07-27 08:10:41 +00:00
Madhu Rajanna
66a0b0953e ci: update mergify rules for v3.0
updated mergify rules for auto backport
and merging for release-v3.0 branch.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-07-27 11:29:20 +05:30
Madhu Rajanna
a21d8fad69 doc: update upgrade doc for v3.0.0
updated upgrade documentation for upgrade
from v2.1.x to v3.0 .

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-07-27 11:29:20 +05:30
Madhu Rajanna
423183bcfc doc: update readme for v3.0.0 release
updated readme for new v3.0.0 release.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-07-27 11:29:20 +05:30
Yug
02b4a7175c rbd: add upgrade testing
Upgrade testing will enable us to keep
in check the backward compatibility of
earlier releases.

Signed-off-by: Yug <yuggupta27@gmail.com>
2020-07-26 03:41:53 +00:00
Yug
9b30969594 cephfs: add upgrade testing
Upgrade testing will enable us to keep
in check the backward compatibility of
earlier releases.

Signed-off-by: Yug <yuggupta27@gmail.com>
2020-07-26 03:41:53 +00:00
Yug
9c0d5abb5a doc: Add README for upgrade-testing
Update README with upgrade testing parameters.

Signed-off-by: Yug <yuggupta27@gmail.com>
2020-07-26 03:41:53 +00:00
Yug
def44cac90 ci: Add jobs for upgrade testing
Added two jobs for upgrade testing of
cephfs and rbd, with default as upgrade
version as v2.1.2

Signed-off-by: Yug <yuggupta27@gmail.com>
2020-07-26 03:41:53 +00:00
Niels de Vos
6df6cbd9e0 ci: require jjb-validate job to succeed for ci/centos branch
The jjb-validate job can not be run in parallel, so it may fail when
multiple PRs for the ci/centos branch are sent. However, the ci/centos
branch is not very active, so problems should be minimal.

In case of problems, leave the following comment in the PR and the job
should restart:

    /retest ci/centos/jjb-validate

See-also: #1273
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-07-25 08:39:01 +05:30
Humble Chirammal
02b8cd0b4b dep: lift kube dependency to v0.18.6
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2020-07-24 19:21:28 +00:00
Niels de Vos
be9e7cf956 rbd: pass context.Context to rbdVolume.resize()
While adding the context.Context to the resizeRBDimage() function, it
became a little ugly. So renaming the function to resize() and making it
a method of the rbdVolume type.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-07-24 16:04:13 +00:00
Niels de Vos
36469b87e2 util: make ExecComand return stdout and stderr as string
Most consumers of util.ExecCommand() need to convert the returned []byte
format of stdout and/or stderr to string. By having util.ExecCommand()
return strings instead, the code gets a little simpler.

A few commands return JSON that needs to be parsed. These commands will
be replaced by go-ceph implementations later on. For now, convert the
strings back to []byte when needed.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-07-24 16:04:13 +00:00
Niels de Vos
ddac66d76b util: use context.Context for logging in ExecCommand
All calls to util.ExecCommand() now pass the context.Context. In some
cases this is not possible or needed, and util.ExecCommand() will not
log the command.

This should make debugging easier when command executions fail.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-07-24 16:04:13 +00:00
Niels de Vos
bb4f1c7c9d rbd: use util.ExecCommand() instead of execCommand()
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-07-24 16:04:13 +00:00
Niels de Vos
457d846241 cephfs: use util.ExecCommand() instead of execCommand()
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-07-24 16:04:13 +00:00
Niels de Vos
47d5b60af8 rbd: disable reflink while creating XFS filesystems
Current versions of the mkfs.xfs binary enable reflink support by
default. This causes problems on systems where the kernel does not
support this feature. When the kernel the feature does not support, but
the filesystem has it enabled, the following error is logged in `dmesg`:

    XFS: Superblock has unknown read-only compatible features (0x4) enabled

Introduce a check to see if mkfs.xfs supports the `-m reflink=` option.
In case it does, pass `-m reflink=0` while creating the filesystem.

The check is executed once during the first XFS filesystem creation. The
result of the check is cached until the nodeserver restarts.

Fixes: #966
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-07-24 13:37:51 +00:00
Niels de Vos
526da43b6a rbd: remove unused rbdStatus()
Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-07-24 11:34:48 +00:00
Niels de Vos
7afaac9c66 rbd: implement rbdVolume.isInUse() with go-ceph
The new rbdVolume.isInUse() method will replace the rbdStatus()
function. This removes one more rbd command execution in the
DeleteVolume path.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
2020-07-24 11:34:48 +00:00
Humble Chirammal
9e0589cf12 ci: fix rook cluster version fetching
As part of https://github.com/ceph/ceph-csi/pull/1237/ there was
a patching enabled for the ceph cluster deployed, however due to
an error in the version fetching logic, the patching was not applied

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2020-07-24 09:55:04 +00:00
Sven Anderson
92884f56f4 rbd: simplify error handling
This change replaces the sentinel errors in rbd module with
standard errors created with errors.New().

Related: #1203

Signed-off-by: Sven Anderson <sven@redhat.com>
2020-07-23 11:16:40 +00:00
Sven Anderson
dba2c27bcb cephfs: simplify error handling
This change replaces the sentinel errors in cephfs module with
standard errors created with errors.New().

Related: #1203

Signed-off-by: Sven Anderson <sven@redhat.com>
2020-07-23 11:16:40 +00:00
Sven Anderson
7c9c7c78a7 util: add tests for JoinErrors()
Signed-off-by: Sven Anderson <sven@redhat.com>
2020-07-23 11:16:40 +00:00
Sven Anderson
8393fbe40b util: simplify error handling
The sentinel error code had additional fields in the errors, that are
used nowhere.  This leads to unneccesarily complicated code.  This
change replaces the sentinel errors in utils with standard errors
created with errors.New() and adds a simple JoinErrors() function to
be able to combine sentinel errors from different code tiers.

Related: #1203

Signed-off-by: Sven Anderson <sven@redhat.com>
2020-07-23 11:16:40 +00:00
Madhu Rajanna
c277ed588d doc: correct kubernetes version for snap and clone
Corrected the required kubernetes version for rbd
snapshot and clone in README.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-07-23 07:01:36 +00:00
Madhu Rajanna
b18fca7ae0 doc: Remove support for mimic
As ceph mimic is deprecated in the ceph upstream,
we are removing the support for mimic from ceph-csi
also, the user need to update the latest Nautilus or
Octopus to use ceph-csi.

more info realated to ceph mimim deprecation at
https://lists.ceph.io/hyperkitty/list/dev@ceph.io/thread/X5IUICDEM4IVVWTMUTSSNEU424MB6WL7/
https://ceph.io/releases/mimic-is-retired/

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-07-23 04:44:37 +00:00
Madhu Rajanna
5168ad7ddf e2e: create/delete snap and clone in parallel
In rbd E2E testing,we need to create snap and clone
as parallel operation.

This helps us to insure that functionality works when
we have parallel delete and create operations and also
it helps to catch bugs when we get parallel requests.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-07-21 13:25:19 +00:00
Madhu Rajanna
b3a4f510e6 rbd: take operation locks before operating on resource
Take operation locks on the resources before operating
on the resouces. This allows us to do parallel operations
for some RPC calls such as Clone and Restore of PVC.
This operations will only be blocked if the image is
expanding or Snapshot and RBD image is getting deleted.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-07-21 13:25:19 +00:00
Madhu Rajanna
d6348545ab journal: Add additional operation based locking
As we are adding new functionalities like Create/Delete
snapshot,Clone from Snapshot and Clone from Volume.
with the current implementation, there are only serial
operations allowed for this functionalities, for some
function we can allow parallel operations like
Clone from snapshot and Clone from Volume and Create
`N` snapshots on a single volume.

Delete Volume: Need to ensure that there is no clone,
Snapshot create and  Expand volume in progress.

Expand Volume: Need to ensure that there is no clone,
snapshot create and cloning in progress

Delete Snapshot: Need to ensure that there is no
cloning in progress

Restore Volume/Snapshot: Need to ensure that there is
no Expand or delete operation in progress.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
2020-07-21 13:25:19 +00:00
Yug
71ddf51544 cleanup: address gomnd warnings
Direct usage of numbers should be avoided.

Issue reported:
mnd: Magic number: X, in <argument> detected (gomnd)

Signed-off-by: Yug <yuggupta27@gmail.com>
2020-07-21 08:36:24 +00:00
Yug
e73fe64a0d cleanup: address gosec warnings
gosec warns about security problems by scanning the
Go AST.

Issues Reported:
G101 (CWE-798): Potential hardcoded credentials (Confidence: LOW, Severity: HIGH)
G204 (CWE-78): Subprocess launched with variable (Confidence: HIGH, Severity: MEDIUM)
G304 (CWE-22): Potential file inclusion via variable (Confidence: HIGH, Severity: MEDIUM)

Signed-off-by: Yug <yuggupta27@gmail.com>
2020-07-21 08:36:24 +00:00
Yug
48fa43270f cleanup: address gocritic warnings
Add explanation to nolint directives.

Issue reported:
whyNoLint: include an explanation for nolint directive (gocritic)

Signed-off-by: Yug <yuggupta27@gmail.com>
2020-07-21 08:36:24 +00:00