ceph-csi

mirror of https://github.com/ceph/ceph-csi.git synced 2025-05-28 09:46:40 +00:00

Author	SHA1	Message	Date
Niels de Vos	eeec4471c5	rbd: no need to create a snapshot on a thick-provisioned volume When cloning a volume from a (CSI) snapshot, we use DeepCopy() and do not need an RBD snapshot as source. Suggested-by: Madhu Rajanna <madhupr007@gmail.com> Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-06-23 14:22:28 +00:00
Niels de Vos	d2c4cacb39	rbd: restart thick-provisioned PVC snapshot restoring after aborting In case restoring a snapshot of a thick-PVC failed during DeepCopy(), the image will exist, but have partial contents. Only when the image has the thick-provisioned metadata set, it has completed DeepCopy(). When the metadata is missing, the image is deleted, and an error is returned to the caller. Kubernetes will automatically retry provisioning on the ABORTED error, and the restoring will get restarted from the beginning. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-06-23 14:22:28 +00:00
Niels de Vos	7f1bdb49d1	rbd: use DeepCopy() when restoring a thick-snapshot Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-06-23 14:22:28 +00:00
Yati Padia	847b996501	cleanup: Modifies Wrapcheck linter Wrapcheck is a simple Go linter to check that errors from external packages are wrapped during return to help identify the error source during debugging. This commit addresses the wrapcheck error Updates:#2025 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-06-22 08:47:55 +00:00
Madhu Rajanna	591ba3f580	rbd: set thick provision metadata on clone volume the parent volume(CreateVolume) and the clone volume (CreateSnapshot) are both indepedent and parent volume can be deleted anytime. To check the thick provision during Snapshot restore(CreateVolume from snapshot) we need the thick provision metadata so for the same reason setting the thick provision metadata on the clone image we are creating at the CreateSnapshot time. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-06-18 10:57:48 +00:00
Madhu Rajanna	6d14eeee70	rbd: use RbdSnapName to check the image details RbdSnapName holds the actual RBD image name which got created during the CreateSnapshot operation. RbdImageName holds the name of the parent from which the snapshot is created. and the parent is independent of snapshot and it can be deleted any time for the same reason using the RbdSnapName to check the rbd image details. generate a temporary volume from the snapshot which replaces the rbdImageName with RbdSnapName and use it to check the image metadata. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-06-18 10:57:48 +00:00
Madhu Rajanna	7966d2e5c1	rbd: add validation for thick restore/clone added validation to allow only Restore of Thick PVC snapshot to a thick clone and creation of thick clone from thick PVC. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-06-18 10:57:48 +00:00
Madhu Rajanna	fc442221e4	rbd: make isThickProvisioned method of rbdImage isThickProvisioned can be used for both snapshot and clone validation if isThickProvisioned is method of common rbdImage structure. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-06-18 10:57:48 +00:00
Niels de Vos	57d3183cb1	rbd: restart thick-provisioned PVC cloning after aborting In case cloning a thick-PVC failed during DeepCopy(), the image will exist, but have partial contents. Only when the image has the thick-provisioned metadata set, it has completed DeepCopy(). When the metadata is missing, the image is deleted, and an error is returned to the caller. Kubernetes will automatically retry provisioning on the ABORTED error, and the cloning will get restarted from the beginning. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-06-18 06:25:56 +00:00
Niels de Vos	b1045364d9	rbd: disable FeatureDeepFlatten when doing DeepCopy() Not all Linux kernels support the deep-flatten feature. Disabling the feature makes it possible to map RBD images on older kernels (like what minikube uses). Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-06-18 06:25:56 +00:00
Niels de Vos	4908ff8743	rbd: no need to flatten thick-provisioned images Thick-provisioned images are independent, cloned images or snapshots are deep-flattened during creation. There is no need to try and flatten them again. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-06-18 06:25:56 +00:00
Niels de Vos	6cc11c15d3	rbd: use DeepCopy to create a thick-provisioned clone To create a full-allocated RBD image from a snapshot/clone DeepCopy() can be used. This is needed when the parent of the new volume is thick-provisioner, so that the new volume is independent of the parent and thick-provisioned as well. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-06-18 06:25:56 +00:00
Niels de Vos	334f237e23	cleanup: move snapshot/clone/flatten into its own function Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-06-18 06:25:56 +00:00
Madhu Rajanna	367eb9f748	rbd: correct return error for isCompatibleEncryption isCompatibleEncryption is used to validate the requested volume and the existing volume and the destination volume name wont be generated yet and logging the destination volume prints the empty image name with pool name. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-06-17 10:12:18 +00:00
Madhu Rajanna	05b8433b89	rbd: check stdErr for does not have a parent error actual error will be present in the stdErr not the error when we try to add a task to flatten the rbd image. This commits corrects the error checking when the image does not have a parent. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-06-15 11:07:34 +00:00
Yati Padia	6bfdf2feb0	cleanup: gocyclo being unused for linter This commit addresses the following issue: 'nolint:gocyclo // complexity needs to be reduced.' is unused for linter "gocyclo" (nolintlint) Updates:#2025 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-06-15 02:54:16 +00:00
Yug	5c079894c7	doc: correct comment indentation in rbdVolume correct comment indentation in rbdvolume{} Signed-off-by: Yug <yuggupta27@gmail.com>	2021-06-15 02:34:51 +00:00
Yati Padia	095a82f37d	util: returns actual error instead of ErrPoolNotFound This commit returns actual error returned by the go-ceph API to the function GetPoolName(..) instead of just returning ErrPoolNotFound everytime there is error getting the pool id. There is a issue reported in which the snapshot creation takes much more time to reach True state (i.e., between 2-7 mins) and keeps trying to create with below error though pool is present: rpc error: code = NotFound desc = pool not found: pool ID (21) not found in Ceph cluster. Since we cannot interpret the actual error for the delay in snapshot creation, it is required to return the actual error as well so that we can uderstand the reason. Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-06-14 14:41:32 +00:00
Humble Chirammal	17b0091cba	cleanup: fix codespell error in internal/utils package Codespell checker report below error: ``` Resulting CLI options --check-filenames --check-hidden --skip .git,./vendor --ignore-words-list ExtraVersion,extraversion,ba 1 Error: ./internal/util/aws_metadata.go:96: Kubenetes ==> Kubernetes ``` This commit address the same. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-06-11 08:04:07 +00:00
Yug	d992803e9e	rbd: Update pool name in image chain While traversing image chain, the parent image can be present in a different pool that the one child is in. So, updating pool name in the next itteration to that of the Parent. Co-authored-by: Madhu Rajanna <madhupr007@gmail.com> Signed-off-by: Yug <yuggupta27@gmail.com>	2021-06-10 21:46:53 +00:00
Yug	1f6a9cabfd	rbd: verify if pool name is not empty Validate Snapshot request to check if the passed pool name is not empty. Co-authored-by: Madhu Rajanna <madhupr007@gmail.com> Signed-off-by: Yug <yuggupta27@gmail.com>	2021-06-10 21:46:53 +00:00
Yug	3898ae34a7	rbd: open new ioctx connection if the parent and child clones are in different namespaces we need to open a new ioctx for pools. Co-authored-by: Madhu Rajanna <madhupr007@gmail.com> Signed-off-by: Yug <yuggupta27@gmail.com>	2021-06-10 21:46:53 +00:00
Yug	b63b0bf18d	rbd: retrieve parent pool name of child image when clones are created in different pool,we need to retrieve the parent pool to get the information of the parent image. Co-authored-by: Madhu Rajanna <madhupr007@gmail.com> Signed-off-by: Yug <yuggupta27@gmail.com>	2021-06-10 21:46:53 +00:00
Yug	e699318acc	rbd: pass parent volume to undoSnapshotCloning function as we are supporting the creation of clone to a new pool we need to pass the correct parent volume to cleanup the snapshot on parent volume. Co-authored-by: Madhu Rajanna <madhupr007@gmail.com> Signed-off-by: Yug <yuggupta27@gmail.com>	2021-06-10 21:46:53 +00:00
Yug	961c1d12fd	rbd: add support to create clone in different pool added support to create image in different pool. if the snapshot/rbd image exists in one pool we can create a clone the clone of the rbd image to a different pool. Co-authored-by: Madhu Rajanna <madhupr007@gmail.com> Signed-off-by: Yug <yuggupta27@gmail.com>	2021-06-10 21:46:53 +00:00
Mohammed Naser	671d6a7767	rbd: Backout if image features is empty In golang world, if you split an empty string that does not contain the seperator, you get an array with one empty string. This results in volumes failing to mount with "invalid feature " (note extra space because it's trying to check if 'empty string' is a valid feature). This patch checks if the string is empty, and if so, it just decides to skip the entire validation and returning nothing. Signed-off-by: Mohammed Naser <mnaser@vexxhost.com>	2021-06-10 15:43:09 +00:00
Mohammed Naser	f193ebfbb1	rbd: Add failing test when no features are provided Signed-off-by: Mohammed Naser <mnaser@vexxhost.com>	2021-06-10 15:43:09 +00:00
Madhu Rajanna	7b5c78ec7c	rbd: fail fast in create volume for missmatch encryption CreateVolume will fail in below cases * If the snapshot is encrypted and requested volume is not encrypted * If the snapshot is not encrypted and requested volume is encrypted * If the parent volume is encrypted and requested volume is not encrypted * If the parent volume is not encrypted and requested volume is encrypted Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-06-07 15:05:21 +00:00
Madhu Rajanna	4e2c4ef704	cephfs: return internal server error if it is an error from the IsMountPoint function and the error is not IsNotExist return it as a internal server error. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-06-07 07:38:48 +00:00
Madhu Rajanna	46f1ab9e99	cephfs: use IsMountPoint to check mountpoint Currently we are relaying on the error output from the umount command we run on the nodes when mounting the volume but we are not checking for all the error message to verify the volume is mounted or not. This commits uses IsMountPoint function in util to check the mountpoint. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-06-07 07:38:48 +00:00
Madhu Rajanna	b4dbffa316	util: return actual error from IsMountPoint as callers are already taking care of returing the GRPC error code return the actual error from the IsMountPoint function. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-06-07 07:38:48 +00:00
Yati Padia	0f44c6acb7	cleanup: address wasted assign issues At places variable is reassigned without being used. Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-06-03 09:51:14 +00:00
YingshuoTao	bfe64d4aee	cephfs: pass extra volume attributes to static PV when using pre-provisioned volumes, pass these parameters: - kernelMountOptions - fuseMountOptions - subVolumeGroup in spec.csi.volumeAttributes in PV declaration Signed-off-by: YingshuoTao <frigid.blues@gmail.com>	2021-06-03 04:42:59 +00:00
Niels de Vos	7cbad9305f	rbd: repair thick-provisioned images on CreateVolume restart Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-06-01 14:42:12 +00:00
Niels de Vos	96a8ea3e88	cleanup: split repairExistingVolume() from CreateVolume() Move the repairing of a volume/snapshot from CreateVolume to its own function. This reduces the complexity of the code, and makes the procedure easier to understand. Further enhancements to repairing an exsiting volume can be done in the new function. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-06-01 14:42:12 +00:00
Madhu Rajanna	2e978e4211	rbd: fix typo in error message fixed typo in error message. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-06-01 10:40:07 +00:00
Madhu Rajanna	a666d452bf	cephfs: return GRPC error in NodeGetVolumeStats in case of failure return GRPC error to the caller. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-05-31 08:17:37 +00:00
Rakshith R	b891e5585d	cleanup: address ifshort linter issues This commit addresses ifshort linter issues which checks if short syntax for if-statements is possible. updates: #1586 Signed-off-by: Rakshith R <rar@redhat.com>	2021-05-26 07:04:32 +00:00
Rakshith R	6618e2012d	cleanup: remove unnecessary calling of .String() when logging This commit removes calling of .String() when logging since `%s`,`%v` or `%q` will call an existing .String() function automatically. Fixes: #2051 Signed-off-by: Rakshith R <rar@redhat.com>	2021-05-25 18:02:11 +00:00
Yati Padia	774e8e4042	util: enable golang profiling Add support for golang profiling. Standard tools like go tool pprof and curl work. example: $ go tool pprof http://localhost:8080/debug/pprof/profile $ go tool pprof http://localhost:8080/debug/pprof/heap $ curl http://localhost:8080/debug/pprof/heap?debug=1 https://golang.org/pkg/net/http/pprof/ contains more details about the pprof interface. Fixes: #1699 Signed-off-by: Yati Padia <ypadia@redhat.com>	2021-05-25 10:41:22 +00:00
Niels de Vos	25d0a1cfc0	rbd: add support for block-devices in NodeGetVolumeStats() The NodeGetVolumeStats procedure can now be used to fetch the capacity of the RBD block-device. By default this is a thin-provisioned device, which means that the capacity is not reserved in the Ceph cluster. This makes it possible to over-provision the cluster. In order to detect the amount of storage used by the RBD block-device (when thin-provisioned), it is required to connect to the Ceph cluster. Unfortunately, the NodeGetVolumeStats CSI procedure does not provide enough parameters to connect to the Ceph cluster and fetch more details about the RBD image. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-05-25 06:41:04 +00:00
Niels de Vos	c0ab4c03e6	cephfs: move NodeGetVolumeStats() to CephFS NodeServer The CephFS NodeServer should handle the CephFS specific requests. This is not something that the NodeServer for RBD should handle. Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-05-25 06:41:04 +00:00
Madhu Rajanna	0ce6ad1152	rbd: fix image details logging log only the required details of the image. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-05-07 07:57:37 +00:00
Madhu Rajanna	67d73cd6e9	rbd: flatten image if the depth is not zero flatten the image if the deep-flatten feature is present on the images in the chain or if the images in chain is not zero, as we cannot check the deep-flatten feature the images which are in trash. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-05-07 07:57:37 +00:00
Madhu Rajanna	e15e2e5081	rbd: discard image not found error For flatten we call checkImageChainHasFeature which internally calls to getImageInfo returns the parent name even if the parent is in the trash, when we try to open the parent image to get its information it fails as the image not found. we should treat error as nil if the parent is not found. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-05-07 07:57:37 +00:00
Niels de Vos	f11a041f56	cleanup: address gosec complaint about creating a file The new gosec 2.7.0 complains like: G304 (CWE-22): Potential file inclusion via variable (Confidence: HIGH, Severity: MEDIUM) Updates: #2025 Signed-off-by: Niels de Vos <ndevos@redhat.com>	2021-05-05 16:05:23 +00:00
Madhu Rajanna	07a916b84d	rbd: mark image ready when image state is up+unknown To recover from split brain (up+error) state the image need to be demoted and requested for resync on site-a and then the image on site-b should gets demoted.The volume should be marked to ready=true when the image state on both the clusters are up+unknown because during the last snapshot syncing the data gets copied first and then image state on the site-a changes to up+unknown. If the image state on both the sites are up+unknown consider that complete data is synced as the last snapshot gets exchanged between the clusters. * create 10 GB of file and validate the data after resync * Do Failover when the site-a goes down * Force promote the image and write data in GiB * Once the site-a comes back, Demote the image and issue resync * Demote the image on site-b * The status will get reflected on the other site when the last snapshot sync happens * The image will go to up+unknown state. and complete data will be copied to site a * Promote the image on site-a and use it ```bash csi-vol-5633715e-a7eb-11eb-bebb-0242ac110006: global_id: e7f9ec55-06ab-46cb-a1ae-784be75ed96d state: up+unknown description: remote image demoted service: a on minicluster1 last_update: 2021-04-28 07:11:56 peer_sites: name: e47e29f4-96e8-44ed-b6c6-edf15c5a91d6-rook-ceph state: up+unknown description: remote image demoted last_update: 2021-04-28 07:11:41 ``` * Do Failover when the site-a goes down * Force promote the image on site-b and write data in GiB * Demote the image on site-b * Once the site-a comes back, Demote the image on site-a * The images on the both site will go to split brain state ```bash csi-vol-37effcb5-a7f1-11eb-bebb-0242ac110006: global_id: 115c3df9-3d4f-4c04-93a7-531b82155ddf state: up+error description: split-brain service: a on minicluster2 last_update: 2021-04-28 07:25:41 peer_sites: name: abbda0f0-0117-4425-8cb2-deb4c853da47-rook-ceph state: up+error description: split-brain last_update: 2021-04-28 07:25:26 ``` * Issue resync * The images cannot be resynced because when we issue resync on site a the image on site-b was in demoted state * To recover from this state (promote and then demote the image on site-b after sometime) ```bash csi-vol-37effcb5-a7f1-11eb-bebb-0242ac110006: global_id: 115c3df9-3d4f-4c04-93a7-531b82155ddf state: up+unknown description: remote image demoted service: a on minicluster1 last_update: 2021-04-28 07:32:56 peer_sites: name: e47e29f4-96e8-44ed-b6c6-edf15c5a91d6-rook-ceph state: up+unknown description: remote image demoted last_update: 2021-04-28 07:32:41 ``` * Once the data is copied we can see that the image state is moved to up+unknown on both sites * Promote the image on site-a and use it Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-05-05 13:38:29 +00:00
Madhu Rajanna	c3bae17fce	rbd: delete encryption key from KMS when a Snapshot is encrypted during a CreateSnapshot operation, the encryption key gets created in the KMS when we delete the Snapshot the key from the KMS should also gets deleted. When we create a volume from snapshot we are copying required information but we missed to copy the encryption information, This commit adds the missing information to delete the encryption key. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>	2021-04-30 08:05:47 +00:00
Humble Chirammal	074c937a08	cleanup: correct typo in vault_tokens.go Signed-off-by: Humble Chirammal <hchiramm@redhat.com>	2021-04-29 08:51:29 +00:00
Mudit Agarwal	ec105bd782	cephfs: expand clone error messages Adding "snapshot clone" in the clone error messages. Signed-off-by: Mudit Agarwal <muagarwa@redhat.com>	2021-04-26 13:38:55 +00:00

1 2 3 4 5 ...

548 Commits