feat: CSI volume health monitoring and failover#1816
feat: CSI volume health monitoring and failover#1816abeowlu wants to merge 3 commits intokubernetes-sigs:masterfrom
Conversation
|
Hi @abeowlu. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Regular contributors should join the org to skip this step. Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: abeowlu The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Is this a bug fix or adding new feature? Feature enhancement
What is this PR about? / Why do we need it?
This PR should close #1675 #1676 satisfactorily, as well as related request for CSI volume health monitoring, high availability and self-heal
In this PR the minimum use-case, monitor volume mounts an report health and condition along with stat info when they are probed by kubelet, is implemented.
As a stab at the best case scenario, an asyn attempt at volume remount, and failover if EFS server issue or hang is encountered is also implemented.
What testing is done?
unit testing for health check and async recovery attempt flow