🐛 skip reconcile for ipaddress and ipaddressclaim when cluster is paused#3037
🐛 skip reconcile for ipaddress and ipaddressclaim when cluster is paused#3037archerwu9425 wants to merge 8 commits intokubernetes-sigs:mainfrom
Conversation
✅ Deploy Preview for kubernetes-sigs-cluster-api-openstack ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Welcome @archerwu9425! |
|
Hi @archerwu9425. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Regular contributors should join the org to skip this step. Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/ok-to-test |
|
/test pull-cluster-api-provider-openstack-e2e-test |
|
@lentzi90 @nikParasyr Could you please help review? Thanks |
lentzi90
left a comment
There was a problem hiding this comment.
Thanks for the PR!
I agree that we have a bug in that we reconcile the IPAddresses and claims even when the cluster is paused. This PR seems to do more than just fix this bug though.
It also adds no tests. I think we do need some tests to verify the behavior and I have a few concerns (see below).
…ss before remove claim finalzier
|
@lentzi90 changes made based on comment and tests added. Could you please help review again? Thanks |
|
@lentzi90 Could you please help review again? Thanks |
|
@lentzi90 @nikParasyr Could you please help review? Thanks |
|
@lentzi90 @nikParasyr Could you please help review? Thanks |
|
@archerwu9425 i have it on my list, probably end of the week as it is a bit hectic now. |
Will put it on my list. But unfortunately, cannot get it to soon. |
| cluster, err := util.GetClusterFromMetadata(ctx, r.Client, claim.ObjectMeta) | ||
| if err != nil { | ||
| log.Error(err, "Failed to get owning cluster, skipping claim", "claim", claim.Name) | ||
| continue |
There was a problem hiding this comment.
Consider a scenario where there's no cluster label no the claim. In that case, we would get ErrNoCluster and since you have a continue here, it would just skip the claim completely.
This can cause a problem since any IPAddressClaim without a cluster label will never be processed and never be deleted (since finalizer won't be removed).
I think we should process it normally when label is not found as well as this will avoid the regression. Currently, this is what we do.
| } | ||
| if cluster != nil && annotations.IsPaused(cluster, claim) { | ||
| scope.Logger().V(4).Info("IPAddress owner IPAddressClaim or linked Cluster is paused, skipping deletion", "ipAddress", ipAddress.Name, "claim", claim.Name) | ||
| continue |
There was a problem hiding this comment.
This means the deletion will be skipped. So, the floating IP exists in OpenStack. And the following code at line 367 will mark this as available. This might cause the IP to be allocated to two different claims.
Fix: add this to ClaimedIPs. That way, this IP won't be utilized by any other resource.
| cluster, err := util.GetClusterFromMetadata(ctx, r.Client, claim.ObjectMeta) | ||
| if err != nil { | ||
| log.Error(err, "Failed to get owning cluster, skipping mapping", "claim", claim.Name, "namespace", claim.Namespace) | ||
| return nil |
There was a problem hiding this comment.
Same issue as above. If the claim doesn't have a cluster label, GetClusterFromMetadata returns ErrNoCluster, and this returns nil. Meaning, changes to this claim will never trigger a pool reconciliation. The controller won't even notice the claim was created, updated, or deleted.
fix: If there's no cluster, we should still map the event to the pool so reconciliation proceeds normally.
|
@archerwu9425: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
What this PR does / why we need it:
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)format, will close the issue(s) when PR gets merged):Fixes #3029
Special notes for your reviewer:
This PR is to fix
ipaddressclaimandipaddresskeep reconciling while cluster is paused.Following the cluster-api ipam provider contract:
https://cluster-api.sigs.k8s.io/developer/providers/contracts/ipam#ipam-provider
Changes include:
openstackfloatingippool.infrastructure.cluster.x-k8s.iotoipaddressclaimipaddress, if theipaddressclaimdefined inspec.ClaimRefor the related cluster is paused, abort reconciliationipaddressis deleted, update thestatusof theipaddressclaimdefined inspec.ClaimRefTODOs:
/hold