Skip to content

🐛 skip reconcile for ipaddress and ipaddressclaim when cluster is paused#3037

Open
archerwu9425 wants to merge 8 commits intokubernetes-sigs:mainfrom
archerwu9425:main
Open

🐛 skip reconcile for ipaddress and ipaddressclaim when cluster is paused#3037
archerwu9425 wants to merge 8 commits intokubernetes-sigs:mainfrom
archerwu9425:main

Conversation

@archerwu9425
Copy link
Copy Markdown

What this PR does / why we need it:

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #3029

Special notes for your reviewer:

This PR is to fix ipaddressclaim and ipaddress keep reconciling while cluster is paused.
Following the cluster-api ipam provider contract:
https://cluster-api.sigs.k8s.io/developer/providers/contracts/ipam#ipam-provider

Changes include:

  1. If the related Cluster is paused, abort reconciliation
  2. Add finalizer openstackfloatingippool.infrastructure.cluster.x-k8s.io to ipaddressclaim
  3. For ipaddress, if the ipaddressclaim defined in spec.ClaimRef or the related cluster is paused, abort reconciliation
  4. If ipaddress is deleted, update the status of the ipaddressclaim defined in spec.ClaimRef

TODOs:

  • squashed commits
  • if necessary:
    • includes documentation
    • adds unit tests

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 3, 2026
@netlify
Copy link
Copy Markdown

netlify bot commented Mar 3, 2026

Deploy Preview for kubernetes-sigs-cluster-api-openstack ready!

Name Link
🔨 Latest commit b7566a1
🔍 Latest deploy log https://app.netlify.com/projects/kubernetes-sigs-cluster-api-openstack/deploys/69c33a46025701000836ea19
😎 Deploy Preview https://deploy-preview-3037--kubernetes-sigs-cluster-api-openstack.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign emilienm for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Mar 3, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Welcome @archerwu9425!

It looks like this is your first PR to kubernetes-sigs/cluster-api-provider-openstack 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/cluster-api-provider-openstack has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 3, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Hi @archerwu9425. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@archerwu9425 archerwu9425 changed the title skip reconcile for ipaddress and ipaddressclaim when cluster is paused (:bug:)skip reconcile for ipaddress and ipaddressclaim when cluster is paused Mar 3, 2026
@archerwu9425 archerwu9425 changed the title (:bug:)skip reconcile for ipaddress and ipaddressclaim when cluster is paused [:bug:] skip reconcile for ipaddress and ipaddressclaim when cluster is paused Mar 3, 2026
@archerwu9425 archerwu9425 changed the title [:bug:] skip reconcile for ipaddress and ipaddressclaim when cluster is paused 🐛 skip reconcile for ipaddress and ipaddressclaim when cluster is paused Mar 3, 2026
@nikParasyr
Copy link
Copy Markdown
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 3, 2026
@archerwu9425
Copy link
Copy Markdown
Author

/test pull-cluster-api-provider-openstack-e2e-test

@archerwu9425
Copy link
Copy Markdown
Author

@lentzi90 @nikParasyr Could you please help review? Thanks

Copy link
Copy Markdown
Contributor

@lentzi90 lentzi90 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!
I agree that we have a bug in that we reconcile the IPAddresses and claims even when the cluster is paused. This PR seems to do more than just fix this bug though.
It also adds no tests. I think we do need some tests to verify the behavior and I have a few concerns (see below).

Comment thread controllers/openstackfloatingippool_controller.go Outdated
Comment thread controllers/openstackfloatingippool_controller.go
Comment thread controllers/openstackfloatingippool_controller.go Outdated
Comment thread controllers/openstackfloatingippool_controller.go
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 5, 2026
@archerwu9425 archerwu9425 requested a review from lentzi90 March 5, 2026 09:58
@archerwu9425
Copy link
Copy Markdown
Author

@lentzi90 changes made based on comment and tests added. Could you please help review again? Thanks

@archerwu9425
Copy link
Copy Markdown
Author

@lentzi90 Could you please help review again? Thanks

@archerwu9425
Copy link
Copy Markdown
Author

@lentzi90 @nikParasyr Could you please help review? Thanks

@archerwu9425
Copy link
Copy Markdown
Author

@lentzi90 @nikParasyr Could you please help review? Thanks

@nikParasyr
Copy link
Copy Markdown
Contributor

@archerwu9425 i have it on my list, probably end of the week as it is a bit hectic now.
@bnallapeta if you have time in the meantime, would be nice

@bnallapeta
Copy link
Copy Markdown
Contributor

@archerwu9425 i have it on my list, probably end of the week as it is a bit hectic now. @bnallapeta if you have time in the meantime, would be nice

Will put it on my list. But unfortunately, cannot get it to soon.

cluster, err := util.GetClusterFromMetadata(ctx, r.Client, claim.ObjectMeta)
if err != nil {
log.Error(err, "Failed to get owning cluster, skipping claim", "claim", claim.Name)
continue
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider a scenario where there's no cluster label no the claim. In that case, we would get ErrNoCluster and since you have a continue here, it would just skip the claim completely.

This can cause a problem since any IPAddressClaim without a cluster label will never be processed and never be deleted (since finalizer won't be removed).

I think we should process it normally when label is not found as well as this will avoid the regression. Currently, this is what we do.

}
if cluster != nil && annotations.IsPaused(cluster, claim) {
scope.Logger().V(4).Info("IPAddress owner IPAddressClaim or linked Cluster is paused, skipping deletion", "ipAddress", ipAddress.Name, "claim", claim.Name)
continue
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means the deletion will be skipped. So, the floating IP exists in OpenStack. And the following code at line 367 will mark this as available. This might cause the IP to be allocated to two different claims.

Fix: add this to ClaimedIPs. That way, this IP won't be utilized by any other resource.

cluster, err := util.GetClusterFromMetadata(ctx, r.Client, claim.ObjectMeta)
if err != nil {
log.Error(err, "Failed to get owning cluster, skipping mapping", "claim", claim.Name, "namespace", claim.Namespace)
return nil
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as above. If the claim doesn't have a cluster label, GetClusterFromMetadata returns ErrNoCluster, and this returns nil. Meaning, changes to this claim will never trigger a pool reconciliation. The controller won't even notice the claim was created, updated, or deleted.

fix: If there's no cluster, we should still map the event to the pool so reconciliation proceeds normally.

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

@archerwu9425: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-cluster-api-provider-openstack-test b7566a1 link true /test pull-cluster-api-provider-openstack-test

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 28, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

Status: Inbox

Development

Successfully merging this pull request may close these issues.

floating ip address conflict error during migration

5 participants