Skip to content

[Multi-SLB] external->internal transition can lose current-LB hint and target sibling LB #10050

@nilo19

Description

@nilo19

What happened:

In multi-SLB mode, after manually flipping a Service from external to internal, the service did not stay on its current LB configuration on the next reconcile.

The service was originally associated with one LB config (for example lb1). After the external->internal flip, the controller correctly reconciled the service onto the internal sibling LB (lb1-internal). But a
later reconcile no longer considered the service to be “currently on lb1” and instead tried to migrate it to another eligible LB that did not even exist yet (for example lb2-internal).

This looks like activeServices state for the current LB config is being removed during the opposite-LB cleanup reconcile.

What you expected to happen:

After flipping a service from external to internal in multi-SLB mode, the service should stay on the same LB configuration (lb1 -> lb1-internal) unless placement rules actually require it to move.

A follow-up reconcile should keep the service on its current LB configuration, not treat it as unplaced and attempt migration to another eligible LB.

How to reproduce it (as minimally and precisely as possible):

  1. Enable multi-SLB with at least two eligible LB configs, for example lb1 and lb2.
  2. Create an external type: LoadBalancer service that lands on lb1.
  3. Manually flip the service to internal by setting service.beta.kubernetes.io/azure-load-balancer-internal: "true".
  4. Let the controller reconcile once so the service is moved onto lb1-internal.
  5. Trigger another reconciliation for the same service.
  6. Observe that the controller may now try to move the service to another eligible LB, especially an eligible LB whose internal sibling resource does not exist yet (for example lb2-internal), instead of keeping the
    service on lb1-internal.

Anything else we need to know?:

I traced the failure path to ActiveServices bookkeeping in pkg/provider/azure_loadbalancer.go.

The problem appears to be:

  • reconcileService() first reconciles the desired service, then always runs a cleanup reconcile for the flipped service to remove the opposite LB flavor:
    • updateService := updateServiceLoadBalancerIPs(service, lbIPsPrimaryPIPs)
      flippedService := flipServiceInternalAnnotation(updateService)
      if _, _, err := az.reconcileLoadBalancer(ctx, clusterName, flippedService, nil, false /* wantLb */); err != nil {
      logger.Error(err, "Failed to reconcile flipped LoadBalancer")
      return nil, err
      }
  • During that cleanup reconcile, if the old frontend IP config is removed, fipChanged becomes true and the controller updates multi-SLB status with wantLb=false:
    • if fipChanged {
      az.reconcileMultipleStandardLoadBalancerConfigurationStatus(wantLb, serviceName, lbName)
    • if !wantLb {
      for i := len(newConfigs) - 1; i >= 0; i-- {
      config := newConfigs[i]
      isServiceOwnsFrontendIP, _, _ := az.serviceOwnsFrontendIP(ctx, config, service)
      if isServiceOwnsFrontendIP {
      unsafe, err := az.isFrontendIPConfigUnsafeToDelete(lb, service, config.ID)
      if err != nil {
      return nil, toDeleteConfigs, false, err
      }
      // If the frontend IP configuration is not being referenced by:
      // 1. loadBalancing rules of other services with different ports;
      // 2. outbound rules;
      // 3. inbound NAT rules;
      // 4. inbound NAT pools,
      // do the deletion, or skip it.
      if !unsafe {
      var configNameToBeDeleted string
      if newConfigs[i].Name != nil {
      configNameToBeDeleted = *newConfigs[i].Name
      logger.V(2).Info("dropping", "service", serviceName, "wantLb", wantLb, "configNameToBeDeleted", configNameToBeDeleted)
      } else {
      logger.V(2).Info("nil name", "service", serviceName, "wantLb", wantLb)
      }
      toDeleteConfigs = append(toDeleteConfigs, newConfigs[i])
      newConfigs = append(newConfigs[:i], newConfigs[i+1:]...)
  • reconcileMultipleStandardLoadBalancerConfigurationStatus() trims the -internal suffix before updating ActiveServices, so external and internal sibling LBs share the same config-level status bucket:
    • func (az *Cloud) reconcileMultipleStandardLoadBalancerConfigurationStatus(wantLb bool, svcName, lbName string) {
      logger := log.Background().WithName("reconcileMultipleStandardLoadBalancerConfigurationStatus")
      lbName = trimSuffixIgnoreCase(lbName, consts.InternalLoadBalancerNameSuffix)
      for i := range az.MultipleStandardLoadBalancerConfigurations {
      if strings.EqualFold(lbName, az.MultipleStandardLoadBalancerConfigurations[i].Name) {
      az.multipleStandardLoadBalancersActiveServicesLock.Lock()
      if wantLb {
      logger.V(4).Info("service is active on lb", "service", svcName, "lb", lbName)
      az.MultipleStandardLoadBalancerConfigurations[i].ActiveServices = utilsets.SafeInsert(az.MultipleStandardLoadBalancerConfigurations[i].ActiveServices, svcName)
      } else {
      logger.V(4).Info("service is not active on lb any more", "service", svcName, "lb", lbName)
      az.MultipleStandardLoadBalancerConfigurations[i].ActiveServices.Delete(svcName)
      }
  • That means cleanup of the old external LB can delete the same ActiveServices["lb1"] entry that is needed to remember the service is now active on lb1-internal.
  • On the next reconcile, current LB selection relies only on that config-level ActiveServices state:
    • currentLBName := az.getServiceCurrentLoadBalancerName(service)
      lbNamePrefix = getMostEligibleLBForService(currentLBName, eligibleLBs, existingLBs, requiresInternalLoadBalancer(service))
    • func (az *Cloud) getServiceCurrentLoadBalancerName(service *v1.Service) string {
      for _, multiSLBConfig := range az.MultipleStandardLoadBalancerConfigurations {
      if az.isLoadBalancerInUseByService(service, multiSLBConfig) {
      return multiSLBConfig.Name
      }
      }
      return ""
  • Once the current LB name is empty, getMostEligibleLBForService() explicitly prefers the first eligible LB that does not exist yet:
    • func getMostEligibleLBForService(
      currentLBName string,
      eligibleLBs []string,
      existingLBs []*armnetwork.LoadBalancer,
      isInternal bool,
      ) string {
      logger := log.Background().WithName("getMostEligibleLBForService")
      // 1. If the LB is eligible and being used, choose it.
      if StringInSlice(currentLBName, eligibleLBs) {
      logger.V(4).Info("choose LB as it is eligible and being used", "currentLBName", currentLBName)
      return currentLBName
      }
      // 2. If the LB is eligible and not created yet, choose it because it has the fewest rules.
      for _, eligibleLB := range eligibleLBs {
      var found bool
      for i := range existingLBs {
      existingLB := (existingLBs)[i]
      if strings.EqualFold(trimSuffixIgnoreCase(ptr.Deref(existingLB.Name, ""), consts.InternalLoadBalancerNameSuffix), eligibleLB) &&
      isInternalLoadBalancer(existingLB) == isInternal {
      found = true
      break
      }
      }
      if !found {
      logger.V(4).Info("choose LB as it is eligible and not existing", "eligibleLB", eligibleLB)
      return eligibleLB

So the issue is not just the later selection logic. The underlying bug is that opposite-LB cleanup removes the config-level ActiveServices entry even when the service is still active on the sibling LB of the same
config.

This also looks symmetric for internal->external flips.

I also ran nearby provider tests locally:
go test -run 'TestGetMostEligibleLBName|TestGetServiceLoadBalancerMultiSLB|TestEnsureLoadBalancerDeleted' ./pkg/provider
Those passed, which suggests this path is missing regression coverage rather than already being covered by a failing test.

Environment:

  • Kubernetes version (use kubectl version): <fill in>
  • Cloud provider or hardware configuration: Azure, Standard LoadBalancer SKU, multi-SLB enabled
  • OS (e.g: cat /etc/os-release): <fill in>
  • Kernel (e.g. uname -a): <fill in>
  • Install tools: <fill in>
  • Network plugin and version (if this is a network-related bug): <fill in>

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions