You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
IPv6 BGP-routed Isolated network: missing ct state established,related INPUT rule on VR's IPv6 firewall
Summary
When creating a tenant network using an IPv6-only ROUTED + Filtered offering (internetprotocol=ipv6, networkmode=ROUTED, services including Firewall), the Virtual Router's nftables ip6 ip6_firewall fw_input chain has policy drop and only ICMPv6 accept rules. There is no ct state established,related accept rule on the public NIC.
Because the VR initiates BGP outbound to upstream PE peers, the return SYN-ACK is silently dropped at the v6 INPUT hook, before TCP's MD5 verification ever runs. BGP IPv6 sessions cannot reach Established.
The equivalent IPv4 INPUT chain on the same VR DOES have iifname "eth2" ct state related,established counter accept, and IPv4 BGP works correctly.
Environment
Apache CloudStack 4.22.0.0 (live install on staging mgmt host)
Source analysis cross-checked against 4.20 branch HEAD a7c2a05 — same bug visible in source on both branches
10.25.12.2 Established PfxRcd=1
10.25.12.3 Established PfxRcd=1
Diagnostic
Packet capture on the hypervisor's underlay (bond0, VLAN 258):
VR → PE: TCP SYN (port 179) with MD5
PE → VR: TCP SYN-ACK with MD5
VR → PE: TCP SYN retransmit (VR never sent ACK)
PE → VR: TCP SYN-ACK retransmit
... cycle repeats until VR's connect timeout ...
PE responds correctly. Return packet reaches the VR's eth2. But VR's nftables drops it before TCP processes it.
$ nft list table ip ip4_firewall
table ip ip4_firewall {
chain INPUT {
type filter hook input priority filter; policy drop;
...
iifname "eth2" ct state established,related counter packets ... accept
...
}
...
}
The IPv4 INPUT chain has the rule on eth2; the IPv6 fw_input chain does not.
Kernel TCPMD5 counters are all zero, confirming the drop happens before TCP state machine — i.e., at netfilter.
Source code root cause
In systemvm/debian/opt/cloud/bin/cs/CsAddress.py, fw_router_routing() writes the default INPUT and FORWARD rules for IPv4 only:
deffw_router_routing(self):
ifself.config.is_vpc() ornotself.config.is_routed():
return# Add default rules for INPUT chainself.nft_ipv4_fw.append({'type': "", 'chain': 'INPUT',
'rule': "iifname lo counter accept"})
self.nft_ipv4_fw.append({'type': "", 'chain': 'INPUT',
'rule': "iifname eth2 ct state related,established counter accept"}) # <-- this rule# Add default rules for FORWARD chainself.nft_ipv4_fw.append({'type': "", 'chain': 'FORWARD',
'rule': 'iifname "eth2" oifname "eth0" ct state related,established counter accept'})
# ... more v4-only rules ...
There is no IPv6 equivalent of this function — nft_ipv6_fw is not appended-to anywhere. The IPv6 firewall's INPUT chain default rules are entirely missing for ROUTED-mode Isolated networks.
CsNetfilter.py:add_ip6_chain() adds the ct state established,related accept rule only to FORWARD-hooked chains, not INPUT:
Within seconds, both IPv6 BGP sessions reach Established, tenant /64 is advertised, VMs become reachable from IPv6 internet. Verified end-to-end with SSH from public IPv6 internet to VM inside the v6-only routed network.
Caveat: the workaround is in-memory only. Lost on:
VR reboot
Any subsequent cmk createIpv6FirewallRule / cmk deleteIpv6FirewallRule call (ACS regenerates the chain from its own config DB, wiping the manually-added rule)
Any other event that triggers a v6 firewall reconfiguration on the VR
Each tenant FW rule change wipes the workaround. The operator has to SSH back into the VR and re-apply the nft rule after every FW change. This makes the offering effectively unusable as a customer product without the upstream fix.
Proposed fix — VALIDATED on a live VR
Add a v6 equivalent of fw_router_routing() in systemvm/debian/opt/cloud/bin/cs/CsAddress.py plus expose nft_ipv6_fw on CsIP. nft_ipv6_fw already exists on CsConfig (line 43); we just need to plumb it through CsIP and write into it.
Three changes in CsAddress.py:
1. Add reference in CsIP.__init__ (around line 312):
fw_input chain now includes iifname "eth2" ct state established,related counter accept
v6 BGP sessions Established within seconds, PfxRcd=1, PfxSnt=2
Survival test (the key one): After patch, ran cmk createIpv6FirewallRule networkid=<net> traffictype=Ingress protocol=tcp startport=80 endport=80 — this pushes ipv6_firewall_rules.json to the VR and triggers the full IpTablesExecutor flush+rebuild path that previously wiped the manual nft workaround. After the FW change:
iifname "eth2" ct state established,related accept rule persists in fw_input (with active counters)
Both v6 BGP sessions still Established
End-to-end SSH from public IPv6 internet to VM in the network still works
This confirms the fix is correct and durable. The bug is in CsAddress.py / nft_ipv6_fw not being populated; the rest of the pipeline handles the v6 list correctly once it has content.
VPC equivalent
The same gap likely exists in the VPC routed path (fw_vpcrouter_routing at line 674). Not tested here (our setup is non-VPC Isolated) but worth a symmetric audit.
Affected versions
Verified on Apache CloudStack 4.22.0.0 (latest LTS at time of filing). PR #10970, which added the equivalent FORWARD-chain rule, is present and active in this build — but the INPUT-chain rule was deliberately removed in the PR's second commit ("Remove rule from input chain"), leaving this regression.
High for anyone wanting to deploy IPv6-only ROUTED Isolated networks at scale. The feature appears to work (offering enables, network creates, VR provisions, BGP-v4 establishes) but tenant v6 traffic doesn't route because BGP-v6 silently fails. Diagnosis requires packet captures on the underlay — not obvious from the VR's own view.
Related
PR IPv6 firewall: accept packets from related and established connections #10970 ("IPv6 firewall: accept packets from related and established connections") — landed in 4.20.2 and 4.22.0.0 — added the equivalent rule to the FORWARD chain only. This fixed the VM-return-traffic case (downloads, etc.) but did NOT add the rule to the INPUT chain, leaving the VR's own outbound BGP return traffic still dropped. The PR discussion mentions a second commit "Remove rule from input chain" — suggesting an earlier draft did add the INPUT rule but it was removed in review. The bug described here is the consequence of that removal: VR-originated v6 connections (BGP, but also NTP, DNS lookups, etc., that the systemvm itself initiates outbound) fail on the return.
IsolatedV6RoutedFiltered offering — affected
IsolatedV6RoutedOffering (no Firewall service) — not affected (no firewall service means no ip6_firewall table; v6 BGP works there because no nftables drop happens)
IPv4 ROUTED with same offering shape — works as expected (different code path: fw_router_routing() in CsAddress.py writes the INPUT iifname "eth2" ct state related,established rule for v4)
IPv6 BGP-routed Isolated network: missing
ct state established,relatedINPUT rule on VR's IPv6 firewallSummary
When creating a tenant network using an IPv6-only ROUTED + Filtered offering (
internetprotocol=ipv6,networkmode=ROUTED, services including Firewall), the Virtual Router's nftablesip6 ip6_firewall fw_inputchain haspolicy dropand only ICMPv6 accept rules. There is noct state established,related acceptrule on the public NIC.Because the VR initiates BGP outbound to upstream PE peers, the return SYN-ACK is silently dropped at the v6 INPUT hook, before TCP's MD5 verification ever runs. BGP IPv6 sessions cannot reach
Established.The equivalent IPv4 INPUT chain on the same VR DOES have
iifname "eth2" ct state related,established counter accept, and IPv4 BGP works correctly.Environment
4.20branch HEADa7c2a05— same bug visible in source on both branches8.4.4IsolatedV6RoutedFiltered(internetprotocol=ipv6,routingmode=Dynamic,networkmode=ROUTED, services[UserData, Firewall, Dhcp, Dns],egressdefaultpolicy=true)999999(external)4200000001-4200000099(32-bit private)/48r-276-VMASN 4200000052,r-278-VMASN 4200000081) — identical symptom, identical fix.Steps to reproduce
/48.createNetworkusing the offering.port 3922, systemvm key from/root/.ssh/id_rsa.cloud).Expected
VR advertises tenant
/64upstream; VMs in the network are reachable from the IPv6 internet.Actual
The IPv4 sessions on the SAME VR work normally:
Diagnostic
Packet capture on the hypervisor's underlay (
bond0, VLAN 258):PE responds correctly. Return packet reaches the VR's
eth2. But VR's nftables drops it before TCP processes it.Inside the VR, the v6 firewall table:
For comparison, the IPv4 table on the same VR:
The IPv4 INPUT chain has the rule on
eth2; the IPv6fw_inputchain does not.Kernel TCPMD5 counters are all zero, confirming the drop happens before TCP state machine — i.e., at netfilter.
Source code root cause
In
systemvm/debian/opt/cloud/bin/cs/CsAddress.py,fw_router_routing()writes the default INPUT and FORWARD rules for IPv4 only:There is no IPv6 equivalent of this function —
nft_ipv6_fwis not appended-to anywhere. The IPv6 firewall's INPUT chain default rules are entirely missing for ROUTED-mode Isolated networks.CsNetfilter.py:add_ip6_chain()adds thect state established,related acceptrule only to FORWARD-hooked chains, not INPUT:So for v6 INPUT (
fw_inputchain), only ICMPv6 is allowed and the chain inheritspolicy drop. The return BGP traffic never matches anything → dropped.Reproduction confirmed across multiple VRs
Tested independently on two fresh VRs in two different tenant networks. Both showed:
Workaround
On the running VR, apply the missing rule and restart FRR:
nft 'add rule ip6 ip6_firewall fw_input iifname "eth2" ct state established,related counter accept' systemctl restart frrWithin seconds, both IPv6 BGP sessions reach
Established, tenant /64 is advertised, VMs become reachable from IPv6 internet. Verified end-to-end with SSH from public IPv6 internet to VM inside the v6-only routed network.Caveat: the workaround is in-memory only. Lost on:
cmk createIpv6FirewallRule/cmk deleteIpv6FirewallRulecall (ACS regenerates the chain from its own config DB, wiping the manually-added rule)Each tenant FW rule change wipes the workaround. The operator has to SSH back into the VR and re-apply the nft rule after every FW change. This makes the offering effectively unusable as a customer product without the upstream fix.
Proposed fix — VALIDATED on a live VR
Add a v6 equivalent of
fw_router_routing()insystemvm/debian/opt/cloud/bin/cs/CsAddress.pyplus exposenft_ipv6_fwonCsIP.nft_ipv6_fwalready exists onCsConfig(line 43); we just need to plumb it through CsIP and write into it.Three changes in
CsAddress.py:1. Add reference in
CsIP.__init__(around line 312):self.nft_ipv4_fw = config.get_nft_ipv4_fw() self.nft_ipv4_acl = config.get_nft_ipv4_acl() + self.nft_ipv6_fw = config.get_ipv6_fw()2. Add new
fw_router_routing_v6()method (immediately beforefw_vpcrouter_routingat line 674):3. Call it from
CsIP.configure()(line 756-757):self.fw_router_routing() self.fw_vpcrouter_routing() + self.fw_router_routing_v6()Note:
eth2is hardcoded matching the v4 convention (andPUBLIC_INTERFACES["router"]inCsHelper.py). A more robust fix could reference that constant.Validation
Applied this patch in-place on a running VR (
r-278-VM, ACS 4.22.0.0) on 2026-05-16:/opt/cloud/bin/configure.py cmd_line.jsontriggered re-processiifname "eth2" ct state established,related counter acceptSurvival test (the key one): After patch, ran
cmk createIpv6FirewallRule networkid=<net> traffictype=Ingress protocol=tcp startport=80 endport=80— this pushesipv6_firewall_rules.jsonto the VR and triggers the full IpTablesExecutor flush+rebuild path that previously wiped the manual nft workaround. After the FW change:iifname "eth2" ct state established,related acceptrule persists in fw_input (with active counters)This confirms the fix is correct and durable. The bug is in CsAddress.py /
nft_ipv6_fwnot being populated; the rest of the pipeline handles the v6 list correctly once it has content.VPC equivalent
The same gap likely exists in the VPC routed path (
fw_vpcrouter_routingat line 674). Not tested here (our setup is non-VPC Isolated) but worth a symmetric audit.Affected versions
Verified on Apache CloudStack 4.22.0.0 (latest LTS at time of filing). PR #10970, which added the equivalent FORWARD-chain rule, is present and active in this build — but the INPUT-chain rule was deliberately removed in the PR's second commit ("Remove rule from input chain"), leaving this regression.
Affected versions (by code inspection + PR #10970 history):
Severity
High for anyone wanting to deploy IPv6-only ROUTED Isolated networks at scale. The feature appears to work (offering enables, network creates, VR provisions, BGP-v4 establishes) but tenant v6 traffic doesn't route because BGP-v6 silently fails. Diagnosis requires packet captures on the underlay — not obvious from the VR's own view.
Related
IsolatedV6RoutedFilteredoffering — affectedIsolatedV6RoutedOffering(no Firewall service) — not affected (no firewall service means noip6_firewalltable; v6 BGP works there because no nftables drop happens)fw_router_routing()inCsAddress.pywrites the INPUTiifname "eth2" ct state related,establishedrule for v4)