Feature/multiserver plugin#3421
Conversation
ipspace
left a comment
There was a problem hiding this comment.
Super-awesome-job!!! Thanks a million.
Tons of comments (as you expected ;). Some of them are just suggestions or pointers to existing helper functions, in other cases I think we can make the whole thing a lot more streamlined with significant rewrites.
|
I guess about 99% of them are valid 😁 |
Refactor & review-response changelogclab.yml generation
Data types
Servers as a dictionary
CLI hooks for VXLAN setup/teardown
Single-server
|
|
A few quick thoughts -- more details after I implement the hooks this needs and we rebase it.
Will cherry-pick this change into a new PR. Don't want to have this hidden in a large blob of unrelated code.
👍
👍
That's a huge can of worms. We need the absolute paths in the snapshot and in the Ansible inventory. I think it would be best to have a plugin hook executed very early in "netlab up" so it can adjust the topology data and recreate the snapshot and Ansible inventory before "netlab up" does some real work.
It's resolved from the current installation path in make_paths_absolute
Unless you install netlab as root on lab VMs and in virtual environment on your local machine.
Agreed. That whole thing has to be solved in a different way. |
ipspace
left a comment
There was a problem hiding this comment.
A mixed bag of nits, things that should be fixed (more info on replicated nodes, default server uplink interface name...), and Another Grand Idea (😜). I'm perfectly fine if you tell me to defer the Grand Idea to a later time and just merge this thing.
| {% endfor %} | ||
| {% endfor %} | ||
| {% elif l.node_count == 2 %} | ||
| {% if l.clab is defined and l.clab.vxlan is defined %} |
There was a problem hiding this comment.
Due to the way we implement undefined Jinja2 values, you only need to test l.clab.vxlan is defined (
Line 102 in c1ccc38
| |-----------|------|---------| | ||
| | **vni_base** | integer | Starting VNI for cross-server links (default: `10000`) | | ||
| | **dstport** | integer | UDP destination port for VXLAN traffic (default: `4789`) | | ||
| | **dev** | string | Default physical interface to bind VXLAN tunnels (default: `ens33`) | |
There was a problem hiding this comment.
I think it would make more sense to make this eth0 (that's what you get on less-opinionated distros ;) or leave it undefined but make it a required attribute so the user is forced to define it. ens33 is oddly specific.
| site2: | ||
| members: [ site2-r1, site2-r2, site2-r3, site2-r4, site2-r5 ] | ||
| sites: | ||
| members: [ site1-r1, site1-r2, site1-r3, site1-r4, site1-r5, |
There was a problem hiding this comment.
You might want to point out in a comment that it's even better to use members: [ site1, site2 ]
| (multiserver-replicate)= | ||
| ### Replicated Nodes | ||
|
|
||
| Nodes listed in **multiserver.replicate** are instantiated on every server. This is useful for infrastructure services that need local access on each physical host — for example, monitoring collectors, route reflectors, or DNS resolvers. |
There was a problem hiding this comment.
I wonder how well the inevitably overlapping IP addresses work. Also, I don't think route reflectors are a good example (I can easily see how that would result in split routing).
It might be best to move this section to the end of the document and use your specific example, including an explanation of how the overlapping IP addresses are resolved as you're effectively deploying an implicit anycast service.
On a second thought, maybe that's a better way to go -- require an explicit anycast service?
I'm fine with whatever you decide is best, and this is not a showstopper. It's just that I can see too many unexpected consequences, so there should be a large enough "THERE BE DRAGONS" sign attached to this concept ;)
|
|
||
| * **Local links** connecting nodes on the same server remain as regular containerlab veth pairs or bridges. | ||
| * **Cross-server point-to-point links** are provisioned via containerlab's native VXLAN link endpoints (`type: vxlan` in `clab.yml`). | ||
| * **Cross-server multi-access links** use a local Linux bridge on each server, interconnected via host-level VXLAN tunnels configured by generated setup scripts. |
There was a problem hiding this comment.
Here's a crazy idea: what if you implemented cross-server multi-access links with Linux bridge nodes (https://netlab.tools/node/roles/#implementing-multi-access-links-with-bridge-nodes) -- when analyzing the topology, you could add necessary bridges to nodes and the bridge attribute to links, totally removing the need for extra provisioning scripts.
OTOH, while this would make the end result simpler, you would need a very careful orchestration of steps between this plugin and the bridge code (https://github.com/ipspace/netlab/blob/dev/netsim/roles/bridge.py). You'd have to create the bridge nodes before the pre_transform hook is executed in the bridge role. However, looking at the code, it seems that the roles are the last plugins in the list, so just doing this in the pre_transform hook is probably good enough if you're OK with initializing all the required node data. However, we should not do this too early, or you'd have to crawl through VLAN and VRF links (plus there are topology components and other stuff).
Worst case, I could add another plugin hook ;)
There was a problem hiding this comment.
This is starting to look more and more like a core feature, not a plugin
Perhaps we should add the notion of 'servers' to the topology, and add vxlan support for containerlab links?
| vxlan: | ||
| vni_base: int | ||
| dstport: int | ||
| dev: str |
There was a problem hiding this comment.
This should be a required attribute. At the moment, it's set from the system defaults anyway, so no harm done.
| vxlan: | ||
| vni_base: 10000 | ||
| dstport: 4789 | ||
| dev: ens33 |
There was a problem hiding this comment.
I would change this to eth0, or we could omit this default which would (together with _required on the topology attribute) force the user to specify it either in the topology or in the defaults.
| # When true, 'netlab up --snapshot' auto-runs vxlan-setup.sh via a CLI hook. | ||
| # Set false to keep cross-server tunnels inactive until you run the script | ||
| # manually (e.g. to stage convergence or connect servers on your own schedule). | ||
| auto_start: true |
There was a problem hiding this comment.
Another minor detail that would be unnecessary with the "bridge" nodes ;))
| # Register atexit handler to copy node_files, host_vars, etc. into each server | ||
| # folder after netlab writes all output files. | ||
| if server_folders: | ||
| import atexit |
There was a problem hiding this comment.
The "post-output" callback is there. Please use that.
| for name, s in servers.items(): | ||
| if "id" not in s: | ||
| s.id = _dataplane.get_next_id("multiserver_server") | ||
| if "host" not in s: |
There was a problem hiding this comment.
If you set the "host" attribute to be required, it will eventually be checked, and if you don't use the "host" value very early, we should be OK.
|
@Muddyblack -- I'm sorry for the long delay. I wanted to push the June release out before Autocon5 (and it turns out one has a limited daily amount of mental energy as one gets older), and then Autocon5 struck (just joking, it was a great conference). Anyway, I think we should implement multi-access links with bridge nodes in the long run to make your life easier. I could even make that functionality part of netlab core (in the "bridge" role) to have it available regardless of whether the user uses this plugin. Hey, maybe I could even do that by default just to simplify containerlab provisioning (no need for "manual" creation of Linux bridges). Whether we do that as part of this PR, or merge this and then work on that, is completely up to you. Just let me know ;) |
|
Update: the "implement clab multi-access networks with bridge nodes" idea is much harder to implement than I thought if we want to support VLAN trunks or routed VLANs across multi-access networks -- we would have to split the multi-access link into P2P links very late in the transformation process. For the moment, let's fix the other stuff and move forward with this. When I implement that other idea, the multi-access part of the code in this PR will become irrelevant, at which point we can do a cleanup. |
|
No Problem some time off the computer and charge energy is also good for me ;) |
Reference: #3420
Summary
This PR adds the
multiserverplugin to distribute a single Netlab topology across multiple physical servers.Sadly for now containerlab-provider only.
Key Details
netsim/extra/multiserver/and doesn't modify any core Netlab engine logic.clab.ymlandnetlab.snapshot.pickle.sudo netlab up --snapshot -vvwithout needing custom CLI options.For the test-files I am not sure if they make any sense. But they show at least it does not interfere with the normal netlab workflow.
Explanations on how it works can be found in
docs/plugins/multiserver.md.