-
Notifications
You must be signed in to change notification settings - Fork 325
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
introduce XLAT, native IPv4 for clients #1808
Conversation
package/gluon-ddhcpd/files/lib/gluon/upgrade/317-ddhcp-init-remove
Outdated
Show resolved
Hide resolved
package/gluon-xlat464-plat/luasrc/lib/gluon/upgrade/440-gluon-xlat464-plat
Outdated
Show resolved
Hide resolved
@christf Thank you for raising this PR. I think PLAT support should be removed, because it is not usable without N2N/S2N support. Rotanid's remarks regarding indentation and double newlines are correct. I didn't bother at the time of writing it. |
yeah, I will rework it to make it mergable. For now I just wanted to share that there is such a set of packages while still commencing the tests. I am still experiencing performance issues in my test network related to MTU problems. This PR will remain WIP until those are resolved. |
@christf Indeed the MTU issue is not handled correctly. From the comments the NAT46 device is intended to be used with a big MTU (as much as the WLAN interface plus IPv6 overhead in our case). But that contradicts the RFC which says for the IPv4 network a very small MTU should be set (IPv6-MTU minus IP-header-size-difference - e.g. 1280 - 60 = 1220). Because the problem is that the resulting IPv6 packets won't get fragmented as the default for IPv6 is "don't fragment". So it needs to happen on the IPv4-side with MSS clamping enabled. The NAT46 modules does not check whether packets are too big at the moment https://github.com/ayourtch/nat46/blob/master/nat46/modules/nat46-core.c#L1843. Performance-wise the NAT46 module might be heavy as it processes each skb directly instead of using a separate worker which is correct in general, but it can take up much memory on high traffic flows, because it reallocates a bigger skb as far as I remember and it takes time until the old one gets freed. This could be optimized. I think the newer kernels have a function to put in a page at the beginning for the IPv6 header and leave the rest of the data untouched and for unlinking references to the resulting skb. So we could possibly save most of the copying process. |
Is there an accidental mixup in this sentence? It's the other way round, queuing something onto a kworker with queue_work() is a performance killer. You don't want to use that on the fast-path for your payload packet flow. The only reason you might want to queue something on a kworker is if you have some packet that needs a longer time to be processed and therefore should be processed in a sleepable/interruptable handler. But lifting onto such a handler is heavy. |
@T-X Yes, sorry I changed the sentence and didn't read it again. Performance-wise it's ordinarily better, but not memory-performance wise. The module uses skb_copy/reallocates the skbs instead of skb_cloning them and adding/removing the headers with push-pull and removing the references, but this would require a worker which checks on the references on the original one and only starts changing the headers when the old skb has been released by everything else. Do you know what I mean? Edit: Edit2: @T-X I assume pskb_expand_head won't give us a real improvement for linear skb's translation from IPv4 to IPv6 as the 60 additional bytes won't be within the margin of the IPv4 max headroom (do you remember where this value is being defined?). |
actually outgoing traffic is ok but inbound is problematic. This is not menat to be a high-performance solution. It just has to work barely enough to allow the few remaining clients that refuse to work without ipv4 use the network. When larger scale applications are required, the best performance optimization is to stop the dns server to hand out A records... |
@christf I agree. I'd still be happy to hear T-X's point, because I don't want to have the same problems with NAT426, but it's OT. |
I have this integrated now into our gluon builds. PMTU discovery works with the additional clat rule. |
ab96d5c
to
96c1df1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i noted a few additional things.
also, i noticed that your PR is using different indentations, even inside the same shell script.
Please try to use at most two, one for lua, one for shell and not mix them, maybe one can look at gluon-core stuff to see which one is preferred
documentation is still missing.
4ab9359
to
223e80d
Compare
This provides native ipv4 to clients on a network having an ipv6-only backend, relying on external plat. This allows to use clients in ipv6-only networks that would otherwise refuse to connect without a valid ipv4 route. Limitations: * External plat must be provided, for example by jool - for example by https://github.com/FreifunkMD/jool-docker.git * This implementation will break ipv4 connections when clients roam.
The following changes were performed:
|
I am using the Magdeburg firmware and therefore this PR since a couple of weeks and did not see any issues in that while. |
@@ -0,0 +1,3 @@ | |||
#!/bin/sh | |||
|
|||
uci set ddhcpd.settings.enabled="0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be merged into 316-ddhcp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, please check if it was done correctly
package/gluon-ddhcpd/luasrc/lib/gluon/upgrade/315-dhcp-config-firewall
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks mostly okay, a few nitpicks and then we can get this merged in time for v2020.2.
Example:: | ||
|
||
{ | ||
clat_range = 'fdff:ffff:ffff::/48', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation
|
||
clat_range : mandatory | ||
- infrastructure net (ULA) from which a /96 CLAT prefix will be generated. | ||
- This must be a /48 prefix. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A short subclause, explaining why a /48, e.g. how it is partitioned, would go a long way.
end | ||
end | ||
f:close() | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could reuse gluon.util.file_contains_line
.
clat_range : mandatory | ||
- infrastructure net (ULA) from which a /96 CLAT prefix will be generated. | ||
- This must be a /48 prefix. | ||
- This can be the same for each site and is pre-registered at https://wiki.freifunk.net/IP-Netze#IPv6 as part of fdff:ffff:ff00::/40 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should reference such a badly maintained and outdated wiki page. And if it's all the same we might also default to some prefix, no?
@@ -0,0 +1,2 @@ | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
empty line?
|
||
define Package/gluon-ddhcpd | ||
TITLE:=Distributed DHCP Daemon for Gluon | ||
DEPENDS:=+gluon-core +ddhcpd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does ddhcpd
come from?
|
||
define Package/gluon-ddhcpd | ||
TITLE:=Distributed DHCP Daemon for Gluon | ||
DEPENDS:=+gluon-core +ddhcpd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If gluon-ddhcpd
depends on mmfd
that dependency is missing here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ddhcpd depends on a means to distribute multicast throughout the network. In babel networks this capability is provided by mmfd. in batman networks this is provided by the l2-property of batman. It could also be provided by bier or any other means. There is no way to model that currently and depending on mmfd is not the right thing to do for batman - as such a dependency was not placed.
@@ -0,0 +1,22 @@ | |||
#!/usr/bin/lua |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WIth the mmfd rule it's probably appropriate to call the file ddhcpd-firewall
.
@@ -0,0 +1,10 @@ | |||
#!/usr/bin/lua |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no need to sort this behind the firewall rules, both can be put at position 315.
@@ -1,6 +1,8 @@ | |||
need_string_match(in_domain({'node_prefix6'}), '^[%x:]+/64$') | |||
need_string_match(in_domain({'node_client_prefix6'}), '^[%x:]+/64$') | |||
|
|||
need_string_match(in_domain({'clat_range'}), '^[%x:]+/48$', false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does babel require this? And why is this check more specific that the one in gluon-464xlat-clat
? Shouldn't gluon-mesh-babel
just depend on gluon-464xlat-clat
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Babel needs to know which routes (from which IP ranges) should be redistributed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the clat functionality is only there, when the clat package is actually included, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a node does not use the CLAT (whyever) it should still redistribute the route. I agree that the site-check fits better in the clat-package itself. There is no clear solution for this, but I guess nobody will be mixing firmwares with and without clat-support. Based on that it is fine to move it to the clat-package. I don't think gluon-mesh-babel should depend on the clat package.
shellcheck:
|
Hey @christf , your PR got a review 4 weeks ago and is waiting to be updated :-) |
The Linux kernel and Babel got support for routing IPv4 packets via IPv6 routes:
Which sounds like a neat idea and overall a lot simpler than XLAT? |
@T-X Yes, indeed it looks promising. As a sidenote: I've kind of lost interest in Babel, because there is more future potential in Batman as it is much easier to implement technologies like multipath bonding. Maybe we can have a discussion about it at the end of the year. Btw rhashtables was something to keep in mind for improving Batman performance and possibly fixing the scaling issues. |
Closed because of #3105 |
Just in case someone stumbles over this PR again, in case some renewed interest might come up: Someone at Freifunk Franken has recently implemented SIIT in eBPF, which seems to include code for handling many of the annoying parts, like checksums and fragmentation handling, too: https://git.freifunk-franken.de/jkimmel/siit-bpf |
This introduces native ipv4 support for clients in a babel-based, ipv6-only network. There is a jool instance expected somewhere on the network for the plat part. This setup is used in Magdeburg and working well.
I do not think the ipv4 connections survive roaming with the current implementation but at least android-clients do not disconnect from the network any more due to missing ipv4 connectivity.
@CodeFetch I do hope you do not disapprove raising a PR partly containing your work. Let me know if you do and we can work something out.