-
Notifications
You must be signed in to change notification settings - Fork 872
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Sample ipv4 packet matched by several sample filters iff optimization is enabled. #430
Comments
This has nothing to do with out of bound access - the BPF code generator optimizes away the "ether[1] > tcp[100]" test, so that test isn't even performed, regardless of whether there's packet data 100 bytes following the beginning of the TCP header or not. |
That doesn't happen for the expression "ether[1] > tcp[100]", but does happen for "ether[1] > tcp[100] or ip". Somehow, if you do "ether[1] > tcp[100]" without the optimizer, it tests for both IPv4 and IPv6 TCP packets where ether[1] > tcp[100], but, with the optimizer, the IPv6 test is removed. If no IPv6 test is done, "ether[1] > tcp[100] or ip" is equivalent to just "ip", as all packets matched by "ether[1] > tcp[100]" are IP packets, but if the IPv6 test is done, there might be packets matched by "ether[1] > tcp[100]" that aren't matched by "ip", because they're IPv6 packets not matched by "ip" (you need "ip6" for that). |
Is "ether[1] > tcp[100] or ip" really equivalent to ip, even given that it's an ipv4 packet? My understanding is that an out-of-bounds packet access makes the whole expression fail, not just the current clause, which invalidates some optimization ideas that would work if clauses were independent.
Tangentially, this shows the same difference in behaviour with the optimizer on:
|
BTW, if anybody's curious about the history of pflang, It's All Van And Steve's Fault. See the video and slides from Steve McCanne's keynote presentation at Sharkfest '11. |
Given that the behavior of which I speak can be demonstrated by using the |
It's probably best to think of Steve's little 6502-flavored machine as having "hardware" bounds checking, with bounds check faults causing a failure, and of the compiler as ignoring that characteristic, so that any "as if" rules in optimization ignore the possibly of a bounds check fault. |
The underlying problem is that, for Internet protocol family transport-layer protocols, use in general expressions such as "tcp[100]" generates code that tests only for IPv4, not for IPv6. As a comment in
|
Thank you for the link to the keynote; the slides on libpcap's history are quite interesting, and provide some perspective. But, going back to the bug report: I see two different issues here, and I would appreciate if they were not conflated. a) issues related to IPv6. As both man pcap-filter and the source say, there are a number of interesting things here, including many filters implying ipv4. I would appreciated if you would open a separate bug if you'd like to consider these issues. b) I have provided fairly minimal test cases of one ipv4 packet and half a dozen tiny filters that match it when optimization is enabled, and do not match it when optimization is disabled. I would appreciate if this was confirmed and fixed, or if you would clearly explain why you do not consider it to be a bug and say that the behaviour I have reported is expected. I've avoided speculating about the root cause; I've merely reported some test cases I've found that trigger this behaviour, and remarked on features they have in common. As you've pointed out, the test in the first clause is being optimized out; if that is the root cause of the behaviour I'm reporting, I would question whether that is a valid optimization, depending on the intended semantics of the language. |
On the off-chance it's marginally interesting, |
I have seen nothing in the tcpdump man page, the pcap-filter man page, or the USENIX paper on BPF to indicate that the "and" and "or" operators in the filter language are to be treated as C-style short-circuit operators for which the order is important or that failure on an out-of-bounds reference is to be treated as behavior that cannot be optimized out by optimizing out the references. (I don't consider the fact that you can write "and" and "or" as "&&" and "||", using the same symbols as C-style short-circuit operators, to be explicit indications that they are to be treated as C-style short-circuit operators.) The BPF paper speaks of the ability to re-organize the control flow graph "so [a] value is only used at nodes that follow the original computation", which implies, for example, removal of redundant tests. |
I'm not assuming C-style short-circuit behaviour; it's not consistent with what tcpdump shows with the optimizer enabled, although I can't think of an unoptimized expression off-hand which is not compatible with it or left-to-right evaluation of logical clauses.
On the other hand,
Like you, I have not seen documentation which answers my questions about the semantics of the language. If clause order is not important, users of the language cannot even rely on adding explicit length checks before doing a packet access to avoid the divergences in behaviour in the out-of-bounds examples I've presented. And, for users, it's rather disconcerting to have basic filter expressions match or not match depending on whether optimization is enabled. It's not particularly great if developers can't tell whether a 12-character expression like I think the key question here is whether what's happening is user-hostile and developer-hostile undefined behaviour, making the optimizations valid but destroying the idea of many filters having defined semantics, or whether having consistent behaviour is important and the optimizer needs to be redefined to take more things into account. |
The filter And short-circuiting has nothing to do with that - it happens regardless of the order in which things are tested:
(the fact that tcpdump doesn't print out generated filter code means that no such code was generated; "division by zero" is an error string returned by The reason why the behavior is different when optimization is enabled and when it's not enabled is that the check for division and modulus by zero is done in the optimizer at the time it does compile-time evaluation of expressions. I've checked in a change to detect division by a constant zero at code generation time prior to optimization, so the checks are done regardless of whether optimization is enabled. That way, The key question here is whether run-time exceptions should be treated, by users and developers, as a mechanism for doing tests that match or don't match packets, so that code that could generate run-time bounds exceptions cannot be optimized away, or whether, if they want to do length checks, they should do so explicitly rather than relying on the bounds exception causing the entire program to fail rather than, for example, causing only tests using the value whose fetch caused the bounds exception to fail. I vote for the latter. |
The semantics should be that an subexpression that makes an out-of-bounds reference or gets a {divide,modulo}-by-zero error evaluates to false, rather than causing the entire expression to return false. Unfortunately, that's difficult to implement efficiently when generating code for the existing BPF machine; there are other things that are also difficult to implement, such as IPv6 protocol chasing, so the entire filtering mechanism needs a rethink. |
I think that is probably too much of a breaking change, given the amount of history libpcap has, and how widely it is used. What do you think about the idea that "libpcap's optimizer should never change the accept/reject status of any given packet+filter pair, compared to unoptimized libpcap"? |
Some filters with out-of-bounds packet access behave differently depending on if optimizations are enabled or not. In a two-clause expression with an out of bounds packet access in the first clause, joined by the 'or' operator to a clause that would match a packet, with the optimizer disabled, a filter never matches. With the optimizer enabled, the second clause is sometimes run, allowing the filter to match packets.
tcpdump version 4.5.1
libpcap version 1.5.3
The examples are run on a 66-byte tcp ack packet, which can be downloaded from https://github.com/Igalia/pflua/raw/master/tests/data/tcp-ack-66-bytes.pcap
"ether[1] > tcp[100] or ip" triggers this inconsistency; it matches the test packet with optimizations enabled, and does not match with optimizations disabled.
Here are more test cases which show similar behavior. Note that the test packet is 66 bytes, so tcp[100], ether[100], and ip[100] are all out of bounds, and that the protocols match the packet.
Source to run the above tests:
The text was updated successfully, but these errors were encountered: