Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

OR'ing vlans impossible in tcpdump filter #158

Open
guyharris opened this issue Apr 15, 2013 · 12 comments
Open

OR'ing vlans impossible in tcpdump filter #158

guyharris opened this issue Apr 15, 2013 · 12 comments

Comments

@guyharris
Copy link
Member

Converted from SourceForge issue 3469486, submitted by spp2

There is a small bug I have seen while trying to filter on several VLANs;
I'm unable to specify more than one VLAN.

Obviously, some variable keeps incrementing by mistake;
the first check is made against bytes 12,13 (= Tag Type) then 14,15 (vlan id) for the first VLAN,
but the next check is made against bytes 16,17 for the tag type, then 18,19 for the next VLAN,
and so on.

Here is an example:
tcpdump -d vlan 2 or vlan 3
(000) ldh 12 jeq #0x8100 jt 2 jf 5
(002) ldh 14 and #0xfff
(004) jeq #0x2 jt 10 jf 5
(005) ldh 16 jeq #0x8100 jt 7 jf 11
(007) ldh 18 and #0xfff
(009) jeq #0x3 jt 10 jf 11
(010) ret #96
(011) ret #0

... everything is correct with only one VLAN :
tcpdump -d vlan 2
(000) ldh 12 jeq #0x8100 jt 2 jf 6
(002) ldh 14 and #0xfff
(004) jeq #0x2 jt 5 jf 6
(005) ret #96
(006) ret #0

First tried with argus (of Qosient) and libpcap 0.8, same problem:

argus -X -b - vlan 2

(000) ldh 12 jeq #0x8100 jt 2 jf 6
(002) ldh 14 and #0xfff
(004) jeq #0x2 jt 5 jf 6
(005) ret #96
(006) ret #0

I have compiled libpcap-1.2.1, same behavior.

Hope this helps.

Regards,

Stephane Peters.

@guyharris
Copy link
Member Author

Submitted by guy_harris

This is documented behavior. To quote the pcap-filter man page:

   vlan [vlan_id]
          True if the packet is an IEEE 802.1Q VLAN packet.  If  [vlan_id]
          is specified, only true if the packet has the specified vlan_id.
          ******Note that the  first  vlan  keyword  encountered  in  expression
          changes  the decoding offsets for the remainder of expression on
          the assumption that the packet  is  a  VLAN  packet.******   The  vlan
          [vlan_id]  expression  may  be used more than once, to filter on
          VLAN hierarchies.  Each use of that  expression  increments  the
          filter offsets by 4.

          For example:
               vlan 100 && vlan 200
          filters on VLAN 200 encapsulated within VLAN 100, and
               vlan && vlan 300 && ip
          filters  IPv4  protocols  encapsulated  in VLAN 300 encapsulated
          within any higher order VLAN.

Note the sentence emphasized in "******"; that might not be stated as clearly as it should be, but it means, for better or worse, that "vlan 2 or vlan 3" means "this packet is a VLAN-in-VLAN packet, with the outer VLAN being VLAN 2 and the inner VLAN being VLAN 3".

This was done in order to allow deeper checks, e.g. for fields in the IP or transport layer headers, to properly find those headers.

It's not the best way to do that, as:

1) it has the problem you note (with "and" it does what most would probably expect and want, but with "or" it doesn't);

2) it means that the "obvious" filters aren't sufficient on networks with VLANs - for example, "tcp port 80", by itself, won't match VLAN packets sent to TCP port 80;

but the best way to do it requires some additional capabilities in the BPF interpreter or an increase in the complexity of the BPF code generated.

It might be possible to come up with a hack so that "vlan N and vlan M" tests VLAN headers at two different levels (doing it at the same level would result in a filter that matches no packets unless N = M) and "vlan N or vlan M" tests VLAN headers at the same level.

@guyharris
Copy link
Member Author

Submitted by spp2

So it seems that I have to use some kind of workaround, which does what I need,
ie extract vlans 2 and 3 from a tapped trunk with several other vlans :

tcpdump -d 'vlan and ether[14:2]&0xfff=0x2 or ether[14:2]&0xfff=0x3'
(000) ldh 12 jeq #0x8100 jt 2jf 5
(002) ldh 14 and #0xfff
(004) jeq #0x2 jt 8jf 5
(005) ldh 14 and #0xfff
(007) jeq #0x3 jt 8jf 9
(008) ret #96
(009) ret #0

@spellr
Copy link

spellr commented Oct 11, 2013

From the code (gencode.c:7857):

image

This comment, commited by @guyharris in 2005 explains this issue very well. yacc parsers the bpf from left to right without saving the state, and doesn't provide a tree of some kind, which would allow an easy solution. @guyharris says that OR'ing vlans in the current parsing methodology is impossible.

But there might be a solution, if GCC used yacc in previous version to parse C code, a state can be saved. We simply want yacc to parse parenthesis, and using them to increment the offset, and with each 'OR' it encounters, resetting the offset to it's last state. Let me explain:

tcpdump -d 'vlan and (vlan or arp) or ip'
means:

  1. filter vlan with the current offset (0) and increment offset ( = 4)
  2. open parenthesis. push the offset in a stack
  3. filter vlan with the current offset (0) and increment offset ( = 8)
  4. or. reset the offset to it's state in the last parenthesis from the offset stack ( = 4)
  5. filter arp with the current offset (4)
  6. close parenthesis. pop the offset's state
  7. or. reset the offset to it's state in the last parenthesis from the offset stack ( = 0)
  8. filter ip with the current offset (0)

As it seems to me, this will solve the issue, and would allow OR'ing vlans.

What do you say?

@mcr
Copy link
Member

mcr commented Oct 12, 2013

------- Blind-Carbon-Copy

From: Michael Richardson mcr@sandelman.ca
To: tcpdump-workers@lists.tcpdump.org
Subject: Re: [libpcap] OR'ing vlans impossible in tcpdump filter (#158)
In-Reply-To: the-tcpdump-group/libpcap/issues/158/26125562@github.com
References: the-tcpdump-group/libpcap/issues/158@github.com the-tcpdump-group/libpcap/issues/158/26125562@github.com
X-Mailer: MH-E 8.2; nmh 1.3-dev; GNU Emacs 23.4.1
X-Face: $\n1pF)h^}$H>Hk{L"x@)JS7<%Az}5RyS@k9X%29-lHB$Ti.V>2bi.~ehC0;<'$9xN5Ub# z!G,pnR&p7Fz@^UXIn156S8.~^@mj*mMsD7=QFeq%AL4m<nPbLgmtKK-5dC@#:k
Date: Fri, 11 Oct 2013 21:20:54 -0400
Message-ID: 3639.1381540854@sandelman.ca
Sender: mcr@sandelman.ca

Please take this discussion to the tcpdump-workers list.

shohamp writes:
> This commit by @yuguy explains this issue very well. yacc parsers the
> bpf from left to right without saving the state, and doesn't provide a
> tree of some kind, which would allow an easy solution. @yuguy says that
> OR'ing vlans in the current parsing methodology is impossible.

> But there might be a solution, if GCC used yacc in previous version to
> parse C code, a state can be saved. We simply want yacc to parse
> parenthesis, and using them to increment the offset, and with each 'OR'
> it encounters, resetting the offset to it's last state. Let me explain:

> tcpdump -d 'vlan and (vlan or arp) or ip' means: 1. filter vlan with
> the current offset (0) and increment offset ( = 4) 2. open
> parenthesis. push the offset in a stack 3. filter vlan with the current
> offset (0) and increment offset ( = 8) 4. or. reset the offset to it's
> state in the last parenthesis from the offset stack ( = 4) 5. filter
> arp with the current offset (4) 6. close parenthesis. pop the offset's
> state 7. or. reset the offset to it's state in the last parenthesis
> from the offset stack ( = 0) 8. filter ip with the current offset (0)

> As it seems to me, this will solve the issue, and would allow OR'ing
> vlans.

> What do you say?

------- End of Blind-Carbon-Copy

@spellr
Copy link

spellr commented Oct 14, 2013

I did, no one answered.
What do I to make of this?

I guess, as expected, that not so many people are interested in OR'ing vlans. it's a pretty rare use-case.

@infrastation
Copy link
Member

Some things require time to be done well.

@spellr
Copy link

spellr commented Oct 26, 2013

And when time has passed and still no answer?..

@spp2
Copy link

spp2 commented Nov 20, 2013

(Ah! there you are, my dear little issue #158! 
Why did you leave sourceforge ? We have been looking for you for such a time !
I thought you were buried deep underground!
Many thanks to the one who has revived you back in the sunlight!
Hope you will spend an new life on your new location ! ;-)

Firstly, I hope, that everyone agrees on the fact that the compiled BPF doesn't match the presumed semantics of an "OR", even though I understand yacc contraints.

Am I really the only one to be interested in those cases?
I see mainly two usages where filtering of several vlans of a trunk is of any use:

  • You could have a trunk with several vlans from you to your network provider,
    where each vlan is going from your firewall to a different partner, and would like to troubleshoot traffic by comparing two similar vlans;
  • Or you have a trunk between a distribution switch and an access switch,
    and you want to group monitoring of similar vlans or troubleshoot some of them at the same time.

Such vlan filters are easily used everyday with Network Instruments' Observer,
and it would be great to have the same functionnality in other tools like
wireshark, ntop, argus(quosient), security onion, ...

Regards,

Stephane Peters.

@infrastation
Copy link
Member

Is the point that a custom parser could handle this better than a flex+yacc parser?

@vbrozik
Copy link

vbrozik commented Feb 18, 2019

I am wondering if a simple solution was considered.
If the parser is not able to correctly cope with hierarchical use of expressions with side-effects then I think that the side effect (the shift by 4 bytes) should be invoked explicitly and separately.

What about separating the two functions of the vlan primitive into two primitives:
vlanshift - only shift the rest of the expression by 4 bytes (or check the presence of the VLAN tag and shift)
vlanid - only check the presence of the VLAN tag and the VLAN ID value without any shifting

I think this approach would be much clearer, The current one is a little bit obscure and unintuitive.

@guyharris
Copy link
Member Author

Is the point that a custom parser could handle this better than a flex+yacc parser?

My point is that a Flex+{Bison,BYACC} parser that generates a protocol tree in pass 1 and generates code from the protocol tree in pass 2 might handle this better, so that, for example,

(vlan and ip) or (vlan and ipx)

would mean "VLAN-encapsulated IPv4 or VLAN-encapsulated IPX" rather than "VLAN-encapsulated IPv4 or doubly-VLAN-encapsulated IPv6".

It might also be possible to do so without having a two-pass parser (not counting the optimizer passes).

The question is whether doing that would change the meaning of filters for which the current semantics are useful and intended. If not, we could probably get away with that.

@infrastation
Copy link
Member

If vlan (2 or 3) was a valid pcap-filter syntax, it could be a possible solution to this particular problem.

# for free to join this conversation on GitHub. Already have an account? # to comment
Development

No branches or pull requests

6 participants