-
Notifications
You must be signed in to change notification settings - Fork 430
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Revises network.* to cover more use cases #81
Conversation
As in #79 you introduce an additional level to protocol, what about using |
@ruflin I'd prefer having them separate and delineate between application protocol and network protocol. The example we used in slack was TCP, UDP, and HTTPS DNS lookups. |
I totally agree we need two distinct fields for the transport vs application protocols. But I would like to group them, however. Instead of this network.application.protocol: DNS
network.protocol: UDP I would prefer that network.protocol.application: DNS
network.protocol.transport: UDP The former looks like what Suricata is doing ( Plus if we ever want to support protocols at other levels (e.g. ethernet/wifi/token ring, or even "userland" protocols built on top of application layer), having a protocol object with sub-fields will easily support this future growth. |
@robgil @webmat I agree with your points, but I'm concerned that the proposed names may confuse users due to mixing layers, and they could be shortened. Could we instead have |
+1 for network.protocol.application and network.protocol.transport. It is easy to see what is related to what layer. |
@webmat @robgil Can you share a bit on how the queries and aggregation will look on these fields? One of the aggregations I had in mind for the current pattern was "aggregation to show me all protocols" which are used. This changes to show me either application or transport protocols used. In general I'm on board with the change but would like to understand a bit better the querying / aggregation / filtering pattern that will be used or also what queries were not possible with the current implementation? |
I like the shortened names suggested by @MikePaquette. And sorry for mixing up layers :-) Actually this mixing up made me wonder if
Perhaps we could follow this nomenclature? network.internet: ipv4 or icmp
network.transport: tcp
network.application: http And if it comes up as necessary/desirable, |
@webmat to make it more complicated the "Internet Layer" is actually Layer 3 in the OSI model which is called the "Network Layer". So it would actually be
|
382da23
to
8324db5
Compare
Updated and expands/replaces |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR will also need a changelog entry.
schemas/network.yml
Outdated
@@ -16,11 +16,20 @@ | |||
type: keyword | |||
description: > | |||
Name given by operators to sections of their network. | |||
|
|||
- name: protocol | |||
- name: network |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be network.network
. Not sure if that make sense. Any alternatives we could use?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm not a huge fan, but it aligns with the OSI model which is a standard with regard to naming these things. Does anyone else have any thoughts? @webmat @MikePaquette ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd rather stick with network.internet
, which is also weird, but at least is not a repetition like network.network
...
Or perhaps network.layer3
? Not a super fan of that either because it's inconsistent vs using the actual names for all the other layers. But it's worth mentioning the option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not to use network.protocol.* naming which is not confusing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Putting OSI aside: What will the common user look for naming wise? What about the suggestion from @vbohata
What is still not fully clear to me why we can't mix them all in one field. Are the aggregations we can't do then?
One more option: What about network.layer: layer_name
together with network.protocol
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @dainperkins! ECS is meant to document a common event schema that would help make different event/log sources look more similar to one another. This will make it easier to find the basics in any event stream. The use cases we are mostly focusing on at the moment is operational monitoring as well as security.
By its nature, though, ECS will never encompass all use cases for all sources. Any given stream is likely to have fields that don't fit in ECS. You can use your custom fields around the documented ECS fields.
If you see existing fields that aren't quite defined correctly to fit some needs, or if you see generic fields that are missing, it's totally fair to open the discussion around it, though. So feel free to open issues or PRs with specific suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from an NSM perspective where we might have knowledge of more than one layer of protocol in a given event, this is fantastic:
network.protocol (IP / GRE / EIGRP, etc)
network.transport (tcp/udp)
network.application (http/telnet/ssh/smtp/tls)
network.application.id (L7 id - twitter, facebook, etc)
Namely, HTTP is absolutely an OSI layer 7 protocol, but it adds value if we have signatures or behavior models that are picking up the next layer application model of twitter, facebook, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Derek!
Note that we can't have both network.application
and network.application.id
at the same time in the field mapping. The following would work, though: network.application
and network.application_id
.
I like @dainperkins's suggestion of adding network.application_id
, but for now I think we should focus on finding sensible names for each network layer. We can add _id
later. I'll try to close on this sooner than later. Based on the conversations here I would be inclined to go with:
network.protocol
: e.g. IP, ICMP, GRE, EIGRPnetwork.transport
: TCP, UDPnetwork.application
: HTTP, HTTPS, SSH, TLS, SMTP
Let me know if there's any big disagreement on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pro: shorter, con: not self describing enough
Example: User wants to filter all events related to program "named" and "UDP" protocol. Until deeper look at logged data it seems network.application will be "named" and network.protocol will be "udp". In longer variant everyone immediately see "network.protocol.transport" is the correct field and network.protocol.application will probably not contain program name ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern is that network.protocol
in network sensor applications (bro, suricata, etc) is almost always the transport layer protocol (in the OSI sense). ICMP, GRE, and EIGRP, and of course TCP and UDP are all IP-encapsulated protocols, in that they all require a valid IP header first. Arp is probably one of the common protocols that you might log that sits at the network layer.
All that said, there are certainly times where one wouldn't be using IP as the network protocol, but in general it's at least a 95% solution. I'm not inclined to accrue the cost of adding a field to simply mark all my log entries as IP
. And if I do that, I have to rename all the network.protocol
fields to network.protocol.transport
.
I will adopt the network.application
and network.application_id
annotation, however. Thanks for that suggestion @webmat
*Edited for clarity
schemas/network.yml
Outdated
- name: network | ||
type: keyword | ||
description: > | ||
OSI Layer 3 Network Layer. Examples - IP, ICMP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a bit more details to all descriptions. For example add/describe a use case from which you pick the fields.
How about this (more accurate, possibly more annoying)
- network.protocol.version: 4/6 (ipv4, ipv6)
- network.protocol.number: 1/6/17
- network.protocol.name: ICMP, TCP, UDP, GRE, EIGRP, etc
- network.transport.port: 80/443/25/5061, etc . **could be
network.protocol.port, but seemed like a better option to split into
transport for tcp/udp details
- network.transport.protocol: http/sip/tls (may or may not be derived
from port - e.g. you can run TLS over 8443...)
- network.application.id: Twitter, Gmail, Facebook, Sharepoint,
Outlook (e.g. open slot for vendor specific e.g. cisco nbar, Palo AppID or
whatever they call it)
/d
…On Tue, Oct 16, 2018 at 4:42 PM vbohata ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In schemas/network.yml
<#81 (comment)>:
> @@ -16,11 +16,20 @@
type: keyword
description: >
Name given by operators to sections of their network.
-
- - name: protocol
+ - name: network
Pro: shorter, con: not self describing enough
Example: User wants to filter all events related to program "named" and
"UDP" protocol. Until deeper look at logged data it seems
network.application will be "named" and network.protocol will be "udp". In
longer variant everyone immediately see "network.protocol.transport" is the
correct field and network.protocol.application will probably not contain
program name ...
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#81 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AlmSyfjdMJGy41GJNvxlPWRMfFkvx356ks5ulkTAgaJpZM4V6_zm>
.
|
@dainperkins the issue here is that at the transport layer, there's always two ends of the connection, so there are two ports. If you're not logging the network connection itself, but rather just application data, you may not know the port(s) of the other end of the connections, however. Also, I think |
Updated proposal for new We met internally to try to cover as many use cases as possible and filter/collate the comments and suggestions above. We came up with the following proposal. network.typeDescribes essentially the
network.iana_numberThis is the most helpful aspect of this schema proposal. Aligning to the IANA Protocol Numbers lets us cover a great many more use cases. This is also standardized and well understood. There are situations where you need to classify specialized traffic such as IPIP, and this provides a way to do that. Other use cases such as DNS (which can be either UDP, TCP, or now TCP/HTTPS) are strong reasons to cover every layer of the OSI model.
network.transport
network.application_protocolThis brings us up to L7 protocols. It would be nice to refer to a public standard list, but obviously there are cases such as DNS over HTTPS and others. The application protocols can typically be picked up based on the known protocols in wireshark or other full packet inspection tools.
network.applicationAgain at L7, but this time specific to the vendor. So for example, if a capture device captures HTTP traffic it can potentially determine which vendor its in relation to (ex. Facebook, Twitter, LinkedIn requests). There was some debate as to whether this is a generic field or very specific to security, but this field could also be used to map out microservices. For example, if I have an auth service, an api service, and an upload service. I could map each of these to its own
Fields to be removed
|
Some feedback from our usage. In our company we are using following field names which are for now covering all of our use cases (OS logs, application logs, network devices logs, ...): For network.type the logical values are also WAN, LAN ... (which is type of network, not type of network/internet layer). |
Ok, I like how So the nitty gritty of the proposal would be the following, correct?
And network.protocol would be removed. What if instead of |
@vbohata The example you're giving is less about protocol details, and more about free form field about your network topology. I maybe could see the need for this, but I wonder how it would work in practice. My understanding is: it's not a value you would determine based on each event's details, but rather a blanket field value you would put in place for every log coming out of a device meant to service the "LAN", for example, correct? In any case, I would like it if you opened an issue about this. I would prefer if we kept this PR strictly about how we describe the protocol stack details of each event. |
OK, it was just an example "from end-users point of view". |
@vbohata Just making sure we're on the same page: I do see the value in your suggestion. All I'm saying is that this PR is more about mapping the network layer details (think OSI) to ECS. Your proposition is more about mapping a user-centric view of the network on ECS. Also valuable, but we should work on one chunk at a time :-) |
FWIW - seems like the latest round is mixing up layers...
1) type does sound more like a physical layer thing (1gbt, wireless, etc)
2) iana_number seems excessively specific & network.protocol is going to
mean something to a lot of people (and a lot of logs...) - e.g. the ip
protocol (tcp/udp/icmp, etc) and will typically be the IANA decimal.
(netflow, firewall logs, sensors, etc.)
I could like with type for IP4, ip6, ipx, but network.protocol should stay
then I get a little ambivalent on where to split the layers, e.g. TCP, HTTP
network =
IP4, IP6, IPX,
network.protocol = IANA Protocol # 1/6/17
network.protocol.name = IANA Protocol name . TCP
network.transport = IANA Protocol Name . TCP
network.transport.port . = TCP Service Port . 80
network.transport.protocol = well known services HTTP
network.application.protocol = HTTP .
(well known services)
network.application.type ?? WEB
(dpi / service visibility required)
network.application.name . = Vendor /
App specific (e.g. sharepoint vs WEB, DPI or service knowledge required))
Would it make more sense to potentially use an application space for
network.application fields (a'la source / destination potentially being
mixed in with network for a full event)? realistically beyond the
transport.protocol (tls, http, smtp) and port, we're not really talking
about network anymore - dns over TLS, sharepoint, git, twitter, etc..
…On Mon, Oct 29, 2018 at 3:12 PM Mathieu Martin ***@***.***> wrote:
Ok, I like how network.type works around weird naming issues :-D
So the nitty gritty of the proposal would be the following, correct?
Field Type Level
network.type keyword core
network.iana_number long extended
network.transport keyword core
network.application_protocol keyword core
network.application keyword extended
And network.protocol would be removed.
What if instead of network.application_protocol we kept network.protocol
for the L7 protocol? It wouldn't have a similar name to its sibling
network.application, but on the upside the short and most canonical field
name would be the home of the highest level protocol.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#81 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AlmSyQPkg3pYyCXGdeTUMYXhBZzV7VzKks5up1MLgaJpZM4V6_zm>
.
|
@dainperkins to add some more context to some of your points (which are all valid). Keep in mind, myself and others look at this from a network engineer perspective but also from a generic naming convention that can map may types of data. Obviously its challenging to map all security devices and network devices on to one common schema, but the goal here is to map the most common fields that are consistent between devices for the top level fields.
@dainperkins @vbohata what do you think about providing some sample log strings to map? The ones that are top of mind for me are Netflow, sFlow, AWS VPC FlowLogs, and GCP FlowLogs. We obviously need some vendor examples (Cisco ASA, Palo Alto, Fortinet, etc). We'd need IDS (snort, etc) and other TAP related device log samples as well. If we can map 90% of the common fields, I think we'll be in a good spot for v1. |
I should be able to provide netflow, bro, asa, Palo pretty easily, I think
thats a good way to look at it...
VPC is pretty much useless but i can add those in easily as well
/d
…On Mon, Oct 29, 2018 at 4:56 PM robgil ***@***.***> wrote:
@dainperkins <https://github.com/dainperkins> to add some more context to
some of your points (which are all valid). Keep in mind, myself and others
look at this from a network engineer perspective but also from a generic
naming convention that can map may types of data. Obviously its challenging
to map all security devices and network devices on to one common schema,
but the goal here is to map the most common fields that are consistent
between devices for the top level fields.
-
type does sound more like a physical layer thing (1gbt, wireless, etc)
In this particular field the alternative would have been
network.network for the Network Layer in the OSI model. This is
obviously ugly. network.type came out of discussions around what sFlow
and NetFlow did. So in this case we used sFlow as a reference
<https://sflow.org/SFLOW-DATAGRAM5.txt> with regard to sFlow's
address_type. We just shortened it to type. @vbohata
<https://github.com/vbohata> this should help with your question also.
LAN/WAN could be mapped in part by network.direction.
-
iana_number seems excessively specific & network.protocol is going to
mean something to a lot of people (and a lot of logs...) - e.g. the ip
protocol (tcp/udp/icmp, etc) and will typically be the IANA decimal.
(netflow, firewall logs, sensors, etc.)
Yes, this is very specific, however it is also a standard reference to
pull from. Add to that, it is also the field returned from NetFlow. Per
@webmat <https://github.com/webmat>, this would be an *extended* field
and network.transport would be the keyword name from the IANA
Protocols standard. Think of this more as enumeration over a free-form
field.
-
then I get a little ambivalent on where to split the layers, e.g. TCP,
HTTP
Those are two different layers in the OSI model. That's what I've been
trying to delineate in this PR from the start. The prime example of this is
DNS. DNS can be UDP and TCP, but also TLS and HTTPS based.
-
network.transport.port
Ports are all moved to source.port and destination.port. I could see
this being a service listening on a specific port however, but we have not
identified a way to represent listening ports yet. This should be
opened as a new Issue or PR to track things like listening ports with
regards to inventory type use cases.
-
realistically beyond the transport.protocol (tls, http, smtp) and
port, we're not really talking about network anymore - dns over TLS,
sharepoint, git, twitter, etc..
Agreed on this to the extent it goes beyond wire formats. I still
think wire formats should be captured in network.*. For example, http,
memcache (binary), transport protocol, lumberjack, mysql (binary), etc. I
think application protocol decode is pretty common in both network probes
and devices that do full packet inspection.
@dainperkins <https://github.com/dainperkins> @vbohata
<https://github.com/vbohata> what do you think about providing some
sample log strings to map? The ones that are top of mind for me are
Netflow, sFlow, AWS VPC FlowLogs, and GCP FlowLogs. We obviously need some
vendor examples (Cisco ASA, Palo Alto, Fortinet, etc). We'd need IDS
(snort, etc) and other TAP related device log samples as well. If we can
map 90% of the *common* fields, I think we'll be in a good spot for v1.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#81 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AlmSyQG_ZDKAoMCpamXNleKS3ANEBaWwks5up2tqgaJpZM4V6_zm>
.
|
I probably screwed a bunch of stuff up, but here are some examples to chew
on (and some thought on some other things that might be good to add)
- vpc flow (straight from AWS definitions)
- uni direction netflow 9 . (ubiquiti)
- bidirectional netflow 9 . (qradar from meraki)
- palo traffic log from definitions
- firepower from logs (grok + kv to split)
/d
…On Mon, Oct 29, 2018 at 6:42 PM Dain Perkins ***@***.***> wrote:
I should be able to provide netflow, bro, asa, Palo pretty easily, I think
thats a good way to look at it...
VPC is pretty much useless but i can add those in easily as well
/d
On Mon, Oct 29, 2018 at 4:56 PM robgil ***@***.***> wrote:
> @dainperkins <https://github.com/dainperkins> to add some more context
> to some of your points (which are all valid). Keep in mind, myself and
> others look at this from a network engineer perspective but also from a
> generic naming convention that can map may types of data. Obviously its
> challenging to map all security devices and network devices on to one
> common schema, but the goal here is to map the most common fields that are
> consistent between devices for the top level fields.
>
> -
>
> type does sound more like a physical layer thing (1gbt, wireless, etc)
> In this particular field the alternative would have been
> network.network for the Network Layer in the OSI model. This is
> obviously ugly. network.type came out of discussions around what
> sFlow and NetFlow did. So in this case we used sFlow as a reference
> <https://sflow.org/SFLOW-DATAGRAM5.txt> with regard to sFlow's
> address_type. We just shortened it to type. @vbohata
> <https://github.com/vbohata> this should help with your question
> also. LAN/WAN could be mapped in part by network.direction.
> -
>
> iana_number seems excessively specific & network.protocol is going to
> mean something to a lot of people (and a lot of logs...) - e.g. the ip
> protocol (tcp/udp/icmp, etc) and will typically be the IANA decimal.
> (netflow, firewall logs, sensors, etc.)
> Yes, this is very specific, however it is also a standard reference
> to pull from. Add to that, it is also the field returned from NetFlow. Per
> @webmat <https://github.com/webmat>, this would be an *extended*
> field and network.transport would be the keyword name from the IANA
> Protocols standard. Think of this more as enumeration over a free-form
> field.
> -
>
> then I get a little ambivalent on where to split the layers, e.g.
> TCP, HTTP
> Those are two different layers in the OSI model. That's what I've
> been trying to delineate in this PR from the start. The prime example of
> this is DNS. DNS can be UDP and TCP, but also TLS and HTTPS based.
> -
>
> network.transport.port
> Ports are all moved to source.port and destination.port. I could see
> this being a service listening on a specific port however, but we have not
> identified a way to represent listening ports yet. This should be
> opened as a new Issue or PR to track things like listening ports with
> regards to inventory type use cases.
> -
>
> realistically beyond the transport.protocol (tls, http, smtp) and
> port, we're not really talking about network anymore - dns over TLS,
> sharepoint, git, twitter, etc..
> Agreed on this to the extent it goes beyond wire formats. I still
> think wire formats should be captured in network.*. For example,
> http, memcache (binary), transport protocol, lumberjack, mysql (binary),
> etc. I think application protocol decode is pretty common in both network
> probes and devices that do full packet inspection.
>
> @dainperkins <https://github.com/dainperkins> @vbohata
> <https://github.com/vbohata> what do you think about providing some
> sample log strings to map? The ones that are top of mind for me are
> Netflow, sFlow, AWS VPC FlowLogs, and GCP FlowLogs. We obviously need some
> vendor examples (Cisco ASA, Palo Alto, Fortinet, etc). We'd need IDS
> (snort, etc) and other TAP related device log samples as well. If we can
> map 90% of the *common* fields, I think we'll be in a good spot for v1.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#81 (comment)>, or mute
> the thread
> <https://github.com/notifications/unsubscribe-auth/AlmSyQG_ZDKAoMCpamXNleKS3ANEBaWwks5up2tqgaJpZM4V6_zm>
> .
>
|
@dainperkins Haha yeah, email to GitHub doesn't support attachments, it seems :-) For this you need to attach directly from the GitHub web form. Thanks a lot for sending these sample logs over. It will be very helpful. |
doh!
…On Mon, Oct 29, 2018 at 10:52 PM Mathieu Martin ***@***.***> wrote:
@dainperkins <https://github.com/dainperkins> Haha yeah, email to GitHub
doesn't support attachments, it seems :-) For this you need to attach
directly from the GitHub web form.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#81 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AlmSyXqVw8W85eX2ZLwidWU9q683RSWVks5up773gaJpZM4V6_zm>
.
|
excel file attached - its a mess (other than VPC flow which is pretty basic) , I think starting with a common subset would probably be the most effective way to go, then deciding which other additions are required... |
8324db5
to
9c10f5e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM and simple to understand.
The only thing I want to make sure is that moving forward we will not need application
, transport
or protocol
as an object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Beautiful! Thanks a lot @robgil for submitting this updated PR.
I have one small nitpick: Could you tweak the examples to use a lowercase "v" in "IPv4" and "IPv6", please? :-)
Once this is in, I'm ok to merge as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, tweaking the example is not a blocker. Merge whenever you're ready. We can fix the example afterwards.
* Make the capitalization of `IPv` consistent across examples * Tweak the wording of the examples a bit. * Add changelog
This is to help delineate between
network.protocol
liketcp
and annetwork.application.protocol
such ashttp
.