Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Revises network.* to cover more use cases #81

Merged
merged 2 commits into from
Nov 5, 2018

Conversation

robgil
Copy link
Contributor

@robgil robgil commented Aug 13, 2018

This is to help delineate between network.protocol like tcp and an
network.application.protocol such as http.

@ruflin
Copy link
Contributor

ruflin commented Aug 14, 2018

As in #79 you introduce an additional level to protocol, what about using network.protocol.type: application? I would prefer if all protocols stay in the same field.

@robgil
Copy link
Contributor Author

robgil commented Aug 14, 2018

@ruflin I'd prefer having them separate and delineate between application protocol and network protocol. The example we used in slack was TCP, UDP, and HTTPS DNS lookups.

@webmat
Copy link
Contributor

webmat commented Aug 14, 2018

I totally agree we need two distinct fields for the transport vs application protocols. But I would like to group them, however.

Instead of this

network.application.protocol: DNS
network.protocol: UDP

I would prefer that

network.protocol.application: DNS
network.protocol.transport: UDP

The former looks like what Suricata is doing (app_proto & proto), but this means the fields are scattered at two ends of the same event, when you're drilling down :-) It's a minor gripe, but I think the grouping makes it easier to work with the data day to day.

Plus if we ever want to support protocols at other levels (e.g. ethernet/wifi/token ring, or even "userland" protocols built on top of application layer), having a protocol object with sub-fields will easily support this future growth.

@MikePaquette
Copy link
Contributor

@robgil @webmat I agree with your points, but I'm concerned that the proposed names may confuse users due to mixing layers, and they could be shortened. Could we instead have network.protocol: "ipv4" network.transport: "udp" and network.application: "dns" as the fields under network? We already have source.port and destination.port to capture the transport port numbers.

@vbohata
Copy link

vbohata commented Aug 15, 2018

+1 for network.protocol.application and network.protocol.transport. It is easy to see what is related to what layer.

@ruflin
Copy link
Contributor

ruflin commented Aug 15, 2018

@webmat @robgil Can you share a bit on how the queries and aggregation will look on these fields? One of the aggregations I had in mind for the current pattern was "aggregation to show me all protocols" which are used. This changes to show me either application or transport protocols used.

In general I'm on board with the change but would like to understand a bit better the querying / aggregation / filtering pattern that will be used or also what queries were not possible with the current implementation?

@webmat
Copy link
Contributor

webmat commented Aug 16, 2018

I like the shortened names suggested by @MikePaquette. And sorry for mixing up layers :-)

Actually this mixing up made me wonder if network.protocol was actually the right naming for the layer where we'd find ipv4/6. I ended up on the trusty Wikipedia, where this layer is actually called the "Internet Layer". So, partially quoting the Internet Layer page:

Application Layer
..., DNS, FTP, HTTP ...
Transport Layer
TCP, UDP, ...
Internet Layer
IP (IPv4 IPv6), ICMP, ...
Link Layer
ARP, ..., MAC (Ethernet, DSL, ...

Perhaps we could follow this nomenclature?

network.internet: ipv4 or icmp
network.transport: tcp
network.application: http

And if it comes up as necessary/desirable, network.link as well. Personally I think network.internet sounds a bit awkward, but I think it's better than network.protocol because each of these layers are actually composed of "protocols"...

@robgil
Copy link
Contributor Author

robgil commented Sep 4, 2018

@webmat to make it more complicated the "Internet Layer" is actually Layer 3 in the OSI model which is called the "Network Layer". So it would actually be network.network 😭 . Or perhaps network.type but that's confusing also.

network.network: ip or ipx (L3)
network.transport: tcp or udp (L4)
network.application: http or transport protocol (L7)

@robgil robgil force-pushed the 9506-application-protocol branch from 382da23 to 8324db5 Compare September 4, 2018 20:31
@robgil
Copy link
Contributor Author

robgil commented Sep 4, 2018

Updated and expands/replaces network.protocol with network.network, network.transport, and network.application to cover Layers 3, 4 and 7 respectively.

Copy link
Contributor

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR will also need a changelog entry.

@@ -16,11 +16,20 @@
type: keyword
description: >
Name given by operators to sections of their network.

- name: protocol
- name: network
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be network.network. Not sure if that make sense. Any alternatives we could use?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm not a huge fan, but it aligns with the OSI model which is a standard with regard to naming these things. Does anyone else have any thoughts? @webmat @MikePaquette ?

Copy link
Contributor

@webmat webmat Sep 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd rather stick with network.internet, which is also weird, but at least is not a repetition like network.network...

Or perhaps network.layer3? Not a super fan of that either because it's inconsistent vs using the actual names for all the other layers. But it's worth mentioning the option.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not to use network.protocol.* naming which is not confusing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Putting OSI aside: What will the common user look for naming wise? What about the suggestion from @vbohata

What is still not fully clear to me why we can't mix them all in one field. Are the aggregations we can't do then?

One more option: What about network.layer: layer_name together with network.protocol?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @dainperkins! ECS is meant to document a common event schema that would help make different event/log sources look more similar to one another. This will make it easier to find the basics in any event stream. The use cases we are mostly focusing on at the moment is operational monitoring as well as security.

By its nature, though, ECS will never encompass all use cases for all sources. Any given stream is likely to have fields that don't fit in ECS. You can use your custom fields around the documented ECS fields.

If you see existing fields that aren't quite defined correctly to fit some needs, or if you see generic fields that are missing, it's totally fair to open the discussion around it, though. So feel free to open issues or PRs with specific suggestions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from an NSM perspective where we might have knowledge of more than one layer of protocol in a given event, this is fantastic:

network.protocol (IP / GRE / EIGRP, etc)
network.transport (tcp/udp)
network.application (http/telnet/ssh/smtp/tls)
network.application.id (L7 id - twitter, facebook, etc)

Namely, HTTP is absolutely an OSI layer 7 protocol, but it adds value if we have signatures or behavior models that are picking up the next layer application model of twitter, facebook, etc.

Copy link
Contributor

@webmat webmat Oct 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Derek!

Note that we can't have both network.application and network.application.id at the same time in the field mapping. The following would work, though: network.application and network.application_id.

I like @dainperkins's suggestion of adding network.application_id, but for now I think we should focus on finding sensible names for each network layer. We can add _id later. I'll try to close on this sooner than later. Based on the conversations here I would be inclined to go with:

  • network.protocol: e.g. IP, ICMP, GRE, EIGRP
  • network.transport: TCP, UDP
  • network.application: HTTP, HTTPS, SSH, TLS, SMTP

Let me know if there's any big disagreement on this.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pro: shorter, con: not self describing enough
Example: User wants to filter all events related to program "named" and "UDP" protocol. Until deeper look at logged data it seems network.application will be "named" and network.protocol will be "udp". In longer variant everyone immediately see "network.protocol.transport" is the correct field and network.protocol.application will probably not contain program name ...

Copy link
Contributor

@dcode dcode Oct 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern is that network.protocol in network sensor applications (bro, suricata, etc) is almost always the transport layer protocol (in the OSI sense). ICMP, GRE, and EIGRP, and of course TCP and UDP are all IP-encapsulated protocols, in that they all require a valid IP header first. Arp is probably one of the common protocols that you might log that sits at the network layer.

All that said, there are certainly times where one wouldn't be using IP as the network protocol, but in general it's at least a 95% solution. I'm not inclined to accrue the cost of adding a field to simply mark all my log entries as IP. And if I do that, I have to rename all the network.protocol fields to network.protocol.transport.

I will adopt the network.application and network.application_id annotation, however. Thanks for that suggestion @webmat

*Edited for clarity

- name: network
type: keyword
description: >
OSI Layer 3 Network Layer. Examples - IP, ICMP
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a bit more details to all descriptions. For example add/describe a use case from which you pick the fields.

@webmat webmat mentioned this pull request Sep 18, 2018
26 tasks
@dainperkins
Copy link
Contributor

dainperkins commented Oct 17, 2018 via email

@dcode
Copy link
Contributor

dcode commented Oct 17, 2018

  • network.protocol.name: ICMP, TCP, UDP, GRE, EIGRP, etc
  • network.transport.port: 80/443/25/5061, etc . **could be
    network.protocol.port, but seemed like a better option to split into
    transport for tcp/udp details
  • network.transport.protocol: http/sip/tls (may or may not be derived
    from port - e.g. you can run TLS over 8443...)

@dainperkins the issue here is that at the transport layer, there's always two ends of the connection, so there are two ports. If you're not logging the network connection itself, but rather just application data, you may not know the port(s) of the other end of the connections, however. Also, I think network.transport.protocol may be a bit confusing, as "transport" is often used for protocols operating just above the Internet layer (e.g. TCP, UDP, SCTP, etc)

@robgil
Copy link
Contributor Author

robgil commented Oct 29, 2018

Updated proposal for new network.* fields

We met internally to try to cover as many use cases as possible and filter/collate the comments and suggestions above. We came up with the following proposal.

network.type

Describes essentially the Network Layer of the OSI model. network.type seemed the most logical name over something like network.network to describe this layer of the OSI model.

network.type: ipv4, ipv6, ipx

network.iana_number

This is the most helpful aspect of this schema proposal. Aligning to the IANA Protocol Numbers lets us cover a great many more use cases. This is also standardized and well understood. There are situations where you need to classify specialized traffic such as IPIP, and this provides a way to do that. Other use cases such as DNS (which can be either UDP, TCP, or now TCP/HTTPS) are strong reasons to cover every layer of the OSI model.

network.iana_number: 1/6/17 IANA Protocol Numbers

network.transport

network.transport specifically refers to the Keywords in the IANA Protocol Numbers standard.

network.transport: TCP, UDP, IPv6-ICMP, ICMP 

network.application_protocol

This brings us up to L7 protocols. It would be nice to refer to a public standard list, but obviously there are cases such as DNS over HTTPS and others. The application protocols can typically be picked up based on the known protocols in wireshark or other full packet inspection tools.

network.application_protocol: http, dns, smtp, ftp, etc

network.application

Again at L7, but this time specific to the vendor. So for example, if a capture device captures HTTP traffic it can potentially determine which vendor its in relation to (ex. Facebook, Twitter, LinkedIn requests). There was some debate as to whether this is a generic field or very specific to security, but this field could also be used to map out microservices. For example, if I have an auth service, an api service, and an upload service. I could map each of these to its own network.application value (ie my-auth, my-api, my-upload) for the purposes of identifying flows based on application/service.

network.application: skype, icq, aim, dns (dns over https)

Fields to be removed

network.protocol

@vbohata
Copy link

vbohata commented Oct 29, 2018

Some feedback from our usage. In our company we are using following field names which are for now covering all of our use cases (OS logs, application logs, network devices logs, ...):
companyprefix.application.name (not related to network field, but for demonstration)
companyprefix.network.protocol.application
companyprefix.network.protocol.transport
in future maybe: companyprefix.network.protocol.internet

For network.type the logical values are also WAN, LAN ... (which is type of network, not type of network/internet layer).

@webmat
Copy link
Contributor

webmat commented Oct 29, 2018

Ok, I like how network.type works around weird naming issues :-D

So the nitty gritty of the proposal would be the following, correct?

Field Type Level
network.type keyword core
network.iana_number long extended
network.transport keyword core
network.application_protocol keyword core
network.application keyword extended

And network.protocol would be removed.

What if instead of network.application_protocol we kept network.protocol for the L7 protocol? It wouldn't have a similar name to its sibling network.application, but on the upside the short and most canonical field name would be the home of the highest level protocol.

@webmat
Copy link
Contributor

webmat commented Oct 29, 2018

@vbohata The example you're giving is less about protocol details, and more about free form field about your network topology. I maybe could see the need for this, but I wonder how it would work in practice. My understanding is: it's not a value you would determine based on each event's details, but rather a blanket field value you would put in place for every log coming out of a device meant to service the "LAN", for example, correct?

In any case, I would like it if you opened an issue about this. I would prefer if we kept this PR strictly about how we describe the protocol stack details of each event.

@vbohata
Copy link

vbohata commented Oct 29, 2018

OK, it was just an example "from end-users point of view".

@webmat
Copy link
Contributor

webmat commented Oct 29, 2018

@vbohata Just making sure we're on the same page: I do see the value in your suggestion. All I'm saying is that this PR is more about mapping the network layer details (think OSI) to ECS. Your proposition is more about mapping a user-centric view of the network on ECS. Also valuable, but we should work on one chunk at a time :-)

@dainperkins
Copy link
Contributor

dainperkins commented Oct 29, 2018 via email

@robgil
Copy link
Contributor Author

robgil commented Oct 29, 2018

@dainperkins to add some more context to some of your points (which are all valid). Keep in mind, myself and others look at this from a network engineer perspective but also from a generic naming convention that can map may types of data. Obviously its challenging to map all security devices and network devices on to one common schema, but the goal here is to map the most common fields that are consistent between devices for the top level fields.

  • type does sound more like a physical layer thing (1gbt, wireless, etc)
    In this particular field the alternative would have been network.network for the Network Layer in the OSI model. This is obviously ugly. network.type came out of discussions around what sFlow and NetFlow did. So in this case we used sFlow as a reference with regard to sFlow's address_type. We just shortened it to type. @vbohata this should help with your question also. LAN/WAN could be mapped in part by network.direction.

  • iana_number seems excessively specific & network.protocol is going to mean something to a lot of people (and a lot of logs...) - e.g. the ip protocol (tcp/udp/icmp, etc) and will typically be the IANA decimal. (netflow, firewall logs, sensors, etc.)
    Yes, this is very specific, however it is also a standard reference to pull from. Add to that, it is also the field returned from NetFlow. Per @webmat, this would be an extended field and network.transport would be the keyword name from the IANA Protocols standard. Think of this more as enumeration over a free-form field.

  • then I get a little ambivalent on where to split the layers, e.g. TCP, HTTP
    Those are two different layers in the OSI model. That's what I've been trying to delineate in this PR from the start. The prime example of this is DNS. DNS can be UDP and TCP, but also TLS and HTTPS based.

  • network.transport.port
    Ports are all moved to source.port and destination.port. I could see this being a service listening on a specific port however, but we have not identified a way to represent listening ports yet. This should be opened as a new Issue or PR to track things like listening ports with regards to inventory type use cases.

  • realistically beyond the transport.protocol (tls, http, smtp) and port, we're not really talking about network anymore - dns over TLS, sharepoint, git, twitter, etc..
    Agreed on this to the extent it goes beyond wire formats. I still think wire formats should be captured in network.*. For example, http, memcache (binary), transport protocol, lumberjack, mysql (binary), etc. I think application protocol decode is pretty common in both network probes and devices that do full packet inspection.

@dainperkins @vbohata what do you think about providing some sample log strings to map? The ones that are top of mind for me are Netflow, sFlow, AWS VPC FlowLogs, and GCP FlowLogs. We obviously need some vendor examples (Cisco ASA, Palo Alto, Fortinet, etc). We'd need IDS (snort, etc) and other TAP related device log samples as well. If we can map 90% of the common fields, I think we'll be in a good spot for v1.

@dainperkins
Copy link
Contributor

dainperkins commented Oct 29, 2018 via email

@dainperkins
Copy link
Contributor

dainperkins commented Oct 30, 2018 via email

@webmat
Copy link
Contributor

webmat commented Oct 30, 2018

@dainperkins Haha yeah, email to GitHub doesn't support attachments, it seems :-) For this you need to attach directly from the GitHub web form.

Thanks a lot for sending these sample logs over. It will be very helpful.

@dainperkins
Copy link
Contributor

dainperkins commented Oct 30, 2018 via email

@dainperkins
Copy link
Contributor

excel file attached - its a mess (other than VPC flow which is pretty basic) , I think starting with a common subset would probably be the most effective way to go, then deciding which other additions are required...

ecs network examples.xlsx

@robgil robgil force-pushed the 9506-application-protocol branch from 8324db5 to 9c10f5e Compare November 2, 2018 20:43
@robgil robgil changed the title Adds network.application.protocol Revises network.* to cover more use cases Nov 2, 2018
Copy link
Contributor

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM and simple to understand.

The only thing I want to make sure is that moving forward we will not need application, transport or protocol as an object.

Copy link
Contributor

@webmat webmat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beautiful! Thanks a lot @robgil for submitting this updated PR.

I have one small nitpick: Could you tweak the examples to use a lowercase "v" in "IPv4" and "IPv6", please? :-)

Once this is in, I'm ok to merge as is.

Copy link
Contributor

@webmat webmat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, tweaking the example is not a blocker. Merge whenever you're ready. We can fix the example afterwards.

@webmat webmat merged commit 1accdf5 into elastic:master Nov 5, 2018
webmat added a commit that referenced this pull request Nov 6, 2018
* Make the capitalization of `IPv` consistent across examples
* Tweak the wording of the examples a bit.
* Add changelog
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants