Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[Filebeat][httpjson] httpjson chain calls #29816

Merged
merged 33 commits into from
Mar 29, 2022
Merged
Show file tree
Hide file tree
Changes from 19 commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
07e006a
[filebeat][httpjson] Extend filebeat httpjson input for chained calls
kush-elastic Dec 8, 2021
d062e26
Updates based on comments
kush-elastic Dec 15, 2021
664d95b
Update on httpjson and nits
kush-elastic Dec 17, 2021
e288c0d
Resolve comments from Tiago
kush-elastic Dec 23, 2021
27e71a2
address comments and improved json_parser
kush-elastic Jan 5, 2022
24b1e87
Add entry to CHANGELOG.next.asciidoc
kush-elastic Jan 12, 2022
28daafc
Merge branch 'master' into salesforce_httpjson_chain_calls
kush-elastic Jan 12, 2022
915bc6d
mage update and mage check for ci linting
kush-elastic Jan 13, 2022
ea115bb
address comments
kush-elastic Jan 20, 2022
1dfb2b0
Addressed comments
kush-elastic Jan 28, 2022
d42d505
remove else as we have continue, close resp.Body in id collection mec…
kush-elastic Feb 1, 2022
3aa2a11
Merge branch 'master' into salesforce_httpjson_chain_calls
kush-elastic Feb 1, 2022
5b9e7fe
nits
kush-elastic Feb 2, 2022
19694db
Merge branch 'main' into salesforce_httpjson_chain_calls
kush-elastic Feb 22, 2022
6e9170d
mage check
kush-elastic Feb 22, 2022
7386e91
Merge branch 'main' into salesforce_httpjson_chain_calls
kush-elastic Feb 24, 2022
d8e5327
Changes requested from elastic team
kush-elastic Mar 7, 2022
7975a49
Merge branch 'main' into salesforce_httpjson_chain_calls
kush-elastic Mar 8, 2022
df162e9
Merge branch 'main' into salesforce_httpjson_chain_calls
kush-elastic Mar 9, 2022
f43d410
requested changes on logging level
kush-elastic Mar 10, 2022
f59e8a0
requested changes for ids collection
kush-elastic Mar 19, 2022
844bd79
Requested changes on input-httpjson.asciidoc and added tests for chai…
kush-elastic Mar 23, 2022
c64dc1d
Added regular tests with chain
kush-elastic Mar 25, 2022
60e5c14
Merge branch 'main' into salesforce_httpjson_chain_calls
kush-elastic Mar 25, 2022
1534361
Changes based on CI
kush-elastic Mar 25, 2022
d7df3c4
Add New diagram for chain feature and updates based on CI changes
kush-elastic Mar 28, 2022
879b945
CI linter problem
kush-elastic Mar 28, 2022
6563fc6
removed unnecessary ci changes
kush-elastic Mar 29, 2022
c11c048
Merge branch 'main' into salesforce_httpjson_chain_calls
kush-elastic Mar 29, 2022
616415b
golangci eventsCh->events
kush-elastic Mar 29, 2022
d55af74
ignore few CI errors
kush-elastic Mar 29, 2022
a0a5d94
update GET to http.MethodGet
kush-elastic Mar 29, 2022
6739df7
Review changes- GET->http.MethodGet and nits
kush-elastic Mar 29, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.next.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,7 @@ https://github.com/elastic/beats/compare/v7.0.0-alpha2...main[Check the HEAD dif
- Add extraction of `related.hosts` to Microsoft 365 Defender ingest pipeline {issue}29859[29859] {pull}29863[29863]
- threatintel module: Add new Recorded Future integration. {pull}30030[30030]
- Add pipeline in FB's supported hints. {pull}30212[30212]
- Add support in httpjson input for chain calls. {pull}29816[29816]

*Auditbeat*

Expand Down
144 changes: 144 additions & 0 deletions x-pack/filebeat/docs/inputs/input-httpjson.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -370,6 +370,7 @@ NOTE: Only one of the credentials settings can be set at once. If none is provid
default credentials from the environment will be attempted via ADC. For more information about
how to provide Google credentials, please refer to https://cloud.google.com/docs/authentication.

[[request-parameters]]
[float]
==== `request.url`

Expand Down Expand Up @@ -931,6 +932,149 @@ This will output:
]
----

[[chain]]
[float]
=== `chain`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs contain a sequence diagram that explains the high level flow. I think the chain feature should be added to the diagram.


Chain is a list of calls to be made after the first call.

chain list option: [`step`].

[float]
==== `chain[].step`

chain[].step will contain basic request and response configurations for chain calls.

[float]
==== `chain[].step.request`

Please refer <<request-parameters,request parameters>>. also place same replace string in url where collected values from previous call should be place. Required.

Example:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure the example is too clear, is it referencing that the value can or should be different for every chain step?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, As value can be different. user can mentioned different calls in specified way.


first call: https://example.com/services/data/v1.0/

second call: https://example.com/services/data/v1.0/1/export_ids

third call: https://example.com/services/data/v1.0/export_ids/file_1/info

[float]
==== `chain[].step.respose.split`

Please refer <<response-split,response split parameter>>. Required.

[float]
==== `chain[].step.replace`

replace is JSONPath string to parse values from response JSONs, collected from previous chain calls. Please look at the official doc of https://goessner.net/articles/JsonPath/index.html#e2[JSONPath]. Required.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
replace is JSONPath string to parse values from response JSONs, collected from previous chain calls. Please look at the official doc of https://goessner.net/articles/JsonPath/index.html#e2[JSONPath]. Required.
A [JSONPath](https://goessner.net/articles/JsonPath/index.html#e2[JSONPath]) string to parse values from responses JSON, collected from previous chain steps. Required.

Is this always required? Can there be situations were a step does not expect a replaceable part?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, It's always required.


Example:

- first call: https://example.com/services/data/v1.0/
+
<<response-json1,response>>

- second call: https://example.com/services/data/v1.0/`$.records[:].id`/export_ids
+
<<response-json2,response>>

third call: https://example.com/services/data/v1.0/export_ids`/`$.file_name`/info

["source","yaml",subs="attributes"]
----
filebeat.inputs:
- type: httpjson
enabled: true
# first call
request.url: https://example.com/services/data/v1.0/records
interval: 1h
chain:
# second call
- step:
request.url: https://example.com/services/data/v1.0/$.records[:].id/export_ids
request.method: GET
replace: $.records[:].id
# third call
- step:
request.url: https://example.com/services/data/v1.0/export_ids/$.file_name/info
request.method: GET
replace: $.file_name
----

Example:

- First call to collect record ids

+
request_url: https://example.com/services/data/v1.0/records

+
response_json:

+
[[response-json1]]
["source","json",subs="attributes"]
----
{
"records": [
{
"id": 1,
},
{
"id": 2,
},
{
"id": 3,
},
]
}
----

- Second call to collect `file_name` using collected ids from first call.

+
request_url using id as '1': https://example.com/services/data/v1.0/1/export_ids

+
response_json using id as '1':

+
[[response-json1]]
["source","json",subs="attributes"]
----
{
"file_name": "file_1"
}
----

+
request_url using id as '2': https://example.com/services/data/v1.0/2/export_ids

+
response_json using id as '2':

+
["source","json",subs="attributes"]
----
{
"file_name": "file_2"
}
----

- Third call to collect `files` using collected `file_name` from second call.

+
request_url using file_name as 'file_1': https://example.com/services/data/v1.0/export_ids/file_1/info

+
request_url using file_name as 'file_2': https://example.com/services/data/v1.0/export_ids/file_2/info

+
Collect and make events from response in any format supported by httpjson for all calls.

NOTE: httpjson chain will only create and ingest events from last call on chain configurations.
httpjson chain will also

[[cursor]]
[float]
==== `cursor`
Expand Down
74 changes: 74 additions & 0 deletions x-pack/filebeat/input/httpjson/chain.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
// Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
// or more contributor license agreements. Licensed under the Elastic License;
// you may not use this file except in compliance with the Elastic License.

// Example:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe would be good to have this examples in the docs if they are different from the current ones

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, They are same.

// 1. First call to collect record ids
// request_url: https://some_url.com/services/data/v1.0/records
// response_json:
// {
// "records": [
// {
// "id": 1,
// },
// {
// "id": 2,
// },
// {
// "id": 3,
// },
// ]
// }
//
// 2. Second call to collect file name using collected ids from first call.
// request_url using id as '1': https://some_url.com/services/data/v1.0/1/export_ids
// response_json using id as '1':
// {
// "file_name": "file_1"
// }
// request_url using id as '2': https://some_url.com/services/data/v1.0/2/export_ids
// response_json using id as '2':
// {
// "file_name": "file_2"
// }
//
// 3. Third call to collect files using collected file names from second call.
// request_url using file_name as 'file_1': https://some_url.com/services/data/v1.0/export_ids/file_1/info
// request_url using file_name as 'file_2': https://some_url.com/services/data/v1.0/export_ids/file_2/info
//
// Collect and make events from response in any format[csv, json, etc.] for all calls.
//
// Example configuration:
//
// - type: httpjson
// enabled: true
// request.url: https://some_url.com/services/data/v1.0/records (first call)
// interval: 1h
// chain:
// - step:
// request.url: https://some_url.com/services/data/v1.0/$.records[:].id/export_ids (second call)
// request.method: GET
// replace: $.records[:].id
// - step:
// request.url: https://some_url.com/services/data/v1.0/export_ids/$.file_name/info (third call)
// request.method: GET
// replace: $.file_name

package httpjson

// chainConfig for chain request.
// Following contains basic call structure for each call after normal httpjson
// call.
type chainConfig struct {
Step stepConfig `config:"step" validate:"required"`
}

// stepConfig will contain basic properties like, request.url,
// request.method and replace parameter. Each step: request.url
// will contain replace string with original URL to make a skeleton for the
// call request.
type stepConfig struct {
Request requestConfig `config:"request"`
Response responseConfig `config:"response,omitempty"`
Replace string `config:"replace,omitempty"`
}
1 change: 1 addition & 0 deletions x-pack/filebeat/input/httpjson/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ type config struct {
Request *requestConfig `config:"request" validate:"required"`
Response *responseConfig `config:"response"`
Cursor cursorConfig `config:"cursor"`
Chain []chainConfig `config:"chain"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Question]
chainConfig only contains a single field stepConfig, is it really needed? Are they both used so the final yaml looks like that:

   chain:
     - step:
         request.url: https://some_url.com/services/data/v1.0/$.records[:].id/export_ids (second call)
         request.method: GET
         replace: $.records[:].id
     - step:
         request.url: https://some_url.com/services/data/v1.0/export_ids/$.file_name/info (third call)
         request.method: GET
         replace: $.file_name

Maybe we could use the stepConfig directly:

Suggested change
Chain []chainConfig `config:"chain"`
Chain []stepConfig `config:"chain"`

I believe it would result in a yaml like this:

chain:
  - request.url: 'https://some_url.com/services/data/v1.0/$.records[:].id/export_ids'
    request.method: GET
    replace: '$.records[:].id'
  - request.url: 'https://some_url.com/services/data/v1.0/export_ids/$.file_name/info'
    request.method: GET
    replace: $.file_name

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is related to #29816 (comment). Changing type would be better than just documenting the situation, though does this preclude extending the computation model in the future, and if it does, does that matter.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, we thought of that but then decided to go with using a step to make it easy in the future to extend it to add some more processing in-between steps whenever required, for example:

  enabled: true
  request.url: https://example.com/
  interval: 1h
  chain:
	- step:
		request.method: GET
		request.url: "http://example.com/auth"
		response: ...

	- break:
		when: '[[eq .last_response.body.authtok ""]]'

	- step:
		request.method: POST
		request.url: 'http://example.com/api?auth=[[.last_response.body.authtok]]'
		response: ...```
Hence, we were planning to update the docs for the user to mention using steps.

}

type cursorConfig map[string]cursorEntry
Expand Down
2 changes: 1 addition & 1 deletion x-pack/filebeat/input/httpjson/input.go
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,7 @@ func run(
return err
}

requestFactory := newRequestFactory(config.Request, config.Auth, log)
requestFactory := newRequestFactory(config, log)
pagination := newPagination(config, httpClient, log)
responseProcessor := newResponseProcessor(config.Response, pagination, log)
requester := newRequester(httpClient, requestFactory, responseProcessor, log)
Expand Down
Loading