-
Notifications
You must be signed in to change notification settings - Fork 11
Host Monitoring Not Enabling #365
Comments
Hi Mark, Hum right, there are a few conditions where host can be put on disabled automatically:
There is a macro, which you can check out in the UI "TrackMe Manage and configure", the default definition says the following:
This basically indicates by default that is a given data source did not receive any data for more than this period, the entity will get automatically disabled.
One could setup a custom alert action using the trackMe rest API: https://trackme.readthedocs.io/en/latest/userguide.html#alerts-tracking-trackme-alert-actions Basically one could have setup an action to disable the host automatically. Note that the same thing could be achieved from the outside using a REST call. In both cases, this would get tagged on the audit collection and the audit changes.
One could well have a custom logic to update the collection records based on a custom logic, basically updating the KVstore collection records. In any case:
Let me know if that makes sense |
Yes the screenshot is visible. Hum this looks weird, seems to indicate that this host is continously being discovered over and over again. Some question then: I would recommend to be restrictive enough on the data hosts to start in good conditions, it tends to contain too much crap data and it's hard to a have a good vision. So, I recommend generally to:
What does the record looks like?
Can you try to delete the host from the UI, then run the tracker a few times to see how it is behaving One option would be that you lots of crap in there, a very large number of host containing a very large number of sourcetypes etc. |
HI there,
Mark |
When you deleted the host, did you use permanent deletion or temporary deletion? You can check your action in the audit change tab As well:
Guilhem |
I did a temporary deletion and ran both short and longterm trackers several times with the same results. I did then try a permanent delete :( With regards to the allow/block - all lists are at the defaults installation settings. I haven't added or removed anything from those. Mark. |
Hi @cidermark When you delete an host through the UI, this creates a deletion record in the audit change, example:
To allow the host to be re-created, you can update this record, for example:
Then, when running the tracker the host can be re-created if the data allow it. Now if you host still is not created, you can start from this search:
And check what is going on, you can expand the search and go step by step to understand why it wouldn't be created. |
I followed the advice to re-add the server and it's back in the list. I re-enabled monitoring but, sadly, it still reverts back to not monitored after 5 minutes :( Mark. |
Right, ok so now that it's back in the collection let's continue. |
In my previous message I was showing this:
Adapt this to your own case, then run this command over the last 4 hours for instance, and expand the search. You will get a quite large search, there are parts of the code which are dealing with the data_monitored_state: While comparing these with yours, do you see anything special?
Can you please checkout in: /opt/splunk/etc/apps/trackme/local/ And checkout any local config file you have, especially savedsearches.conf and macros.conf, anything in there? |
Hi @cidermark Let me know if you have any update ;-) |
Hi @guilhemmarchand , I'll get on to this as soon as I can but I'm away from my computer this week. Hopefully I'll be able to take a look tomorrow. Mark |
No problem @cidermark just wanna make sure we don't leave that out. Guilhem |
Hi @guilhemmarchand I couldn't find anything especially notable in the local directory - just a modified macros.conf and savedsearches.conf Does this give any better insight as to what the problem may be? Again, thanks for your help, |
Hi @guilhemmarchand - any thoughts on my response? Cheers, |
Hi @cidermark Thanks for the remind ;-) One potential root cause I think might be due to the search breaking due to a way too large number of sourcetypes for a x number of hosts. This can happen with some bad practices such as dynamic sourcetyping, can you run: So the following would show up with the biggest from the collection:
Which could be reflected from the data:
What we want to find out is host have a seriously large number of sourcetypes, which should be excluded from the host tracking. Let me know |
Thinking about it, the esiest might be that we have a look together, I believe you have some form of exceptions here and I am sure there's a reason. My email is: guilhem.marchand@gmail.com Guilhem |
Hi Guilhem,
I'm not sure if this a bug or a misconfiguration but I'm trying to enable monitoring on a load of hosts. Of the 112 hosts, 67 remain enabled but the other 45 revert to 'disabled' after the 5 minute refresh.
Is there some kind of log that I can look at to help diagnose the issue?
Cheers,
Mark.
The text was updated successfully, but these errors were encountered: