Improve pattern matching #133
edoardottt
started this conversation in
Ideas
Replies: 0 comments
# for free
to join this conversation on GitHub.
Already have an account?
# to comment
-
From @ocervell:
I came across huge matches like:
which completely destroy my terminal 😄
So we might think about either:
We could end up with a JSON format like:
Additionally, regexes have their limits - ideally we want to see one step further and create some kind of pattern-recognition algorithms, or using even using ML for this kind of tasks. It could be a good evolution for cariddi ;) The
type
key would be useful in that case to differenciate the matches from regex matches:There is also room to improve the findings by filtering which ones are found important or not, for instance:
licensing@<domain>
orsales@<domain>
is very common and not very sensitiveetc...
Those "rules" could be first hardcoded by us on a case-by-case and then learned by ML as well at some point, and a
severity
field could be set for each finding.There might be a need to create separate issues for some of those points since it's not directly linked to the JSON lines aggregation. Feel free to copy-paste some of my comments there.
Beta Was this translation helpful? Give feedback.
All reactions