Automated Vulnerability Scoring and Categorisation Toolset for Vulnerability Reports.
- About the Tool
- Severity Prediction Under CVSS Version 3
- Severity Prediction Under CVSS Version 2
- Threat Prediction Using CVEDetails
- Future Works
- References
Vulnerability severity scoring and categorisation using machine-learning tools. VulnerabilityClassifier is an open-source toolkit that employs machine-learning techniques to learn vulnerability labels assigned by NVD, vendors, cvedetails, and other repositories, in order to predict the labels for new vulnerability reports. Here, "labels" refers to CVSS-metric labels, threat types provided by cvedetails, weakness types provided by CWE, and attack types provided by CAPEC. The purpose is to support a higher level of automation in vulnerability assessment.
We generate some datasets for CWE/CAPEC/CVSS/threat classification training purposes in another repo: NVD Data Feature Analysis
The recommended environment is Python 3. The tutorials need Jupyter Notebook (by Anaconda Navigator).
The purpose here is to be able to automatically assign a severity score to any vulnerability instance with a descriptive report, using the CVSS Version 3 standard. Two examples are shown below, whereby the TestingSamples have labels initially set as (CVSS score = 0) and other values as "l", and the labels of the PredictedSamples are predicted by the trained machine-learning models.
A severity computation pipeline that streamlines the process of machine-learning model training, testing, and validation is illustrated in the CVSS V3 Notebook, in a step-by-step manner.
- Machine-learning model: Logistic Regression algorithm is utilised to show the applicability of the proposed approach. Any other machine-learning model can be applied to further improve the model performances.
- Training/Testing dataset: NVD data feeds (2002-2020).
- Validating dataset: NVD data feeds (2021).
- Step 1: Clone the repo using the following command:
git clone https://github.com/Yuni0217/VulnerabilityClassifier.git
-
Step 2: Create a virtual environment.
-
Step 3: Install requirements using
pip
:
pip install -r requirements.txt
- Step 4: Download datasets from NVD feeds.
python ./CVSSV3prediction/updateDB.py
- Step 5: Train machine-learning models for different CVSS V3 mechanisms and store them.
python ./CVSSV3prediction/trainScoreCVSSV3.py
- Step 6: Using the trained machine-learning models to predict CVSS V3 scores for any vulnerability document.
python ./CVSSV3prediction/predictScoreCVSSV3.py -p './CVSSV3prediction/testData' -s -v
Similarly, vulnerability severity score under CVSS Version 2 can be predicted using trained machine-learning model.
The model training, testing, validation process is illustrated in the CVSS V2 Notebook, in a step-by-step manner.
- Machine-learning model: Logistic Regression.
- Training/Testing dataset: NVD data feeds (2002-2020).
- Validating dataset: NVD data feeds (2021).
Threat categories that one vulnerability might be exposed to can be predicted using trained machine-learning model. With accuracy shown below (without any optimisation yet).
The model training, testing, validation process is illustrated in the Threat Prediction Notebook
- Machine-learning model: LSTM Model.
- Training/Testing dataset: NVD data feeds (2002-2021); cvedetails.
Before using the tutorial Threat Prediction Notebook, you can also update the data to be synchorinised with the latest vulnerability data feeds, and create mappings between CVEs and threat types in cvedetails with the following scripts:
python ./threatPrediction/updateDB.py
python ./threatPrediction/cveIDcrawler_in_cveDetails.py
python ./threatPrediction/generateThreatTrainingData.py
- More classification works related to weakness types provided by CWE, attack types provided by CAPEC would be added.
- Wrapping up prediction models for different purposes (threat categorisation, CVSS-metric categorisation, CWE classification) into a pipeline.
If you use this tool in your academic work you can cite it using
@article{jiang2022towards,
title={Towards automatic discovery and assessment of vulnerability severity in cyber--physical systems},
author={Jiang, Yuning and Atif, Yacine},
journal={Array},
volume={15},
pages={100209},
year={2022},
publisher={Elsevier}
}