tf-idf.ipynb: TF-IDF implemention on the entire phishing email dataset.
descriptive-statistics-var1.ipynb & descriptive-statistics-var2.ipynb: Descriptive statistics (mean, median, standard deviation) of phishing data under variation 1 and 2.
mwu-var1.ipynb & mwu-var2.ipynb: Test statistics, that is, Mann-Whitney U test performed on phishing emails received before and after each DDoS announcement date under variation 1 and 2.
sum-var1.ipynb & sum-var2.ipynb: Sum of phishing emails (for respective hypothesis) received before and after each DDoS announcement date under variation 1 and 2.
Description of variation 1 and 2:
H1: +/- 3 and +/-7 days
H2: +/- 7 and +/-15 days
H3: +/- 7 and +/-15 days
Selection of phishing emails:
H1: all English phishing emails
H2: phishing emails with security related content that can lead to a DDoS attack
H3: phishing emails with malware attachments that takes victim to malicious website causing DDoS attack