-
Notifications
You must be signed in to change notification settings - Fork 12
/
README.install
85 lines (73 loc) · 3.7 KB
/
README.install
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
/*
* Polygraph (release 0.1)
* Signature generation algorithms for polymorphic worms
*
* Copyright (c) 2004-2005, Intel Corporation
* All Rights Reserved
*
* This software is distributed under the terms of the Eclipse Public
* License, Version 1.0 which can be found in the file named LICENSE.
* ANY USE, REPRODUCTION OR DISTRIBUTION OF THIS SOFTWARE CONSTITUTES
* RECIPIENT'S ACCEPTANCE OF THIS AGREEMENT
*/
The following steps are to install the Polygraph code, and to run the
included experiments.
Step 1: Install dependencies
Make and install dependencies (see README.dependencies).
NOTE: you must patch libstree using libstree-annotation.diff
NOTE: you must patch sary using sary_next_offset.diff
Step 2: Install Polygraph code
From the polygraph subdirectory, run 'python setup.py'. Optionally,
use --home=installdir to specify where to install.
The next steps are to run the experiments in the included test bed. You can
skip these steps if you don't want to run these particular experiments, though
it could be useful to get an idea of how to use the polygraph code.
Step 3: Acquire and preprocess traces
Polygraph uses traces of innocuous traffic during signature generation to help
estimate how likely tokens are to appear in innocuous streams, to estimate
the overall false positive rate of generated signatures, and to set the
threshold in bayes signatures. The test bed uses additional traces to
measure the false positive rate of generated signatures.
To run the experiments in the included test bed, you will need four traces:
Two traces of port 53 (DNS), and two traces of port 80 (HTTP). One trace
for each port will be used for training, and the other for evaluation.
Due to privacy concerns, we are not able to distribute the traces we used
to generate our published results. Sample, toy traces are included in
experiments/traces/, but these should NOT be used for real experiments.
They may be used to verify that the Polygraph code and test bed are installed
and running correctly.
Once you have acquired the appropriate traces, they must be converted to
'streamfiles', which consist of the reconstructed network streams. To build
a streamfile, run:
reconstruct_streams [--nosary] outname pcapfile1 [pcapfile2] [pcapfile3]...
By default, this will will build a streamfile called 'outname' out of the
specified pcap trace files, and build a suffix array to allow them to be
efficiently searched. The suffix arrays are required for streamfiles used
for training, but are not necessary for evaluation traces. Note that in
the current implementation, network streams that span multiple pcap files
will be broken into separate streams in the resulting streamfile.
Step 4: Generate workloads
The next step is to generate the polymorphic worm workloads. This can be
done simply by running experiments/workloads/generate_workloads.py
Step 5: Configuration
Edit experiments/config.py to specify where the streamfiles and workloads
are located.
Step 6: Run experiments
There are several 'run.py' scripts in subdirectories of 'experiments'.
Each 'run.py' script will run a particular experiment from the paper.
Experiments included are:
single_workload: suspicious pool contains samples from 1 worm
apache_host
atphttpd
lion
single_noise: suspicious pool contains samples from 1 worm, and 'noise'
innocuous requests
apache_host
atphttpd
lion
mixed_workloads: suspicious pool contains multiple worm types, and noise
http
When each experiment is run, a subdirectory will be created named after the
current time. That directory will contain results in a format suitable for
graphing. A file will also be created, named after the current time, with
'.logfile' appended. This is a more human readable log of the results.