forked from xapian/xapian-evaluation
-
Notifications
You must be signed in to change notification settings - Fork 0
/
README
85 lines (62 loc) · 3.34 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
Evaluation
==========
Evaluation module is one of essential module for a search engine to be used in academia. This module eases to calculate various metric precision, recall, precision @ rank, Precision @ recall which validates implementation and check the engine on result presented on literature. Xapian being an open source engine have implemented evaluation module to help people from academia.
Buiding Evaluation Module
-------------------------
User needs to build Evaluation module on their system after cloning the source repository. This can be done using
::
autoreconf -fiv
./configure
make
sudo make install
Using Evaluation Module
------------------------
Evaluation metric currently in xapian
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* Mean Average precision(MAP)
* Mean Relevance Precision(MRP)
* Precision at Rank
* Precision at Recall
* Recall
Indexing Collection with evaluation Module
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
User can index a collection using
::
trec_index config
Find a sample config file in current directory names 'config'
User needs to compulsory configure textfile, db and stopsfile parameters to be able to use index a collection. User needs to set indexbigram parameter to true to be able to index bigrams while indexing to use bigram language model weighting scheme.
Searching Topics file and creating run with evaluation module
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
User can search a collection using index build to create a run with queries in topic file.
::
trec_query config
trec_search config
User need to compulsory configure queryfile, topicfile and stopsfile to search using a topic file, needs to set queryparserbigram to true to be able to add bigrams in query object formed by bigram.
Evaluating the run
^^^^^^^^^^^^^^^^^^^
User needs to run to get the evaluation result for your run.
::
trec_adhoceval config
It's compulsory to set relfile and runfile for using this option.
Configuring evaluation module
------------------------------
Configurable Parameters:
^^^^^^^^^^^^^^^^^^^^^^^^
- textfile - (INPUT) path/filename of text file to be indexed for evaluation
- language - (INPUT) corpus language to be indexed for stemming
- db - (OUTPUT) path of database which is index of above textfile
- querytype - (INPUT) type of query
- queryfile - (OUTPUT) path/filename of query file to be ran and searched
- resultsfile - (OUTPUT) path/filename of results file(run file)
- transfile - (INPUT) path/filename of transaction file
- noresults - (INPUT) no of results to save in results log file
- topicfile - (INPUT) path/filename of topic file for which relevance judgement file is available
- topicfields - (INPUT) fields of topic to use from topic file in evaluation
- relfile - (INPUT) path/filename of relevance judgements file
- runname - (INPUT) name of the evaluation run
- nterms - (INPUT) no of terms to pick from the topic
- stopsfile - (INPUT) name of the stopword file
- evaluationfiles - (OUTPUT) path/filename of the evaluation files(where evaluation result is stored).
- indexbigram - (INPUT) Index Bigram in the index.
- queryparsebigram -(INPUT) Parse and add bigrams to the Query.
- weightingscheme - (INPUT) Specifies the weighting scheme and any parameters (passed to ``Xapian::Weight::create()``)