diff --git a/.gitignore b/.gitignore index b97ccc8d19..c6a57185cd 100644 --- a/.gitignore +++ b/.gitignore @@ -61,4 +61,7 @@ src/main/resources/topics-and-qrels/qrels.covid-round12.txt src/main/resources/topics-and-qrels/qrels.covid-round2-cumulative.txt # TREC 2022 NeuCLIR qrels haven't been official released yet. -src/main/resources/topics-and-qrels/qrels.neuclir22-*.txt \ No newline at end of file +src/main/resources/topics-and-qrels/qrels.neuclir22-*.txt + +# TREC 2022 DL qrels haven't been official released yet. +src/main/resources/topics-and-qrels/qrels.dl22-*.txt diff --git a/README.md b/README.md index 7383b65ace..88772a41db 100644 --- a/README.md +++ b/README.md @@ -58,6 +58,9 @@ For the most part, these runs are based on [_default_ parameter settings](https: These pages can also serve as guides to reproduce our results. See individual pages for details! +
+MS MARCO V1 Passage Corpus + ### MS MARCO V1 Passage Corpus | | dev | DL19 | DL20 | @@ -78,6 +81,10 @@ See individual pages for details! | SPLADEv2 | [✓](docs/regressions-msmarco-passage-distill-splade-max.md) | | SPLADE-distill CoCodenser-medium | [✓](docs/regressions-msmarco-passage-splade-distil-cocodenser-medium.md) | [✓](docs/regressions-dl19-passage-splade-distil-cocodenser-medium.md) | [✓](docs/regressions-dl20-passage-splade-distil-cocodenser-medium.md) | +
+
+MS MARCO V1 Document Corpus + ### MS MARCO V1 Document Corpus | | dev | DL19 | DL20 | @@ -95,6 +102,10 @@ See individual pages for details! | uniCOIL noexp | [✓](docs/regressions-msmarco-doc-segmented-unicoil-noexp.md) | [✓](docs/regressions-dl19-doc-segmented-unicoil-noexp.md) | [✓](docs/regressions-dl20-doc-segmented-unicoil-noexp.md) | | uniCOIL with doc2query-T5 | [✓](docs/regressions-msmarco-doc-segmented-unicoil.md) | [✓](docs/regressions-dl19-doc-segmented-unicoil.md) | [✓](docs/regressions-dl20-doc-segmented-unicoil.md) | +
+
+MS MARCO V2 Passage Corpus + ### MS MARCO V2 Passage Corpus | | dev | DL21 | @@ -109,6 +120,10 @@ See individual pages for details! | uniCOIL noexp zero-shot | [✓](docs/regressions-msmarco-v2-passage-unicoil-noexp-0shot.md) | [✓](docs/regressions-dl21-passage-unicoil-noexp-0shot.md) | | uniCOIL with doc2query-T5 zero-shot | [✓](docs/regressions-msmarco-v2-passage-unicoil-0shot.md) | [✓](docs/regressions-dl21-passage-unicoil-0shot.md) | +
+
+MS MARCO V2 Document Corpus + ### MS MARCO V2 Document Corpus | | dev | DL21 | @@ -123,6 +138,10 @@ See individual pages for details! | uniCOIL noexp zero-shot | [✓](docs/regressions-msmarco-v2-doc-segmented-unicoil-noexp-0shot-v2.md) | [✓](docs/regressions-dl21-doc-segmented-unicoil-noexp-0shot-v2.md) | | uniCOIL with doc2query-T5 zero-shot | [✓](docs/regressions-msmarco-v2-doc-segmented-unicoil-0shot-v2.md) | [✓](docs/regressions-dl21-doc-segmented-unicoil-0shot-v2.md) | +
+
+Regressions for BEIR (v1.0.0) + ### Regressions for BEIR (v1.0.0) + F = "flat" baseline @@ -162,10 +181,13 @@ See individual pages for details! | Climate-FEVER | [+](docs/regressions-beir-v1.0.0-climate-fever-flat.md) | [+](docs/regressions-beir-v1.0.0-climate-fever-flat-wp.md) | [+](docs/regressions-beir-v1.0.0-climate-fever-multifield.md) | [+](docs/regressions-beir-v1.0.0-climate-fever-unicoil-noexp.md) | [+](docs/regressions-beir-v1.0.0-climate-fever-splade-distil-cocodenser-medium.md) | | SciFact | [+](docs/regressions-beir-v1.0.0-scifact-flat.md) | [+](docs/regressions-beir-v1.0.0-scifact-flat-wp.md) | [+](docs/regressions-beir-v1.0.0-scifact-multifield.md) | [+](docs/regressions-beir-v1.0.0-scifact-unicoil-noexp.md) | [+](docs/regressions-beir-v1.0.0-scifact-splade-distil-cocodenser-medium.md) | +
+
+Regressions for MIRACL ### Regressions for MIRACL -| | dev | +| | BM25 | |---|:---:| | Arabic | [+](docs/regressions-miracl-v1.0-ar.md) | | Bengali | [+](docs/regressions-miracl-v1.0-bn.md) | @@ -184,17 +206,12 @@ See individual pages for details! | Thai | [+](docs/regressions-miracl-v1.0-th.md) | | Chinese | [+](docs/regressions-miracl-v1.0-zh.md) | +
+
+Other Cross-Lingual and Multi-Lingual Regressions -### Other Regressions +### Other Cross-Lingual and Multi-Lingual Regressions -+ Regressions for [Disks 1 & 2 (TREC 1-3)](docs/regressions-disk12.md), [Disks 4 & 5 (TREC 7-8, Robust04)](docs/regressions-disk45.md), [AQUAINT (Robust05)](docs/regressions-robust05.md) -+ Regressions for [the New York Times Corpus (Core17)](docs/regressions-core17.md), [the Washington Post Corpus (Core18)](docs/regressions-core18.md) -+ Regressions for [Wt10g](docs/regressions-wt10g.md), [Gov2](docs/regressions-gov2.md) -+ Regressions for [ClueWeb09 (Category B)](docs/regressions-cw09b.md), [ClueWeb12-B13](docs/regressions-cw12b13.md), [ClueWeb12](docs/regressions-cw12.md) -+ Regressions for [Tweets2011 (MB11 & MB12)](docs/regressions-mb11.md), [Tweets2013 (MB13 & MB14)](docs/regressions-mb13.md) -+ Regressions for Complex Answer Retrieval (CAR17): [v1.5](docs/regressions-car17v1.5.md), [v2.0](docs/regressions-car17v2.0.md), [v2.0 with doc2query](docs/regressions-car17v2.0-doc2query.md) -+ Regressions for TREC News Tracks (Background Linking Task): [2018](docs/regressions-backgroundlinking18.md), [2019](docs/regressions-backgroundlinking19.md), [2020](docs/regressions-backgroundlinking20.md) -+ Regressions for [FEVER Fact Verification](docs/regressions-fever.md) + Regressions for [NTCIR-8 ACLIA (IR4QA subtask, Monolingual Chinese)](docs/regressions-ntcir8-zh.md) + Regressions for [CLEF 2006 Monolingual French](docs/regressions-clef06-fr.md) + Regressions for [TREC 2002 Monolingual Arabic](docs/regressions-trec02-ar.md) @@ -205,11 +222,30 @@ See individual pages for details! + Regressions for HC4 (v1.0) baselines on translated NeuCLIR22 corpora: [Persian](docs/regressions-hc4-neuclir22-fa-en.md), [Russian](docs/regressions-hc4-neuclir22-ru-en.md), [Chinese](docs/regressions-hc4-neuclir22-zh-en.md) + Regressions for TREC 2022 NeuCLIR Track (query translation): [Persian](docs/regressions-neuclir22-fa-qt.md), [Russian](docs/regressions-neuclir22-ru-qt.md), [Chinese](docs/regressions-neuclir22-zh-qt.md) + Regressions for TREC 2022 NeuCLIR Track (document translation): [Persian](docs/regressions-neuclir22-fa-dt.md), [Russian](docs/regressions-neuclir22-ru-dt.md), [Chinese](docs/regressions-neuclir22-zh-dt.md) + +
+
+Other Regressions + +### Other Regressions + ++ Regressions for [Disks 1 & 2 (TREC 1-3)](docs/regressions-disk12.md), [Disks 4 & 5 (TREC 7-8, Robust04)](docs/regressions-disk45.md), [AQUAINT (Robust05)](docs/regressions-robust05.md) ++ Regressions for [the New York Times Corpus (Core17)](docs/regressions-core17.md), [the Washington Post Corpus (Core18)](docs/regressions-core18.md) ++ Regressions for [Wt10g](docs/regressions-wt10g.md), [Gov2](docs/regressions-gov2.md) ++ Regressions for [ClueWeb09 (Category B)](docs/regressions-cw09b.md), [ClueWeb12-B13](docs/regressions-cw12b13.md), [ClueWeb12](docs/regressions-cw12.md) ++ Regressions for [Tweets2011 (MB11 & MB12)](docs/regressions-mb11.md), [Tweets2013 (MB13 & MB14)](docs/regressions-mb13.md) ++ Regressions for Complex Answer Retrieval (CAR17): [v1.5](docs/regressions-car17v1.5.md), [v2.0](docs/regressions-car17v2.0.md), [v2.0 with doc2query](docs/regressions-car17v2.0-doc2query.md) ++ Regressions for TREC News Tracks (Background Linking Task): [2018](docs/regressions-backgroundlinking18.md), [2019](docs/regressions-backgroundlinking19.md), [2020](docs/regressions-backgroundlinking20.md) ++ Regressions for [FEVER Fact Verification](docs/regressions-fever.md) + Regressions for DPR Wikipedia QA baselines: [100-word splits](docs/regressions-wikipedia-dpr-100w-bm25.md) +
### Available Corpora +
+Variants of MS MARCO V1 and V2 corpora available for download + | Corpora | Size | Checksum | |:------------------------------------------------------------------------------------------------------------------------------------------------|-------:|:-----------------------------------| | [MS MARCO V1 passage: Quantized BM25](https://rgw.cs.uwaterloo.ca/JIMMYLIN-bucket0/data/msmarco-passage-bm25-b8.tar) | 1.2 GB | `0a623e2c97ac6b7e814bf1323a97b435` | @@ -226,6 +262,8 @@ See individual pages for details! | [MS MARCO V2 doc: uniCOIL (noexp)](https://rgw.cs.uwaterloo.ca/JIMMYLIN-bucket0/data/msmarco_v2_doc_segmented_unicoil_noexp_0shot_v2.tar) | 55 GB | `97ba262c497164de1054f357caea0c63` | | [MS MARCO V2 doc: uniCOIL (d2q-T5)](https://rgw.cs.uwaterloo.ca/JIMMYLIN-bucket0/data/msmarco_v2_doc_segmented_unicoil_0shot_v2.tar) | 72 GB | `c5639748c2cbad0152e10b0ebde3b804` | +
+ ## Additional Documentation The experiments described below are not associated with rigorous end-to-end regression testing and thus provide a lower standard of reproducibility. @@ -289,6 +327,10 @@ Beyond that, there are always [open issues](https://github.com/castorini/anserin + v0.14.2: March 24, 2022 [[Release Notes](docs/release-notes/release-notes-v0.14.2.md)] + v0.14.1: February 27, 2022 [[Release Notes](docs/release-notes/release-notes-v0.14.1.md)] + v0.14.0: January 10, 2022 [[Release Notes](docs/release-notes/release-notes-v0.14.0.md)] + +
+older... (and historic notes) + + v0.13.5: November 2, 2021 [[Release Notes](docs/release-notes/release-notes-v0.13.5.md)] + v0.13.4: October 22, 2021 [[Release Notes](docs/release-notes/release-notes-v0.13.4.md)] + v0.13.3: August 22, 2021 [[Release Notes](docs/release-notes/release-notes-v0.13.3.md)] @@ -327,6 +369,8 @@ Based on [preliminary experiments](docs/lucene7-vs-lucene8.md), query evaluation As a result of this upgrade, results of all regressions have changed slightly. To reproducible old results from Lucene 7.6, use [v0.5.1](https://github.com/castorini/anserini/releases). +
+ ## References + Jimmy Lin, Matt Crane, Andrew Trotman, Jamie Callan, Ishan Chattopadhyaya, John Foley, Grant Ingersoll, Craig Macdonald, Sebastiano Vigna. [Toward Reproducible Baselines: The Open-Source IR Reproducibility Challenge.](https://cs.uwaterloo.ca/~jimmylin/publications/Lin_etal_ECIR2016.pdf) _ECIR 2016_. diff --git a/tools b/tools index 28a938134b..78cb98f944 160000 --- a/tools +++ b/tools @@ -1 +1 @@ -Subproject commit 28a938134b652a9153172edc0d82b7b765b66216 +Subproject commit 78cb98f94497267c5eb1676fd0b7c38ad77d70e6