Skip to content

Commit

Permalink
Update README.md: add expando sections (#2007)
Browse files Browse the repository at this point in the history
Update tools/
Update .gitignore
  • Loading branch information
lintool authored Oct 27, 2022
1 parent 6d8601a commit 91ec674
Show file tree
Hide file tree
Showing 3 changed files with 59 additions and 12 deletions.
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -61,4 +61,7 @@ src/main/resources/topics-and-qrels/qrels.covid-round12.txt
src/main/resources/topics-and-qrels/qrels.covid-round2-cumulative.txt

# TREC 2022 NeuCLIR qrels haven't been official released yet.
src/main/resources/topics-and-qrels/qrels.neuclir22-*.txt
src/main/resources/topics-and-qrels/qrels.neuclir22-*.txt

# TREC 2022 DL qrels haven't been official released yet.
src/main/resources/topics-and-qrels/qrels.dl22-*.txt
64 changes: 54 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,9 @@ For the most part, these runs are based on [_default_ parameter settings](https:
These pages can also serve as guides to reproduce our results.
See individual pages for details!

<details>
<summary>MS MARCO V1 Passage Corpus</summary>

### MS MARCO V1 Passage Corpus

| | dev | DL19 | DL20 |
Expand All @@ -78,6 +81,10 @@ See individual pages for details!
| SPLADEv2 | [](docs/regressions-msmarco-passage-distill-splade-max.md) |
| SPLADE-distill CoCodenser-medium | [](docs/regressions-msmarco-passage-splade-distil-cocodenser-medium.md) | [](docs/regressions-dl19-passage-splade-distil-cocodenser-medium.md) | [](docs/regressions-dl20-passage-splade-distil-cocodenser-medium.md) |

</details>
<details>
<summary>MS MARCO V1 Document Corpus</summary>

### MS MARCO V1 Document Corpus

| | dev | DL19 | DL20 |
Expand All @@ -95,6 +102,10 @@ See individual pages for details!
| uniCOIL noexp | [](docs/regressions-msmarco-doc-segmented-unicoil-noexp.md) | [](docs/regressions-dl19-doc-segmented-unicoil-noexp.md) | [](docs/regressions-dl20-doc-segmented-unicoil-noexp.md) |
| uniCOIL with doc2query-T5 | [](docs/regressions-msmarco-doc-segmented-unicoil.md) | [](docs/regressions-dl19-doc-segmented-unicoil.md) | [](docs/regressions-dl20-doc-segmented-unicoil.md) |

</details>
<details>
<summary>MS MARCO V2 Passage Corpus</summary>

### MS MARCO V2 Passage Corpus

| | dev | DL21 |
Expand All @@ -109,6 +120,10 @@ See individual pages for details!
| uniCOIL noexp zero-shot | [](docs/regressions-msmarco-v2-passage-unicoil-noexp-0shot.md) | [](docs/regressions-dl21-passage-unicoil-noexp-0shot.md) |
| uniCOIL with doc2query-T5 zero-shot | [](docs/regressions-msmarco-v2-passage-unicoil-0shot.md) | [](docs/regressions-dl21-passage-unicoil-0shot.md) |

</details>
<details>
<summary>MS MARCO V2 Document Corpus</summary>

### MS MARCO V2 Document Corpus

| | dev | DL21 |
Expand All @@ -123,6 +138,10 @@ See individual pages for details!
| uniCOIL noexp zero-shot | [](docs/regressions-msmarco-v2-doc-segmented-unicoil-noexp-0shot-v2.md) | [](docs/regressions-dl21-doc-segmented-unicoil-noexp-0shot-v2.md) |
| uniCOIL with doc2query-T5 zero-shot | [](docs/regressions-msmarco-v2-doc-segmented-unicoil-0shot-v2.md) | [](docs/regressions-dl21-doc-segmented-unicoil-0shot-v2.md) |

</details>
<details>
<summary>Regressions for BEIR (v1.0.0)</summary>

### Regressions for BEIR (v1.0.0)

+ F = "flat" baseline
Expand Down Expand Up @@ -162,10 +181,13 @@ See individual pages for details!
| Climate-FEVER | [+](docs/regressions-beir-v1.0.0-climate-fever-flat.md) | [+](docs/regressions-beir-v1.0.0-climate-fever-flat-wp.md) | [+](docs/regressions-beir-v1.0.0-climate-fever-multifield.md) | [+](docs/regressions-beir-v1.0.0-climate-fever-unicoil-noexp.md) | [+](docs/regressions-beir-v1.0.0-climate-fever-splade-distil-cocodenser-medium.md) |
| SciFact | [+](docs/regressions-beir-v1.0.0-scifact-flat.md) | [+](docs/regressions-beir-v1.0.0-scifact-flat-wp.md) | [+](docs/regressions-beir-v1.0.0-scifact-multifield.md) | [+](docs/regressions-beir-v1.0.0-scifact-unicoil-noexp.md) | [+](docs/regressions-beir-v1.0.0-scifact-splade-distil-cocodenser-medium.md) |

</details>
<details>
<summary>Regressions for MIRACL</summary>

### Regressions for MIRACL

| | dev |
| | BM25 |
|---|:---:|
| Arabic | [+](docs/regressions-miracl-v1.0-ar.md) |
| Bengali | [+](docs/regressions-miracl-v1.0-bn.md) |
Expand All @@ -184,17 +206,12 @@ See individual pages for details!
| Thai | [+](docs/regressions-miracl-v1.0-th.md) |
| Chinese | [+](docs/regressions-miracl-v1.0-zh.md) |

</details>
<details>
<summary>Other Cross-Lingual and Multi-Lingual Regressions</summary>

### Other Regressions
### Other Cross-Lingual and Multi-Lingual Regressions

+ Regressions for [Disks 1 &amp; 2 (TREC 1-3)](docs/regressions-disk12.md), [Disks 4 &amp; 5 (TREC 7-8, Robust04)](docs/regressions-disk45.md), [AQUAINT (Robust05)](docs/regressions-robust05.md)
+ Regressions for [the New York Times Corpus (Core17)](docs/regressions-core17.md), [the Washington Post Corpus (Core18)](docs/regressions-core18.md)
+ Regressions for [Wt10g](docs/regressions-wt10g.md), [Gov2](docs/regressions-gov2.md)
+ Regressions for [ClueWeb09 (Category B)](docs/regressions-cw09b.md), [ClueWeb12-B13](docs/regressions-cw12b13.md), [ClueWeb12](docs/regressions-cw12.md)
+ Regressions for [Tweets2011 (MB11 &amp; MB12)](docs/regressions-mb11.md), [Tweets2013 (MB13 &amp; MB14)](docs/regressions-mb13.md)
+ Regressions for Complex Answer Retrieval (CAR17): [v1.5](docs/regressions-car17v1.5.md), [v2.0](docs/regressions-car17v2.0.md), [v2.0 with doc2query](docs/regressions-car17v2.0-doc2query.md)
+ Regressions for TREC News Tracks (Background Linking Task): [2018](docs/regressions-backgroundlinking18.md), [2019](docs/regressions-backgroundlinking19.md), [2020](docs/regressions-backgroundlinking20.md)
+ Regressions for [FEVER Fact Verification](docs/regressions-fever.md)
+ Regressions for [NTCIR-8 ACLIA (IR4QA subtask, Monolingual Chinese)](docs/regressions-ntcir8-zh.md)
+ Regressions for [CLEF 2006 Monolingual French](docs/regressions-clef06-fr.md)
+ Regressions for [TREC 2002 Monolingual Arabic](docs/regressions-trec02-ar.md)
Expand All @@ -205,11 +222,30 @@ See individual pages for details!
+ Regressions for HC4 (v1.0) baselines on translated NeuCLIR22 corpora: [Persian](docs/regressions-hc4-neuclir22-fa-en.md), [Russian](docs/regressions-hc4-neuclir22-ru-en.md), [Chinese](docs/regressions-hc4-neuclir22-zh-en.md)
+ Regressions for TREC 2022 NeuCLIR Track (query translation): [Persian](docs/regressions-neuclir22-fa-qt.md), [Russian](docs/regressions-neuclir22-ru-qt.md), [Chinese](docs/regressions-neuclir22-zh-qt.md)
+ Regressions for TREC 2022 NeuCLIR Track (document translation): [Persian](docs/regressions-neuclir22-fa-dt.md), [Russian](docs/regressions-neuclir22-ru-dt.md), [Chinese](docs/regressions-neuclir22-zh-dt.md)

</details>
<details>
<summary>Other Regressions</summary>

### Other Regressions

+ Regressions for [Disks 1 &amp; 2 (TREC 1-3)](docs/regressions-disk12.md), [Disks 4 &amp; 5 (TREC 7-8, Robust04)](docs/regressions-disk45.md), [AQUAINT (Robust05)](docs/regressions-robust05.md)
+ Regressions for [the New York Times Corpus (Core17)](docs/regressions-core17.md), [the Washington Post Corpus (Core18)](docs/regressions-core18.md)
+ Regressions for [Wt10g](docs/regressions-wt10g.md), [Gov2](docs/regressions-gov2.md)
+ Regressions for [ClueWeb09 (Category B)](docs/regressions-cw09b.md), [ClueWeb12-B13](docs/regressions-cw12b13.md), [ClueWeb12](docs/regressions-cw12.md)
+ Regressions for [Tweets2011 (MB11 &amp; MB12)](docs/regressions-mb11.md), [Tweets2013 (MB13 &amp; MB14)](docs/regressions-mb13.md)
+ Regressions for Complex Answer Retrieval (CAR17): [v1.5](docs/regressions-car17v1.5.md), [v2.0](docs/regressions-car17v2.0.md), [v2.0 with doc2query](docs/regressions-car17v2.0-doc2query.md)
+ Regressions for TREC News Tracks (Background Linking Task): [2018](docs/regressions-backgroundlinking18.md), [2019](docs/regressions-backgroundlinking19.md), [2020](docs/regressions-backgroundlinking20.md)
+ Regressions for [FEVER Fact Verification](docs/regressions-fever.md)
+ Regressions for DPR Wikipedia QA baselines: [100-word splits](docs/regressions-wikipedia-dpr-100w-bm25.md)

</details>

### Available Corpora

<details>
<summary>Variants of MS MARCO V1 and V2 corpora available for download</summary>

| Corpora | Size | Checksum |
|:------------------------------------------------------------------------------------------------------------------------------------------------|-------:|:-----------------------------------|
| [MS MARCO V1 passage: Quantized BM25](https://rgw.cs.uwaterloo.ca/JIMMYLIN-bucket0/data/msmarco-passage-bm25-b8.tar) | 1.2 GB | `0a623e2c97ac6b7e814bf1323a97b435` |
Expand All @@ -226,6 +262,8 @@ See individual pages for details!
| [MS MARCO V2 doc: uniCOIL (noexp)](https://rgw.cs.uwaterloo.ca/JIMMYLIN-bucket0/data/msmarco_v2_doc_segmented_unicoil_noexp_0shot_v2.tar) | 55 GB | `97ba262c497164de1054f357caea0c63` |
| [MS MARCO V2 doc: uniCOIL (d2q-T5)](https://rgw.cs.uwaterloo.ca/JIMMYLIN-bucket0/data/msmarco_v2_doc_segmented_unicoil_0shot_v2.tar) | 72 GB | `c5639748c2cbad0152e10b0ebde3b804` |

</details>

## Additional Documentation

The experiments described below are not associated with rigorous end-to-end regression testing and thus provide a lower standard of reproducibility.
Expand Down Expand Up @@ -289,6 +327,10 @@ Beyond that, there are always [open issues](https://github.com/castorini/anserin
+ v0.14.2: March 24, 2022 [[Release Notes](docs/release-notes/release-notes-v0.14.2.md)]
+ v0.14.1: February 27, 2022 [[Release Notes](docs/release-notes/release-notes-v0.14.1.md)]
+ v0.14.0: January 10, 2022 [[Release Notes](docs/release-notes/release-notes-v0.14.0.md)]

<details>
<summary>older... (and historic notes)</summary>

+ v0.13.5: November 2, 2021 [[Release Notes](docs/release-notes/release-notes-v0.13.5.md)]
+ v0.13.4: October 22, 2021 [[Release Notes](docs/release-notes/release-notes-v0.13.4.md)]
+ v0.13.3: August 22, 2021 [[Release Notes](docs/release-notes/release-notes-v0.13.3.md)]
Expand Down Expand Up @@ -327,6 +369,8 @@ Based on [preliminary experiments](docs/lucene7-vs-lucene8.md), query evaluation
As a result of this upgrade, results of all regressions have changed slightly.
To reproducible old results from Lucene 7.6, use [v0.5.1](https://github.com/castorini/anserini/releases).

</details>

## References

+ Jimmy Lin, Matt Crane, Andrew Trotman, Jamie Callan, Ishan Chattopadhyaya, John Foley, Grant Ingersoll, Craig Macdonald, Sebastiano Vigna. [Toward Reproducible Baselines: The Open-Source IR Reproducibility Challenge.](https://cs.uwaterloo.ca/~jimmylin/publications/Lin_etal_ECIR2016.pdf) _ECIR 2016_.
Expand Down
2 changes: 1 addition & 1 deletion tools
Submodule tools updated 36 files
+4 −0 .gitignore
+29,197 −0 topics-and-qrels/qrels.miracl-v1.0-ar-dev.tsv
+4,206 −0 topics-and-qrels/qrels.miracl-v1.0-bn-dev.tsv
+8,350 −0 topics-and-qrels/qrels.miracl-v1.0-en-dev.tsv
+6,443 −0 topics-and-qrels/qrels.miracl-v1.0-es-dev.tsv
+6,571 −0 topics-and-qrels/qrels.miracl-v1.0-fa-dev.tsv
+12,008 −0 topics-and-qrels/qrels.miracl-v1.0-fi-dev.tsv
+3,429 −0 topics-and-qrels/qrels.miracl-v1.0-fr-dev.tsv
+3,494 −0 topics-and-qrels/qrels.miracl-v1.0-hi-dev.tsv
+9,668 −0 topics-and-qrels/qrels.miracl-v1.0-id-dev.tsv
+8,354 −0 topics-and-qrels/qrels.miracl-v1.0-ja-dev.tsv
+3,057 −0 topics-and-qrels/qrels.miracl-v1.0-ko-dev.tsv
+13,100 −0 topics-and-qrels/qrels.miracl-v1.0-ru-dev.tsv
+5,092 −0 topics-and-qrels/qrels.miracl-v1.0-sw-dev.tsv
+1,606 −0 topics-and-qrels/qrels.miracl-v1.0-te-dev.tsv
+7,573 −0 topics-and-qrels/qrels.miracl-v1.0-th-dev.tsv
+3,928 −0 topics-and-qrels/qrels.miracl-v1.0-zh-dev.tsv
+500 −0 topics-and-qrels/topics.dl22.txt
+ topics-and-qrels/topics.dl22.unicoil-noexp.0shot.tsv.gz
+ topics-and-qrels/topics.dl22.unicoil.0shot.tsv.gz
+2,896 −0 topics-and-qrels/topics.miracl-v1.0-ar-dev.tsv
+411 −0 topics-and-qrels/topics.miracl-v1.0-bn-dev.tsv
+799 −0 topics-and-qrels/topics.miracl-v1.0-en-dev.tsv
+648 −0 topics-and-qrels/topics.miracl-v1.0-es-dev.tsv
+632 −0 topics-and-qrels/topics.miracl-v1.0-fa-dev.tsv
+1,271 −0 topics-and-qrels/topics.miracl-v1.0-fi-dev.tsv
+343 −0 topics-and-qrels/topics.miracl-v1.0-fr-dev.tsv
+350 −0 topics-and-qrels/topics.miracl-v1.0-hi-dev.tsv
+960 −0 topics-and-qrels/topics.miracl-v1.0-id-dev.tsv
+860 −0 topics-and-qrels/topics.miracl-v1.0-ja-dev.tsv
+213 −0 topics-and-qrels/topics.miracl-v1.0-ko-dev.tsv
+1,252 −0 topics-and-qrels/topics.miracl-v1.0-ru-dev.tsv
+482 −0 topics-and-qrels/topics.miracl-v1.0-sw-dev.tsv
+828 −0 topics-and-qrels/topics.miracl-v1.0-te-dev.tsv
+733 −0 topics-and-qrels/topics.miracl-v1.0-th-dev.tsv
+393 −0 topics-and-qrels/topics.miracl-v1.0-zh-dev.tsv

0 comments on commit 91ec674

Please # to comment.