Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Build out cosDPR-distil regressions for TREC 2019 and TREC 2020 for Anserini #12

Closed
lintool opened this issue Sep 15, 2023 · 3 comments
Closed

Comments

@lintool
Copy link
Member

lintool commented Sep 15, 2023

Here is a concrete task. If you look at our Anserini regressions, under "MS MARCO V1 Passage Regressions", we have missing entries for TREC 2019 and TREC 2020.

Screen Shot 2023-09-15 at 7 26 58 AM

The concrete task is to build these regressions.

We need to encode the queries and then convert them into Anserini's JSON format. @MXueguang might actually have them encoded already (in numpy?)... in which case we just have to convert them over.

Warmup tasks:

(1) Reproduce cosDPR-distil on MS MARCO: https://github.com/castorini/anserini/blob/master/docs/regressions/regressions-msmarco-passage-cos-dpr-distil.md - make sure you can get it running on student linux env.
(2) To understand the context of what you're doing, read: https://cs.uwaterloo.ca/~jimmylin/publications/Ma_etal_CIKM2023.pdf

This is related to #3 - @pratyushpal and @mchlp you might be interested.

@pratyushpal
Copy link

@lintool I'll work on it!

@pratyushpal
Copy link

pratyushpal commented Sep 22, 2023

This task was solved in this PR : castorini/anserini#2204

@lintool
Copy link
Member Author

lintool commented Nov 28, 2023

This is done, closing.

@lintool lintool closed this as completed Nov 28, 2023
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants