whatsthatbook

This repository contains code for the paper Decomposing Complex Queries for Tip-of-the-tongue Retrieval

Data

Dataset is released on huggingface at: https://huggingface.co/datasets/nlpkevinl/whatsthatbook

Installation

Create conda environment

conda create -n whatsthatbook python=3.9 --yes
conda activate whatsthatbook

Install the requirements.txt file

pip install requirements.txt

Clue Generation

To generate subqueries, input a prompt, examples, and the original query file. Our example uses openai as the LLM decomposer, so first set the required environmental variables OPENAI_API_KEY.

Example query for cover prompts.

python clue_extraction.py --prompt_text_file prompts/cover.prompt.txt --examples_file prompts/examples/cover_clue_examples.jsonl  --input_file ./data/2022-05-30_14441_gold_posts.cover_clues.jsonl --output_file ./data/debug.2022-05-30_14441_gold_posts.cover_clues.jsonl  --max_examples 1

Text Model Finetuning

To finetune dense models on the text-based metadata, run:

  python finetuning.py --model_path bert-base-uncased \
  --eval_data <path to dev> \
  --train_data  <path to train> \
  --output_dir <path to output> \
  --total_steps 10000 \
  --save_freq 5000 \
  --per_gpu_batch_size 16

To evaluate, trained models models build indices following /baselines/contriever/README.md then run passage_retrieval_all.py to generate output files and metrics.

References

Please consider referencing our work if you find it useful for your work

@misc{lin-etal:2023:arxiv,
  author    = {Kevin Lin and Kyle Lo and Joseph Gonzalez and Dan Klein},
  title     = {Decomposing Complex Queries for Tip-of-the-tongue Retrieval},
  note      = {arXiv:2305.15053},
  year      = {2023}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

whatsthatbook

Data

Installation

Clue Generation

Text Model Finetuning

References

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
baselines/contriever		baselines/contriever
prompts		prompts
README.md		README.md
clue_extraction.py		clue_extraction.py
requirements.txt		requirements.txt

kl2806/whatsthatbook

Folders and files

Latest commit

History

Repository files navigation

whatsthatbook

Data

Installation

Clue Generation

Text Model Finetuning

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages