Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Running pretrained models on Other datasets #15

Open
Cosmopal opened this issue Jun 30, 2019 · 0 comments
Open

Running pretrained models on Other datasets #15

Cosmopal opened this issue Jun 30, 2019 · 0 comments

Comments

@Cosmopal
Copy link

Hi,
You some great work here! Is there a way to run your pre trained models on another dataset? I tried just replacing the train.document and train.summary files with other data, but the final-test-output-convs2s-checkpoint-best.pt results were totally unrelated, and repeated. It seems it is still trying to map my custom values to previously seen titles??

Here's what I did:
I was not sure which file is the data read from for the test, so I replaced train.document, test.document, valid.document, validation.document all with the texts (same in each) and train.summary, test.summary, valid.summary, validation.summary with the titles. (same in each). I copied he dict.document.txt and dict.summary.txt from your original tar.

Then I ran

cd XSum-ConvS2S
python generate.py ./convs2s-emnlp18/data-convs2s --path ./convs2s-emnlp18/checkpoints-convs2s/checkpoint-best.pt --batch-size 1 --beam 10 --replace-unk --source-lang document --target-lang summary > test-output-convs2s-checkpoint-best.pt
cd ..
python scripts/extract-hypothesis-fairseq.py -o XSum-ConvS2S/test-output-convs2s-checkpoint-best.pt -f final-test-output-convs2s-checkpoint-best.pt
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant