Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Token ids generated instead of translation #3

Open
ahmedoumar opened this issue Jul 7, 2022 · 6 comments · May be fixed by #5
Open

Token ids generated instead of translation #3

ahmedoumar opened this issue Jul 7, 2022 · 6 comments · May be fixed by #5

Comments

@ahmedoumar
Copy link

ahmedoumar commented Jul 7, 2022

Hey there, I hope you're doing fine.
when running the command: turj.translate
it returns the token ids instead of the actual translation?
(see the output below)
2022-07-07 10:41:43 | INFO | turjuman.translate | Using beam search
tensor([[ 0, 6538, 2, 76, 6380, 1]])

@elmadany
Copy link
Member

elmadany commented Jul 7, 2022

Hi Ahmed,
could you please provide us with more details such as your input sentence and screenshot?
Thanks

@ahmedoumar
Copy link
Author

Screenshot from 2022-07-07 12-02-16
as you can see the turj.translate returns output ids instead of translation, i have solved this by using the tokenizer and then decode the ids back to tokens:
tokenizer.decode(target, skip_special_tokens=True, clean_up_tokenization_spaces=True)

@elmadany
Copy link
Member

elmadany commented Jul 7, 2022

To integrate Turjuman with your python code, take a look at this notebook.
https://colab.research.google.com/github/UBC-NLP/turjuman/blob/main/examples/Integrate_turjuman_with_your_code.ipynb
Thanks

@ahmedoumar
Copy link
Author

when you run that notebook, you get only the target ids, as shown in the screenshot.

@elmadany
Copy link
Member

elmadany commented Jul 7, 2022

Thanks Ahmed, we will check this soon

@kabapy
Copy link

kabapy commented Sep 11, 2022

quick fix
result = torj.tokenizer.batch_decode(target, skip_special_tokens=True)

@mohammad-albarham mohammad-albarham linked a pull request Apr 9, 2023 that will close this issue
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants