Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Ул/Бул #13

Open
mansayk opened this issue Dec 9, 2018 · 3 comments
Open

Ул/Бул #13

mansayk opened this issue Dec 9, 2018 · 3 comments
Labels
disambiguation enhancement New feature or request

Comments

@mansayk
Copy link
Member

mansayk commented Dec 9, 2018

Is "Ул/бул" parsed correctly here:

echo "Ул ташламас сине." | apertium-destxt -n | lt-proc -z -w 'apertium-tat/tat.automorf.bin' | cg-proc -z 'apertium-tat/tat.rlx.bin' | apertium-retxt

^Ул/бул<v><tv><imp><p2><sg>/ул<prn><pers><p3><sg><nom>/бул<v><iv><imp><p2><sg>/ул<prn><dem><nom>$ ^ташламас/ташла<v><tv><neg><gpr_fut>/ташла<v><tv><neg><fut><p3><sg>$ ^сине/син<prn><pers><p2><sg><acc>$^./.<sent>$

@IlnarSelimcan
Copy link
Member

The бул v iv analysis was added to deal with the 19th century corpus texts I've been working on, i.e. улмак = булмак which shows up in them quite frequently. I think the way to go here is to mark all such archaic words with some flag and prune them while compiling unless the user specifies a compilation flag which keeps them.

@mansayk
Copy link
Member Author

mansayk commented Dec 9, 2018

I understand. Unfortunately, in my case this one is even chosen after disambiguation.

@mansayk mansayk added the enhancement New feature or request label Jan 16, 2019
@jonorthwash
Copy link
Member

@IlnarSelimcan, it would probably be fairly straightforward to write a disambigution rule to deal with some of these.

Alternatively, sometimes it can make sense to just treat things like бул-/ул- as synonyms, and deal with them as such in later stages for translation.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
disambiguation enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants