Ул/Бул #13

mansayk · 2018-12-09T08:09:04Z

Is "Ул/бул" parsed correctly here:

echo "Ул ташламас сине." | apertium-destxt -n | lt-proc -z -w 'apertium-tat/tat.automorf.bin' | cg-proc -z 'apertium-tat/tat.rlx.bin' | apertium-retxt

^Ул/бул<v><tv><imp><p2><sg>/ул<prn><pers><p3><sg><nom>/бул<v><iv><imp><p2><sg>/ул<prn><dem><nom>$ ^ташламас/ташла<v><tv><neg><gpr_fut>/ташла<v><tv><neg><fut><p3><sg>$ ^сине/син<prn><pers><p2><sg><acc>$^./.<sent>$

The text was updated successfully, but these errors were encountered:

IlnarSelimcan · 2018-12-09T09:32:29Z

The бул v iv analysis was added to deal with the 19th century corpus texts I've been working on, i.e. улмак = булмак which shows up in them quite frequently. I think the way to go here is to mark all such archaic words with some flag and prune them while compiling unless the user specifies a compilation flag which keeps them.

mansayk · 2018-12-09T09:38:57Z

I understand. Unfortunately, in my case this one is even chosen after disambiguation.

jonorthwash · 2019-01-16T16:41:24Z

@IlnarSelimcan, it would probably be fairly straightforward to write a disambigution rule to deal with some of these.

Alternatively, sometimes it can make sense to just treat things like бул-/ул- as synonyms, and deal with them as such in later stages for translation.

mansayk added the enhancement New feature or request label Jan 16, 2019

mansayk added the disambiguation label Jan 16, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ул/Бул #13

Ул/Бул #13

mansayk commented Dec 9, 2018 •

edited

Loading

IlnarSelimcan commented Dec 9, 2018

mansayk commented Dec 9, 2018

jonorthwash commented Jan 16, 2019

Ул/Бул #13

Ул/Бул #13

Comments

mansayk commented Dec 9, 2018 • edited Loading

IlnarSelimcan commented Dec 9, 2018

mansayk commented Dec 9, 2018

jonorthwash commented Jan 16, 2019

mansayk commented Dec 9, 2018 •

edited

Loading