Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Esperanto Support #62

Open
Fierthraix opened this issue Feb 25, 2020 · 0 comments
Open

Esperanto Support #62

Fierthraix opened this issue Feb 25, 2020 · 0 comments

Comments

@Fierthraix
Copy link

Hello, I would like to add support for the Esperanto language, as several projects downstream I use depend on lunr-languages. It is a constructed languaged invented in 1887 by Dr. L.L. Zamenhof, and has over 2-million speakers worldwide.

Fortunately due to the extreme regularity of the language (it only has 16 rules), implementing this should be a lot easier than for other languages.

Advice Needed:

I don't normally work with JavaScript, so I was wondering if anyone involved with the project can help me out with a few things:

  • Does the stop-words function run before the stemmer? It would greatly reduce the burden if stop-words are filtered out before they get to the stemmer. Otherwise, I will basically wind up having to reimplement the stop-words list again in the stemmer, as most of the stop-words are grammatical prepositions and the like that have irregular endings.

  • Many other languages have very complicated hundred-line stemmer functions, but in Esperanto, once you filter the special grammatical words, every word ends with either: -is, -as, -os, -us, -u, -e, -en, -a, -an -aj, -ajn, -o, -on, -oj, or -ojn. With that said, my stemmer function can be as simple as just returning a string with the end cut off (this always results in a valid word root). I wasn't sure if I needed to use the SnowballFunction or not.

I'm currently working on Esperanto support on my fork if anyone has any advice, or wants to point out any obvious JS flaws I missted.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant