Skip to content

Add spellchecker runner and initial generated dictionary (ru) #2720

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

razum2um
Copy link
Contributor

@razum2um razum2um commented Oct 31, 2021

I suggest using yaspeller to check md content.

  • russian maintainers supposed to use it locally => Rakefile changed
  • and need to sync exceptions list (dictionary, array or regexps) => new json file added

Usage:

npm i -g yaspeller
rake 'check:spelling[ru]'

Default output looks like this:

www.ruby-lang.org/ru/news/_posts/2013-02-24-ruby-2-0-0-p0-is-released.md 170 ms
-----
Typos: 2
1. эксперемент (73:32, suggest: эксперимент)
2. отправлии (171:1, suggest: отправили)

www.ruby-lang.org/ru/news/_posts/2013-12-21-ruby-version-policy-changes-with-2-1-0.md 191 ms
-----
Typos: 2
1. yдаление (42:3, en: y*******, ru: *даление, suggest: удаление)
2. oбратно (43:3, en: o******, ru: *братно, suggest: обратно, обратной)
-----

All those errors fixed in related PR: #2719

Note that it suggests replacements and finds locale-mismatched symbols as well.

Update dictionary from scratch

Use this repo

rm lib/spelling/ru/dictionary.json
rake 'check:spelling[ru,json]' # generates ./yaspeller_report.json which we git-ignore
cd ../yaspeller-dictionary-builder
python src/dictionary.py ../www.ruby-lang.org/yaspeller_report.json > ../www.ruby-lang.org/lib/spelling/ru/dictionary.json

Limitations:

  • Unfortunately, it supports only ru, ukrainian, and en. Consider adding it for en?
  • As the cli utility is based on free api and has rate limits, I don't put it into CI / standard linter flow. Maybe a git hook makes sense if a commit contains changes under ru

@razum2um razum2um requested review from a team as code owners October 31, 2021 19:21
@razum2um razum2um changed the title Add spellchecker runner and initial generated dictionary Add spellchecker runner and initial generated dictionary (ru) Oct 31, 2021
@lex111
Copy link
Member

lex111 commented Oct 31, 2021

Thanks for the suggestion, but I would prefer to add this check to CI (GitHub actions) as a separate workflow.

As the cli utility is based on free api and has rate limits, I don't put it into CI

Actually it is not big deal, I use yaspeller on many projects and never experienced with its rate limits, moreover it is possible to make yaspeller run only when files related with Russian translation has been changed.

@Nakilon
Copy link
Contributor

Nakilon commented Nov 13, 2021

Personally I used hunspell like this:

hunspell -d dict-en-20210701/en_US,dict_ru_ru-aot-0.4.5/russian-aot -p my.txt -l ru/news/_posts/202* | sort | uniq

Their README told to download the dictionaries in .oxt format, rename to .zip and unpack.
And the my.txt in the example is a text file with skipped words, like:

Alexandr
CVE
Gemfile
RDoc
Savca
aycabta
bundler
lang
...

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants