Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

JMNEDict support #8

Closed
aehlke opened this issue Jul 16, 2017 · 4 comments
Closed

JMNEDict support #8

aehlke opened this issue Jul 16, 2017 · 4 comments

Comments

@aehlke
Copy link

aehlke commented Jul 16, 2017

Having this data for the JMNEDict/ENAMDICT dictionary of names would be fantastic!

@Doublevil
Copy link
Owner

Done. Don't hesitate if you spot errors in the file, I just checked a few cases.

@aehlke
Copy link
Author

aehlke commented Jul 18, 2017

Wow, thank you! 👯‍♂️

@fasiha
Copy link

fasiha commented Jul 18, 2017

Thanks @Doublevil!

A lazy question—I should check myself but how much overlap is there between JMDICT and the newly added JMNEDict entries? I ask because I wonder if now, when I look up a word in JmdictFurigana, if I have to add further checks to see if the search results are names vs not?

@Doublevil
Copy link
Owner

@fasiha If you access the entries by a combination of the kanji and kana readings, since the methods used to compute the "cuts" are the same, with the exception of included nanori readings for the JMNEDICT, there should be very little to no meaningful overlap (by that I mean entries with the same kanji and kana readings but different in how the reading is cut).

So if you find a word (again, using both kanji and kana readings as a key) in the JmdictFurigana file, you can assume it will be either absent or exactly the same in the JmnedictFurigana file, and reciprocally.

Now if you're looking to determine whether a word can or cannot be a name, I don't think this resource is the appropriate way to do so.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants