-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Make Japanese Lorem sentences look more natural #1918
Conversation
Hm, it seems that each word is easier to understand if there is space because it is a sequence of random words. % ruby -rfaker -e 'p Faker::VERSION'
"2.10.1"
% ruby -rfaker -e 'Faker::Config.locale = "ja"; p Faker::Lorem.sentence'
"好き きょだい 出版 超音波。"
% ruby -rfaker -e 'Faker::Config.locale = "ja"; p Faker::Lorem.sentences'
["約する 殻 書き方。", "察知 そあく 割り箸。", "いちだい むらさきいろ 太る。"]
% ruby -rfaker -e 'Faker::Config.locale = "ja"; p Faker::Lorem.paragraph'
"誘惑 しずむ 量。 騎兵 全日本 けいむしょ。 きんく こうせい 飽くまでも。"
% ruby -rfaker -e 'Faker::Config.locale = "ja"; p Faker::Lorem.paragraphs'
["長唄 かん こうおつ。 既に 頂く えきびょう。 旧姓 金星 はなはだ。", "平安 あう まもる。 とうさん しずむ れつあく 。 ひきざん 不思議 伐採。", "弥生 退く 地面。 つなひき よくげつ 乗せる。 輸出 ぶっきょう 見当たる。"]
IMHO, it seems better to keep separating words in Japanese with spaces (わかち書き). |
I can see how it looks like わかち書き (wakachigaki) because of all the hiragana-only words. They were added in #900 from a Kanji learning app, which means they may have originally been furigana. Perhaps I could remove all of these furigana words if they contribute to the awkwardness of the text? The reason I am proposing to remove the spaces is because lorem ipsum text is supposed to look like real-world text without meaningful content, so that designers can use it to design layouts. https://en.wikipedia.org/wiki/Lorem_ipsum
As far as I can tell, Japanese real-world text does not use wakachigaki unless it is targeted at non-native speakers. I do realise that most of us use Faker to generate dummy data for our tests, and that the spaces don't make much difference in that use case. I won't pursue the matter any further if you think this is YAGNI 😃 |
Thanks for the explanation. Actually, I don't have a strong opinion on this. Leave it to a maintainers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that lorem text should look closer to real Japanese gibberish.
Changes look good to be, but can you also add a Japanese locale test to ensure spaces are removed?
@@ -75,6 +75,7 @@ def test_ja_lorem_methods | |||
assert Faker::Lorem.words.is_a? Array | |||
assert Faker::Lorem.words(number: 1000) | |||
assert Faker::Lorem.words(number: 10_000, supplemental: true) | |||
assert_not_match(/ /, Faker::Lorem.paragraph) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you also add a Japanese locale test to ensure spaces are removed?
@Zeragamba
Sorry, I assumed that this would be enough to cover the changes. Could you specify what else it needs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh! Sorry, I thought that was another file. :derp:
All good then!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! ❤️
Issue #1917
#1917
Description:
This pull request makes Japanese
Faker::Lorem
sentences look more natural by:This affects the following methods:
Faker::Lorem.sentence
Faker::Lorem.sentences
Faker::Lorem.paragraph
Faker::Lorem.paragraphs