Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Make Japanese Lorem sentences look more natural #1918

Merged
merged 1 commit into from
Jan 27, 2020

Conversation

rastamhadi
Copy link
Contributor

Issue #1917

#1917

Description:

This pull request makes Japanese Faker::Lorem sentences look more natural by:

  • removing half-width spaces between Japanese sentences.
  • removing full-width spaces between Japanese words.

This affects the following methods:

  • Faker::Lorem.sentence
  • Faker::Lorem.sentences
  • Faker::Lorem.paragraph
  • Faker::Lorem.paragraphs

@rastamhadi rastamhadi requested a review from koic January 23, 2020 03:27
@koic
Copy link
Member

koic commented Jan 23, 2020

Hm, it seems that each word is easier to understand if there is space because it is a sequence of random words.

% ruby -rfaker -e 'p Faker::VERSION'
"2.10.1"
% ruby -rfaker -e 'Faker::Config.locale = "ja"; p Faker::Lorem.sentence'
"好き きょだい 出版 超音波。"
% ruby -rfaker -e 'Faker::Config.locale = "ja"; p Faker::Lorem.sentences'
["約する 殻 書き方。", "察知 そあく 割り箸。", "いちだい むらさきいろ 太る。"]
% ruby -rfaker -e 'Faker::Config.locale = "ja"; p Faker::Lorem.paragraph'
"誘惑 しずむ 量。 騎兵 全日本 けいむしょ。 きんく こうせい 飽くまでも。"
% ruby -rfaker -e 'Faker::Config.locale = "ja"; p Faker::Lorem.paragraphs'
["長唄 かん こうおつ。 既に 頂く えきびょう。 旧姓 金星 はなはだ。", "平安 あう まもる。 とうさん しずむ れつあく 。 ひきざん 不思議 伐採。", "弥生 退く 地面。 つなひき よくげつ 乗せる。 輸出 ぶっきょう 見当たる。"]

IMHO, it seems better to keep separating words in Japanese with spaces (わかち書き).

@rastamhadi
Copy link
Contributor Author

rastamhadi commented Jan 23, 2020

I can see how it looks like わかち書き (wakachigaki) because of all the hiragana-only words. They were added in #900 from a Kanji learning app, which means they may have originally been furigana. Perhaps I could remove all of these furigana words if they contribute to the awkwardness of the text?

The reason I am proposing to remove the spaces is because lorem ipsum text is supposed to look like real-world text without meaningful content, so that designers can use it to design layouts.

https://en.wikipedia.org/wiki/Lorem_ipsum

In publishing and graphic design, Lorem ipsum is a placeholder text commonly used to demonstrate the visual form of a document or a typeface without relying on meaningful content.

As far as I can tell, Japanese real-world text does not use wakachigaki unless it is targeted at non-native speakers.

I do realise that most of us use Faker to generate dummy data for our tests, and that the spaces don't make much difference in that use case. I won't pursue the matter any further if you think this is YAGNI 😃

@koic
Copy link
Member

koic commented Jan 23, 2020

Thanks for the explanation. Actually, I don't have a strong opinion on this. Leave it to a maintainers.

Copy link
Contributor

@Zeragamba Zeragamba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that lorem text should look closer to real Japanese gibberish.

Changes look good to be, but can you also add a Japanese locale test to ensure spaces are removed?

@@ -75,6 +75,7 @@ def test_ja_lorem_methods
assert Faker::Lorem.words.is_a? Array
assert Faker::Lorem.words(number: 1000)
assert Faker::Lorem.words(number: 10_000, supplemental: true)
assert_not_match(/ /, Faker::Lorem.paragraph)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you also add a Japanese locale test to ensure spaces are removed?

@Zeragamba
Sorry, I assumed that this would be enough to cover the changes. Could you specify what else it needs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! Sorry, I thought that was another file. :derp:

All good then!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! ❤️

@rastamhadi rastamhadi removed the request for review from koic January 24, 2020 03:54
@vbrazo vbrazo merged commit 4a0d6c5 into faker-ruby:master Jan 27, 2020
@rastamhadi rastamhadi deleted the rastamhadi-patch-1 branch September 26, 2021 10:21
# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants