In HTML and CSS, how do I make japanese text break lines correctly?

M

6

21

I'm writting a simple paragraph in both English and Japanese, using only HTML and CSS. The English text breaks lines normally (when a word doesn't fit on a line anymore, it's pushed to the next one).

With Japanese though, not a whole word is pushed to the next line, but part of it only. I've tried setting word-wrap to break-word and normal, but nothing changes (with the Japanese text).

How to I make whole words in Japanese jump to the next line like it happens in English?

Miscellany answered 9/3, 2011 at 17:4 Comment(1)

This is intended behavior. —a Japanese native speaker. – Belovo 13/7 at 6:46

F

15

English separates words with spaces, Japanese doesn't.

Whether characters in Japanese form a word or not depends on context. In many cases, looking for certain grammatical (Kana) particles could be used to separate words - but this wouldn't even be close to being reliable.

Essentially, you'd need a Japanese dictionary / understanding of the language to identify where the words start and end - a browser won't know how to do this.

Alternatively, if you know the start and end of the words, you could perhaps wrap each one in a span - then use CSS to ensure each span wraps to a new line as a whole when it doesn't fit.

Frederick answered 4/4, 2011 at 8:4 Comment(4)

I can't think of a better answer. – Reyreyes 29/4, 2012 at 20:51

Pick up a Japanese book or magazine and notice how the text wraps. You will see that it's normal for words to break in the middle to wrap to the next line. Trying to force it to follow English-like rules for when to wrap text wouldn't be natural. – Jotham 27/6, 2012 at 6:51

What about alignment? Say I have the text "デモンストレーションのお申し込み" and there is a line break at お and then 申し込み is centered below it. Is it more common for it to be left aligned? – Daphie 6/7, 2017 at 13:14

Budou has really high friction to get started. I need to create a GCP account, then an API key with whatever a service account means, then I have to enable the Cloud Natural Language API, then I have to enable billing then I have to put in my credit card, I'm willing to give it up at this point. – Brno 5/12, 2018 at 13:42

S

9

Japanese has specific rules that are followed when breaking text. They are called 禁則処理 (kinsoku shori). Here is a link explaining the rules. The rules are mostly concerned with special characters. Have a look at any popular Japanese webpage and you will see that multi-character (kana and kanji) words are often split. I often see です split between lines.

Update: I stumbled across this tool recently. I haven't tried it out yet, but the theory is solid. If someone is looking to improve the line breaks with Japanese text this could be a good solution.

Satyriasis answered 4/10, 2016 at 0:40 Comment(2)

Budou has really high friction to get started. I need to create a GCP account, then an API key with whatever a service account means, then I have to enable the Cloud Natural Language API, then I have to enable billing then I have to put in my credit card, I'm willing to give it up at this point. – Brno 5/12, 2018 at 13:43

@Brno Not really. You don't need to use Google's Natural Language API. Much easier to use the Mecab backend library instead. – Bistro 30/6, 2020 at 19:6

S

1

try setting the css property

line-break:strict;

Check it out here.

Situation answered 9/3, 2011 at 17:17 Comment(2)

Thank you for your answer, but unfortunately the problem persists (nothing changed). – Miscellany 10/3, 2011 at 9:14

@Rodrigo Bezerra Sorry I couldn't help you out. – Situation 10/3, 2011 at 16:2

E

1

I'm not an expert with Japanese specifically so it's hard for me to tell if things are wrapping correctly, but I just had to solve this problem myself and both word-break: keep-all and white-space: nowrap seemed to solve the issue for me, so those might be worth trying out.

Ecclesiastical answered 30/10, 2019 at 17:47 Comment(0)

S

1

Until the browsers are smart enough to do on-the-fly semantic analysis of the language, there are only a couple of options :

1/ Understand enough of the language to be able to group semantic elements in their own, unbreakable DOM elements. Something like (without the line breaks) :

<span class="el">私は</span>
<span class="el">キッチンで</span>
<span class="el">パンを</span>
<span class="el">食べました。</span>

Then in CSS, use something like .el { display: inline-block; }. You probably want to do this only on headings and important text pieces only, since it could impact accessibility (ie. how screen readers interpret the text). The other inconvenients are that 1/ you need to understand the text to know where to add the blocks, and 2/ this obviously only works for static text (and even in that case, it's still a manual, painstaking process).

2/ Use a tool that does the grouping for you. It could be something on the client side, like TinySegmenter (whitch does segment a bit too much for my taste IMHO), or on the server-side, with things like Budou that use Google Cloud Natural Language API and ML to analyze your sentences. The downsides (at least for Budou) is that 1/ you need Python (I think that I saw a Node.js port somewhere), and 2/ It's not free.

Hope this helps!

Sauder answered 1/7, 2020 at 4:41 Comment(0)

B

0

The canonical (and easiest) way to do this is to set line-break: strict;. According to Can I Use, this property has widespread support (93%+) now.

Belovo answered 7/9, 2023 at 0:57 Comment(0)

Recommended topics

Hot tags