Unable to preserve line breaks in Google Translate response
Asked Answered
A

6

7

I have a problem of not having line breaks in the translated text from Google Translate API.

I have a raw query string like this:

RELATED WORK .

Studies of group work have shown the importance of

I did a URL encode for the query string and it shows this:

RELATED%20WORK%20.%0D%0A%0D%0AStudies%20of%20group%20work%20have%20shown%20the%20importance%20of

The problem is when being submitted to Google Translate API:

https://www.googleapis.com/language/translate/v2?key=<key>&source=en&target=ja&q=RELATED%20WORK%20.%0D%0A%0D%0AStudies%20of%20group%20work%20have%20shown%20the%20importance%20of

I only get a response in one line (no line breaks):

{
    "data": {
       "translations": [
          {
            "translatedText": "関連作業 。グループワークの研究は、"
          }
       ]
    }
 }

My ultimate goal is to parse the translated text line by line for proper rendering.

I'm just showing the URL for even by just accessing it via browser, it doesn't show the line breaks in the response.

Any ideas?

Antihistamine answered 8/6, 2017 at 20:22 Comment(0)
H
20

The translate api has a parameter format_ which you can set to text. This will preserve line breaks. See this link for reference.

Update Added underscore in format_ parameter.

Heighttopaper answered 7/6, 2018 at 8:14 Comment(1)
I can verify that this does preserve line breaks, but does not solve my problem of wanting to translate HTML and preserve line breaks. I'm translating HTML that is entered by a user into a text box, and want to preserve the spacing so it looks the same as they entered into the textarea.Quadrangle
A
6

Got it working by replacing \r\n with <br> in the input string.

Antihistamine answered 28/6, 2017 at 2:20 Comment(0)
W
4

Replacing \r\n with <br> does work, but it seems to think that its the end of a sentence, and so limits the stretch of the translation evaluation, resulting in a less-than-optimal translation. Also the first character of the line becomes a capital letter, which is what was the clue for me.

What I did was to replace \r\n with <code>0</code> and then back again after translation - this gave a good translation, as it did not see the <code>0</code> as contributing to the sentence. Not ideal, but gives a better translation.

Willow answered 29/8, 2017 at 4:33 Comment(0)
N
0

May be it will be useful if you use the wordpress there is a wpautop() function

So each full text block you can wrap to this function like so:

urlencode(wpautop($text))

So the result will be wrapped in <p> Tags with some <br> inside, if there is any.

Nightingale answered 19/11, 2021 at 13:56 Comment(0)
C
0

My app has numerous character sequences I'd like hidden from Google Translate. The way I did it was replacing such sequences with characters from Unicode private use area (0xe001 and so on). The Google Translate API leaves them untouched so I can replace them back after a translation. It works for \n and \r\n as well.

Clava answered 23/8 at 8:51 Comment(0)
B
-6

Low tech workaround:

Simple text program like microsoft write or note pad or libre office writer. Write there, then copy/past into google translate. Line breaks preserved.

Beadle answered 19/8, 2019 at 6:20 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.