Split string using \r\n using IFS in bash
Asked Answered
S

2

2

I would like to split string contains \r\n in bash but carriage return and \n gives issue. Can anyone give me hint for different IFS? I tried IFS=' |\' too.

input:

projects.google.tests.inbox.document_01\r\nprojects.google.tests.inbox.document_02\r\nprojects.google.tests.inbox.global_02

Code:

IFS=$'\r'
inputData="projects.google.tests.inbox.document_01\r\nprojects.google.tests.inbox.document_02\r\nprojects.google.tests.inbox.global_02"
for line1 in ${inputData}; do
    line2=`echo "${line1}"`
    echo ${line2} //Expected one by one entry
done

Expected:

projects.google.tests.inbox.document_01
projects.google.tests.inbox.document_02
projects.google.tests.inbox.global_02
Synaeresis answered 4/10, 2017 at 2:43 Comment(3)
Don't read lines with for.Platonic
BTW, in string="foo\r\n", you don't actually have a literal CRLF sequence inside the variable (as it would be if you'd retrieved that variable's contents from a file, for example). To assign that sequence to a string, you need string=$'foo\r\n'Platonic
Possible duplicate of How do I split a string on a delimiter in Bash?Beneath
L
1

Following awk could help you in your question.

awk '{gsub(/\\r\\n/,RS)} 1'  Input_file

OR

echo "$var" | awk '{gsub(/\\r\\n/,RS)} 1'

Output will be as follows.

projects.google.tests.inbox.document_01
projects.google.tests.inbox.document_02
projects.google.tests.inbox.global_02

Explanation: Using awk's gsub utility which is used for globally substitution and it's method is gsub(/regex_to_be_subsituted/,variable/new_value,current_line/variable), so here I am giving \\r\\n(point to be noted here I am escaping here \\ which means it will take it as a literal character) with RS(record separator, whose default value is new line) in the current line. Then 1 means, awk works on method of condition and action, so by mentioning 1 I am making condition as TRUE and no action is given, so default action print of current will happen.

EDIT: With a variable you could use as following.

var="projects.google.tests.inbox.document_01\r\nprojects.google.tests.inbox.document_02\r\nprojects.google.tests.inbox.global_02"
echo "$var" | awk '{gsub(/\\r\\n/,RS)} 1'
projects.google.tests.inbox.document_01
projects.google.tests.inbox.document_02
projects.google.tests.inbox.global_02
Logorrhea answered 4/10, 2017 at 3:7 Comment(11)
Thanks a lot, can you answer by taking string value instead of file?Synaeresis
@JiteshSojitra, please check my edited OR solution now and let me know if you have any queries on same. Explanation has been added too now.Logorrhea
It works, now i need to again take a loop to read each instead of had a way in above code using IFS?Synaeresis
@RavinderSingh13, please don't encourage folks to echo $var (unquoted expansion).Platonic
@JiteshSojitra, no Jitesh, you need not to use awk in any loop. awk could read any file by itself. If you have some other requirement then kindly do let us know in a new post, if it is minor question then let us know here.Logorrhea
@CharlesDuffy, Apologies somehow I missed it, it has been added now. Thank you for letting me know.Logorrhea
@RavinderSingh13, yes its minor for e.g in question line2 variable shows each line value. This prints row output for all.Synaeresis
@JiteshSojitra, not sure if I get it, if you are using above awk solution is it required, kindly do let me know on same please.Logorrhea
Done. for line2 in $(echo "$var" | awk '{gsub(/\\r\\n/,RS)} 1'); do echo ${line2} doneSynaeresis
@JiteshSojitra, in spite of doing it, you could do it with single awk itself, try it out once and could save it's output into a variable, no need for a for loop here.Logorrhea
@JiteshSojitra, ...also, echo "$line2" is better-behaved than echo ${line2}. If your line contains a *, for example, then the version without the quotes will emit a list of files in the current directory, whereas the version with the quotes will emit that *. (Curly braces make no difference to correctness in this context one way or the other, so I suggest leaving them out unless you're in a context where they're specifically helpful).Platonic
P
8
inputData=$'projects.google.tests.inbox.document_01\r\nprojects.google.tests.inbox.document_02\r\nprojects.google.tests.inbox.global_02'
while IFS= read -r line; do
  line=${line%$'\r'}
  echo "$line"
done <<<"$inputData"

Note:

  • The string is defined as string=$'foo\r\n', not string="foo\r\n". The latter does not put an actual CRLF sequence in your variable. See ANSI C-like strings on the bash-hackers' wiki for a description of this syntax.
  • ${line%$'\r'} is a parameter expansion which strips a literal carriage return off the end of the contents of the variable line, should one exist.
  • The practice for reading an input stream line-by-line (used here) is described in detail in BashFAQ #1. Unlike iterating with for, it does not attempt to expand your data as globs.
Platonic answered 4/10, 2017 at 3:17 Comment(3)
Interesting and thanks a lot! As value is coming from outside so i just need to replace with : inputData="${valueFromExternalSource}" ?Synaeresis
@JiteshSojitra, yup -- if the externally-provided value contains literals rather than escape sequences, that'll work perfectly.Platonic
@JiteshSojitra, ...by contrast, if the incoming string contains literal escape sequences, that raises a question of what that data is. If it's a JSON string, you're better off using jq to interpret it; if it has C-style printf escapes only, then you can use printf '%b' to convert them to literals, should that be your intent.Platonic
L
1

Following awk could help you in your question.

awk '{gsub(/\\r\\n/,RS)} 1'  Input_file

OR

echo "$var" | awk '{gsub(/\\r\\n/,RS)} 1'

Output will be as follows.

projects.google.tests.inbox.document_01
projects.google.tests.inbox.document_02
projects.google.tests.inbox.global_02

Explanation: Using awk's gsub utility which is used for globally substitution and it's method is gsub(/regex_to_be_subsituted/,variable/new_value,current_line/variable), so here I am giving \\r\\n(point to be noted here I am escaping here \\ which means it will take it as a literal character) with RS(record separator, whose default value is new line) in the current line. Then 1 means, awk works on method of condition and action, so by mentioning 1 I am making condition as TRUE and no action is given, so default action print of current will happen.

EDIT: With a variable you could use as following.

var="projects.google.tests.inbox.document_01\r\nprojects.google.tests.inbox.document_02\r\nprojects.google.tests.inbox.global_02"
echo "$var" | awk '{gsub(/\\r\\n/,RS)} 1'
projects.google.tests.inbox.document_01
projects.google.tests.inbox.document_02
projects.google.tests.inbox.global_02
Logorrhea answered 4/10, 2017 at 3:7 Comment(11)
Thanks a lot, can you answer by taking string value instead of file?Synaeresis
@JiteshSojitra, please check my edited OR solution now and let me know if you have any queries on same. Explanation has been added too now.Logorrhea
It works, now i need to again take a loop to read each instead of had a way in above code using IFS?Synaeresis
@RavinderSingh13, please don't encourage folks to echo $var (unquoted expansion).Platonic
@JiteshSojitra, no Jitesh, you need not to use awk in any loop. awk could read any file by itself. If you have some other requirement then kindly do let us know in a new post, if it is minor question then let us know here.Logorrhea
@CharlesDuffy, Apologies somehow I missed it, it has been added now. Thank you for letting me know.Logorrhea
@RavinderSingh13, yes its minor for e.g in question line2 variable shows each line value. This prints row output for all.Synaeresis
@JiteshSojitra, not sure if I get it, if you are using above awk solution is it required, kindly do let me know on same please.Logorrhea
Done. for line2 in $(echo "$var" | awk '{gsub(/\\r\\n/,RS)} 1'); do echo ${line2} doneSynaeresis
@JiteshSojitra, in spite of doing it, you could do it with single awk itself, try it out once and could save it's output into a variable, no need for a for loop here.Logorrhea
@JiteshSojitra, ...also, echo "$line2" is better-behaved than echo ${line2}. If your line contains a *, for example, then the version without the quotes will emit a list of files in the current directory, whereas the version with the quotes will emit that *. (Curly braces make no difference to correctness in this context one way or the other, so I suggest leaving them out unless you're in a context where they're specifically helpful).Platonic

© 2022 - 2024 — McMap. All rights reserved.