Delete a specific string with tr
Asked Answered
A

5

37

Is it possible to delete a specific string with tr command in a UNIX-Shell? For example: If I type:

tr -d "1."

and the input is 1.1231, it would show 23 as an output, but I want it to show 1231 (notice only the first 1 has gone). How would I do that?

If you know a solution or a better way, please explain the syntax since I don't want to just copy&paste but also to learn.

I have huge problems with awk, so if you use this, please explain it even more.

Aureus answered 23/4, 2012 at 14:29 Comment(1)
The direct answer is "no"; tr substitutes individual characters, not strings. So (a) your command will remove all occurrences of "1" and "." anywhere in the input; and (b) tr is not the right command for the task you are asking about.Fare
C
21

In your example above the cut command would suffice.

Example: echo '1.1231' | cut -d '.' -f 2 would return 1231.

For more information on cut, just type man cut.

Caro answered 23/4, 2012 at 14:39 Comment(0)
B
13

You would be better off using some kind of regex (maybe something like sed).

For example, with the input 1.1231 you could use the following to get the 1231 output:

sed 's/1\.//g'

Maybe have a look here: http://tldp.org/LDP/abs/html/string-manipulation.html

Bradski answered 23/4, 2012 at 14:39 Comment(1)
"Nice use of 'tr'", I agree that this is much easier to use and easier to read, for simple expressions.Stench
I
9

You could also use sed for this kind of thing:

$ echo "1.1231" | sed -e "s/1\.//"
1231

This is just using sed to run a regular expression search and replace, replacing "1." (with appropriate escaping) with "". It only deletes the first match by default.

Idolla answered 23/4, 2012 at 14:33 Comment(0)
B
4

If you are using bash, you can do this easily with parameter substitution:

$ a=1.1231
$ echo ${a#1.}
1231

This will remove the leading "1." string. If you want to remove up to and including the first occurrence, use ${a#*1.} and if you want to remove everything up to and including the last occurrence, use ${##*1.}.

The TLDP page on string manipulation has further options (such as substring extraction).

Note that using standard sh built-in string manipulation tools for such simple transformations will always be much faster than using an external tool, such as sed, awk or cut because the shell doesn't have to create a sub-process to perform the operation. However, for more complicated things (e.g. you need to use regular expressions or when the input is large), you're better of using the dedicated tools.

Bullfight answered 23/4, 2012 at 14:34 Comment(1)
This is also a great solution! Thanks.Aureus
F
1

Since you asked specifically about awk, here is another one.

awk '{ gsub(/1\./,"") }1' input.txt

As any awk tutorial will tell you, the general form of an awk program is a sequence of 'condition { actions }'. If you have no actions, the default action is to print. If you have no conditions, the actions will be taken unconditionally. This program uses both of these special cases.

The first part is an action without a condition, i.e. it will be taken for all lines. The action is to substitute all occurrences of the regular expression /1\./ with nothing. So this will trim any '1.' (regardless of context) from a line.

The second part is a condition without an action, i.e. it will print if the condition is true, and the condition is always true. This is a common idiom for "we are done -- print whatever we have now". It consists simply of the constant 1 (which when used as a condition means "true", simply).

This could be reformulated in a number of ways. For example, you could factor the print into the first action;

awk '{ gsub(/1\./,""); print }'  input.txt

Perhaps you want to substitute the integer part, i.e. any numbers before a period sign. The regex for that would be something like /[0-9]+\./.

gsub is a GNU extension, so you might want to replace it with sub or some sort of loop if you need portability to legacy awk syntax.

Fare answered 23/4, 2012 at 15:36 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.