Sed replace at second occurrence

Asked 15/9, 2017 at 11:47 Answered 15/9, 2017 at 14:20

I want to remove a pattern with sed, only at second occurence. Here is what I want, remove a pattern but on second occurrence.

What's in the file.csv:

a,Name(null)abc.csv,c,d,Name(null)abc.csv,f
a,Name(null)acb.csv,c,d,Name(null)acb.csv,f
a,Name(null)cba.csv,c,d,Name(null)cba.csv,f

Output wanted:

a,Name(null)abc.csv,c,d,Name,f
a,Name(null)acb.csv,c,d,Name,f
a,Name(null)cba.csv,c,d,Name,f

This is what i tried:

sed -r 's/(\(null)\).*csv//' file.csv

The problem here is that the regex is too greedy, but i cannot make is stop. I also tried this, to skip the first occurrence of "null":

sed -r '0,/null/! s/(\(null)\).*csv//' file.csv

Also tried but the greedy regex is still the problem.

sed -r 's/(\(null)\).*csv//2' file.csv

I've read that ? can make the regex "lazy", but I cannot make it workout.

sed -r 's/(\(null)\).*?csv//' file.csv

Hiro answered 15/9, 2017 at 11:47 Comment(1)

If you may have 3 or more (null)s and you still want to only remove the 2nd occurrence, I think it would be easier to do with perl, using .*? instead of .*. – Mildred 15/9, 2017 at 11:55

The more robust awk solution:

Extended sample file input.csv:

12,Name(null)randomstuff.csv,2,3,Name(null)randomstuff.csv, false,Name(null)randomstuff.csv
12,Name(null)AotherRandomStuff.csv,2,3,Name(null)AotherRandomStuff.csv, false,Name(null)randomstuff.csv
12,Name(null)alphaNumRandom.csv,2,3,Name(null)alphaNumRandom.csv, false,Name(null)randomstuff.csv

The job:

awk -F, '{ c=0; for(i=1;i<=NF;i++) if($i~/\(null\)/ && c++==1) sub(/\(null\).*/,"",$i) }1' OFS=',' input.csv

The output:

12,Name(null)randomstuff.csv,2,3,Name, false,Name(null)randomstuff.csv
12,Name(null)AotherRandomStuff.csv,2,3,Name, false,Name(null)randomstuff.csv
12,Name(null)alphaNumRandom.csv,2,3,Name, false,Name(null)randomstuff.csv

Bonilla answered 15/9, 2017 at 12:10 Comment(1)

Great this is Working just fine! I'll have to check out more on awk tool! – Hiro 15/9, 2017 at 12:23

sed does provide an easy way to specify which match to be replaced. Just add the number after delimiters

$ sed 's/(null)[^.]*\.csv//2' ip.csv
a,Name(null)abc.csv,c,d,Name,f
a,Name(null)acb.csv,c,d,Name,f
a,Name(null)cba.csv,c,d,Name,f

$ # or [^,] if there are no , within fields
$ sed 's/(null)[^,]*//2' ip.csv
a,Name(null)abc.csv,c,d,Name,f
a,Name(null)acb.csv,c,d,Name,f
a,Name(null)cba.csv,c,d,Name,f

Also, no need to escape () when not using extended regular expressions

Minneapolis answered 15/9, 2017 at 12:21 Comment(4)

I've tried it if you look closer in my post. The problem was the Greedy Regex. I had to change .* with [^,]* like in your example. Thank you. – Hiro 15/9, 2017 at 12:41

well I didn't notice that you had tried //1 (later edited to //2) ... so you were only put off by greedy issue... easy to solve in this case as there are workarounds with [^,] or [^.]... for generic case you might need proper csv parsers available in perl/python/etc – Minneapolis 15/9, 2017 at 12:49

You are right, i could of done this with pyexcel which i use in my script. Didn't thought about that! – Hiro 15/9, 2017 at 12:53

ahhh, this is exactly what I needed as well, Thanks! – Bronchia 2/11, 2020 at 5:27

The more robust awk solution:

Extended sample file input.csv:

12,Name(null)randomstuff.csv,2,3,Name(null)randomstuff.csv, false,Name(null)randomstuff.csv
12,Name(null)AotherRandomStuff.csv,2,3,Name(null)AotherRandomStuff.csv, false,Name(null)randomstuff.csv
12,Name(null)alphaNumRandom.csv,2,3,Name(null)alphaNumRandom.csv, false,Name(null)randomstuff.csv

The job:

awk -F, '{ c=0; for(i=1;i<=NF;i++) if($i~/\(null\)/ && c++==1) sub(/\(null\).*/,"",$i) }1' OFS=',' input.csv

The output:

12,Name(null)randomstuff.csv,2,3,Name, false,Name(null)randomstuff.csv
12,Name(null)AotherRandomStuff.csv,2,3,Name, false,Name(null)randomstuff.csv
12,Name(null)alphaNumRandom.csv,2,3,Name, false,Name(null)randomstuff.csv

Bonilla answered 15/9, 2017 at 12:10 Comment(1)

Great this is Working just fine! I'll have to check out more on awk tool! – Hiro 15/9, 2017 at 12:23

-3

Execute:

awk '{sub(/.null.....csv,f/,",f")}1' file

And the output should be:

a,Name(null)abc.csv,c,d,Name,f
a,Name(null)acb.csv,c,d,Name,f
a,Name(null)cba.csv,c,d,Name,f

Ligurian answered 15/9, 2017 at 14:20 Comment(0)

Recommended topics

Hot tags