How to truncate long matching lines returned by grep or ack
Asked Answered
B

11

128

I want to run ack or grep on HTML files that often have very long lines. I don't want to see very long lines that wrap repeatedly. But I do want to see just that portion of a long line that surrounds a string that matches the regular expression. How can I get this using any combination of Unix tools?

Brammer answered 9/1, 2010 at 20:19 Comment(5)
What's ack? Is it a command you use when you don't like something? Something like ack file_with_long_lines | grep pattern? :-)Armstead
@Alok ack (known as ack-grep on Debian) is grep on steroids. It also has the --thpppt option (not kidding). betterthangrep.comSpoof
While the --thpppt feature is somewhat controversial, the key advantage appears to be that you can use Perl regexes directly, not some crazy [[:space:]] and characters like {, [, etc. changing meaning with the -e and -E switches in a way that's impossible to remember.Badalona
Similar: unix.stackexchange.com/q/163726 and https://mcmap.net/q/118467/-grep-characters-before-and-after-matchRialto
I use grep --color=always | less -S -R. Then, type -R to unfold/fold the lines.Quimper
V
126

You could use the grep options -oE, possibly in combination with changing your pattern to ".{0,10}<original pattern>.{0,10}" in order to see some context around it:

       -o, --only-matching
              Show only the part of a matching line that matches PATTERN.

       -E, --extended-regexp
             Interpret pattern as an extended regular expression (i.e., force grep to behave as egrep).

For example (from @Renaud's comment):

grep -oE ".{0,10}mysearchstring.{0,10}" myfile.txt

Alternatively, you could try -c:

       -c, --count
              Suppress normal output; instead print a count of matching  lines
              for  each  input  file.  With the -v, --invert-match option (see
              below), count non-matching lines.
Vorticella answered 9/1, 2010 at 20:21 Comment(5)
an example: grep -oE ".{0,20}mysearchstring.{0,20}" myfileInfatuation
you should change the answer to add -E option as shown by @Infatuation (extended pattern option), or the proposed pattern for extending context wont work.Suppuration
Not that necessary maybe but here's an example: $ echo "eeeeeeeeeeeeeeeeeeeeqqqqqqqqqqqqqqqqqqqqMYSTRINGwwwwwwwwwwwwwwwwwwwwrrrrrrrrrrrrrrrrrrrrr" > fileonelongline.txt && grep -oE ".{0,20}MYSTRING.{0,20}" ./fileonelongline.txt prints qqqqqqqqqqqqqqqqqqqqMYSTRINGwwwwwwwwwwwwwwwwwwwwBebe
This works well; but notable downside is that when using, e.g., oE ".{0,20}mysearchstring.{0,20}", you lose the highlighting of the inner "original" string against the context, because the whole thing becomes the search pattern. Would love to find a way to keep some non-highlighted context around the search results, for much easier visual scanning and result interpretation.Castaway
Oh, here's a solution to the highlighting problem caused by using the -oE ".{0,x}foo.{0,x}" approach (where x is the number of characters of context) -- append ` | grep foo ` to the end. Works for either ack or grep solutions. More solutions also here: unix.stackexchange.com/questions/163726/…Castaway
A
57

Pipe your results thru cut. I'm also considering adding a --cut switch so you could say --cut=80 and only get 80 columns.

Arette answered 9/1, 2010 at 21:19 Comment(7)
What if the part that matches is not in the first 80 characters?Vorticella
FWIW I appended | cut=c1-120 to the grep, worked for me (though don't know how to cut around matched text)Joost
| cut=c1-120 didn't work for me, I needed to do | cut -c1-120Necessitarianism
I think @edib is accurate in syntax | cut -c 1-100 https://mcmap.net/q/120607/-how-to-truncate-long-matching-lines-returned-by-grep-or-ackKhorma
@AndyLester: What about a --no-wrap option that uses $COLUMNS?Hurd
@Hurd You can submit it as an issue github.com/beyondgrep/ack3/issues Remember that ack also runs on Windows and I don't know that if they have $COLUMNSArette
Existing feature request: github.com/beyondgrep/ack3/issues/234Moyra
R
28

You could use less as a pager for ack and chop long lines: ack --pager="less -S" This retains the long line but leaves it on one line instead of wrapping. To see more of the line, scroll left/right in less with the arrow keys.

I have the following alias setup for ack to do this:

alias ick='ack -i --pager="less -R -S"' 
Rani answered 14/6, 2012 at 18:2 Comment(3)
Please note that you can put that --pager command in your ~/.ackrc file, if you always want to use it.Arette
This sounds like the best solution by far to this problem that bugs me a lot. I wish I knew how to use ack.Generation
@BrianPeterson ack is pretty much just like grep, only simpler in the most common casesCastaway
S
19

grep -oE ".{0,10}error.{0,10}" mylogfile.txt

In the unusual situation where you cannot use -E, use lowercase -e instead.

Explanation: illustrative command explanation

Synchronism answered 30/7, 2020 at 2:6 Comment(1)
Do no use backslashes - grep -oE ".{0,10}error.{0,10}" mylogfile.txt - at least in Z zhellKeeney
A
16

To get characters from 1 to 100.

cut -c 1-100

You might want to base the range off the current terminal, e.g.

cut -c 1-$(tput cols)
Aranda answered 23/2, 2018 at 18:24 Comment(0)
N
3

I put the following into my .bashrc:

grepl() {
    $(which grep) --color=always $@ | less -RS
}

You can then use grepl on the command line with any arguments that are available for grep. Use the arrow keys to see the tail of longer lines. Use q to quit.

Explanation:

  • grepl() {: Define a new function that will be available in every (new) bash console.
  • $(which grep): Get the full path of grep. (Ubuntu defines an alias for grep that is equivalent to grep --color=auto. We don't want that alias but the original grep.)
  • --color=always: Colorize the output. (--color=auto from the alias won't work since grep detects that the output is put into a pipe and won't color it then.)
  • $@: Put all arguments given to the grepl function here.
  • less: Display the lines using less
  • -R: Show colors
  • S: Don't break long lines
Nutgall answered 6/11, 2019 at 10:28 Comment(0)
E
2

Taken from: http://www.topbug.net/blog/2016/08/18/truncate-long-matching-lines-of-grep-a-solution-that-preserves-color/

The suggested approach ".{0,10}<original pattern>.{0,10}" is perfectly good except for that the highlighting color is often messed up. I've created a script with a similar output but the color is also preserved:

#!/bin/bash

# Usage:
#   grepl PATTERN [FILE]

# how many characters around the searching keyword should be shown?
context_length=10

# What is the length of the control character for the color before and after the
# matching string?
# This is mostly determined by the environmental variable GREP_COLORS.
control_length_before=$(($(echo a | grep --color=always a | cut -d a -f '1' | wc -c)-1))
control_length_after=$(($(echo a | grep --color=always a | cut -d a -f '2' | wc -c)-1))

grep -E --color=always "$1" $2 |
grep --color=none -oE \
    ".{0,$(($control_length_before + $context_length))}$1.{0,$(($control_length_after + $context_length))}"

Assuming the script is saved as grepl, then grepl pattern file_with_long_lines should display the matching lines but with only 10 characters around the matching string.

Exsiccate answered 19/8, 2016 at 1:51 Comment(1)
Works, but outputs trailing junk for me, like this: ^[[?62;9;c. I haven't tried debugging because @Jonah Braun's answer satisfied me.Rialto
B
2

The Silver Searcher (ag) supports its natively via the --width NUM option. It will replace the rest of longer lines by [...].

Example (truncate after 120 characters):

 $ ag --width 120 '@patternfly'
 ...
 1:{"version":3,"file":"react-icons.js","sources":["../../node_modules/@patternfly/ [...]

In ack3, a similar feature is planned but currently not implemented.

Biopsy answered 24/6, 2021 at 8:43 Comment(1)
but in ag, the width "begins" with first character, so this doesn't quite work when the string is in the middle of a very long lineLotti
K
1

Here's what I do:

function grep () {
  tput rmam;
  command grep "$@";
  tput smam;
}

In my .bash_profile, I override grep so that it automatically runs tput rmam before and tput smam after, which disabled wrapping and then re-enables it.

Kinny answered 21/11, 2019 at 18:37 Comment(1)
That is a nice alternative - except if the actual match is then out of screen...Woke
A
0

ag can also take the regex trick, if you prefer it:

ag --column -o ".{0,20}error.{0,20}"
Asyut answered 12/5, 2021 at 19:10 Comment(0)
G
0

bgrep if lines don't necessarily fit into memory

grep only works if the lines fit into memory, but bgrep also works on huge lines that don't.

I keep coming back to this random repo from time to time: https://github.com/tmbinc/bgrep Install:

curl -L 'https://github.com/tmbinc/bgrep/raw/master/bgrep.c' | gcc -O2 -x c -o $HOME/.local/bin/bgrep -

Use:

bgrep `printf %s saf | od -t x1 -An -v | tr -d '\n '` myfile.bin

Sample output:

myfile.bin: c80000003
\x02abc
myfile.bin: c80000007
dabc

I have tested it on files that don't fit into memory, and it worked just fine.

I've given further details at: https://unix.stackexchange.com/questions/223078/best-way-to-grep-a-big-binary-file/758528#758528

Goodsell answered 10/10, 2023 at 7:46 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.