Delete duplicate commands of zsh_history keeping last occurence
Asked Answered
C

3

7

I'm trying to write a shell script that deletes duplicate commands from my zsh_history file. Having no real shell script experience and given my C background I wrote this monstrosity that seems to work (only on Mac though), but takes a couple of lifetimes to end:

#!/bin/sh

history=./.zsh_history
currentLines=$(grep -c '^' $history)
wordToBeSearched=""
currentWord=""
contrastor=0
searchdex=""

echo "Currently handling a grand total of: $currentLines lines. Please stand by..."
while (( $currentLines - $contrastor > 0 ))
do
    searchdex=1
    wordToBeSearched=$(awk "NR==$currentLines - $contrastor" $history | cut -d ";" -f 2)
    echo "$wordToBeSearched A BUSCAR"
    while (( $currentLines - $contrastor - $searchdex > 0 ))
    do
        currentWord=$(awk "NR==$currentLines - $contrastor - $searchdex" $history | cut -d ";" -f 2)
        echo $currentWord
        if test "$currentWord" == "$wordToBeSearched"
        then
            sed -i .bak "$((currentLines - $contrastor - $searchdex)) d" $history
            currentLines=$(grep -c '^' $history)
            echo "Line deleted. New number of lines: $currentLines"
            let "searchdex--"
        fi
        let "searchdex++"
    done
    let "contrastor++"
done

^THIS IS HORRIBLE CODE NOONE SHOULD USE^

I'm now looking for a less life-consuming approach using more shell-like conventions, mainly sed at this point. Thing is, zsh_history stores commands in a very specific way:

: 1652789298:0;man sed

Where the command itself is always preceded by ":0;". I'd like to find a way to delete duplicate commands while keeping the last occurrence of each command intact and in order.

Currently I'm at a point where I have a functional line that will delete strange lines that find their way into the file (newlines and such):

#sed -i '/^:/!d' $history

But that's about it. Not really sure how get the expression to look for into a sed without falling back into everlasting whiles or how to delete the duplicates while keeping the last-occurring command.

Cobby answered 18/5, 2022 at 17:37 Comment(0)
L
16

The zsh option hist_ignore_all_dups should do what you want. Just add setopt hist_ignore_all_dups to your zshrc.


If you really want to remove the duplicates manually (or do more interesting things with them), here's a one-liner that produces the same results as zsh itself: only the last occurrence of a command is kept and all entries remain in the same order, regardless of the timestamp.

sed ':start; /\\$/ { N; s/\\\n/\\\x00/; b start }' .zsh_history | nl -nrz | tac | sort -t';' -u -k2 | sort | cut -d$'\t' -f2- | tr '\000' '\n' > .zsh_history_deduped

The main improvement over the other answers is proper handling of commands with multiple lines. These are stored in the history file with backslashes before the newlines, so I use sed to join the lines with null bytes and tr to restore them after deduplicating. zsh has its own internal escaping for null bytes and other special characters, so they won't occur in the file (for the curious, it uses 0x83 as an escape marker and XORs the following byte with 0x20).

Finally, some notes about what the individual commands do:

sed ':start; /\\$/ { N; s/\\\n/\\\x00/; b start }' .zsh_history  # join multiline commands
| nl -nrz                  # add line numbers so we can restore the original order
| tac | sort -t';' -u -k2  # sort and remove duplicate commands, keeping the last occurrence
| sort                     # sort on line numbers
| cut -d$'\t' -f2-         # remove the line numbers
| tr '\000' '\n'           # restore multiline commands
> .zsh_history_deduped
Lexeme answered 18/5, 2022 at 18:5 Comment(6)
I already have that (doesn't seem to work but there it is), thing is I want to delete the duplicates currently in the fileCobby
Do you have inc_append_history set? It might be this issue: github.com/ohmyzsh/ohmyzsh/issues/9359.Lexeme
I didn't have it, just tried with it and still it doesn't work. But that's not what I'm looking for here, as said before I want to delete the file's duplicatesCobby
Unrelated to the OP question, the question states how to remove duplicates from the current file. Modifying the .zshrc just prevents this from happening in the future and the issue its also unrelated...Boru
In my testing in a clean shell, setting this option will make zsh remove any duplicates when it writes out the history file (at shell exit by default).Lexeme
that long sed command would be super helpful, but it spits out some junk for meHeilman
B
2

I wanted something similar, but I dont care about preserving the last one as you mentioned. This is just finding duplicates and removing them.

I used this command and then removed my .zsh_history and replacing it with the .zhistory that this command outputs

So from your home folder:

cat -n .zsh_history | sort -t ';' -uk2 | sort -nk1 | cut -f2- > .zhistory

This effectively will give you the file .zhistory containing the changed list, in my case it went from 9000 lines to 3000, you can check it with wc -l .zhistory to count the number of lines it has.

Please double check and make a backup of your zsh history before doing anything with it.

The sort command might be able to be modified to sort it by numerical value and somehow archieve what you want, but you will have to investigate further about that.

I found the script here, along with some commands to avoid saving duplicates in the future

Boru answered 19/11, 2022 at 10:27 Comment(0)
C
0

I didn't want to rename the history file.

# dedupe_lines.zsh

if [ $# -eq 0 ]; then
  echo "Error: No file specified" >&2
  exit 1
fi

if [ ! -f $1 ]; then
  echo "Error: File not found" >&2
  exit 1
fi

sort $1 | uniq >temp.txt
mv temp.txt $1

Add dedupe_lines.zsh to your home directory, then make it executable.

chmod +x dedupe_lines.zsh

Run it.

./dedupe_lines.zsh .zsh_history
Caterinacatering answered 26/12, 2022 at 23:56 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.