How to remove duplicate commands from Bash history file?
Asked Answered
B

1

10

I have configured my own .bash_myhistory

export HISTFILESIZE=
export HISTSIZE=
export HISTTIMEFORMAT="[%F %T] "

export HISTFILE=~/.bash_myhistory
PROMPT_COMMAND="history -a; history -r; $PROMPT_COMMAND"

When I run history it shows me many repetitive output:

 $ history | grep 'git rebase'
   75  [2018-05-23 16:39:39] git rebase -p dev_hypermouse 
  168  [2018-05-23 19:27:39] man git rebase 
  547  [2018-05-25 19:01:44] git rebase master 
  639  [2018-05-25 20:24:52] git rebase master 
  869  [2018-05-28 14:07:33] git rebase xxx
  921  [2018-05-28 16:12:20] git rebase dash_v2 
  922  [2018-05-28 16:12:33] man git rebase
  925  [2018-05-28 16:13:21] man git rebase
  927  [2018-05-28 16:15:42] git rebase xxx dash_v2 
  937  [2018-05-28 16:17:46] git rebase --onto dash_v2 xxx 296-ToS-component 
 2177  [2018-05-23 16:39:39] git rebase -p dev_hypermouse 
 2270  [2018-05-23 19:27:39] man git rebase 
 2649  [2018-05-25 19:01:44] git rebase master 
 2741  [2018-05-25 20:24:52] git rebase master 
 2971  [2018-05-28 14:07:33] git rebase xxx
 3023  [2018-05-28 16:12:20] git rebase dash_v2 
 3024  [2018-05-28 16:12:33] man git rebase
 3027  [2018-05-28 16:13:21] man git rebase
 3029  [2018-05-28 16:15:42] git rebase xxx dash_v2 
 3039  [2018-05-28 16:17:46] git rebase --onto dash_v2 xxx 296-ToS-component 
 4239  [2018-05-23 19:27:39] man git rebase 
 4618  [2018-05-25 19:01:44] git rebase master 
 4710  [2018-05-25 20:24:52] git rebase master 
 4940  [2018-05-28 14:07:33] git rebase xxx
 4992  [2018-05-28 16:12:20] git rebase dash_v2 
 4993  [2018-05-28 16:12:33] man git rebase
 4996  [2018-05-28 16:13:21] man git rebase
 4998  [2018-05-28 16:15:42] git rebase xxx dash_v2 
 5008  [2018-05-28 16:17:46] git rebase --onto dash_v2 xxx 296-ToS-component 
 ...

But $ cat ~/.bash_myhistory | grep 'git rebase' does not:

man git rebase 
git rebase master 
git rebase master 
git rebase xxx
git rebase dash_v2 
man git rebase
man git rebase
git rebase xxx dash_v2 
git rebase --onto dash_v2 xxx 296-ToS-component 
man git rebase
man git rebase
history | grep git rebase
history | grep 'git rebase'

How to fix repetitive output from history?

UPD

with export HISTCONTROL=ignoreboth:erasedups history looks much better but duplicates still exists:

$ history | grep 'git rebase'
   34  [2018-05-23 19:27:39] man git rebase 
  413  [2018-05-25 19:01:44] git rebase master 
  505  [2018-05-25 20:24:52] git rebase master 
  735  [2018-05-28 14:07:33] git rebase xxx
  787  [2018-05-28 16:12:20] git rebase dash_v2 
  788  [2018-05-28 16:12:33] man git rebase
  791  [2018-05-28 16:13:21] man git rebase
  793  [2018-05-28 16:15:42] git rebase xxx dash_v2 
  803  [2018-05-28 16:17:46] git rebase --onto dash_v2 xxx 296-ToS-component 
 2038  [2018-06-02 14:49:31] man git rebase
 2058  [2018-06-02 14:52:33] man git rebase
 2060  [2018-06-02 15:11:08] history | grep git rebase
 2061  [2018-06-02 15:11:13] history | grep 'git rebase'
 2063  [2018-06-02 15:12:45] cat .bash_myhistory | grep 'git rebase'
 2064  [2018-06-02 15:09:58] man git rebase
 2077  [2018-06-02 15:35:41] history | grep 'git rebase'
 2111  [2018-05-23 19:27:39] man git rebase 
 2490  [2018-05-25 19:01:44] git rebase master 
 2582  [2018-05-25 20:24:52] git rebase master 
 2812  [2018-05-28 14:07:33] git rebase xxx
 2864  [2018-05-28 16:12:20] git rebase dash_v2 
 2865  [2018-05-28 16:12:33] man git rebase
 2868  [2018-05-28 16:13:21] man git rebase
 2870  [2018-05-28 16:15:42] git rebase xxx dash_v2 
 2880  [2018-05-28 16:17:46] git rebase --onto dash_v2 xxx 296-ToS-component 
 4115  [2018-06-02 14:49:31] man git rebase
 4135  [2018-06-02 14:52:33] man git rebase
 4137  [2018-06-02 15:11:08] history | grep git rebase
 4138  [2018-06-02 15:11:13] history | grep 'git rebase'
 4140  [2018-06-02 15:12:45] cat .bash_myhistory | grep 'git rebase'
 4141  [2018-06-02 15:09:58] man git rebase
 4154  [2018-06-02 15:35:41] history | grep 'git rebase'

UPD
Even after adding export HISTCONTROL=ignoreboth:erasedups history looks like:

25988  [2018-07-26 17:13:19] gd 1
25989  [2018-07-26 15:45:47] mc
25990  [2018-07-26 13:57:46] mc
25991  [2018-07-26 09:23:28] mc

Also I notice that some commands disappear from history =(

Bullivant answered 2/6, 2018 at 12:17 Comment(0)
A
8
export HISTCONTROL=ignoreboth:erasedups

From bash man page:

HISTCONTROL

A colon-separated list of values controlling how commands are saved on the history list. If the list of values includes ignorespace, lines which begin with a space character are not saved in the history list. A value of ignoredups causes lines matching the previous history entry to not be saved. A value of ignoreboth is shorthand for ignorespace and ignoredups. A value of erasedups causes all previous lines matching the current line to be removed from the history list before that line is saved. Any value not in the above list is ignored. If HISTCONTROL is unset, or does not include a valid value, all lines read by the shell parser are saved on the history list, subject to the value of HISTIGNORE. The second and subsequent lines of a multi-line compound command are not tested, and are added to the history regardless of the value of HISTCONTROL.

Advantageous answered 2/6, 2018 at 12:20 Comment(10)
with this history looks much better, but duplicates still exists. What you can advice more?Bullivant
Try awk '!seen[$0]++' file.txt where file.txt is .bash_history. Make sure and take a backup first! :-) From here.Hogfish
That works (except it not delete times for preceding line). But this will not save from future duplicatesBullivant
The bash history export HISTCONTROL=ignoreboth:erasedups will avoid future duplicates!Hogfish
I'll fire up my Linux box at home this evening and check it out. Thanks for the heads up!Hogfish
I assume this removes/ignores adjacent duplicates, not all duplicates (Right?)Inglorious
@Advantageous Id doesn't remove all duplicates.Independence
@Independence - I thought that was gone through - it doesn't remove the dups - it'll stop new dups - but the awk command should remove existing dups.Hogfish
@Advantageous I came up with a bash function I'll publish later on gh/gl/cb. It uses HISTTIMEFORMAT runtime declaration for more controllable awk splitting.Independence
Let me know - assuming it's better than mine, I'll upvote and suggest to the OP (he's about) that he change his correct answer?Hogfish

© 2022 - 2024 — McMap. All rights reserved.