Rename files using regular expression in linux
Asked Answered
W

10

92

I have a set of files named like:

Friends - 6x03 - Tow Ross' Denial.srt
Friends - 6x20 - Tow Mac and C.H.E.E.S.E..srt
Friends - 6x05 - Tow Joey's Porshe.srt

and I want to rename them like the following

S06E03.srt
S06E20.srt
S06E05.srt

what should I do to make the job done in linux terminal? I have installed rename but U get errors using the following:

rename -n 's/(\w+) - (\d{1})x(\d{2})*$/S0$2E$3\.srt/' *.srt
Winifield answered 4/8, 2012 at 15:13 Comment(1)
I have shared my solution in an another post: https://mcmap.net/q/217146/-recursively-rename-files-using-find-and-sed .Tanana
M
109

You forgot a dot in front of the asterisk:

rename -n 's/(\w+) - (\d{1})x(\d{2}).*$/S0$2E$3\.srt/' *.srt

On OpenSUSE, RedHat, Gentoo you have to use Perl version of rename. This answer shows how to obtain it. On Arch, the package is called perl-rename.

Metalliferous answered 4/8, 2012 at 15:31 Comment(5)
OpenSUSE, RedHat, Gentoo doesn't support regex in renameBetz
@mmrmartin: The rename script used here is the one written by Larry Wall. It used be in the file /usr/bin/rename, but perhaps it has been renamed (no pun intended)? On Debian the script name is now /usr/bin/file-rename.Metalliferous
openSUSE uses rename from util-linux package, I didn't find any package providing file-rename, prename or perl-rename - only working solution was install using cpan for me.Betz
@mmrmartin Same problem on RHEL 6, which also uses rename based on util-linux. See https://mcmap.net/q/224841/-rename-files-using-regular-expression-in-linux.Decca
Update from 2012 to 2023: in Debian 12 Bookworm, the package is called rename, and the command that calls the Perl script once installed is file-rename.Overhear
D
49

Simple Approach

find + perl + xargs + mv

xargs -n2 makes it possible to print two arguments per line. When combined with Perl's print $_ (to print the $STDIN first), it makes for a powerful renaming tool.

find . -type f | perl -pe 'print $_; s/input/output/' | xargs -d "\n" -n2 mv

Results of perl -pe 'print $_; s/OldName/NewName/' | xargs -n2 end up being:

OldName1.ext    NewName1.ext
OldName2.ext    NewName2.ext
OldName3.ext    NewName3.ext
OldName4.ext    NewName4.ext

I did not have Perl's rename readily available on my system.


How does it work?

  1. find . -type f outputs file paths (or file names...you control what gets processed by regex here!)
  2. -p prints file paths that were processed by regex, -e executes inline script
  3. print $_ prints the original file name first (independent of -p)
  4. -d "\n" cuts the input by newline, instead of default space character
  5. -n2 prints two elements per line
  6. mv gets the input of the previous line

Advanced Approach

This is my preferred approach, because it is rebust. The advanced part is the added complexity of using null bytes in the standard output of each executable in the pipeline, and dealing with them in the standard inputs of subsequent executables. Why do this?: This avoids issues with spaces and newlines in filenames.

Let's say I want to rename all ".txt" files to be ".md" files:

find . -type f -printf '%P\0' | perl -0 -l0 -pe 'print $_; s/(.*)\.txt/$1\.md/' | xargs -0 -n 2 mv

The magic here is that each process in the pipeline supports the null byte (0x00) that is used as a delimiter as opposed to spaces or newlines. The first aforementioned method uses newlines as separators. Note that I tried to easily support find . without using subprocesses. Be careful here (you might want to check your output of find before you run in through a regular expression match, or worse, a destructive command like mv).

How it works (abridged to include only changes from above)

  1. In find: -printf '%P\0' print only name of files without path followed by null byte. Adjust to your use case-whether matching filenames or entire paths.
  2. In perl and xargs: -0 stdin delimiter is the null byte (rather than space)
  3. In perl: -l0 stdout delimiter is the null byte (in octal 000)
Decca answered 16/1, 2018 at 11:54 Comment(7)
For me, this is the best answer - oneliner with tools available out of the boxRoadblock
this erased all my files. luckily I made a backupGabrila
The last command should be changed to xargs -d '\n' -n2 mv, otherwise xargs will treat spaces in filenames as delimiters and either cause errors, or rename files nonsensically. The -d '\n' argument specifies that newlines should be treated as the delimiter. GNU xargs has the -d argument, but for those implementations that do not (i.e. FreeBSD which I was using), this would work across most environments: find . -type f | perl -pe 'print $_; s/input/output/' | sed 's/ /\\ /g' xargs -n2 mv by using sed to escape all spaces in the output that's piped to xargs. (Not elegant, perhaps.)Bismarck
@Bismarck A perhaps better way to treat spaces as normal chars is to use a different dilimiter char. Xargs supports the 0-byte and so does find. I‘d do a find -print0 followed by a xargs -0.Decca
Another improvement would be to pre-filter the results from the find through grep to minimize the no-op renames: find . -type f | grep 'input' | perl -pe 'print $_; s/input/output/' | xargs -n2 mvGravelly
"SomeFile.1400mb.mkv" ... To remove "1400mb" (e.g., file size): for f in `find -type f`; do mv -v "$f" "`echo $f | sed -r 's/[0-9]{1,}.*mb/ /'`"; done Note use of backticks: ` : unix.stackexchange.com/questions/27428/…Damico
@VictoriaStuart Seems like your comment should be an answer, but regardless, I would not recommend your approach, see stackoverflow.com/a/9612560Decca
B
17

Use mmv (mass-move?)

It's simple but useful: The * wildcard matches any string (without /) and ? matches any character in the string to be matched. In the replace string, use #N to refer to the N-th wildcard match.

In your case:

mmv 'Friends - 6x?? - Tow *.srt' 'S06E#1#2.srt'

Here, #1#2 represent the two digits which are captured by ?? (match #1 and #2).
So the following replacement is being made:

The pattern string:     'Friends - 6x?? - Tow *           .srt'
matches this file:       Friends - 6x03 - Tow Ross' Denial.srt
                                     ↓↓
will be renamed to:             S06E03.srt

Personally, I use it to pad numbers such that numbered files appear in the desired order when sorted lexicographically (e.g., 1. appears before 10.): file_?.extfile_0#1.ext


mmv also offers matching by [ and ] and ;.

You can not only mass rename, but also mass move, copy, append and link files.

See the man page for more!

Barely answered 2/12, 2017 at 11:34 Comment(1)
Thanks! Great tool with a simple syntaxPalatial
I
12

Edit: found a better way to list the files without using IFS and ls while still being sh compliant.

I would do a shell script for that:

#!/bin/sh
for file in *.srt; do
  if [ -e "$file" ]; then
    newname=`echo "$file" | sed 's/^.*\([0-9]\+\)x\([0-9]\+\).*$/S0\1E\2.srt/'`
    mv "$file" "$newname"
  fi
done

Previous script:

#!/bin/sh
IFS='
'
for file in `ls -1 *.srt`; do
  newname=`echo "$file" | sed 's/^.*\([0-9]\+\)x\([0-9]\+\).*$/S0\1E\2.srt/'`
  mv "$file" "$newname"
done
Intrusion answered 2/2, 2017 at 0:52 Comment(3)
What does IFS='\n' stand for in this example? I like it because it does not use anything special.Gymkhana
IFS: The Internal Field Separator that is used for word splitting after expansion and to split lines into words with the read builtin command. The default value is "<space><tab><newline>" -- (from man bash). Changing it to \n allows to get one file per line.Intrusion
You could extend the script to support recursive action with: for file in `find . -type f`; do (But then you need to update the sed to capture the path also)Twentyfour
P
10

Not every distro ships a rename utility that supports regexes as used in the examples above - RedHat, Gentoo and their derivatives amongst others.

Alternatives to try to use are perl-rename and mmv.

Pentothal answered 18/5, 2015 at 11:17 Comment(0)
N
9

Use regex-rename

It's super easy to install (unlike the other tools):

pip3 install regex-rename

Do the renaming with:

regex-rename "(\d{1})x(\d{2})" "S0\1E\2.srt" --rename

Try "dry-run" mode (without --rename flag) in first place to check if it looks good before the actual renaming. It shows you what was matched to each of the groups so you can debug your regex until it's fine. It expects 2 arguments: matcher & replacement pattern, no bizarre s/.../.../ syntax. Also, I'm too lazy to do a full match, it just works with the season + episode pattern.

I made it myself as I saw there's no decent tool like this. I'd love to hear your feedback.

Nordstrom answered 14/10, 2022 at 14:47 Comment(2)
Seems very slick. Would love to be able to pass it files via find so there is more control over /not/ entering directories, etcFlemming
love that it defaults to dry run mode.Hyperesthesia
C
7

if your linux does not offer rename, you could also use the following:

find . -type f -name "Friends*" -execdir bash -c 'mv "$1" "${1/\w+\s*-\s*(\d)x(\d+).*$/S0\1E\2.srt}"' _ {} \;

i use this snippet quite often to perform substitutions with regex in my console.

i am not very good in shell-stuff, but as far as i understand this code, its explanation would be like: the search results of your find will be passed on to a bash-command (bash -c) where your search result will be inside of $1 as source file. the target that follows is the result of a substitution within a subshell, where the content of $1 (here: just 1 inside your parameter-substituion {1//find/replace}) will also be your search result. the {} passes it on to the content of -execdir

better explanations would be appreciated a lot :)

please note: i only copy-pasted your regex; please test it first with example files. depending on your system you might need to change \d and \w to character classes like [[:digit:]] or [[:alpha:]]. however, \1 should work for the groups.

Coan answered 9/10, 2017 at 14:7 Comment(1)
As the bash manual says: "-c string If the -c option is present, then commands are read from string. If there are arguments after the string, they are assigned to the positional parameters, starting with $0.", so you can even improve your command: find . -type f -name "Friends*" -execdir bash -c 'mv "$0" "${0/\w+\s*-\s*(\d)x(\d+).*$/S0\1E\2.srt}"' {} \;Amused
P
7

I think the simplest as well as universal way will be using for loop sed and mv. First, you can check your regex substitutions in a pipe:

ls *.srt | sed -E 's/.* ([0-9])x([0-9]{2}) .*(\.srt)/S\1E\2\3/g'

If it prints the correct substitution, just put it in a for loop with mv

for i in $(ls *.srt); do 
    mv $i $(echo $i | sed -E 's/.* ([0-9])x([0-9]{2}) .*(\.srt)/S\1E\2\3/g') 
    done
Polychromatic answered 7/12, 2020 at 21:27 Comment(0)
V
2

You can use rnm:

rnm -rs '/\w+\s*-\s*(\d)x(\d+).*$/S0\1E\2.srt/' *.srt

Explanation:

  1. -rs : replace string of the form /search_regex/replace_part/modifier
  2. (\d) and (\d+) in (\d)x(\d+) are two captured groupes (\1 and \2 respectively).

More examples here.

Volition answered 7/5, 2016 at 8:43 Comment(1)
Works like a charm, and it also shows the transformation of the file name before taking any action. <3Progestin
L
1

If you use rnr, the command would be:

rnr -f '.*(\d{1})x(\d{2}).*' 'S0${1}E${2}.str' *.srt

rnr has the benefit of being able to undo the command.

Landlady answered 16/12, 2022 at 12:14 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.