wget: downloaded file name
Asked Answered
E

11

26

I'm writing a script for Bash and I need to get the name of the downloaded file using wget and put the name into $string.

For example, if I downloading this file below, I want to put its name, mxKL17DdgUhcr.jpg, to $string.

wget http://pics.sitename.com/images/191211/mxKL17DdgUhcr.jpg
45439 (44K) [image/jpeg]
Saving to: «mxKL17DdgUhcr.jpg»

100%[===================================================================================================>] 45 439      --.-K/s   в 0s

2011-12-20 12:25:33 (388 MB/s) - «mxKL17DdgUhcr.jpg» saved [45439/45439]
Extravaganza answered 20/12, 2011 at 10:30 Comment(1)
maybe wget --content-disposition 'url'Chalcidice
T
41

Use the basename command to extract the filename from the URL. For example:

url=http://pics.sitename.com/images/191211/mxKL17DdgUhcr.jpg
filename=$(basename "$url")
wget "$url"
Taking answered 20/12, 2011 at 10:33 Comment(2)
Warning: This will not work for urls that contains redirects or dynamic content. Refer to est's answer for correct solution.Outbid
I like it! But it also won't quite work if there are URL parameters. For example https://github.com/awslabs/aws-well-architected-labs/blob/master/Reliability/300_Testing_for_Resiliency_of_EC2_RDS_and_S3/Code/Python/server.py?raw=1Pt
S
51
wget --server-response -q -O - "https://very.long/url/here" 2>&1 | 
  grep "Content-Disposition:" | tail -1 | 
  awk 'match($0, /filename=(.+)/, f){ print f[1] }' )

This is the correct version as there are may be several 301/302 redirects and finally a Content-Disposition: header to set the file name

Guessing file name based on URL is not always correct.

Sardella answered 14/1, 2013 at 4:31 Comment(5)
I like this approach, but unfortunately the awk in Debian derivatives (Ubuntu, e.g.) does not support the 3rd argument in match.Marquetry
while not always perfectly accurate this is the correct approach.Locomotion
In Ubuntu, you can use: wget --server-response -q -O - "https://very.long/url/here" 2>&1 | grep "Content-Disposition:" | tail -1 | awk -F"filename=" '{print $2}'Outbid
Modern easy way to achive it: wget {link} --content-dispositionMaterials
@balbelias: this will correctly make wget use the name suggested by the server. But... how to retrieve it to assign it to a variable?Memberg
T
41

Use the basename command to extract the filename from the URL. For example:

url=http://pics.sitename.com/images/191211/mxKL17DdgUhcr.jpg
filename=$(basename "$url")
wget "$url"
Taking answered 20/12, 2011 at 10:33 Comment(2)
Warning: This will not work for urls that contains redirects or dynamic content. Refer to est's answer for correct solution.Outbid
I like it! But it also won't quite work if there are URL parameters. For example https://github.com/awslabs/aws-well-architected-labs/blob/master/Reliability/300_Testing_for_Resiliency_of_EC2_RDS_and_S3/Code/Python/server.py?raw=1Pt
L
24

You can just specify the filename before downloading, with the -O option to wget:

wget -O myfile.html http://www.example.com/
Lanthanide answered 20/12, 2011 at 10:33 Comment(1)
While not as "clever" as the other answers, this method actually has the advantage of simplicity and predictabilityPt
D
5

As PizzaBeer mentioned, wget says where he's going to save the file. And that's important because it will ensure to not overwrite existing files by adding a number at the end of the filename.

So here's my solution with grep to narrow down the good line (--line-buffered is needed because of how wget works, see here) and sed to extract the filename.

wget --content-disposition $1 2>&1 | grep "Saving to" --line-buffered | sed -r 's/Saving to: ‘(.*)’/\1/'

You can store this in a variable, which will be populated at the end of the download.

Dallis answered 25/5, 2021 at 14:28 Comment(1)
I found that the quotes on the sed command were formatted incorrectly, but this worked - sed -r "s/Saving to: '(.*)'/\1/"Astound
S
3

You can be explicit about the name like this:

url='http://pics.sitename.com/images/191211/mxKL17DdgUhcr.jpg'
file=`basename "$url"`
wget "$url" -O "$file"
Segal answered 20/12, 2011 at 10:33 Comment(0)
H
2

To handle URL-encoded filenames:

URL="http://www.example.com/ESTAD%C3%8DSTICA(2012).pdf"
BASE=$(basename ${URL})             # ESTAD%C3%8DSTICA(2012).pdf
FILE=$(printf '%b' ${BASE//%/\\x})  # ESTADÍSTICA(2012).pdf
wget ${URL}
Hussy answered 7/6, 2013 at 9:15 Comment(0)
W
2
#!/bin/bash
file=$(wget $1 2>&1 | grep Saving | cut -d ' ' -f 3 | sed -e 's/[^A-Za-z0-9._-]//g')

I like this because wget already tells you the filename it's saving. The sed strips non-filename characters ie. the apostrophes.

Walburga answered 26/12, 2019 at 18:18 Comment(0)
S
1
~ $ URL='http://pics.sitename.com/images/191211/mxKL17DdgUhcr.jpg'
~ $ echo ${URL##*/}
mxKL17DdgUhcr.jpg
~ $ wget $URL -O ${URL##*/}
--18:34:26--  http://pics.sitename.com/images/191211/mxKL17DdgUhcr.jpg
           => `mxKL17DdgUhcr.jpg'
Severn answered 20/12, 2011 at 10:35 Comment(0)
M
1

An alternative to @Gowtham Gopalakrishnan's answer is simply:

wget --server-response -q "https://very.long/url/here" 2>&1 | awk -F"filename=" '{if ($2) print $2}'

Which just outputs the name of the file that is set in the content disposition.

Example

$ wget --server-response -q https://hostname/filename-that-i-liek.zip 2>&1 | awk -F"filename=" '{if ($2) print $2}'
"filename-that-i-liek.zip"
Mosira answered 12/10, 2019 at 13:50 Comment(0)
L
0

I guess you already have the full URL of the file somewhere in a variable. Use Bash parameter expansion to strip the prefix:

echo ${url##*/}
Lefton answered 20/12, 2011 at 10:34 Comment(0)
H
-4

So you want to give the file / image name as a parameter.

Try this:

echo -n "Give me the name of file in http://pics.sitename.com/images/191211/ :"

read $string

sudo wget http://pics.sitename.com/images/191211/$string ;;

I think this could help you

Haemal answered 20/12, 2011 at 11:55 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.