parse http response header from wget

D

6

12

Im trying to extract a line from wget's result but having trouble with it. This is my wget call:

$ wget -SO- -T 1 -t 1 http://myurl.com:15000/myhtml.html

Output:

--18:24:12--  http://xxx.xxxx.xxxx:15000/myhtml.html
           => `-'
Resolving xxx.xxxx.xxxx... xxx.xxxx.xxxx
Connecting to xxx.xxxx.xxxx|xxx.xxxx.xxxx|:15000... connected.
HTTP request sent, awaiting response...
  HTTP/1.1 302 Found
  Date: Tue, 18 Nov 2008 23:24:12 GMT
  Server: IBM_HTTP_Server
  Expires: Thu, 01 Dec 1994 16:00:00 GMT
  Location: https://xxx.xxxx.xxxx/siteminderagent/...
  Content-Length: 508
  Keep-Alive: timeout=10, max=100
  Connection: Keep-Alive
  Content-Type: text/html; charset=iso-8859-1
Location: https://xxx.xxxx.xxxx//siteminderagent/...
--18:24:13--  https://xxx.xxxx.xxxx/siteminderagent/...
           => `-'
Resolving xxx.xxxx.xxxx... failed: Name or service not known.

if I do this:

$ wget -SO- -T 1 -t 1 http://myurl.com:15000/myhtml.html | egrep -i "302" <br/>

It doesnt return me the line that contains the string. I just want to check if the site or siteminder is up.

Dancette answered 19/11, 2008 at 15:15 Comment(0)

A

20

The output of wget you are looking for is written on stderr. You must redirect it:

$ wget -SO- -T 1 -t 1 http://myurl.com:15000/myhtml.html 2>&1 | egrep -i "302"

Accept answered 19/11, 2008 at 15:20 Comment(0)

A

10

wget prints the headers to stderr, not to stdout. You can redirect stderr to stdout as follows:

wget -SO- -T 1 -t 1 http://myurl.com:15000/myhtml.html 2>&1 | egrep -i "302"

The "2>&1" part says to redirect ('>') file descriptor 2 (stderr) to file descriptor 1 (stdout).

Alexipharmic answered 19/11, 2008 at 15:23 Comment(1)

Good additional detail to @Piotr's answer. – Insatiate 19/11, 2008 at 15:24

J

2

A bit enhanced version of already provided solution

wget -SO- -T 1 -t 1 http://myurl.com:15000/myhtml.html 2>&1 >/dev/null | grep -c 302

2>&1 >/dev/null will trim off unneeded output. This way egrep will parse only wget`s stderr, what eliminates possibility to catch strings containing 302 from stdout (where html file itself outputted + download proces bar with resulting bytes count e.t.c.) :)

egrep -c counts number of matched strings instead of simply output them. Enough to know how much strings egrep matched.

Jaimiejain answered 3/2, 2011 at 16:38 Comment(0)

V

2

wget --server-response http://www.amazon.de/xyz 2>&1 | awk '/^ HTTP/{print $2}'

Vermiculite answered 20/5, 2014 at 13:26 Comment(0)

C

1

Just to explicate a bit. The -S switch in the original question is shorthand for --server-response.

Also, I know the OP specified wget, but curl is similar and defaults to STDOUT .

curl --head --silent $yourURL

or

curl -I -s $yourURL

The --silent switch is only needed for grep-ability: (-s turns off progress % meter)

Coquille answered 23/1, 2012 at 23:27 Comment(1)

some servers don't respond to a head request – Bloc 29/3, 2017 at 1:51

S

0

I found this question trying to scrape response codes to large lists of URLs after finding curl very slow (5+s per request).

Previously, I was using this:

curl -o /dev/null -I --silent --head --write-out %{http_code} https://example.com

Building off Piotr and Adam's answers, I came up with this:

wget -Sq -T 1 -t 1 --no-check-certificate --spider https://example.com 2>&1 | egrep 'HTTP/1.1 ' | cut -d ' ' -f 4

This has a few bugs e.g. redirects return 302 200, but overall is greased lightning in comparison.

Stake answered 1/8, 2022 at 5:55 Comment(0)

Recommended topics

Hot tags