Wget: How to bypass "hotlinking" protected image
Asked Answered
S

2

6

Is it possible to bypass "hotlink" image protection? i'm not trying to post it on other sites just to download. when i download the following image, using wget:

http://comicsbook.ru/upload/%D0%9A%D0%BE%D0%BC%D0%B8%D0%BA%D1%81-Trollface-%D0%9D%D0%B0-%D0%B1%D0%BE%D1%80%D1%82%D1%83-70813.jpg

I'm getting redirected to:

http://comicsbook.ru/trollface/70813?na-bortu

I have no idea where to start. What I've tried so far:

curl "http://comicsbook.ru" -s -L -b cookie.c -c cookie.c -b "$COOKIEPAR" > index.$TEMP
wget http://comicsbook.ru/upload/%D0%9A%D0%BE%D0%BC%D0%B8%D0%BA%D1%81-Trollface-%D0%9D%D0%B0-%D0%B1%D0%BE%D1%80%D1%82%D1%83-70813.jpg
Spada answered 10/10, 2012 at 21:4 Comment(1)
Where to start: hotlink protection relies on the HTTP Referer header, not on cookies.Berardo
A
4

You can use the --refer=URL option of wget. Perhaps you could try:

wget --referer=http://comicsbook.ru http://comicsbook.ru/upload/%D0%9A%D0%BE%D0%BC%D0%B8%D0%BA%D1%81-Trollface-%D0%9D%D0%B0-%D0%B1%D0%BE%D1%80%D1%82%D1%83-70813.jpg
Aguish answered 10/10, 2012 at 21:11 Comment(0)
C
1

To download this image, run the following curl command :

curl -e 'http://comicsbook.ru/trollface/70813?na-bortu' -A "Mozilla/5.0" -L -b /tmp/c -c /tmp/c -s 'http://comicsbook.ru/upload/%D0%9A%D0%BE%D0%BC%D0%B8%D0%BA%D1%81-Trollface-%D0%9D%D0%B0-%D0%B1%D0%BE%D1%80%D1%82%D1%83-70813.jpg' > image.jpg

All the magic is in -e switch : the referer

Covert answered 10/10, 2012 at 21:12 Comment(1)
Yes, there's some not needed options, but you can re-use them for another web-scraping use case, it's built to works as close as possible like a real browser.Act

© 2022 - 2024 — McMap. All rights reserved.