How to send custom headers in a Scrapy Splash request?
Asked Answered
B

1

6

My spider.py file is as so:

def start_requests(self):
    for url in self.start_urls:
        yield scrapy.Request(
            url,
            self.parse,
            headers={'My-Custom-Header':'Custom-Header-Content'},
            meta={
                'splash': {
                    'args': {
                        'html': 1,
                        'wait': 5,
                    },
                }
            },
        )

And my parse def is as below:

def parse(self, response):
    print(response.request.headers)

When I run my spider, below line gets printed as the header:

{
    b'Content-Type': [b'application/json'], 
    b'Accept': [b'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'],
    b'Accept-Language': [b'en'], 
    b'User-Agent': [b'Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.2309.372 Safari/537.36'], 
    b'Accept-Encoding': [b'gzip,deflate']
}

AS you can see, this does not have the custom header I added to the Scrapy request.

Can anybody help me with adding a custom header values for this request?

Thanks in advance.

Bricole answered 14/5, 2019 at 11:36 Comment(0)
P
3

If you want splash to use your headers in the request to your specified url, then you should add the headers to the args part, together with html and wait:

meta={
   'splash': {
        'args': {
            'html': 1,
            'wait': 5,
            'headers': {
                'My-Custom-Header': 'Custom-Header-Content',
            },
        },
    }
}
Preraphaelite answered 17/5, 2019 at 14:19 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.