I am using scrapy with splash on a Javascript driven site. However, I can't get passed a Connection was refused by other side: 10061
error.
I get logs like this:
[scrapy.downloadermiddlewares.retry] DEBUG: Retrying
<GET https://www2.deloitte.com/ch/en/misc/search.html#country=All#qr=accounting
via http://localhost:8050/render.html> (failed 1 times): Connection
was refused by other side: 10061: No connection could be made because
the target machine actively refused it..
and a traceback pointing to twisted:
twisted.internet.error.ConnectionRefusedError: Connection was refused
by other side: 10061: No connection could be made because the target
machine actively refused it..
I have checked all the entries in settings, did try various USER_AGENTS
and ROBOT
entries, but no luck. Also tried to use --disable-private-mode
to start splash, but no effect.
Strangely, just copy-pasting the same url into the browser works perfectly.
I used normal command line scrapy, as well as via the API. Interestingly, when using the API, of course, clicking the url of the target in the error message within PyCharm, the hashtag # is replaced by its escape-code. So I am confused whether under the hud this is another issue or whether the two are related together.
Even tried to look at the packages sent via both Wireshark and Fiddler, but was not able to understand the results well enough, as I never used these tools before.
Any suggestions would be greatly appreciated.