I can't access scrapyd port 6800 from browser
Asked Answered
H

2

6

I searched a lot on this, it may have a simple solution that I am missing.

I have setup scrapy + scrapyd on both my local machine and my server. They work both ok when I try as "scrapyd".

I can deploy to local without a problem, and I can access to localhost:6800 as well from the browser and I can run spiders on local.

After running scrapyd on remote, I try to deploy to http://remoteip:6800/ with the same as I did deploy locally,

I get,

Packing version 1500333306
Deploying to project "projectX" in http://remoteip:6800/addversion.json
Deploy failed: <urlopen error [Errno 111] Connection refused>

I also can't access http://remoteip:6800/ from my local PC, but I can access from ssh on remote PC (with curl)

I opened inbound and outbound connections on the remote server, what else I am missing?

Thanks

Hole answered 15/7, 2017 at 19:38 Comment(0)
S
13

First check if its running or not, run curl localhost:6800 on the server where ScrapyD is running

Check if firewall is enabled

sudo ufw status

Ideally, just allow tcp connections to 6800instead of disabling firewall, to do so

sudo ufw allow 6800/tcp
sudo ufw reload

Check your scrapyd.conf please set

bind_address=0.0.0.0

instead of

bind_address=127.x.x.x

0.0.0.0 will make scrapyD accessible for incoming connections outside the server/instance, not only localhost.

Then stop scrapyD, I do killall scrapyd to stop scrapyd

Then restart scrapyD using command scrapyd


Note: If you want to keep scrapyd running even after you disconnect from server, do this

nohup scrapyd >& /dev/null &

Also see my answer to set ScrapyD as a System Service

Slating answered 16/7, 2017 at 8:8 Comment(2)
spent the last 8 hours on this and bind_address=0.0.0.0 was the answer. Thanks!Chromatogram
This is especially helpful for Docker deployments. I tried everything to get scrapyd to serve its admin page inside a Docker container while it worked fine on my host system. Creating a config file and using bind_address=0.0.0.0 solved it. Here's an example config setting: scrapyd.readthedocs.io/en/stable/config.htmlElk
C
0

I know this answer may be late, but I hope it can help others like me.

From the official documentation, it will search the config file in these places:

  • /etc/scrapyd/scrapyd.conf (Unix)
  • c:\scrapyd\scrapyd.conf (Windows)
  • /etc/scrapyd/conf.d/* (in alphabetical order, Unix) scrapyd.conf
  • ~/.scrapyd.conf (users home directory)

So you need to create a scrapyd.conf file, and put some configurations in it.

Here is an example configuration file with all the defaults from the the documentation:

[scrapyd]
eggs_dir    = eggs
logs_dir    = logs
items_dir   =
jobs_to_keep = 5
dbs_dir     = dbs
max_proc    = 0
max_proc_per_cpu = 4
finished_to_keep = 100
poll_interval = 5.0
bind_address = 127.0.0.1
http_port   = 6800
debug       = off
runner      = scrapyd.runner
application = scrapyd.app.application
launcher    = scrapyd.launcher.Launcher
webroot     = scrapyd.website.Root

[services]
schedule.json     = scrapyd.webservice.Schedule
cancel.json       = scrapyd.webservice.Cancel
addversion.json   = scrapyd.webservice.AddVersion
listprojects.json = scrapyd.webservice.ListProjects
listversions.json = scrapyd.webservice.ListVersions
listspiders.json  = scrapyd.webservice.ListSpiders
delproject.json   = scrapyd.webservice.DeleteProject
delversion.json   = scrapyd.webservice.DeleteVersion
listjobs.json     = scrapyd.webservice.ListJobs
daemonstatus.json = scrapyd.webservice.DaemonStatus

And what you need to do is: change the bind_address to 0.0.0.0

Cephalad answered 5/3, 2020 at 10:15 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.