Connection was refused by other side: 10061: No connection could be made because the target machine actively refused it
Asked Answered
O

2

6

My steps:

  1. Build image docker build . -t scrapy
  2. Run a container docker run -it -p 8050:8050 --rm scrapy
  3. In container run scrapy project: scrapy crawl foobar -o allobjects.json

This works locally, but on my production server I get error:

[scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.example.com via http://localhost:8050/execute> (failed 1 times): Connection was refused by other side: 10061: No connection could be made because the target machine actively refused it..

Note: I'm NOT using Docker Desktop, neither can I on this server.

Dockerfile

FROM mcr.microsoft.com/windows/servercore:ltsc2019

SHELL ["powershell", "-Command", "$ErrorActionPreference = 'Stop'; $ProgressPreference = 'SilentlyContinue';"]

RUN setx /M PATH $('C:\Users\ContainerAdministrator\miniconda3\Library\bin;C:\Users\ContainerAdministrator\miniconda3\Scripts;C:\Users\ContainerAdministrator\miniconda3;' + $Env:PATH)
RUN Invoke-WebRequest "https://repo.anaconda.com/miniconda/Miniconda3-py38_4.10.3-Windows-x86_64.exe" -OutFile miniconda3.exe -UseBasicParsing; \
    Start-Process -FilePath 'miniconda3.exe' -Wait -ArgumentList '/S', '/D=C:\Users\ContainerAdministrator\miniconda3'; \
    Remove-Item .\miniconda3.exe; \
    conda install -y -c conda-forge scrapy;

RUN pip install scrapy-splash
RUN pip install scrapy-user-agents
    
#creates root directory if not exists, then enters it
WORKDIR /root/scrapy

COPY scrapy /root/scrapy

settings.py

SPLASH_URL = 'http://localhost:8050/'

OUTPUT with command scrapy crawl foobar -o allobjects.json

2021-09-15 20:12:16 [scrapy.core.engine] INFO: Spider opened
2021-09-15 20:12:16 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min
)
2021-09-15 20:12:16 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
2021-09-15 20:12:16 [py.warnings] WARNING: C:\Users\ContainerAdministrator\miniconda3\lib\site-packages\scrapy_splash\re
quest.py:41: ScrapyDeprecationWarning: Call to deprecated function to_native_str. Use to_unicode instead.
  url = to_native_str(url)

2021-09-15 20:12:16 [scrapy_user_agents.middlewares] DEBUG: Assigned User-Agent Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.117 Safari/537.36
2021-09-15 20:12:16 [scrapy_user_agents.middlewares] DEBUG: Assigned User-Agent Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.108 Safari/537.36
2021-09-15 20:12:17 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.example.com via http://localhost:8050/execute> (failed 1 times): Connection was refused by other side: 10061: No connection could be made because the target machine actively refused it..
2021-09-15 20:12:17 [scrapy_user_agents.middlewares] DEBUG: Assigned User-Agent Mozilla/5.0 (Windows NT 10.0; WOW64) App
leWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36
2021-09-15 20:12:18 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.example.com via http://localhost:8050/execute> (failed 2 times): Connection was refused by other side: 10061: No connection
could be made because the target machine actively refused it..
2021-09-15 20:12:18 [scrapy_user_agents.middlewares] DEBUG: Assigned User-Agent Mozilla/5.0 (Windows NT 10.0; Win64; x64
) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.146 Safari/537.36
2021-09-15 20:12:19 [scrapy.downloadermiddlewares.retry] ERROR: Gave up retrying <GET https://www.example.com via http://localhost:8050/execute> (failed 3 times): Connection was refused by other side: 10061: No con
nection could be made because the target machine actively refused it..
2021-09-15 20:12:19 [scrapy.core.scraper] ERROR: Error downloading <GET https://www.example.com via http://localhost:8050/execute>
Traceback (most recent call last):
  File "C:\Users\ContainerAdministrator\miniconda3\lib\site-packages\scrapy\core\downloader\middleware.py", line 45, in
process_request
    return (yield download_func(request=request, spider=spider))
twisted.internet.error.ConnectionRefusedError: Connection was refused by other side: 10061: No connection could be made
because the target machine actively refused it..
2021-09-15 20:12:19 [scrapy.core.engine] INFO: Closing spider (finished)
2021-09-15 20:12:19 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/exception_count': 3,
 'downloader/exception_type_count/twisted.internet.error.ConnectionRefusedError': 3,
 'downloader/request_bytes': 4632,
 'downloader/request_count': 3,
 'downloader/request_method_count/POST': 3,
 'elapsed_time_seconds': 3.310168,
 'finish_reason': 'finished',
 'finish_time': datetime.datetime(2021, 9, 15, 18, 12, 19, 605641),
 'log_count/DEBUG': 6,
 'log_count/ERROR': 2,
 'log_count/INFO': 10,
 'log_count/WARNING': 46,
 'retry/count': 2,
 'retry/max_reached': 1,
 'retry/reason_count/twisted.internet.error.ConnectionRefusedError': 2,
 'scheduler/dequeued': 4,
 'scheduler/dequeued/memory': 4,
 'scheduler/enqueued': 4,
 'scheduler/enqueued/memory': 4,
 'splash/execute/request_count': 1,
 'start_time': datetime.datetime(2021, 9, 15, 18, 12, 16, 295473)}
2021-09-15 20:12:19 [scrapy.core.engine] INFO: Spider closed (finished)

What am I missing?

I already checked here:

UPDATE 1

I included EXPOSE 8050 in my Dockerfile, but get the same error. I tried netstat -a inside the docker container, but 8050 seems not to be in there?

C:\root\scrapy>netstat -a

Active Connections

  Proto  Local Address          Foreign Address        State
  TCP    0.0.0.0:135            c60d48724046:0         LISTENING
  TCP    0.0.0.0:5985           c60d48724046:0         LISTENING
  TCP    0.0.0.0:47001          c60d48724046:0         LISTENING
  TCP    0.0.0.0:49152          c60d48724046:0         LISTENING
  TCP    0.0.0.0:49153          c60d48724046:0         LISTENING
  TCP    0.0.0.0:49154          c60d48724046:0         LISTENING
  TCP    0.0.0.0:49155          c60d48724046:0         LISTENING
  TCP    0.0.0.0:49159          c60d48724046:0         LISTENING
  TCP    [::]:135               c60d48724046:0         LISTENING
  TCP    [::]:5985              c60d48724046:0         LISTENING
  TCP    [::]:47001             c60d48724046:0         LISTENING
  TCP    [::]:49152             c60d48724046:0         LISTENING
  TCP    [::]:49153             c60d48724046:0         LISTENING
  TCP    [::]:49154             c60d48724046:0         LISTENING
  TCP    [::]:49155             c60d48724046:0         LISTENING
  TCP    [::]:49159             c60d48724046:0         LISTENING
  UDP    0.0.0.0:5353           *:*
  UDP    0.0.0.0:5355           *:*
  UDP    127.0.0.1:51352        *:*
  UDP    [::]:5353              *:*
  UDP    [::]:5355              *:*

UPDATE 2

Commands I ran on my host OS:

docker ps output:

CONTAINER ID   IMAGE     COMMAND                    CREATED          STATUS          PORTS                    NAMES
bf615a00b74a   scrapy    "c:\\windows\\system32…"   52 seconds ago   Up 49 seconds   0.0.0.0:8050->8050/tcp   blissful_brahmagupta

netstat -a output (I changed ip/server names for anonymity):

Active Connections

  Proto  Local Address          Foreign Address        State
  TCP    0.0.0.0:21             exampleserver:0             LISTENING
  TCP    0.0.0.0:25             exampleserver:0             LISTENING
  TCP    0.0.0.0:80             exampleserver:0             LISTENING
  TCP    0.0.0.0:110            exampleserver:0             LISTENING
  TCP    0.0.0.0:135            exampleserver:0             LISTENING
  TCP    0.0.0.0:143            exampleserver:0             LISTENING
  TCP    0.0.0.0:443            exampleserver:0             LISTENING
  TCP    0.0.0.0:445            exampleserver:0             LISTENING
  TCP    0.0.0.0:587            exampleserver:0             LISTENING
  TCP    0.0.0.0:995            exampleserver:0             LISTENING
  TCP    0.0.0.0:1433           exampleserver:0             LISTENING
  TCP    0.0.0.0:2179           exampleserver:0             LISTENING
  TCP    0.0.0.0:3306           exampleserver:0             LISTENING
  TCP    0.0.0.0:3389           exampleserver:0             LISTENING
  TCP    0.0.0.0:5985           exampleserver:0             LISTENING
  TCP    0.0.0.0:8983           exampleserver:0             LISTENING
  TCP    0.0.0.0:33060          exampleserver:0             LISTENING
  TCP    0.0.0.0:47001          exampleserver:0             LISTENING
  TCP    0.0.0.0:49231          exampleserver:0             LISTENING
  TCP    0.0.0.0:49664          exampleserver:0             LISTENING
  TCP    0.0.0.0:49665          exampleserver:0             LISTENING
  TCP    0.0.0.0:49666          exampleserver:0             LISTENING
  TCP    0.0.0.0:49667          exampleserver:0             LISTENING
  TCP    0.0.0.0:49668          exampleserver:0             LISTENING
  TCP    0.0.0.0:49673          exampleserver:0             LISTENING
  TCP    0.0.0.0:49881          exampleserver:0             LISTENING
  TCP    12.12.12.12:21        103.144.31.100:ftp     SYN_RECEIVED
  TCP    12.12.12.12:25        ip245:1256             TIME_WAIT
  TCP    12.12.12.12:25        ip245:12756            TIME_WAIT
  TCP    12.12.12.12:25        ip245:25324            TIME_WAIT
  TCP    12.12.12.12:25        ip245:30624            TIME_WAIT
  TCP    12.12.12.12:25        ip245:48206            TIME_WAIT
  TCP    12.12.12.12:25        ip245:59510            TIME_WAIT
  TCP    12.12.12.12:80        ec2-52-31-126-154:1440  ESTABLISHED
  TCP    12.12.12.12:80        ec2-52-31-157-215:31240  ESTABLISHED
  TCP    12.12.12.12:80        ec2-52-31-205-57:65197  ESTABLISHED
  TCP    12.12.12.12:80        ninja-crawler92:36060  ESTABLISHED
  TCP    12.12.12.12:80        13:62786               TIME_WAIT
  TCP    12.12.12.12:80        16:22362               TIME_WAIT
  TCP    12.12.12.12:80        19:4130                TIME_WAIT
  TCP    12.12.12.12:80        22:30072               TIME_WAIT
  TCP    12.12.12.12:80        22:51362               TIME_WAIT
  TCP    12.12.12.12:80        34:9586                TIME_WAIT
  TCP    12.12.12.12:80        35:40210               TIME_WAIT
  TCP    12.12.12.12:80        35:65164               TIME_WAIT
  TCP    12.12.12.12:80        38:17882               TIME_WAIT
  TCP    12.12.12.12:80        39:17918               TIME_WAIT
  TCP    12.12.12.12:80        40:51642               TIME_WAIT
  TCP    12.12.12.12:80        40:57586               TIME_WAIT
  TCP    12.12.12.12:80        45:45800               TIME_WAIT
  TCP    12.12.12.12:139       exampleserver:0             LISTENING
  TCP    12.12.12.12:443       static:3610            TIME_WAIT
  TCP    12.12.12.12:443       static:5823            TIME_WAIT
  TCP    12.12.12.12:443       static:38855           TIME_WAIT
  TCP    12.12.12.12:443       static:53579           TIME_WAIT
  TCP    12.12.12.12:443       static:54816           TIME_WAIT
  TCP    12.12.12.12:443       static:26725           TIME_WAIT
  TCP    12.12.12.12:443       static:14749           TIME_WAIT
  TCP    12.12.12.12:443       static:8533            TIME_WAIT
  TCP    12.12.12.12:443       static:9136            TIME_WAIT
  TCP    12.12.12.12:443       static:35494           TIME_WAIT
  TCP    12.12.12.12:443       193:48688              TIME_WAIT
  TCP    12.12.12.12:443       static:3161            TIME_WAIT
  TCP    12.12.12.12:443       static:31667           TIME_WAIT
  TCP    12.12.12.12:443       ec2-52-31-126-154:25042  ESTABLISHED
  TCP    12.12.12.12:443       ec2-52-31-157-215:61630  ESTABLISHED
  TCP    12.12.12.12:443       ec2-52-31-205-57:20864  ESTABLISHED
  TCP    12.12.12.12:443       crawl-66-249-76-28:46983  ESTABLISHED
  TCP    12.12.12.12:443       crawl-66-249-76-30:47250  ESTABLISHED
  TCP    12.12.12.12:443       crawl-66-249-76-102:45115  ESTABLISHED
  TCP    12.12.12.12:443       crawl-66-249-76-104:62362  ESTABLISHED
  TCP    12.12.12.12:443       crawl-66-249-76-106:52575  ESTABLISHED
  TCP    12.12.12.12:443       crawl-66-249-76-192:51273  ESTABLISHED
  TCP    12.12.12.12:443       google-proxy-66-249-81-16:37717  ESTABLISHED
  TCP    12.12.12.12:443       rate-limited-proxy-66-249-89-97:42078  ESTABLISHED
  TCP    12.12.12.12:443       77-162-6-126:60721     ESTABLISHED
  TCP    12.12.12.12:443       77-162-6-126:60728     ESTABLISHED
  TCP    12.12.12.12:443       81-207-120-215:53600   ESTABLISHED
  TCP    12.12.12.12:443       ip-83-134-52-36:51127  ESTABLISHED
  TCP    12.12.12.12:443       host-83-232-56-99:2747  ESTABLISHED
  TCP    12.12.12.12:443       84-29-102-40:57144     ESTABLISHED
  TCP    12.12.12.12:443       84-104-10-105:57252    ESTABLISHED
  TCP    12.12.12.12:443       exampleserver:54209         ESTABLISHED
  TCP    12.12.12.12:443       static:37222           TIME_WAIT
  TCP    12.12.12.12:443       static:net-device      TIME_WAIT
  TCP    12.12.12.12:443       static:7874            TIME_WAIT
  TCP    12.12.12.12:443       static:33373           TIME_WAIT
  TCP    12.12.12.12:443       static:60446           TIME_WAIT
  TCP    12.12.12.12:443       92-111-50-210:54795    ESTABLISHED
  TCP    12.12.12.12:443       static:2841            TIME_WAIT
  TCP    12.12.12.12:443       ip-95-223-56-232:51129  ESTABLISHED
  TCP    12.12.12.12:443       petalbot-114-119-135-120:32530  TIME_WAIT
  TCP    12.12.12.12:443       petalbot-114-119-148-37:39746  TIME_WAIT
  TCP    12.12.12.12:443       petalbot-114-119-148-47:39066  TIME_WAIT
  TCP    12.12.12.12:443       petalbot-114-119-148-60:51178  SYN_RECEIVED
  TCP    12.12.12.12:443       petalbot-114-119-148-160:11516  TIME_WAIT
  TCP    12.12.12.12:443       petalbot-114-119-148-169:52484  TIME_WAIT
  TCP    12.12.12.12:443       petalbot-114-119-148-191:41470  TIME_WAIT
  TCP    12.12.12.12:443       petalbot-114-119-149-1:64570  TIME_WAIT
  TCP    12.12.12.12:443       petalbot-114-119-149-168:1456  TIME_WAIT
  TCP    12.12.12.12:443       petalbot-114-119-149-169:61436  TIME_WAIT
  TCP    12.12.12.12:443       static:47402           TIME_WAIT
  TCP    12.12.12.12:443       static:7710            TIME_WAIT
  TCP    12.12.12.12:443       static:15334           TIME_WAIT
  TCP    12.12.12.12:443       static:50492           TIME_WAIT
  TCP    12.12.12.12:443       static:3896            TIME_WAIT
  TCP    12.12.12.12:443       static:32136           TIME_WAIT
  TCP    12.12.12.12:443       ninja-crawler97:19950  ESTABLISHED
  TCP    12.12.12.12:443       static:9737            TIME_WAIT
  TCP    12.12.12.12:443       1:14850                TIME_WAIT
  TCP    12.12.12.12:443       2:9212                 TIME_WAIT
  TCP    12.12.12.12:443       2:38644                TIME_WAIT
  TCP    12.12.12.12:443       2:40354                TIME_WAIT
  TCP    12.12.12.12:443       2:61144                TIME_WAIT
  TCP    12.12.12.12:443       9:4920                 TIME_WAIT
  TCP    12.12.12.12:443       9:10744                TIME_WAIT
  TCP    12.12.12.12:443       9:41246                TIME_WAIT
  TCP    12.12.12.12:443       10:55160               TIME_WAIT
  TCP    12.12.12.12:443       12:28250               TIME_WAIT
  TCP    12.12.12.12:443       12:48182               TIME_WAIT
  TCP    12.12.12.12:443       13:6848                TIME_WAIT
  TCP    12.12.12.12:443       13:41174               TIME_WAIT
  TCP    12.12.12.12:443       14:11724               TIME_WAIT
  TCP    12.12.12.12:443       14:23780               TIME_WAIT
  TCP    12.12.12.12:443       14:35272               TIME_WAIT
  TCP    12.12.12.12:443       14:42876               TIME_WAIT
  TCP    12.12.12.12:443       15:50642               TIME_WAIT
  TCP    12.12.12.12:443       16:11382               TIME_WAIT
  TCP    12.12.12.12:443       16:43780               TIME_WAIT
  TCP    12.12.12.12:443       17:18676               TIME_WAIT
  TCP    12.12.12.12:443       18:40086               TIME_WAIT
  TCP    12.12.12.12:443       20:14698               TIME_WAIT
  TCP    12.12.12.12:443       21:8742                TIME_WAIT
  TCP    12.12.12.12:443       21:9222                TIME_WAIT
  TCP    12.12.12.12:443       21:10050               TIME_WAIT
  TCP    12.12.12.12:443       21:22212               TIME_WAIT
  TCP    12.12.12.12:443       23:20186               TIME_WAIT
  TCP    12.12.12.12:443       24:9702                TIME_WAIT
  TCP    12.12.12.12:443       24:29658               TIME_WAIT
  TCP    12.12.12.12:443       24:54316               TIME_WAIT
  TCP    12.12.12.12:443       24:54740               TIME_WAIT
  TCP    12.12.12.12:443       26:63912               TIME_WAIT
  TCP    12.12.12.12:443       34:38802               TIME_WAIT
  TCP    12.12.12.12:443       34:48344               TIME_WAIT
  TCP    12.12.12.12:443       35:19314               TIME_WAIT
  TCP    12.12.12.12:443       35:56518               TIME_WAIT
  TCP    12.12.12.12:443       36:26848               TIME_WAIT
  TCP    12.12.12.12:443       36:29840               TIME_WAIT
  TCP    12.12.12.12:443       37:22090               TIME_WAIT
  TCP    12.12.12.12:443       37:41662               TIME_WAIT
  TCP    12.12.12.12:443       37:62462               TIME_WAIT
  TCP    12.12.12.12:443       37:65246               TIME_WAIT
  TCP    12.12.12.12:443       38:3746                TIME_WAIT
  TCP    12.12.12.12:443       38:13518               TIME_WAIT
  TCP    12.12.12.12:443       38:19626               TIME_WAIT
  TCP    12.12.12.12:443       38:46588               TIME_WAIT
  TCP    12.12.12.12:443       38:55504               TIME_WAIT
  TCP    12.12.12.12:443       39:13096               TIME_WAIT
  TCP    12.12.12.12:443       40:14808               TIME_WAIT
  TCP    12.12.12.12:443       40:18046               TIME_WAIT
  TCP    12.12.12.12:443       40:19968               TIME_WAIT
  TCP    12.12.12.12:443       40:37858               TIME_WAIT
  TCP    12.12.12.12:443       40:47914               TIME_WAIT
  TCP    12.12.12.12:443       40:54890               TIME_WAIT
  TCP    12.12.12.12:443       40:58958               TIME_WAIT
  TCP    12.12.12.12:443       40:61998               TIME_WAIT
  TCP    12.12.12.12:443       41:5752                TIME_WAIT
  TCP    12.12.12.12:443       41:6420                ESTABLISHED
  TCP    12.12.12.12:443       41:6424                TIME_WAIT
  TCP    12.12.12.12:443       41:8224                TIME_WAIT
  TCP    12.12.12.12:443       41:23838               TIME_WAIT
  TCP    12.12.12.12:443       41:56540               TIME_WAIT
  TCP    12.12.12.12:443       42:44002               TIME_WAIT
  TCP    12.12.12.12:443       42:48300               TIME_WAIT
  TCP    12.12.12.12:443       45:16840               TIME_WAIT
  TCP    12.12.12.12:443       45:44966               TIME_WAIT
  TCP    12.12.12.12:443       45:45542               TIME_WAIT
  TCP    12.12.12.12:443       static:6008            TIME_WAIT
  TCP    12.12.12.12:443       static:50129           TIME_WAIT
  TCP    12.12.12.12:443       static:11337           TIME_WAIT
  TCP    12.12.12.12:443       static:57596           TIME_WAIT
  TCP    12.12.12.12:443       188:13459              ESTABLISHED
  TCP    12.12.12.12:443       193.32.169.146:40330   ESTABLISHED
  TCP    12.12.12.12:443       dD5765B1A:53034        ESTABLISHED
  TCP    12.12.12.12:443       ip-213-127-45-40:59736  ESTABLISHED
  TCP    12.12.12.12:443       ip-213-127-45-40:59772  ESTABLISHED
  TCP    12.12.12.12:3389      31.184.218.129:37280   CLOSE_WAIT
  TCP    12.12.12.12:3389      219:63027              ESTABLISHED
  TCP    12.12.12.12:3389      89.205.133.110:2255    ESTABLISHED
  TCP    12.12.12.12:3389      101.204.228.50:55310   ESTABLISHED
  TCP    12.12.12.12:3389      static:59832           ESTABLISHED
  TCP    12.12.12.12:54209     exampleserver:https         ESTABLISHED
  TCP    12.12.12.12:54217     ams17s10-in-f10:https  ESTABLISHED
  TCP    12.12.12.12:54242     lhr26s05-in-f10:https  ESTABLISHED
  TCP    12.12.12.12:54288     server-52-222-139-23:https  ESTABLISHED
  TCP    12.12.12.12:54533     40.126.31.135:https    TIME_WAIT
  TCP    12.12.12.12:54534     20.73.194.208:https    TIME_WAIT
  TCP    12.12.12.12:54535     20.67.183.221:https    TIME_WAIT
  TCP    12.12.12.12:54544     40.125.122.176:https   TIME_WAIT
  TCP    12.12.12.12:54620     ec2-54-77-149-211:https  CLOSE_WAIT
  TCP    12.12.12.12:54623     a23-79-157-152:http    TIME_WAIT
  TCP    12.12.12.12:54625     a23-79-157-152:http    TIME_WAIT
  TCP    12.12.12.12:54626     a23-79-157-152:http    TIME_WAIT
  TCP    12.12.12.12:54627     a23-79-157-152:http    TIME_WAIT
  TCP    12.12.12.12:54628     a23-79-157-152:http    TIME_WAIT
  TCP    12.12.12.12:54629     a23-79-157-152:http    TIME_WAIT
  TCP    12.12.12.12:54633     80-69-93-62:https      CLOSE_WAIT
  TCP    12.12.12.12:54637     exampleserver:ms-sql-s      TIME_WAIT
  TCP    127.0.0.1:3306         exampleserver:54634         ESTABLISHED
  TCP    127.0.0.1:49674        exampleserver:49675         ESTABLISHED
  TCP    127.0.0.1:49675        exampleserver:49674         ESTABLISHED
  TCP    127.0.0.1:49676        exampleserver:49677         ESTABLISHED
  TCP    127.0.0.1:49677        exampleserver:49676         ESTABLISHED
  TCP    127.0.0.1:54634        exampleserver:3306          ESTABLISHED
  TCP    172.25.64.1:53         exampleserver:0             LISTENING
  TCP    172.25.64.1:139        exampleserver:0             LISTENING
  TCP    172.25.64.1:54638      exampleserver:ms-sql-s      TIME_WAIT
  TCP    [::]:21                exampleserver:0             LISTENING
  TCP    [::]:25                exampleserver:0             LISTENING
  TCP    [::]:80                exampleserver:0             LISTENING
  TCP    [::]:135               exampleserver:0             LISTENING
  TCP    [::]:443               exampleserver:0             LISTENING
  TCP    [::]:445               exampleserver:0             LISTENING
  TCP    [::]:587               exampleserver:0             LISTENING
  TCP    [::]:1433              exampleserver:0             LISTENING
  TCP    [::]:2179              exampleserver:0             LISTENING
  TCP    [::]:3306              exampleserver:0             LISTENING
  TCP    [::]:3389              exampleserver:0             LISTENING
  TCP    [::]:5985              exampleserver:0             LISTENING
  TCP    [::]:8983              exampleserver:0             LISTENING
  TCP    [::]:33060             exampleserver:0             LISTENING
  TCP    [::]:47001             exampleserver:0             LISTENING
  TCP    [::]:49231             exampleserver:0             LISTENING
  TCP    [::]:49664             exampleserver:0             LISTENING
  TCP    [::]:49665             exampleserver:0             LISTENING
  TCP    [::]:49666             exampleserver:0             LISTENING
  TCP    [::]:49667             exampleserver:0             LISTENING
  TCP    [::]:49668             exampleserver:0             LISTENING
  TCP    [::]:49673             exampleserver:0             LISTENING
  TCP    [::]:49881             exampleserver:0             LISTENING
  TCP    [::1]:8983             exampleserver:53194         ESTABLISHED
  TCP    [::1]:8983             exampleserver:54283         ESTABLISHED
  TCP    [::1]:8983             exampleserver:54476         TIME_WAIT
  TCP    [::1]:8983             exampleserver:54622         ESTABLISHED
  TCP    [::1]:8983             exampleserver:54632         FIN_WAIT_2
  TCP    [::1]:8983             exampleserver:62954         ESTABLISHED
  TCP    [::1]:8983             exampleserver:62971         ESTABLISHED
  TCP    [::1]:8983             exampleserver:63272         ESTABLISHED
  TCP    [::1]:53194            exampleserver:8983          ESTABLISHED
  TCP    [::1]:54283            exampleserver:8983          ESTABLISHED
  TCP    [::1]:54622            exampleserver:8983          ESTABLISHED
  TCP    [::1]:54632            exampleserver:8983          CLOSE_WAIT
  TCP    [::1]:62954            exampleserver:8983          ESTABLISHED
  TCP    [::1]:62971            exampleserver:8983          ESTABLISHED
  TCP    [::1]:63272            exampleserver:8983          ESTABLISHED
  TCP    [fe80::67ff:cd37:2e9:65fe%14]:1433  exampleserver:53967         ESTABLISHED
  TCP    [fe80::67ff:cd37:2e9:65fe%14]:1433  exampleserver:53981         ESTABLISHED
  TCP    [fe80::67ff:cd37:2e9:65fe%14]:1433  exampleserver:54262         ESTABLISHED
  TCP    [fe80::67ff:cd37:2e9:65fe%14]:1433  exampleserver:54276         ESTABLISHED
  TCP    [fe80::67ff:cd37:2e9:65fe%14]:1433  exampleserver:54635         ESTABLISHED
  TCP    [fe80::67ff:cd37:2e9:65fe%14]:53967  exampleserver:ms-sql-s      ESTABLISHED
  TCP    [fe80::67ff:cd37:2e9:65fe%14]:53981  exampleserver:ms-sql-s      ESTABLISHED
  TCP    [fe80::67ff:cd37:2e9:65fe%14]:54262  exampleserver:ms-sql-s      ESTABLISHED
  TCP    [fe80::67ff:cd37:2e9:65fe%14]:54276  exampleserver:ms-sql-s      ESTABLISHED
  TCP    [fe80::67ff:cd37:2e9:65fe%14]:54635  exampleserver:ms-sql-s      ESTABLISHED
  TCP    [fe80::bc0d:78b4:381:b364%18]:54636  exampleserver:ms-sql-s      TIME_WAIT
  UDP    0.0.0.0:123            *:*
  UDP    0.0.0.0:500            *:*
  UDP    0.0.0.0:3389           *:*
  UDP    0.0.0.0:4500           *:*
  UDP    0.0.0.0:5353           *:*
  UDP    0.0.0.0:5355           *:*
  UDP    12.12.12.12:137       *:*
  UDP    12.12.12.12:138       *:*
  UDP    127.0.0.1:61804        *:*
  UDP    172.25.64.1:53         *:*
  UDP    172.25.64.1:137        *:*
  UDP    172.25.64.1:138        *:*
  UDP    [::]:123               *:*
  UDP    [::]:500               *:*
  UDP    [::]:3389              *:*
  UDP    [::]:4500              *:*
  UDP    [::]:5353              *:*
  UDP    [::]:5355              *:*

UPDATE 3

In Windows Firewall I created new TCP port rule for both inbound and outbound allowing ports "80,443,8050" (for Domain/Private/Public)

Then retried running C:\root\scrapy>scrapy foobar -o allobjects.json in my container, but still get the "Connection was refused" error.

I ran these commands:

  • In my container: C:\root\scrapy>netstat -a
  • On my host OS: D:\Progams\netstat -a

In neither output does port 8050 show up...but I had expected it to be there?

UPDATE 4

Based on @LSerni/@Pankaj's suggestions around Splash:

I don't know what I would need to enter instead of scrapinghub/splash in your example.

I tried: PS D:\Programs\image_addons> docker run -p 8050:8050 -p 5023:5023 scrapinghub/splash

Unable to find image 'scrapinghub/splash:latest' locally
latest: Pulling from scrapinghub/splash
docker: image operating system "linux" cannot be used on this platform.

And:

PS D:\Programs\image_addons> docker run -p 8050:8050 -p 5023:5023 scrapy/splash

Unable to find image 'scrapy/splash:latest' locally
docker: Error response from daemon: pull access denied for scrapy/splash, repository does not exist or may require 'docker login': denied: requested access to the resource is denied.

Splash is installed via my Docker file into my scrapy image as can be seen from my post, so how would I need to start it?

UPDATE 5

Based on @Thiago Curvelo's comment around Splash:

So locally on Windows 10 and Docker desktop the above setup works for me where I'm not (at least not explicitly) starting a separate Splash instance, but perhaps that is logic build into Docker Desktop? On my server I can't use Docker Desktop as it does not support Hyper-V.
If you look at my Dockerfile, you see that I install scrapy-splash into my image.
Now to your point: apparently I need to start a Splash instance separately? Does that container have to run separately and in parallel from my scrapy container? Or could (and SHOULD?) I add the Splash image to my existing scrapy image and run a container based on that? I'm new to this so not exactly sure if that's what I should be Googling for.
(also I was just attempting scrapy/splash as I thought perhaps the scrapinghub part was referring to the image name, in my case scrapy, guess I was wrong about that too ;-))

Olivine answered 15/9, 2021 at 18:29 Comment(19)
have you tried exposing the port 8050 in dockerfile and then do the build ? basically EXPOSE 8050Spooner
Added that, but get same error, see my update 1 pleaseOlivine
can you paste the output of docker ps ?Spooner
also can you please check at localhost level if the port 8050 is used by something else if so then the mapping wouldn't work as well the 8050:8050Spooner
I've got two questions for you. First, how did you run the command in production? Was it on the docker container as you did testing or a different way. Second, what functionality is getting exposed on port 8050 when you run the docker container? I don't see any command running in the docker container on port 8050.Rosalindarosalinde
@JeffSiver he’s got the docker run in his steps with -p mapping. I suspect his port might be in use already and the mapping isn’t working for the same reason.Spooner
@Flo Where is you Splash located? and where is your Production server? are they same? in your docker run with scrapy crawl foobar, check your outgoing IP using lumtest.com/myip.json and then debug if your Splash server is allows to accept requests from that or not.Personification
@JeffSiver/Raj Verma: see my update 2.Olivine
@UmairAyub please check my Dockerfile top of the post, I presume Splash is running inside the launched container or am I misunderstanding your question?Olivine
by the looks of all the info provided and the docker ps output i can assure you this is the port 8050 getting blocked. As you mentioned previously you are able to run this fine on local server. Can you please check the ports open on the server and try to use one them(basically map xyz:8050) and then query the url. Could you please share the output of this from the Prod server: 'netstat -an | find /i "listening"' Spooner
@RajVerma: thank you. Command netstat -an | find /i "listening" results in error "FIND: Parameter format not correct". Output of netstat -a -n looks the same as output of netstat -a already listed above and port 8050 is not showing up in either. So should I perhaps open port 8050 in Windows Firewall (and what are the risks of this if any?)Olivine
yes please get the port opened and you should be good to go! hope this has helped rectify your problem. Well if your application needs that port to work then that exception can be set from security team in your org. Have a word with them on the policy front and risks(just in case).Spooner
Thanks! I added update 3, as I still have the same error after opening the port :SOlivine
@Flo you might be mistaken Splash with scrapy-splash. The first one is a headless browser engine, while the latter is a wrapper for integrating Splash and Scrapy. You'll need to start a Splash instance regardless, which runs by default in a Linux container. If you can't start a Linux container in your system, you may need to build your own image or use a VM for that. Also, you're mistaking the image's names. The correct is scrapinghub/splash (There isn't a scrapy/splash available)Midday
IMHO I don't think installing Splash in a windows worthes the trouble (or even If It is possible). My advice here would be running Splash in a Linux VM (eg. using Virtualbox) or launching it in a cloud provider. Another option, would using Zyte (formely Scrapinghub) platform, which offers Splash instances in their app.Midday
@ThiagoCurvelo , thanks! The Zyte route might be something to consider when budget allows :). For now I'm going to checkout the VirtualBox route. For my understanding: would I still run scrapy container on Windows and just Splash in VirtualBox, or do Docker, scrapy-splash and all images/containers have to run in VirtualBox? (FYI: I prefer to keep as much logic/data as I can on Windows as that's my main OS)Olivine
You can stick with your scrapy container and have a dedicated VM for Splash, as long as they can reach each other over the network.Midday
Sorry for the long comment thread, will add it to my post if relevant, but is this a potential solution to not have to use Virtual Box and still run Docker on Windows Server 2019? learn.microsoft.com/en-us/windows/wsl/install-on-server (I only have my production server to test, so I want to be careful what I try)Olivine
I'm no windows expert but I think that running docker on WSL should work.Midday
M
0

You need to fire up a Splash instance first, and make it listen on the port 8050. Eg.:

docker run -dit -p 8050:8050 --name my_splash scrapinghub/splash

Then, set the splash URL to point to that running container:

settings.py:

SPLASH_URL = 'http://my_splash:8050/'

And finally, start the Scrapy container, linking it to the Splash one:

docker run -it --link my_splash --rm scrapy

That way you'll be able to send Scrapy's requests to Splash.

Midday answered 24/9, 2021 at 4:48 Comment(3)
@RajVerma it clearly isn't the firewall: there is no 8050 listening port (netstat says so). Now, opening the firewall may be required to allow access to the 8050 port, but if the port is not there because no one is listening, allowing access to a nonexistent object will always yield nothing.Forgiveness
Well there is an update 3, post firewall update as well it isn’t working, so that takes the firewall out of picture. It could be some group policies as well at AD level but further digging is needed for sure. My assumption was based purely based on initial data, which turns out to be wrong.Spooner
@Thiago Curvelo: Thanks! Could you please check out my update 5? It's the last step to get where I want to be (I think).Olivine
F
0

You say you checked out this question.

Did you start the Splash container? I quote from Pankaj's answer to the above question:

Run the splash on docker

# Start the container:
    $ sudo docker run -p 8050:8050 -p 5023:5023 scrapinghub/splash
# Splash is now available at [...] ports 8050 (http) and 5023 (telnet).
Forgiveness answered 24/9, 2021 at 17:36 Comment(2)
Thanks! Could you check out my update 5? It's the last step to get where I want (I think).Olivine
@Flo yes, you do have to run the Splash container "separately and in parallel from the scrapy container". When you see port 8050 going up, then you'll know you're ready to go.Forgiveness

© 2022 - 2024 — McMap. All rights reserved.