My steps:
- Build image
docker build . -t scrapy
- Run a container
docker run -it -p 8050:8050 --rm scrapy
- In container run scrapy project:
scrapy crawl foobar -o allobjects.json
This works locally, but on my production server I get error:
[scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.example.com via http://localhost:8050/execute> (failed 1 times): Connection was refused by other side: 10061: No connection could be made because the target machine actively refused it..
Note: I'm NOT using Docker Desktop, neither can I on this server.
Dockerfile
FROM mcr.microsoft.com/windows/servercore:ltsc2019
SHELL ["powershell", "-Command", "$ErrorActionPreference = 'Stop'; $ProgressPreference = 'SilentlyContinue';"]
RUN setx /M PATH $('C:\Users\ContainerAdministrator\miniconda3\Library\bin;C:\Users\ContainerAdministrator\miniconda3\Scripts;C:\Users\ContainerAdministrator\miniconda3;' + $Env:PATH)
RUN Invoke-WebRequest "https://repo.anaconda.com/miniconda/Miniconda3-py38_4.10.3-Windows-x86_64.exe" -OutFile miniconda3.exe -UseBasicParsing; \
Start-Process -FilePath 'miniconda3.exe' -Wait -ArgumentList '/S', '/D=C:\Users\ContainerAdministrator\miniconda3'; \
Remove-Item .\miniconda3.exe; \
conda install -y -c conda-forge scrapy;
RUN pip install scrapy-splash
RUN pip install scrapy-user-agents
#creates root directory if not exists, then enters it
WORKDIR /root/scrapy
COPY scrapy /root/scrapy
settings.py
SPLASH_URL = 'http://localhost:8050/'
OUTPUT with command scrapy crawl foobar -o allobjects.json
2021-09-15 20:12:16 [scrapy.core.engine] INFO: Spider opened
2021-09-15 20:12:16 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min
)
2021-09-15 20:12:16 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
2021-09-15 20:12:16 [py.warnings] WARNING: C:\Users\ContainerAdministrator\miniconda3\lib\site-packages\scrapy_splash\re
quest.py:41: ScrapyDeprecationWarning: Call to deprecated function to_native_str. Use to_unicode instead.
url = to_native_str(url)
2021-09-15 20:12:16 [scrapy_user_agents.middlewares] DEBUG: Assigned User-Agent Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.117 Safari/537.36
2021-09-15 20:12:16 [scrapy_user_agents.middlewares] DEBUG: Assigned User-Agent Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.108 Safari/537.36
2021-09-15 20:12:17 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.example.com via http://localhost:8050/execute> (failed 1 times): Connection was refused by other side: 10061: No connection could be made because the target machine actively refused it..
2021-09-15 20:12:17 [scrapy_user_agents.middlewares] DEBUG: Assigned User-Agent Mozilla/5.0 (Windows NT 10.0; WOW64) App
leWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36
2021-09-15 20:12:18 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.example.com via http://localhost:8050/execute> (failed 2 times): Connection was refused by other side: 10061: No connection
could be made because the target machine actively refused it..
2021-09-15 20:12:18 [scrapy_user_agents.middlewares] DEBUG: Assigned User-Agent Mozilla/5.0 (Windows NT 10.0; Win64; x64
) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.146 Safari/537.36
2021-09-15 20:12:19 [scrapy.downloadermiddlewares.retry] ERROR: Gave up retrying <GET https://www.example.com via http://localhost:8050/execute> (failed 3 times): Connection was refused by other side: 10061: No con
nection could be made because the target machine actively refused it..
2021-09-15 20:12:19 [scrapy.core.scraper] ERROR: Error downloading <GET https://www.example.com via http://localhost:8050/execute>
Traceback (most recent call last):
File "C:\Users\ContainerAdministrator\miniconda3\lib\site-packages\scrapy\core\downloader\middleware.py", line 45, in
process_request
return (yield download_func(request=request, spider=spider))
twisted.internet.error.ConnectionRefusedError: Connection was refused by other side: 10061: No connection could be made
because the target machine actively refused it..
2021-09-15 20:12:19 [scrapy.core.engine] INFO: Closing spider (finished)
2021-09-15 20:12:19 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/exception_count': 3,
'downloader/exception_type_count/twisted.internet.error.ConnectionRefusedError': 3,
'downloader/request_bytes': 4632,
'downloader/request_count': 3,
'downloader/request_method_count/POST': 3,
'elapsed_time_seconds': 3.310168,
'finish_reason': 'finished',
'finish_time': datetime.datetime(2021, 9, 15, 18, 12, 19, 605641),
'log_count/DEBUG': 6,
'log_count/ERROR': 2,
'log_count/INFO': 10,
'log_count/WARNING': 46,
'retry/count': 2,
'retry/max_reached': 1,
'retry/reason_count/twisted.internet.error.ConnectionRefusedError': 2,
'scheduler/dequeued': 4,
'scheduler/dequeued/memory': 4,
'scheduler/enqueued': 4,
'scheduler/enqueued/memory': 4,
'splash/execute/request_count': 1,
'start_time': datetime.datetime(2021, 9, 15, 18, 12, 16, 295473)}
2021-09-15 20:12:19 [scrapy.core.engine] INFO: Spider closed (finished)
What am I missing?
I already checked here:
- Scrapy, Splash and Connection was refused by other side: 10061
- Scrapy + Splash = Connection Refused
- How to run splash using docker toolbox
UPDATE 1
I included EXPOSE 8050
in my Dockerfile, but get the same error. I tried netstat -a
inside the docker container, but 8050 seems not to be in there?
C:\root\scrapy>netstat -a
Active Connections
Proto Local Address Foreign Address State
TCP 0.0.0.0:135 c60d48724046:0 LISTENING
TCP 0.0.0.0:5985 c60d48724046:0 LISTENING
TCP 0.0.0.0:47001 c60d48724046:0 LISTENING
TCP 0.0.0.0:49152 c60d48724046:0 LISTENING
TCP 0.0.0.0:49153 c60d48724046:0 LISTENING
TCP 0.0.0.0:49154 c60d48724046:0 LISTENING
TCP 0.0.0.0:49155 c60d48724046:0 LISTENING
TCP 0.0.0.0:49159 c60d48724046:0 LISTENING
TCP [::]:135 c60d48724046:0 LISTENING
TCP [::]:5985 c60d48724046:0 LISTENING
TCP [::]:47001 c60d48724046:0 LISTENING
TCP [::]:49152 c60d48724046:0 LISTENING
TCP [::]:49153 c60d48724046:0 LISTENING
TCP [::]:49154 c60d48724046:0 LISTENING
TCP [::]:49155 c60d48724046:0 LISTENING
TCP [::]:49159 c60d48724046:0 LISTENING
UDP 0.0.0.0:5353 *:*
UDP 0.0.0.0:5355 *:*
UDP 127.0.0.1:51352 *:*
UDP [::]:5353 *:*
UDP [::]:5355 *:*
UPDATE 2
Commands I ran on my host OS:
docker ps
output:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
bf615a00b74a scrapy "c:\\windows\\system32…" 52 seconds ago Up 49 seconds 0.0.0.0:8050->8050/tcp blissful_brahmagupta
netstat -a
output (I changed ip/server names for anonymity):
Active Connections
Proto Local Address Foreign Address State
TCP 0.0.0.0:21 exampleserver:0 LISTENING
TCP 0.0.0.0:25 exampleserver:0 LISTENING
TCP 0.0.0.0:80 exampleserver:0 LISTENING
TCP 0.0.0.0:110 exampleserver:0 LISTENING
TCP 0.0.0.0:135 exampleserver:0 LISTENING
TCP 0.0.0.0:143 exampleserver:0 LISTENING
TCP 0.0.0.0:443 exampleserver:0 LISTENING
TCP 0.0.0.0:445 exampleserver:0 LISTENING
TCP 0.0.0.0:587 exampleserver:0 LISTENING
TCP 0.0.0.0:995 exampleserver:0 LISTENING
TCP 0.0.0.0:1433 exampleserver:0 LISTENING
TCP 0.0.0.0:2179 exampleserver:0 LISTENING
TCP 0.0.0.0:3306 exampleserver:0 LISTENING
TCP 0.0.0.0:3389 exampleserver:0 LISTENING
TCP 0.0.0.0:5985 exampleserver:0 LISTENING
TCP 0.0.0.0:8983 exampleserver:0 LISTENING
TCP 0.0.0.0:33060 exampleserver:0 LISTENING
TCP 0.0.0.0:47001 exampleserver:0 LISTENING
TCP 0.0.0.0:49231 exampleserver:0 LISTENING
TCP 0.0.0.0:49664 exampleserver:0 LISTENING
TCP 0.0.0.0:49665 exampleserver:0 LISTENING
TCP 0.0.0.0:49666 exampleserver:0 LISTENING
TCP 0.0.0.0:49667 exampleserver:0 LISTENING
TCP 0.0.0.0:49668 exampleserver:0 LISTENING
TCP 0.0.0.0:49673 exampleserver:0 LISTENING
TCP 0.0.0.0:49881 exampleserver:0 LISTENING
TCP 12.12.12.12:21 103.144.31.100:ftp SYN_RECEIVED
TCP 12.12.12.12:25 ip245:1256 TIME_WAIT
TCP 12.12.12.12:25 ip245:12756 TIME_WAIT
TCP 12.12.12.12:25 ip245:25324 TIME_WAIT
TCP 12.12.12.12:25 ip245:30624 TIME_WAIT
TCP 12.12.12.12:25 ip245:48206 TIME_WAIT
TCP 12.12.12.12:25 ip245:59510 TIME_WAIT
TCP 12.12.12.12:80 ec2-52-31-126-154:1440 ESTABLISHED
TCP 12.12.12.12:80 ec2-52-31-157-215:31240 ESTABLISHED
TCP 12.12.12.12:80 ec2-52-31-205-57:65197 ESTABLISHED
TCP 12.12.12.12:80 ninja-crawler92:36060 ESTABLISHED
TCP 12.12.12.12:80 13:62786 TIME_WAIT
TCP 12.12.12.12:80 16:22362 TIME_WAIT
TCP 12.12.12.12:80 19:4130 TIME_WAIT
TCP 12.12.12.12:80 22:30072 TIME_WAIT
TCP 12.12.12.12:80 22:51362 TIME_WAIT
TCP 12.12.12.12:80 34:9586 TIME_WAIT
TCP 12.12.12.12:80 35:40210 TIME_WAIT
TCP 12.12.12.12:80 35:65164 TIME_WAIT
TCP 12.12.12.12:80 38:17882 TIME_WAIT
TCP 12.12.12.12:80 39:17918 TIME_WAIT
TCP 12.12.12.12:80 40:51642 TIME_WAIT
TCP 12.12.12.12:80 40:57586 TIME_WAIT
TCP 12.12.12.12:80 45:45800 TIME_WAIT
TCP 12.12.12.12:139 exampleserver:0 LISTENING
TCP 12.12.12.12:443 static:3610 TIME_WAIT
TCP 12.12.12.12:443 static:5823 TIME_WAIT
TCP 12.12.12.12:443 static:38855 TIME_WAIT
TCP 12.12.12.12:443 static:53579 TIME_WAIT
TCP 12.12.12.12:443 static:54816 TIME_WAIT
TCP 12.12.12.12:443 static:26725 TIME_WAIT
TCP 12.12.12.12:443 static:14749 TIME_WAIT
TCP 12.12.12.12:443 static:8533 TIME_WAIT
TCP 12.12.12.12:443 static:9136 TIME_WAIT
TCP 12.12.12.12:443 static:35494 TIME_WAIT
TCP 12.12.12.12:443 193:48688 TIME_WAIT
TCP 12.12.12.12:443 static:3161 TIME_WAIT
TCP 12.12.12.12:443 static:31667 TIME_WAIT
TCP 12.12.12.12:443 ec2-52-31-126-154:25042 ESTABLISHED
TCP 12.12.12.12:443 ec2-52-31-157-215:61630 ESTABLISHED
TCP 12.12.12.12:443 ec2-52-31-205-57:20864 ESTABLISHED
TCP 12.12.12.12:443 crawl-66-249-76-28:46983 ESTABLISHED
TCP 12.12.12.12:443 crawl-66-249-76-30:47250 ESTABLISHED
TCP 12.12.12.12:443 crawl-66-249-76-102:45115 ESTABLISHED
TCP 12.12.12.12:443 crawl-66-249-76-104:62362 ESTABLISHED
TCP 12.12.12.12:443 crawl-66-249-76-106:52575 ESTABLISHED
TCP 12.12.12.12:443 crawl-66-249-76-192:51273 ESTABLISHED
TCP 12.12.12.12:443 google-proxy-66-249-81-16:37717 ESTABLISHED
TCP 12.12.12.12:443 rate-limited-proxy-66-249-89-97:42078 ESTABLISHED
TCP 12.12.12.12:443 77-162-6-126:60721 ESTABLISHED
TCP 12.12.12.12:443 77-162-6-126:60728 ESTABLISHED
TCP 12.12.12.12:443 81-207-120-215:53600 ESTABLISHED
TCP 12.12.12.12:443 ip-83-134-52-36:51127 ESTABLISHED
TCP 12.12.12.12:443 host-83-232-56-99:2747 ESTABLISHED
TCP 12.12.12.12:443 84-29-102-40:57144 ESTABLISHED
TCP 12.12.12.12:443 84-104-10-105:57252 ESTABLISHED
TCP 12.12.12.12:443 exampleserver:54209 ESTABLISHED
TCP 12.12.12.12:443 static:37222 TIME_WAIT
TCP 12.12.12.12:443 static:net-device TIME_WAIT
TCP 12.12.12.12:443 static:7874 TIME_WAIT
TCP 12.12.12.12:443 static:33373 TIME_WAIT
TCP 12.12.12.12:443 static:60446 TIME_WAIT
TCP 12.12.12.12:443 92-111-50-210:54795 ESTABLISHED
TCP 12.12.12.12:443 static:2841 TIME_WAIT
TCP 12.12.12.12:443 ip-95-223-56-232:51129 ESTABLISHED
TCP 12.12.12.12:443 petalbot-114-119-135-120:32530 TIME_WAIT
TCP 12.12.12.12:443 petalbot-114-119-148-37:39746 TIME_WAIT
TCP 12.12.12.12:443 petalbot-114-119-148-47:39066 TIME_WAIT
TCP 12.12.12.12:443 petalbot-114-119-148-60:51178 SYN_RECEIVED
TCP 12.12.12.12:443 petalbot-114-119-148-160:11516 TIME_WAIT
TCP 12.12.12.12:443 petalbot-114-119-148-169:52484 TIME_WAIT
TCP 12.12.12.12:443 petalbot-114-119-148-191:41470 TIME_WAIT
TCP 12.12.12.12:443 petalbot-114-119-149-1:64570 TIME_WAIT
TCP 12.12.12.12:443 petalbot-114-119-149-168:1456 TIME_WAIT
TCP 12.12.12.12:443 petalbot-114-119-149-169:61436 TIME_WAIT
TCP 12.12.12.12:443 static:47402 TIME_WAIT
TCP 12.12.12.12:443 static:7710 TIME_WAIT
TCP 12.12.12.12:443 static:15334 TIME_WAIT
TCP 12.12.12.12:443 static:50492 TIME_WAIT
TCP 12.12.12.12:443 static:3896 TIME_WAIT
TCP 12.12.12.12:443 static:32136 TIME_WAIT
TCP 12.12.12.12:443 ninja-crawler97:19950 ESTABLISHED
TCP 12.12.12.12:443 static:9737 TIME_WAIT
TCP 12.12.12.12:443 1:14850 TIME_WAIT
TCP 12.12.12.12:443 2:9212 TIME_WAIT
TCP 12.12.12.12:443 2:38644 TIME_WAIT
TCP 12.12.12.12:443 2:40354 TIME_WAIT
TCP 12.12.12.12:443 2:61144 TIME_WAIT
TCP 12.12.12.12:443 9:4920 TIME_WAIT
TCP 12.12.12.12:443 9:10744 TIME_WAIT
TCP 12.12.12.12:443 9:41246 TIME_WAIT
TCP 12.12.12.12:443 10:55160 TIME_WAIT
TCP 12.12.12.12:443 12:28250 TIME_WAIT
TCP 12.12.12.12:443 12:48182 TIME_WAIT
TCP 12.12.12.12:443 13:6848 TIME_WAIT
TCP 12.12.12.12:443 13:41174 TIME_WAIT
TCP 12.12.12.12:443 14:11724 TIME_WAIT
TCP 12.12.12.12:443 14:23780 TIME_WAIT
TCP 12.12.12.12:443 14:35272 TIME_WAIT
TCP 12.12.12.12:443 14:42876 TIME_WAIT
TCP 12.12.12.12:443 15:50642 TIME_WAIT
TCP 12.12.12.12:443 16:11382 TIME_WAIT
TCP 12.12.12.12:443 16:43780 TIME_WAIT
TCP 12.12.12.12:443 17:18676 TIME_WAIT
TCP 12.12.12.12:443 18:40086 TIME_WAIT
TCP 12.12.12.12:443 20:14698 TIME_WAIT
TCP 12.12.12.12:443 21:8742 TIME_WAIT
TCP 12.12.12.12:443 21:9222 TIME_WAIT
TCP 12.12.12.12:443 21:10050 TIME_WAIT
TCP 12.12.12.12:443 21:22212 TIME_WAIT
TCP 12.12.12.12:443 23:20186 TIME_WAIT
TCP 12.12.12.12:443 24:9702 TIME_WAIT
TCP 12.12.12.12:443 24:29658 TIME_WAIT
TCP 12.12.12.12:443 24:54316 TIME_WAIT
TCP 12.12.12.12:443 24:54740 TIME_WAIT
TCP 12.12.12.12:443 26:63912 TIME_WAIT
TCP 12.12.12.12:443 34:38802 TIME_WAIT
TCP 12.12.12.12:443 34:48344 TIME_WAIT
TCP 12.12.12.12:443 35:19314 TIME_WAIT
TCP 12.12.12.12:443 35:56518 TIME_WAIT
TCP 12.12.12.12:443 36:26848 TIME_WAIT
TCP 12.12.12.12:443 36:29840 TIME_WAIT
TCP 12.12.12.12:443 37:22090 TIME_WAIT
TCP 12.12.12.12:443 37:41662 TIME_WAIT
TCP 12.12.12.12:443 37:62462 TIME_WAIT
TCP 12.12.12.12:443 37:65246 TIME_WAIT
TCP 12.12.12.12:443 38:3746 TIME_WAIT
TCP 12.12.12.12:443 38:13518 TIME_WAIT
TCP 12.12.12.12:443 38:19626 TIME_WAIT
TCP 12.12.12.12:443 38:46588 TIME_WAIT
TCP 12.12.12.12:443 38:55504 TIME_WAIT
TCP 12.12.12.12:443 39:13096 TIME_WAIT
TCP 12.12.12.12:443 40:14808 TIME_WAIT
TCP 12.12.12.12:443 40:18046 TIME_WAIT
TCP 12.12.12.12:443 40:19968 TIME_WAIT
TCP 12.12.12.12:443 40:37858 TIME_WAIT
TCP 12.12.12.12:443 40:47914 TIME_WAIT
TCP 12.12.12.12:443 40:54890 TIME_WAIT
TCP 12.12.12.12:443 40:58958 TIME_WAIT
TCP 12.12.12.12:443 40:61998 TIME_WAIT
TCP 12.12.12.12:443 41:5752 TIME_WAIT
TCP 12.12.12.12:443 41:6420 ESTABLISHED
TCP 12.12.12.12:443 41:6424 TIME_WAIT
TCP 12.12.12.12:443 41:8224 TIME_WAIT
TCP 12.12.12.12:443 41:23838 TIME_WAIT
TCP 12.12.12.12:443 41:56540 TIME_WAIT
TCP 12.12.12.12:443 42:44002 TIME_WAIT
TCP 12.12.12.12:443 42:48300 TIME_WAIT
TCP 12.12.12.12:443 45:16840 TIME_WAIT
TCP 12.12.12.12:443 45:44966 TIME_WAIT
TCP 12.12.12.12:443 45:45542 TIME_WAIT
TCP 12.12.12.12:443 static:6008 TIME_WAIT
TCP 12.12.12.12:443 static:50129 TIME_WAIT
TCP 12.12.12.12:443 static:11337 TIME_WAIT
TCP 12.12.12.12:443 static:57596 TIME_WAIT
TCP 12.12.12.12:443 188:13459 ESTABLISHED
TCP 12.12.12.12:443 193.32.169.146:40330 ESTABLISHED
TCP 12.12.12.12:443 dD5765B1A:53034 ESTABLISHED
TCP 12.12.12.12:443 ip-213-127-45-40:59736 ESTABLISHED
TCP 12.12.12.12:443 ip-213-127-45-40:59772 ESTABLISHED
TCP 12.12.12.12:3389 31.184.218.129:37280 CLOSE_WAIT
TCP 12.12.12.12:3389 219:63027 ESTABLISHED
TCP 12.12.12.12:3389 89.205.133.110:2255 ESTABLISHED
TCP 12.12.12.12:3389 101.204.228.50:55310 ESTABLISHED
TCP 12.12.12.12:3389 static:59832 ESTABLISHED
TCP 12.12.12.12:54209 exampleserver:https ESTABLISHED
TCP 12.12.12.12:54217 ams17s10-in-f10:https ESTABLISHED
TCP 12.12.12.12:54242 lhr26s05-in-f10:https ESTABLISHED
TCP 12.12.12.12:54288 server-52-222-139-23:https ESTABLISHED
TCP 12.12.12.12:54533 40.126.31.135:https TIME_WAIT
TCP 12.12.12.12:54534 20.73.194.208:https TIME_WAIT
TCP 12.12.12.12:54535 20.67.183.221:https TIME_WAIT
TCP 12.12.12.12:54544 40.125.122.176:https TIME_WAIT
TCP 12.12.12.12:54620 ec2-54-77-149-211:https CLOSE_WAIT
TCP 12.12.12.12:54623 a23-79-157-152:http TIME_WAIT
TCP 12.12.12.12:54625 a23-79-157-152:http TIME_WAIT
TCP 12.12.12.12:54626 a23-79-157-152:http TIME_WAIT
TCP 12.12.12.12:54627 a23-79-157-152:http TIME_WAIT
TCP 12.12.12.12:54628 a23-79-157-152:http TIME_WAIT
TCP 12.12.12.12:54629 a23-79-157-152:http TIME_WAIT
TCP 12.12.12.12:54633 80-69-93-62:https CLOSE_WAIT
TCP 12.12.12.12:54637 exampleserver:ms-sql-s TIME_WAIT
TCP 127.0.0.1:3306 exampleserver:54634 ESTABLISHED
TCP 127.0.0.1:49674 exampleserver:49675 ESTABLISHED
TCP 127.0.0.1:49675 exampleserver:49674 ESTABLISHED
TCP 127.0.0.1:49676 exampleserver:49677 ESTABLISHED
TCP 127.0.0.1:49677 exampleserver:49676 ESTABLISHED
TCP 127.0.0.1:54634 exampleserver:3306 ESTABLISHED
TCP 172.25.64.1:53 exampleserver:0 LISTENING
TCP 172.25.64.1:139 exampleserver:0 LISTENING
TCP 172.25.64.1:54638 exampleserver:ms-sql-s TIME_WAIT
TCP [::]:21 exampleserver:0 LISTENING
TCP [::]:25 exampleserver:0 LISTENING
TCP [::]:80 exampleserver:0 LISTENING
TCP [::]:135 exampleserver:0 LISTENING
TCP [::]:443 exampleserver:0 LISTENING
TCP [::]:445 exampleserver:0 LISTENING
TCP [::]:587 exampleserver:0 LISTENING
TCP [::]:1433 exampleserver:0 LISTENING
TCP [::]:2179 exampleserver:0 LISTENING
TCP [::]:3306 exampleserver:0 LISTENING
TCP [::]:3389 exampleserver:0 LISTENING
TCP [::]:5985 exampleserver:0 LISTENING
TCP [::]:8983 exampleserver:0 LISTENING
TCP [::]:33060 exampleserver:0 LISTENING
TCP [::]:47001 exampleserver:0 LISTENING
TCP [::]:49231 exampleserver:0 LISTENING
TCP [::]:49664 exampleserver:0 LISTENING
TCP [::]:49665 exampleserver:0 LISTENING
TCP [::]:49666 exampleserver:0 LISTENING
TCP [::]:49667 exampleserver:0 LISTENING
TCP [::]:49668 exampleserver:0 LISTENING
TCP [::]:49673 exampleserver:0 LISTENING
TCP [::]:49881 exampleserver:0 LISTENING
TCP [::1]:8983 exampleserver:53194 ESTABLISHED
TCP [::1]:8983 exampleserver:54283 ESTABLISHED
TCP [::1]:8983 exampleserver:54476 TIME_WAIT
TCP [::1]:8983 exampleserver:54622 ESTABLISHED
TCP [::1]:8983 exampleserver:54632 FIN_WAIT_2
TCP [::1]:8983 exampleserver:62954 ESTABLISHED
TCP [::1]:8983 exampleserver:62971 ESTABLISHED
TCP [::1]:8983 exampleserver:63272 ESTABLISHED
TCP [::1]:53194 exampleserver:8983 ESTABLISHED
TCP [::1]:54283 exampleserver:8983 ESTABLISHED
TCP [::1]:54622 exampleserver:8983 ESTABLISHED
TCP [::1]:54632 exampleserver:8983 CLOSE_WAIT
TCP [::1]:62954 exampleserver:8983 ESTABLISHED
TCP [::1]:62971 exampleserver:8983 ESTABLISHED
TCP [::1]:63272 exampleserver:8983 ESTABLISHED
TCP [fe80::67ff:cd37:2e9:65fe%14]:1433 exampleserver:53967 ESTABLISHED
TCP [fe80::67ff:cd37:2e9:65fe%14]:1433 exampleserver:53981 ESTABLISHED
TCP [fe80::67ff:cd37:2e9:65fe%14]:1433 exampleserver:54262 ESTABLISHED
TCP [fe80::67ff:cd37:2e9:65fe%14]:1433 exampleserver:54276 ESTABLISHED
TCP [fe80::67ff:cd37:2e9:65fe%14]:1433 exampleserver:54635 ESTABLISHED
TCP [fe80::67ff:cd37:2e9:65fe%14]:53967 exampleserver:ms-sql-s ESTABLISHED
TCP [fe80::67ff:cd37:2e9:65fe%14]:53981 exampleserver:ms-sql-s ESTABLISHED
TCP [fe80::67ff:cd37:2e9:65fe%14]:54262 exampleserver:ms-sql-s ESTABLISHED
TCP [fe80::67ff:cd37:2e9:65fe%14]:54276 exampleserver:ms-sql-s ESTABLISHED
TCP [fe80::67ff:cd37:2e9:65fe%14]:54635 exampleserver:ms-sql-s ESTABLISHED
TCP [fe80::bc0d:78b4:381:b364%18]:54636 exampleserver:ms-sql-s TIME_WAIT
UDP 0.0.0.0:123 *:*
UDP 0.0.0.0:500 *:*
UDP 0.0.0.0:3389 *:*
UDP 0.0.0.0:4500 *:*
UDP 0.0.0.0:5353 *:*
UDP 0.0.0.0:5355 *:*
UDP 12.12.12.12:137 *:*
UDP 12.12.12.12:138 *:*
UDP 127.0.0.1:61804 *:*
UDP 172.25.64.1:53 *:*
UDP 172.25.64.1:137 *:*
UDP 172.25.64.1:138 *:*
UDP [::]:123 *:*
UDP [::]:500 *:*
UDP [::]:3389 *:*
UDP [::]:4500 *:*
UDP [::]:5353 *:*
UDP [::]:5355 *:*
UPDATE 3
In Windows Firewall I created new TCP port rule for both inbound and outbound allowing ports "80,443,8050" (for Domain/Private/Public)
Then retried running C:\root\scrapy>scrapy foobar -o allobjects.json
in my container, but still get the "Connection was refused" error.
I ran these commands:
- In my container:
C:\root\scrapy>netstat -a
- On my host OS:
D:\Progams\netstat -a
In neither output does port 8050 show up...but I had expected it to be there?
UPDATE 4
Based on @LSerni/@Pankaj's suggestions around Splash:
I don't know what I would need to enter instead of scrapinghub/splash
in your example.
I tried:
PS D:\Programs\image_addons> docker run -p 8050:8050 -p 5023:5023 scrapinghub/splash
Unable to find image 'scrapinghub/splash:latest' locally
latest: Pulling from scrapinghub/splash
docker: image operating system "linux" cannot be used on this platform.
And:
PS D:\Programs\image_addons> docker run -p 8050:8050 -p 5023:5023 scrapy/splash
Unable to find image 'scrapy/splash:latest' locally
docker: Error response from daemon: pull access denied for scrapy/splash, repository does not exist or may require 'docker login': denied: requested access to the resource is denied.
Splash is installed via my Docker file into my scrapy image as can be seen from my post, so how would I need to start it?
UPDATE 5
Based on @Thiago Curvelo's comment around Splash:
So locally on Windows 10 and Docker desktop the above setup works for me where I'm not (at least not explicitly) starting a separate Splash instance, but perhaps that is logic build into Docker Desktop? On my server I can't use Docker Desktop as it does not support Hyper-V.
If you look at my Dockerfile, you see that I install scrapy-splash
into my image.
Now to your point: apparently I need to start a Splash
instance separately?
Does that container have to run separately and in parallel from my scrapy
container? Or could (and SHOULD?) I add the Splash
image to my existing scrapy
image and run a container based on that? I'm new to this so not exactly sure if that's what I should be Googling for.
(also I was just attempting scrapy/splash
as I thought perhaps the scrapinghub
part was referring to the image name, in my case scrapy
, guess I was wrong about that too ;-))
docker ps
? – Spooner8050:8050
– Spoonerscrapy crawl foobar
, check your outgoing IP using lumtest.com/myip.json and then debug if your Splash server is allows to accept requests from that or not. – Personification'netstat -an | find /i "listening"'
– Spoonernetstat -an | find /i "listening"
results in error "FIND: Parameter format not correct". Output ofnetstat -a -n
looks the same as output ofnetstat -a
already listed above and port 8050 is not showing up in either. So should I perhaps open port 8050 in Windows Firewall (and what are the risks of this if any?) – Olivinescrapinghub/splash
(There isn't ascrapy/splash
available) – Middayscrapy
container on Windows and justSplash
in VirtualBox, or do Docker, scrapy-splash and all images/containers have to run in VirtualBox? (FYI: I prefer to keep as much logic/data as I can on Windows as that's my main OS) – Olivine