I am trying to set up a Spark + HDFS deployment on a small cluster using Docker Swarm as a stack deployment. I have it generally working, but I ran into an issue that is preventing Spark from taking advantage of data locality.
In an attempt to enable data locality, I made a single "worker node" container on each server that contains both the Spark worker and the HDFS datanode. The idea here is that they then should both have the same IP address on the stack's overlay network because they are running in the same container. However, they do not. It would seem the container gets one VIP on the overlay network and the service that is defined in the compose file the stack uses gets another VIP.
It turns out that HDFS datanode process binds to the containers VIP and the Spark worker process binds to the service's VIP (as best I can determine). As a result, Spark doesn't know that the Spark worker and the HDFS datanode are actually on the same machine and only schedules tasks with ANY
locality.
I am sure I am missing something, but I (of course) don't know what.
The Docker stack compose file entry that I use for defining each worker node service looks like this:
version: '3.4'
services:
...
worker-node2:
image: master:5000/spark-hdfs-node:latest
hostname: "worker-node2"
networks:
- cluster_network
environment:
- SPARK_PUBLIC_DNS=10.1.1.1
- SPARK_LOG_DIR=/data/spark/logs
depends_on:
- hdfs-namenode
volumes:
- type: bind
source: /mnt/data/hdfs
target: /data/hdfs
- type: bind
source: /mnt/data/spark
target: /data/spark
deploy:
mode: replicated
replicas: 1
placement:
constraints:
- node.hostname == slave1
resources:
limits:
memory: 56g
...
networks:
cluster_network:
attachable: true
ipam:
driver: default
config:
- subnet: 10.20.30.0/24
The Hadoop HDFS-site.xml
configuration looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>dfs.datanode.data.dir</name>
<value>/data/hdfs/datanode</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/data/hdfs/namenode</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
<description>The default replication factor of files on HDFS</description>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.block.size</name>
<value>64m</value>
<description>The default block size in bytes of data saved to HDFS</description>
</property>
<property>
<name>dfs.namenode.datanode.registration.ip-hostname-check</name>
<value>false</value>
</property>
<property>
<name>dfs.client.use.datanode.hostname</name>
<value>true</value>
</property>
<property>
<name>dfs.datanode.use.datanode.hostname</name>
<value>true</value>
</property>
<property>
<name>dfs.namenode.rpc-bind-host</name>
<value>0.0.0.0</value>
<description>
controls what IP address the NameNode binds to.
0.0.0.0 means all available.
</description>
</property>
<property>
<name>dfs.namenode.servicerpc-bind-host</name>
<value>0.0.0.0</value>
<description>
controls what IP address the NameNode binds to.
0.0.0.0 means all available.
</description>
</property>
<property>
<name>dfs.namenode.http-bind-host</name>
<value>0.0.0.0</value>
<description>
controls what IP address the NameNode binds to.
0.0.0.0 means all available.
</description>
</property>
<property>
<name>dfs.namenode.https-bind-host</name>
<value>0.0.0.0</value>
<description>
controls what IP address the NameNode binds to.
0.0.0.0 means all available.
</description>
</property>
</configuration>
My full setup can be viewed here on GitHub.
Does anyone have any ideas what I am doing wrong that is preventing the Spark worker and HDFS datanode processes in the same Docker container from binding to the same IP address?