How do I restart airflow webserver?
Asked Answered
H

13

60

I am using airflow for my data pipeline project. I have configured my project in airflow and start the airflow server as a backend process using following command

airflow webserver -p 8080 -D True

Server running successfully in backend. Now I want to enable authentication in airflow and done configuration changes in airflow.cfg, but authentication functionality is not reflected in server. when I stop and start airflow server in my local machine it works.

So How can I restart my daemon airflow webserver process in my server??

Hither answered 22/8, 2016 at 7:19 Comment(1)
airflow webserver -p 8080 -DTegular
S
58

I advice running airflow in a robust way, with auto-recovery with systemd
so you can do:
- to start systemctl start airflow
- to stop systemctl stop airflow
- to restart systemctl restart airflow
For this you'll need a systemd 'unit' file. As a (working) example you can use the following:
put it in /lib/systemd/system/airflow.service

[Unit]
Description=Airflow webserver daemon
After=network.target postgresql.service mysql.service redis.service rabbitmq-server.service
Wants=postgresql.service mysql.service redis.service rabbitmq-server.service
[Service]
PIDFile=/run/airflow/webserver.pid
EnvironmentFile=/home/airflow/airflow.env
User=airflow
Group=airflow
Type=simple
ExecStart=/bin/bash -c 'export AIRFLOW_HOME=/home/airflow ; airflow webserver --pid /run/airflow/webserver.pid'
ExecReload=/bin/kill -s HUP $MAINPID
ExecStop=/bin/kill -s TERM $MAINPID
Restart=on-failure
RestartSec=42s
PrivateTmp=true
[Install]
WantedBy=multi-user.target

P.S: change AIRFLOW_HOME to where your airflow folder with the config

Sukkah answered 17/5, 2017 at 11:1 Comment(6)
This is the right way to do it. There's example scripts for both upstart and systemd: github.com/apache/incubator-airflow/tree/master/scriptsSammie
This is also discussed in the airflow docs here : pythonhosted.org/airflow/…Cud
If you are familiar with daemon-izing airflow, can you and/or @Cud please help me? I'm having trouble with daemonizing it from within a virtualenv. Thanks!Hurds
I got this error when i tried your solution "Job for airflow.service failed because a configured resource limit was exceeded. See "systemctl status airflow.service" and "journalctl -xe" for details"Jollanta
Just a question here, suppose we have apache-airflow in a virtual environment, would you have to activate environment or is there a way to execute the airflow webserver command with the file present in the bin folder of our virtual environmentCatalogue
@AmartyaGaur virtual-env is just a folder so yes, you can execute anything in it directly. "Activating" a virtual-env is simply aliasing the 'python' commands for the given shell (where the virtual-env is activated) so that any time you call 'python' or 'pip' or 'airflow' it's redirected to the bin folder of your virtual-env. To prevent confusion I think it's better to make sure you 'activate' the virtual-env that you are currently working on. On production servers, you usually don't work with virtual-envs.Sukkah
C
52

Can you check $AIRFLOW_HOME/airflow-webserver.pid for the process id of your webserver daemon?

Then pass it a kill signal to kill it

cat $AIRFLOW_HOME/airflow-webserver.pid | xargs kill -9

Then clear the pid file

cat /dev/null >  $AIRFLOW_HOME/airflow-webserver.pid

Then just run

airflow webserver -p 8080 -D True

to restart the daemon.

Cub answered 22/8, 2016 at 23:4 Comment(4)
Why do you need True after -D ?Cursorial
You're right. As long as you pass the flag you don't need to explicitly pass TrueCub
This does not seem to work with me. I can still see the pid when i use cat I am using WSL 2 btwBroyles
@captaincapsaicin: I have approed an edit of this answer. Please go over it to make sure it corresponds to what you intended.Outbound
F
28

This worked for me (multiple times! :D )

find the process id: (assuming 8080 is the port)

lsof -i tcp:8080

kill it

kill <pid>
Fluorene answered 6/4, 2017 at 14:28 Comment(0)
C
13

Use Airflow webserver's (gunicorn) signal handling

Airflow uses gunicorn as it's HTTP server, so you can send it standard POSIX-style signals. A signal commonly used by daemons to restart is HUP.

You'll need to locate the pid file for the airflow webserver daemon in order to get the right process id to send the signal to. This file could be in $AIRFLOW_HOME or also /var/run, which is where you'll find a lot of pids.

Assuming the pid file is in /var/run, you could run the command:

cat /var/run/airflow-webserver.pid | xargs kill -HUP

gunicorn uses a preforking model, so it has master and worker processes. The HUP signal is sent to the master process, which performs these actions:

HUP: Reload the configuration, start the new worker processes with a new configuration and gracefully shutdown older workers. If the application is not preloaded (using the preload_app option), Gunicorn will also load the new version of it.

More information in the gunicorn signal handling docs.

This is mostly an expanded version of captaincapsaicin's answer, but using HUP (SIGHUP) instead of KILL (SIGKILL) to reload the process instead of actually killing it and restarting it.

Coggins answered 29/3, 2017 at 18:39 Comment(1)
Thanks, this works great for me while developing an Airflow plugin! I'm using it with entr to auto-reload when I modify a file: git ls-files | entr sh -c 'cat $AIRFLOW_HOME/airflow-webserver.pid | xargs -t kill -HUP'Elveraelves
N
5

In my case i want to kill previous airflow process and start. for that following command did the magic

killall -9 airflow
Nim answered 2/2, 2021 at 12:34 Comment(1)
Yes! I had the incorrect launch of webserver without running service and this did exactly what I wantedCorrelation
L
3

As the question was related to webserver, this is something that worked in my case:

systemctl restart airflow-webserver

Lineberry answered 19/3, 2020 at 9:15 Comment(0)
C
2

To restart Airflow you need to restart Airflow webserver and Airflow scheduler.

Check if Airflow servers are running:

ps -aux | grep airflow

if you see in list of running processes entries like:

ubuntu     49601  0.1  1.6 266668 135520 ?       S    12:19   0:00 [ready] gunicorn: worker [airflow-webserver]

This means that Airflow webserver is running.

If you see entries like this:

ubuntu     49653  0.6  2.3 308912 187596 ?       S    12:19   0:00 airflow scheduler -- DagFileProcessorManager

That means that Airflow scheduler is running.

Stop Airflow servers (webserver and scheduler):

pkill -f "airflow scheduler"
pkill -f "airflow webserver"

Now use again ps -aux | grep airflow to check if they are really shut down.

Start Airflow servers in background (daemon):

airflow webserver -D
airflow scheduler -D
Cristoforo answered 10/2, 2023 at 12:28 Comment(0)
C
1

Just run:

airflow webserver -p 8080 -D 
Cabezon answered 4/11, 2019 at 16:19 Comment(0)
C
1

Find pid with:

airflow webserver

will give: "The webserver is already running under PID 21250."

Than kill web server process with:

kill 21250

Cristoforo answered 23/3, 2021 at 11:10 Comment(0)
P
0

Create a init script and use the command "daemon" to run this as service.

daemon --user="${USER}" --pidfile="${PID_FILE}" airflow webserver -p 8090 >> "${LOG_FILE}" 2>&1 &
Percolate answered 9/9, 2016 at 22:19 Comment(0)
A
0

None of these worked for me. I had to delete the $AIRFLOW_HOME/airflow-webserver.pid file and then running airflow webserver worked.

Antispasmodic answered 1/5, 2018 at 16:22 Comment(1)
Airflow prevent running in daemon mode when a pid file is still existing.Recess
T
0

The recommended approach is to create and enable the airflow webserver as a service. If you named the webserver as 'airflow-webserver', run the following command to restart the service:

systemctl restart airflow-webserver

You can use a ready-made AMI (namely, LightningFLow) from AWS Marketplace which provides Airflow services (webserver, scheduler, worker) which are enabled at startup.

Note: LightningFlow comes pre-integrated with all required libraries, Livy, custom operators, and local Spark cluster.

Link for AWS Marketplace: https://aws.amazon.com/marketplace/pp/Lightning-Analytics-Inc-LightningFlow-Integrated-o/B084BSD66V

Tref answered 17/2, 2020 at 6:47 Comment(0)
A
0

Just by killing processes!!

Assuming the default airflow home directory is ~/airflow/

List the 3 parent processes running the airflow (PID):

cat ~/airflow/airflow-scheduler.pid
cat ~/airflow/airflow-webserver.pid
cat ~/airflow/airflow-webserver-monitor.pid

Get their PGID using:

ps -xjf

And finally run loop to kill all tree of each parent (PID):

for child in $(ps x -o "%P %p %r"| awk '{ if ( $1 == $your_first_PID || $3 == $your_first_PGID) { print $2 }}'); do kill $child; done
Amias answered 11/11, 2022 at 21:52 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.