Gunicorn, supervisord, and graceful reloads
// TODO: this page is a WIP.
Background
// TODO
How gunicorn handles SIGHUP
When we send SIGHUP to gunicorn, new requests will be blocked until the new worker processes are ready to serve requests. How long requests are blocked is determined by how long it takes your application to start up.
In gunicorn’s signal handling documentation this is how SIGHUP is described:
HUP: Reload the configuration, start the new worker processes with a new configuration and gracefully shutdown older workers. If the application is not preloaded (using the –preload option), Gunicorn will also load the new version of it.
When “configuration” is mentioned, gunicorn is refering to its own configuration, not your application. Gunicorn also has a different definition of a graceful shutdown – when gunicorn mentions graceful shutdown, it means that in-flight requests will complete. Graceful does not mean that there will be no application downtime.
To illustrate, this is what happens when gunicorn receives SIGHUP:
Step 1: Requests go to the worker processes.
Step 2: New workers are spawned. Requests start getting routed to the new workers. The new workers are not yet initialized and so can’t process requests.
Step 3: Old workers are gracefully shut down and finish processing in-flight requests. No new requests are going to old workers. New requests are still going to the yet uninitialized new workers.
Step 4: Old workers finish shutting down. New workers are not yet initialized.
Step 5: After about 20 seconds, the new workers are initialized. Requests are now getting processed again.
Graceful reloading with SIGUSR2
To properly gracefully reload an application without blocking requests, we can send the SIGUSR2 signal to gunicorn. SIGUSR2 is designed for upgrading the gunicorn binary, so we are misappropriating it a little here.
SIGUSR2 will cause gunicorn to spawn a new master process while leaving the old master process running. After a warm-up period, the new master process is up and serving requests, and we can terminate the old master process by sending SIGTERM to it.
To illustrate:
Step 1: Before receiving the signal, gunicorn has one master process.
Step 2: Sending SIGUSR2 to the master process will cause it to spawn a new one. The new master process starts to initialize its workers.
Step 3: After the workers in the new master process are initialized, requests will start getting served by both masters.
Step 4: Sending SIGTERM to the old master process will cause it to gracefully shut down its workers. The workers will finish processing the in-flight requests. The old master process will exit once all workers have exited. Only the new master remains.
With supervisor
Unfortunately the SIGUSR2 approach to gracefully reloading applications does not play nice with supervisor. The new gunicorn master process isn’t owned by supervisor. When the old master process is terminated, supervisor will attempt to restart gunicorn. This results in there being two master processes.
What we can do instead is wrap the gunicorn master processes with a script. Supervisor will manage this wrapper script, not the gunicorn master processes. The script will handle sending SIGUSR2/SIGTERM to reload gunicorn gracefully. The script will present itself to supervisor with a consistent PID to insulate supervisor from the changing PIDs of the currently-active gunicorn master process.
In the main loop of the script, we start gunicorn if it hasn’t been started yet. If the gunicorn process is externally killed or otherwise does not exist, then we exit the script. When the script exits, supervisor will handle restarting it.
gunicorn_args=("$@")
gunicorn_pidfile="/run/gunicorn.pid"
function start_gunicorn() {
log "Starting gunicorn"
gunicorn "${gunicorn_args[@]}" &
}
function gunicorn_exists() {
[ -f "$gunicorn_pidfile" ] && ps -p "$(cat "$gunicorn_pidfile")" &> /dev/null
}
# Start gunicorn if not yet started
if ! gunicorn_exists; then
start_gunicorn
fi
# Loop to keep the script alive
while true; do
sleep 5
if ! gunicorn_exists; then
# If somehow gunicorn has stopped, exit this script.
exit 0
fi
done
When the script receives SIGTERM, we propagate the signal to the gunicorn process and wait for it to exit, before the script itself exits.
trap shutdown SIGTERM
function log() {
echo "[$(date --rfc-3339=seconds)] [gunicorn-wrapper] $1"
}
function shutdown() {
if [ -f "$gunicorn_pidfile" ]; then
pid=$(cat $gunicorn_pidfile)
log "Shutting down. Sending SIGTERM to $pid"
kill -s SIGTERM "$pid"
wait_pid "$pid"
fi
exit
}
function wait_pid() {
pid=$1
tail --pid="$pid" -f /dev/null
}
When the script receives SIGHUP, we gracefully reload gunicorn using the SIGUSR2+SIGTERM approach. The new gunicorn master process is given 30 seconds to warm up. This warm up period depends on how long it takes your application to start.
trap queue_for_reload SIGHUP
should_reload=0
function queue_for_reload() {
eval should_reload=1
}
function reload_gunicorn() {
if [ ! -f "$gunicorn_pidfile" ]; then
return
fi
old_gunicorn_pid=$(cat $gunicorn_pidfile)
# If existing pid doesn't exist, do nothing
if ! ps -p "$old_gunicorn_pid" &> /dev/null; then
return
fi
# Signal gunicorn to fork the master process
log "Sending SIGUSR2 to $old_gunicorn_pid"
kill -s SIGUSR2 "$old_gunicorn_pid"
# Give the new master process 30s to start up
sleep 30
# Gracefully terminate the old master process
log "Sending SIGTERM to $old_gunicorn_pid"
kill -s SIGTERM "$old_gunicorn_pid"
wait_pid "$old_gunicorn_pid"
log "Gunicorn pid $old_gunicorn_pid shutdown complete"
sleep 2
log "New gunicorn pid is $(cat $gunicorn_pidfile)"
}
while true; do
sleep 5
if [ "$should_reload" -ne "0" ]; then
reload_gunicorn
should_reload=0
fi
done
The full source of the gunicorn-wrapper
script
can be found in the appendices. This wrapper script is a drop-in replacement
for gunicorn in your supervisor config.
[program:app]
- command=gunicorn
+ command=gunicorn-wrapper
--pid /run/gunicorn.pid
--chdir=/opt/code
wsgi:application
stopsignal=TERM
You can then trigger a hot reload of your application by sending SIGHUP to the program managed by supervisor.
kill -s SIGHUP $(supervisorctl pid app)
In Carousell we trigger these hot reloads using consul-template whenever there is a configuration update. Our service will reload and pick up the configuration changes.
Appendices
gunicorn-wrapper script
#!/bin/bash
trap queue_for_reload SIGHUP
trap shutdown SIGTERM
gunicorn_pidfile="/run/gunicorn.pid"
gunicorn_args=("$@")
should_reload=0
function log() {
echo "[$(date --rfc-3339=seconds)] [gunicorn-wrapper] $1"
}
function shutdown() {
if [ -f "$gunicorn_pidfile" ]; then
pid=$(cat $gunicorn_pidfile)
log "Shutting down. Sending SIGTERM to $pid"
kill -s SIGTERM "$pid"
wait_pid "$pid"
fi
exit
}
function queue_for_reload() {
eval should_reload=1
}
function reload_gunicorn() {
if [ ! -f "$gunicorn_pidfile" ]; then
return
fi
old_gunicorn_pid=$(cat $gunicorn_pidfile)
# If existing pid doesn't exist, do nothing
if ! ps -p "$old_gunicorn_pid" &> /dev/null; then
return
fi
# Signal gunicorn to fork the master process
log "Sending SIGUSR2 to $old_gunicorn_pid"
kill -s SIGUSR2 "$old_gunicorn_pid"
# Give the new master process 30s to start up
sleep 30
# Gracefully terminate the old master process
log "Sending SIGTERM to $old_gunicorn_pid"
kill -s SIGTERM "$old_gunicorn_pid"
wait_pid "$old_gunicorn_pid"
log "Gunicorn pid $old_gunicorn_pid shutdown complete"
sleep 2
log "New gunicorn pid is $(cat $gunicorn_pidfile)"
}
function wait_pid() {
pid=$1
tail --pid="$pid" -f /dev/null
}
function start_gunicorn() {
log "Starting gunicorn"
gunicorn "${gunicorn_args[@]}" &
}
function gunicorn_exists() {
[ -f "$gunicorn_pidfile" ] && ps -p "$(cat "$gunicorn_pidfile")" &> /dev/null
}
# Start gunicorn if not yet started
if ! gunicorn_exists; then
start_gunicorn
fi
# Loop to keep the script alive
while true; do
sleep 5
if [ "$should_reload" -ne "0" ]; then
reload_gunicorn
should_reload=0
elif ! gunicorn_exists; then
# If somehow gunicorn has stopped, exit this script.
exit 0
fi
done