Sign in

Scaling Shiny App using Traefik and Docker Swarm

Scaling Shiny app is quite tricky primarily because of the nature of single-threaded R. There are several options that we know of:

  • Shinyproxy; spawning/terminating app container for each user, can be run on Docker and Kubernetes mode. As our app is quite big and took quite some time to initialize, this is not our first option.
  • Shiny Server Pro; run multiple processes. I don’t know how they achieve this, perhaps some sort of the following approach. Too expensive.
  • Instead of spawning for each new user, why not create several pre-warmed app containers, load-balanced users, and scale it if there are too many of them.

This is a quick note on the third approach, using Traefik and docker as its provider.

Traefik is an Edge-Router, sits at the edge of network, proxying requests to among others web apps behind it. In its parlance, our app is ‘service’. Traefik has the ability to auto-discover routes and configuration from the service itself instead of defining it statically on the Traefik side. When we scale up/down service, it will configured its routing definition automatically. Traefik has several ‘providers’ that can supply that information beside static configuration, eg. Docker and Kubernetes Ingress.

After the initial Swarm setup (basically install docker and enable swarm mode on several servers and join them together; we’ll call them nodes), we create Docker Compose configuration (must be version > 3.0) like below.

# swarm.yaml
version: '3.2'
services:
traefik:
image: traefik:v2.4.7
deploy:
placement:
constraints:
- node.role == manager
labels:
- "traefik.http.services.traefik.loadbalancer.server.port=3840"
command:
- "--api.insecure=true"
- "--providers.docker=true"
- "--providers.docker.swarmMode=true"
ports:
- published: 3840
target: 80
- published: 8080
target: 8080
volumes: # so traefik can communicate with docker
- /var/run/docker.sock:/var/run/docker.sock:ro
shiny:
image: myrepo/shinyserver:20210316
deploy:
replicas: 10
labels: # stick, easier to trace for error
- "traefik.enable=true"
- "traefik.http.routers.shiny.rule=PathPrefix(`/`)"
- "traefik.http.services.shiny.loadbalancer.server.port=3838"
- "traefik.http.services.shiny.loadbalancer.sticky.cookie=true"
- "traefik.http.services.shiny.loadbalancer.sticky.cookie.name=stickysvr"
volumes:
- /NFS_FOLDER/app/:/app/
- ${PWD}/user_config.yaml:/app/user_config.yaml # secrets

That’s it. As you can see, it’s only several labels and scaling parameters on-top of regular Compose. The rest is putting our existing WAF and TLS-termination server in front of this cluster.

Swarm service resource utilization. We can scale service replica here.
Traefik dashboard exposed on 8080

Several works planned to be done afterward:

  • Create a ‘sidecar’ app that health check shiny app readiness to serve and monitor if it terminates (eg. throws fatal error). I think in Swarm, it must be on the same container since container is the smallest unit of service. On the other hand, in Kubernetes, it will be a separate container in the same pod.
  • Setup observability. We have Prometheus and Traefik handily exports metrics for it to consume. Since Shiny is the only service, we will not utilize Traefik tracing logs.
  • Script to monitor those metrics and dynamically scale shiny replicas.

That will have to wait ;-)

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store