Implementing SSO with Traefik

Implementing SSO with Traefik

samdbmg/ansible-traefik-auth-proxy combines the Traefik reverse proxy with thomseddon/traefik-forward-auth to provide HTTP reverse proxying, certificate handling with LetsEncrypt and SSO login, either as an Ansible role or a Docker Compose project.

The idea is to be able to have a subdomain for each of my servers (e.g. sofaserver.samn.co.uk) and to deploy various self-hosted HTTP services as Docker containers under subdomains (e.g. coolapp.sofaserver.samn.co.uk) which gets exposed with login and certificates automatically sorted out, just by applying labels to the containers when they're launched. Setting that up requires knitting together some bits of configuration, which is why I've built an Ansible role for it.

How it works

To make configuration based on Docker container labels work, Traefik has the concept of dynamic configuration. Various providers, such as the Docker Provider supply it with the details of what should be configured at runtime and Traefik listens for containers launching with traefik.* labels attached to expose them accordingly. However, granting anything direct access to Docker via the Docker socket is a security risk: anything with access to that socket is effectively root on the host. Instead, an instance of tecnativa/docker-socket-proxy is deployed which provides read-only access to details about Docker containers but rejects any other requests to reduce the attack surface.

The first part of exposing a service is automatically issuing a TLS certificate with LetsEncrypt for the relevant subdomain. LetsEncrypt has to check that you control the domain for which you've requested a certificate by using one of a series of challenges such as HTTP-01 and DNS-01. In the former LetsEncrypt provides a file to host at a specific location on your web server and verifies it can be accessed over unencrypted HTTP (proving you have control of the location the domain points to), and in the latter, it provides details of a specific DNS TXT record to create (proving you control the domain itself). Fortunately, Traefik takes care of both challenges, either intercepting and responding to requests for the challenge file over HTTP, or creating DNS records on your DNS provider using the go-acme/lego client. In my case, I tend to use DNS-01 to avoid having to expose the HTTP port and because some of my services only accept requests from inside my network, but it makes little difference.

The second major piece of automatic configuration is authentication and authorization, to restrict access to my services only to people who should be able to log in. Traefik has support for middleware to change how requests get processed, such as the ForwardAuth middleware which calls some other service and only permits requests when the other service responds that they are permitted. thomseddon/traefik-forward-auth is an example of one of those services, and it permits requests that pass an OAuth2 Authorization Code Grant. From a user perspective that looks like what happens when you push a "Sign in with Google" or "Log in with GitHub" button: you're sent to Google/GitHub/et al (the Identity Provider, or IdP), you log in, the Identity Provider sends you back with a code and Traefik uses that to get an Access Token and some details about you (your email address) which it forwards on to the service you're logging in to as the X-Forwarded-User header.

Let's think about what happens in our previous example of coolapp.sofaserver.samn.co.uk, and if Google was the IdP in use. When you browse to that site you'd be redirected to Google to log in, with a callback URL pointing to auth.sofaserver.samn.co.uk: the address of the Forward Auth middleware. After logging in successfully you are redirected to that callback URL with a code, and the middleware makes a request to Google behind the scenes for an access token and your email address. Finally, the middleware sets a cookie to keep you logged in and responds to Traefik that auth succeeded and the request should be allowed and the X-Forwarded-User header should be passed to the backing service.

Docker Compose Project

I mentioned at the beginning that the repository contained both an Ansible Role and a Docker Compose project. Ansible is ideal for automating complex infrastructure and maintaining a long-term deployment, but a lot of work for a one-off demo, which is why I built a Docker version. It starts up the same set of containers (Traefik, the auth middleware and the Docker socket proxy) as when deployed using Ansible, but also includes a container that generates all the config files. In principle, you can copy the docker-compose.yml file from the repository, set environment variables as needed (e.g. for access to your Identity Provider) and then run docker-compose run config-generator && docker-compose up reverse-proxy -d and have a working Traefik/LetsEncrypt/SSO setup.

The config-generator container runs a one-off ansible-playbook command to generate config files, using the same Ansible role, with a set of flags to disable starting the other containers.

Unit Testing & CI

The role has a set of automated tests and linting configured using Ansible Molecule to verify that it works as expected. Molecule provides a way to start a Docker container and orchestrate Ansible to connect to that container, apply the role, check that it's idempotent and run a verification process against the container afterward.

Verifying the reverse proxy role works is complicated, because the tests have to work in GitHub Actions without human interaction, but also exercise enough of the certificate issuance and auth processes to prove they work.

Testing certificates in a CI environment implies the DNS-01 challenge, because it's not practical (nor wise!) to expect your CI system to be accessible from the Internet. Instead I've delegated a subdomain to another provider (desec.io and GitHub Actions has a token to access the deSEC API. When the test runs, it generates a random string and prepends it to the delegated domain to produce the test domain (e.g. abcd1234.ci-dns.samn.co.uk), which is saved to a file and used to issue certificates (initially it just used the delegated domain directly, but that led to confusing test failures when they ran in parallel!)

For testing the ForwardAuth middleware I've used navikt/mock-oauth2-server. It behaves like an IdP that you're already logged in to, so then it's just a case of testing the various redirects to and from the IdP appear, and following them.

Further Reading