Podman + Pods + Anubis

Updated on

Podman is a great alternative to docker. It appears to be getting more popular too! Some benefits include:

  • No daemon
  • A mature rootless container hosting approach (much more secure!)
  • Excellent systemd integration
  • Largely compatible with docker

If you’re on linux, there’s little reason to not use podman. The alternative to docker-compose with podman are quadlets. These are systemd services with a new [Container] or [Pod] section for launching podman. They’re very pleasant to administer, especially if you want persistent rootless containers.

Anubis

Anubis is a recently popular tarpit to increase the cost of using mass llm-crawlers over the internet, as is now popular for llm training. This thread provides an excellent discussion of the benefits for using Anubis.

Anubis sits in between a reverse proxy and an application. This means every container serving an application will need a separate Anubis instance!

In this blog, we’ll learn to use podman pods through qualdets by setting up an Anubis instance for redlib. This is a good read if you’re interested in deploying Anubis.

Prerequisites

You will need to install podman from your distro. A reasonably recent version of podman (at least v5) will likely work. I tested this on v5.5.2.

Create a new unprivileged user (kate here). This user should not have sudo access, as that would largely defeat the purpose of a rootless deployment.

Give this user about 100000 subuids/subgids. It’s recommended to provide at least 65536. This range cannot overlap with any other user! Check the ranges in cat /etc/subuid and cat /etc/subgid. The lower bound of the range should start at 100000 or more.

Example:

usermod --add-subuids 100000-200000 kate
usermod --add-subgids 100000-200000 kate

Everything we do from now on will be using user kate. You can always switch into kate from your privileged account using sudo su -l kate.

Setting up the pod

Here, we host redlib as our app. We’ll need 3 files to put Anubis as a guard:

  • redlib.pod
  • redlib.container
  • redlib-anubis.container

First, our redlib.pod looks as follows. We define the network to be in bridge mode, which is a good choice for rootless containers. We publish port 8999 to the host, which will be the entry point to Anubis (which itself proxies redlib). The AddHost line improves clarity when communicating between containers.

[Unit]
Description=Pod for redlib
Wants=network-online.target
After=network-online.target
 
[Pod]
PodName=redlib.pod
Network=bridge
PublishPort=8999:8081
AddHost=redlib:127.0.0.1
 
[Install]
WantedBy=default.target

Our main redlib container has several important lines. The most important one is Pod under [Container]. Best practice is to define BindsTo and After for the pod. Additionally, we put a requirement for Anubis to start up before this container.

[Unit]
Description=Redlib
BindsTo=redlib-pod.service
After=redlib-pod.service
Requires=redlib-anubis.service
After=redlib-anubis.service
 
[Container]
Pod=redlib.pod
 
AutoUpdate=registry
ContainerName=redlib
DropCapability=ALL
EnvironmentFile=/home/kate/Documents/servers/redlib/.env
Image=quay.io/redlib/redlib:latest
NoNewPrivileges=true
ReadOnly=true
 
[Install]
WantedBy=default.target

Anubis

Since we opened port 8999 on the pod, that will have to be where Anubis is listening for traffic. We set BIND=:8999 to reflect that.

Redlib is serving itself on port 8080 by default. Set set the target to this port and the HostName we defined in redlib.pod. localhost won’t work here, since that would be the localhost for Anubis not redlib. This is similar to the use of host.docker.internal for docker-compose, but much more clear.

[Unit]
Description=Redlib Anubis instance
BindsTo=redlib-pod.service
After=redlib-pod.service
Before=redlib.service
 
[Container]
Pod=redlib.pod
ContainerName=redlib-anubis
Image=ghcr.io/techarohq/anubis:latest
 
Environment=BIND=:8999
Environment=DIFFICULTY=5
Environment=TARGET=http://redlib:8080
Environment=REDIRECT_DOMAINS=redlib.mami2.moe
Environment=COOKIE_DOMAIN=redlib.mami2.moe
 
Environment=ED25519_PRIVATE_KEY_HEX_FILE=/ed25519_anubis.key
Environment=POLICY_FNAME=/policy.json
 
Volume=/home/kate/Documents/servers/redlib/data/ed25519_anubis.key:/ed25519_anubis.key
Volume=/home/kate/Documents/servers/redlib/data/policy.json:/policy.json
 
[Install]
WantedBy=default.target

The Anubis-specific configuration is largely stored in the policy.json file. The envvar DIFFICULTY will only be used for connections that fall through policy.json. A good policy should define handles for everything, so nothing should fall through. I advise mounting any persistent configuration like this to avoid the mess of volumes.

A very conservative policy can look like:

{
  "bots": [
    {
      "name": "everything",
      "user_agent_regex": ".*",
      "action": "CHALLENGE",
      "challenge": {
        "difficulty": 5,
        "report_as": 5,
        "algorithm": "fast"
      }
    }
  ]
}

Please look at the docs for configuring a better policy. We typically want to serve things like robots.txt to crawlers. Some policies will be very specific to the app behind Anubis.

Caddy

Our pod serves redlib protected by Anubis on port 8999, which is not the standard port for https. Even if we wanted to, a rootless container cannot bind to a privileged (<1000) port like 443.

The best approach is to have a reverse proxy on the host. I recommend Caddy, as it does it all (letsencrypt, logging, reverse proxy) with very little required configuration. Here is what we’ll throw in /etc/caddy/Caddyfile:

redlib.mami2.moe {
    header X-Robots-Tag "noindex, nofollow"  # Prevents search engine indexing
    #tls kate@example.com  # Optional
 
    reverse_proxy localhost:8999 {
        header_up X-Real-Ip {remote_host}  # Anubis needs this to ratelimit
        header_up X-Http-Version {http.request.proto}
    }
}

Starting the pod

Recall we have 3 files: redlib.pod, redlib.container, redlib-anubis.container. We need systemd to see these to manage them as services. Podman’s internal quadlet extension will convert these files to systemd.unit files, which essentially just run the podman commands we specified. However, this gives us the advantage of logging and managing startup.

I recommend symlinking these 3 files (ln -s) into ~/.config/containers/systemd. From there:

systemctl --user daemon-reload

You now have redlib-pod.service, redlib.service, and redlib-anubis.service. You will start/stop these through redlib-pod.service, as the other two are controlled by the pod.

systemctl --user start redlib-pod.service

Log files are still individual for all 3 services. For instance, to check the Anubis logs we’d use journalctl --user -fu redlib-anubis.service.

Debugging

Quadlet, which is integrated with podman, converts .pod and .container files to systemd-unit files (.service). You can see the conversion (and any errors during the conversion), with /usr/lib/podman/quadlet -dryrun -user.

References