So Podman is an open source container engine like Dockerā€”with "full"1 Docker compatibility. IMO Podmanā€™s main benefit over Docker is security. But how is it more secure? Keep readingā€¦

Docker traditionally runs a daemon as the root user, and you need to mount that daemonā€™s socket into various containers for them to work as intended (See: Traefik, Portainer, etc.) But if someone compromises such a container and therefore gains access to the Docker socket, itā€™s game over for your host. That Docker socket is the keys to the root kingdom, so to speak.

Podman doesnā€™t have a daemon by default, although you can run a very minimal one for Docker compatibility. And perhaps more importantly, Podman can run entirely as a non-root user.2 Non-root means if someone compromises a container and somehow manages to break out of it, they donā€™t get the keys to the kingdom. They only get access to your non-privileged Unix user. So like the keys to a little room that only contains the thing they already compromised.2.5 Pretty neat.

Okay, now for the annoying parts of Podman. In order to achieve this rootless, daemonless nirvana, you have to give up the convenience of Unix users in your containers being the same as the users on the host. (Or at least the same UIDs.) Thatā€™s because Podman typically3 runs as a non-root user, and most containers expect to either run as root or some other specific user.

The "solution"4 is user re-mapping. Meaning that you can configure your non-root user that Podman is running as to map into the container as the root user! Or as UID 1234. Or really any mapping you can imagine. If that makes your head spin, wait until you actually try to configure it. Itā€™s actually not so bad on containers that expect to run as root. You just map your non-root user to the container UID 0 (root)ā€¦ and Bobā€™s your uncle. But it can get more complicated and annoying when you have to do more involved UID and GID mappingsā€”and then play the resultant permissions whack-a-mole on the host because your volumes are no longer accessed from a container running as host-rootā€¦

Still, itā€™s a pretty cool feeling the first time you run a ā€œrootā€ container in your completely unprivileged Unix user and everything just works. (After spending hours of swearing and Duck-Ducking to get it to that point.) At least, it was pretty cool for me. If itā€™s not when you do it, then Podman may not be for you.

The other big annoying thing about Podman is that because thereā€™s no Big Bad Daemon managing everything, there are certain things you give up. Like containers actually starting on boot. Youā€™d think thatā€™d be a fundamental feature of a container engine in 2023, but youā€™d be wrong. Podman doesnā€™t do that. Podman adheres to the ā€œUnix philosophy.ā€ Meaning, briefly, if Podman doesnā€™t feel like doing something, then it doesnā€™t. And therefore expects you to use systemd for starting your containers on boot. Which is all good and well in theory, until you realize that means Podman wants you to manage your containers entirely with systemd. Soā€¦ running each container with a systemd service, using those services to stop/start/manage your containers, etc.

Which, if you ask me, is totally bananasland. I donā€™t know about you, but I donā€™t want to individually manage my containers with systemd. I want to use my good old trusty Docker Compose. The good news is you can use good old trusty Docker Compose with Podman! Just run a compatibility daemon (tiny and minimal and rootlessā€¦ donā€™t you worry) to present a Docker-like socket to Compose and boom everything works. Except your containers still donā€™t actually start on boot. You still need systemd for that. But if you make systemd run Docker Compose, problem solved!

This isnā€™t the ā€œPodman Wayā€ though, and any real Podman user will be happy to tell you that. The Podman Way is either the aforementioned systemd-running-the-show approach or something called Quadlet or even a Kubernetes compatibility feature. Briefly, about those: Quadlet is ā€œjustā€ a tighter integration between systemd and Podman so that you can declaratively define Podman containers and volumes directly in a sort of systemd service file. (Well, multiple.) Itā€™s like Podman and Docker Compose and systemd and Windows 3.1 INI files all had a bastard love childā€”and itā€™s about as pretty as it sounds. IMO, youā€™d do well to stick with Docker Compose.

The Kubernetes compatibility feature lets you write Kubernetes-style configuration files and run them with Podman to start/manage your containers. It doesnā€™t actually use a Kubernetes cluster; it lets you pretend youā€™re running a big boy cluster because your command has the word ā€œkubeā€ in it, but in actuality youā€™re just running your lowly Podman containers instead. It also has the feel of being a dev toy intended for local development rather than actual production use.5 For instance, thereā€™s no way to apply a change in-place without totally stopping and starting a container with two separate commands. What is this, 2003?

Lastly, thereā€™s Podman Compose. Itā€™s a third-party project (not produced by the Podman devs) thatā€™s intended to support Docker Compose configuration files while working more ā€œnativelyā€ with Podman. My brief experience using it (with all due respect to the devs) is that itā€™s total amateur hour and/or just not ready for prime time. Again, stick with Docker Compose, which works great with Podman.

Anyway, thatā€™s all Iā€™ve got! Use Podman if you want. Donā€™t use it if you donā€™t want. Iā€™m not the boss of you. But you said you wanted content on Lemmy, and now youā€™ve got content on Lemmy. This is all your fault!

1 Where ā€œfullā€ is defined as: Not actually full.

2 Newer versions of Docker also have some rootless capabilities. But theyā€™ve still got that stinky olā€™ daemon.

2.5 Itā€™s maybe not quite this simple in practice, because youā€™ll probably want to run multiple containers under the same Unix account unless youā€™re really OCD about security and/or have a hatred of the convenience of container networking.

3 You can run Podman as root and have many of the same properties as root Docker, but then whatā€™s the point? One less daemon, I guess?

4 Where ā€œsolutionā€ is defined as: Something that solves the problem while creating five new ones.

5 Spoiler: Red Hatā€™s whole positioning with Podman is like they see it is as a way for buttoned-up corporate devs to run containers locally for development while their ā€œproductionā€ is running K8s or whatever. Personally, I donā€™t care how they position it as long as Podman works well to run my self-hosting shitā€¦

  • Geronimo Wenja@agora.nop.chat
    link
    fedilink
    English
    arrow-up
    2
    Ā·
    edit-2
    2 years ago

    Yeah sure.

    Iā€™m going to assume youā€™re starting from the point of having a second linux user also set up to use rootless podman. Thatā€™s just following the same steps for setting up rootless podman as any other user, so there shouldnā€™t be too many problems there.

    If you have wireguard set up and running already - i.e. with Mullvad VPN or your own VPN to a VPS - you should be able to run ip link to see a wireguard network interface. Mine is called wg. I donā€™t use wg-quick, which means I donā€™t have all my traffic routing through it by default. Instead, I use a systemd unit to bring up the WG interface and set up routing.

    Iā€™ll also assume the UID you want to forward is 1001, because thatā€™s what Iā€™m using. Iā€™ll also use enp3s0 as the default network link, because thatā€™s what mine is, but if yours is eth0, you should use that. Finally, Iā€™ll assume that 192.168.0.0 is your standard network subnet - itā€™s useful to avoid routing local traffic through wireguard.

    #YOUR_STATIC_EXTERNAL_IP# should be whatever you get by calling curl ifconfig.me if you have a static IP - again, useful to avoid routing local traffic through wireguard. If you donā€™t have a static IP you can drop this line.

    [Unit]
    Description=Create wireguard interface
    After=network-online.target
    
    [Service]
    RemainAfterExit=yes
    ExecStart=/usr/bin/bash -c " \
            /usr/sbin/ip link add dev wg type wireguard || true; \
            /usr/bin/wg setconf wg /etc/wireguard/wg.conf || true; \
            /usr/bin/resolvectl dns wg #PREFERRED_DNS#; \
            /usr/sbin/ip -4 address add #WG_IPV4_ADDRESS#/32 dev wg || true; \
            /usr/sbin/ip -6 address add #WG_IPV6_ADDRESS#/128 dev wg || true; \
            /usr/sbin/ip link set mtu 1420 up dev wg || true; \
            /usr/sbin/ip rule add uidrange 1001-1001 table 200 || true; \
            /usr/sbin/ip route add #VPN_ENDPOINT# via #ROUTER_IP# dev enp3s0 table 200 || true; \
            /usr/sbin/ip route add 192.168.0.0/24 via 192.168.0.1 dev enp3s0 table 200 || true; \
            /usr/sbin/ip route add #YOUR_STATIC_EXTERNAL_IP#/32 via #ROUTER_IP# dev enp3s0 table 200 || true; \
            /usr/sbin/ip route add default via #WG_IPV4_ADDRESS# dev wg table 200 || true; \
    "
    
    ExecStop=/usr/bin/bash -c " \
            /usr/sbin/ip rule del uidrange 1001-1001 table 200 || true; \
            /usr/sbin/ip route flush table 200 || true; \
            /usr/bin/wg set wg peer '#PEER_PUBLIC_KEY#' remove || true; \
            /usr/sbin/ip link del dev wg || true; \
    "
    
    [Install]
    WantedBy=multi-user.target
    

    Thereā€™s a bit to go through here, so Iā€™ll take you through why it works. Most of it is just setting up WG to receive/send traffic. The bits that are relevant are:

            /usr/sbin/ip rule add uidrange 1001-1001 table 200 || true; \
            /usr/sbin/ip route add #VPN_ENDPOINT# via #ROUTER_IP# dev enp3s0 table 200 || true; \
            /usr/sbin/ip route add 192.168.0.0/24 via 192.168.0.1 dev enp3s0 table 200 || true; \
            /usr/sbin/ip route add #YOUR_STATIC_EXTERNAL_IP#/32 via #ROUTER_IP# dev enp3s0 table 200 || true; \
            /usr/sbin/ip route add default via #WG_IPV4_ADDRESS# dev wg table 200 || true; \
    

    ip rule add uidrange 1001-1001 table 200 adds a new rule where requests from UID 1001 go through table 200. A table is a subset of ip routing rules that are only relevant to certain traffic.

    ip route add #VPN_ENDPOINT# ... ensures that traffic already going through the VPN - i.e. wireguard traffic - does. This is relevant for handshakes.

    ip route add 192.168.0.0/24 via 192.168.0.1 ... is just excluding local traffic, as is ip route add #YOUR_STATIC_EXTERNAL_IP

    Finally, we add ip route add default via #WG_IPV4_ADDRESS# ... which routes all traffic that didnā€™t match any of the above rules (local traffic, wireguard) to go to the wireguard interface. From there, WG handles all the rest, and passes returning traffic back.

    Thereā€™s going to be some individual tweaking here, but the long and short of it is, UID 1001 will have all their external traffic routed through WG. Any internal traffic between docker containers in a docker-compose should already be handled by podman pods and never reach the routing rules. Any traffic aimed at other services in the network - i.e. sonarr calling sabnzbd or transmission - will happen with a relevant local IP of the machine itā€™s hosted on, and so will also be skipped. Localhost is already handled by existing ip route rules, so you shouldnā€™t have to worry about that either.

    Hopefully that helps - sorry if itā€™s a bit confusing. I learned to set up my own IP routing to avoid wg-quick so that I could have greater control over the traffic flow, so this is quite a lot of my learning that Iā€™m attempting to distill into one place.