My Hosting Journey

For a long while, I have been struggling to find a quick, no-nonsense, low-ops way self-hosting a range of websites, from full applications to simple static generated sites (like this one). It’s easy enough to put a website on the internet if you can build the website yourself. Github pages, wordpress.com, neocities, netlify, cloudflare pages and workers, fly.io, etc, all give reasonable means of hosting both static webpages and in some cases dynamic web apps.
All that is great, but I have always liked the self-suffenciency of using general purpose compute and open-source software to acheive the goal of hosting sites on the internet. Let me tell you about my self-hosting journey and what I have found has worked best for me.

Explorations with VPSs: 2011-2015

A decade ago or so, I had a VPS that I rented with a friend. We ran nginx on it and FTP’d static sites to it as well as deployed web apps, both custom and open-source. Honestly, it’s pretty nice to start out this way. The interaction points between systems are very well known and reliable. There are only so many ways you can run a Node.js app as a background service on a linux system and expose it to the network. I was just starting my Linux admin journey, so the whole system was a learning experiance.

I didn’t know much about systemd at the time and systemd wasn’t even the defacto standard quite yet anyway, so things were a bit shaky when it came to keeping things running. The biggest problems though were about real and potential overlap of how services that share the same userland can interact with each other. This made upgrading, re-organizing, and experimenting very risky with the single “production” machine and expensive in both money and time for me to experiment on.

I did start getting into managing this and other servers using SaltStack and later Ansible. This helped a lot with running maintainence across multiple hosts and host setup. However, state drift and service management was still not solved satisfactorily. Adding new apps, or maintining running ones was still a lot of work. These tools mostly just allowed me to scale the size of the operation past a couple of hosts, but didn’t really solve the underlying problems.

There was a tool that was getting popular at that time that would make developing new deployments easier than ever. Once I became aware of it, and became aware of when and how to use it, I started using it for nearly everything.

All-in on Docker: 2015-2018

I wasn’t an early adopter of Docker by any means, but I did get pretty familiar with it once I started running into the issues that it was made to address. Namely, the multi-app interaction problems, upgrades, state snapshots, deployment experimentation/isolation, and it largely eliminated the constant care and feeding of servers needed in the more traditional admin scenario.

My deployment plan at this time was to run a VM that had docker and some development toolchains, then build each project from source and deploy with docker-compose on the VM. The project’s repo would hold it’s own docker-compose so the question of how to deploy a project was kept very close to the project itself.

This was worlds better than the previous iteration, but there were still problems that would come up. The first question was usually, how to get my custom containers onto the production machines? I didn’t want to run a private container registry, nor use public registries. I tended to manage the development toolchains on the VM and keep them in sync with other development machines. The best situation would be if the build could be done entirely in a multi-stage dockerfile if there was any compile steps. I would still use Ansible or SaltStack to manage the production VM state. Migrating hosts involved a bunch of work to get a working development environment again because the tools, configs, and repos scattered around the OS would need to moved over. The docker-compose would at least document the important parts for each project, but you can’t tell from a docker ps call which containers were started from which compose file. OS startup with docker containers expected to start at boot can be a bit iffy. After a while, I started to feel uneasy about using a daemon with root access just to keep these containers functioning.

The thing that really pushed for the next iteration it was not having a good answer for apps that needed lower level access to the filesystem, privledged ports, or just had a lot of configuration scattered around. Running Gitea in a container, connecting it separate backend services and keeping that consistent when migrating hosts or other maintainance tasks was a big pain. That project itself is a single binary, why is all this container stuff necessary? Why do I need to keep up with not only the host config but also the bridge to guests and the guests themselves? Can this be simplified to just one system to worry about?

Overall, Docker moved the bar, but it was never quite satisfactory. Every machine still needed to be setup manually in a lot of ways and the system’s configuration was still made real by the final state and not by the declaration of that state like Dockerfiles and docker compose files promised.

Back to /home, but this time with SystemD: 2018-2022

At some point I figured that while the docker stuff was good, the problems it solves best are not actually my problems and the problems it solves by circumstance are solvable without docker if you just know what you are doing. So I resolved to move my apps and sites back to a VM or bare-metal servers and do sysadmin (more) correctly. I learned how to work with systemd to create unit files and tap into a lot of its management functionality. I also deployed many of my custom apps to one or more servers running Alpine Linux on my bookshelf in my house with its OpenRC init system.

This time around I did much better managing app deployments and server maintanence. Most of my sites and apps contained all of their configuration and you could almost just drop a tarball with the binary, unitfile and init script on the server and have it running without further setup. Static sites and proxy servers would all use the same caddy server with separate server configs. Custom apps would get their own user and try to keep everything within their respective home directories as much as possible so that there would little possiblity of misplacing config or state.

This was all really fun and generally worked as long as proper diligence was performed. Apps deployed to the bare-metal alpine server were as reliable as my home network connection would allow. However, that did mean that the weakest link now has become my home network and power. I started worrying about keeping my home machines running, with storage, power, network, cooling, and all the rest suddenly being a problem if stuff goes down. I got used to running things on hosted VMs and took all those physical constraints for granted! I felt like I was suddenly on-call whenever I messed around with the local dns resolver or updated the router’s firmware.

I’m way more confident with hosting services on my “edge data-center” now, but I’ve changed my strategy to limiting the services that I share publically or just to friends and family to hosted or cloud deployments so that I’m susceptible to bring a friend’s service down when a storm rolls in. Services that are only accessible on the local network are still suitable for this simple deployment strategy. I still run some services this way for stuff that only I, or co-habitants, use.

K8s: 2023-now

While I had leveled up my skills at system admin by working with VMs and home servers more carefully, I still needed to address the real contraints of running the infrastructure myself. There are good reasons you keep your production environment separate from development, and I was finding that out. Going the full homelab setup complete with redundant power, network, and servers to minimize impact of changes was not really an option for me due to lack of space. So it was back to the cloud, but to make it a bit more interesting this time, why not make it work in Kubernetes?

With Kubernetes, everything is declaritive. This would potentially fill the gap that docker-compose often left. I had used mini-kube and k3s in the past to experiment with on single host deployments, but it hadn’t quite stuck yet. At this point however, I was confident that I wanted to go in this direction so I committed to migrating to it a service at a time.

I first evaluated my options and decided to get a hosted distribution of K8s from Digital Ocean. I was pretty familiar with DO as I had been renting VMs from them for a while at this point. Their initial setup seemed simple. It has a relativily limited capabilities compared to the big 3 public cloud vendor. However, the modest features was actually all that I would really need and the setup simplicity was a blessing. While it would be more expensive than just plain VMs, I figured that I had both enough workloads to use the minimum node allocation and the otherwise free control plane would make it worth it.

So I setup a git repository that just held the state of my kubernetes deployment and followed DO’s guides for initial setup. After getting Traefik ingress, cert-manager, and a couple simple workloads running, I was pretty impressed by how well it held together. I did have to get pretty comfortable with the debugging techniques with kubectl logs and kubectl exec to figure out how things were going wrong while authoring the service deployments. Getting comfortable enough with k8s to know what I needed and how to solve common service needs did take a couple of weeks of free time. In the end, being able to hold the complete state in my deployment repo makes maintaining my little menagerie of sites and apps suprisingly simple. No need to maintain ssh access to anything, all workloads are isolated, persistent volumes are handled entirely by DO’s Volume appliance, and network configuration is stated explictly.

Before migrating to K8s, I was dreading having to continue to maintain some of my long running group applications, like gitea. After getting it setup in K8s, however, the infrastructure maintainence problems disappeared so completely that I was excited to keep that service running again and adding on to my hosted capabilities with it.

A small thing that took me a while to figure out, but felt like a revelation was setting up static web-pages in k8s. I had gotten used to dropping stuff in www-data, as either a host volume or in install script but I wasn’t sure how to pull that in to a k8s pod without packaging custom images, or external volumes, etc. Those options all had trade-offs I didn’t like, managing a container registry or having to pay for extra volume storage for something that is read-only. I eventually discovered initContainers and that opened my mind to the possibilties. Now I clone the static site repo or copy from a storage service and write to a shared ephemeral volume in the pod. A stock nginx container then mounts that ephemeral volume and that’s it. Simple and effective static site.

I’ve so far been pretty happy building up my little k8s cluster and it has been both reliable and low maintainance. I look forward to continuing to raising the sophistication of my k8s setup.

Next Steps

I’ve got lots of ideas for new services to try out and develop. Having a clean deployment strategy takes a large technical burden off the design and development of those ideas. It also makes it much faster to go from idea to running app which is a big help when I only have a few hours of free time to work on my passion projects.

I’d like to get a more robust monitoring stack integrated so that I can keep track of my services individually and see problems sooner. I’m still wrapping my head around prometheus-operator and how to add the prometheus-exporter to workloads. Another area I need to improve my process is to stage changes and rollouts locally or in a staging area to improve my skills at zero-downtime changes. Routine updates are usually pretty boring, but changing config or environment settings can lead to start-up crashes that may take a few minutes to hours to debug. Not much of a problem for a small personal use service, but it’s one of the many gaps that keep me from looking to apply these techniques professionally.

Looking Back

I’ve learned a lot over the last 13-14 years of system administration. Starting out VMs is exactly what I would recommend to anyone starting their own journey. Unix-like OSs are the language that everything in this space speaks, so getting hands on with that environment, trying stuff, and finding out what works and what doesn’t is the best way to get a fundamental education. I came in at the start of the containerization cycle and have watched in real-time how it transformed system administration.

From, those simple VPSs to SaltStack management, to Docker, back to Unix system admin, and then to k8s, I’ve found that the best thing is to be curious and be willing to just give something a shot for a while. New technologies come and go, and a few even suprise and delight. The most important thing is taking every oppertunity to improve on one’s knowledge of the fundamentals. Those fundamentals gives sure footing when doing something novel or new.

last-modified: 2024-10-09 03:45 +0000

developmentwebretrospective