My first k8s build log - Talos

I rebuilt most of it from scratch
kubernetes
talos
Published

January 23, 2026

Introduction

Continuing my rebooted cluster setup series, let’s talk about how I manage Talos Linux and my base kubernetes environment.

I did cover some of this in a previous post, but some of the administrative components and configuration has changed since then and I’m treating these new posts as a bit of a clean slate anyway.

Bootstrap secret

To encrypt all the secrets for certificates etc. that talos requires I still use sops and age. There’s a good post on getting those working here. I manage my age key as part of my NixOS setup, so if I’m on a machine I manage that’s already set up.

In the repo I configure sops to use that age key with a .sops.yaml file in the folder for my cluster, and a parent folder one that tells it what to encrypt my secrets with and what files are secrets. Then as part of the bootstrap I generated an encrypted secret file with talhelper gensecret > talsecret.sops.yaml, which I can safely commit to my repo since all the secrets are encrypted. These are then used by talhelper to generate and apply talos configs.

Talos config

The rest of my config lives in either talconfig.yaml or is incorporated into it from patches. To be honest I’m still not thrilled with how I have my config set up. I’m getting better about not repeating configuration where it’s not necessary by using yaml anchors but it still feels like things could be designed better on my side. I also still struggle figuring out how to apply configs from how they’re specified in base talos config syntax into talhelper. Skill issue on my part I’m sure, nothing against the developer of the tool.

Helper tasks

Because I don’t regularly do talos operations and because some of them can be verbose to type out, I make use of taskfiles. For the basic operations like applying configs, or updating talos or kubernetes this is a bit overkill, but it lets me run tasks from any folder, is slightly less typing, and forces the generation of new configs before applying them, which I sometimes forget about otherwise.

I’ve also got some status tasks to do things like pull the names of disks in my machines or check the health of my cluster.

Finally, probably the most helpful tasks I have are the ones to perform a wipe and bootstrap of my cluster I don’t do this very often so it’s nice to have the steps and commands captured.

A note on version management

In my previous post on talos I talked about how I got myself into a pickle by having a newer version of kubernetes installed on my cluster than my version of talos supported and not being able to upgrade or roll back. At the time I thought the issue was related to which version of talosctl I was running. While I do still try and keep those versions close now, I don’t think that’s the issue. The main thing is that I had a misunderstanding of how applying a talos config impacts a machine that’s in maintenance mode. Here’s my current understanding, with the caveat that I may be missing some things still.

When you wipe a talos machine back to maintenance mode it will be running whatever version of talos it was before it was wiped. If you then apply a talos config that specifies a different version of talos and kubernetes than what’s currently running, it will not change the version of talos that’s running, but it will change the version of kubernetes. This can then get you in the situation I found myself in. I’m pretty sure what happened was:

  • I was running an older version of talos and kubernetes
  • I decided I wanted to wipe and rebuild my cluster, and upgrade while I was at it
  • I reset my machines to maintenance mode (still on old talos and old kubernetes)
  • I updated my talos config to specify new talos and new kubernetes
  • I applied my config, expecting it to apply the specified talos and kubernetes version
  • It only applied the specified kubernetes version, resulting in a mismatch.

The correct approach is to not mix wipe/rebuilds with upgrades. Do your wipe and rebuild to the same version of talos and kubernetes you were previously on, then upgrade talos, then upgrade kubernetes (repeat as needed if you have to go up a few major versions).