Introduction

Warning

These docs contain information that relates to my setup. They may or may not work for you.

My Home Operations repository

... managed by Flux, Renovate and GitHub Actions 🤖

Kubernetes cluster stats:

👋 Welcome to my Home Operations repository. This is a mono repository for my home infrastructure and Kubernetes cluster. I try to adhere to Infrastructure as Code (IaC) and GitOps practices using the tools like Ansible, Terraform, Kubernetes, Flux, Renovate and GitHub Actions.

🔎 Support

If you like this project, please consider supporting my work through my GitHub sponsorship page.

🤝 Thanks

Thanks to all the people who donate their time to the Home Operations Discord community. A lot of inspiration for my cluster comes from the people that have shared their clusters using the k8s-at-home GitHub topic. Be sure to check out the awesome Kubesearch tool for ideas on how to deploy applications or get ideas on what you can deploy.

🔏 License

See LICENSE

Hardware

Device	Count	OS Disk Size	Data Disk Size	Ram	Operating System	Purpose
YCSD 6LAN i211 MiniPC i3 7100U	1	128GB mSATA	-	8GB	VyOS	Router
Intel NUC8i3BEH	1	512GB SSD	1TB NVMe (rook-ceph)	32GB	Talos	Kubernetes Node
Intel NUC8i5BEH	2	512GB SSD	1TB NVMe (rook-ceph)	32GB	Talos	Kubernetes Node
Synology DS918+	1	-	2x14TB + 1x10TB + 1x6TB (SHR)	8GB	Synology DSM7	NFS + Backup Server
Raspberry Pi 4	1	128GB (SD)	-	4GB	PiKVM	Network KVM
Unifi USW-Lite-16-PoE	2	-	-	-	-	Core network switch
Unifi USW-Flex-Mini	1	-	-	-	-	Secondary network switch
Unifi UAP-AC-Pro	4	-	-	-	-	Wireless AP

☁️ Cloud services

While most of my infrastructure and workloads are selfhosted I do rely upon the cloud for certain key parts of my setup. This saves me from having to worry about two things. (1) Dealing with chicken/egg scenarios and (2) services I critically need whether my cluster is online or not.

The alternative solution to these two problems would be to host a Kubernetes cluster in the cloud and deploy applications like HCVault, Vaultwarden, ntfy, and Authentik. However, maintaining another cluster and monitoring another group of workloads is a lot more time and effort than I am willing to put in and only saves me roughly $10/month.

Service	Use	Cost
GitHub	Hosting this repository and continuous integration/deployments	Free
Auth0	Identity management and authentication	Free
Cloudflare	Domain, DNS and proxy management	Free
1Password	Secrets with External Secrets	~$65/y
Terraform Cloud	Storing Terraform state	Free
B2 Storage	Offsite application backups	~$5/m
Pushover	Kubernetes Alerts and application notifications	Free
		Total: ~$10/m

Kubernetes

My main cluster is Talos provisioned on bare-metal using the official talosctl CLI tool. I render my Talos configuration using the talhelper CLI tool. This allows me to keep the Talos configuration as DRY as possible.

This is a semi hyper-converged cluster, workloads and block storage are sharing the same available resources on my nodes while I have a separate server for (NFS) file storage.

Core Components

actions-runner-controller: Self-hosted Github runners.
cilium: Internal Kubernetes networking plugin.
cert-manager: Creates SSL certificates for services in my Kubernetes cluster.
external-dns: Automatically manages DNS records from my cluster in a cloud DNS provider.
external-secrets: Managed Kubernetes secrets using 1Password Connect.
ingress-nginx: Ingress controller to expose HTTP traffic to pods over DNS.
multus: Allows multi-homing Kubernetes pods.
rook: Distributed block storage for peristent storage.
sops: Managed secrets for Kubernetes, Ansible and Terraform which are commited to Git.
tf-controller: Additional Flux component used to run Terraform from within a Kubernetes cluster.
volsync and snapscheduler: Backup and recovery of persistent volume claims.

GitOps

Flux watches my kubernetes folder (see Directory structure) and makes the changes to my cluster based on the YAML manifests.

The way Flux works for me here is it will recursively search the kubernetes/apps folder until it finds the most top level kustomization.yaml per directory and then apply all the resources listed in it. That aforementioned kustomization.yaml will generally only have a namespace resource and one or many Flux kustomizations. Those Flux kustomizations will generally have a HelmRelease or other resources related to the application underneath it which will be applied.

Renovate watches my entire repository looking for dependency updates, when they are found a PR is automatically created. When PRs are merged Flux applies the changes to my cluster.

Directory structure

My home-ops repository contains the following directories under kubernetes.

📁 kubernetes      # Kubernetes clusters defined as code
├─📁 main     # My main kubernetes cluster
│ ├─📁 bootstrap   # Flux installation
│ ├─📁 flux        # Main Flux configuration of repository
│ └─📁 apps        # Apps deployed into my cluster grouped by namespace (see below)
└─📁 tools         # Manifests that come in handy every now and then

Flux resource layout

Below is a a high level look at the layout of how my directory structure with Flux works. In this brief example you are able to see that authelia will not be able to run until glauth and cloudnative-pg are running. It also shows that the Cluster custom resource depends on the cloudnative-pg Helm chart. This is needed because cloudnative-pg installs the Cluster custom resource definition in the Helm chart.

# Key: <kind> :: <metadata.name>
GitRepository :: flux-system
    Kustomization :: cluster
        Kustomization :: cluster-apps
            Kustomization :: cluster-apps-authelia
                DependsOn:
                    Kustomization :: cluster-apps-glauth
                    Kustomization :: cluster-apps-cloudnative-pg-cluster
                HelmRelease :: authelia
            Kustomization :: cluster-apps-glauth
                HelmRelease :: glauth
            Kustomization :: cluster-apps-cloudnative-pg
                HelmRelease :: cloudnative-pg
            Kustomization :: cluster-apps-cloudnative-pg-cluster
                DependsOn:
                    Kustomization :: cluster-apps-cloudnative-pg
                Cluster :: postgres

Storage

Storage in my cluster is handled in a number of ways. The in-cluster storage is provided by a rook Ceph cluster that is running on a number of my nodes.

rook-ceph block storage

The bulk of my cluster storage relies on my CephBlockPool. This ensures that my data is replicated across my storage nodes.

NFS storage

Finally, I have my NAS that exposes several exports over NFS. Given how NFS is a very bad idea for storing application data (see for example this Github issue) I only use it to store data at rest, such as my personal media files, Linux ISO's, backups, etc.

Backups

Automation

Terraform

Ansible

How to...

Here you can find information on how to accomplish specific scenario's.

Run a Pod in a VLAN

Sometimes you'll want to give a Kubernetes Pod direct access to a VLAN. This could be for any number of reasons, but the most common reason is for the application to be able to automatically discover devices on that VLAN.

A good example of this would be Home Assistant. This application has several integrations that rely on being able to discover the hardware devices (e.g. Sonos speakers or ESPHome devices).

Prerequisites

For a Kubernetes cluster to be able to add additional network interfaces to Pods (this is also known as "multi-homing") the Multus CNI needs to be installed in your cluster.

NIC configuration

Make sure that the Kubernetes node has a network interface that is connected to the VLAN you wish to connect to.

Note

My nodes only have a single NIC, so I have set them up so their main interface gets it's IP address over DHCP and a virtual interface connecting to the VLAN. How to do this will depend on your operating system.

Multus Configuration

My Multus Helm configuration can be found here.

It is important to note that the paths of your CNI plugin binaries / config might differ depending on the Kubernetes distribution you are running. For my Talos setup they need to be set to /opt/cni/bin and etc/cni/net.d respectively.

NetworkAttachmentDefinition

Once the Multus CNI has been installed and configured you can use the NetworkAttachmentDefinition Custom Resource to define the virtual IP addresses that you want to hand out. These need to be free addresses within the VLAN subnet, so it's important to make sure that they do not overlap with your DHCP server range(s).

{{ #include ../../../../kubernetes/apps/home-automation/home-assistant/app/networkattachmentdefinition.yaml }}

Be sure to check out the official documentation for more information on how to configure the spec.config field.

Pod configuration

Once the NetworkAttachmentDefinition has been loaded it is possible to use it within a Pod. This can be done by setting an annotation on the Pod that references it. Staying with the Home Assistant example (full Helm values), this would be:

k8s.v1.cni.cncf.io/networks: macvlan-static-iot-hass

App-specific configuration: Home Assistant

In order for Home Assistant to actually use the additional network interface you will need to explicitly enable it instead of relying on automatic network detection. To do so, navigate to Settings >> System >> Network (this setting is only available to Home Assistant users that have "Advanced mode" enabled in their user profile) and place a checkmark next to the adapters that you wish to use with Home Assistant integrations.

Run a Service with both TCP and UDP

One example where it is really nice having a single unified Service expose all the ports instead of several "single-purpose" ones is the Unifi Controller: Helm values.

Up until Kubernetes version 1.26 it was (by default) not possible to have a single Service expose both TCP and UDP protocols.

Prerequisites

Since Kubernetes version 1.26 the MixedProtocolLBService has graduated to GA status, and no special flags should be required. Up until version 1.26 it was required to enable the MixedProtocolLBService=true feature-gate in order to achieve this functionality.

Home-Ops