There Has Been a Slight Change with My Lab: An Upgrade

A Lab Upgrade

๐Ÿ”ง My Reasoning for the Upgrade

Running Kubernetes on a single machine was a great way to get started. It was easy to manage, simple to test deployments, and perfect for learning. But over time, I ran into several limitations.

With just one node, I couldnโ€™t truly explore:

  • ๐Ÿ” Scheduling across multiple nodes
  • ๐Ÿ›ฐ๏ธ Cluster communication and service discovery
  • โš™๏ธ High Availability (HA) deployments
  • ๐Ÿ”ฅ Disaster recovery scenarios
  • ๐Ÿ“ˆ Horizontal scaling and load balancing

๐Ÿ’ธ The Upgrade

I decided to invest in a few low-cost Mini PCs to build out a proper cluster. I found a great option on Amazon: the OUMAX Mini PC. These are currently priced at just $141.54 USD.

They offer a solid set of specs for a home lab:

  • ๐Ÿง  CPU: Intel N150 (4 cores, 4 threads)
  • ๐Ÿ’พ RAM: 16GB
  • ๐Ÿ“ฆ Storage: 500GB NVMe SSD
  • ๐ŸŒ Networking: 2 x 2.5GbE NICs

๐Ÿ—๏ธ The New Lab Design

With three of these units, Iโ€™m now running a multi-node Kubernetes cluster where each node acts as both a control plane and a worker. The dual NICs let me physically separate:

  • ๐Ÿ”’ Internal traffic: cluster and storage communications
  • ๐ŸŒ External traffic: ingress/egress to the internet

This setup allows me to:

  • ๐Ÿ”„ Simulate node failure and recovery
  • ๐Ÿ› ๏ธ Add/remove nodes dynamically
  • ๐Ÿ“ก Test longhorn and NFS over an isolated backend network
  • ๐Ÿ“Š Analyze service behavior under load

๐Ÿ–ฅ๏ธ What About the Old Machine?

The original system isnโ€™t going away. Iโ€™ll be dedicating it to GPU-related workloads. Itโ€™s perfect for testing things like:

  • ๐Ÿง  AI models with the NVIDIA toolkit
  • ๐ŸŽฅ Media workloads like transcoding and inference

๐Ÿงช Whatโ€™s Next?

Iโ€™m considering picking up a fourth Mini PC and running Windows Server 2019/2022 on it. This would let me experiment with Windows containers and hybrid clusters.

๐Ÿ’ก I know Microsoft recommends Azure or Azure Stack HCI for Windows-based pods, but Iโ€™m curious to see whatโ€™s possible in a pure local setup. Even if itโ€™s not ideal, the experience alone will be valuable.

๐Ÿงต TL;DR

My lab just leveled up. Going from one node to a real multi-node Kubernetes setup opens the door to high availability, better simulation of production-grade environments, and hands-on experimentation with real-world scenarios โ€” all without breaking the bank.

One Tiny Server, Big Plans – Kicking Off My Kubernetes Lab!

Every big project starts small โ€” and this is mine. Iโ€™m setting the stage for a full rebuild of my production RKE2 cluster by first rolling up my sleeves and diving deep into a dedicated Kubernetes lab environment. This isnโ€™t just tinkering for fun (though it will be fun); itโ€™s a focused effort to test, learn, and refine the tools and workflows that will power my future production setups.

With an old Lenovo ThinkStation P320 Tiny, Iโ€™m going to break things, fix them, andโ€”most importantlyโ€”learn why they broke in the first place. This lab is where mistakes become lessons, and lessons become skills.

๐Ÿ› ๏ธ Lab Hardware Specs

The heart of my lab is humble, but it packs enough punch for everything I need:

Lenovo ThinkStation P320 Tiny

  • Ubuntu Server 24.04.2 LTS
  • Rancher/SUSE K3s
  • Intel Core i7-7700T
  • 16GB RAM
  • 256GB NVMe SSD
  • 1GbE NIC
  • Nvidia Quadro P600

While modest compared to a full-blown production cluster, itโ€™s perfect for lab work, experimentation, and building a solid foundation.

๐ŸŽฏ Learning Goals & Focus Areas

This lab isnโ€™t about just spinning up pods. Itโ€™s about understanding the ecosystem that makes Kubernetes production-ready. Hereโ€™s where my focus will be:

๐Ÿ“ก Networking (CNI)

  • Calico for advanced network policies and scalability.
  • Cilium to explore eBPF-powered networking and observability.
  • Flannel for a lightweight, baseline comparison.
  • Canal combines Calicoโ€™s policy engine with Flannelโ€™s networking backend.
  • Multus to enable attaching multiple network interfaces to pods for complex topologies.
  • Weave Net for simple, encrypted networking with automatic peer discovery.

โš–๏ธ Load Balancing & High Availability

  • MetalLB for simple L2/L3 load balancing.
  • Kube-VIP for virtual IPs and HA control plane setup.
  • HAProxy for flexible L4/L7 proxying, ideal for ingress or external load balancing.
  • Traefik as a dynamic reverse proxy with integrated Let’s Encrypt and metrics.
  • NGINX Ingress Controller for production-grade HTTP load balancing and Ingress routing.
  • ExternalDNS โ€“ Automatically manages DNS records in external providers (e.g., Cloudflare, Route53) based on Kubernetes service and ingress resources.

๐Ÿ—„๏ธ Storage (CSI)

  • Synology CSI Driver for volume provisioning on Synology NAS systems.
  • NFS and SMB CSI drivers for shared filesystem access over network protocols.
  • S3-compatible storage solutions like JuiceFS for cloud-native object storage access.
  • iSCSI CSI Driver for block-level persistent volumes and advanced workloads.
  • Longhorn โ€“ Distributed block storage system designed for Kubernetes, ideal for lab/home clusters.
  • Ceph / Rook โ€“ Highly scalable, replicated block/object/filesystem storage using Ceph, managed via Rook operator.
  • GlusterFS CSI โ€“ Shared filesystem storage with high availability and horizontal scaling.
  • Local Path Provisioner โ€“ Simple hostPath-based storage for development/testing clusters.

๐ŸŒ Connectivity & Access

  • Tailscale to create a secure mesh VPN for remote access.
  • Cloudflare Tunnels for secure, zero-trust ingress.
  • ngrok for quick, developer-friendly secure tunnels to local services (great for demos and dev, but limited for production use).
  • Teleport for secure access to Kubernetes clusters, SSH, databases, and apps โ€” with audit logging and SSO.
  • ZeroTier as an alternative mesh VPN with advanced routing and bridging features.
  • Nebula โ€“ Lightweight mesh VPN built by Slack, great for self-hosted, peer-to-peer networks.
  • Traditional VPN tunnels for more controlled scenarios.

๐ŸŒ€ GitOps: FluxCD & ArgoCD

Modern Kubernetes management revolves around GitOps. Iโ€™ll be diving into:

  • FluxCD for automated reconciliation and Git-driven deployments.
  • ArgoCD for application visualization and deployment management.
  • Jenkins, GitHub Actions, or Semaphore for CI/CD pipelines and integration testing.

These tools will help me build repeatable, automated workflows โ€” making my cluster self-healing, declarative, and production-ready.

๐Ÿ” Secrets Management: HashiCorp Vault & OpenBao

Managing secrets securely is non-negotiable in any production environment. Iโ€™ll be comparing:

  • HashiCorp Vault, the industry standard for secret management.
  • OpenBao, the open-source fork aimed at long-term community support.

Both will play key roles in managing tokens, passwords, and sensitive configs in my cluster.

๐Ÿ”‘ SSO & Authentication

  • Authentik โ€“ Lightweight, modern identity provider supporting SSO, MFA, reverse proxy authentication, and LDAP/AD integration.
  • Keycloak โ€“ Enterprise-grade open-source identity and access management with robust OIDC, SAML, and user federation support.
  • Authelia โ€“ Authentication proxy for web apps with 2FA, ideal for use with NGINX or Traefik.
  • Dex โ€“ Kubernetes-native OIDC identity service that integrates with LDAP, GitHub, and others.

๐Ÿ—๏ธ Infrastructure as Code: Terraform & OpenTofu

To manage resources beyond Kubernetes itself, Iโ€™ll be using:

  • Hashicorp Terraform to automate cloud resources, DNS records, and infrastructure provisioning.
  • OpenTofu, the community-driven alternative, to see how it compares and integrates into my workflows.

Infrastructure as Code will be a key skill in scaling my homelab into production-grade environments.

๐Ÿ“ˆ Monitoring & Observability

Capturing metrics, logs, and traces is essential for maintaining system health and debugging issues:

  • Prometheus for collecting metrics and enabling alerting.
  • Grafana for visual dashboards of your infrastructure and services.
  • Loki to aggregate and query logs efficiently.
  • Fluent Bit to collect, parse, and forward logs to multiple backends.
  • Fluentd โ€“ Full-featured log collector and processor, supports filtering, buffering, and routing logs to multiple backends (Splunk, Elasticsearch, etc).
  • Elasticsearch โ€“ Distributed search and analytics engine, often used for log indexing and full-text search.
  • Kibana โ€“ Visualization and dashboard interface for Elasticsearch, used to explore logs, metrics, and APM data in ELK/EFK stacks.
  • Tempo or Jaeger for distributed tracing.
  • Splunk for enterprise-grade log and event aggregation, especially for hybrid or security-focused environments.

Observability is more than just logs โ€” it’s about connecting metrics, traces, and logs to gain real insight.

โœ‰๏ธ Messaging & Queuing

Message brokers and event streaming platforms are essential for decoupling services and enabling scalable architectures:

  • RabbitMQ โ€“ Lightweight, reliable message broker supporting AMQP, MQTT, and STOMP. Great for traditional pub/sub or job queues.
  • NATS โ€“ High-performance cloud-native messaging system, ideal for microservices and IoT applications.
  • Kafka โ€“ Distributed streaming platform for large-scale data pipelines and event-driven systems.
  • Mosquitto โ€“ Lightweight MQTT broker, great for home automation and IoT telemetry.
  • Redis Streams / Valkey Streams โ€“ Simple stream processing using Redis/Valkeyโ€™s in-memory data store.

๐Ÿ’พ Data Backups & Snapshots

Backing up your data is critical for disaster recovery, migrations, and testing:

  • Velero โ€“ Kubernetes-native backup and restore tool for volumes and cluster state.
  • CloudNativePG Backups โ€“ Built-in support for S3/NFS backups and PITR.
  • Kasten K10 โ€“ Commercial-grade Kubernetes backup solution (free for small/home clusters).
  • Restic โ€“ Fast, efficient backup tool that can integrate with Kubernetes via Stash or Velero plugins.
  • PgBackRest โ€“ Reliable PostgreSQL backup and recovery tool, often used with CloudNativePG.
  • Percona XtraBackup โ€“ Hot backup utility for MySQL and MariaDB.
  • NFS/SMB snapshotting via Synology or Longhorn native snapshot/backup tools.

๐Ÿ› ๏ธ Tools & Utilities

A collection of essential tools I use to interact with, manage, and customize my Kubernetes environment:

  • K9s โ€“ Terminal UI to interact with Kubernetes clusters in real time.
  • Kustomize โ€“ Customize Kubernetes manifests with patches and overlays.
  • Helm โ€“ Kubernetes package manager for deploying complex applications via charts.
  • Helmfile โ€“ Declarative manager for Helm charts, great for GitOps and environments.
  • Stern โ€“ Stream logs from multiple pods with label filtering.
  • Rancher โ€“ Web-based Kubernetes management platform with multi-cluster support.
  • Lens โ€“ GUI-based Kubernetes IDE for visual insights, workload browsing, and troubleshooting.
  • KURED โ€“ KUbernetes REboot Daemon that safely handles node reboots after OS updates.

๐Ÿš€ Why This Lab Matters

This lab is my personal training ground โ€” a place where I can explore, break, fix, and automate without the risks of impacting production. By the time Iโ€™m ready to rebuild my RKE2 cluster, Iโ€™ll have tested and validated these tools in realistic scenarios, with hands-on experience guiding every decision.

But this journey isnโ€™t just for me.

I want to share every step โ€” the successes, the failures, the lessons learned โ€” so that others can learn alongside me. Whether youโ€™re new to Kubernetes, experimenting with homelabs, or preparing for your own production deployments, I hope my experiences can help you avoid common pitfalls and accelerate your own learning.

All my configurations, manifests, and code from this lab will be available in my GitHub repository:

๐Ÿ‘‰ EricZarnosky/MyOps

This will be a living resource as I continue to refine, document, and share what I discover.

๐Ÿ“ข Whatโ€™s Next?

Follow along as I document each step โ€” the wins, the failures, and everything I learn while leveling up my Kubernetes skills. From lab experiments to production-ready deployments, this journey is just getting started.

๐Ÿง‘โ€๐Ÿ’ป Follow My Lab Journey

This lab is more than a personal project โ€” itโ€™s a learning journey Iโ€™m sharing with the community.

Mistakes, fixes, lessons โ€” all documented. Letโ€™s learn Kubernetes together.