Using an AI agent to migrate Kubernetes

Dawid Laszuk published on
6 min, 1129 words

Another great use for AI agents: Kubernetes migrations.

I used to be quite anti-Kubernetes. It is a big, convoluted thing with creeping complexity. It lures you in with "simple orchestration" and the fact that every cloud provider has managed Kubernetes. You think this is a great step on the cloud-agnostic journey. Built-in high availability. Similar manifests everywhere. A bit more painful at the start, sure, but maybe a good long-term investment. Then the scaling starts and things get hairy.

Kubernetes assumes that pods are cattle. Anything can happen to them. Most engineers don't even think something bad can happen to their processes. Their processes are immortal, obviously. Then there are network issues, resource slang, verbose manifests, storage classes, ingress controllers, secrets, certificates, and twenty different ways to spell "this should be running".

It is not a one-off set-and-forget job. Not unless your cluster is small enough that you can pretend the hard parts do not exist. So, yes, I was anti-Kubernetes until about half a year ago.

I joined a project with someone who had swallowed the LinkedIn marketing pill. Kubernetes, 5 different vector bases, LangChain, an LLM provider I never heard of with some weird embeddings. We already had enough friction points in the project and this was not the hill I wanted to fight on. Kubernetes sounds great, my friend.

A few months in, I started appreciating it. Less Terraform than I expected. Better tooling than I expected. And, best of all, Claude/GPT models are weirdly good at Kubernetes. Not magic, some hand holding, but good enough to hold a lot of YAML-shaped nonsense in their context and figure out which commands are needed.

The most impressive part came when we had to migrate the whole stack from Azure Kubernetes to AWS EKS. My rough estimate was 3 weeks for the migration. There were many namespaces, secrets were stored in AKS Vault and had to be moved to AWS Secrets Manager, and there was some explicit Azure blob integration which S3 didn't include... The 3 weeks was definitely optimistic. A year before I'd put this into 3 months, but an AI assistant with access to aws / az (with read-only permissions!) and our Terraform code made it feel achievable.

Migration took 4 days. The Kubernetes migration itself was one day, but verifying which secrets were actually needed and making sure data was where it should be took two extra days. The last day was just me not believing how little time all of this took. Yeah, Kubernetes with AI agents is awesome.

There are still gotchas. AI is not going to flag everything because it does not know everything. Metrics capture over multiple pods with multiple workers. IP allow-lists. Provider-specific storage behavior. The usual little traps that become obvious five minutes after they hurt you. Still, I will take kubectl over aws and az any day.

Another example: I just migrated my personal infra from DigitalOcean to OVHcloud. The whole thing started as a pricing question and ended with a live migration. Of course it did.

The motivation was not only price. I wanted to move to a non-US cloud provider, ideally one with presence in Canada, and OVHcloud has a region in Quebec. I had almost no experience with OVHcloud before this. I found a few UI bugs. The UX is not the most intuitive. Eh.

The cluster was not huge, but it was real. Multiple namespaces. HomeHero, family-tree, ghostfolio, projects, a few others. Postgres data. PVCs. Ingress. TLS. Monitoring. Around twenty-ish public endpoints across a handful of domains. Exactly the kind of personal infra that starts as "I just need one project online" and somehow turns into a small hosting company for myself.

The nice part was that Kubernetes made the cloud switch boring in the right places. The checklist was long, but familiar: create the OVH managed cluster with Terraform, add a node pool, install ingress-nginx and cert-manager, copy namespaces, translate DigitalOcean storage classes to OVH storage classes, recreate services, deployments, stateful sets, ingresses and cron jobs, dump Postgres from the old cluster, load it into the new one, move DNS, verify endpoints.

That is a lot of work, but it is also a lot of very regular work. And regular work is where an AI agent is useful.

OpenCode with GPT held my hand through the annoying parts. OVH API tokens were painful. The first credentials looked structurally fine but failed with 403 because the consumer key did not have enough permissions. Terraform created a cluster, got interrupted, and then the remote cluster had to be imported back into state. One PVC copied from DigitalOcean still had a selector from an old manual PV setup, and OVH's CSI provisioner quite reasonably said no. A GHCR secret was stale. HomeHero's first Postgres restore failed mid-stream because COPY and schema drift did not agree with each other, so the dump had to be redone with column inserts. Grafana sidecars crashlooped because monitoring RBAC was incomplete. All normal migration nonsense.

The difference was that I did not have to keep the whole tree in my head. The agent could inspect both clusters, compare objects, suggest the next command, write the Terraform changes, recover from mistakes, and then verify the public endpoints after DNS moved.

HomeHero ended up live on OVH. homehero.pro, www, api, console, adminer, Grafana - all served from the OVH ingress. The forced-IP checks matched the normal DNS checks. Postgres data was there. Monitoring was there after the RBAC fix. The old DigitalOcean cluster could finally be treated as a fallback instead of the source of truth.

This is the Kubernetes promise I used to roll my eyes at. Cloud agnostic. Portable. Same manifests, different provider.

It is still not free. You pay the complexity tax up front and then you keep paying a smaller tax forever. But with AI agents the tax feels different. Less like memorizing provider-specific CLIs. More like supervising a very patient junior engineer who has read all the docs and is willing to type YAML until the heat death of the universe. That is a good trade for me.

The reason I started using Kubernetes for personal infra was simple: I wanted to add a new project and expose it through a subdomain without thinking too much. With Kubernetes this is easy. When I used Claude Code I was even prototyping from my phone and making things public in no time. Now I use a terminal on the phone and an OpenCode session, which is objectively a little silly and also works surprisingly well.

So I am no longer anti-Kubernetes. I am anti-pretending-Kubernetes-is-simple. Different thing.