k8s-clean-cluster

$npx mdskill add kurtosis-tech/kurtosis/k8s-clean-cluster

Force-delete all Kurtosis Kubernetes resources when cleanup hangs.

  • Removes orphaned namespaces, pods, and cluster roles.
  • Requires kubectl access to the target Kubernetes cluster.
  • Executes commands via shell scripts to purge resources.
  • Outputs command results directly to the terminal.

SKILL.md

.github/skills/k8s-clean-clusterView on GitHub ↗
---
name: k8s-clean-cluster
description: Force-clean all Kurtosis resources from a Kubernetes cluster when kurtosis clean hangs or fails. Removes all kurtosis namespaces, pods, daemonsets, cluster roles, and cluster role bindings. Use when kurtosis clean -a hangs or leaves behind orphaned resources.
compatibility: Requires kubectl with cluster access.
metadata:
  author: ethpandaops
  version: "1.0"
---

# K8s Clean Cluster

Force-clean all Kurtosis resources from a Kubernetes cluster when the normal `kurtosis clean -a` command hangs or fails.

## When to use

- `kurtosis clean -a` hangs for more than a few minutes
- Orphaned kurtosis namespaces remain after a failed clean
- `remove-dir-pod-*` pods are stuck in Pending state
- Engine start fails because old resources exist

## Steps

### 1. Kill any running kurtosis processes

```bash
pkill -f "kurtosis gateway" 2>/dev/null
pkill -f "kurtosis clean" 2>/dev/null
```

### 2. Stop the engine gracefully (if possible)

```bash
kurtosis engine stop || true
```

### 3. Delete all kurtosis namespaces

```bash
# List them first
kubectl get ns | grep kurtosis

# Delete all kurtosis namespaces (engine, enclaves, logs)
kubectl get ns | grep kurtosis | awk '{print $1}' | xargs -r kubectl delete ns --force --grace-period=0
```

### 4. Clean up cluster-scoped resources

```bash
# Delete kurtosis cluster roles
kubectl get clusterrole | grep kurtosis | awk '{print $1}' | xargs -r kubectl delete clusterrole

# Delete kurtosis cluster role bindings
kubectl get clusterrolebinding | grep kurtosis | awk '{print $1}' | xargs -r kubectl delete clusterrolebinding
```

### 5. Clean up stuck pods

```bash
# Force-delete any remaining kurtosis pods
kubectl get pods -A | grep kurtosis | awk '{print $2 " -n " $1}' | xargs -L1 kubectl delete pod --force --grace-period=0

# Clean up evicted pods
kubectl get pods -A | grep Evicted | awk '{print $2 " -n " $1}' | xargs -L1 kubectl delete pod --force --grace-period=0
```

### 6. Verify clean state

```bash
kubectl get ns | grep kurtosis
kubectl get pods -A | grep kurtosis
kubectl get ds -A | grep kurtosis
```

All three commands should return empty results.

### 7. Restart

```bash
kurtosis engine start
kurtosis gateway &
```

## Why clean hangs

The most common cause is the fluentbit logs collector `Clean` method which:
1. Evicts all DaemonSet pods by adding a non-existent node selector
2. Waits for each pod to terminate (up to 5 min per pod, sequentially)
3. Creates `remove-dir-pod` cleanup pods targeted at each node
4. Cleanup pods on tainted/unhealthy nodes get stuck in Pending

The fix in the codebase makes these operations best-effort with timeouts and detects unschedulable pods early, but if running an unfixed version, manual cleanup is needed.

More from kurtosis-tech/kurtosis

SkillDescription
cli-local-buildBuild and test the Kurtosis CLI from source. Compile the CLI binary locally, run it against Docker or Kubernetes engines, and iterate on CLI changes without creating a release. Use when developing or debugging CLI commands.
cluster-manageManage Kurtosis cluster settings. Switch between Docker and Kubernetes backends, list available clusters, and configure which cluster Kurtosis uses. Use when you need to change where Kurtosis runs enclaves.
context-manageManage Kurtosis contexts for connecting to different Kurtosis instances. Add, list, switch, and remove contexts. Use when working with multiple Kurtosis environments (local, remote, team shared).
docker-debugDebug Kurtosis running on local Docker. Inspect engine, API container, and service logs. Diagnose container crashes, port conflicts, and networking issues. Use when kurtosis commands fail or services aren't reachable on Docker.
docker-local-buildBuild and test Kurtosis from source on local Docker. Compiles all components (engine, core, files-artifacts-expander), builds Docker images, installs the CLI, and restarts the engine. Use when developing Kurtosis and testing changes locally with Docker.
dumpDump Kurtosis state for debugging and sharing. Export enclave state including service logs, configurations, and file artifacts to a local directory. Use when you need to capture state for offline analysis or to share with others for debugging.
enclave-inspectInspect and manage Kurtosis enclaves. List enclaves, view services and ports, examine file artifacts, dump enclave state for debugging, and clean up. Use when you need to understand what's running inside an enclave or export its state.
engine-manageManage the Kurtosis engine server. Start, stop, restart the engine, check status, and view engine logs. Covers both Docker and Kubernetes engine backends. Use when the engine won't start, needs restarting, or you need to check engine health.
files-inspectInspect, download, upload, and debug Kurtosis file artifacts. View artifacts in an enclave, download them locally for inspection, upload local files, and troubleshoot file mounting issues. Use when services can't find expected files or configs are wrong.
gatewayStart and manage the Kurtosis gateway for Kubernetes. The gateway forwards local ports to the Kurtosis engine and services running in a k8s cluster. Required when using Kurtosis with Kubernetes. Use when kurtosis engine status shows nothing on k8s or services aren't reachable.