🚀 Kubernetes Services – Troubleshooting Rules
🔹 1. No Endpoints Assigned
-
Service is created, but:
-
Pods don’t have the labels matching the Service’s
selector. - Or Service
selectoritself is wrong. - Fix: Ensure
spec.selectormatches Pod labels.
🔹 2. Wrong Port Configuration
-
Endpoint is present, but Service doesn’t forward traffic because:
-
Wrong port number (e.g., using
8080instead of Pod’s80). - Wrong protocol (e.g.,
UDPinstead ofTCP). - Fix: Align Service
port/targetPortwith PodcontainerPort.
🔹 3. Wrong NodePort Configuration
- Sometimes, question tells to click a button on terminal, on clicking.. this url shows port, and this pod is misconfigured in service manifest.
Q1 Pod doesn't have label
controlplane:~$ kubectl port-forward svc/nginx-service 8080:80
^Ccontrolplane:~$ k describe po nginx-pod
Name: nginx-pod
Namespace: default
Priority: 0
Service Account: default
Node: node01/172.30.2.2
Start Time: Wed, 27 Aug 2025 22:25:03 +0000
Labels: <none>
controlplane:~$ k describe svc nginx-service
Name: nginx-service
Selector: app=nginx-pod
IP: 10.99.175.81
IPs: 10.99.175.81
Port: <unset> 80/TCP
TargetPort: 80/TCP
Endpoints:
controlplane:~$ k label po nginx-pod app=nginx-pod
pod/nginx-pod labeled
controlplane:~$ kubectl port-forward svc/nginx-service 8080:80
Forwarding from 127.0.0.1:8080 -> 80
Forwarding from [::1]:8080 -> 80
Q2: Endpoint is not yet assigned, wrong label
cluster1-controlplane ~ ➜ k describe svc curlme-cka01-svcn
Name: curlme-cka01-svcn
Namespace: default
Selector: run=curlme-ckaO1-svcn # 0 not O
Type: ClusterIP
IP: 172.20.218.134
IPs: 172.20.218.134
Port: <unset> 80/TCP
TargetPort: 80/TCP
Endpoints:
Events: <none>
cluster1-controlplane ~ ➜ k get po -o wide curlme-cka01-svcn curlpod-cka01-svcn --show-labels
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES LABELS
curlme-cka01-svcn 1/1 Running 0 4m4s 172.17.1.11 cluster1-node01 <none> <none> run=curlme-cka01-svcn
curlpod-cka01-svcn 1/1 Running 0 4m4s 172.17.3.12 cluster1-node02 <none> <none> run=curlpod-cka01-svcn
Q3: Endpoint is assigned, but wrong port
cluster1-controlplane ~ ➜ k get po purple-app-cka27-trb -o yaml | grep -i image:
- image: nginx
image: docker.io/library/nginx:latest
cluster1-controlplane ~ ➜ k get po purple-app-cka27-trb -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
purple-app-cka27-trb 1/1 Running 0 32m 172.17.1.22 cluster1-node01 <none> <none>
cluster1-controlplane ~ ➜ k describe svc purple-svc-cka27-trb
Name: purple-svc-cka27-trb
Namespace: default
Labels: <none>
Annotations: <none>
Selector: app=purple-app-cka27-trb
Type: ClusterIP
IP Family Policy: SingleStack
IP Families: IPv4
IP: 172.20.10.168
IPs: 172.20.10.168
Port: app 8080/TCP # set it to 80
TargetPort: 8080/TCP # set it to 80
Endpoints: 172.17.1.22:8080
Session Affinity: None
Internal Traffic Policy: Cluster
Events: <none>
cluster1-controlplane ~ ➜
cluster1-controlplane ~ ➜ k get po purple-app-cka27-trb -o yaml | grep -i image:
- image: nginx
image: docker.io/library/nginx:latest
cluster1-controlplane ~ ➜ k get po purple-app-cka27-trb -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
purple-app-cka27-trb 1/1 Running 0 32m 172.17.1.22 cluster1-node01 <none> <none>
cluster1-controlplane ~ ➜ k describe svc purple-svc-cka27-trb
Name: purple-svc-cka27-trb
Namespace: default
Labels: <none>
Annotations: <none>
Selector: app=purple-app-cka27-trb
Type: ClusterIP
IP Family Policy: SingleStack
IP Families: IPv4
IP: 172.20.10.168
IPs: 172.20.10.168
Port: app 8080/TCP # set it to 80
TargetPort: 8080/TCP # set it to 80
Endpoints: 172.17.1.22:8080
Session Affinity: None
Internal Traffic Policy: Cluster
Events: <none>
cluster1-controlplane ~ ➜
Q4: Wrong Protocol
A pod called pink-pod-cka16-trb is created in the default namespace in cluster4. This app runs on port tcp/5000, and it is to be exposed to end-users using an ingress resource called pink-ing-cka16-trb such that it becomes accessible using the command curl http://kodekloud-pink.app on the cluster4-controlplane host. There is an ingress.yaml file under the root folder in cluster4-controlplane. Create an ingress resource by following the command and continue with the task.
However, even after creating the ingress resource, it is not working. Troubleshoot and fix this issue, making any necessary changes to the objects.
cluster4-controlplane ~ ➜ k get po pink-pod-cka16-trb -o yaml
apiVersion: v1
kind: Pod
metadata:
name: pink-pod-cka16-trb
spec:
containers:
ports:
- containerPort: 80
protocol: TCP
cluster4-controlplane ~ ➜ k describe svc pink-svc-cka16-trb
Name: pink-svc-cka16-trb
Namespace: default
Labels: <none>
Annotations: <none>
Selector: app=pink-app-cka16-trb
Type: ClusterIP
IP Family Policy: SingleStack
IP Families: IPv4
IP: 172.20.39.102
IPs: 172.20.39.102
Port: <unset> 5000/UDP # wrong protocol
TargetPort: 80/UDP
Endpoints: 172.17.1.3:80
Session Affinity: None
Internal Traffic Policy: Cluster
Events: <none>
cluster4-controlplane ~ ➜ k edit svc pink-svc-cka16-trb # just change the protocol to TCP
service/pink-svc-cka16-trb edited
cluster4-controlplane ~ ➜ cat ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: pink-ing-cka16-trb
spec:
ingressClassName: nginx
rules:
- host: kodekloud-pink.app
http:
paths:
- pathType: Prefix
path: /
backend:
service:
name: pink-svc-cka16-trb
port:
number: 5000 # ingress resource never contains protocol, just port number only
cluster4-controlplane ~ ➜ k apply -f ingress.yaml
ingress.networking.k8s.io/pink-ing-cka16-trb created
cluster4-controlplane ~ ➜ curl http://kodekloud-pink.app
<title>Welcome to nginx!</title>
<h1>Welcome to nginx!</h1>
<p><em>Thank you for using nginx.</em></p>
🔎 What you actually have
Pod (pink-pod-cka16-trb):
containers:
ports:
- containerPort: 80 # ✅ Pod is listening on 80
Service (pink-svc-cka16-trb):
Port: 5000/UDP # ❌ wrong, it should be TCP
TargetPort: 80/UDP # points to Pod’s port (80), but also marked UDP
Endpoints: 172.17.1.3:80
🚨 What went wrong
- The exam question said: “App runs on port 5000” → we assumed Pod’s containerPort = 5000.
-
But in reality, the Pod YAML shows
containerPort: 80. -
That means the Pod app is actually running on 80, not 5000.
-
The Service was created incorrectly:
-
Used
port: 5000with UDP, while it should be TCP. - Service should expose TCP 80, not UDP 5000.
Q5: Both containers share same port within a pod.
There is a deployment with two containers, one is running, and other restarting...
k describe deployments.apps -n management collect-data # no clue
k describe po -n management collect-data-5759c5c888-gvf2z
Warning BackOff 14s (x13 over 2m35s) kubelet Back-off restarting failed container httpd in pod collect-data-5759c5c888-gvf2z_management(9d91ca38-197d-48fc-8916-d22e54cd899b)
controlplane:~$ k logs -n management deploy/collect-data -c nginx # all good
controlplane:~$ k logs -n management deploy/collect-data -c httpd
Found 2 pods, using pod/collect-data-5759c5c888-gvf2z
AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 192.168.0.7. Set the 'ServerName' directive globally to suppress this message
(98)Address in use: AH00072: make_sock: could not bind to address [::]:80
(98)Address in use: AH00072: make_sock: could not bind to address 0.0.0.0:80
no listening sockets available, shutting down
AH00015: Unable to open logs
The issue seems that both containers have processes that want to listen on port 80. Depending on container creation order and speed, the first will succeed, the other will fail.
Solution: remove one container.
controlplane:~$ k get deploy -n management
NAME READY UP-TO-DATE AVAILABLE AGE
collect-data 0/2 2 0 29m
controlplane:~$ k edit deploy -n management collect-data
deployment.apps/collect-data edited
controlplane:~$ k rollout restart deployment -n management collect-data
deployment.apps/collect-data restarted
controlplane:~$ k get deploy -n management collect-data
NAME READY UP-TO-DATE AVAILABLE AGE
collect-data 2/2 2 2 31m
k describe deployments.apps -n management collect-data # no clue
k describe po -n management collect-data-5759c5c888-gvf2z
Warning BackOff 14s (x13 over 2m35s) kubelet Back-off restarting failed container httpd in pod collect-data-5759c5c888-gvf2z_management(9d91ca38-197d-48fc-8916-d22e54cd899b)
controlplane:~$ k logs -n management deploy/collect-data -c nginx # all good
controlplane:~$ k logs -n management deploy/collect-data -c httpd
Found 2 pods, using pod/collect-data-5759c5c888-gvf2z
AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 192.168.0.7. Set the 'ServerName' directive globally to suppress this message
(98)Address in use: AH00072: make_sock: could not bind to address [::]:80
(98)Address in use: AH00072: make_sock: could not bind to address 0.0.0.0:80
no listening sockets available, shutting down
AH00015: Unable to open logs
The issue seems that both containers have processes that want to listen on port 80. Depending on container creation order and speed, the first will succeed, the other will fail.
Solution: remove one container.
controlplane:~$ k get deploy -n management
NAME READY UP-TO-DATE AVAILABLE AGE
collect-data 0/2 2 0 29m
controlplane:~$ k edit deploy -n management collect-data
deployment.apps/collect-data edited
controlplane:~$ k rollout restart deployment -n management collect-data
deployment.apps/collect-data restarted
controlplane:~$ k get deploy -n management collect-data
NAME READY UP-TO-DATE AVAILABLE AGE
collect-data 2/2 2 2 31m
Q6
The error is pretty clear:
spec.ports[0].nodePort: Invalid value: 32345: provided port is already allocated
That means another Service in your cluster is already using NodePort 32345, and Kubernetes won’t allow duplicates.
🔧 Fix options:
- Check which Service is already using that NodePort:
kubectl get svc -A | grep 32345
This will show you the service that already has 32345 assigned.
- Pick a different NodePort in the range 30000–32767 Example, edit your YAML and change:
ports:
- port: 80
targetPort: 80
nodePort: 32346 # change this
type: NodePort
- Reapply the Service:
kubectl apply -f /tmp/kubectl-edit-76677757.yaml
Q7
root@controlplane ~ ➜ k describe po -n triton webapp-mysql-7bd5857746-hrnnn
Name: webapp-mysql-7bd5857746-hrnnn
Namespace: triton
Priority: 0
Service Account: default
Node: controlplane/192.168.121.159
Start Time: Tue, 07 Oct 2025 20:30:39 +0000
Labels: name=webapp-mysql
pod-template-hash=7bd5857746
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/webapp-mysql-7bd5857746
Containers:
webapp-mysql:
Container ID:
Image: mmumshad/simple-webapp-mysql
Image ID:
Port: 8080/TCP
Host Port: 0/TCP
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment:
DB_Host: mysql
DB_User: root
DB_Password: paswrd
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-9xjnx (ro)
Conditions:
Type Status
PodReadyToStartContainers False
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
kube-api-access-9xjnx:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
Optional: false
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 5m27s default-scheduler Successfully assigned triton/webapp-mysql-7bd5857746-hrnnn to controlplane
Warning FailedCreatePodSandBox 5m27s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "b1f2d3d14c84321dd0b5ff7041449db641159f6676d660f742bce390b1b5cd9d": plugin type="weave-net" name="weave" failed (add): unable to allocate IP address: Post "http://127.0.0.1:6784/ip/b1f2d3d14c84321dd0b5ff7041449db641159f6676d660f742bce390b1b5cd9d": dial tcp 127.0.0.1:6784: connect: connection refused
root@controlplane ~ ➜ ls /opt/cni/bin/
bandwidth dhcp firewall host-local LICENSE macvlan ptp sbr tap vlan weave-ipam weave-plugin-2.8.1
bridge dummy host-device ipvlan loopback portmap README.md static tuning vrf weave-net
root@controlplane ~ ➜ ls /etc/cni/net.d/
10-weave.conflist
root@controlplane ~ ➜ k get no
NAME STATUS ROLES AGE VERSION
controlplane Ready control-plane 32m v1.33.0
root@controlplane ~ ➜ k get po -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-674b8bbfcf-2gsl6 1/1 Running 0 34m
kube-system coredns-674b8bbfcf-57q4d 1/1 Running 0 34m
kube-system etcd-controlplane 1/1 Running 0 34m
kube-system kube-apiserver-controlplane 1/1 Running 0 34m
kube-system kube-controller-manager-controlplane 1/1 Running 0 34m
kube-system kube-proxy-g9h2x 1/1 Running 0 34m
kube-system kube-scheduler-controlplane 1/1 Running 0 34m
triton mysql 0/1 ContainerCreating 0 8m40s
triton webapp-mysql-7bd5857746-hrnnn 0/1 ContainerCreating 0 8m40s
root@controlplane ~ ➜ kubectl apply -f https://github.com/weaveworks/weave/releases/download/v2.8.1/weave-daemonset-k8s.yaml
serviceaccount/weave-net created
clusterrole.rbac.authorization.k8s.io/weave-net created
clusterrolebinding.rbac.authorization.k8s.io/weave-net created
role.rbac.authorization.k8s.io/weave-net created
rolebinding.rbac.authorization.k8s.io/weave-net created
daemonset.apps/weave-net created
root@controlplane ~ ➜ k get po -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-674b8bbfcf-2gsl6 1/1 Running 0 36m
kube-system coredns-674b8bbfcf-57q4d 1/1 Running 0 36m
kube-system etcd-controlplane 1/1 Running 0 36m
kube-system kube-apiserver-controlplane 1/1 Running 0 36m
kube-system kube-controller-manager-controlplane 1/1 Running 0 36m
kube-system kube-proxy-g9h2x 1/1 Running 0 36m
kube-system kube-scheduler-controlplane 1/1 Running 0 36m
kube-system weave-net-kv8dz 2/2 Running 0 16s
triton mysql 1/1 Running 0 11m
triton webapp-mysql-7bd5857746-hrnnn 1/1 Running 0 11m
root@controlplane ~ ➜
Perfect — this is a CNI failure case, and you’ve gathered exactly the right data, sweetheart ❤️.
Let’s break it down carefully and solve it step by step 👇
🧩 The core error:
Failed to create pod sandbox:
rpc error: code = Unknown desc = failed to setup network for sandbox ...
plugin type="weave-net" name="weave" failed (add):
unable to allocate IP address: Post "http://127.0.0.1:6784/ip/...": dial tcp 127.0.0.1:6784: connect: connection refused
This line tells the full story:
kubelet is trying to contact the Weave Net CNI plugin via
127.0.0.1:6784, but the Weave Net daemon isn’t running on the node — or it crashed.
🧠 Why this happens
Kubernetes relies on a CNI plugin to handle pod networking (assign IPs, connect pods, etc.). Weave Net runs as a DaemonSet — meaning a pod per node.
If the weave pod isn’t running or is failing, no pod on that node can start because the network setup step fails. That’s exactly why your pods stay in ContainerCreating / Pending state.
🧾 Evidence
- ✅ CNI binaries exist:
/opt/cni/bin/weave-netand/opt/cni/bin/weave-ipam - ✅ Config exists:
/etc/cni/net.d/10-weave.conflist - ❌ But weave Net DaemonSet isn’t functioning (port 6784 unreachable).