Kubernetes Network Debugging for Application Developers
Common Kubernetes networking issues that affect your applications and how to diagnose them without deep k8s knowledge.
The Developer's Dilemma
When your application works perfectly in development but fails in Kubernetes, networking issues are often the culprit. This guide helps you identify and resolve common Kubernetes networking problems without needing deep Kubernetes expertise.
Understanding Kubernetes Networking
Before diving into debugging, let's understand how Kubernetes networking works at a high level:
Kubernetes Networking Components
Pod Networking
Each pod gets its own IP address and can communicate directly with other pods
Service Networking
Services provide stable endpoints for accessing pods, handling load balancing and discovery
Ingress Controllers
Ingress controllers manage external access to services, typically HTTP/HTTPS routing
Network Policies
Network policies control traffic flow between pods, implementing security boundaries
Common Kubernetes Networking Issues
Here are the most common networking problems developers encounter in Kubernetes:
Connectivity Issues
Pods cannot reach other pods in the same or different namespaces
Applications cannot connect to services via their DNS names
Unable to reach services from outside the cluster
Performance Issues
Network requests between services are significantly slower than expected
Connections sometimes work, sometimes fail without clear pattern
Network throughput is lower than expected for data-intensive operations
Real-World Case Study: The Mysterious Database Connection Failure
A development team deployed their application to Kubernetes but could not connect to their database:
Symptoms
Application Logs
- • "Failed to connect to database: connection timed out"
- • "Database connection pool exhausted"
- • "Unable to resolve database host"
Environment Details
- • Application running in Kubernetes cluster
- • Database hosted outside cluster (AWS RDS)
- • Database accessible from developer laptops
- • Same configuration worked in previous environment
Investigation Process
Used kubectl exec to run network commands inside the application pod
Discovered that DNS resolution was failing for external domains
Found that a default-deny network policy was blocking egress traffic
Debugging Kubernetes Networking Issues
Follow this systematic approach to identify and resolve Kubernetes networking problems:
Check Basic Pod Connectivity
Start by testing basic network connectivity from within your pods:
kubectl exec -it <pod-name> -- /bin/sh
# Test basic connectivity
ping 8.8.8.8
# Test DNS resolution
nslookup kubernetes.default
# Test connectivity to a service
curl -v http://<service-name>:<port>
# Test external connectivity
curl -v https://google.com
Examine Services and Endpoints
Verify that your services are correctly configured and pointing to the right pods:
kubectl get service <service-name> -o wide
# Check endpoints for the service
kubectl get endpoints <service-name>
# Describe service for detailed info
kubectl describe service <service-name>
# Check if pods are correctly labeled
kubectl get pods --show-labels | grep <label-selector>
Review Network Policies
Check if network policies are blocking traffic:
kubectl get networkpolicies --all-namespaces
# Describe a specific network policy
kubectl describe networkpolicy <policy-name> -n <namespace>
# Check policies in your namespace
kubectl get networkpolicies -n <your-namespace>
# Temporarily remove a policy for testing
kubectl delete networkpolicy <policy-name> -n <namespace>
Inspect Ingress Configuration
If external access is failing, examine your ingress resources:
kubectl get ingress
# Describe ingress for detailed info
kubectl describe ingress <ingress-name>
# Check ingress controller logs
kubectl logs -n <ingress-namespace> -l app=<ingress-controller-label>
# Test ingress endpoint
curl -H "Host: <your-host>" http://<ingress-controller-ip>
Analyze with Network Tools
Use specialized tools for deeper network analysis:
Network Debugging Pod
kubectl run debug --image=nixery.dev/shell/curl/telnet/dig/netcat/tcpdump --restart=Never --rm -it -- sh
# Run network diagnostics
ip route
netstat -tuln
traceroute <target>
Packet Capture
kubectl exec <pod-name> -- tcpdump -i any -w /tmp/capture.pcap
# Copy capture file to local
kubectl cp <namespace>/<pod-name>:/tmp/capture.pcap ./capture.pcap
# Analyze with Wireshark or whisperly
Prevention Best Practices
Implement these practices to prevent networking issues in Kubernetes:
Configuration Management
- Use consistent service naming conventions
- Document network policies and their purposes
- Implement health checks for all services
- Use namespaces to isolate environments
Monitoring and Alerting
- Monitor service availability and response times
- Set up alerts for network policy violations
- Track ingress controller performance
- Monitor DNS resolution success rates
Testing and Validation
- Create test pods for network validation
- Implement end-to-end connectivity tests
- Validate network policies with test traffic
- Regularly audit network configurations
Incident Response
- Document common networking troubleshooting steps
- Maintain a list of critical network endpoints
- Establish communication channels for network issues
- Create runbooks for common network problems
Quick Reference: Common Commands
Task | Command | Purpose |
---|---|---|
Check pod connectivity | kubectl exec -it <pod> -- ping <target> | Test basic network connectivity from pod |
Check service endpoints | kubectl get endpoints <service> | Verify service is pointing to correct pods |
List network policies | kubectl get networkpolicies -A | See all network policies in cluster |
Test DNS resolution | kubectl exec -it <pod> -- nslookup <service> | Verify DNS resolution within cluster |
Capture network traffic | kubectl exec <pod> -- tcpdump -i any -w /tmp/cap.pcap | Capture packets for detailed analysis |
Check ingress status | kubectl get ingress <name> -o wide | View ingress controller and endpoint info |
Diagnose Kubernetes Network Issues Automatically
Upload your PCAP file to whisperly and get instant insights into Kubernetes networking problems. No Kubernetes expertise required.
Related Articles
Database Connection Timeouts
Why your database "timeouts" aren't actually database problems.
Read ArticleAPI Timeout Debugging Guide
Step-by-step process to diagnose API timeouts without learning Wireshark.
Read ArticleDNS Issues: The Silent Killer
Why DNS problems are the #1 cause of mysterious "network timeouts".
Read Article