I have an on-premise Kubernetes cluster running Calico as CNI. The cluster has been configured to peer with multiple BGP ToR routers. Thus the pod networks are reachable from outside. The service subnet is announced as well to make services available to external hosts. First tests showed that a connection to the services was possible from nodes and pods but not from the outside world. After some debugging I finally found the reason.
I created a the following deployment containing nginx and a service to expose the pods in namespace test:
apiVersion: v1 kind: Namespace metadata: name: test --- apiVersion: apps/v1 kind: Deployment metadata: name: nginx namespace: test spec: selector: matchLabels: app: nginx replicas: 1 template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx ports: - containerPort: 80 --- apiVersion: v1 kind: Service metadata: name: my-service namespace: test spec: selector: app: nginx ports: - protocol: TCP port: 80 targetPort: 80
After the deployment my service got the IP 10.97.0.1 but curl responded with “Connection timed out” on external hosts. Running the same test on a kubernetes node worked. My first troubleshooting attempts were related to routing, the BGP sessions and firewalling but none of these were the reason.
Next I tried to resolve internal DNS names using the Kube DNS service:
$ dig kube-dns.kube-system.svc.cluster.local @10.97.0.10 ;; reply from unexpected source: 10.11.207.66#53, expected 10.97.0.10#53 ;; reply from unexpected source: 10.11.207.66#53, expected 10.97.0.10#53 ;; reply from unexpected source: 10.11.207.66#53, expected 10.97.0.10#53 ; <<>> DiG 9.16.27-Debian <<>> kube-dns.kube-system.svc.cluster.local @10.97.0.10 ;; global options: +cmd ;; connection timed out; no servers could be reached
As dig reported it didn’t get the expected response from the service IP but from one of the Kube DNS pods. This directed me to masquerading issues. Some searches brought me to the kube-proxy option MasqueradeAll. This option enabled NAT for service IPs to make external access work.
To enable masquerading you have to run kubectl edit cm/kube-proxy -n kube-system. Find MasqueradeAll: false and set it to true. Finally make sure to restart kube-proxy on your nodes: kubectl rollout restart ds kube-proxy -n kube-system.
That’s it. You can watch the restart progress by checking kubectl rollout status ds kube-proxy -n kube-system.
Finally your services should be reachable from external hosts.