Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Add yurt-tunnel-server graceful shut down #346

Merged
merged 1 commit into from
Aug 6, 2021

Conversation

Peeknut
Copy link
Member

@Peeknut Peeknut commented Jun 9, 2021

What type of PR is this?

Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespace from that line:
/kind bug
/kind documentation
/kind enhancement
/kind good-first-issue
/kind feature
/kind question
/kind design
/sig ai
/sig iot
/sig network
/sig storage
/sig storage

/kind enhancement

What this PR does / why we need it:

After yurtctl revert or delete deployment yurt-tunnel-server, the iptables rules created by yurt-tunnel-server still exist and have not been deleted. So the request kubectl exec/logs cannot be sent correctly.

Which issue(s) this PR fixes:

Fixes #337

Special notes for your reviewer:

/assign @rambohe-ch

Does this PR introduce a user-facing change?

other Note

The yurt-tunnel-server image needs to be recompiled locally and used during testing.

@openyurt-bot
Copy link
Collaborator

@Peeknut: GitHub didn't allow me to assign the following users: your_reviewer.

Note that only openyurtio members, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

In response to this:

What type of PR is this?

Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespace from that line:
/kind bug
/kind documentation
/kind enhancement
/kind good-first-issue
/kind feature
/kind question
/kind design
/sig ai
/sig iot
/sig network
/sig storage
/sig storage

/kind enhancement

What this PR does / why we need it:

After yurtctl revert or delete deployment yurt-tunnel-server, the iptables rules created by yurt-tunnel-server still exist and have not been deleted. So the request kubectl exec/logs cannot be sent correctly.

Which issue(s) this PR fixes:

Fixes #337

Special notes for your reviewer:

/assign @rambohe-ch

Does this PR introduce a user-facing change?

other Note

The yurt-tunnel-server image needs to be recompiled locally and used during testing.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@rambohe-ch
Copy link
Member

rambohe-ch commented Jun 10, 2021

@Peeknut Would you add some unit cases for the new feature?

btw: It would be much more welcome if you could upload some validation logs for new feature.

@Peeknut
Copy link
Member Author

Peeknut commented Jun 10, 2021

👌

@Peeknut
Copy link
Member Author

Peeknut commented Jun 10, 2021

Validation logs:

Execute the following commands:

./_output/bin/yurtctl convert --deploy-yurttunnel --cloud-nodes master --provider kubeadm\
 --kubeadm-conf-path /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf\
 --yurt-tunnel-server-image="openyurt/yurt-tunnel-server:ljw"

Then yurt-tunnel-server pod is pulled up in the cluster:

[root@master ~]# kubectl get pod -A
NAMESPACE     NAME                                      READY   STATUS    RESTARTS   AGE
default       test-po                                   1/1     Running   0          5d6h
kube-system   coredns-546565776c-zmrm4                  1/1     Running   0          5d21h
kube-system   coredns-546565776c-zsx4k                  1/1     Running   0          6d1h
kube-system   etcd-master                               1/1     Running   0          6d1h
kube-system   kube-apiserver-master                     1/1     Running   0          6d1h
kube-system   kube-controller-manager-master            1/1     Running   0          5d6h
kube-system   kube-flannel-ds-67kx4                     1/1     Running   0          6d1h
kube-system   kube-flannel-ds-78qqg                     1/1     Running   0          5d21h
kube-system   kube-proxy-cl5gb                          1/1     Running   0          5d21h
kube-system   kube-proxy-kvf79                          1/1     Running   0          6d1h
kube-system   kube-scheduler-master                     1/1     Running   1          6d1h
kube-system   yurt-controller-manager-6c95788bf-wb57j   1/1     Running   0          72s
kube-system   yurt-hub-n80                              1/1     Running   0          64s
kube-system   yurt-tunnel-agent-m24bh                   1/1     Running   0          54s
kube-system   yurt-tunnel-server-6db95b477b-p2f22       1/1     Running   0          71s

The nat table in the cluster is as follows, containing records related to TUNNEL-PORT (chain: ``OUTPUT, TUNNEL-PORT`, `TUNNEL-PORT-10250`, `TUNNEL-PORT-10255`):

[root@master ~]# iptables -t nat -L -n
Chain PREROUTING (policy ACCEPT)
target     prot opt source               destination         
KUBE-SERVICES  all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes service portals */
DOCKER     all  --  0.0.0.0/0            0.0.0.0/0            ADDRTYPE match dst-type LOCAL

Chain INPUT (policy ACCEPT)
target     prot opt source               destination         

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         
TUNNEL-PORT  tcp  --  0.0.0.0/0            0.0.0.0/0            /* yurttunnel server port */
KUBE-SERVICES  all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes service portals */
DOCKER     all  --  0.0.0.0/0           !127.0.0.0/8          ADDRTYPE match dst-type LOCAL

Chain POSTROUTING (policy ACCEPT)
target     prot opt source               destination         
MASQUERADE  all  --  172.17.0.0/16        0.0.0.0/0           
KUBE-POSTROUTING  all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes postrouting rules */
RETURN     all  --  10.244.0.0/16        10.244.0.0/16       
MASQUERADE  all  --  10.244.0.0/16       !224.0.0.0/4         
RETURN     all  -- !10.244.0.0/16        10.244.0.0/24       
MASQUERADE  all  -- !10.244.0.0/16        10.244.0.0/16       

……
Chain KUBE-SVC-TCOU7JCQXEZGVUNU (1 references)
target     prot opt source               destination         
KUBE-SEP-6E7XQMQ4RAYOWTTM  all  --  0.0.0.0/0            0.0.0.0/0            /* kube-system/kube-dns:dns */ statistic mode random probability 0.50000000000
KUBE-SEP-EJJ3L23ZA35VLW6X  all  --  0.0.0.0/0            0.0.0.0/0            /* kube-system/kube-dns:dns */

Chain TUNNEL-PORT (1 references)
target     prot opt source               destination         
TUNNEL-PORT-10255  tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:10255 /* jump to port 10255 */
TUNNEL-PORT-10250  tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:10250 /* jump to port 10250 */

Chain TUNNEL-PORT-10250 (1 references)
target     prot opt source               destination         
RETURN     tcp  --  0.0.0.0/0            127.0.0.1            /* return request to access node directly */ tcp dpt:10250
RETURN     tcp  --  0.0.0.0/0            10.10.102.78         /* return request to access node directly */ tcp dpt:10250
DNAT       tcp  --  0.0.0.0/0            0.0.0.0/0            /* dnat to tunnel for access node */ tcp dpt:10250 to:10.10.102.78:10263

Chain TUNNEL-PORT-10255 (1 references)
target     prot opt source               destination         
RETURN     tcp  --  0.0.0.0/0            127.0.0.1            /* return request to access node directly */ tcp dpt:10255
RETURN     tcp  --  0.0.0.0/0            10.10.102.78         /* return request to access node directly */ tcp dpt:10255
DNAT       tcp  --  0.0.0.0/0            0.0.0.0/0            /* dnat to tunnel for access node */ tcp dpt:10255 to:10.10.102.78:10264

Execute the following command to continuously input the log of pod yurt-tunnel-server-6db95b477b-p2f22 into the local file:

[root@master ~]# docker ps
CONTAINER ID   IMAGE                                                           COMMAND                  CREATED              STATUS              PORTS     NAMES
61fa1d6b77de   openyurt/yurt-controller-manager                                "yurt-controller-man…"   About a minute ago   Up About a minute             k8s_yurt-controller-manager_yurt-controller-manager-6c95788bf-wb57j_kube-system_3a83f542-6101-4bf3-b9a3-b55992ed5009_0
e3f8d1210586   789b6c0af4c7                                                    "yurt-tunnel-server …"   About a minute ago   Up About a minute             k8s_yurt-tunnel-server_yurt-tunnel-server-6db95b477b-p2f22_kube-system_6bd1c195-26c5-40df-a290-d2c7c9609e93_0
43c182932c53   registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.2   "/pause"                 About a minute ago   Up About a minute             k8s_POD_yurt-tunnel-server-6db95b477b-p2f22_kube-system_6bd1c195-26c5-40df-a290-d2c7c9609e93_0
……

[root@master ~]# docker inspect e3f8d1210586 | grep -i logpath
        "LogPath": "/var/lib/docker/containers/e3f8d1210586915d449ccfc9e3e2bde7071237442a474234e6579415d28347bc/e3f8d1210586915d449ccfc9e3e2bde7071237442a474234e6579415d28347bc-json.log",
                "io.kubernetes.container.logpath": "/var/log/pods/kube-system_yurt-tunnel-server-6db95b477b-p2f22_6bd1c195-26c5-40df-a290-d2c7c9609e93/yurt-tunnel-server/0.log",
[root@master ~]# 
[root@master ~]# 

[root@master suit-1.18]# tail -f /var/lib/docker/containers/e3f8d1210586915d449ccfc9e3e2bde7071237442a474234e6579415d28347bc/e3f8d1210586915d449ccfc9e3e2bde7071237442a474234e6579415d28347bc-json.log >> yurt-tunnel-server.log

The log information in the file is as follows:

[root@master suit-1.18]# cat yurt-tunnel-server.log 
{"log":"I0610 09:20:18.973521       1 anpserver.go:156] start handling http request from master at 10.10.102.78:10264\n","stream":"stderr","time":"2021-06-10T09:20:18.976791039Z"}
{"log":"I0610 09:20:19.006273       1 anpserver.go:142] start handling https request from master at 10.10.102.78:10263\n","stream":"stderr","time":"2021-06-10T09:20:19.006562743Z"}
{"log":"I0610 09:20:23.977933       1 leaderelection.go:252] successfully acquired lease kube-system/tunnel-dns-controller\n","stream":"stderr","time":"2021-06-10T09:20:23.978937538Z"}
{"log":"I0610 09:20:23.979059       1 dns.go:203] starting tunnel dns controller\n","stream":"stderr","time":"2021-06-10T09:20:23.980407072Z"}
{"log":"I0610 09:20:23.979090       1 shared_informer.go:223] Waiting for caches to sync for tunnel-dns-controller\n","stream":"stderr","time":"2021-06-10T09:20:23.980426268Z"}
{"log":"I0610 09:20:23.979103       1 shared_informer.go:230] Caches are synced for tunnel-dns-controller \n","stream":"stderr","time":"2021-06-10T09:20:23.980432305Z"}
{"log":"I0610 09:21:15.658950       1 iptables.go:466] clear conntrack entries for ports [\"10250\" \"10255\"] and nodes [\"10.10.102.80\"]\n","stream":"stderr","time":"2021-06-10T09:21:15.659230528Z"}
{"log":"E0610 09:21:15.696089       1 iptables.go:483] clear conntrack for 10.10.102.80:10250 failed: \"conntrack v1.4.4 (conntrack-tools): 0 flow entries have been deleted.\\n\", error message: exit status 1\n","stream":"stderr","time":"2021-06-10T09:21:15.696373246Z"}
{"log":"E0610 09:21:15.708765       1 iptables.go:483] clear conntrack for 10.10.102.80:10255 failed: \"conntrack v1.4.4 (conntrack-tools): 0 flow entries have been deleted.\\n\", error message: exit status 1\n","stream":"stderr","time":"2021-06-10T09:21:15.708991045Z"}
{"log":"I0610 09:21:15.708804       1 iptables.go:535] directly access nodes changed, [10.10.102.78] for ports [10250 10255]\n","stream":"stderr","time":"2021-06-10T09:21:15.709035025Z"}

Execute the yurtctl revert command to delete pod yurt-tunnel-server-6db95b477b-p2f2:

[root@master openyurt]# ./_output/bin/yurtctl revert --kubeadm-conf-path /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf

Check the log, it shows that cleanup iptables rules succeeded

[root@master suit-1.18]# cat yurt-tunnel-server.log 
{"log":"I0610 09:20:18.973521       1 anpserver.go:156] start handling http request from master at 10.10.102.78:10264\n","stream":"stderr","time":"2021-06-10T09:20:18.976791039Z"}
{"log":"I0610 09:20:19.006273       1 anpserver.go:142] start handling https request from master at 10.10.102.78:10263\n","stream":"stderr","time":"2021-06-10T09:20:19.006562743Z"}
{"log":"I0610 09:20:23.977933       1 leaderelection.go:252] successfully acquired lease kube-system/tunnel-dns-controller\n","stream":"stderr","time":"2021-06-10T09:20:23.978937538Z"}
{"log":"I0610 09:20:23.979059       1 dns.go:203] starting tunnel dns controller\n","stream":"stderr","time":"2021-06-10T09:20:23.980407072Z"}
{"log":"I0610 09:20:23.979090       1 shared_informer.go:223] Waiting for caches to sync for tunnel-dns-controller\n","stream":"stderr","time":"2021-06-10T09:20:23.980426268Z"}
{"log":"I0610 09:20:23.979103       1 shared_informer.go:230] Caches are synced for tunnel-dns-controller \n","stream":"stderr","time":"2021-06-10T09:20:23.980432305Z"}
{"log":"I0610 09:21:15.658950       1 iptables.go:466] clear conntrack entries for ports [\"10250\" \"10255\"] and nodes [\"10.10.102.80\"]\n","stream":"stderr","time":"2021-06-10T09:21:15.659230528Z"}
{"log":"E0610 09:21:15.696089       1 iptables.go:483] clear conntrack for 10.10.102.80:10250 failed: \"conntrack v1.4.4 (conntrack-tools): 0 flow entries have been deleted.\\n\", error message: exit status 1\n","stream":"stderr","time":"2021-06-10T09:21:15.696373246Z"}
{"log":"E0610 09:21:15.708765       1 iptables.go:483] clear conntrack for 10.10.102.80:10255 failed: \"conntrack v1.4.4 (conntrack-tools): 0 flow entries have been deleted.\\n\", error message: exit status 1\n","stream":"stderr","time":"2021-06-10T09:21:15.708991045Z"}
{"log":"I0610 09:21:15.708804       1 iptables.go:535] directly access nodes changed, [10.10.102.78] for ports [10250 10255]\n","stream":"stderr","time":"2021-06-10T09:21:15.709035025Z"}
{"log":"E0610 09:24:35.639364       1 server.go:649] \"stream read failure\" err=\"rpc error: code = Canceled desc = context canceled\"\n","stream":"stderr","time":"2021-06-10T09:24:35.640992505Z"}
{"log":"I0610 09:24:36.340234       1 dns.go:230] shutting down tunnel dns controller\n","stream":"stderr","time":"2021-06-10T09:24:36.344814095Z"}
{"log":"I0610 09:24:36.340297       1 iptables.go:161] stop the iptablesManager\n","stream":"stderr","time":"2021-06-10T09:24:36.344858438Z"}
{"log":"I0610 09:24:36.340440       1 csrapprover.go:65] stoping the csrapprover\n","stream":"stderr","time":"2021-06-10T09:24:36.344864832Z"}
{"log":"I0610 09:24:36.786688       1 iptables.go:205] cleanup iptables rules succeed\n","stream":"stderr","time":"2021-06-10T09:24:36.786861028Z"}
[root@master suit-1.18]# 

Check the nat table on the node, there are no rules related to TUNNEL-PORT.

@yanyhui
Copy link
Member

yanyhui commented Jun 13, 2021

@Peeknut
Have you verified manual deployment of tunnel?
Is there any error in executing kubectl exec after deleting tunnel? I think it's necessary to add it when deploying tunnel manually.

return
case <-ticker.C:
im.syncIptableSetting()
}
}
}

func (im *iptablesManager) cleanupIptableSetting() {
dnatPorts, err := util.GetConfiguredDnatPorts(im.kubeClient, im.insecurePort)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not need to get the current dnat ports, only use im.lastDnatPorts to cleanup iptables instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@rambohe-ch
Copy link
Member

/lgtm
/approve

@openyurt-bot openyurt-bot added the lgtm lgtm label Aug 6, 2021
@openyurt-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Peeknut, rambohe-ch

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openyurt-bot openyurt-bot added the approved approved label Aug 6, 2021
@openyurt-bot openyurt-bot merged commit ea2002a into openyurtio:master Aug 6, 2021
MrGirl pushed a commit to MrGirl/openyurt that referenced this pull request Mar 29, 2022
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
approved approved kind/enhancement kind/enhancement lgtm lgtm size/L size/L: 100-499
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] kubectl exec error occurred: after use yurtctl revert to convert openyurt cluster with yurttunnel server
4 participants