Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

fail fast on the index request dropping while leader changed #9566

Closed
nolouch opened this issue Apr 12, 2018 · 2 comments
Closed

fail fast on the index request dropping while leader changed #9566

nolouch opened this issue Apr 12, 2018 · 2 comments

Comments

@nolouch
Copy link
Contributor

nolouch commented Apr 12, 2018

similar with #8975. the index reading message may be dropped when the leader is transferring or some other states.
https://github.com/coreos/etcd/blob/70341b161442105a211070e6806947cfdc711e22/etcdserver/v3_server.go#L630-L661
and the read request will block the following read request until the context timeout. In our test, it report like:

2018/04/11 16:49:30.208 log.go:86: [info] raft: [402ff265b80d4dc1 is starting a new election at term 3]
2018/04/11 16:49:30.208 log.go:86: [info] raft: [402ff265b80d4dc1 became candidate at term 4]
2018/04/11 16:49:30.208 log.go:86: [info] raft: [402ff265b80d4dc1 received MsgVoteResp from 402ff265b80d4dc1 at term 4]
2018/04/11 16:49:30.208 log.go:86: [info] raft: [402ff265b80d4dc1 [logterm: 3, index: 587] sent MsgVote request to b063ce376278837c at term 4]
2018/04/11 16:49:30.208 log.go:86: [info] raft: [402ff265b80d4dc1 [logterm: 3, index: 587] sent MsgVote request to daa53193d66b358f at term 4]
2018/04/11 16:49:30.208 log.go:86: [info] raft: [raft.node: 402ff265b80d4dc1 lost leader daa53193d66b358f at term 4]
2018/04/11 16:49:30.465 log.go:86: [info] raft: [402ff265b80d4dc1 [term: 4] received a MsgVote message with higher term from daa53193d66b358f [term: 76]]
2018/04/11 16:49:30.465 log.go:86: [info] raft: [402ff265b80d4dc1 became follower at term 76]
2018/04/11 16:49:30.465 log.go:86: [info] raft: [402ff265b80d4dc1 [logterm: 3, index: 587, vote: 0] cast MsgVote for daa53193d66b358f [logterm: 3, index: 587] at term 76]
2018/04/11 16:49:30.465 log.go:86: [info] raft: [raft.node: 402ff265b80d4dc1 elected leader daa53193d66b358f at term 76]
2018/04/11 16:49:36.148 etcd_kv.go:117: [error] load from etcd error: context deadline exceeded
2018/04/11 16:49:36.148 etcd_kv.go:120: [warning] kv gets too slow: key /pd/6543228748054100662/member/4625181861026287041/leader_priority cost 10.000292691s err context deadline exceeded
2018/04/11 16:49:37.148 log.go:82: [warning] etcdserver: [timed out waiting for read index response]
2018/04/11 16:49:37.149 etcd_kv.go:120: [warning] kv gets too slow: key /pd/6543228748054100662/member/4625181861026287041/leader_priority cost 5.437088366s err <nil>
2018/04/11 16:49:37.149 etcd_kv.go:120: [warning] kv gets too slow: key /pd/6543228748054100662/member/4625181861026287041/leader_priority cost 4.395910077s err <nil> 
2018/04/11 16:49:37.149 etcd_kv.go:120: [warning] kv gets too slow: key /pd/6543228748054100662/member/4625181861026287041/leader_priority cost 3.754287338s err <nil> 
2018/04/11 16:49:37.149 etcd_kv.go:120: [warning] kv gets too slow: key /pd/6543228748054100662/member/4625181861026287041/leader_priority cost 5.759414241s err <nil> 
2018/04/11 16:49:37.149 etcd_kv.go:120: [warning] kv gets too slow: key /pd/6543228748054100662/member/4625181861026287041/leader_priority cost 5.436268761s err <nil> 
2018/04/11 16:49:37.149 etcd_kv.go:120: [warning] kv gets too slow: key /pd/6543228748054100662/member/4625181861026287041/leader_priority cost 2.435955314s err <nil> 
2018/04/11 16:49:37.149 etcd_kv.go:120: [warning] kv gets too slow: key /pd/6543228748054100662/member/4625181861026287041/leader_priority cost 2.436752165s err <nil> 
2018/04/11 16:49:37.149 etcd_kv.go:120: [warning] kv gets too slow: key /pd/6543228748054100662/member/4625181861026287041/leader_priority cost 8.759771471s err <nil> 
2018/04/11 16:49:37.149 etcd_kv.go:120: [warning] kv gets too slow: key /pd/6543228748054100662/member/4625181861026287041/leader_priority cost 4.756303747s err <nil
@MikeSpreitzer
Copy link

I am seeing "timed out waiting for read index response" complaints when rebooting a non-leader. I am using etcd release 3.2.19 to support Kubernetes release 1.10.3. The Kubernetes api-servers log a burst of Trace dumps with total times in the range 4--7 seconds, when the etcd leader logs that timeout message. See https://docs.google.com/document/d/1atgspqSeWlxJi_5xhjidnPKzvGwwGWskIiu7PmSnd5Y .

@xiang90
Copy link
Contributor

xiang90 commented Sep 24, 2018

fixed by #10094

@xiang90 xiang90 closed this as completed Sep 24, 2018
# for free to join this conversation on GitHub. Already have an account? # to comment
Development

Successfully merging a pull request may close this issue.

4 participants