Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

"Not Capable" errors after getting "3.0 version" #6029

Closed
hongchaodeng opened this issue Jul 22, 2016 · 2 comments
Closed

"Not Capable" errors after getting "3.0 version" #6029

hongchaodeng opened this issue Jul 22, 2016 · 2 comments

Comments

@hongchaodeng
Copy link
Contributor

When doing

curl http://${ETCD_HOST}:${ETCD_PORT}/version

It gives output:

{"etcdserver":"3.0.3","etcdcluster":"3.0.0"}

Nonetheless, client still gets "not capable" error. The client is fine after retrying, which means etcd cluster has negotiated 3.0 version thereafter.

This "/version" endpoint is useful for bootstrapping servers after etcd.

@hongchaodeng
Copy link
Contributor Author

@heyitsanthony

@heyitsanthony
Copy link
Contributor

heyitsanthony commented Jul 22, 2016

A brief analysis of what's going wrong: the version http handler is returning the most recent semver but the enabled map is only updated on a 500ms periodic basis. Since the grpc interceptor does the capability check on the enabled map, there is up to a 500ms window where etcd reports 3.0.0 capabilities but will reject 3.0.0 requests. In other words, the value exposed in the version endpoint should only be updated when/after the enabled capability map is updated, not before.

@heyitsanthony heyitsanthony self-assigned this Jul 26, 2016
heyitsanthony pushed a commit to heyitsanthony/etcd that referenced this issue Jul 26, 2016
heyitsanthony pushed a commit to heyitsanthony/etcd that referenced this issue Jul 27, 2016
heyitsanthony pushed a commit to heyitsanthony/etcd that referenced this issue Jul 27, 2016
starius added a commit to starius/oniongateway that referenced this issue Aug 21, 2016
Tests failed with the following error:

2016-08-21 16:25:30.299559 I | etcdserver: published {Name:default ClientURLs:[http://127.0.0.1:38202]} to cluster f12d0b8c5b4beff9
2016-08-21 16:25:30.299680 I | embed: ready to serve client requests
2016-08-21 16:25:30.299769 I | etcdserver: setting up the initial cluster version to 3.0
2016-08-21 16:25:30.301468 N | membership: set the initial cluster version to 3.0
2016-08-21 16:25:30.303932 N | embed: serving insecure client requests on 127.0.0.1:38202, this is strongly discouraged!
2016-08-21 16:25:30.320528 I | api: enabled capabilities for version 3.0
2016-08-21 16:25:30.320623 W | etcdserver: apply entries took too long [19.219094ms for 1 entries]
2016-08-21 16:25:30.320676 W | etcdserver: avoid queries with large range/delete range!
2016-08-21 16:25:30.320718 I | etcdserver: skipped leadership transfer for single member cluster
--- FAIL: TestEtcdResolver (0.73s)
        etcd_resolver_test.go:104: Failed to populate etcd with example data: Failed to put key /ipv4/127.0.0.1: etcdserver: not capable

Discussion [1] suggests that there is a race in etcd during first 500 ms.
Sleep of one second should fix the problem.

[1] etcd-io/etcd#6029 (comment)
starius added a commit to starius/oniongateway that referenced this issue Aug 21, 2016
Tests failed with the following error:

2016-08-21 16:25:30.299559 I | etcdserver: published {Name:default ClientURLs:[http://127.0.0.1:38202]} to cluster f12d0b8c5b4beff9
2016-08-21 16:25:30.299680 I | embed: ready to serve client requests
2016-08-21 16:25:30.299769 I | etcdserver: setting up the initial cluster version to 3.0
2016-08-21 16:25:30.301468 N | membership: set the initial cluster version to 3.0
2016-08-21 16:25:30.303932 N | embed: serving insecure client requests on 127.0.0.1:38202, this is strongly discouraged!
2016-08-21 16:25:30.320528 I | api: enabled capabilities for version 3.0
2016-08-21 16:25:30.320623 W | etcdserver: apply entries took too long [19.219094ms for 1 entries]
2016-08-21 16:25:30.320676 W | etcdserver: avoid queries with large range/delete range!
2016-08-21 16:25:30.320718 I | etcdserver: skipped leadership transfer for single member cluster
--- FAIL: TestEtcdResolver (0.73s)
        etcd_resolver_test.go:104: Failed to populate etcd with example data: Failed to put key /ipv4/127.0.0.1: etcdserver: not capable

Discussion [1] suggests that there is a race in etcd during first 500 ms.
Sleep of one second should fix the problem.

[1] etcd-io/etcd#6029 (comment)
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Development

No branches or pull requests

2 participants