-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
etcdserver/ARM: starting etcd on crashes the first time, succeeds subsequently #2308
Comments
@miekg Can you reliably reproduce this? It is a transient bug? |
Yes. Deleting the datadir re-triggers this.
|
I tend to think this is an arm specific problem. I cannot reproduce this on x86. |
From some prints the error seems to actual be in
Which makes it a wider Go (arm) problem. |
This
It crashes on
which is another |
ok, as starting point: the following just works.
|
This bug (from the docs) maybe:
|
This is one problem, but looking at the original post there may be two. The first is an honest to god nil pointer exception, something is passing 0x4 as the address of the unit64 inside the struct, so that is nil + offset of the field. The second problem is while the runtime guarantees that structures will always be heap aligned on 8 byte boundaries, if you have something like this
val will be 4 bytes offset from the start of the structure on 32 bit platforms. The 32 bit compilers do not insert the correct padding here. There are a number of ways to tackle this, the simplest is to move val to the top of the structure so it will be 8 byte aligned. You could also use built tags to provider a properly padded version of the struct depending on the platform. |
thanks @davecheney This patch makes the crash go away (fixes the allignment).
|
LGTM. What is the definition of raft.Node ?
|
@miekg I do not think you are 100% safe by only changing this. We are not in control of all the dependencies and are not sure of other sub pkgs in etcd. We probably need more effort to say we do support 32bit or arm well enough. |
@xiang90 Ack. At least I can play some more now :)
|
I just made the above changes and this is now working on an odroid u3 as well. It would be great to have this working out of the box though. |
[ Quoting notifications@github.com in "Re: [etcd] etcdserver/ARM: starting..." ]
Note that the same bug is triggered on 32 bit intel. (my lousy virtual machine which I sometimes use for development, is only 32 bits) |
@Audumla do you have a repo/changeset I can build from for odroid? |
I did not create one for this, as I just made the change to the file mentioned above and rebuilt using the code from a master git repository. It just worked! |
I put this into a changeset and rebuilt: https://github.com/hh/etcd/tree/32bit I had to start with a clean data-dir, but it seems to be working. |
@hh. I've used your version of the source code, with go 1.4.2 (built from source) and I'm getting etcdserver: publish error: etcdserver: request timed out My go skills are non existent, otherwise I would have a look at the code. I'm building and running it on a raspberry pi 2 with hypriot image(http://blog.hypriot.com/post/hypriotos-back-again-with-docker-on-arm/) |
I've found some patches described on http://mkaczanowski.com/building-arm-cluster-part-3-docker-fleet-etcd-distribute-containers/#install_docker to make etcd 2.0.4 run on a raspberry pi. Is it possible to merge these changes ? They are the same as @hh but for some reason etcd works with those patches. Unless I'm being a noob. |
+1 for patch on raspberry pi (ARM) |
getting exact same error as @ajazam also on raspberry pi 2 and go 1.4.2 |
I looked at the three patches:
They all do the same, move some struct fields to the beginning of the struct, to be sure of the 8 byte alignment. I think they could be applied without consequence. |
Applied the patches and etcd works fine for now. Now I should step forward to next problem... |
I've come across this as well and confirm that the patches from @mkaczanowski work. Changing struct alignment seems to be the key here, and I now have It might be worthwhile applying these on an ARM-related branch (just for archival purposes) before merging... |
Fixed via #3249 |
Title says it all. First time -> crash, second time -> works (this is raspberry pi b+)
The text was updated successfully, but these errors were encountered: