-
Notifications
You must be signed in to change notification settings - Fork 292
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Savedata is not endianness portable #2693
Comments
Why 0.3.0? Can this not be fixed earlier? |
I see, your comment says big endian systems will break. True, but does anyone actually use toxcore on that? Is it worth delaying the fix by months/years? |
save_compatibility_test was failing on big-endian systems, as it was written and tested on a little-endian system and savedata is not endianness portable[1]. [1] TokTok#2693
@iphydf well, that was before we had the discussion that it could be done without possibly breaking things and with Tox_Options flag(s). There you go, updated the milestone. Not sure if v0.2.x or v0.2.20 is more appropriate, feel free to change it if I got it wrong. |
save_compatibility_test was failing on big-endian systems, as it was written and tested on a little-endian system and savedata is not endianness portable[1]. [1] TokTok#2693
save_compatibility_test was failing on big-endian systems, as it was written and tested on a little-endian system and savedata is not endianness portable[1]. [1] TokTok#2693
save_compatibility_test was failing on big-endian systems, as it was written and tested on a little-endian system and savedata is not endianness portable[1]. [1] TokTok#2693
Savedata created on a little-endian systems will not load on big-endian systems and the other way around.
I have looked into
save_compatibility_test
failing on s390x and found a couple of issues that I would like to document:Toxcore uses little-endian to host and host to little-endian conversion functions. For example,
lendian_bytes_to_host32()
intox_load()
:c-toxcore/toxcore/tox.c
Lines 704 to 709 in 710eb67
They function such that if
WORDS_BIGENDIAN
is defined, they assume the host is big-endian, otherwise it's little endian:c-toxcore/toxcore/state.c
Lines 127 to 136 in 710eb67
However,
WORDS_BIGENDIAN
is never defined by anything on a big-endian system, so those functions always assume the host is little-endian and justmemcpy()
the data as is, without any conversion. This results in little-endian systems storing and reading integers in the little-endian order, and big-endian systems storing and reading integers in the big-endian order.Trying to load a savedata created on amd64 on a s390x system will result in the
tox_load()
code linked above returning -1 on line 708 as the savedata file's magic number won't match due to the wrong endianness, withtox_new()
returningTOX_ERR_NEW_LOAD_BAD_FORMAT
to the user, as evident bysave_compatibility_test
failing on s390s with:c-toxcore/auto_tests/save_compatibility_test.c
Lines 84 to 87 in 710eb67
If we fix the previous issue by defining
WORDS_BIGENDIAN
on a big-endian system, e.g. by adding the following toCMakeLists.txt
:then the magic number matches and the code proceeds further in parsing the savedata, however the s390x
save_compatibility_test
then fails with:c-toxcore/toxcore/state.c
Lines 27 to 45 in 710eb67
The issue here is that the code does the same endianness conversion twice. On line 28 it converts little-endian to host 32-bit. This will convert the little-endian to the big-endian representation. But then on the line 39 it performs a little-endian to host conversion again, mistakenly converting those 16 bits from the host endianness (which is big-endian! this is why calling
lendian_to_host16()
on it produces non-portable result, as it's notlendian
to begin with) to the little endian. Then the comparison fails because left side is little-endian now and the right side is big-endian. (Then there are the lower 16-bits being converted on like 45, which also is unnecessary and will produce an incorrect result).The saving code does the conversion twice too:
c-toxcore/toxcore/state.c
Line 77 in 710eb67
which are no-ops on little endian but produce unexpected result on big-endian.
Fixing this would break the savedata format on big-endian systems, changing it in non backwards-compatible way. There have been talk of switching savedata to using msgpack, which would also be a breaking change, so it might make sense to keep the current broken behavior for now and do all the savedata breaking changes together.
Here is a small snippet to reproduce the issue
The text was updated successfully, but these errors were encountered: