Skip to content

UTF-8 stopped working #21

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
laino opened this issue May 6, 2014 · 20 comments
Closed

UTF-8 stopped working #21

laino opened this issue May 6, 2014 · 20 comments

Comments

@laino
Copy link

laino commented May 6, 2014

utf-8 strings worked just fine before. Now they are broken.
Characters that are represented as multiple bytes now get split into individual characters.

@laino
Copy link
Author

laino commented May 6, 2014

Whoops. I am not using the latest client.

@laino
Copy link
Author

laino commented May 6, 2014

Still happens when using release 1.1.0 on both ends

@laino
Copy link
Author

laino commented May 6, 2014

When sending multi-byte strings from client to server I see this exception:

[22:31:49 Worker 2983 Critical] Error: Invalid continuation byte Error: Invalid continuation byte
    at Error (<anonymous>)
    at readContinuationByte (/var/volafile/app/node_modules/engine.io/node_modules/engine.io-parser/node_modules/utf8/utf8.js:127:9)
    at decodeSymbol (/var/volafile/app/node_modules/engine.io/node_modules/engine.io-parser/node_modules/utf8/utf8.js:167:12)
    at Object.utf8decode [as decode] (/var/volafile/app/node_modules/engine.io/node_modules/engine.io-parser/node_modules/utf8/utf8.js:201:17)
    at Object.exports.decodePacket (/var/volafile/app/node_modules/engine.io/node_modules/engine.io-parser/lib/index.js:150:17)
    at WebSocket.Transport.onData (/var/volafile/app/node_modules/engine.io/lib/transport.js:102:24)
    at WebSocket.onData (/var/volafile/app/node_modules/engine.io/lib/transports/websocket.js:75:30)
    at WebSocket.EventEmitter.emit (events.js:98:17)
    at Receiver.self._receiver.ontext (/var/volafile/app/node_modules/engine.io/node_modules/ws/lib/WebSocket.js:697:10)
    at Receiver.opcodes.1.finish (/var/volafile/app/node_modules/engine.io/node_modules/ws/lib/Receiver.js:397:14)

@laino
Copy link
Author

laino commented May 6, 2014

The other way around results in garbled text in the browser, but no exception.

@laino
Copy link
Author

laino commented May 6, 2014

It doesn't happen when using 1.0.5 (engine.io) on the server, even when using 1.1.0 on the client.

@laino
Copy link
Author

laino commented May 6, 2014

I'd suspect ws is already dealing with the encoding/decoding internally, why the sudden need to use utf8?
Node.js should correctly convert utf16 strings to utf8 when making http responses and browsers will then decode them back into utf16 javascript strings. Websocket implementations deal with it and strings with japanese characters were working just fine before the introduction of the module.

@defunctzombie
Copy link
Contributor

Please provide a failing example.
On May 6, 2014 1:49 PM, "binlain" notifications@github.com wrote:

I'd suspect ws https://www.npmjs.org/package/ws is already dealing with
the encoding/decoding internally, why the sudden need to use utf8https://www.npmjs.org/package/utf8
?
Node.js should correctly convert utf16 strings to utf8 when making http
responses and browsers will then decode them back into utf16 javascript
strings. Websocket implementations deal with it just fine too and strings
with japanese characters were working just fine before the introduction of
the module.


Reply to this email directly or view it on GitHubhttps://github.com//issues/21#issuecomment-42356950
.

@laino
Copy link
Author

laino commented May 7, 2014

It's impossible to write a working example, but here's a failing one:
https://github.com/binlain/failing-engine.io-utf8-test
Just run node fail.js, then navigate to localhost:8888.
An alert window showing "日本語" should appear, but you'll see some garbled text instead.

@laino
Copy link
Author

laino commented May 7, 2014

So my suggestion is that when sending strings, just let the underlying implementation (http/websocket) handle them. I don't get why it would be necessary to do any encoding/decoding trickery on them yourself.

@laino
Copy link
Author

laino commented May 15, 2014

bump. Just let me re-emphasize that it was working in previous versions (like 1.0.5)

@rase-
Copy link
Contributor

rase- commented May 15, 2014

@binlain have you tried with the latest releases? It sounds like either the client or server has the utf8.js stuff, but the other doesn't. Your example worked for me with the master branches just fine.

The reason we are using utf8.js is that when encoding mixed binary and string contents, we need to support multibyte strings. It is easiest to encode the multibyte characters so that we can do a transformation to a binary representation one byte at a time.

@laino
Copy link
Author

laino commented May 17, 2014

Version 1.1.1 (client & server) is still completely and utterly broken. I can observe this in production as well as in this example which I suggest you try yourself.

Engine.io was working perfectly. It got broken by trying to fix a problem that wasn't there.
It's broken starting with version 1.1.0.

@rase-
Copy link
Contributor

rase- commented May 18, 2014

@binlain it was not working perfectly for all transports. For the situation I described above, we were not handling multibyte characters correctly.

The engine.io.js build file is outdated in the current release, which is maybe why the file in your example is outdated too. @guille we should update the build.

If I use a newly built engine.io-client (that I built myself in a working copy), everything works as expected in your example with engine.io 1.1.1. Here is a screenshot of the resulting alert. Indeed if I use the file that is currently in the repo, the text is garbled.

@laino
Copy link
Author

laino commented May 18, 2014

Ah sigh. Thanks for clearing this up.

@defunctzombie
Copy link
Contributor

We should just stop having the prebuilt file for engine and publish build instructions or point to the browserify CDN.

@rase-
Copy link
Contributor

rase- commented May 18, 2014

@defunctzombie 👍

@chirag04
Copy link

just got hit by this bug. Thanks to @rase- I made a new build and it worked perfectly.

@rase-
Copy link
Contributor

rase- commented May 22, 2014

@chirag04 good that you got it solved. :)

@rauchg rauchg closed this as completed May 22, 2014
@rauchg
Copy link
Contributor

rauchg commented May 22, 2014

1.2.1 tagged with the proper build

@lewispham
Copy link

Thank you all guys. This error took my whole day. ~~

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants