-
Notifications
You must be signed in to change notification settings - Fork 7.3k
Bring 'crypto' module into the modern age #3278
Comments
Was just looking for a way to get an SHA1 hash digest piped directly into a buffer at an offset in the buffer e.g. digest(buffer, offset) rather than receiving an intermediary buffer from digest and then copying this. Also noticed that digest('base64') is twice as slow as digest('hex') or digest('binary'): Benchmark(function() { crypto.createHash('sha1').update('abcdefg', 'utf8').digest('base64') }, 1000) Benchmark(function() { crypto.createHash('sha1').update('abcdefg', 'utf8').digest('base64') }, 1000) Benchmark(function() { crypto.createHash('sha1').update('abcdefg', 'utf8').digest('hex') }, 1000) Benchmark(function() { crypto.createHash('sha1').update('abcdefg', 'utf8').digest('hex') }, 1000) Benchmark(function() { crypto.createHash('sha1').update('abcdefg', 'utf8').digest('binary') }, 1000) Benchmark(function() { crypto.createHash('sha1').update('abcdefg', 'utf8').digest('binary') }, 1000) |
@isaacs What's the overall status of this, and is there anything I could do to help? |
@miksago AFAIK this stuff will be 0.10 territory after speaking to Mikeal - everything should be moved to streams and be the last API-breaking changes |
Regarding API, it would be great if the interface for crypto.randomBytes(size, callback) could match that of window.crypto.getRandomValues(typedarray). The problem with Node's randomBytes is that if you already have a buffer which you need to be filled with entropy then you have to copy memory twice. This incurs unnecessary copies in the kind of situation where you would often want to reuse a buffer. window.crypto.getRandomValues is blocking but for the use case it is more helpful. Would prefer Node to make randomBytes synchronous (i.e. treat the entropy generation to be a blocking CPU activity rather than a non-blocking IO activity). If the user wants, they can call getRandomValues multiple times rather to avoid blocking. Consider also that the hash functions in Node are already synchronous in this same sense (one can update SHA1 multiple times to avoid blocking, but the update operation itself is synchronous). |
I think more what I'm interested in, is: should crypto methods be returning a Buffer instead of binary string or other string? (if they don't already) |
Yes they should be able to return a buffer, but they should also be able to write the digest into an existing buffer at a specified offset. If the getRandomValues interface could be improved while you're at it, then that would be much appreciated. |
hmm, I might have a look at this over the long weekend. |
Having crypto methods return buffers might make sense, but just wanted to toss in that having them support returning strings (maybe not by default) is really convenient. E.g. Would love to see this for |
Mentioning related issues that are still open: #3571, #1393, #2945. Streams make a ton of sense here. For ciphers, though, that pretty much implies creating new buffers. On the other hand, I didn't see any mention of ciphers here, but mostly of methods that don't return a lot of data. Stuff like hash output and I'm pretty much in favor of |
Those functions deal with small amounts in practice. Yet they may be called many times a second in busy servers: e.g. deduplication, hashing keys for lookup etc. It should be possible to eliminate the copy, if the interface is designed right, by having the hash write output into an existing buffer at a specified offset. For the interface to impose a copy when it's not necessary and not desired by the user is wasteful. That's how we've ended up with systems that do copies all the way down. On 08 Aug 2012, at 10:24 AM, Joyent/Node reply@reply.github.com wrote:
|
My concern is that there may not be a perfect interface for this, when the trade-off is between flexibility and complexity. Assuming we're transitioning to streams, I count a total of 10 remaining methods that deal with raw data. Each of these would have to support some or all of the following means of output in their interface:
I'm also not sure what a zero-copy cipher would look like, if we want to expose them as regular node streams. |
+1 for writing to an existing buffer as an option. I think this would perform much better for the reasons @jorangreef listed and thus would be a good fit for the crypto use in the ssh2 module I'm writing. |
@mscdex Well snap, guess what I'm writing. :) |
Does this issue including making crypto streaming à la zlib, with the actual encryption calls being done in a background thread? Or should I open up another issue for that? |
Fixed in 63ff449 |
@shlevy I don't think so, there is no stream interface yet |
The crypto module relies on many outdated Node ideas. It cannot operate on Buffers directly, and doesn't present any kind of stream interface.
It's the only reason we cannot yet deprecate the 'binary' string encoding.
Its API should be cleaned up, and ideally move much more of its functionality into JavaScript, and make it more dry.
The text was updated successfully, but these errors were encountered: