Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

Bring 'crypto' module into the modern age #3278

Closed
isaacs opened this issue May 16, 2012 · 17 comments
Closed

Bring 'crypto' module into the modern age #3278

isaacs opened this issue May 16, 2012 · 17 comments
Labels
Milestone

Comments

@isaacs
Copy link

isaacs commented May 16, 2012

The crypto module relies on many outdated Node ideas. It cannot operate on Buffers directly, and doesn't present any kind of stream interface.

It's the only reason we cannot yet deprecate the 'binary' string encoding.

Its API should be cleaned up, and ideally move much more of its functionality into JavaScript, and make it more dry.

isaacs added a commit that referenced this issue May 16, 2012
@jorangreef
Copy link

Was just looking for a way to get an SHA1 hash digest piped directly into a buffer at an offset in the buffer e.g. digest(buffer, offset) rather than receiving an intermediary buffer from digest and then copying this.

Also noticed that digest('base64') is twice as slow as digest('hex') or digest('binary'):

Benchmark(function() { crypto.createHash('sha1').update('abcdefg', 'utf8').digest('base64') }, 1000)
0.00823ms

Benchmark(function() { crypto.createHash('sha1').update('abcdefg', 'utf8').digest('base64') }, 1000)
0.0076ms

Benchmark(function() { crypto.createHash('sha1').update('abcdefg', 'utf8').digest('hex') }, 1000)
0.00497ms

Benchmark(function() { crypto.createHash('sha1').update('abcdefg', 'utf8').digest('hex') }, 1000)
0.00482ms

Benchmark(function() { crypto.createHash('sha1').update('abcdefg', 'utf8').digest('binary') }, 1000)
0.00446ms

Benchmark(function() { crypto.createHash('sha1').update('abcdefg', 'utf8').digest('binary') }, 1000)
0.00444ms

@ThisIsMissEm
Copy link

@isaacs What's the overall status of this, and is there anything I could do to help?

@tanepiper
Copy link

@miksago AFAIK this stuff will be 0.10 territory after speaking to Mikeal - everything should be moved to streams and be the last API-breaking changes

@jorangreef
Copy link

Regarding API, it would be great if the interface for crypto.randomBytes(size, callback) could match that of window.crypto.getRandomValues(typedarray).

The problem with Node's randomBytes is that if you already have a buffer which you need to be filled with entropy then you have to copy memory twice. This incurs unnecessary copies in the kind of situation where you would often want to reuse a buffer.

window.crypto.getRandomValues is blocking but for the use case it is more helpful. Would prefer Node to make randomBytes synchronous (i.e. treat the entropy generation to be a blocking CPU activity rather than a non-blocking IO activity). If the user wants, they can call getRandomValues multiple times rather to avoid blocking. Consider also that the hash functions in Node are already synchronous in this same sense (one can update SHA1 multiple times to avoid blocking, but the update operation itself is synchronous).

@ThisIsMissEm
Copy link

I think more what I'm interested in, is: should crypto methods be returning a Buffer instead of binary string or other string? (if they don't already)

@jorangreef
Copy link

Yes they should be able to return a buffer, but they should also be able to write the digest into an existing buffer at a specified offset. If the getRandomValues interface could be improved while you're at it, then that would be much appreciated.

@ThisIsMissEm
Copy link

hmm, I might have a look at this over the long weekend.

@aseemk
Copy link

aseemk commented Jun 15, 2012

Having crypto methods return buffers might make sense, but just wanted to toss in that having them support returning strings (maybe not by default) is really convenient. E.g. hash.digest('hex').

Would love to see this for randomBytes() too (btw, any reason it couldn't be shortened to just random()? crypto.random() would be a nice parallel to Math.random()), e.g. crypto.random(16, 'hex').

@stephank
Copy link

stephank commented Aug 8, 2012

Mentioning related issues that are still open: #3571, #1393, #2945.

Streams make a ton of sense here. For ciphers, though, that pretty much implies creating new buffers.

On the other hand, I didn't see any mention of ciphers here, but mostly of methods that don't return a lot of data. Stuff like hash output and randomBytes is usually limited to a handful. Isn't this plenty fast with the existing buffer pool in node? Is the small copy for this data significant?

I'm pretty much in favor of Hash, Hmac, Signer and Verify all as write streams, both Cipher and Decipher as read/write streams, and the remaining functions operating on and returning plain buffers. You'll find those functions deal with small amounts in practice.

@jorangreef
Copy link

Those functions deal with small amounts in practice.

Yet they may be called many times a second in busy servers: e.g. deduplication, hashing keys for lookup etc.

It should be possible to eliminate the copy, if the interface is designed right, by having the hash write output into an existing buffer at a specified offset.

For the interface to impose a copy when it's not necessary and not desired by the user is wasteful.

That's how we've ended up with systems that do copies all the way down.

On 08 Aug 2012, at 10:24 AM, Joyent/Node reply@reply.github.com wrote:

Mentioning related issues that are still open: #3571, #1393, #2945.

Streams make a ton of sense here. For ciphers, though, that pretty much implies creating new buffers.

On the other hand, I didn't see any mention of ciphers here, but mostly of methods that don't return a lot of data. Stuff like hash output and randomBytes is usually limited to a handful. Isn't this plenty fast with the existing buffer pool in node? Is the small copy for this data significant?

I'm pretty much in favor of Hash, Hmac, Signer and Verify all as write streams, both Cipher and Decipher as read/write streams, and the remaining functions operating on and returning plain buffers. You'll find those functions deal with small amounts in practice.


Reply to this email directly or view it on GitHub.

@stephank
Copy link

stephank commented Aug 9, 2012

My concern is that there may not be a perfect interface for this, when the trade-off is between flexibility and complexity. Assuming we're transitioning to streams, I count a total of 10 remaining methods that deal with raw data. Each of these would have to support some or all of the following means of output in their interface:

  • Return a buffer.
  • Return an encoded string.
  • Output to a buffer, optionally at an offset.
  • Output an encoded representation to a buffer, optionally at an offset. (for e.g. text-only protocols?)

I'm also not sure what a zero-copy cipher would look like, if we want to expose them as regular node streams.

@mscdex
Copy link

mscdex commented Aug 9, 2012

+1 for writing to an existing buffer as an option. I think this would perform much better for the reasons @jorangreef listed and thus would be a good fit for the crypto use in the ssh2 module I'm writing.

@stephank
Copy link

stephank commented Aug 9, 2012

@mscdex Well snap, guess what I'm writing. :)

@shlevy
Copy link

shlevy commented Aug 28, 2012

Does this issue including making crypto streaming à la zlib, with the actual encryption calls being done in a background thread? Or should I open up another issue for that?

@indutny
Copy link
Member

indutny commented Sep 18, 2012

Fixed in 63ff449

@shlevy
Copy link

shlevy commented Sep 18, 2012

@indutny Did 63ff449 add a stream interface for crypto?

@langpavel
Copy link

@shlevy I don't think so, there is no stream interface yet

# for free to subscribe to this conversation on GitHub. Already have an account? #.
Labels
Projects
None yet
Development

No branches or pull requests

10 participants