Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

speedup WAEncoder #482

Closed
GoogleCodeExporter opened this issue Mar 25, 2015 · 2 comments
Closed

speedup WAEncoder #482

GoogleCodeExporter opened this issue Mar 25, 2015 · 2 comments

Comments

@GoogleCodeExporter
Copy link

The attached changeset speeds up WAEncoder by noticing that all of its
subclasses are mostly outputting the input value for all characters, and
that a conversion from a character X to a character Y is never found.

With this change I got a 15% improvement on this test:

(s := ((1 to: 100000) collect: [ :e | Random between: 32 and: 126 ])
asByteArray asString) size
Time millisecondsToRun: [ 100 timesRepeat: [ ws := String new writeStream.
(SWAHtmlEncoder on: ws) nextPutAll: s ] ]

before: 5319 (best of 4 runs, restarting the VM each time)
after: 4606 (best of 4 runs, restarting the VM each time)

Consistent with this, I measured the cost per character of
WAEncoder>>nextPutAll: and the (new) WASimpleEncoder>>nextPutAll: as
respectively 24 and 21 bytecodes.  This was measured with GNU Smalltalk's
profiler.

The reason is that on GST #notNil is much faster than #isString.

I could get even better speedups by using SequenceableCollection's
#at:ifAbsent: method.  GST has it implemented as a primitive on
SequenceableCollection (with the failure code invoking the absentBlock),
which explains the reason for the speed.  But that would be a speedup only
when all characters are in the 0-255 range, so I did not do that.

In general, the undisputed hotspot is WriteStream>>#nextPut: (15%), which
is heavily used under GST by both Swazoo and Seaside.  I am starting to
think it was not such a bad idea to make it a primitive in the Blue Book... :-)

While unportable, even a C-coded String>>#htmlEncoded (to be used by
String>>#encodeOn:) would not be a bad idea actually.  10% of execution
time is spent there, and while this would have a higher GC cost because of
possibly big strings returned by the C function, #nextPutAll: boils down to
a single memcpy so...

Paolo

Original issue reported on code.google.com by philippe...@gmail.com on 5 Oct 2009 at 4:51

Attachments:

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

No branches or pull requests

1 participant