This report updates a prior report. Please refer to the prior report for background information.
KryoNet was also added in this version of the report. KryoNet is a JVM-only solution that uses its own TCP or UDP connections and the fast Kryo serializer.
As a result of Issue #1, the ping-pong benchmark was redesigned. The main changes are:
- 10,000 pings are sent per second (ie 100,000 nanoseconds between each ping)
- Time is measured from the moment a ping was scheduled to be sent
- Different threads are used for sending and receiving
- gRPC was changed to use a bidirectional stream (not request-response)
- Aeron message payloads were reduced to that necessary for each sent message
- Aeron was configured to not perform bounds checking via its buffer
- Addition of KryoNet to the benchmark
The following changes applied to the price stream benchmark:
- Aeron was configured to not perform bounds checking via its buffer
- Addition of KryoNet to the benchmark
While KryoNet supports TCP and UDP, only TCP offers guarantees around message ordering and retry. Confining the test to TCP therefore reflects the quality of service guarantees made by Aeron (which uses UDP but with automatic management of delivery order and retry) and gRPC (which uses HTTP 2 transport).
For the KryoNet server, the ping-pong responder ran in the same thread as the KryoNet server. The price stream responder ran in a different thread given its long execution duration blocks the KryoNet server thread and disconnects the client.
The following changes were made:
- Aeron was upgraded to version 1.0.2
- Protocol Buffers was upgraded to version 3.1.0
- Linux was upgraded to version 4.7.4
- KryoNet was added with version 2.22.0-RC1 and Kryo version 4.0.0
This benchmark was run with Git revision 317a91a9156cf546b97281ba7fe850ce955533dd.
All values are in microseconds.
Percentile | gRPC | KryoNet | Aeron |
---|---|---|---|
0.00 | 37 | 11 | 3 |
0.10 | 48 | 17 | 4 |
0.20 | 51 | 18 | 4 |
0.30 | 54 | 18 | 4 |
0.40 | 56 | 18 | 4 |
0.50 | 59 | 19 | 4 |
0.60 | 62 | 20 | 4 |
0.70 | 65 | 21 | 5 |
0.80 | 69 | 22 | 5 |
0.90 | 76 | 24 | 6 |
0.95 | 81 | 26 | 6 |
0.99 | 130 | 33 | 8 |
0.999 | 4,452 | 61 | 13 |
1.00 | 14,532 | 2,770 | 895 |
Despite ping-pong benchmark changes that would assist gRPC (eg using a single bidirectional stream, throttling requests to a level that gRPC can easily accommodate on this hardware based on the price stream benchmark etc), Aeron continues to provide much lower and more stable latency.
Percentile | gRPC | KryoNet | Aeron |
---|---|---|---|
0.00 | 5,570 | 1,589 | 136 |
0.10 | 107,374,182 | 38,990,249 | 682,622 |
0.20 | 211,392,921 | 77,510,737 | 1,214,251 |
0.30 | 319,975,063 | 115,427,246 | 1,719,664 |
0.40 | 432,717,955 | 155,155,693 | 2,222,981 |
0.50 | 546,266,152 | 192,468,221 | 2,736,783 |
0.60 | 603,442,905 | 228,707,008 | 3,244,294 |
0.70 | 658,740,609 | 265,616,883 | 3,743,416 |
0.80 | 770,409,758 | 301,184,581 | 4,236,247 |
0.90 | 992,137,445 | 338,765,545 | 4,739,563 |
0.95 | 1,048,508,891 | 355,408,543 | 4,987,027 |
0.99 | 1,094,679,789 | 370,977,800 | 5,188,354 |
0.999 | 1,104,880,336 | 374,467,461 | 5,230,297 |
1.00 | 1,105,954,078 | 374,735,896 | 5,234,491 |
Aeron delivered over 19 million messages per second, completing the transfer 211 times faster than gRPC and 71 times faster than KryoNet. This is a material improvement since the prior benchmark report and is mostly due to buffer bounds checks being disabled.
Aeron's user-defined throughput was 44 bytes per message, consisting of an 8 byte user reserved value field (not used in the benchmark, but sent anyway), an 8 byte SBE header, and a 28 byte SBE price message. As such Aeron delivered 801 megabytes of user-defined content per second.
As was seen in the prior benchmark, Aeron continues to deliver exceptional performance and latency outcomes. The prior report's qualifier around the very different focuses of the stacks is again reiterated.
If you are interested in other low latency benchmarks on the JVM, you might like to browse our Embedded Key-Value Store Benchmark and JVM Hashing Algorithm Benchmark reports.