Jaeger/Zipkin Exporter Performance #1274

CodeBlanch · 2020-09-16T06:47:56Z

Changes

A round of performance tweaks for Zipkin. It now uses the Batch<Activity throughout (instead of making a copy) and flushes small (4k) chunks to the Http stream as they are ready.

The new version is slightly slower. If I make the chunk size bigger, it can be faster. But I think that is cheating for the sake of benchmarking. In a real-world scenario, we want to start sending bytes as soon as we can. Today we buffer everything into memory and then send it all at the end, that's why the memory usage is so high. I think lower chunks will actually make the export finish faster for real-world usage. I did the same thing (lower packet size) for JaegerExporter. Size reduced from 65000 to 4096. Socket default send buffer is 8192. For Jaeger we want to get the packets to the socket and let the networking algorithm handle the transmission, it has enough buffering we don't need our own 😄

/cc @reyang Curious what your thoughts are on this 4096 size change?

Benchmarks

Before:

Method	NumberOfBatches	NumberOfSpans	Mean	Error	StdDev	Completed Work Items	Lock Contentions	Gen 0	Gen 1	Gen 2	Allocated
ZipkinExporter_Batching	1	10000	49.77 ms	0.539 ms	0.478 ms	5.1818	-	1181.8182	1090.9091	1000.0000	18.7 MB
ZipkinExporter_Batching	10	10000	458.13 ms	5.493 ms	4.869 ms	34.0000	-	11000.0000	9000.0000	9000.0000	186.79 MB
ZipkinExporter_Batching	100	10000	4,625.93 ms	72.792 ms	68.090 ms	305.0000	-	124000.0000	100000.0000	100000.0000	1867.75 MB

After:

Method	NumberOfBatches	NumberOfSpans	Mean	Error	StdDev	Completed Work Items	Lock Contentions	Gen 0	Gen 1	Gen 2	Allocated
ZipkinExporter_Batching	1	10000	56.84 ms	1.087 ms	1.209 ms	4.2000	-	200.0000	-	-	1.71 MB
ZipkinExporter_Batching	10	10000	511.93 ms	5.001 ms	4.678 ms	25.0000	-	2000.0000	-	-	16.95 MB
ZipkinExporter_Batching	100	10000	5,151.66 ms	66.415 ms	58.875 ms	204.0000	-	21000.0000	1000.0000	-	169.3 MB

TODOs

CHANGELOG.md updated for non-trivial changes
Changes in public API reviewed

CodeBlanch · 2020-09-16T06:48:55Z

src/OpenTelemetry.Exporter.Zipkin/ZipkinExporterOptions.cs

        /// </summary>
-        public TimeSpan TimeoutSeconds { get; set; } = TimeSpan.FromSeconds(10);


TimeoutSeconds wasn't actually being used. The ActivityProcessor now owns timeout, so I just removed it.

src/OpenTelemetry.Exporter.Zipkin/ZipkinExporterOptions.cs

codecov · 2020-09-16T06:57:31Z

Codecov Report

Merging #1274 into master will decrease coverage by 0.00%.
The diff coverage is 93.47%.

@@            Coverage Diff             @@
##           master    #1274      +/-   ##
==========================================
- Coverage   79.12%   79.12%   -0.01%     
==========================================
  Files         215      215              
  Lines        6176     6175       -1     
==========================================
- Hits         4887     4886       -1     
  Misses       1289     1289

Impacted Files	Coverage Δ
...plementation/ZipkinActivityConversionExtensions.cs	`92.07% <91.66%> (-0.08%)`	⬇️
...rc/OpenTelemetry.Exporter.Zipkin/ZipkinExporter.cs	`85.89% <93.75%> (-1.29%)`	⬇️
...rc/OpenTelemetry.Exporter.Jaeger/JaegerExporter.cs	`84.04% <100.00%> (ø)`
...Telemetry.Exporter.Jaeger/JaegerExporterOptions.cs	`100.00% <100.00%> (ø)`
...Telemetry.Exporter.Zipkin/ZipkinExporterOptions.cs	`100.00% <100.00%> (ø)`
...ter.ZPages/Implementation/ZPagesActivityTracker.cs	`100.00% <0.00%> (+2.85%)`	⬆️

reyang · 2020-09-17T18:00:15Z

/cc @reyang Curious what your thoughts are on this 4096 size change?

It makes total sense to leverage the HTTP/TCP layer to do buffering. For smaller events we might consider turning off the Nagle's Algorithm (e.g. metrics).

reyang

LGTM.

…kinExporter.

eddynaka

LGTM!
nit: I would change the title to Jaeger/Zipkin Exporter Performance

CodeBlanch added 2 commits September 15, 2020 17:54

Fixed broken ZipkinExporter benchmarks & switched to batching.

4da9eb6

Zipkin Exporter performance tweaks.

5f76048

CodeBlanch requested a review from a team September 16, 2020 06:47

CodeBlanch commented Sep 16, 2020

View reviewed changes

src/OpenTelemetry.Exporter.Zipkin/ZipkinExporterOptions.cs Outdated Show resolved Hide resolved

reyang approved these changes Sep 18, 2020

View reviewed changes

CodeBlanch added 5 commits September 18, 2020 11:57

Merge branch 'master' into zipkin-perf

5ce5bef

Code review.

e390426

Updated CHANGELOGs.

297a902

Renamed maxPacketSize in JaegerExporter. Added some validation in Zip…

a2704b9

…kinExporter.

Fixed build break.

b732bd0

eddynaka approved these changes Sep 18, 2020

View reviewed changes

CodeBlanch changed the title ~~Zipkin Exporter Performance~~ Jaeger/Zipkin Exporter Performance Sep 18, 2020

cijothomas and others added 5 commits September 22, 2020 10:40

Merge branch 'master' into zipkin-perf

46effce

Attempting to get 2.1 tests passing.

5399970

Fixed messed up syntax.

a7c7efb

Revert build definition change.

d99cc93

adding rsync in linux (again)

17d878c

CodeBlanch merged commit 401c111 into open-telemetry:master Sep 22, 2020

CodeBlanch deleted the zipkin-perf branch September 22, 2020 19:41

jonasi-resideo mentioned this pull request Oct 19, 2020

Jaeger Agent errors since 0.7 #1372

Closed

Austin-Tan mentioned this pull request Nov 11, 2020

Update to ZipkinExporterOptions #1510

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Jaeger/Zipkin Exporter Performance #1274

Jaeger/Zipkin Exporter Performance #1274

CodeBlanch commented Sep 16, 2020 •

edited

Loading

CodeBlanch Sep 16, 2020

codecov bot commented Sep 16, 2020 •

edited

Loading

reyang commented Sep 17, 2020 •

edited

Loading

reyang left a comment

eddynaka left a comment

		/// </summary>
		public TimeSpan TimeoutSeconds { get; set; } = TimeSpan.FromSeconds(10);

Jaeger/Zipkin Exporter Performance #1274

Jaeger/Zipkin Exporter Performance #1274

Conversation

CodeBlanch commented Sep 16, 2020 • edited Loading

Changes

Benchmarks

TODOs

CodeBlanch Sep 16, 2020

Choose a reason for hiding this comment

codecov bot commented Sep 16, 2020 • edited Loading

Codecov Report

reyang commented Sep 17, 2020 • edited Loading

reyang left a comment

Choose a reason for hiding this comment

eddynaka left a comment

Choose a reason for hiding this comment

CodeBlanch commented Sep 16, 2020 •

edited

Loading

codecov bot commented Sep 16, 2020 •

edited

Loading

reyang commented Sep 17, 2020 •

edited

Loading