Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

fix(aws-lambda): added performance improvements #1315

Draft
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

kirrg001
Copy link
Contributor

@kirrg001 kirrg001 commented Sep 5, 2024

refs https://jsw.ibm.com/browse/INSTA-13498

  • live test with lambda layer
  • live test without lambda layer
  • can the backend connect the spans still? If we send 10 each?
    • backend confirmed: no big deal as long as they arrive within 20s.
    • add a protection for the 20s limit! -> we cannot add logic for that, we have to tell the customers to disable the feature for calls longer than 20s.
  • layer too slow with AND without span buffering - https://github.ibm.com/instana/lambda-extension/pull/18
  • Span buffer
  • fix not waiting to finish the requests for the current invocation (problem: lambda gets freezed and on the next invocation, the data gets sent out but also goes to the awaitnext iteration!) -> Fixed -> reproduce and proof its no longer happening
  • https://us-east-2.console.aws.amazon.com/lambda/home?region=us-east-2#/functions/nodejs-test-app-1?code&tab=code toggle layers and check why there is the error msg
  • create multiple PR's
  • 250 span iteration -> bug???
  • add a load test with different scenarios (automate?)

CASES

The results are always a little different. Factors like the speed of AWS, network and the backend replies plays a role.

Heartbeat performance fix

Was extracted from this PR and released already.
0c001c8

We have seen too many random errors regarding failed heartbeat requests to the layer.

Backend down / Host not found

  • 2 spans
  • host not found triggers retries and runs longer than the actual lambda execution
  • fixed bug in extension

Current layer

Average Response Time: 367.958ms
Average Billed Duration: 1502.000ms

Local branch

Average Response Time: 350.958ms
Average Billed Duration: 295.500ms

Backend slow / timeout (8 spans, no span batching)

  • lambda fn in US -> serverless endpoint in Asia
  • layer tries twice to send the spans - both failing (this adds a delay on top)

Current layer

Average Response Time: 962.348ms
Average Billed Duration: 1567.000ms

Local branch

Average Response Time: 974.408ms
Average Billed Duration: 1356.900ms

Backend performs (8 spans, no span batching)

  • lambda fn in US -> serverless endpoint in US

Current layer

Average Response Time: 949.076ms
Average Billed Duration: 904.267ms

Local branch

Average Response Time: 957.742ms
Average Billed Duration: 882.929ms
Average Backend Response Time: 65.178ms

Very similar. BUT there was a bug which executed invocations on the next invocation. Thats why current layer has similar results.

10 spans (one /traces, one /bundle)

Backend slow / timeout (100 spans, span batching enabled)

  • lambda fn in US -> serverless endpoint in Asia

Backend performs (Span batching 100 spans)

current layer

Average Response Time: 10280.682ms
Average Billed Duration: 10231.250ms

local branch

Average Response Time: 10270.244ms
Average Billed Duration: 10195.500ms
Average Backend Response Time: 59.806ms

@kirrg001 kirrg001 changed the title fix(aws-lambda): reduced lambda runtimes for large number of spans fix(aws-lambda): reduced lambda latency for large number of spans Sep 5, 2024
@kirrg001 kirrg001 changed the title fix(aws-lambda): reduced lambda latency for large number of spans fix(serverless): reduced lambda latency for large number of spans Sep 10, 2024
@kirrg001 kirrg001 changed the title fix(serverless): reduced lambda latency for large number of spans fix(aws-lambda): reduced lambda latency for large number of spans Sep 10, 2024
@kirrg001 kirrg001 changed the title fix(aws-lambda): reduced lambda latency for large number of spans fix(aws-lambda): reduced latency for large number of spans Sep 10, 2024
@kirrg001 kirrg001 force-pushed the short-fix-lambda branch 3 times, most recently from dad39ed to 83ae371 Compare November 19, 2024 16:07
@kirrg001 kirrg001 changed the title fix(aws-lambda): reduced latency for large number of spans fix(aws-lambda): added performance improvements Nov 20, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant