Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

feat: readiness #181

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open

feat: readiness #181

wants to merge 9 commits into from

Conversation

simone-sanfratello
Copy link

@simone-sanfratello simone-sanfratello commented Dec 16, 2022

note: to be rebased after #180

The PR aims to replace the current readiness logic. Current readiness logic is based on the fact the 3rd party storage such as AWS Dynamo and S3 are able to respond. This is good for low traffic service, but not good enough for the intense traffic of the bitswap-peer service.

The proposed readiness logic is going to consider:

  • amount of active connections
  • pending request blocks
  • event loop utilization (ELU)

Where active connections and and pending request blocks are hard limits got from production metrics, ELU is a index of the node inner process "busyness", so when one of them pass the limit, we can consider the single instance "busy" and stop to send it more request, until it solves the pending load. Note the ELU is more significant than memory and cpu usage - we're going to add them to the readiness logic in the future, if needed.

Open question: should be consider Dynamo and S3 access for readiness?

What we want to really avoid is this situation, where the service is busy responding (for ~10 minutes at ~15.00) but it still receive requests

elu

More info: https://www.nearform.com/blog/event-loop-utilization-with-hpa/

Also note, the next step for bitswap scalability will be to implement on the k8s load balancer to scale services based on custom parameters, that will be the same (active connections, pending blocks, ELU) exposed on the /load endpoint

@simone-sanfratello simone-sanfratello temporarily deployed to dev January 2, 2023 15:07 — with GitHub Actions Inactive
@simone-sanfratello simone-sanfratello marked this pull request as ready for review January 5, 2023 10:36
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant