Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Refingerprint available memory on HUP/reload #18327

Open
optiz0r opened this issue Aug 25, 2023 · 2 comments
Open

Refingerprint available memory on HUP/reload #18327

optiz0r opened this issue Aug 25, 2023 · 2 comments

Comments

@optiz0r
Copy link
Contributor

optiz0r commented Aug 25, 2023

Proposal

When nomad is HUP'd/reloaded, it should re-fingerprint the available resources and reflect any changes, for example an increase in system memory. Currently a full restart of the nomad agent is required to pick up changes, which is disruptive in some cases. It would be very useful to update the amount of memory available on the fly (even if it's only supported for increases rather than decreases).

Use-cases

If nomad is running on a virtual machine that has it's system memory increased at runtime, a full restart of nomad is currently required to allow the additional memory to be seen by the scheduler with minimum disruption.

@shoenig
Copy link
Member

shoenig commented Aug 29, 2023

Hi @optiz0r while I can appreciate the use case of re-sizing VMs, I would note that restarting a Nomad client shouldn't be particularly disruptive. We would be curious to hear more about what problems that is causing.

@optiz0r
Copy link
Contributor Author

optiz0r commented Aug 29, 2023

I routinely see allocations restarted when the nomad agent is restarted via systemctl restart nomad. Some allocations survive an agent restart untouched, but most are restarted. I've always assumed it's related to use of consul or vault blocking queries to populate templates in nomad jobs, but haven't dug into the exact pattern. This is on a mix of EL7+docker and EL8+podman. Given allocation restarts having potential downstream impact, I consider nomad agent restarts to be a disruptive operation that either needs to be done in a maintenance window, or with the downstream application teams on standby to deal with potential fallout, hence wishing for memory increases to be reflected via reload/HUP.

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
Status: Needs Roadmapping
Development

No branches or pull requests

4 participants