Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

topology-aware: internal error from changing containers' NUMA nodes by adjusting AvailableResources #92

Open
askervin opened this issue Jul 10, 2023 · 0 comments

Comments

@askervin
Copy link
Collaborator

Assume that a container runs on CPUs of NUMA node 0.

An admin wants to reorganize server resources so that containers will not use CPUs on NUMA/die/socket 0 anymore by removing those CPUs from AvailableResources.

When this is done, restarting the topology aware NRI plugin with new configuration fails with an internal error:

E0710 07:30:57.289447       1 nri.go:784] <= Synchronize FAILED: failed to start policy topology-aware: topology-aware: failed to start:
topology-aware: failed to restore allocations from cache:
topology-aware: failed to allocate <CPU request pod0/pod0c0: exclusive: 3><Memory request: limit:95.37M, req:95.37M> from <NUMA node #1 allocatable: MemLimit: DRAM 1.85G>:
topology-aware: internal error: NUMA node #1: can't slice 3 exclusive CPUs from , 0m available

Let's discuss if this is a bug, expected behavior or if we should provide a configuration option for forcing new CPU/memory pinning, even if it would lead into costly memory accesses/moves.

Current workaround on this error is deleting the cache and thereby forcing reassignment of resources from scratch. Using this workaround or draining a node before AvailableResources change are both heavier operations than what forcing new pinning would be.

askervin added a commit to askervin/nri-plugins that referenced this issue Jul 10, 2023
Test that a running container gets reassigned into new CPUs when the
CPUs where it used to run are not included in AvailableResources
anymore.

Tests issue containers#92.
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant