Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Turn off swap on all machines #38

Closed
xksteven opened this issue Mar 30, 2023 · 5 comments
Closed

Turn off swap on all machines #38

xksteven opened this issue Mar 30, 2023 · 5 comments

Comments

@xksteven
Copy link
Contributor

xksteven commented Mar 30, 2023

sudo swapoff -a

@AndriyNovykov
Copy link
Collaborator

Why would we want to turn off swap?

@xksteven
Copy link
Contributor Author

Story time

I hate swap. It's so old school.

When is it used?

Program is consuming too much RAM. okay so what should happen now? Let's use the 100x slower memory of flash or God forbid hard disk to temporarily store it and retrieve it later.

That doesn't sound so bad right?

Okay now practically speaking when does it happen.

We have a program that has an infinite loop and is consuming all of the RAM and resources.

What happens next?

Well the operating system steps in to save this program by allocating even More memory to this program because if you add a tiny bit more it'll surely work right?

What happens next?

The entire node is sitting waiting on I/O because it just needs a little bit more memory to do the next step of execution for this program and now the entire node is essentially unresponsive because CPUs are basically only working to swap out RAM space for swap space and repeating this process for all eternity.

Meanwhile what am I trying to do?

I am just trying to either Ctrl+C to kill the program or typing kill -9 pid to kill it. The operating system won't even listen to me though because it's too busy trying to save this program.

Alternate universe.

Here's the alternative universe where it's dead or non existent.

Program consumes all the memory and requests more.

Operating system tells program that there is no more memory and proceed to just kill the program. This all happened within a few seconds tops and I'm literally none the wiser as an admin.

Conclusion

Hope that helps explain why I want it dead.

@andriy-safe-ai
Copy link
Contributor

@steven-basart Has this been done yet?

@steven-safeai
Copy link
Contributor

It was manually done on the cluster but it needed to be done for all nodes and put it into a playbook. Just created the PR #120 .

@steven-safeai
Copy link
Contributor

Merged. Closing Issue.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants