Skip to content

Python script to calculate number of nodes by given pods and server configuration

License

Notifications You must be signed in to change notification settings

mancubus77/pod-packer

Repository files navigation

Description

This is a pet project to count number of nodes for OpenShift Kubernetes cluster by given number of PODS/CNFs.

Installation / Setup

Python 3.8+ and installed venv module are needed

install

python3 -m venv .venv 
source .venv/bin/activate
pip install -r requirements.txt

or for impatient

make install
make run

Script reads CSV file, which can be given via arguments

❯ python main.py -h
usage: main.py [-h] -i FILE [-d] [--csv]                                                                                                                                 ─╯

Read CSV form given path.

optional arguments:
  -h, --help            show this help message and exit
  -i FILE, --input FILE
                        input file
  -d, --detail          Detailed view of pods breakdown
  --csv                 csv output, needs to be used with -d/--detail

To update servers HW specifications modify config.py

# Minimum workers to start calculation
MIN_WORKERS = 3 
# vCPUs (or cores for non HT) on server
COMPUTE_CPU = 104
# Memory on server (MB)
COMPUTE_MEM = 384000
# Allowed maximum resource allocation (for high availability)
ALLOCATION_PERCENT = 70

Input data format (CSV)

app,mem,cpu,count,affinity
apache,100,3,6,3
nginx,100,3,6,3

name - name of pod group (must be unique)

mem - required memory per instance (MB)

cpu - required cpu per instance (Milli-Core)

count - number of instances

affinity - affinity rules (pods in same group can not be launched on same server)

platform - (optional) used when apps have same names, but belong to different groups/platforms. For example app1:nginx and app2:nginx are not the same

Results and Demo

> python main.py -i data_sample/example.csv
 [Sat, 12 Feb 2022 20:06:27] INFO [main.py.<module>:83] Starting allocation, there are 2 apps to be allocated
 [Sat, 12 Feb 2022 20:06:27] INFO [node.py.__init__:24] Creating new node compute-0
 [Sat, 12 Feb 2022 20:06:27] INFO [node.py.__init__:24] Creating new node compute-1
 [Sat, 12 Feb 2022 20:06:27] INFO [node.py.__init__:24] Creating new node compute-2
Simulating node failure. Anit-Affinity violations will be ignored
 [Sat, 12 Feb 2022 20:06:27] INFO [main.py.run_simulation:68] Running Simulation for compute-0
 [Sat, 12 Feb 2022 20:06:27] INFO [main.py.run_simulation:68] Running Simulation for compute-1
 [Sat, 12 Feb 2022 20:06:27] INFO [main.py.run_simulation:68] Running Simulation for compute-2
NODE BREAKDOWN
SUMMARY
+-------+------+-------+-------+
| nodes | pods |  cpu  |  mem  |
+-------+------+-------+-------+
|   3   |  12  | 0.036 | 1,200 |
+-------+------+-------+-------+

Detailed view

❯ python main.py -i data_sample/example.csv -d
 [Wed, 22 Dec 2021 16:59:43] INFO [main.py.<module>:65] Starting allocation, there are 2 apps to be allocated                                                            ─╯
 [Wed, 22 Dec 2021 16:59:43] INFO [node.py.__init__:29] Creating new node compute-0
 [Wed, 22 Dec 2021 16:59:43] INFO [node.py.__init__:29] Creating new node compute-1
 [Wed, 22 Dec 2021 16:59:43] INFO [node.py.__init__:29] Creating new node compute-2
+-----------+--------+-----+-------+----------+--------------+
|   node    |  app   | mem |  cpu  | affinity | max_per_node |
+-----------+--------+-----+-------+----------+--------------+
| compute-0 | apache | 100 | 0.003 |    3     |      2       |
+-----------+--------+-----+-------+----------+--------------+
| compute-0 | apache | 100 | 0.003 |    3     |      2       |
+-----------+--------+-----+-------+----------+--------------+
| compute-0 | nginx  | 100 | 0.003 |    3     |      2       |
+-----------+--------+-----+-------+----------+--------------+
| compute-0 | nginx  | 100 | 0.003 |    3     |      2       |
+-----------+--------+-----+-------+----------+--------------+
| compute-1 | apache | 100 | 0.003 |    3     |      2       |
+-----------+--------+-----+-------+----------+--------------+
| compute-1 | apache | 100 | 0.003 |    3     |      2       |
+-----------+--------+-----+-------+----------+--------------+
| compute-1 | nginx  | 100 | 0.003 |    3     |      2       |
+-----------+--------+-----+-------+----------+--------------+
| compute-1 | nginx  | 100 | 0.003 |    3     |      2       |
+-----------+--------+-----+-------+----------+--------------+
| compute-2 | apache | 100 | 0.003 |    3     |      2       |
+-----------+--------+-----+-------+----------+--------------+
| compute-2 | apache | 100 | 0.003 |    3     |      2       |
+-----------+--------+-----+-------+----------+--------------+
| compute-2 | nginx  | 100 | 0.003 |    3     |      2       |
+-----------+--------+-----+-------+----------+--------------+
| compute-2 | nginx  | 100 | 0.003 |    3     |      2       |
+-----------+--------+-----+-------+----------+--------------+
+-----------+---+-------+-------+-----+-------+
| compute-0 | 4 | 0.012 | 0.012 | 400 | 0.104 |
+-----------+---+-------+-------+-----+-------+
| compute-1 | 4 | 0.012 | 0.012 | 400 | 0.104 |
+-----------+---+-------+-------+-----+-------+
| compute-2 | 4 | 0.012 | 0.012 | 400 | 0.104 |
+-----------+---+-------+-------+-----+-------+
SUMMARY
+-------+------+-------+-------+
| nodes | pods |  cpu  |  mem  |
+-------+------+-------+-------+
|   3   |  12  | 0.036 | 1,200 |
+-------+------+-------+-------+


Failure domain

After allocation the application shutdown nodes on by one and ensure that pods can be evicted. Anit-Affinity violation is ignored. In example below, if compute-0 fails, pod AUSF won't be able to find a new node with sufficient resources. Test is failed (the app doesn't take in consideration min availability and pod disruption budget)

[Sat, 12 Feb 2022 16:52:46] WARNING [node_list.py.is_node_schedulable:70] Node compute-2 is full: CPU: True MEM: False 216130 : 384000.0
[Sat, 12 Feb 2022 16:52:46] ERROR [main.py.run_allocations:49] FAILED: Can not evict AUSF from failed node compute-1
Reconsider ALLOCATION_PERCENT values, it's 100% now
Allocated nodes: 4
NODE BREAKDOWN
+-----------+-----------+---------+--------+---------+--------+
|   node    | pod count |   cpu   | cpu,%  | mem, GB | mem,%  |
+-----------+-----------+---------+--------+---------+--------+
| compute-0 |    90     | 103.953 | 99.955 | 215,767 | 56.189 |
+-----------+-----------+---------+--------+---------+--------+
| compute-1 |    85     | 103.977 | 99.978 | 215,216 | 56.046 |
+-----------+-----------+---------+--------+---------+--------+
| compute-2 |    92     | 103.99  | 99.99  | 210,212 | 54.743 |
+-----------+-----------+---------+--------+---------+--------+
| compute-3 |    14     |  8.135  | 7.822  | 18,596  | 4.843  |
+-----------+-----------+---------+--------+---------+--------+

DEBUG

To run script in debug mode add environment variable LOGLEVEL=DEBUG

export LOGLEVEL=DEBUG
python main.py -i data_sample/example.csv

or run make script

make run-debug

TODO

Web UI with visualisation

About

Python script to calculate number of nodes by given pods and server configuration

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published