Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Add preflight OS, CPU, RAM, Swap, and Filesystem checks (backport #326) #329

Merged
merged 1 commit into from
Feb 26, 2025

Conversation

mergify[bot]
Copy link

@mergify mergify bot commented Feb 25, 2025

  • Implemented OS, NIC and Other preflight checks to validate system requirements before Ceph cluster creation.

    • Checks include:
      • OS version (RHEL 9+ required)
      • SELinux enforcing mode
      • Firewalld installation and status
      • Required package availability (rpcbind, podman, firewalld)
      • Podman version check (>= 3.3)
      • RHEL software profile validation
      • Tuned profile check
      • CPU, RAM, Swap, and Filesystem (part of other checks)
      • Check whether jumbo frames are enabled
      • Is it configured with DHCP or static IP
      • Is the bandwidth sufficient
      • Collect and output current NIC options set (e.g. Bonding, not bridged or virtual)
      • Check and report network latency (ping) with all hosts provided in the inventory file
      • Separate NICs for front-end and back-end networks

Enhancements:

❯ ansible-playbook -i ~/ansible-inventory/inventory.ini cephadm-preflight.yml                                                                                                                                                              ─╯

PLAY [insecure_registries] *******************************************************************************************************************************************************************************************************************

TASK [fail if insecure_registry is undefined] ************************************************************************************************************************************************************************************************
skipping: [rhel-ceph-admin]

PLAY [preflight] *****************************************************************************************************************************************************************************************************************************

TASK [fail when ceph_origin is custom with no repository defined] ****************************************************************************************************************************************************************************
skipping: [rhel-ceph-admin]

TASK [fail if baseurl is not defined for ceph_custom_repositories] ***************************************************************************************************************************************************************************
skipping: [rhel-ceph-admin]

PLAY [all] ***********************************************************************************************************************************************************************************************************************************

❯ ansible-playbook -i ~/ansible-inventory/inventory.ini cephadm-preflight.yml                                                                                                                                                              ─╯

PLAY [insecure_registries] *******************************************************************************************************************************************************************************************************************

TASK [fail if insecure_registry is undefined] ************************************************************************************************************************************************************************************************
skipping: [rhel-ceph-admin]

PLAY [preflight] *****************************************************************************************************************************************************************************************************************************

TASK [fail when ceph_origin is custom with no repository defined] ****************************************************************************************************************************************************************************
skipping: [rhel-ceph-admin]

TASK [fail if baseurl is not defined for ceph_custom_repositories] ***************************************************************************************************************************************************************************
skipping: [rhel-ceph-admin]

PLAY [Preflight Checks for Ceph Deployment] **************************************************************************************************************************************************************************************************

TASK [Initialize preflight results list] *****************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Collect installed package facts] *******************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Check if OS is RHEL 9+] ****************************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Ensure SELinux is set to Enforcing mode] ***********************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Determine SELinux Check Result] ********************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Determine SELinux Failure Reason] ******************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Determine Package Installation Check Result] *******************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Determine Package Installation Failure Reason] *****************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Fetch Firewalld status] ****************************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Extract Podman version if installed] ***************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Determine if Podman meets version requirement (>=3.3)] *********************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Validate RHEL software profile] ********************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Define RHEL Profile Check Result] ******************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Define RHEL Profile Check Reason] ******************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Get current tuned profile] *************************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Define Tuned Profile Check Result] *****************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Define Tuned Profile Check Reason] *****************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Check CPU x86-64-v2 support] ***********************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Define CPU, RAM, Swap, and Filesystem Check Variables] *********************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Ping all hosts in inventory to measure latency] ****************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin] => (item=rhel-ceph-admin)

TASK [Define networking facts] ***************************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Store all preflight check results] *****************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Generate preflight check report file] **************************************************************************************************************************************************************************************************
changed: [rhel-ceph-admin -> localhost]

TASK [Load the preflight check report] *******************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Final Check - Fail if any critical checks failed] **************************************************************************************************************************************************************************************
fatal: [rhel-ceph-admin]: FAILED! => changed=false 
  msg: 'Preflight checks failed for the following: Tuned Profile, RHEL Profile, Minimum RAM, Swap Space, /var Partition, Root Filesystem, Jumbo Frames Enabled, NIC Static IP Configuration, NIC Bandwidth. Please resolve these issues before proceeding.'

PLAY RECAP ***********************************************************************************************************************************************************************************************************************************
rhel-ceph-admin            : ok=25   changed=1    unreachable=0    failed=1    skipped=3    rescued=0    ignored=0  

=================================================================================================

❯ cat preflight_report.txt                                                                                                                                                                                                                 ─╯
==================================================
               **  Preflight Check Report **
==================================================

 System Checks
--------------------------------------------------
- OS Version: ✅ Passed

- Tuned Profile: ❌ Failed
    - Reason: Incorrect tuned profile. Expected: throughput-performance

- RHEL Profile: ❌ Failed
    - Reason: Incorrect RHEL software profile. Expected: Server with File and Storage Server.

- Firewalld Running: ✅ Passed

- Podman Installed: ✅ Passed

- SELinux: ✅ Passed

- Required Packages Installed: ✅ Passed

- Minimum RAM (8GB): ❌ Failed
    - Reason: System has only 7684 MB RAM, required: 8192MB

- Swap Space (1.5x RAM): ❌ Failed
    - Reason: System has only 5119 MB Swap, required: 11526 MB

- CPU x86-64-v2: ✅ Passed

- CPU Cores >= 4: ✅ Passed

- /var is a separate partition: ❌ Failed
    - Reason: /var is not a separate partition

- Root Filesystem >= 100GB: ❌ Failed
    - Reason: Root FS is only 43GB, required: 100GB

- NIC Configuration: ℹ️ INFO
    - Reason: Available network interfaces: ens3 | Speeds (Mbps): -1

- Jumbo Frames Enabled: ❌ Failed
    - Reason: MTU is 1500, recommended > 1500

- NIC Static IP Configuration: ❌ Failed
    - Reason: NIC is using DHCP, static IP is recommended

- NIC Bandwidth (10GbE Recommended): ❌ Failed
    - Reason: NIC speed is -1 Mbps, recommended is 10GbE

- Network Latency: ℹ️ INFO
    - Reason: Average latency (ms): ['0.111']

==================================================
** Summary **
--------------------------------------------------
❌ Critical Failures Detected:
   - Tuned Profile, RHEL Profile, Minimum RAM, Swap Space, /var Partition, Root Filesystem, Jumbo Frames Enabled, NIC Static IP Configuration, NIC Bandwidth

** Action Required: Please resolve these issues before proceeding.

❯ pwd                                                                                                                                                                                                                                      ─╯
/home/kushaldeb/Github/cephadm-ansible/reports

░▒▓ ~/Github/cephadm-ansible/reports  on implement_os_preflight_checks *1 

❯ ls -l                                                                                                                                                                                                                                    ─╯
total 4
-rw-r--r--. 1 kushaldeb kushaldeb 1872 Feb 24 22:04 rhel-ceph-admin_preflight_report.txt



This is an automatic backport of pull request #326 done by [Mergify](https://mergify.com).

Copy link
Author

mergify bot commented Feb 25, 2025

Cherry-pick of f4833f4 has failed:

On branch mergify/bp/squid/pr-326
Your branch is up to date with 'origin/squid'.

You are currently cherry-picking commit f4833f4.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Changes to be committed:
	modified:   ceph_defaults/defaults/main.yml
	new file:   checks.yml
	new file:   rhel-checks.yml
	new file:   templates/preflight_report.j2

Unmerged paths:
  (use "git add <file>..." to mark resolution)
	both modified:   cephadm-preflight.yml

To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally

@mergify mergify bot added the conflicts label Feb 25, 2025
@guits guits force-pushed the mergify/bp/squid/pr-326 branch from 94bfcc3 to 331269a Compare February 25, 2025 16:17
@guits guits removed the conflicts label Feb 25, 2025
- Implemented OS preflight checks to validate system requirements before Ceph cluster creation.
- Checks include:
  - OS version (RHEL 9+ required)
  - SELinux enforcing mode
  - Firewalld installation and status
  - Required package availability (rpcbind, podman, firewalld)
  - Podman version check (>= 3.3)
  - RHEL software profile validation
  - Tuned profile check
  - CPU, RAM, Swap, and Filesystem (part of other checks)
  - Check whether jumbo frames are enabled
  - Is it configured with DHCP or static IP
  - Is the bandwidth sufficient
  - Collect and output current NIC options set (e.g. Bonding, not bridged or virtual)
  - Check and report network latency (ping) with all hosts provided in the inventory file
  - Listing all NICs

Signed-off-by: Kushal Deb <Kushal.Deb@ibm.com>
(cherry picked from commit f4833f4)
@guits guits force-pushed the mergify/bp/squid/pr-326 branch from 331269a to c07eb50 Compare February 25, 2025 16:18
@guits guits merged commit 6b341e4 into squid Feb 26, 2025
8 checks passed
@guits guits deleted the mergify/bp/squid/pr-326 branch February 26, 2025 07:33
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants