Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Update playbooks to download SLURM RPM files to /tmp always #268

Open
andriy-safe-ai opened this issue Mar 24, 2024 · 0 comments
Open

Update playbooks to download SLURM RPM files to /tmp always #268

andriy-safe-ai opened this issue Mar 24, 2024 · 0 comments
Assignees

Comments

@andriy-safe-ai
Copy link
Contributor

andriy-safe-ai commented Mar 24, 2024

The playbook that downloads SLURM RPM file is hardcoded to download to /data/slurm_rpms.

- hosts: all
  become: true
  vars:
    slurm_version: "23.02.1-1"
    slurm_all_packages:
      - "slurm-{{slurm_version}}.el{{ansible_distribution_major_version}}.x86_64.rpm"
      - "slurm-devel-{{slurm_version}}.el{{ansible_distribution_major_version}}.x86_64.rpm"
      - "slurm-contribs-{{slurm_version}}.el{{ansible_distribution_major_version}}.x86_64.rpm"
      - "slurm-perlapi-{{slurm_version}}.el{{ansible_distribution_major_version}}.x86_64.rpm"
      - "slurm-torque-{{slurm_version}}.el{{ansible_distribution_major_version}}.x86_64.rpm"
      - "slurm-openlava-{{slurm_version}}.el{{ansible_distribution_major_version}}.x86_64.rpm"
      - "slurm-slurmctld-{{slurm_version}}.el{{ansible_distribution_major_version}}.x86_64.rpm"
      - "slurm-slurmdbd-{{slurm_version}}.el{{ansible_distribution_major_version}}.x86_64.rpm"
      - "slurm-pam_slurm-{{slurm_version}}.el{{ansible_distribution_major_version}}.x86_64.rpm"
      - "slurm-libpmi-{{slurm_version}}.el{{ansible_distribution_major_version}}.x86_64.rpm"
      - "slurm-slurmd-{{slurm_version}}.el{{ansible_distribution_major_version}}.x86_64.rpm"
  tasks:
    - name: Download slurm .rpms
      get_url:
        url: "https://objectstorage.eu-frankfurt-1.oraclecloud.com/p/VnkLhYXOSNVilVa9d24Riz1fz4Ul-KTXeK4HCKoyqv0ghW3gry3Xz8CZqloqphLw/n/hpc/b/source/o/slurm/{{ item }}"
        dest: "/data/slurm_rpms"
      with_items: "{{slurm_all_packages}}"
      delegate_to: 127.0.0.1
      run_once: true  
    - name: manually install all of the .rpms together (fails separately)
      shell: yum install -y /data/slurm_rpms/{{slurm_all_packages[0]}} \
        /data/slurm_rpms/{{slurm_all_packages[1]}} \
        /data/slurm_rpms/{{slurm_all_packages[2]}} \
        /data/slurm_rpms/{{slurm_all_packages[3]}} \
        /data/slurm_rpms/{{slurm_all_packages[4]}} \
        /data/slurm_rpms/{{slurm_all_packages[5]}} \
        /data/slurm_rpms/{{slurm_all_packages[6]}} \
        /data/slurm_rpms/{{slurm_all_packages[7]}} \
        /data/slurm_rpms/{{slurm_all_packages[8]}} \
        /data/slurm_rpms/{{slurm_all_packages[9]}} \
        /data/slurm_rpms/{{slurm_all_packages[10]}}
      # Needed in case you wish to rerun this playbook otherwise it'll error.
      ignore_errors: true

This is a problem because the path where our playbooks that read the SLURM RPM files changes depending on how we configured our cluster. Depending on the values of variables in /etc/ansible/hosts the path those playbooks will look for will change. For example, the slurm role takes a download_path variable as the path to the RPM files. Depending on whether we have the configured create_fss to true or cluster_nfs to true the behavior will change. By default, the slurm role would check in /tmp but this would never work since we've hardcoded the download path to /data/slurm_rpms.

- hosts: bastion,slurm_backup,compute,login
  gather_facts: true
  vars:
    destroy: false
    initial: true
    download_path: "{{ nfs_target_path if create_fss | bool else ( cluster_nfs_path if cluster_nfs|bool else '/tmp')  }}"
    enroot_top_path: "{{ nvme_path }}/enroot/"
  vars_files:
    - "/opt/oci-hpc/conf/queues.conf"
  tasks:
    - include_role:
        name: slurm
      when: slurm|default(true)|bool

One way to solve this would be by setting the default download path for the SLURM RPMs be to /tmp and have all playbooks look for the RPMs in /tmp.

@andriy-safe-ai andriy-safe-ai self-assigned this Mar 24, 2024
@andriy-safe-ai andriy-safe-ai linked a pull request Mar 25, 2024 that will close this issue
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant