Skip to content

Installing from rpm

DrDaveD edited this page May 10, 2024 · 61 revisions

These are the instructions for installing from rpm. They have been tested only on RHEL7 based systems. All of the install steps have to be performed on both machines; all the configuration on both machines has to be identical. All the configuration has to be done as the root user.

  1. If you will be installing a squid for monitoring and for caching the geoapi calls, first follow the Open Science Grid instructions for installing and configuring frontier-squid for a CVMFS stratum 1.

  2. Set up the cvmfs-contrib yum repository:

    # yum -y install https://ecsft.cern.ch/dist/cvmfs/cvmfs-contrib-release/cvmfs-contrib-release-latest.noarch.rpm
    
  3. For cvmfs software, do this:

    # yum -y install http://cvmrepo.web.cern.ch/cvmrepo/yum/cvmfs-release-latest.noarch.rpm
    
  4. Do this command on both machines to install the package and dependencies including cvmfs-server:

    # yum -y install cvmfs-hastratum1 cvmfs-config-default mod_wsgi
    

    On el8, use python3-mod_wsgi instead of mod_wsgi.

    If you will be using pacemaker, also do this command:

    # yum -y install pcs
    

    On el8 you may first need to enable the ha yum repo in /etc/yum.repos.d, or on el9 you may first need to enable the highavailability repo.

  5. If your site uses kerberos authentication, know that the scripts manage the kerberos credentials so they can run for a long time or from cron. For that reason you need to add two lines of the following form to ~/.k5login on your machines:

    host/HOST1NAME@YOUR.DOMAIN
    host/HOST2NAME@YOUR.DOMAIN
    

    where HOST1NAME and HOST2NAME are the fully qualified domain name of your two machines and YOUR.DOMAIN is your keberos domain. For the purposes of manually copying files during some of the following steps, if you don't otherwise have it set up so you can copy between the machines without a password, you may want to run kinit -k to initialize the host keberos credentials for your commands.

    If your site does not use kerberos authentication, set up the two machines to be able to connect as root between each other without a password such as with a special key in ~root/.ssh/authorized_keys.

  6. For the following file, on one of the machines copy it to the corresponding file without a '.in' and then edit the copy:

    /etc/cvmfs/hastratum1.conf.in
    

    Things to change:

    • HOST1NAME and HOST2NAME - fully qualified domain names of your two machines
    • HOST1IP and HOST2IP - the IP addresses of your two machines
    • /storage/cvmfs - you can change this to a different directory if your big filesystem is elsewhere, but don't use /srv/cvmfs because that is used for something else
    • /srv/cvmfs - this is the value (set for SRV) of the directory that pull_and_push looks in to find out the repositories to snapshot. Set this to a different directory to snapshot a different set of repositories. At Fermilab we set this to ${OVERRIDE_SRV:-/srv/cvmfs} so we can override the value from different cron jobs to snapshot repositories at different rates, putting some on a fast track.

    Parameters to set in /etc/cvmfs/hastratum1.conf when not using pacemaker:

    • IS_HA_MASTER_CMD - if not using heartbeat, set this to a shell command that can be eval'ed to return a zero exit code if on the master machine
    • IS_HA_BACKUP_CMD - if not using heartbeat, set this to a shell command that can be eval'ed to return a zero exit code if on the backup machine

    Optional parameters you may add to /etc/cvmfs/hastratum1.conf:

    • HTTPPORT=N where N is the TCP port of apache on the two machines (default 8081)
    • MAXPARALLELPULL=N where N is the number of parallel cvmfsha-pull-and-push operations you want to allow. One is started each time from cron (configured below), but more may be started if previous ones are still running. The default is 4.

    OSG stratum 1s should also add this to /etc/cvmfs/hastratum1.conf:

    • EXTRAKEYS=/etc/cvmfs/keys/opensciencegrid.org/opensciencegrid.org.pub

    After having made your edits on one machine, copy the file to the other machine so they're identical.

  7. Create /etc/httpd/conf.d/cvmfs.conf with the following contents on both machines:

    Listen 8081
    KeepAlive On
    RewriteEngine On
    # Point api URLs to the WSGI handler
    RewriteRule ^/cvmfs/([^/]+)/api/(.*)$ /var/www/wsgi-scripts/cvmfs-server/cvmfs-api.wsgi/$1/$2
    # Change /cvmfs to where the storage is
    RewriteRule ^/cvmfs/(.*)$ /storage/cvmfs/$1
    <Directory "/storage/cvmfs"> 
        Options -MultiViews +FollowSymLinks -Indexes
        AllowOverride All 
        Require all granted
    
        EnableMMAP Off 
        EnableSendFile Off
    
        <FilesMatch "^\.cvmfs">
            ForceType application/x-cvmfs
        </FilesMatch>
    
        Header unset Last-Modified
        RequestHeader unset If-Modified-Since
        FileETag None
    
        ExpiresActive On
        ExpiresDefault "access plus 3 days" 
        ExpiresByType text/html "access plus 15 minutes" 
        ExpiresByType application/x-cvmfs "access plus 61 seconds"
        ExpiresByType application/json "access plus 61 seconds"
    </Directory>
    
    # Enable the api functions
    WSGIDaemonProcess cvmfs-api threads=64 display-name=%{GROUP} \
        python-path=/usr/share/cvmfs-server/webapi
    <Directory /var/www/wsgi-scripts/cvmfs-server>
        WSGIProcessGroup cvmfs-api
        WSGIApplicationGroup cvmfs-api
        Options ExecCGI
        SetHandler wsgi-script
        Require all granted
    </Directory>
    WSGISocketPrefix /var/run/wsgi
    

    No "RewriteRule" for supporting short repository names because the add-repository command inserts symbolic links for the short name for cern.ch and opensciencegrid.org repositories for backward compatibility.

    If you are not using squid, also add Listen 8000 and Listen 8080 to the top of the above configuration file.

    Copy the file to the other machine and then run these commands on both machines on systemd systems such as EL7:

    # systemctl enable httpd
    # systemctl start httpd
    
  8. Enable incoming connections through the firewall on both machines:

    # firewall-cmd --permanent --add-port=8000/tcp
    # firewall-cmd --permanent --add-port=8080/tcp
    # firewall-cmd --permanent --add-port=8081/tcp
    # firewall-cmd --reload
    
  9. Add the first small repository with this command on either machine (it sets up both machines):

    # cvmfsha-add-repository config-egi.egi.eu http://cvmfs-stratum0.gridpp.rl.ac.uk:8000

    If anything goes wrong, you will need to run cvmfsha-remove-repository config-egi.egi.eu on either machine to clean up the partially created repository.

  10. After a repository is successfully created, try a test run of this command:

    # cvmfsha-pull-and-push

    It normally runs from cron only on the master machine so all its output goes to logs. Check the logs in

    /var/log/cvmfs/cvmfs-pull.log
    /var/log/cvmfs/cvmfs-push.log
    

    To verify that the repository was successfully pulled and pushed without errors.

  11. Create /etc/cron.d/cvmfs-hastratum1 with the following line on both machines:

    */5 * * * * root cvmfsha-is-master && /usr/sbin/cvmfsha-pull-and-push
    

    If you're maintaining a large stratum 1, instead of the "*/5" in the first column use the value for your site at https://twiki.cern.ch/twiki/bin/view/CvmFS/StratumOnes.

    Also add this cron entry for cleaning up old temporary files daily:

    0 9 * * * root   find /srv/cvmfs/*.*/data/txn -name "*.*" -mtime +2 2>/dev/null|xargs rm -f
    

    and add this entry for running garbage collection daily on all garbage-collectable repositories:

    9 1 * * * root   /usr/sbin/cvmfsha-gc-all >>/var/log/cvmfs/gc.log 2>&1
    
  12. If you are not using pacemaker, make sure that whatever mechanism you use to switch your service IP to from the master to the backup calls this command on the master and skip the rest of this step:

    /usr/share/cvmfs-hastratum1/push-abort stop

    If you are using pacemaker, start it on both machines, like this, roughly following the clusterlabs documentation:

    # firewall-cmd --permanent --add-service=high-availability
    # firewall-cmd --reload
    # systemctl start pcsd
    # systemctl enable pcsd
    # passwd hacluster
    

    Set a common password for the hacluster login on both machines.

    Then on one machine (either one), configure the cluster, replacing HOST1NAME and HOST2NAME with the fully qualified names of your two nodes and CLUSTERNAME with a cluster name of your choice:

    # pcs host auth HOST1NAME HOST2NAME
    

    When it prompts for Username enter hacluster and the password that you set the login to above.

    # pcs cluster setup CLUSTERNAME HOST1NAME HOST2NAME
    # pcs cluster start --all
    # pcs property set stonith-enabled=false
    # pcs resource create cvmfsha-push-abort ocf:heartbeat:cvmfsha-push-abort
    

    Locate a common IP address that both nodes can ping, such as a router, so they can verify network connectivity, and run this command replacing PINGADDR with that IP address:

    # pcs resource create ping ocf:pacemaker:ping host_list=PINGADDR dampen=5s multiplier=1000
    # pcs cluster enable --all
    

    Then the output of pcs status should look something like this:

        Cluster name: CLUSTERNAME
    Cluster Summary:
      * Stack: corosync (Pacemaker is running)
      * Current DC: HOST1NAME (version 2.1.6-9.1.el8_9-6fdc9deea29) - partition with quorum
      * Last updated: Thu May  9 21:36:20 2024 on HOST1NAME
      * Last change:  Thu May  9 21:35:30 2024 by root via cibadmin on HOST1NAME
      * 2 nodes configured
      * 2 resource instances configured
    
    Node List:
      * Online: [ HOST1NAME HOST2NAME ]
    
    Full List of Resources:
      * cvmfsha-push-abort	(ocf::heartbeat:cvmfsha-push-abort):	 Started HOST1NAME
      * ping	(ocf::pacemaker:ping):	 Started HOST1NAME
    
    Daemon Status:
      corosync: active/disabled
      pacemaker: active/enabled
      pcsd: active/enabled
    

    If you also have a virtual IP address to move to the master, follow the directions on creating an active/passive cluster.

    Verify it is working by trying out the commands in the Intro to working with pacemaker page.

  13. If you want to automatically add replicas to your stratum 1 based on what is currently installed on stratum 0s or other stratum 1s, read the documentation in /etc/cvmfs/manage-replicas.conf or online. Once you're satisfied that it is working, add the following in /etc/cron.d/cvmfs-manage-replicas to have it run once an hour on the master and backup machines:

    35 * * * * root /usr/bin/cvmfsha-is-master && PATH=$PATH:/usr/sbin /usr/share/cvmfs-hastratum1/manage-replicas-log -c
    35 * * * * root /usr/bin/cvmfsha-is-backup && PATH=$PATH:/usr/sbin /usr/share/cvmfs-hastratum1/manage-replicas-log -k
    
Clone this wiki locally