Skip to content
DrDaveD edited this page Oct 10, 2014 · 5 revisions

This is an overview of how cvmfs-hastratum1 works.

Two identical machines are kept up to date with serving replicas of CVMFS repositories. A single service IP address is passed back and forth between the two machines to the one considered to be the "master" machine. The other machine is considered to be the "backup". The master machine first pulls an entire update with the data going into a shared data area, but the critical file that indicates to clients how to find the catalog listing all the current files (that is, ".cvmfspublished) is first read into a staging subdirectory that clients do not see. The master then pushes the same update to the backup machine's staging area. Finally, the master makes the catalog update visible on both machines. If there is any update in progress when a swap of mastership happens, the update is aborted. With this approach, the data corresponding to every catalog served to clients is always available on both machines in case the mastership is moved from one machine to the other in the middle of an update. It also avoids any shared storage between the machines, improving reliability and making it straightforward to independently upgrade the two machines or do maintenance operations on the backup machine.

Adding new repository replicas to both machines is done with a supplied command add-repository, and removing repository replicas is done with remove-repository.

The pull_and_push command pulls updates in one thread and pushes them in another thread, so pulls do not need to wait for pushes to finish. The thread that pushes is the one that makes the published catalog available on both machines after a successful push.

The "push" is implemented using the same "cvmfs_server snapshot" command that is used to pull, however it's effectively a push because the master executes the command on the backup with ssh. "cvmfs_server snapshot" is usually more efficient than rsync because it knows by catalog comparisons which files to send, rather than scanning all directories looking for modified files. On the other hand, the add-repository command does do its initial push to the other machine with rsync because it is faster at copying large numbers of files.

Clone this wiki locally