Skip to content

Running on a Mac (macOS) with Vagrant

Chris Jackson edited this page Mar 31, 2021 · 16 revisions

Unfortunately, the Singularity port for macOS has a bug that means it can’t be used with Nextflow. However, you can easily run the native Linux version of Singularity on a Mac using a Singularity Vagrant Box. This sounds a lot more complicated than it is - you can get this up and running in about 15 minutes. The general installation process is as follows.

NOTE: the following was tested with:

  • macOS Catalina version 10.15.7
  • VirtualBox version 6.1.18
  • Vagrant version 2.2.14
  1. Install VirtualBox for Mac. This is just a standard Mac click-through install.

  2. Install Vagrant for Mac. This is just a standard Mac click-through install.

  3. Install a plugin for resizing the hard disk size of Vagrant machines using the command vagrant plugin install vagrant-disksize.

  4. OPTIONAL: install Vagrant Manager for Mac. This program provides a handy GUI for managing Vagrant machines e.g. viewing those currently running, starting and stopping them, etc. This is just a standard Mac click-through install.

  5. Create a directory to be used for your Vagrant virtual machine e.g. mkdir yang-and-smith-vagrant-vm; cd yang-and-smith-vagrant-vm.

  6. Copy the provided Vagrant file into the directory you’ve created. The Vagrant file can be obtained by downloading this repo from the main page; the required file is simply called Vagrantfile. Alternatively, you can copy the content [here][7] to a text file, and save it as Vagrantfile (note the lack of file extension).

  7. Open Vagrantfile in a text editor, and change the line vb.memory = "5120" and vb.cpus = "4" to match the amount of RAM (here 5120 MB or 5GB) and the number of CPUs you’d like your virtual machine to have. You’ll probably want to use fewer CPUs and less RAM than available on your host machine, so that there are sufficient resources to still run macOS without bogging down. Save the file.

  8. By default, the HDD drive size for the Vagrant machine is set to ~20 GB. If you need more for your analysis this can be expanded. In the Vagrantfile, change the line config.disksize.size = '20GB' to the larger size required (e.g. ‘30GB'). Save the file.

  9. In the directory containing Vagrantfile, type the command vagrant up and press enter. The first time you run this command it’ll download the appropriate Vagrant “box” containing Ubuntu 18.04 with Singularity installed. It’ll then install Nextflow, and launch the virtual machine.

  10. Connect to the virtual machine using the command vagrant ssh.

  11. By default, the directory (eg. yang-and-smith-vagrant-vm) containing the Vagrantfile in your host OS (macOS) will be mounted inside the Vagrant machine at the path /vagrant/. So, you can create a subfolder within the directory yang-and-smith-vagrant-vm called data via Finder, and just drag-and-drop data (e.g. single-gene .fasta files folders and outgroup .fasta files) into yang-and-smith-vagrant-vm/data. These will then be available in the Vagrant machine at /vagrant/data/.

  12. The pipeline script yang-and-smith-rbgv-pipeline.nf and its config file yang-and-smith-rbgv.config will already be in the /home/vagrant directory within the Vagrant machine. So you can now run the Yang-and-Smith pipeline! For example (depending on which folders you’ve copied your data to), you might use a command like:

    nextflow run yang-and-smith-rbgv-pipeline.nf -c yang-and-smith-rbgv.config --hybpiper_paralogs_directory /vagrant/data/11_paralogs/ --outgroups_file /vagrant/data/outgroups.fasta --outgroups sesame

NOTES:

  • The same details regarding download of the Singularity container in Linux apply here i.e. if you don’t change the Nextflow config file, Nextflow will do it for you automatically the first time you run the script. Or, you could download it manually and change the config file to point towards the full path of the image.

  • You’re free to install any other programs you want in the Vagrant machine as well. By default (i.e. specified in the Vagrantfile) I’ve installed curl, the text editor vim, as well as the terminal manager screen.

  • Once you’re done, you can view your results via command line within the Vagrant machine, or you can copy them across to the /vagrant/data folder so that they’re accessible via OSX/Finder.

  • When done, you can exit your SSH session with the Vagrant machine using the command exit. Note that it will still be running in background, so if you want to shut it down (you won’t lose your data - it’s persistent on the Vagrant machine HDD), use the command vagrant halt. You can also use Vagrant Manager (if installed) to perform these sorts of tasks using a nice simple GUI.

  • To restart the Vagrant machine, just move to the directory with the Vagrantfile and type vagrant up then vagrant ssh again. This will be much faster the second time round, as all downloads and installations have already been completed.

  • If you want to delete the Vagrant machine (and all the data it contains), use the command vagrant destroy, and follow the prompt. You can then simply generate a fresh machine using the vagrant up command.

Let me know if you run into any problems - I’m happy to help troubleshoot them with you!

Singularity for macOS bug

For those interested, Nextflow generates compound bash commands with individual commands separated by a semicolon (standard bash practice for submitting more than one bash command on a single line, e.g. cd yang_and_smith_directory; ls). When run through the OSX Singularity port, however, the bash shell drops out of the current working directory when processing commands after the first semi-colon, meaning that e.g. files it needs to process can’t be found. Good times.

[7]:

Clone this wiki locally