Skip to content
Robert J. Gifford edited this page Nov 17, 2024 · 5 revisions

Installing GLUE

There are two alternative modes for installing command-line GLUE, Docker-based installation and Native installation.

Docker-based installation

Docker is a platform which allows you to run software in the form of lightweight "containers". GLUE can be installed in the form of Docker containers. We would recommend this route in most circumstances as it offers a quick and less error-prone set-up. Also, GLUE will be isolated from other software on the computer and the operation of GLUE should generally be more predictable.

  1. Install Docker
  2. Familiarise yourself with some Docker concepts
  3. How Docker-based GLUE works
  4. Set up a gluetools-mysql container
  5. Set up a gluetools container
  6. Hints and tips for Docker-based GLUE
  7. Next steps

Install Docker

Docker Engine Community Edition is the software that manages Docker containers. It is available for Mac OSX, Windows and various Linux distributions. GLUE has been tested on Docker Engine version 18.06.

Familiarise yourself with some Docker concepts

If you are unfamiliar with Docker, you need to learn a few basic concepts. We would recommend at least reading through chapters 1 and 2 of the Get Started with Docker guide.

How Docker-based GLUE works

GLUE is packaged into two Docker images. The cvrbioinformatics/gluetools-mysql image provides the MySQL database which GLUE will use and some scripts for updating it. A container based on this image provides GLUE with its persistent database storage, and runs in the background as a daemon.

The cvrbioinformatics/gluetools image provides the GLUE engine software itself plus its 3rd-party dependencies such as RAxML and MAFFT. Containers based on this image will be run in a transient way, each time a GLUE interactive session is run.

Set up a gluetools-mysql container

Pull the cvrbioinformatics/gluetools-mysql image from Docker Hub:

$ docker pull cvrbioinformatics/gluetools-mysql:latest

Start a container called gluetools-mysql based on this image:

$ docker run --detach --name gluetools-mysql cvrbioinformatics/gluetools-mysql:latest

The container was started in detached mode, it runs in the background as a daemon.

Set up a gluetools container

Pull the cvrbioinformatics/gluetools image from Docker Hub:

$ docker pull cvrbioinformatics/gluetools:latest

Start a container called gluetools based on this image, linking it to the gluetools-mysql container.

$ docker run --rm -it --name gluetools --link gluetools-mysql cvrbioinformatics/gluetools:latest

This will start an interactive GLUE session within the new container:

GLUE Version 1.1.113 Copyright (C) 2018 The University of Glasgow This program comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it under certain conditions. For details see GNU Affero General Public License v3: http://www.gnu.org/licenses/

Mode path: / Option load-save-path: /opt/gluetools/projects/exampleProject GLUE> ...

When the interactive session completes, the container will be removed (via the --rm option).

Hints and tips for Docker-based GLUE

  • Starting and stopping the gluetools-mysql container

    The gluetools-mysql container contains the GLUE database which you normally want to keep in place from one GLUE interactive session to the next. When you restart your computer this container will be in a "stopped" state. To start it again use:

    $ docker start gluetools-mysql
    
    

    You can stop it with:

    $ docker stop gluetools-mysql
    
    

    If you remove this container, your database contents will be lost.

  • Install a pre-built GLUE dataset in the gluetools-mysql container

    While the gluetools-mysql container is running, you can install various pre-built GLUE projects. For example, the latest NCBI-HCV-GLUE project build can be installed using this command:

$ docker exec gluetools-mysql installGlueProject.sh ncbi_hcv_glue

Note: this command will wipe any previous data from the database.

Wipe the database in the gluetools-mysql container

You can wipe the GLUE database using this command:

$ docker exec gluetools-mysql glueWipeDatabase.sh

Volume mapping

Each Docker container has its own isolated file system, so by default files outside the container cannot be accessed by GLUE. Since the gluetools container is transient, its file system will be removed at the end of the GLUE session.

If you want GLUE to read your own project data from a directory outside the container or save any file output, add the --volume option to the docker run command for the gluetools container. This maps a directory outside the container (i.e. on the host filesystem) to a path within container filesystem. For example, if you use

--volume /home/fred/my_glue_project:/opt/gluetools/projects/my_glue_project

then the directory /home/fred/my_glue_project will be readable/writable inside the container at /opt/gluetools/projects/my_glue_project. Multiple directories can be mapped with this option.

Working directory

The working directory for the gluetools container defaults to the example project directory. However this can be overridden by adding the following option to the docker run command:

$ --workdir /opt/gluetools/projects/my_glue_project

Use .gluerc and .glue_history files from the host file system

These files store your GLUE console preferences and command history. You may want to use .gluerc and .glue_history files from the host file system rather than the container file system. To do this, map your home directory using a --volume option in the docker run command:

--volume /home/fred:/home/fred

Then also add this --env option to the docker run command:

--env _JAVA_OPTIONS=-Duser.home=/home/fred 

Run bash in the container

You can run an interactive bash session rather than a GLUE session in the container by simply adding /bin/bash to the end of the docker run command.

$ docker run --rm -it --name gluetools --link gluetools-mysql cvrbioinformatics/gluetools:latest /bin/bash

Next steps If you are new to GLUE we strongly recommend building the example GLUE project as the next step. The example project directory is included within the container file system.

Native installation

You can install GLUE running directly on your system instead of in a container. This is a much more complex set up but might be preferential in some circumstances. The Native installation offers more control over the various software packages GLUE uses. The GLUE installation may take up less disk space and some GLUE operations may be faster.

1.  Supported operating systems
2.  Core prerequisite: Java
3.  Core prerequisite: MySQL
4.  The GLUE install directory
5.  The GLUE engine jar
6.  The GLUE XML configuration file
7.  Running the GLUE command line
8.  Upgrading GLUE
9.  Software integration: BLAST+
10. Software integration: MAFFT
11. Software integration: RAxML
12. Next steps

Supported operating systems

You can install native GLUE on MS Windows, Linux and Mac OSX.

If you are using Windows, you must also install Cygwin, which you can download from cygwin.org.

Core prerequisite: Java

Download and install Oracle Java 1.8.0 or later, from java.com.

Make sure you can run the correct version of the "java" program from the command line:

$ java -version
java version "1.8.0_45"
Java(TM) SE Runtime Environment (build 1.8.0_45-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode)

Core prerequisite: MySQL

GLUE stores all its data in a relational database system: MySQL. Download and install MySQL 5.6 or later, from mysql.com.

The GLUE uses a specific named database within MySQL and accesses this with a specific username/password. So you would normally create a username/password and named database specifically for GLUE. Example set up:

-   MySQL username: gluetools
-   MySQL password: glue12345
-   Database name: GLUE_TOOLS

You can use the following MySQL commands to set this up:

mysql> create user 'gluetools'@'localhost' identified by 'glue12345';\
mysql> create database GLUE_TOOLS character set UTF8;\
mysql> grant all privileges on GLUE_TOOLS.* to 'gluetools'@'localhost';

Test that the new user/password works:

$ mysql -u gluetools --password=glue12345
Welcome to the MySQL monitor.
Server version: 5.6.25 MySQL Community Server (GPL)

mysql>

In this MySQL session, test that the new named database works:

mysql> use GLUE_TOOLS;

Database changed

The GLUE install directory

Your installation of GLUE will be contained within its own "install" directory. Download the GLUE install zip from the GLUE download page. Unzip the GLUE install zip file in a convenient location (e.g. /home/fred), to create a gluetools directory. Ensure that the path to the gluetools directory is stored in the environment variable GLUE_HOME. Also make sure that ${GLUE_HOME}/bin is on your bash path. This can be done for example by adding these lines to the end of your .bash_profile file (Mac / Cygwin), .profile or .bashrc file (Linux).

export GLUE_HOME=/home/fred/gluetools\
export PATH=${PATH}:${GLUE_HOME}/bin

Make sure the bash script is executable:

$ chmod u+x /home/fred/gluetools/bin/gluetools.sh

The GLUE engine jar

Download the GLUE engine jar file from the GLUE download page. Place it inside the gluetools/lib directory.

The GLUE XML configuration file

GLUE reads a configuration XML file each time it runs. The role of this file is to make GLUE aware of its local installation. So, you will need to edit this file to adapt it to your local GLUE installation.

In a text editor, load the XML file gluetools/conf/gluetools-config.xml. You will need to make sure that the database section specifies the correct MySQL username, password and database name as necessary, by editing the contents of the <username>, <password> and <jdbcUrl> elements.

<gluetools>\
    <database>\
        <username>gluetools</username>\
        <password>glue12345</password>\
        <vendor>MySQL</vendor>\
        <jdbcUrl>jdbc:mysql://localhost:3306/GLUE_TOOLS?characterEncoding=UTF-8</jdbcUrl>\
    </database>\
    <!-- .... -->\
</gluetools>\

For Windows / Cygwin, you must also add a property showing GLUE where to find the sh executable:

<gluetools>\
    <!-- .... -->\
    <properties>\
    <!-- .... -->\
        <!-- Cygwin specific config -->\
        <property>\
            <name>gluetools.core.cygwin.sh.executable</name>\
            <value>C:\cygwin\bin\sh.exe</value>\
        </property>\
    <!-- .... -->\
    </properties>\
</gluetools>

Running the GLUE command line

GLUE has an interactive command line, this is an important tool for GLUE users. We can now test that this works by running gluetools.sh

$ gluetools.sh
GLUE version 1.1.113
Mode path: /
GLUE>
...

Use the quit command to leave the GLUE interpreter.

Upgrading GLUE

At some point in the future you may wish to upgrade your installation to a new version of GLUE. Normally this is done as follows:

-   Download a new version of the engine jar from the GLUE download page
-   Place the new version in the gluetools/lib directory.
-   Delete the old version from the gluetools/lib directory.

Software integration: BLAST+

GLUE uses the BLAST+ suite of programs for auto-alignment and certain other features.

Download and install BLAST+ 2.2.31 from NCBI's BLAST+ FTP page. GLUE may function correctly with later versions of BLAST+ but this has not been fully tested. In the case of Mac OSX you should use the 'universal-macosx' BLAST+ distribution.

To integrate BLAST+ into GLUE, load the XML file gluetools/conf/gluetools-config.xml in a text editor.

-   Specify the location of the blast executables blastn, tblastn and makeblastdb
-   GLUE creates BLAST databases and certain other temporary files. Specify two directories, where GLUE can store these files

<gluetools>\
    <!-- .... -->\
    <properties>\
    <!-- .... -->\
        <!-- BLAST specific config -->\
        <property>\
            <name>gluetools.core.programs.blast.blastn.executable</name>\
            <value>/home/fred/blast/ncbi-blast-2.2.31+/bin/blastn</value>\
        </property>\
        <property>\
            <name>gluetools.core.programs.blast.tblastn.executable</name>\
            <value>/home/fred/blast/ncbi-blast-2.2.31+/bin/tblastn</value>\
        </property>\
        <property>\
            <name>gluetools.core.programs.blast.makeblastdb.executable</name>\
            <value>/home/fred/blast/ncbi-blast-2.2.31+/bin/makeblastdb</value>\
        </property>\
        <property>\
            <name>gluetools.core.programs.blast.temp.dir</name>\
            <value>/home/fred/gluetools/tmp/blastfiles</value>\
        </property>\
        <property>\
            <name>gluetools.core.programs.blast.db.dir</name>\
            <value>/home/fred/gluetools/tmp/blastdbs</value>\
        </property>\
        <property>\
            <name>gluetools.core.programs.blast.search.threads</name>\
            <value>4</value>\
        </property>\
    <!-- .... -->\
    </properties>\
</gluetools>

Software integration: MAFFT

GLUE uses MAFFT as part of its maximum likelihood genotyping procedure, and other uses are possible.

Download MAFFT from the CBRC MAFFT page and install it locally.

To integrate MAFFT into GLUE, load the XML file gluetools/conf/gluetools-config.xml in a text editor.

-   Specify the location of the MAFFT executable
-   GLUE creates temporary MAFFT files. Specify a directory where GLUE can store these files

<gluetools>\
    <!-- .... -->\
    <properties>\
    <!-- .... -->\
        <!-- MAFFT-specific config -->\
        <property>\
            <name>gluetools.core.programs.mafft.executable</name>\
            <value>/usr/local/bin/mafft</value>\
        </property>\
        <property>\
            <name>gluetools.core.programs.mafft.cpus</name>\
            <value>4</value>\
        </property>\
        <property>\
            <name>gluetools.core.programs.mafft.temp.dir</name>\
            <value>/home/fred/gluetools/tmp/mafftfiles</value>\
        </property>\
    <!-- .... -->\
    </properties>\
</gluetools>

Software integration: RAxML

GLUE uses RAxML as part of its maximum likelihood genotyping procedure, and for general phylogenetics.

We suggest RAxML be compiled locally so that it is optimised for your hardware. Instructions can be found at the Exelixis Lab RAxML page.

To integrate RAxML into GLUE, load the XML file gluetools/conf/gluetools-config.xml in a text editor.

-   Specify the location of the RAxML executable
-   GLUE creates temporary RAxML files. Specify a directory where GLUE can store these files

<gluetools>\
    <!-- .... -->\
    <properties>\
    <!-- .... -->\
        <!-- RAxML-specific config -->\
        <property>\
            <name>gluetools.core.programs.raxml.raxmlhpc.executable</name>\
            <value>/home/fred/RAxML/bin/raxmlHPC-PTHREADS-AVX2</value>\
        </property>\
        <property>\
            <name>gluetools.core.programs.raxml.raxmlhpc.cpus</name>\
            <value>4</value>\
        </property>\
        <property>\
            <name>gluetools.core.programs.raxml.temp.dir</name>\
            <value>/home/fred/gluetools/tmp/raxmlfiles</value>\
        </property>\
    <!-- .... -->\
    </properties>\
</gluetools>

Next steps

If you are new to GLUE we strongly recommend downloading and building the example GLUE project as the next step.


Clone this wiki locally