-
Notifications
You must be signed in to change notification settings - Fork 2
Installation
There are two alternative modes for installing command-line GLUE, Docker-based installation and Native installation.
Docker is a platform which allows you to run software in the form of lightweight "containers". GLUE can be installed in the form of Docker containers. We would recommend this route in most circumstances as it offers a quick and less error-prone set-up. Also, GLUE will be isolated from other software on the computer and the operation of GLUE should generally be more predictable.
- Install Docker
- Familiarise yourself with some Docker concepts
- How Docker-based GLUE works
- Set up a
gluetools-mysql
container - Set up a
gluetools
container - Hints and tips for Docker-based GLUE
- Next steps
Docker Engine Community Edition is the software that manages Docker containers. It is available for Mac OSX, Windows and various Linux distributions. GLUE has been tested on Docker Engine version 18.06.
If you are unfamiliar with Docker, you need to learn a few basic concepts. We would recommend at least reading through chapters 1 and 2 of the Get Started with Docker guide.
GLUE is packaged into two Docker images. The cvrbioinformatics/gluetools-mysql
image provides the MySQL database which GLUE will use and some scripts for updating it. A container based on this image provides GLUE with its persistent database storage, and runs in the background as a daemon.
The cvrbioinformatics/gluetools
image provides the GLUE engine software itself plus its 3rd-party dependencies such as RAxML and MAFFT. Containers based on this image will be run in a transient way, each time a GLUE interactive session is run.
Pull the cvrbioinformatics/gluetools-mysql
image from Docker Hub:
$ docker pull cvrbioinformatics/gluetools-mysql:latest
Start a container called gluetools-mysql
based on this image:
$ docker run --detach --name gluetools-mysql cvrbioinformatics/gluetools-mysql:latest
The container was started in detached mode, it runs in the background as a daemon.
Pull the cvrbioinformatics/gluetools
image from Docker Hub:
$ docker pull cvrbioinformatics/gluetools:latest
Start a container called gluetools
based on this image, linking it to the gluetools-mysql
container.
$ docker run --rm -it --name gluetools --link gluetools-mysql cvrbioinformatics/gluetools:latest
This will start an interactive GLUE session within the new container:
GLUE Version 1.1.113 Copyright (C) 2018 The University of Glasgow This program comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it under certain conditions. For details see GNU Affero General Public License v3: http://www.gnu.org/licenses/
Mode path: / Option load-save-path: /opt/gluetools/projects/exampleProject GLUE> ...
When the interactive session completes, the container will be removed (via the --rm
option).
-
Starting and stopping the
gluetools-mysql
containerThe
gluetools-mysql
container contains the GLUE database which you normally want to keep in place from one GLUE interactive session to the next. When you restart your computer this container will be in a "stopped" state. To start it again use:$ docker start gluetools-mysql
You can stop it with:
$ docker stop gluetools-mysql
If you remove this container, your database contents will be lost.
-
Install a pre-built GLUE dataset in the
gluetools-mysql
containerWhile the
gluetools-mysql
container is running, you can install various pre-built GLUE projects. For example, the latest NCBI-HCV-GLUE project build can be installed using this command:
$ docker exec gluetools-mysql installGlueProject.sh ncbi_hcv_glue
Note: this command will wipe any previous data from the database.
Wipe the database in the gluetools-mysql container
You can wipe the GLUE database using this command:
$ docker exec gluetools-mysql glueWipeDatabase.sh
Volume mapping
Each Docker container has its own isolated file system, so by default files outside the container cannot be accessed by GLUE. Since the gluetools container is transient, its file system will be removed at the end of the GLUE session.
If you want GLUE to read your own project data from a directory outside the container or save any file output, add the --volume option to the docker run command for the gluetools container. This maps a directory outside the container (i.e. on the host filesystem) to a path within container filesystem. For example, if you use
--volume /home/fred/my_glue_project:/opt/gluetools/projects/my_glue_project
then the directory /home/fred/my_glue_project will be readable/writable inside the container at /opt/gluetools/projects/my_glue_project. Multiple directories can be mapped with this option.
Working directory
The working directory for the gluetools container defaults to the example project directory. However this can be overridden by adding the following option to the docker run command:
$ --workdir /opt/gluetools/projects/my_glue_project
Use .gluerc and .glue_history files from the host file system
These files store your GLUE console preferences and command history. You may want to use .gluerc and .glue_history files from the host file system rather than the container file system. To do this, map your home directory using a --volume option in the docker run command:
--volume /home/fred:/home/fred
Then also add this --env option to the docker run command:
--env _JAVA_OPTIONS=-Duser.home=/home/fred
Run bash in the container
You can run an interactive bash session rather than a GLUE session in the container by simply adding /bin/bash to the end of the docker run command.
$ docker run --rm -it --name gluetools --link gluetools-mysql cvrbioinformatics/gluetools:latest /bin/bash
Next steps If you are new to GLUE we strongly recommend building the example GLUE project as the next step. The example project directory is included within the container file system.
You can install GLUE running directly on your system instead of in a container. This is a much more complex set up but might be preferential in some circumstances. The Native installation offers more control over the various software packages GLUE uses. The GLUE installation may take up less disk space and some GLUE operations may be faster.
1. Supported operating systems
2. Core prerequisite: Java
3. Core prerequisite: MySQL
4. The GLUE install directory
5. The GLUE engine jar
6. The GLUE XML configuration file
7. Running the GLUE command line
8. Upgrading GLUE
9. Software integration: BLAST+
10. Software integration: MAFFT
11. Software integration: RAxML
12. Next steps
You can install native GLUE on MS Windows, Linux and Mac OSX.
If you are using Windows, you must also install Cygwin, which you can download from cygwin.org.
Download and install Oracle Java 1.8.0 or later, from java.com.
Make sure you can run the correct version of the "java" program from the command line:
$ java -version
java version "1.8.0_45"
Java(TM) SE Runtime Environment (build 1.8.0_45-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode)
GLUE stores all its data in a relational database system: MySQL. Download and install MySQL 5.6 or later, from mysql.com.
The GLUE uses a specific named database within MySQL and accesses this with a specific username/password. So you would normally create a username/password and named database specifically for GLUE. Example set up:
- MySQL username: gluetools
- MySQL password: glue12345
- Database name: GLUE_TOOLS
You can use the following MySQL commands to set this up:
mysql> create user 'gluetools'@'localhost' identified by 'glue12345';\
mysql> create database GLUE_TOOLS character set UTF8;\
mysql> grant all privileges on GLUE_TOOLS.* to 'gluetools'@'localhost';
Test that the new user/password works:
$ mysql -u gluetools --password=glue12345
Welcome to the MySQL monitor.
Server version: 5.6.25 MySQL Community Server (GPL)
mysql>
In this MySQL session, test that the new named database works:
mysql> use GLUE_TOOLS;
Database changed
Your installation of GLUE will be contained within its own "install" directory. Download the GLUE install zip from the GLUE download page. Unzip the GLUE install zip file in a convenient location (e.g. /home/fred
), to create a gluetools
directory. Ensure that the path to the gluetools
directory is stored in the environment variable GLUE_HOME
. Also make sure that ${GLUE_HOME}/bin
is on your bash path. This can be done for example by adding these lines to the end of your .bash_profile
file (Mac / Cygwin), .profile
or .bashrc
file (Linux).
export GLUE_HOME=/home/fred/gluetools\
export PATH=${PATH}:${GLUE_HOME}/bin
Make sure the bash script is executable:
$ chmod u+x /home/fred/gluetools/bin/gluetools.sh
Download the GLUE engine jar file from the GLUE download page. Place it inside the gluetools/lib
directory.
GLUE reads a configuration XML file each time it runs. The role of this file is to make GLUE aware of its local installation. So, you will need to edit this file to adapt it to your local GLUE installation.
In a text editor, load the XML file gluetools/conf/gluetools-config.xml
. You will need to make sure that the database section specifies the correct MySQL username, password and database name as necessary, by editing the contents of the <username>
, <password>
and <jdbcUrl>
elements.
<gluetools>\
<database>\
<username>gluetools</username>\
<password>glue12345</password>\
<vendor>MySQL</vendor>\
<jdbcUrl>jdbc:mysql://localhost:3306/GLUE_TOOLS?characterEncoding=UTF-8</jdbcUrl>\
</database>\
<!-- .... -->\
</gluetools>\
For Windows / Cygwin, you must also add a property showing GLUE where to find the sh
executable:
<gluetools>\
<!-- .... -->\
<properties>\
<!-- .... -->\
<!-- Cygwin specific config -->\
<property>\
<name>gluetools.core.cygwin.sh.executable</name>\
<value>C:\cygwin\bin\sh.exe</value>\
</property>\
<!-- .... -->\
</properties>\
</gluetools>
GLUE has an interactive command line, this is an important tool for GLUE users. We can now test that this works by running gluetools.sh
$ gluetools.sh
GLUE version 1.1.113
Mode path: /
GLUE>
...
Use the quit command to leave the GLUE interpreter.
At some point in the future you may wish to upgrade your installation to a new version of GLUE. Normally this is done as follows:
- Download a new version of the engine jar from the GLUE download page
- Place the new version in the gluetools/lib
directory.
- Delete the old version from the gluetools/lib
directory.
GLUE uses the BLAST+ suite of programs for auto-alignment and certain other features.
Download and install BLAST+ 2.2.31 from NCBI's BLAST+ FTP page. GLUE may function correctly with later versions of BLAST+ but this has not been fully tested. In the case of Mac OSX you should use the 'universal-macosx' BLAST+ distribution.
To integrate BLAST+ into GLUE, load the XML file gluetools/conf/gluetools-config.xml
in a text editor.
- Specify the location of the blast executables blastn
, tblastn
and makeblastdb
- GLUE creates BLAST databases and certain other temporary files. Specify two directories, where GLUE can store these files
<gluetools>\
<!-- .... -->\
<properties>\
<!-- .... -->\
<!-- BLAST specific config -->\
<property>\
<name>gluetools.core.programs.blast.blastn.executable</name>\
<value>/home/fred/blast/ncbi-blast-2.2.31+/bin/blastn</value>\
</property>\
<property>\
<name>gluetools.core.programs.blast.tblastn.executable</name>\
<value>/home/fred/blast/ncbi-blast-2.2.31+/bin/tblastn</value>\
</property>\
<property>\
<name>gluetools.core.programs.blast.makeblastdb.executable</name>\
<value>/home/fred/blast/ncbi-blast-2.2.31+/bin/makeblastdb</value>\
</property>\
<property>\
<name>gluetools.core.programs.blast.temp.dir</name>\
<value>/home/fred/gluetools/tmp/blastfiles</value>\
</property>\
<property>\
<name>gluetools.core.programs.blast.db.dir</name>\
<value>/home/fred/gluetools/tmp/blastdbs</value>\
</property>\
<property>\
<name>gluetools.core.programs.blast.search.threads</name>\
<value>4</value>\
</property>\
<!-- .... -->\
</properties>\
</gluetools>
GLUE uses MAFFT as part of its maximum likelihood genotyping procedure, and other uses are possible.
Download MAFFT from the CBRC MAFFT page and install it locally.
To integrate MAFFT into GLUE, load the XML file gluetools/conf/gluetools-config.xml
in a text editor.
- Specify the location of the MAFFT executable
- GLUE creates temporary MAFFT files. Specify a directory where GLUE can store these files
<gluetools>\
<!-- .... -->\
<properties>\
<!-- .... -->\
<!-- MAFFT-specific config -->\
<property>\
<name>gluetools.core.programs.mafft.executable</name>\
<value>/usr/local/bin/mafft</value>\
</property>\
<property>\
<name>gluetools.core.programs.mafft.cpus</name>\
<value>4</value>\
</property>\
<property>\
<name>gluetools.core.programs.mafft.temp.dir</name>\
<value>/home/fred/gluetools/tmp/mafftfiles</value>\
</property>\
<!-- .... -->\
</properties>\
</gluetools>
GLUE uses RAxML as part of its maximum likelihood genotyping procedure, and for general phylogenetics.
We suggest RAxML be compiled locally so that it is optimised for your hardware. Instructions can be found at the Exelixis Lab RAxML page.
To integrate RAxML into GLUE, load the XML file gluetools/conf/gluetools-config.xml
in a text editor.
- Specify the location of the RAxML executable
- GLUE creates temporary RAxML files. Specify a directory where GLUE can store these files
<gluetools>\
<!-- .... -->\
<properties>\
<!-- .... -->\
<!-- RAxML-specific config -->\
<property>\
<name>gluetools.core.programs.raxml.raxmlhpc.executable</name>\
<value>/home/fred/RAxML/bin/raxmlHPC-PTHREADS-AVX2</value>\
</property>\
<property>\
<name>gluetools.core.programs.raxml.raxmlhpc.cpus</name>\
<value>4</value>\
</property>\
<property>\
<name>gluetools.core.programs.raxml.temp.dir</name>\
<value>/home/fred/gluetools/tmp/raxmlfiles</value>\
</property>\
<!-- .... -->\
</properties>\
</gluetools>
If you are new to GLUE we strongly recommend downloading and building the example GLUE project as the next step.
GLUE by Robert J. Gifford Lab.
For questions, issues, or feedback, please open an issue on the GitHub repository.
- Project Data Model
- Schema Extensions
- Modules
- Alignments
- Variations
- Scripting Layer
- Freemarker Templates
- Example GLUE Project
- Command Line Interpreter
- Build Your Own Project
- Querying the GLUE Database
- Working With Deep Sequencing Data
- Invoking GLUE as a Unix Command
- Known Issues and Fixes
- Overview
- Hepatitis Viruses
- Arboviruses
- Respiratory Viruses
- Animal Viruses
- Spillover Viruses
- Virus Diversity
- Retroviruses
- Paleovirology
- Transposons
- Host Genes