Skip to content

ShenLab/mutable-sh

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Self-hosting Mutable

Mutable provides tools and scripts for self-hosting a web platform for searching, analysis and visualization for De novo variants (DNVs).

Prerequisites

Follow the instructions to install Docker

Clone the repository

git clone https://github.com/ShenLab/mutable-sh; cd mutable-sh

For generating databases for extended data, R is required. And following R packages are required:

  • bio3d
  • dplyr
  • DBI
  • stringr
  • RSQLite

Self-host Mutable

For self-hosting using example datasets, download the data from here. And unzip instance.zip

Make sure all other databases (dnvs.sqlite, genes.sqlite, samples.sqlite, distance.sqlite, constraint.sqlite, plddt.sqlite, users.sqlite) are in the directory mutable-sh/instance. Inside mutable-sh/instance, create a new file config.py with SECRET_KEY=$YOUR_KEY, replace with your own secret key for flask to secure session data for the web app, or you can create a random string using python -c 'import os; print(os.urandom(12))'.

We use Docker to host and deploy Mutable. To self-host Mutable on user's end, under /mutable-sh, do docker build -t mutable:latest . to build the Docker container image. And then use docker run -v $INSTANCE_DIRECTORY:/mutable/instance -p 8000:8000 mutable:latest -b 0.0.0.0 "mutable:create_app()" to run Mutable. Replace the $INSTANCE_DIRECTORY with the absolute path to the /instance folder. You can modify the docker image tag if needed.

Data preparation for additional user-provided data

7 databases and 1 additional dataset folder are required to host Mutable. Example data can be retrieved here. For extending unpublished data for DNVs and samples, under mutable-sh/instance modify the dnvs.sqlite, samples.sqlite, and distance.sqlite accordingly.

dnvs.sqlite

Our published data for de novo varaints can be retreived here. It contains the curated annotated data for published DNVs. See dnvs.sql for the schema required for creating and importing the database for DNVs. The schema shows the required attributes for variants data need for Mutable. You can add more fields if needed. We annotate the de novo varaints using the annotation pipelines here

samples.sqlite

It contains sample-level data for published variants. See samples.sql for the format required for creating the database for samples. The schema shows the required attributes need for sample-level data for Mutable. You can add more fields if needed.

distance.sqlite

Our example data for pairwise distance data for the published DNVs can be found here. The pairwise 1D and 3D distances data are stored in this database. We calculate the spatial distance using the script here. First unzip the downloaded UP000005640_9606_HUMAN_v4.zip, then run the script to generate the pairwise distance between variants. Modify the paths in the script if needed. The default script takes 4 arguments: paths for the dnvs.sqlite, genes.sqlite, UP000005640_9606_HUMAN_v4, and distance.sqlite respectively. Make sure all the packages required by R are installed. Then under /instance run the script with Rscript ../protein_link.R dnvs.sqlite genes.sqlite UP000005640_9606_HUMAN_v4 distance.sqlite. The distance.sqlite will be updated or created if not previously exist.

other databases

Other databases can also be retreived here. The users.sqlite in the example data only contain access for the guest users. If you want to host Mutable and authorize registration access to the complete data, add the user email address as new username to users.sqlite following user.sql. And then the authorized users will be able to register with the username and their own password on Mutable. Passwords will be hashed for credentials safety.

constraint.sqlite, genes.sqlite, and plddt.sqlite do not require modification as they contain complete curated gene-level information for all human genes. You can also modify the databases if needed.

Annotation data specification

On the gene-centered page, variant-level annotation data for computational estimates and selection coefficients are selected to display on the gene-centerd page. You can modify config.json to specify data columns to show in the table.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published