-
Notifications
You must be signed in to change notification settings - Fork 53
Tutorial
CellBase comes with a Python program to download all the data sources required for building the databases and with a Java command line interface (CLI) for building the database, you will need at least Java 7. After the installation you should have a directory with this structure: ... ...
You can find the Python script at...
Once we have downloaded the data we can build the Data Models for MongoDB by executing:
Tu run the Java CLI you must execute:
java -jar lib/cellbase-build-3.1.0.jar --build
CellBase comes with a command line interface (CLI) is written in Java, you will need at least Java 7 for running the CellBase CLI. After the installation you should have a cellbase/cellbase-build/installation-dir/ directory with:
/tmp/cellbase/cellbase-build/installation-dir/ ├── bin │ ├── cosmic │ │ └── cosmic_mutations.sh │ ├── ensembl-scripts │ │ ├── DB_CONFIG.pm │ │ ├── gene_extra_info_cellbase.pl │ │ ├── protein_function_prediction_matrices.pl . . . . . . . . . │ ├── genome-fetcher │ │ ├── CHECKSUMS │ │ ├── DB_CONFIG.pm │ │ ├── genome-fetcher.py . . . . . . . . . │ ├── obsolete │ │ ├── cellbase-builder.py │ │ └── genome_info.pl │ └── protein │ └── uniprot_spliter.pl ├── cellbase-installer.py ├── example │ ├── BasicTest.gvf │ ├── BasicTest.Json │ ├── Escherichia coli.owl │ ├── Homo_sapiens_incl_consequences_1000.gvf │ └── Homo_sapiens_incl_consequences_1000.Json ├── libs │ ├── cellbase-build-3.1.0.jar │ ├── cellbase-core-3.1.0.jar │ ├── cellbase-mongodb-3.1.0.jar . . . . . . . . . ├── mongodb-scripts │ ├── conserverd_region-indexes.js │ ├── create-biouser.js │ ├── drugbank-indexes.js . . . . . . . . . └── run_cellbase.sh
Tu run the CLI you must execute:
java -jar lib/cellbase-build-3.1.0.jar --build
Go to CellBase folder:
cd cellbase/cellbase-build/installation-dir/bin/genome-fetcher
./genome-fetcher.py -s "Homo sapiens" --sequence 1 --gene 1 --variation 1 -o /tmp
This will download the data files into /tmp folder
Go to:
cd cellbase/cellbase-build/installation-dir/
and execute CellBase CLI, for building genome sequence collection:
java -jar libs/cellbase-build-3.1.0.jar --build genome-sequence
--fasta-file /tmp/homo_sapiens/sequence/<file.fa.gz> -o /tmp/
For building gene collection:
java -jar libs/cellbase-build-3.1.0.jar --build gene
--indir /tmp/homo_sapiens/gene
--fasta-file /tmp/homo_sapiens/sequence/<file.fa.gz> -o /tmp/
For building variation collections:
java -jar libs/cellbase-build-3.1.0.jar --build variation
--indir /tmp/homo_sapiens/variation -o /tmp/