Extracting gene expression datasets from CCLE database.
Language | Packages |
---|---|
Python >= 3.7 | pandas, matplotlib, seaborn |
R | biomaRt, dplyr, edgeR |
- CCLE_RNAseq_rsem_genes_tpm_20180929.txt.gz
- Cell_lines_annotations_20181226.txt
- CCLE_RNAseq_genes_counts_20180929.gct.gz
from ccle.database import CancerCellLineEncyclopedia as CCLE
# Set gene_nemes
# Set ccle_names or cell_lines
selected_CCLE_subset = CCLE(
gene_names = ['EGFR', 'ERBB2', 'ERBB3', 'ERBB4'],
ccle_names = ['MCF7_BREAST', 'MDAMB231_BREAST']
)
''' or
selected_CCLE_subset = CCLE(
gene_names = ['EGFR', 'ERBB2', 'ERBB3', 'ERBB4'],
cell_lines = ['MCF7', 'MDA-MB-231']
)
'''
# GeneCards link (https://www.genecards.org)
selected_CCLE_subset.to_gene_summary()
# TPM value
selected_CCLE_subset.to_gene_expression()
$ git clone https://github.com/okadalabipr/ccle_extractor.git