-
Notifications
You must be signed in to change notification settings - Fork 0
metaboanalyst
Don Teng edited this page May 28, 2019
·
3 revisions
This is a web app with a large suite of tools for various uses. It's based on MetaboAnalystR v2.0
, the R package.
Capabilities:
- Statistical analysis^ - Some data processing, e.g. normalization, log transformation (if applicable), for t-tests, ANOVA, volcano plots, k-means, random forest, etc... too many to list.
- Enrichment analysis - Enrichment analysis aims to evaluate whether the observed genes and metabolites in a particular pathway are appearing more/less than expected by random chance within the given dataset. Given an input list of compounds and (optionally) concentrations (input in a text field or uploaded as .csv), compare against a metabolite set of your choice (e.g. blood, urine, CSF, or your own custom metabolite set) to see which compound appears more/less than "usual" compared to the selected metabolite set, in terms of fold-change and p-value.
-
Pathway enrichment analysis - Pathway analysis (a.k.a. topology/network analysis) aims to evaluate whether a given gene or metabolite plays an important role in a biological response based on its position within a pathway/network. Uses graph theory concepts like node centrality, in-between-ness, etc. Given an input list of compounds and (optionally) concentrations (input in a text field or uploaded as .csv), map onto a known network (e.g. human KEGG/SMPDB, fruitfly, yeast, etc.), with some topological analysis algorithms that do network clustering. Outputs a heatmap of pathway impact (effect size), with associated p-values. Results can be downloaded in a
zip
file containing results in a bunch ofcsv
s, and the associatedR
commands used. -
Joint pathway analysis - Takes as input a gene list (using official gene names, with optional fold changes) and a metabolite list (KEGG Id, with optional fold changes). Outputs a heatmap of pathway impact (effect size), with associated p-values. Results can be downloaded in a
zip
file containing results in a bunch ofcsv
s, and the associatedR
commands used. - Network explorer - Projects your sample data onto a known network, (e.g. KEGG global metabolic network, HMDB metabolite-disease network, STITCH metabolite-metabolite interaction network...). Works something like KEGG mapper, but has more networks available.
-
"MS Peaks to Pathways" - Using input peak list data (optionally with t-scores and p-values), and does pathway enrichment (using GSEA or
mummichog
) for compound/pathway hits, and network mapping for network exploration on the browser. Annoyingly, GSEA andmummichog
can give different p-values for the same dataset (even using their example datasets!). - Biomarker analysis - Does something with ROC (Recieving operator characteristic) curves; not too sure how this works yet.
- Time-series^ - coming soon
- Power analysis^ - coming soon
- Biomarker meta-analysis - coming soon
-
Utilities
- Chemical compound ID standardization - converts common chemical names to KEGG IDs, HMDB Ids, etc.
- Batch effect correction - combines multiple datasets (separate
.csv
or.txt
files from different batches) and combines them using ComBat
-
Spectral analysis - MetaboAnalysts suggests using
xcms
orxcms-online
for LC-MS, andGC-Autofit
for GC-MS.
These have a fairly standardized, optional, pre-processing procedure preceding the actual analysis:
- Missing value estimation.
- Data filtering - filters out variables that are unlikely to be useful. "Highly recommended for untargeted analysis". Non-informative variables are categorized as: a. Variables of very small values, close to baseline or detection limit. b. Variables that are almost constant regardless of experimental groups/conditions. Detected with standard deviation or IQR. c. Variables that show low repeatability - measured using QC samples, with relative standard deviation (RSD), a.k.a. coefficient of variation (CV), where RSD = SD/mean.
- Quite handily, the web app also shows you the R commands that it runs.