Quick Start | Documentation | Download | Publication
The analysis of ChIP-seq samples outputs a number of enriched regions, each indicating a protein-DNA interaction or a specific chromatin modification. Enriched regions (commonly known as "peaks") are called when the read distribution is significantly different from the background and its corresponding significance measure (p-value) is below a user-defined threshold.
When replicate samples are analysed, overlapping enriched regions are expected. This repeated evidence can therefore be used to locally lower the minimum significance required to accept a peak. Here, we propose a method for joint analysis of weak peaks.
Given a set of peaks from (biological or technical) replicates, the method combines the p-values of overlapping enriched regions: users can choose a threshold on the combined significance of overlapping peaks and set a minimum number of replicates where the overlapping peaks should be present. The method allows the "rescue" of weak peaks occuring in more than one replicate and outputs a new set of enriched regions for each replicate.
In general, the method groups enriched regions as background, weak, or stringent based on user-defined weak and stringency thresholds. The method then confirms or discards the weak and stringent enriched regions if their combined stringency is at least as significant as a user-defined threshold. The method then performs a multiple testing correction on confirmed enriched regions at a user-defined false-discovery rate, identifying true-positive and false-positive regions. See the following figure as an example, and you may refer to MSPC publications, slides on slideshare, or documentation page for more details.
MSPC is distributed as a cross-platform console application, a .NET library, and a Bioconductor R package.