-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
105 lines (76 loc) · 4.51 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# doubletrouble <img src="man/figures/logo.png" align="right" height="139" />
<!-- badges: start -->
[![GitHub issues](https://img.shields.io/github/issues/almeidasilvaf/doubletrouble)](https://github.com/almeidasilvaf/doubletrouble/issues)
[![Lifecycle: stable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable)
[![R-CMD-check-bioc](https://github.com/almeidasilvaf/doubletrouble/workflows/R-CMD-check-bioc/badge.svg)](https://github.com/almeidasilvaf/doubletrouble/actions)
[![Codecov test
coverage](https://codecov.io/gh/almeidasilvaf/doubletrouble/branch/devel/graph/badge.svg)](https://codecov.io/gh/almeidasilvaf/doubletrouble?branch=devel)
<!-- badges: end -->
The major goal of __doubletrouble__ is to identify duplicated genes from
whole-genome protein sequences and classify them based on their modes
of duplication. Duplicates can be classified using four different
classification schemes, which increase the complexity and level of details
in a stepwise manner. The classification schemes and the duplication modes
they can classify are:
| Scheme | Duplication modes |
|:---------|:----------------------------|
| binary | SD, SSD |
| standard | SD, TD, PD, DD |
| extended | SD, TD, PD, TRD, DD |
| full | SD, TD, PD, rTRD, dTRD, DD |
*Legend:* **SD**, segmental duplication. **SSD**, small-scale duplication.
**TD**, tandem duplication. **PD**, proximal duplication.
**TRD**, transposon-derived duplication.
**rTRD**, retrotransposon-derived duplication.
**dTRD**, DNA transposon-derived duplication. **DD**, dispersed duplication.
Besides classifying gene pairs, users can also classify genes, so that
each gene is assigned to a unique mode of duplication.
Users can also calculate substitution rates per substitution site (i.e.,
$K_a$, $K_s$ and their ratios $\frac{K_a}{K_s}$) from duplicate pairs,
find peaks in Ks distributions with Gaussian Mixture Models (GMMs),
and classify gene pairs into age groups based on Ks peaks.
## Installation instructions
Get the latest stable `R` release from [CRAN](http://cran.r-project.org/).
Then install __doubletrouble__ from [Bioconductor](http://bioconductor.org/)
using the following code:
```{r 'install', eval = FALSE}
if (!requireNamespace("BiocManager", quietly = TRUE)) {
install.packages("BiocManager")
}
BiocManager::install("doubletrouble")
```
And the development version from [GitHub](https://github.com/almeidasilvaf/doubletrouble) with:
```{r 'install_dev', eval = FALSE}
BiocManager::install("almeidasilvaf/doubletrouble")
```
## Citation
Below is the citation output from using `citation('doubletrouble')` in R. Please
run this yourself to check for any updates on how to cite __doubletrouble__.
```{r 'citation', eval = requireNamespace('doubletrouble')}
print(citation('doubletrouble'), bibtex = TRUE)
```
Please note that the __doubletrouble__ was only made possible thanks to many other R and bioinformatics software authors, which are cited either in the vignettes and/or the paper(s) describing this package.
## Code of Conduct
Please note that the __doubletrouble__ project is released with
a [Contributor Code of Conduct](http://bioconductor.org/about/code-of-conduct/).
By contributing to this project, you agree to abide by its terms.
## Development tools
* Continuous code testing is possible thanks to [GitHub actions](https://www.tidyverse.org/blog/2020/04/usethis-1-6-0/) through `r BiocStyle::CRANpkg('usethis')`, `r BiocStyle::CRANpkg('remotes')`, and `r BiocStyle::CRANpkg('rcmdcheck')` customized to use [Bioconductor's docker containers](https://www.bioconductor.org/help/docker/) and `r BiocStyle::Biocpkg('BiocCheck')`.
* Code coverage assessment is possible thanks to [codecov](https://codecov.io/gh) and `r BiocStyle::CRANpkg('covr')`.
* The [documentation website](http://almeidasilvaf.github.io/doubletrouble) is automatically updated thanks to `r BiocStyle::CRANpkg('pkgdown')`.
* The code is styled automatically thanks to `r BiocStyle::CRANpkg('styler')`.
* The documentation is formatted thanks to `r BiocStyle::CRANpkg('devtools')` and `r BiocStyle::CRANpkg('roxygen2')`.
For more details, check the `dev` directory.
This package was developed using `r BiocStyle::Biocpkg('biocthis')`.