diff --git a/README.md b/README.md index 2683cb6..8619d45 100644 --- a/README.md +++ b/README.md @@ -11,19 +11,43 @@ EternaFold is possible thanks to [CONTRAfold-SE](https://github.com/csfoo/contra Clone the repository and run `make` in `src` to compile. Multithreaded version: run `make multi` in `src`. +See instructions in [README_LinearFold-E_patch.md](README_LinearFold-E_patch.md) for using EternaFold parameters with LinearFold and LinearPartition algorithms. + ### Prediction #### Single-structure prediction -Predict the MEA structure of sequence "test", using the EternaFold parameters: +Predict the MEA structure of example test sequence (Hammerhead ribozyme), using the EternaFold parameters: -`contrafold predict test.bpseq --params parameters/EternaFoldParams.v1` +`contrafold predict test.seq --params parameters/EternaFoldParams.v1` -Please see the documentation of [CONTRAfold](http://contra.stanford.edu/contrafold/manual_v2_02.pdf) for further information on parameters and usage. +Output: +``` +Training mode: +Use constraints: 0 +Use evidence: 0 +Predicting using MEA estimator. +>test.seq +CGCUGUCUGUACUUGUAUCAGUACACUGACGAGUCCCUAAAGGACGAAACAGCG +>structure +(((((((((((((......))))))..)....((((.....))))...)))))) +``` -#### Fold change prediction -Predict log K_MS2 values for riboswitch molecules to MS2 in the presence and absence of small molecule aptamers. +Predict the ensemble free energy: + +``` +$ ./src/contrafold predict test.seq --params parameters/EternaFoldParams.v1 --partition +``` + +Output: +``` +Training mode: +Use constraints: 0 +Use evidence: 0 +Log partition coefficient for "test.seq": 13.7489 +``` + +Please see the documentation of [CONTRAfold](http://contra.stanford.edu/contrafold/manual_v2_02.pdf) for further information on parameters and usage. See below for documented discrepancies (besides parameters) from CONTRAfold codebases. -`contrafold predict-foldchange test_ms2.bpseq` ### Training @@ -87,3 +111,23 @@ k1.0 2.0 99 3 U -1 19 19 4 C -1 18 18 ``` + +#### ❗️ Discrepancies from CONTRAfold-SE code + +This code has been modified in two ways that means its output, even using the CONTRAfold parameters, will differ from the CONTRAfold codebase here and the CONTRAfold-SE codebase here. + +1. A bug was fixed in the multiloop traceback `InferenceEngine.ipp` which was first identified by He Zhang (Oregon State). + +2. The minimum allowable hairpin size was increased from `0` to `3` to prevent structure predictions with `(())` hairpins. To revert back to the original CONTRAfold behavior, set `C_MIN_HP_LENGTH=0` in `Config.hpp` before compiling. + +Predictions for Hammerhead Ribozyme sequence, using default CONTRAfold parameters: `CGCUGUCUGUACUUGUAUCAGUACACUGACGAGUCCCUAAAGGACGAAACAGCG` + +contrafold predict hhr.bpseq --partition + +| Version | hhr.bpseq Log Partition Coefficient | +| --- | ----------- | +|CONTRAfold v2.02| 6.87394| +|CONTRAfold-SE| 6.87394| +|EternaFold code, no ML fix and C_MIN_HP_LENGTH=0| 6.87394| +|EternaFold code, C_MIN_HP_LENGTH=0| 6.83585| +|EternaFold code | 6.77285 | diff --git a/README_LinearFold-E_patch.md b/README_LinearFold-E_patch.md index b4dfd40..0e8a508 100644 --- a/README_LinearFold-E_patch.md +++ b/README_LinearFold-E_patch.md @@ -2,16 +2,11 @@ The EternaFold parameters have also been adapted for use with the LinearFold and LinearPartition algorithms, described in -``` -LinearFold: Linear-Time Approximate RNA Folding by 5’-to-3’ Dynamic Programming and Beam Search. Bioinformatics, Volume 35, Issue 14, July 2019, Pages i295–i304. ISMB 2019 -Liang Huang, He Zhang, Dezhong Deng, Kai Zhao, Kaibo Liu, David Hendrix, David Mathews -``` -``` -LinearPartition: linear-time approximation of RNA folding partition function and base-pairing probabilities. Bioinformatics, Volume 36, Issue Supplement_1, July 2020, Pages i258–i267. ISMB 2020 +Liang Huang, He Zhang, Dezhong Deng, Kai Zhao, Kaibo Liu, David Hendrix, David Mathews. *LinearFold: Linear-time approximate RNA folding by 5’-to-3’ dynamic programming and beam search.* Bioinformatics, Volume 35, Issue 14, July 2019, Pages i295–i304. ISMB 2019 + +He Zhang, Liang Zhang, David Mathews, Liang Huang. *LinearPartition: Linear-time approximation of RNA folding partition function and base-pairing probabilities.* Bioinformatics, Volume 36, Issue Supplement_1, July 2020, Pages i258–i267. ISMB 2020 -He Zhang, Liang Zhang, David Mathews, Liang Huang -``` LinearFold and LinearPartition have different licenses than EternaFold, please read the LinearFold [license](https://github.com/LinearFold/LinearFold/blob/master/LICENSE) and LinearPartition [license](https://github.com/LinearFold/LinearPartition/blob/master/LICENSE) before proceeding. 1. Clone the LinearFold repository at [https://github.com/LinearFold/LinearFold](https://github.com/LinearFold/LinearFold). The most recently-tested working commit is 260c6bbb9bf8cc84b807fa7633b9cb731e639884 (June 06 2021). You can get this commit with