From 107fb7a271aa85e7fc69aff78837bbb6c77de585 Mon Sep 17 00:00:00 2001 From: H Wayment-Steele Date: Sun, 20 Jun 2021 12:39:38 -0700 Subject: [PATCH 1/9] Update README.md --- README.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/README.md b/README.md index 2683cb6..d549b37 100644 --- a/README.md +++ b/README.md @@ -11,6 +11,10 @@ EternaFold is possible thanks to [CONTRAfold-SE](https://github.com/csfoo/contra Clone the repository and run `make` in `src` to compile. Multithreaded version: run `make multi` in `src`. +#### Usage with LinearFold and LinearPartition + +See instructions in [README_LinearFold-E_patch.md](README_LinearFold-E_patch.md) for using EternaFold parameters with LinearFold and LinearPartition algorithms. + ### Prediction #### Single-structure prediction From b175e214154decc7dfffcc4b945633c45128282c Mon Sep 17 00:00:00 2001 From: H Wayment-Steele Date: Sun, 20 Jun 2021 12:39:59 -0700 Subject: [PATCH 2/9] Update README.md --- README.md | 2 -- 1 file changed, 2 deletions(-) diff --git a/README.md b/README.md index d549b37..a2c680f 100644 --- a/README.md +++ b/README.md @@ -11,8 +11,6 @@ EternaFold is possible thanks to [CONTRAfold-SE](https://github.com/csfoo/contra Clone the repository and run `make` in `src` to compile. Multithreaded version: run `make multi` in `src`. -#### Usage with LinearFold and LinearPartition - See instructions in [README_LinearFold-E_patch.md](README_LinearFold-E_patch.md) for using EternaFold parameters with LinearFold and LinearPartition algorithms. ### Prediction From d788eb4a8b6555c70b6648319c9390a321b9304f Mon Sep 17 00:00:00 2001 From: H Wayment-Steele Date: Sun, 20 Jun 2021 12:42:33 -0700 Subject: [PATCH 3/9] Update README_LinearFold-E_patch.md --- README_LinearFold-E_patch.md | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/README_LinearFold-E_patch.md b/README_LinearFold-E_patch.md index b4dfd40..7ee650e 100644 --- a/README_LinearFold-E_patch.md +++ b/README_LinearFold-E_patch.md @@ -2,16 +2,15 @@ The EternaFold parameters have also been adapted for use with the LinearFold and LinearPartition algorithms, described in -``` + LinearFold: Linear-Time Approximate RNA Folding by 5’-to-3’ Dynamic Programming and Beam Search. Bioinformatics, Volume 35, Issue 14, July 2019, Pages i295–i304. ISMB 2019 Liang Huang, He Zhang, Dezhong Deng, Kai Zhao, Kaibo Liu, David Hendrix, David Mathews -``` -``` + LinearPartition: linear-time approximation of RNA folding partition function and base-pairing probabilities. Bioinformatics, Volume 36, Issue Supplement_1, July 2020, Pages i258–i267. ISMB 2020 He Zhang, Liang Zhang, David Mathews, Liang Huang -``` + LinearFold and LinearPartition have different licenses than EternaFold, please read the LinearFold [license](https://github.com/LinearFold/LinearFold/blob/master/LICENSE) and LinearPartition [license](https://github.com/LinearFold/LinearPartition/blob/master/LICENSE) before proceeding. 1. Clone the LinearFold repository at [https://github.com/LinearFold/LinearFold](https://github.com/LinearFold/LinearFold). The most recently-tested working commit is 260c6bbb9bf8cc84b807fa7633b9cb731e639884 (June 06 2021). You can get this commit with From b70298637cbbbd490a9074782eb353fb83c9e8d4 Mon Sep 17 00:00:00 2001 From: H Wayment-Steele Date: Sun, 20 Jun 2021 12:43:40 -0700 Subject: [PATCH 4/9] Update README_LinearFold-E_patch.md --- README_LinearFold-E_patch.md | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/README_LinearFold-E_patch.md b/README_LinearFold-E_patch.md index 7ee650e..0e8a508 100644 --- a/README_LinearFold-E_patch.md +++ b/README_LinearFold-E_patch.md @@ -3,13 +3,9 @@ The EternaFold parameters have also been adapted for use with the LinearFold and LinearPartition algorithms, described in -LinearFold: Linear-Time Approximate RNA Folding by 5’-to-3’ Dynamic Programming and Beam Search. Bioinformatics, Volume 35, Issue 14, July 2019, Pages i295–i304. ISMB 2019 +Liang Huang, He Zhang, Dezhong Deng, Kai Zhao, Kaibo Liu, David Hendrix, David Mathews. *LinearFold: Linear-time approximate RNA folding by 5’-to-3’ dynamic programming and beam search.* Bioinformatics, Volume 35, Issue 14, July 2019, Pages i295–i304. ISMB 2019 -Liang Huang, He Zhang, Dezhong Deng, Kai Zhao, Kaibo Liu, David Hendrix, David Mathews - -LinearPartition: linear-time approximation of RNA folding partition function and base-pairing probabilities. Bioinformatics, Volume 36, Issue Supplement_1, July 2020, Pages i258–i267. ISMB 2020 - -He Zhang, Liang Zhang, David Mathews, Liang Huang +He Zhang, Liang Zhang, David Mathews, Liang Huang. *LinearPartition: Linear-time approximation of RNA folding partition function and base-pairing probabilities.* Bioinformatics, Volume 36, Issue Supplement_1, July 2020, Pages i258–i267. ISMB 2020 LinearFold and LinearPartition have different licenses than EternaFold, please read the LinearFold [license](https://github.com/LinearFold/LinearFold/blob/master/LICENSE) and LinearPartition [license](https://github.com/LinearFold/LinearPartition/blob/master/LICENSE) before proceeding. From f2a9285fe78857ec60e36a8f0013e691a1c45d5f Mon Sep 17 00:00:00 2001 From: H Wayment-Steele Date: Sun, 20 Jun 2021 12:48:57 -0700 Subject: [PATCH 5/9] Update README.md --- README.md | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/README.md b/README.md index a2c680f..7c1b019 100644 --- a/README.md +++ b/README.md @@ -27,6 +27,15 @@ Predict log K_MS2 values for riboswitch molecules to MS2 in the presence and abs `contrafold predict-foldchange test_ms2.bpseq` +#### Discrepancies from CONTRAfold-SE code + +This code has been modified in two ways that means its output will differ from the CONTRAfold codebase here and the CONTRAfold-SE codebase here. + +1. A bug was fixed in the multiloop traceback `InferenceEngine.ipp` which was first identified by He Zhang (Oregon State). + +2. The minimum allowable hairpin size was increased from `0` to `3` to prevent structure predictions with `(())` hairpins. To revert back to the original CONTRAfold behavior, set `C_MIN_HP_LENGTH=0` in `Config.hpp` before compiling. + + ### Training Training data is in `input_data` (unzip first). From 457492a666c07f6bf65906284efa39de73ca85a9 Mon Sep 17 00:00:00 2001 From: H Wayment-Steele Date: Sun, 20 Jun 2021 13:01:05 -0700 Subject: [PATCH 6/9] Update README.md --- README.md | 64 +++++++++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 55 insertions(+), 9 deletions(-) diff --git a/README.md b/README.md index 7c1b019..59ea8a1 100644 --- a/README.md +++ b/README.md @@ -16,24 +16,48 @@ See instructions in [README_LinearFold-E_patch.md](README_LinearFold-E_patch.md) ### Prediction #### Single-structure prediction -Predict the MEA structure of sequence "test", using the EternaFold parameters: +Predict the MEA structure of example test sequence (Hammerhead ribozyme), using the EternaFold parameters: -`contrafold predict test.bpseq --params parameters/EternaFoldParams.v1` +`contrafold predict test.seq --params parameters/EternaFoldParams.v1` -Please see the documentation of [CONTRAfold](http://contra.stanford.edu/contrafold/manual_v2_02.pdf) for further information on parameters and usage. +Output: +``` +Training mode: +Use constraints: 0 +Use evidence: 0 +Predicting using MEA estimator. +>test.seq +CGCUGUCUGUACUUGUAUCAGUACACUGACGAGUCCCUAAAGGACGAAACAGCG +>structure +(((((((((((((......))))))..)....((((.....))))...)))))) +``` + +Predict the ensemble free energy: + +``` +$ ./src/contrafold predict test.seq --params parameters/EternaFoldParams.v1 --partition +``` + +Output: +``` +Training mode: +Use constraints: 0 +Use evidence: 0 +Log partition coefficient for "test.seq": 13.7489 +``` + +Please see the documentation of [CONTRAfold](http://contra.stanford.edu/contrafold/manual_v2_02.pdf) for further information on parameters and usage. See below for documented discrepancies (besides parameters) from CONTRAfold codebases. #### Fold change prediction Predict log K_MS2 values for riboswitch molecules to MS2 in the presence and absence of small molecule aptamers. -`contrafold predict-foldchange test_ms2.bpseq` +`./src/contrafold predict-foldchange test_riboswitch.bpseq --params parameters/EternaFoldParams.v1` -#### Discrepancies from CONTRAfold-SE code +Output: -This code has been modified in two ways that means its output will differ from the CONTRAfold codebase here and the CONTRAfold-SE codebase here. - -1. A bug was fixed in the multiloop traceback `InferenceEngine.ipp` which was first identified by He Zhang (Oregon State). +``` -2. The minimum allowable hairpin size was increased from `0` to `3` to prevent structure predictions with `(())` hairpins. To revert back to the original CONTRAfold behavior, set `C_MIN_HP_LENGTH=0` in `Config.hpp` before compiling. +``` ### Training @@ -98,3 +122,25 @@ k1.0 2.0 99 3 U -1 19 19 4 C -1 18 18 ``` + +#### ❗️ Discrepancies from CONTRAfold-SE code + +This code has been modified in two ways that means its output, even using the CONTRAfold parameters, will differ from the CONTRAfold codebase here and the CONTRAfold-SE codebase here. + +1. A bug was fixed in the multiloop traceback `InferenceEngine.ipp` which was first identified by He Zhang (Oregon State). + +2. The minimum allowable hairpin size was increased from `0` to `3` to prevent structure predictions with `(())` hairpins. To revert back to the original CONTRAfold behavior, set `C_MIN_HP_LENGTH=0` in `Config.hpp` before compiling. + +Example Hammerhead Ribozyme sequence: `CGCUGUCUGUACUUGUAUCAGUACACUGACGAGUCCCUAAAGGACGAAACAGCG` + +contrafold predict hhr.bpseq + + +CONTRAfold v2.02: +CONTRAfold-SE: +EternaFold code, C_MIN_HP_LENGTH=0: +EternaFold code, C_MIN_HP_LENGTH=3: + + + + From 27ad4e991b81967041f1d8d11a6b44442d988774 Mon Sep 17 00:00:00 2001 From: H Wayment-Steele Date: Sun, 20 Jun 2021 13:40:34 -0700 Subject: [PATCH 7/9] Update README.md --- README.md | 11 ----------- 1 file changed, 11 deletions(-) diff --git a/README.md b/README.md index 59ea8a1..e2edec3 100644 --- a/README.md +++ b/README.md @@ -48,17 +48,6 @@ Log partition coefficient for "test.seq": 13.7489 Please see the documentation of [CONTRAfold](http://contra.stanford.edu/contrafold/manual_v2_02.pdf) for further information on parameters and usage. See below for documented discrepancies (besides parameters) from CONTRAfold codebases. -#### Fold change prediction -Predict log K_MS2 values for riboswitch molecules to MS2 in the presence and absence of small molecule aptamers. - -`./src/contrafold predict-foldchange test_riboswitch.bpseq --params parameters/EternaFoldParams.v1` - -Output: - -``` - -``` - ### Training From 72312325fbd382b4eb0cbefeac4066b1950ab56e Mon Sep 17 00:00:00 2001 From: H Wayment-Steele Date: Sun, 20 Jun 2021 13:46:59 -0700 Subject: [PATCH 8/9] Update README.md --- README.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/README.md b/README.md index e2edec3..480a016 100644 --- a/README.md +++ b/README.md @@ -120,15 +120,15 @@ This code has been modified in two ways that means its output, even using the CO 2. The minimum allowable hairpin size was increased from `0` to `3` to prevent structure predictions with `(())` hairpins. To revert back to the original CONTRAfold behavior, set `C_MIN_HP_LENGTH=0` in `Config.hpp` before compiling. -Example Hammerhead Ribozyme sequence: `CGCUGUCUGUACUUGUAUCAGUACACUGACGAGUCCCUAAAGGACGAAACAGCG` +Predictions for Hammerhead Ribozyme sequence, using default CONTRAfold parameters: `CGCUGUCUGUACUUGUAUCAGUACACUGACGAGUCCCUAAAGGACGAAACAGCG` -contrafold predict hhr.bpseq +contrafold predict hhr.bpseq --partition - -CONTRAfold v2.02: -CONTRAfold-SE: -EternaFold code, C_MIN_HP_LENGTH=0: -EternaFold code, C_MIN_HP_LENGTH=3: +|CONTRAfold v2.02| 6.87394| +|CONTRAfold-SE| 6.87394| +|EternaFold code, no ML fix and C_MIN_HP_LENGTH=0| 6.87394| +|EternaFold code, C_MIN_HP_LENGTH=0| 6.83585| +|EternaFold code | 6.77285 | From 646295ce3eadc7c68c7e2a48cf6a4028c65037e9 Mon Sep 17 00:00:00 2001 From: H Wayment-Steele Date: Sun, 20 Jun 2021 13:48:12 -0700 Subject: [PATCH 9/9] Update README.md --- README.md | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 480a016..8619d45 100644 --- a/README.md +++ b/README.md @@ -124,12 +124,10 @@ Predictions for Hammerhead Ribozyme sequence, using default CONTRAfold parameter contrafold predict hhr.bpseq --partition +| Version | hhr.bpseq Log Partition Coefficient | +| --- | ----------- | |CONTRAfold v2.02| 6.87394| |CONTRAfold-SE| 6.87394| |EternaFold code, no ML fix and C_MIN_HP_LENGTH=0| 6.87394| |EternaFold code, C_MIN_HP_LENGTH=0| 6.83585| |EternaFold code | 6.77285 | - - - -