alteRNA

Sequence(in FASTA format)

Or load sequence from file:

sigma value (>=0):
b value (>=0 & <=1):

Or get results via email:

Summary

alteRNA is an alternative thermodynamic approach to the RNA secondary structure prediction problem, which aims to minimize a linear combination of total free energy and total energy density using the dynamic programming formulation proposed by Alkan et al.(RECOMB 2006), in contrast to the available alternatives such as Mfold, RNAscf and alifold which all employ the standard thermodynamic approach.

Input and Output

alteRNA requires a sequence in FASTA format where the sequence is represented as a string of characters from the alphabet {A,C,G,U,T} (the characters are case insensitive). There are two user specified parameters, sigma and b. alteRNA outputs the results in three different forms:
  1. The predicted RNA structure in dot-parenthesis format. The sequence is given from 5' to 3' end, and the structure is given with matching parenthesis denoting a base pair and a dot denoting an unbounded base.
  2. The predicted RNA structure in Connect(.ct) file format. The sequence length is given in the first line. For each nucleotide i, there is a line which consists of: the line number(i), the letter denotion of the nucleotide, the predecessor base index(i-1), the successor base index(i+1), the paired base index(0 if unpaired) and the original base index(i).
  3. The graphics files for the predicted secondary structure in Postscript(.ps) and GIF format. These graphs are created using NAVIEW and Mfold/plt2.

Parameters

Sigma Value

Given an RNA secondary structure, the energy density of a basepair is defined as the free energy of the substructure that starts with the basepair, normalized by the length of the underlying sequence. The energy density of an unpaired base is then defined to be the energy density of the closest basepair that encloses it. alteRNA thus aims to minimize ED(n) + Sigma * E(n), where ED(n) is the total energy density of paired and unpaired bases, E(n) is the total free energy and n is the length of the RNA sequence. Here Sigma determines the weight of the contribution of the total free energy in the optimization function. As Sigma approaches to infinity, the predicted secondary structure gets closer to that implied by the standard thermodynamic approach (employed by Mfold).

b value

The free energy of a multi-branch loop as implied by the standard thermodynamic model is not a linear function and as a result cannot be used in a dynamic programming formulation. Thus alteRNA uses the same approximate formulation as per Mfold, which, for a given multi-branch loop, L sets:

Thermodynamic approximation of multibranch loops

Here a, b, c are constants in thermodynamic model, l_s is the number of unpaired bases, l_d is the number of base pairs and Sigma Sigma G_stack is the free energy of each branch in the loop. The b value sets b in this formulation, thus penalizing unpaired bases.

Please cite

"RNA Secondary Structure Prediction via Energy Density Minimization",
C. Alkan, E. Karakoc, C. Sahinalp, P. Unrau, A. Ebhardt, K. Zhang, J. Buhler.
RECOMB'06 Research in Computational Molecular Biology, Venice, Italy (2006).