Heinrich-Heine-Universität Düsseldorf

Institut für Physikalische Biologie

Welcome to the
download page of

ConStruct V2.0


ABSTRACT

ConStruct is a tool for prediction of conserved secondary structure of a set of homologous single-stranded RNA. For each RNA of the set the structure distribution is calculated and stored in a base-pair probability matrix (see FlowChart, step I). Gaps, resulting from a multiple sequence alignment of the RNA set (step II), are introduced into the individual probability matrices (step III). These ``aligned'' probability matrices are summed up to give a consensus probability matrix emphasizing the conserved structural elements of the RNA set (step IV). Because the multiple sequence alignment is independent of any structural constraints, such an alignment may result in introduction of gaps into the homologous probability matrices that disrupt a common consensus structure. By use of its graphical user interface the presented tool allows the removal of such misalignments, which are easily recognized, from the individual probability matrices by optimizing the sequence alignment with respect to a structural alignment (steps VI, VII). From the consensus probability matrix a consensus structure is extracted, which is viewable in three different graphical representations (step V).

Flow chart of the tool ConStruct for determination of a conserved secondary structure. Please klick on the figure to zoom (JavaScript has to be activated).

 
For references see:
Lück, R., Steger, G. & Riesner, D. (1996).
Thermodynamic prediction of conserved secondary structure: Application to RRE-element of HIV, tRNA-like element of CMV, and mRNA of prion protein.
J. Mol. Biol. 258, 813-826.
and
Lück, R., Gräf, S. & Steger, G. (1999).
ConStruct: A tool for thermodynamic controlled prediction of conserved secondary structure.
Nucleic Acids Res. 21, 4208-4217.
 


Supported Systems / Download

The ConStruct package has been tested to some extend on the following systems:

Workstation OS Compiler tcl tk compress/zcat
SGI Indy IRIX 6.2 gcc 2.7.2.2 or cc 7.5i 4.1i 4.0
586 Linux 2.0.32 gcc 2.7.2.3 8.0p2 8.0p2 4.2.4
686 Linux 2.2.12 egcs-2.91.66 8.2.2 8.2.2 4.2.4

tk/tcl is absolutely required; if its version is >=8.0 the dashpatch is required.
The tk/tcl package is available for free from the Scriptics Corporation.
The dashpatch for tk8.0 or later is available for free from Jan Nijtmans' Home page.
You can also download the tk/tcl 8.2.2 source packages (incl. dashpatch) from our local server (504 kB). For detailed installation information read INSTALL-TkTcl

If available, compress and zcat, respectively, are used to compress on the fly the binary output from csRNAfold (step I) or to uncompress these files during reading by ConStruct (step IV); i.e., some version of compress is convenient to save disk space.

cs_convert, for conversion of aligned multiple sequence files into the format used by ConStruct, is an awk-script; if no awk interpreter is available, readseq has to be used instead.

The ConStruct package (ConStruct-2.0.tar.gz) (258 kB) itself is available for free (GNU license).
This tar file contains

  • The source for csRNAfold that is mainly RNAfold  V1.21 with altered output routines. This program is used for energy minimization in Step I.
    For reference to the original RNAfold see:
    Hofacker, I.L., Fontana, W., Stadler, P.F., Bonhoeffer, S., Tacker, M. & Schuster, P. (1994).
    Fast folding and comparsion of RNA structures.
    Monatsh. Chem. 125, 167-188.
  • The source for a dotplot program (Tinoco plot) as an alternative to the energy minimization of step I. This allows to visualize all possible base pairings of the sequences.
  • The tk/tcl files that implement the graphical user interfaces for step I (cs_make) and steps III to VII (cs_dp).
  • The tk/tcl files that produce the different graphical representations of the consensus structure; these routines are used from within cs_dp.
  • The c routines/extensions that enhance tk/tcl with the ability to read the base pair probability matrices (step III) and to produce the structure representations (step V).
  • An awk script (cs_convert) and a modified version of readseq; both programs might be useful to convert the multiple sequence files necessary for steps I and II from and to the format necessary for csRNAfold and cs_dp.
    For the original readseq by D.G. Gilbert see ftp://ftp.bio.indiana.edu/molbio/readseq/.
  • As an example/result obtained by ConStruct the input and output produced with a set of U7 RNAs.
  • The necessary make and configure files for compiling and installing the above mentioned programs.

Compiling and Installing ConStruct

  • After transfer of the package to your system you have to uncompress and unpack the package with help of gzip and/or tar.
    First uncompress the file with
    gunzip ConStruct-2.0.tar.gz
    and then unpack the tar archive with
    tar xvf ConStruct-2.0.tar
    If your tar is able to use gzip directly, substitute both steps with
    tar zxvf ConStruct-2.0.tar.gz
  • Enter the new directory 'ConStruct-2.0' and follow the instructions given in the file README. Basically, you have to perform the standard configure/make/make install cycle.
    The documentation (will) reside(s) in the 'docs' directory.
    The 'examples' directory contains an example of ConStruct's input and output.
     

Contact
Questions
Bug report

Please contact me in case you have any problems, suggestions, find any bugs, or if you want to discuss your results.

Dr. Gerhard Steger
Institut für Physikalische Biologie
Geb. 26.12.U1
Heinrich-Heine-Universität Düsseldorf
40225 Düsseldorf
Germany
Phone: +49 211 81 14927
FAX: +49 211 81 15167
e-mail: G. Steger

G. Steger & S. Gräf    December 1999

Top
Institut für Physikalische Biologie HHU Düsseldorf