Pages for explaining research of the group

Evolution of Selenophosphate Synthetases (SPS)

This page provides access to data files related to our research on the evolution of selenophosphate synthetase proteins (SPS) across the tree of life.

The paper was published in 2015 on the September issue of Genome Research, and inspired the artwork on its cover. The paper can be freely accessed here.

Other Research at Our Lab

  1. In collaboration with Robert B. Russell (EMBL, Heidelberg), we have studied the conservation of exonic structure in absence of sequence conservation (Betts et al, 2001).
  2. In collaboration with Timothy M. Thomson (Hospital de La Vall d'Hebron, Barcelona), we discovered a gene fusion phenomenon (Thomson et al, 2000).

Knowledge Extraction from Biological Databases

This is a line of research that we are not currently pursuing, but that still interest us. With Temple F. Smith we addressed, some time ago, the problem of finding the query selecting the closest database subset to a given arbitrary subset –a problem which we term now reverse querying–. We addressed the problem informally in Guigó et al. (1991), and more rigorously in Guigó and Smith (1993).

Comparative Analysis of Virus Sequences

Currently more than 600 complete virus genomes are available in the public sequence repository GenBank. Sequence comparison methods allow us to identify protein regions that have been conserved in proteins that have a common ancestor (homologues). These conserved regions generally represent functionally important domains. However homology relationships between proteins are not explicitly mapped in primary databases such as GenBank and, more specialised, secondary, databases are required.

Visualization Tools: gff2ps and gff2aplot

We developed a graphical visualization tool, the gff2ps program (Abril and Guigó, 2000), to represent genomic annotations. This program outputs high quality genomic plots in PostScript format and can cope with sequences of any length, for instance, eukaryotic genomes. This tool was used to generate the gene maps for the Drosophila melanogaster (Adams et al. 2000) and human genome (Venter et al, 2001, the figure below showing a fragment).

Computational Analysis of Splicing


A database (SpliceDB) of known mammalian splice site sequences has been developed. Weight matrices were built for the major splice groups, which can be incorporated into gene prediction programs. SpliceDB is available at the computational genomic Web server of the Sanger Center and has a mirror site at SoftBerry.

Gene Prediction based on Comparative Genomics

Recently, the importance of sequence comparisons between genomes of different species to locate functional domains conserved through evolution (protein coding among them) has been underscored, and new bioinformatics methodologies have been developed to infer protein coding genes from sequence comparisons of the genomes of two different species developed (Batzoglou et al., 2000; Bafna and Hudson, 2000; Wiehe et al., 2001; Korf et al., 2001, Novichkov et al., 2001), which appear to lead to highly accurate predictions.

Characterization of the Eukaryotic Selenoproteome

In selenoproteins, incorporation of the amino acid selenocysteine is specified by the UGA codon, usually a stop signal. The alternative decoding of UGA is conferred by an mRNA structure, the SECIS element, located in the 3'-untranslated region of the selenoprotein mRNA (see figure). Because of the non-standard use of the UGA codon, current computational gene prediction methods are unable to identify selenoproteins in the sequence of eukaryotic genomes.

Gene Prediction Software: geneid

Gene Prediction Software: geneid  

Evaluation of Gene Prediction Programs

Evaluation of Gene Prediction Programs  

Syndicate content