Peter Schattner*
Center for Biomolecular Science and Engineering, 227 Sinsheimer Laboratories, University of California, 1156 High Street, Santa Cruz, CA 95064, USA
* Email: schattner@cse.ucsc.edu
The hypothesis that genomic regions rich in non-protein-coding RNAs
(ncRNAs) can be identified using local variations in single-base and dinucleotide
statistics has been investigated. (G+C)%, (G–C)% difference, (A–T)% difference
and dinucleotide-frequency statistics were compared among seven classes
of ncRNAs and three genomes. Significant variations were observed in (G+C)%
and, in Methanococcus jannaschii, in the frequency of the dinucleotide
‘CG’. Screening programs based on these two base-composition statistics
were developed. With (G+C)% screening alone, a 1% fraction of the M.jannaschii
genome containing all 44 known transfer RNAs, ribosomal RNAs and signal
recognition particle RNAs could be identified. When (G+C)% combined with
CG dinucleotide-frequency screening was used, 43 of the 44 known M.jannaschii
structural ncRNAs were again identified, while the number of presumably
false hits overlapping a known or putative protein-coding gene was reduced
from 15 to 6. In addition, 19 candidate ncRNAs were identified including
one
with significant homology to several known archaeal RNaseP RNAs.
1. Eddy SR, "Non-Coding RNA Genes and the Modern RNA World".
2. Frenster JH, "Ultrastructural Probes of Active DNA Sites, and the RNA Activators of DNA".
3. Frenster JH, "Activation of DNA Transcription within Repressed Chromatin".
Top of Page - Euchromatin Network - Current Research - Forums - Other Sites - Future Events -