Published in Genome Research, vol. 13, no. 6b, pp. 1273-1289 (June 2, 2003).
http://www.genome.org/cgi/content/abstract/13/6b/1273
Article and publication are at:    http://www.genome.org/cgi/doi/10.1101/gr.1119703


"Targeting a Complex Transcriptome: The Construction of the Mouse Full-Length cDNA Encyclopedia".

Piero Carninci 1, 2, Kazunori Waki 1, Toshiyuki Shiraki 1, Hideaki Konno 1, Kazuhiro Shibata 2, Masayoshi Itoh 2, Katsunori Aizawa 1, Takahiro Arakawa 1, Yoshiyuki Ishii 1, Daisuke Sasaki 1, Hidemasa Bono 1, Shinji Kondo1 1, Yuichi Sugahara 1, Rintaro Saito 1, Naoki Osato 1, Shiro Fukuda 1, Kenjiro Sato 2, 3, Akira Watahiki 2, 3, Tomoko Hirozane-Kishikawa 1, Mari Nakamura 1, Yuko Shibata 2, 6, Ayako Yasunishi 1, Noriko Kikuchi 2, Atsushi Yoshiki 5, Moriaki Kusakabe 5, 7, Stefano Gustincich 8, Kirk Beisel 9, William Pavan 10, Vassilis Aidinis 11, Akira Nakagawara 12, William A. Held 13, Hiroo Iwata 14, Tomohiro Kono 15, Hiromitsu Nakauchi 16, Paul Lyons 17, Christine Wells 18, David A. Hume 18, Michela Fagiolini 19, Takao K. Hensch 19, Michelle Brinkmeier 20, Sally Camper 20, Junji Hirota 21, Peter Mombaerts 21, Masami Muramatsu 1, 2, 3, Yasushi Okazaki 1, 2, Jun Kawai 1, 2 and Yoshihide Hayashizaki 1, 2, 3, 4, 22

1 Laboratory for Genome Exploration Research Group, RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan;
2 Genome Science Laboratory, RIKEN, Hirosawa, Wako, Saitama 351-0198, Japan;
3 Institute of Basic Medical Sciences, University of Tsukuba, Tsukuba, Ibaraki 305-8577, Japan;
4 Japan Division of Genomic Information Resources, Science of Biological Supramolecular Systems, Graduate School of Integrated Science, Yokohama City University, Tsurumi-Ku, Yokohama 230-0045, Japan; 5 Experimental Animal Research Division, Biogenic Resources Center, RIKEN Tsukuba Institute, Tsukuba,
Ibaraki 305-0074, Japan;
6 Dnaform International, Inc., Ami Town, Inashiki District, Ibaraki 300-0332, Japan;
7 Aloka Co., LTD, Kasumigaura-cho, Niihari-gun, Ibaraki 300-0134 Japan;
8 Department of Neurobiology, Harvard Medical School, Boston, Massachusetts 02115, USA;
9 Boys Town National Research Hospital, Omaha, Nebraska 68131, USA;
10 National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20892, USA; 11 Institute of Immunology, Biomedical Sciences Research Center A1. Fleming, 16672 Vari, Greece;
12 Chiba Cancer Center Research Institute, Division of Biochemistry, Chuo-ku, Chiba 260-8717, Japan; 13 Roswell Park Cancer Institute, Buffalo, New York 14263, USA;
14 Department of Reparative Materials Field of Tissue Engineering, Institute for Frontier Medical Sciences, Kyoto University, Sakyo-ku, Kyoto 606-8507, Japan;
15 Faculty of Applied Bioscience, Department of BioScience, Tokyo University of Agriculture, Setagaya-ku, Tokyo 156-8502, Japan;
16 Laboratory of Stem Cell Therapy Center for Experimental Medicine, Institute of Medical Science,
University of Tokyo Minato-ku, Tokyo 108-8639, Japan;
17 DRF/WT Diabetes and Inflammation Laboratory Cambridge Institute for Medical Research, Cambridge CB2 2XY, UK;
18 The Institute for Molecular Biosciences, The University of QLD, St. Lucia Brisbane, QLD 4072, Australia;
19 Neuronal Function Research, Lab for Neuronal Circuit Development, RIKEN Brain Science Institute (BSI), Wako-shi, Saitama 300-0198, Japan;
20 University of Michigan Medical, Ann Arbor, Michigan 48109, USA;
21 Developmental Biology and Neurogenetics, The Rockefeller University, New York, New York 10021, USA

22 Corresponding author:
E-MAIL:    rgscerg@gsc.riken.go.jp       FAX:  +1 8145 503 9216
 



Abstract:

We report the construction of the mouse full-length cDNA encyclopedia, the most extensive view of a complex
transcriptome, on the basis of preparing and sequencing 246 libraries. Before cloning, cDNAs were enriched in full-length by Cap-Trapper, and in most cases, aggressively subtracted/normalized. We have produced 1,442,236 successful 3'-end sequences clustered into 171,144 groups, from which 60,770 clones were fully sequenced cDNAs annotated in the FANTOM-2 annotation. We have also produced 547,149 5' end reads, which clustered into 124,258 groups. Altogether, these cDNAs were further grouped in 70,000 transcriptional units (TU), which represent the best coverage of a transcriptome so far. By monitoring the extent of normalization/subtraction, we define the tentative equivalent coverage (TEC), which was estimated to be equivalent to >12,000,000 ESTs derived from standard libraries. High coverage explains discrepancies between the very large numbers of clusters (and TUs) of this project, which also include non-protein-coding RNAs, and the lower gene number estimation of genome annotations. Altogether, 5'-end clusters identify regions that are potential promoters for 8637 known genes and 5'-end clusters suggest the presence of almost 63,000 transcriptional starting points. An estimate of the frequency of polyadenylation signals suggests that at least half of the singletons in the EST set
represent real mRNAs. Clones accounting for about half of the predicted TUs await further sequencing. The continued high-discovery rate suggests that the task of transcriptome discovery is not yet complete.
[Supplemental material available online at: http://www.genome.org ]



References:

0. Special Issue of Genome Research, vol. 13, no. 6b, pp. 1265-1561 (June 2, 2003).
Report of "RIKEN Mouse Genome Encyclopedia" project: the whole system from mouse house to database.

1. Numata K , Kanai A, Saito R, Kondo S, Adachi J, Wilming LG, Hume DA, RIKEN GER Group, GSL Members, Hayashizaki Y, and Tomita M, "Identification of Putative Noncoding RNAs Among the RIKEN Mouse Full-Length cDNA Collection", Genome Research, vol. 13, no. 6b, pp. 1301-1306 (June 2, 2003).

2. Bono H, Yagi K, Kasukawa T, Nikaido I, Tominaga N, Miki R, Mizuno Y, Tomaru Y, Goto H, Nitanda H, Shimizu D, Makino H, Morita T, Fujiyama J, Sakai T, Shimoji T, Hume DA, RIKEN GER Group, Arakawa T, Carninci P, Kawai J, Hayashizaki Y, and Okazaki Y, "Systematic Expression Profiling of the Mouse Transcriptome Using RIKEN cDNA Microarrays", Genome Research, vol. 13, no. 6b, pp. 1318-1323 (June2, 2003).


Additional References:

1. Saha S, Ansari AZ, Jarell KA, and Ptashne M, "RNA Sequences that Work as Transcriptional Activating Regions".

2. Lee JM, and Sonnhammer ELL, "Genomic Gene Clustering Analysis of Pathways in Eukaryotes".

3. Blumenthal T, Evans D, Link CD, Guffanti A, Lawson D, Thierry-Mieg J, Thierry-Mieg D, Chiu WL, Duke K, Kiraly M, and Kim SK, "A Global Analysis of Caenorhabditis elegans Operons".

4. Storz G, "An Expanding Universe of Noncoding RNAs".

5. Eddy SR, "Non-Coding RNA Genes and the Modern RNA World".

6. Huttenhofer A, Kiefmann M, Meier-Ewert S, O'Brien J, Lehrach H, Bachellerie J-P, and Brosius J, "RNomics: An Experimental Approach that Identifies 201 Candidates for Novel, Small, Non-Messenger RNAs in Mouse".

7. Hovsepian JA, and Frenster JH, "RNA-Induced Melting of DNA during Selective Gene Transcription".

8. Frenster JH, "Ultrastructural Probes of Active DNA Sites, and the RNA Activators of DNA".
 



Top of Page - Euchromatin Network - Current Research - Forums - Other Sites - Future Events -

For Further Information or Feedback:
e-mail:   frenster@euchromatin.net
Phone:   +1 650 367 6483
Fax:   +1 650 364 1773

euchromatin:  "the most active portion of the genome within the cell nucleus".