Published in Genome Research, vol. 13, no. 6b, pp. 1318-1323 (June2, 2003).
http://www.genome.org/cgi/content/abstract/13/6b/1318
Article and publication are at:   http://www.genome.org/cgi/doi/10.1101/gr.1075103



"Systematic Expression Profiling of the Mouse Transcriptome Using RIKEN cDNA Microarrays".

Hidemasa Bono 1, Ken Yagi 1, Takeya Kasukawa 1, 2, Itoshi Nikaido 1, 3, Naoko Tominaga 1, Rika Miki 1, Yosuke Mizuno 1, Yasuhiro Tomaru 1, Hitoshi Goto 1, Hiroyuki Nitanda 1, Daisuke Shimizu 1, Hirochika Makino 1, Tomoyuki Morita 1, Junshin Fujiyama 1, Takehito Sakai 1, Takashi Shimoji 1, David A. Hume 4, RIKEN GER Group 1, GSL Members 5, 6, Yoshihide Hayashizaki 1, 3, and Yasushi Okazaki 1, 7

1 Laboratory for Genome Exploration Research Group, RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan;
2 Multimedia Development Center, Advanced Technology Development Department, NTT Software Corporation, Naka-ku, Yokohama, Kanagawa 231-8554, Japan;
3 Division of Genomic Information Resource Exploration, Science of Biological Supramolecular Systems, Yokohama City University, Graduate School of Integrated Science, Tsurumi-ku, Yokohama, Kanagawa
230-0045, Japan;
4 Institute for Molecule Bioscience and ARC Special Research Centre for Functional and Applied Genomics, University of Queensland, Q4072, Australia;
5 Genome Science Laboratory, RIKEN, Hirosawa, Wako, Saitama 351-0198, Japan
6 Takahiro Arakawa, Piero Carninci, and Jun Kawai.

7 Corresponding author:
E-MAIL:    rgscerg@gsc.riken.go.jp     FAX:  +1- 81-45-503-9216



Abstract:
Introduction:
Results and Discussion:
Table 1: Number of Clones or Clusters Included in RIKEN Mouse cDNA Microarray:
Table 2: Number of Spots on cDNA Microarrays Judged to be Expressed or Not Expressed:
Methods:
Acknowledgements:
Supplemental Data:
References:
New References:
Additional References:
Other Sites:
Further Information and Feedback:

Abstract:

The number of known mRNA transcripts in the mouse has been greatly expanded by the RIKEN Mouse Gene
Encyclopedia project. Validation of their reproducible expression in a tissue is an important contribution to the study of functional genomics. In this report, we determine the expression profile of 57,931 clones on 20 mouse tissues using cDNA microarrays. Of these 57,931 clones, 22,928 clones correspond to the FANTOM2 clone set. The set represents 20,234 transcriptional units (TUs) out of 33,409 TUs in the FANTOM2 set. We identified 7206 separate clones that satisfied stringent criteria for tissue-specific expression. Gene Ontology terms were assigned for these 7206 clones, and the proportion of `molecular function' ontology for each tissue-specific clone was examined. These data will provide insights into the function of each tissue. Tissue-specific gene expression profiles obtained using our cDNA microarrays were also compared with the data extracted from the GNF Expression Atlas based on Affymetrix microarrays. One major outcome of the RIKEN transcriptome analysis is the identification of numerous nonprotein-coding mRNAs. The expression profile was also used to obtain evidence of expression for putative noncoding RNAs. In addition, 1926 clones (70%) of 2768 clones that were categorized as "unknown EST," and 1969 (58%) clones of 3388 clones that were categorized as "unclassifiable" were also shown to be reproducibly expressed.



Introduction:

DNA microarray technology revolutionized gene expression analysis(DeRisi et al. 1997). DNA microarrays containing virtually all yeast open reading frames (ORFs) have been applied to explore gene expression profiles for various physiological conditions (Eisen et al. 1998). In a recent report (Spellman and Rubin 2002), a striking set of experiments using cDNA microarray profiling in Drosophila revealed that co-expressed genes are clustered in the genome, suggesting long-range coordination of transcriptional control. Although there have been many notable successes in the application of cDNA microarrays to mammalian gene regulation (Alizadeh et al. 2000), the sets of transcripts analyzed have been far from comprehensive, because the mammalian transcriptome has been incomplete. The RIKEN Mouse Encyclopedia project aims to make a library of all transcribed sequences as cDNA clones (The RIKEN Genome Exploration Research Group Phase II Team and the FANTOM Consortium 2001). Analysis of the expression pattern for these cDNAs is a major resource for functional annotation. In particular, many of the transcriptswithin the RIKEN cDNA clone set do not code for protein, orcode for hypothetical proteins. Evidence of expression, particularlytissue-specific expression, can provide an indication that the transcript is likely to be functionally significant. Conversely, lack of any evidence of expression in any tissue might indicate that a transcript is an artifact, or unprocessed nuclear RNA. Expression in a particular tissue may also give insights into likely function for annotated proteins in which the only information available is the presence of a conserved domain or motif.

Following the acquisition of RIKEN mouse full-length cDNAs,we produced our first microarray set, called the RIKEN 19K mouse microarray, which contained a subset of the FANTOM1 full-length cDNAs as well as a large selection of cDNAs from known genes. These arrays were used in producing expression profiling of 49 distinct mouse tissues, and the results were released in the RIKEN Expression Array Database (READ; Miki et al. 2001;Bono et al. 2002). After that effort, we continued characterizing gene expression profiles for mouse tissues using newly sequenced mouse cDNAs as they were acquired. The second and third set of mouse cDNA microarrays, in each of which 19,584 unique cDNA clones were spotted, were prepared and then used for gene expression profiling for 20 tissues. The number of tissues analyzed was reduced by focusing mainly on the adult tissues. The set of cDNAs on these arrays, combined with the earlier 19K set, comprises approximately 60% of the representative transcript set produced in the FANTOM2 annotation process (The FANTOM Consortium and the RIKEN Genome Exploration Research Group Phase I and II Team 2002). Here we present some highlights of this extended analysis.

Results and Discussion:

High Coverage of RIKEN Mouse cDNA Microarray Set in Mouse Transcriptome

The first 19K set (called RIKEN 19k set; 18,763 unique cDNAclones on the array) and newly developed second and third setsof RIKEN mouse cDNA microarrays (called RIKEN 20k chip-2 andchip-3, respectively; containing 19,584 unique cDNA clones each) contain a total of 57,931 unique cDNA clones (denoted as the RIKEN 60K microarray set) and are spotted on three glass slides. We observed that 22,928 clones (~40 %) overlapped with the 60,770 FANTOM2 cDNA clone set (Table 1). cDNA clones usedfor cDNA microarray were not identical to those chosen for full-length sequencing, because novel sequences not in the public database at that time were preferably taken for full-length sequencing, whereas known genes identified from phase1, 3' end sequencing were preferably chosen for cDNA microarrays, to ensure that all transcripts of known function were on the arrays.

Table 1.Number of Clones or Clusters that are Included in RIKEN Mouse cDNA Microarray and FANTOM2 Clone Set
 



Number of cDNA clones in FANTOM2 seta

Clusters in FANTOM2 setb

Clusters in RTSc

60k 
    19k  6,333 
    20k-2  7,397  20,234  22,217 
    20k-3  9,198 
Not on chip  37,842  13,175  15,869 
Total 

60,770 

33,409 

37,086 

RIKEN mouse 19k set, 20k, chip-2, and chip-3 are labeled as 19k, 20k-2, and 20k-3 respectively

a Number of clones of FANTOM2 set that overlap with the RIKEN cDNA microarray

b Number of clusters from the FANTOM2 set that overlap with the RIKEN cDNA microarry

c Number of clusters from the RTS that overlap with the RIKEN cDNA microarray set


To further assign correct correspondence between the microarrayclone set and the FANTOM2 clone set, we performed a systematicanalysis of cDNA sequences on the arrays against the representativetranscript set (RTS) used to assess the FANTOM2 sequence setand thought to reflect the mouse transcriptome. The comparisonwas carried out using NCBI BLASTN with a high-stringency cutoff(E<1e-100; Marra et al. 1999). We found that 20,234 transcriptional units (TUs) of the 33,409 TUs in the FANTOM2 set were contained in the RIKEN 60K microarray set, and 22,217 clusters of the 37,086 clusters were in the RTS (Table 1). Although it seems there are redundancies in the clone set from the clustering results based on the TUs, it should be noted that because these are not fully sequenced, a subset will certainly be redundant with the RTS, and will probably represent alternative 3' UTRs which are common in the mammalian transcriptome (The FANTOM Consortium and the RIKEN Genome Exploration Research Group Phase I and II Team 2002). By analogy, despite the fact that the sequencing of the 60,770 FANTOM2 clones was prioritized based on novel 3' and 5' ends, the set collapsed by almost 50% (i.e., there is twofold redundancy) upon clustering of the full-length sequences.

Microarray Analysis for Clones

In addition to the previously reported microarray data for 49mouse tissues using the RIKEN 19K mouse cDNA microarray (thefirst 19K set), new microarray data were produced for profilingtissues in mouse. Gene expression profiles for adipose tissuewere newly added to the set produced with the original 19K set. The 20 tissues selected for analysis using chip 2 and chip 3 were selected mainly from the major adult organs (spleen, thymus, kidney, heart, lung, liver, brain, cerebellum, 10-day-neonate cerebellum, placenta, testis, uterus, pancreas, small intestine, stomach, colon, bone, adipose, muscle, and 10-day-neonate skin). In total, 57,931 gene expression profiles for 20 tissues were included for the analyses.

The log-transformed ratio using the RNA extracted from Day 17.5embryo whole-body as control was stored in READ (RIKEN ExpressionArray Database, http://READ.gsc.riken.go.jp/fantom2/; Bono et al. 2002). Where the target on the array is contained within the FANTOM2 set, the expression profiles described here are integrated with the functional annotations of cDNA clones (The FANTOM Consortium and the RIKEN Genome Exploration Research Group Phase I and II Team 2002). Prominent features for this large gene expression profile are described below.

Tissue Profiling by Gene Ontology

We explored the functional category of Gene Ontology (GO) termsassigned to cDNA clones whose gene expression pattern was restrictedto a subset of tissues on the microarrays. The genes that areexpressed in a tissue-specific manner were extracted by thecriteria described in the Methods section. As we are focusedon the function of genes, we used GO Slim terms (http://www.ebi.ac.uk/proteome/goslimterms.html)for the molecular function ontology in the Gene Ontology project.GO Slim was constructed by selecting a set of high-level GOterms to cover most aspects of the functional classification.

At a glance, NA (Not Assigned) terms are prevalent even in tissue-specificgenes (Fig. 1), indicating the current limitations of our knowledge of the functions of mammalian genes. Relatively well characterized tissues, such as heart, liver, stomach, and kidney showed the highest percentage of GO assigned genes, perhapsreflecting a relatively low level of transcriptional complexityand highly defined function (Fig. 1). Placenta has a high proportion of genes assigned a signal transduction function, in large measure because of the inclusion of the numerous small secreted growth factors (placental lactogen2, placental growth factor, prolactin-like protein A, B, C, F, G, etc.) in this class.

Figure 1 Pie charts for tissue profiling by Gene Ontology

Comparison With the Data From Affymetrix GeneChip

The tissue expression gene ontology diagram was also constructedfor the data in GNF Gene Expression Atlas (http://expression.gnf.org/;Su et al. 2002), which uses the Affymetrix Chip (Suppl. Fig. 1; http://READ.gsc.riken.go.jp/fantom2/supplement/tissue_profiling/GNF/).There has been no previous comparison of the two technologies(full-length cDNAs vs. printed oligonucleotide arrays) and the data provide important cross-validation. There were 15 tissues that were common between the two sets of array experiments. For these 15 tissues, the gene ontology molecular function diagram was also constructed and compared with that of RIKEN cDNA microarrays (Suppl. Fig. 2; http://READ.gsc.riken.go.jp/fantom2/supplement/tissue_profiling/compara/).As shown, the pattern of each corresponding tissue of the GOdiagram is very similar.

Gene Expression of cDNA Clones Categorized as `unknown EST' or `Unclassifiable'

For cDNA clones that were assigned no functional descriptionsfrom sequence similarity searches, cDNA microarray analysiscan at least provide an indication as to tissue-specific expressionthat might infer possible function. cDNA clones in two categories,`unknown EST hit' and `Unclassifiable' were examined in detailto determine the gene expression profiles in the 20 tissuesexamined. cDNA clones in the category `unknown EST hit' arethose without any sequence hits to existing proteins, but whichhave sequence similarity to archived ESTs in the public database.Conversely, clones in the category `Unclassifiable' are thosewithout any sequence hits to existing proteins or ESTs. We found that 1926 clones (70%) of the 2768 clones that were categorized as `unknown EST', and 1969 (58%) clones out of 3388 clones that were categorized as `unclassifiable' were confirmed to be expressed in the microarray according to stringent cut-off criteria (Table 2). The genes that were evaluated as expressed are listed in Supplemental Table 1. Hierarchical clustering of gene expression data for cDNA clones in the `Unclassifiable' category reveals that several genes in this category show tissue-specific gene expression in specific tissues, even in log-transformed ratio data (Suppl. Fig. 3; http://READ.gsc.riken.go.jp/fantom2/supplement/3/).It should be noted that absence of detectable expression doesnot necessarily infer that the transcript is not expressed or is nonfunctional. Many noncoding RNAs are expressed at very low levels, and may fall below the detection limits of microarrays in either the target tissue or the 17-day-embryo reference control.

Table 2.Number of Spots on cDNA Microarrays Judged to be Expressed or Not Expressed



unknown EST hit

Unclassifiable

20k-1 
    Total spots of this category  516  360 
    Expressed  454  276 
    Not expressed 
    Marginally expressed  62  82 
20k-2 
    Total spots of this category  795  1,152 
    Expressed  608  727 
    Not expressed  42  74 
    Marginally expressed  145  351 
20k-3 
    Total spots of this category  1,457  1,876 
    Expressed  864  966 
    Not expressed  81  134 
    Marginally expressed 

512 

776 

Other Applications of Microarray Ratio Data

The major purpose of this short paper is to announce the availabilityof these data, and the corresponding expanded Web interface.There are numerous applications, some of which are describedin other reports in this special issue of Genome Research. For example, the evidence of tissue-specific expression was used for the analyses of small secreted proteins in the global analysis of the secretome (Grimmond, et al. 2003).

`Search multiple clones' in the READ Web interface (http://read.gsc.riken.go.jp/fantom2/)allows researchers to easily retrieve a set of gene expressionpatterns for cDNA clones of interest. For example, gene expressionprofiles for genes in a specific metabolic pathway are availableonly by `copy and paste' operation from the table in MetabolomapperWeb site (http://fantom2.gsc.riken.go.jp/metabolome/; Bono et al. 2003). The search interface is designed to permit visualization of the tissue expression profiling of a subset of genes.

In conclusion, the RIKEN Expression Array Database now representsa major resource for functional genomics in the mouse. We havereported the expression profiling of 57,931 clones for 20 tissues.Comparative analysis with other types of resources emergingin the public domain, such as the GNF Expression Array resource,will provide extensive validation to enable robust analysesof transcriptional networks in the mouse.

Methods:

RNA Extraction

The 20 adult mouse tissues for exploring genes with tissue-specificexpression patterns were as follows: spleen, thymus, kidney,heart, lung, liver, brain, cerebellum, 10-day-neonate cerebellum,placenta, testis, uterus, pancreas, small intestine, stomach,colon, 10-day-neonate skin, bone, muscle, and adipose. RNA extraction was performed by the AGPC method (Miki et al. 2001;Ichikawa et al. 2002; Mizuno et al. 2002).

Preparation of Target DNAs

The target DNAs were collected from RIKEN mouse cDNA libraries,which were constructed using the CAP trapper method to enrichfor full-length inserts. The cDNAs were amplified using M13forward and reverse primers in a 100-µL PCR reaction with 0.2µM final concentration (each) of forward (F1224; 5'-cgccagggttttcccagtcacga-3') and reverse (R1233; 5'-agcggataacaatttcacacagga-3') primers, 250µM dNTPs, and 1.25U Ex Taq in 1 x Ex Taq buffer (TAKARA). The PCR product was precipitated by using isopropanol and resuspended in 15µL 3x SSC. The DNA solution was spotted on poly-L-lysine-coated slides by using a DNA arrayer (http://cmgm.stanford.edu/pbrown/mguide/index.html) with 16 tips (SMP3, TeleChem International). The diameter of the spots was 100–150 µm. Mouse b-actin and G3PDH cDNAs were used as positive controls, and Arabidopsis cDNAs were used as negative controls (Accession nos. X98108, X13611, X90769, Z99707, AF004393, Z49777, Q03943, U58284).

Preparation of Probes

One µg of mRNA extracted from each of the 20 tissues waslabeled by incorporating Cy3 during random-primed reverse transcription.cDNA derived from entire E17.5 embryos, which we labeled withCy5, was used as the expression reference for all tissues. The labeling was carried out at 42°C for 1 h in a total volume of 30µL containing 400 U SuperScriptII (Gibco BRL), 0.1 mM Cy3-dUTP (or Cy5-dUTP), 0.5 mM each dATP, dCTP, and dGTP; 0.2 mM dTTP, 10 mM DTT, 6µL 5x first-strandbuffer, and 6µg random primers. To remove unincorporatednucleotide, labeled cDNA was mixed with 500µL bindingbuffer (5M guanidine-SCN,10 mM Tris pH.7.0, 0.1 mM EDTA, 0.03%gelatin, and 2 ng/µL tRNA) and 50µL silica matrixbuffer (10% matrix, 3.5 M Guanidine chloride, 20% glycerol,0.1 mM EDTA, and 200 mM NaOAc pH4.8–5.0), transferred to a GFX column (Amersham Pharmacia), and centrifuged at 15,000 rpm for 30 sec. The flow-through was discarded, and the column was washed with 500µL wash buffer. The adsorbed probe was eluted into a final volume of 17µL distilled water. This labeled probe was mixed with blocking solution containing 3 µL of 10µg/µL oligo-dA, 3 µL of 20 µg/µL yeast tRNA, 1 µL of 20µg/µL mouse Cot1 DNA, 5.1µL 20 x SSC, and 0.9µL 10% SDS.

Array Hybridization and Data Analysis

The RIKEN full-length mouse cDNA that comprised the target washybridized in a final volume of 30µL; the entire arrayconsists of three multi-blocks, and each multi-block required10µL hybridization solution. Prior to hybridization, probe aliquots were heated at 95°C for 1 min and cooled at room temperature. Cover slips were hybridized overnight at 65°C in a hybricasette (obtained from ArrayIt.com).After hybridization, slides were washed in 2X SSC, 0.1% SDSuntil the cover slips dropped off, the slides were then transferredinto 1 x SSC, shaken gently for 2 min, and rinsed with 0.1 x SSC for 2 min. After washing, slides were spun at 800 rpm using a SORVALL (RC-3B plus; rotor, H6000A/HBB6) centrifuge. These slides were scanned on a ScanArray 5000 confocal laser scanner, and the images were analyzed by using ImaGene (BioDiscovery).

Analysis of the Data

To improve the accuracy of the data, we did the experiment twice,labeling the same RNA template in two separate reactions. Datawere normalized to the reference standard by subtracting (inlog space) the median observed value if it were other than zero. We only used data points that were reproducible. To this end, we developed a filtering program, PRIM (Preprocessing Implementationfor Microarray; Kadota et al. 2001). Briefly, this program (1) deletes the results with "flags" added manually to corrupted spots, (2) eliminates spots with signal intensities less than the mean + 3 x standard deviation (S.D.) of the background signal intensity in either Cy-3 or Cy-5, and (3) eliminates spots located outside the least-mean squares line ± 2 x S.D. After the filtering was finished, we compared the results of the two experiments by calculating a Pearson's correlation coefficient. If the coefficient were equal to or greater than 0.7, we used the data in subsequent analyses. If not, we repeated the labeling, hybridization, and scanning up to six times. In this way, we could generate high-quality data for most tissues. Before the clustering, ratio values from duplicate experiments were averaged, log-transformed (base 2), and stored in a table. We applied hierarchical clustering to both axes using the weighted pair-group method with a centroid average as implemented by the program Cluster (http://www.microarrays.org/software; Eisen et al. 1998). The distance matrices we used were the Pearson correlation for clustering the arrays and the inner product of vectors normalized to magnitude 1 for the genes (this is a slight variation of the Pearson correlation). The results were analyzed using TreeView (http://www.microarrays.org/software;Eisen et al. 1998).

Data Processing
Arrays were scanned using a ScanArray 5000 confocal scanninglaser microscope (PerkinElmer Life Sciences), and then TIFFimage data were extracted using DigitalGENOME software (MolecularWare),and finally reproducible spots were identified using the PRIMfiltration program (Kadota et al. 2001).

Extracting Tissue-Specific Expressed Genes
Log-transformed ratio data, processed and normalized by PRIM,were used to find genes expressed in a tissue-specific manner.The log-transformed ratio values for one cDNA clone were normalized,and the clone was denoted as `tissue-specific' if the normalizedratio value exceeded mean + 3 S.D. for our cDNA microarray and mean + 2 S.D. for Affy chips.

Finally, the GO terms for these clones were extracted, and 14representative terms in molecular_function ontology (http://www.geneontology.org/ontology/function.ontology)were assigned to all cDNA clones. If there was no GO annotationin molecular_function, code `NA' was assigned.

Gene Expression for cDNA Clones in the Functional Category `unknown EST' or `Unclassifiable'
To check whether the gene is expressed, the intensity of thecorresponding spot was evaluated. The background intensity was used to test this by checking whether (1) the intensity of the spot was more than 10 S.D. of all normalized background intensity values, and (2) this condition was met in the duplicated experiments. If these criteria sufficed for any experimental conditions, the corresponding gene was regarded as `expressed'. cDNA clones whose FANTOM2 functional category was either `unknown EST' or `Unclassifiable' were extracted, and their gene expressions were examined using the method mentioned above.



Acknowledgements:

We thank M.C. Nakao for technical assistance with the figure;H. Matsuda, H. Kawaji, F. Collins, and S. Batalov for valuablediscussion and comments; and Y. Tsujimura, C. Saito, S. Watanabe,T. Kobayashi, G. Matsuda, E. Nakayama, A. Wakamoto, S. Suyama,M. Yahata, H. Arai, T. Shinauchi, S. Arai, K. Kadota, and M.Kadomura for technical assistance and helpful discussions. This study was supported by a Research Grant for the RIKEN Genome Exploration Research Project from the Ministry of Education, Culture, Sports, Science and Technology of the Japanese Government (MEXT) to Y.H., and Grant-in-Aid for Scientific Research on Priority Areas (C) "Genome Information Science" from the Ministry of Education, Culture, Sports, Science and Technology of the Japanese Government (MEXT) to H.B.



Supplemental material is available online at:  http://www.genome.org

Microsoft Word Document

Experimental Design

1. type of experiment ; expression profile of each tissues
2. experimental factors ; normal
3. the number of hybridizations performed experiment ; duplicate
4. the type of reference used for the hybridizations ; RNA from whole body of embryo 17.5 days
5. hybridization design ; cy3 : each tissue, cy5 : embryo 17.5 days
6. quality control steps taken ; duplicate, positive controls and negative controls
7. URL of any supplemental websites or database accession number ; http://READ.gsc.riken.go.jp/fantom2/

Sample used, extract preparation and labeling

1. the origin of the biological sample ; mouse ( C57BL/6J ), male, at 8week( adult ).  the female specific reproductive organs were prepared from female mice at 8 week (adult).
2. manipulation of biological samples and protocols used ; Mice used in this study were bred under SPF condition.  This experiment was approved by IACUC of RIKEN.
3. protocol for preparing the hybridization extract ; Total RNAs were extracted using the acid guanidine phenol chlorophorm (AGPC) method (Carninci, P. and Y. Hayashizaki. 1999. High-efficiency full-length cDNA cloning. Methods Enzymol 303: 19-44)
4. labeling protocol ; aminoallyl method
5. external controls ; positive controls (G3PDH, beta actin, elongation factor 2), negative controls (clones of Arabidopsis thaliana).

Hybridization procedures and parameters

1. the protocol and conditions used during hybridization, blocking and washing ; hybridization : 65o, over night, blocking : no blocking, washing : 2*SSC, 0.1*SDS -> 1*SSC -> 0.2*SSC

Measurement data and specifications

1. the quantifications based on the image ; DigitalGENOME (MolecularWare, Inc., Cambridge, MA, USA)
2. type of scanning hardware and software used ; ScanArray 5000 (GSI Lumonics Inc., Billerica, MA, USA)
3. type of image analysis software used ; DigitalGENOME (MolecularWare, Inc., Cambridge, MA, USA)
4. a description of the measurements produced by image-analysis software and a description of which measurements were used in the analysis ;  Mean value of the pixels in the circled area (provided by the Manufacturer: MolecularWare, Inc., Cambridge, MA, USA)
5. the complete output of the image analysis before data selection and transformation ; available on request
6. data selection and transformation procedures ; Valid data were selected by the PRIM method (Kadota, K., R. Miki, H. Bono, K. Shimizu, Y. Okazaki, and Y. Hayashizaki. 2001. Preprocessing implementation for microarray (PRIM): an efficient method for processing cDNA microarray data. Physiol Genomics 4: 183-188.) The value of the ratio (Cy3/Cy5) was log base2 transformed and used.
7. final gene expression data tables used by the authors to make their conclusions after data selection and transformation ; available on request

Array design

1. general array design ; spotted glass array, gamma-amino-propyl-silane coated slides (CMT-GAPS coated slides, Corning, Inc., Corning, NY, USA)
2. For each feature on the array and the ID of its respective reporter (molecular present on each spot) should be given. ; available at:   http://READ.gsc.riken.go.jp/fantom2/
3. For each reporter, its type should be given ; RIKEN full-length enriched cDNA clones
4. along with information that characterizes the reporter molecule unambiguously, in the form of appropriate database references and sequence; All the sequences are available from the public database. The correlation of accession number is available at:   http://READ.gsc.riken.go.jp/fantom2/
5. For non-commercial arrays, the following details should be provided:
a. the source of the reporter molecules ; RIKEN full-length enriched clones
b. the method of reporter preparation ; CAP trapper method (Carninci, P. and Y. Hayashizaki. 1999. High-efficiency full-length cDNA cloning. Methods Enzymol 303: 19-44)
c. the spotting protocols ; the array substrate : PCR products, the spotting buffer : 3*SSC, post-printing processing : rehydration -> UV cross-linking
d. any additional treatment performed prior to hybridization ; blocking with 1-methyl-2-pyrrolidone and succinic anhydriden


References:

Alizadeh, A.A., Eisen, M.B., Davis, R.E., Ma, C., Lossos, I.S., Rosenwald, A., Boldrick, J.C., Sabet, H., Tran, T., Yu, X., et al. 2000. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403:503 -511.

Bono, H., Kasukawa, T., Hayashizaki, Y., and Okazaki, Y. 2002. READ: RIKEN Expression Array Database. Nucleic Acids Res. 30:211 -213.

Bono, H., Nikaido, I., Kasukawa, T., Hayashizaki, Y., RIKEN GER Group and GSL Members, and Okazaki, Y. 2003. Comprehensive analysis of the mouse metabolome based on the transcriptome. Genome Res. 13: 1345-1349.

DeRisi, J.L., Iyer, V.R., and Brown, P.O. 1997. Exploring the metabolic and genetic control of gene expression on a genomic scale. Science 278:680 -686.

Eisen, M.B., Spellman, P.T., Brown, P.O., and Botstein, D. 1998. Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. 95:14863 -14868.

The FANTOM Consortium and the RIKEN Genome Exploration Research Group Phase I and II Team. 2002. Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs. Nature 420:563 -573.

The Gene Ontology Consortium. 2001. Creating the gene ontology resource: Design and implementation. Genome Res. 11:1425 -1433.

Grimmond, S.M., Miranda, K.C., Yuan, Z., Davis, M.J., Hume, D.A., Yagi, K., Tominaga, N., Bono, H., Hayashizaki, Y., Okazaki, Y., et al. 2003. The Mouse Secretome: Functional classification of the proteins secreted into the extracellular environment. Genome Res. 13:1350-1359.

Ichikawa, Y., Ishikawa, T., Takahashi, S., Hamaguchi, Y., Morita, T., Nishizuka, I., Yamaguchi, S., Endo, I., Ike, H., Togo, S., et al. 2002. Identification of genes regulating colorectal carcinogenesis by using the algorithm for diagnosing malignant state method. Biochem. Biophys. Res. Commun. 296: 497.

Kadota, K., Miki, R., Bono, H., Shimizu, K., Okazaki, Y., and Hayashizaki, Y. 2001. Preprocessing implementation for microarray (PRIM): An efficient method for processing cDNA microarray data. Physiol. Genomics 4:183 -188.

Marra, M., Hillier, L., Kucaba, T., Allen, M., Barstead, R., Beck, C., Blistain, A., Bonaldo, M., Bowers, Y., Bowles, L., et al. 1999. An encyclopedia of mouse genes. Nat. Genet. 21:191 -194.

Miki, R., Kadota, K., Bono, H., Mizuno, Y., Tomaru, Y., Carninci, P., Itoh, M., Shibata, K., Kawai, J., Konno, H., et al. 2001. Delineating developmental and metabolic pathways in vivo by expression profiling using the RIKEN set of 18,816 full-length enriched mouse cDNA arrays. Proc. Natl. Acad. Sci. 98:2199 -2204.

Mizuno, Y., Sotomaru, Y., Katsuzawa, Y., Kono, T., Meguro, M., Oshimura, M., Kawai, J., Tomaru, Y., Kiyosawa, H., Nikaido, I., et al. 2002. Asb4, Ata3, and Dcn are novel imprinted genes identified by high-throughput screening using RIKEN cDNA microarray. Biochem. Biophys. Res. Commun. 290:1499 -1505.

The RIKEN Genome Exploration Research Group Phase II Team and the FANTOM Consortium. 2001. Functional annotation of a full-length mouse cDNA collection. Nature 409:685 -690.

Spellman, P.T. and Rubin, G.M. 2002. Evidence for large domains of similarly expressed genes in the Drosophila genome. J. Biol. 1:5 .

Su, A.I., Cooke, M.P., Ching, K.A., Hakak, Y., Walker, J.R., Wiltshire, T., Orth, A.P., Vega, R.G., Sapinoso, L.M., Moqrich, A., et al. 2002. Large-scale analysis of the human and mouse transcriptomes. Proc. Natl. Acad. Sci. 99:4465 -4470.


New References:

0. Special Issue of Genome Research, vol. 13, no. 6b, pp. 1265-1561 (June 2, 2003).
Report of "RIKEN Mouse Genome Encyclopedia" project: the whole system from mouse house to database.

1. Carninci P, et al, "Targeting a Complex Transcriptome: The Construction of the Mouse Full-Length cDNA Encyclopedia", Genome Research, vol. 13, no. 6b, pp. 1273-1289 (June 2, 2003).

2. Numata K , Kanai A, Saito R, Kondo S, Adachi J, Wilming LG, Hume DA, RIKEN GER Group, GSL Members, Hayashizaki Y, and Tomita M, "Identification of Putative Noncoding RNAs Among the RIKEN Mouse Full-Length cDNA Collection", Genome Research, vol. 13, no. 6b, pp. 1301-1306 (June 2, 2003).
 



Additional References:

1. Saha S, Ansari AZ, Jarell KA, and Ptashne M, "RNA Sequences that Work as Transcriptional Activating Regions".

2. Lee JM, and Sonnhammer ELL, "Genomic Gene Clustering Analysis of Pathways in Eukaryotes".

3. Blumenthal T, Evans D, Link CD, Guffanti A, Lawson D, Thierry-Mieg J, Thierry-Mieg D, Chiu WL, Duke K, Kiraly M, and Kim SK, "A Global Analysis of Caenorhabditis elegans Operons".

4. Storz G, "An Expanding Universe of Noncoding RNAs".

5. Eddy SR, "Non-Coding RNA Genes and the Modern RNA World".

6. Huttenhofer A, Kiefmann M, Meier-Ewert S, O'Brien J, Lehrach H, Bachellerie J-P, and Brosius J, "RNomics: An Experimental Approach that Identifies 201 Candidates for Novel, Small, Non-Messenger RNAs in Mouse".

7. Hovsepian JA, and Frenster JH, "RNA-Induced Melting of DNA during Selective Gene Transcription".

8. Frenster JH, "Ultrastructural Probes of Active DNA Sites, and the RNA Activators of DNA".



Top of Page - Euchromatin Network - Current Research - Forums - Other Sites - Future Events -

For Further Information or Feedback:
e-mail:   frenster@euchromatin.net
Phone:   +1 650 367 6483
Fax:   +1 650 364 1773

euchromatin:  "the most active portion of the genome within the cell nucleus".