The Friedlander Laboratory Department of Cell Biology |
The Scripps Research Institute, La Jolla, CA Microarray Home Statistical Analyses Gene Search |
Statistical data is critical for judging the validity of the microarray data presented. It is also necessary to judge the relevance of the expression profiles in the various clusters. We performed multiple statistical analyses on the subset of 'expressed genes' prior to clustering and other data analysis to ensure the data's reliability. An experimental gene tree, analysis of coefficient of variance (COV), and correlation between variance and expression levels for each timepoint analyzed are all presented below. Validation of select genes by real-time RT-PCR is also demonstrated in the publication associated with this data.
The Pearson correlation was used to define the relative similarities between the various gene expression experiments (3 cRNA replicates per timepoint). As is shown, replicates of each timepoint are significantly more similar than each different timepoint of postnatal retina development.
Coefficient of Variance was calculated for each timepoint by determining the standard deviation for each cRNA sequence (from each timepoint analyzed in triplicate) and dividing it by the mean expression value. The median COVs for each timepoint ranged from 5.2% to 7.9%, within the field's acceptable limits.
Part of the COV data can be misleading. Although each of the genes tested in the COV analysis is considered expressed in the post-natal developing mouse retina, many of these genes are expressed at low levels or are not expressed at all at certain developmental timepoints. The variance for these genes is likely to be high, arbitrarily raising the median COV data for each timepoint. It has been shown that variance is tightly correlated with the intensity of expression. To determine if this was the case for our data, the standard deviation of LN(expression) was graphed vs. the mean LN(expression) for each gene within each timepoint. Again, our data falls within acceptable limits as determined by rigid experimentation validating microarray data ( Tu et al. "Quantitative noise analysis for gene expression microarray experiments" PNAS 99(20):14031 ).