Some problems in the design of the high-throughput sequencing experiments utilizing RNA-Seq or Ribo-Seq technologies are reviewed. The ENCODE guidelines (2011) and the recommendations of other experts on the experimental design for studying animal and plant transcriptomes are briefed also. The optimal limit of the sequencing depth does exist for the identification of most actively transcribed genes. This limit depends on the transcriptome size in a studied biological object. Additional sequencing over this limit does not provide any substantial information about the complexity of the transcriptome. For mammals, the optimal limit of sequencing depth for identification of the actively transcribed genes is ~2 × 109 bp per biological sample. For other species, the optimal limit of sequencing depth per biological sample can be assessed using this value for mammals by recalculating it for target species with respect to their transcriptome size and specific RNA amount per cell. Detection of differentially expressed genes, as well as the identification of splice junctions in mRNA can be enhanced by increasing the number of analyzed biological samples per experimental group. Two biological replicates per experimental group should be sequenced at least. Five to eight biological replicates per experimental group should be sequenced at least to achieve the optimal results (similar to the qRT-PCR quantification of single gene expression). For the transcriptome studies, the sequencing technologies with an accuracy of sequencing of ≥0.999 per base pair are recommended to use. For RNA-Seq, the use of sequencing platforms giving reads with a length of ≥75 bp is optimal to minimize the sequencing cost. The relative cost for the sequencing of control groups can be reduced by increasing the number of experimental groups via combining several similar experiments or via the sophistication of the initial experiment. These recommendations can be helpful in designing the transcriptome experiments in functional genomics.
- design of experiment
- high-throughput sequencing