RNA plays an important role in the intracellular cell life and in the organism in general. Besides the well-established protein coding RNAs (messenger RNAs, mRNAs), long non-coding RNAs (lncRNAs) have gained the attention of recent researchers. Although lncRNAs have been classified as non-coding, some authors reported the presence of corresponding sequences in ribosome profiling data (Ribo-seq). Ribo-seq technology is a powerful experimental tool utilized to characterize RNA translation in cell with focus on initiation (harringtonine, lactimidomycin) and elongation (cycloheximide). By exploiting translation starts obtained from the Ribo-seq experiment, we developed a novel position weight matrix model for the prediction of translation starts. This model allowed us to achieve 96% accuracy of discrimination between human mRNAs and lncRNAs. When the same model was used for the prediction of putative ORFs in RNAs, we discovered that the majority of lncRNAs contained only small ORFs (≤300nt) in contrast to mRNAs.
|Журнал||Journal of Bioinformatics and Computational Biology|
|Состояние||Опубликовано - 1 апр. 2018|