Argo_CUDA: Exhaustive GPU based approach for motif discovery in large DNA datasets

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)

Abstract

The development of chromatin immunoprecipitation sequencing (ChIP-seq) technology has revolutionized the genetic analysis of the basic mechanisms underlying transcription regulation and led to accumulation of information about a huge amount of DNA sequences. There are a lot of web services which are currently available for de novo motif discovery in datasets containing information about DNA/protein binding. An enormous motif diversity makes their finding challenging. In order to avoid the difficulties, researchers use different stochastic approaches. Unfortunately, the efficiency of the motif discovery programs dramatically declines with the query set size increase. This leads to the fact that only a fraction of top “peak” ChIP-Seq segments can be analyzed or the area of analysis should be narrowed. Thus, the motif discovery in massive datasets remains a challenging issue. Argo_Compute Unified Device Architecture (CUDA) web service is designed to process the massive DNA data. It is a program for the detection of degenerate oligonucleotide motifs of fixed length written in 15-letter IUPAC code. Argo_CUDA is a full-exhaustive approach based on the high-performance GPU technologies. Compared with the existing motif discovery web services, Argo_CUDA shows good prediction quality on simulated sets. The analysis of ChIP-Seq sequences revealed the motifs which correspond to known transcription factor binding sites.

Original languageEnglish
Article number1740012
Number of pages23
JournalJournal of Bioinformatics and Computational Biology
Volume16
Issue number1
DOIs
Publication statusPublished - 1 Feb 2018

Keywords

  • ChIP-Seq
  • Motif discovery
  • oligonucleotide motif
  • transcription regulation
  • SEQUENCE MOTIFS
  • INFORMATION-CONTENT
  • FACTOR-BINDING PROFILES
  • IDENTIFICATION
  • NUCLEOTIDE
  • CHIP-SEQ DATA
  • ELEMENTS
  • OPEN-ACCESS DATABASE
  • SITES
  • TOOL
  • Chromatin Immunoprecipitation/methods
  • Databases, Genetic
  • Computational Biology/methods
  • Algorithms
  • Animals
  • Nucleotide Motifs
  • Transcription Factors/metabolism
  • Hepatocyte Nuclear Factor 3-beta/genetics
  • Mice
  • Binding Sites

Fingerprint Dive into the research topics of 'Argo_CUDA: Exhaustive GPU based approach for motif discovery in large DNA datasets'. Together they form a unique fingerprint.

Cite this