Метод поиска структурной гетерогенности сайтов связывания транскрипционных факторов с использованием альтернативных de novo моделей на примере FOXA2

Translated title of the contribution: Application of alternative de novo motif recognition models for analysis of structural heterogeneity of transcription factor binding sites: A case study of FOXA2 binding sites

Research output: Contribution to journalArticlepeer-review

Abstract

The most popular model for the search of ChIP-seq data for transcription factor binding sites (TFBS) is the positional weight matrix (PWM). However, this model does not take into account dependencies between nucleotide occurrences in different site positions. Currently, two recently proposed models, BaMM and InMoDe, can do as much. However, application of these models was usually limited only to comparing their recognition accuracies with that of PWMs, while none of the analyses of the co-prediction and relative positioning of hits of different models in peaks has yet been performed. To close this gap, we propose the pipeline called MultiDeNA. This pipeline includes stages of model training, assessing their recognition accuracy, scanning ChIP-seq peaks and their classif ication based on scan results. We applied our pipeline to 22 ChIP-seq datasets of TF FOXA2 and considered PWM, dinucleotide PWM (diPWM), BaMM and InMoDe models. The combination of these four models allowed a signif icant increase in the fraction of recognized peaks compared to that for the sole PWM model: The increase was 26.3 %. The BaMM model provided the main contribution to the recognition of sites. Although the major fraction of predicted peaks contained TFBS of different models with coincided positions, the medians of the fraction of peaks containing the predictions of sole models were 1.08, 0.49, 4.15 and 1.73 % for PWM, diPWM, BaMM and InMoDe, respectively. Thus, FOXA2 BSs were not fully described by only a sole model, which indicates theirs heterogeneity. We assume that the BaMM model is the most successful in describing the structure of the FOXA2 BS in ChIP-seq datasets under study.

Translated title of the contributionApplication of alternative de novo motif recognition models for analysis of structural heterogeneity of transcription factor binding sites: A case study of FOXA2 binding sites
Original languageRussian
Article number1
Pages (from-to)7-17
Number of pages11
JournalVavilovskii Zhurnal Genetiki i Selektsii
Volume25
Issue number1
DOIs
Publication statusPublished - Jan 2021

Keywords

  • ChIP-seq
  • Heterogeneity of TFBS
  • TFBS de novo searching
  • Transcription factor binding sites (TFBS)

OECD FOS+WOS

  • 3 MEDICAL AND HEALTH SCIENCES
  • 1.06 BIOLOGICAL SCIENCES
  • 1.06.KM GENETICS & HEREDITY

State classification of scientific and technological information

  • 34 BIOLOGY

Fingerprint

Dive into the research topics of 'Application of alternative de novo motif recognition models for analysis of structural heterogeneity of transcription factor binding sites: A case study of FOXA2 binding sites'. Together they form a unique fingerprint.

Cite this