Complex approach towards algoritm learning for anaphora resolution in Russian language

O. A. Gureenkova, T. V. Batura, A. A. Kozlova, A. N. Svischev

Research output: Contribution to journalArticlepeer-review

Abstract

The paper considers applying of ensemble algorithm based on rules and machine learning for anaphora resolution in Russian language. Ensemble presents combination of formal rules, a machine learning algorithm Extra Trees and an algorithm for working with imbalanced learning sets Balance Cascade. Complexity of the approach lies in generation of complex features from rules and vectorization of syntactic context, with context data obtained from algorithms mystem (Yandex), SyntaxNet (Google) and Word2Vec.

Original languageEnglish
Pages (from-to)89-97
Number of pages9
JournalKomp'juternaja Lingvistika i Intellektual'nye Tehnologii
Volume1
Issue number16
Publication statusPublished - 2017

Keywords

  • Anaphora
  • Antecedent
  • Balance Cascade
  • Cataphora
  • Extra Trees
  • Imbalanced set
  • Machine learning
  • Random forest
  • SyntaxNet
  • Word2Vec

OECD FOS+WOS

  • 6.02 LANGUAGES AND LITERATURE

Fingerprint

Dive into the research topics of 'Complex approach towards algoritm learning for anaphora resolution in Russian language'. Together they form a unique fingerprint.

Cite this