Аннотация
The paper considers applying of ensemble algorithm based on rules and machine learning for anaphora resolution in Russian language. Ensemble presents combination of formal rules, a machine learning algorithm Extra Trees and an algorithm for working with imbalanced learning sets Balance Cascade. Complexity of the approach lies in generation of complex features from rules and vectorization of syntactic context, with context data obtained from algorithms mystem (Yandex), SyntaxNet (Google) and Word2Vec.
Язык оригинала | английский |
---|---|
Страницы (с-по) | 89-97 |
Число страниц | 9 |
Журнал | Komp'juternaja Lingvistika i Intellektual'nye Tehnologii |
Том | 1 |
Номер выпуска | 16 |
Состояние | Опубликовано - 2017 |