Combined approach to problem of part-of-speech homonymy resolution in Russian texts

Tatiana Batura, Elena Bruches

Результат исследования: Публикации в книгах, отчётах, сборниках, трудах конференцийстатья в сборнике материалов конференциинаучнаярецензирование

2 Цитирования (Scopus)

Аннотация

The Russian language has an inflective structure and does not have a strict word order. This causes processing difficulties, such as part-of-speech homonymy. This article is devoted to the mentioned issue. The existing approaches to resolving the morphological homonymy problem can be divided into the following groups: rule-based approaches, statistical approaches, machine learning approaches, and combined methods. In the paper, we showed that each approach has its advantages and disadvantages; however, combining several approaches can significantly increase the precision of the algorithm. Moreover, the article provides the analysis of the influence of certain features on the morphological homonymy resolution. The precision of the proposed algorithm is sufficient for its use in the tasks of intellectual text processing texts, for example, in machine translation and summarization systems. The proposed method is successfully used in the geographic location system. The main problem is the distinction between function words (conjunctions, particles, prepositions, interjections). Solving this problem is one of the priorities for the further work. We also plan to implement a system without a dictionary, in order to determine better morphological features for unknown words.

Язык оригиналаанглийский
Название основной публикации2018 International Russian Automation Conference, RusAutoCon 2018
ИздательInstitute of Electrical and Electronics Engineers Inc.
ISBN (электронное издание)9781538649381
DOI
СостояниеОпубликовано - 19 окт. 2018
Событие2018 International Russian Automation Conference, RusAutoCon 2018 - Sochi, Российская Федерация
Продолжительность: 9 сент. 201816 сент. 2018

Конференция

Конференция2018 International Russian Automation Conference, RusAutoCon 2018
Страна/TерриторияРоссийская Федерация
ГородSochi
Период09.09.201816.09.2018

Fingerprint

Подробные сведения о темах исследования «Combined approach to problem of part-of-speech homonymy resolution in Russian texts». Вместе они формируют уникальный семантический отпечаток (fingerprint).

Цитировать