Named Entity Extraction from Semi-structured Data Using Machine Learning Algorithms

Madina Mansurova, Vladimir Barakhnin, Yerzhan Khibatkhanuly, Ilya Pastushkov

Результат исследования: Публикации в книгах, отчётах, сборниках, трудах конференцийстатья в сборнике материалов конференциинаучнаярецензирование

Аннотация

The modern society have been witnessed that intensive development of Internet technologies had followed to information explosion during last decades. This explosion had been expressing by an exponential growth of data volume among the low-quality information. This paper is designed to provide detailed information about some intellectual tools which are support decision taking by automatic knowledge extraction. In the first part of paper, we considered a preprocessing contains morphological analysis of texts. Then we had considered the model of text documents in the form of a hypergraph and implementation of the random walk method to extract semantically close word’s pairs, in other words, pairs that often appears together. Result of calculations is matrix with word affinity coefficients corresponding to each other component of vocabulary vector. In the second part we describe training of neural network for linguistic constructions extraction. These ones include possible values of text named entities descriptors. The neural network enables to retrieve information on one preselected descriptor, for example, location, in the form of the final result of the name of geographical objects. In a general case, the neural network can retrieve information on several descriptors simultaneously.

Язык оригиналаанглийский
Название основной публикацииComputational Collective Intelligence - 11th International Conference, ICCCI 2019, Proceedings
РедакторыNgoc Thanh Nguyen, Richard Chbeir, Ernesto Exposito, Philippe Aniorté, Bogdan Trawinski, Ngoc Thanh Nguyen
ИздательSpringer-Verlag GmbH and Co. KG
Страницы58-69
Число страниц12
ISBN (печатное издание)9783030283735
DOI
СостояниеОпубликовано - 1 янв 2019
Событие11th International Conference on Computational Collective Intelligence, ICCCI 2019 - Hendaye, Франция
Продолжительность: 4 сен 20196 сен 2019

Серия публикаций

НазваниеLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Том11684 LNAI
ISSN (печатное издание)0302-9743
ISSN (электронное издание)1611-3349

Конференция

Конференция11th International Conference on Computational Collective Intelligence, ICCCI 2019
СтранаФранция
ГородHendaye
Период04.09.201906.09.2019

Fingerprint Подробные сведения о темах исследования «Named Entity Extraction from Semi-structured Data Using Machine Learning Algorithms». Вместе они формируют уникальный семантический отпечаток (fingerprint).

  • Цитировать

    Mansurova, M., Barakhnin, V., Khibatkhanuly, Y., & Pastushkov, I. (2019). Named Entity Extraction from Semi-structured Data Using Machine Learning Algorithms. В N. T. Nguyen, R. Chbeir, E. Exposito, P. Aniorté, B. Trawinski, & N. T. Nguyen (Ред.), Computational Collective Intelligence - 11th International Conference, ICCCI 2019, Proceedings (стр. 58-69). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Том 11684 LNAI). Springer-Verlag GmbH and Co. KG. https://doi.org/10.1007/978-3-030-28374-2_6