Named Entity Extraction from Semi-structured Data Using Machine Learning Algorithms

Madina Mansurova, Vladimir Barakhnin, Yerzhan Khibatkhanuly, Ilya Pastushkov

Research output: Chapter in Book/Report/Conference proceedingConference contributionResearchpeer-review

2 Citations (Scopus)


The modern society have been witnessed that intensive development of Internet technologies had followed to information explosion during last decades. This explosion had been expressing by an exponential growth of data volume among the low-quality information. This paper is designed to provide detailed information about some intellectual tools which are support decision taking by automatic knowledge extraction. In the first part of paper, we considered a preprocessing contains morphological analysis of texts. Then we had considered the model of text documents in the form of a hypergraph and implementation of the random walk method to extract semantically close word’s pairs, in other words, pairs that often appears together. Result of calculations is matrix with word affinity coefficients corresponding to each other component of vocabulary vector. In the second part we describe training of neural network for linguistic constructions extraction. These ones include possible values of text named entities descriptors. The neural network enables to retrieve information on one preselected descriptor, for example, location, in the form of the final result of the name of geographical objects. In a general case, the neural network can retrieve information on several descriptors simultaneously.

Original languageEnglish
Title of host publicationComputational Collective Intelligence - 11th International Conference, ICCCI 2019, Proceedings
EditorsNgoc Thanh Nguyen, Richard Chbeir, Ernesto Exposito, Philippe Aniorté, Bogdan Trawinski, Ngoc Thanh Nguyen
PublisherSpringer-Verlag GmbH and Co. KG
Number of pages12
ISBN (Print)9783030283735
Publication statusPublished - 1 Jan 2019
Event11th International Conference on Computational Collective Intelligence, ICCCI 2019 - Hendaye, France
Duration: 4 Sep 20196 Sep 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11684 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference11th International Conference on Computational Collective Intelligence, ICCCI 2019


  • Entity extraction
  • Machine learning algorithms
  • Neural networks
  • Random walk method
  • Semi-structured data


Dive into the research topics of 'Named Entity Extraction from Semi-structured Data Using Machine Learning Algorithms'. Together they form a unique fingerprint.

Cite this