О ГЕОГРАФИЧЕСКОЙ ПРИВЯЗКЕ КОНТЕНТА ТЕКСТОВЫХ ДОКУМЕНТОВ

Translated title of the contribution: On geographical binding of the content of text documents

Oleg L. Zhizhimov, Yulia V. Leonova

Research output: Contribution to journalConference articlepeer-review

Abstract

Extracting geographical names from arbitrary text documents is important in the tasks of processing large arrays of documents and linking their content to a specific geographic region. In the simplest form, the model for extracting geographical names from the text looks like a sequence of actions with the text, while at each stage its task is solved. Among these tasks, there are undoubtedly: text parsing, analyzing text elements, processing synonyms and abbreviations, bringing the text elements to normal form from possible word forms and grammar rules, comparing text elements with the elements of dictionaries of geographical names, adding special tags to the text for unambiguous identification geographical names. The proposed work describes a technology that implements the above tasks on the basis of a freely distributed PostgreSQL DBMS. In this case, the standard configuration is used, all the server part settings are performed within the framework of the documented procedures. GeoNames Gazetteer database, Open Street Map (OSM) databases, OKATO and КЛАДР classifications are used as an authoritative database of geographical names.

Translated title of the contributionOn geographical binding of the content of text documents
Original languageRussian
Pages (from-to)241-247
Number of pages7
JournalCEUR Workshop Proceedings
Volume2534
Publication statusPublished - 1 Jan 2019
Event2019 All-Russian Conference "Spatial Data Processing for Monitoring of Natural and Anthropogenic Processes", SDM 2019 - Berdsk, Russian Federation
Duration: 26 Aug 201930 Aug 2019

Fingerprint Dive into the research topics of 'On geographical binding of the content of text documents'. Together they form a unique fingerprint.

Cite this