TY - GEN

T1 - Exact Linear-Time Algorithm for Parameterized K-Means Problem with Optimized Number of Clusters in the 1D Case

AU - Kel’manov, Alexander

AU - Khandeev, Vladimir

N1 - Publisher Copyright:
© 2020, Springer Nature Switzerland AG.
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.

PY - 2020/1/1

Y1 - 2020/1/1

N2 - We consider a well-known strongly NP-hard K-Means problem. In this problem, one needs to partition a finite set of N points in Euclidean space into K non-empty clusters minimizing the sum over all clusters of the intracluster sums of the squared distances between the elements of each cluster and its centers. The cluster’s center is defined as the centroid (geometrical center). We analyze the polynomial-solvable one-dimensional case of the problem and propose a novel parameterized approach to this case. Within the framework of this approach, we, firstly, introduce a new parameterized formulation of the problem for this case and, secondly, we show that our approach and proposed algorithm allows one to find an optimal input data partition and, contrary to existing approaches and algorithms, simultaneously find an optimal clusters number in (formula presented) time.

AB - We consider a well-known strongly NP-hard K-Means problem. In this problem, one needs to partition a finite set of N points in Euclidean space into K non-empty clusters minimizing the sum over all clusters of the intracluster sums of the squared distances between the elements of each cluster and its centers. The cluster’s center is defined as the centroid (geometrical center). We analyze the polynomial-solvable one-dimensional case of the problem and propose a novel parameterized approach to this case. Within the framework of this approach, we, firstly, introduce a new parameterized formulation of the problem for this case and, secondly, we show that our approach and proposed algorithm allows one to find an optimal input data partition and, contrary to existing approaches and algorithms, simultaneously find an optimal clusters number in (formula presented) time.

KW - K-Means

KW - Linear-time algorithm

KW - One-dimensional case

KW - Parameterized approach

UR - http://www.scopus.com/inward/record.url?scp=85080871796&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-40616-5_35

DO - 10.1007/978-3-030-40616-5_35

M3 - Conference contribution

AN - SCOPUS:85080871796

SN - 9783030406158

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 394

EP - 399

BT - Numerical Computations

A2 - Sergeyev, Yaroslav D.

A2 - Kvasov, Dmitri E.

PB - Springer Gabler

T2 - 3rd Triennial International Conference and Summer School on Numerical Computations: Theory and Algorithms, NUMTA 2019

Y2 - 15 June 2019 through 21 June 2019

ER -