## Abstract

The following two strongly NP-hard problems are considered. In the first problem, we need to find in the given finite set of points in Euclidean space the subset of largest size. The sum of squared distances between the elements of this subset and its unknown centroid (geometrical center) must not exceed a given value. This value is defined as percentage of the sum of squared distances between the elements of the input set and its centroid. In the second problem, the input is a sequence (not a set) and we have some additional constraints on the indices of the elements of the chosen subsequence. The restriction on the sum of squared distances is the same as in the first problem. Both problems can be treated as data editing problems aimed to find similar elements and removal of extraneous (dissimilar) elements. We propose exact algorithms for the cases of both problems in which the input points have integer-valued coordinates. If the space dimension is bounded by some constant, our algorithms run in a pseudopolynomial time. Some results of numerical experiments illustrating the performance of the algorithms are presented.

Original language | English |
---|---|

Pages (from-to) | 157-168 |

Number of pages | 12 |

Journal | Annals of Mathematics and Artificial Intelligence |

Volume | 88 |

Issue number | 1-3 |

DOIs | |

Publication status | Published - 1 Mar 2020 |

## Keywords

- Euclidean space
- Exact algorithm
- Largest subset
- Longest subsequence
- Pseudopolynomial time
- Quadratic variation