TY - GEN
T1 - Regression analysis with cluster ensemble and kernel function
AU - Berikov, Vladimir
AU - Vinogradova, Taisiya
PY - 2018/1/1
Y1 - 2018/1/1
N2 - In this paper, we consider semi-supervised regression problem. The proposed method can be divided into two steps. In the first step, a number of variants of clustering partition are obtained with some clustering algorithm working on both labeled and unlabeled data. Weighted co-association matrix is calculated using the results of partitioning. It is known that this matrix satisfies Mercer’s condition, so it can be used as a kernel for a kernel-based regression algorithm. In the second step, we use the obtained matrix as kernel to construct the decision function based on labelled data. With the use of probabilistic model, we prove that the probability that the error is significant converges to its minimum possible value as the number of elements in the cluster ensemble tends to infinity. Output of the method applied to a real set of data is compared with the results of popular regression methods that use a standard kernel and have all the data labelled. In noisy conditions the proposed method showed higher quality, compared with support vector regression algorithm with standard kernel.
AB - In this paper, we consider semi-supervised regression problem. The proposed method can be divided into two steps. In the first step, a number of variants of clustering partition are obtained with some clustering algorithm working on both labeled and unlabeled data. Weighted co-association matrix is calculated using the results of partitioning. It is known that this matrix satisfies Mercer’s condition, so it can be used as a kernel for a kernel-based regression algorithm. In the second step, we use the obtained matrix as kernel to construct the decision function based on labelled data. With the use of probabilistic model, we prove that the probability that the error is significant converges to its minimum possible value as the number of elements in the cluster ensemble tends to infinity. Output of the method applied to a real set of data is compared with the results of popular regression methods that use a standard kernel and have all the data labelled. In noisy conditions the proposed method showed higher quality, compared with support vector regression algorithm with standard kernel.
KW - Cluster analysis
KW - Ensemble clustering
KW - Kernel methods
KW - Regression analysis
UR - http://www.scopus.com/inward/record.url?scp=85059931171&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-11027-7_21
DO - 10.1007/978-3-030-11027-7_21
M3 - Conference contribution
AN - SCOPUS:85059931171
SN - 9783030110260
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 211
EP - 220
BT - Analysis of Images, Social Networks and Texts - 7th International Conference, AIST 2018, Revised Selected Papers
A2 - Panchenko, Alexander
A2 - van der Aalst, Wil M.
A2 - Khachay, Michael
A2 - Pardalos, Panos M.
A2 - Batagelj, Vladimir
A2 - Loukachevitch, Natalia
A2 - Glavaš, Goran
A2 - Ignatov, Dmitry I.
A2 - Kuznetsov, Sergei O.
A2 - Koltsova, Olessia
A2 - Lomazova, Irina A.
A2 - Savchenko, Andrey V.
A2 - Napoli, Amedeo
A2 - Pelillo, Marcello
PB - Springer-Verlag GmbH and Co. KG
T2 - 7th International Conference on Analysis of Images, Social Networks and Texts, AIST 2018
Y2 - 5 July 2018 through 7 July 2018
ER -