Reinforcement Learning for Long-Term Reward Optimization in Recommender Systems

Anton Dorozhko

Результат исследования: Публикации в книгах, отчётах, сборниках, трудах конференцийстатья в сборнике материалов конференциинаучнаярецензирование

Аннотация

Recommender systems help users to orient in the vast space of goods, services, and events. A user interacts with the recommender engine in a sequence of exchanges of recommendations and user feedback. The idea that previous interaction influence the later ones and the importance of the sequence of interactions can be modeled using Markov decision processes and solved by reinforcement learning. Several recent articles applying reinforcement learning to recommender systems have proved the viability of this direction. But it is still difficult to compare different approaches. We propose an environment with a unified interface that will permit to compare different modelization of recommender process and different algorithms on the same underlying sequential data. We also performed the extensive parameter study for deep deterministic policy gradient methods on the well-known MovieLens dataset.

Язык оригиналаанглийский
Название основной публикацииSIBIRCON 2019 - International Multi-Conference on Engineering, Computer and Information Sciences, Proceedings
ИздательInstitute of Electrical and Electronics Engineers Inc.
Страницы862-867
Число страниц6
ISBN (электронное издание)9781728144016
ISBN (печатное издание)978-1-7281-4402-3
DOI
СостояниеОпубликовано - окт 2019
Событие2019 International Multi-Conference on Engineering, Computer and Information Sciences, SIBIRCON 2019 - Novosibirsk, Российская Федерация
Продолжительность: 21 окт 201927 окт 2019

Серия публикаций

НазваниеSIBIRCON 2019 - International Multi-Conference on Engineering, Computer and Information Sciences, Proceedings

Конференция

Конференция2019 International Multi-Conference on Engineering, Computer and Information Sciences, SIBIRCON 2019
СтранаРоссийская Федерация
ГородNovosibirsk
Период21.10.201927.10.2019

Fingerprint Подробные сведения о темах исследования «Reinforcement Learning for Long-Term Reward Optimization in Recommender Systems». Вместе они формируют уникальный семантический отпечаток (fingerprint).

Цитировать