Reinforcement Learning for Long-term Reward Optimization in Recommender Systems

A. Dorozhko

Результат исследования: Публикации в книгах, отчётах, сборниках, трудах конференцийстатья в сборнике материалов конференциинаучнаярецензирование

Аннотация

Recommender systems help users to orient in the vast space of goods, services, and events. A user interacts with the recommender engine in a sequence of exchanges of recommendations and user feedback. The idea that previous interaction influence the later ones and the importance of the sequence of interactions can be modeled using Markov decision processes and solved by reinforcement learning. Several recent articles applying reinforcement learning to recommender systems have proved the viability of this direction. But it is still difficult to compare different approaches. We propose an environment with a unified interface that will permit to compare different modelization of recommender process and different algorithms on the same underlying sequential data. We also performed the extensive parameter study for deep deterministic policy gradient methods on the well-known MovieLens dataset.
Язык оригиналаанглийский
Название основной публикацииSIBIRCON 2019 - International Multi-Conference on Engineering, Computer and Information Sciences, Proceedings
ИздательIEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Страницы862-867
Число страниц6
ISBN (электронное издание)978-1-7281-4401-6
ISBN (печатное издание)978-1-7281-4402-3
DOI
СостояниеОпубликовано - окт 2019
СобытиеSIBIRCON 2019 International Multi-Conference - Россия, Новосибирск, Новосибирск, Российская Федерация
Продолжительность: 21 окт 201927 окт 2019
Номер конференции: 8
https://sibircon.ieeesiberia.org/

Серия публикаций

НазваниеSIBIRCON 2019 - International Multi-Conference on Engineering, Computer and Information Sciences, Proceedings

Конференция

КонференцияSIBIRCON 2019 International Multi-Conference
Сокращенный заголовокSIBIRCON 2019
СтранаРоссийская Федерация
ГородНовосибирск
Период21.10.201927.10.2019
Адрес в сети Интернет

Ключевые слова

  • recommender systems
  • reinforcement learning
  • long-term value
  • deep reinforcement learning (DRL)
  • DDPG

Fingerprint Подробные сведения о темах исследования «Reinforcement Learning for Long-term Reward Optimization in Recommender Systems». Вместе они формируют уникальный семантический отпечаток (fingerprint).

  • Цитировать

    Dorozhko, A. (2019). Reinforcement Learning for Long-term Reward Optimization in Recommender Systems. В SIBIRCON 2019 - International Multi-Conference on Engineering, Computer and Information Sciences, Proceedings (стр. 862-867). [8958202] (SIBIRCON 2019 - International Multi-Conference on Engineering, Computer and Information Sciences, Proceedings). IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC. https://doi.org/10.1109/SIBIRCON48586.2019.8958202