Deep Reinforcement Learning with a Combinatorial Action Space for Predicting Popular Reddit Threads

Ji He1, Mari Ostendorf1, Xiaodong He2, Jianshu Chen2, Jianfeng Gao2, Lihong Li2, Li Deng2
1University of Washington, 2Microsoft Research


Abstract

We introduce an online popularity prediction and tracking task as a benchmark task for reinforcement learning with a combinatorial, natural language action space. A specified number of discussion threads predicted to be popular are recommended, chosen from a fixed window of recent comments to track. Novel deep reinforcement learning architectures are studied for effective modeling of the value function associated with actions comprised of interdependent sub-actions. The proposed model, which represents dependence between sub-actions through a bi-directional LSTM, gives the best performance across different experimental configurations and domains, and it also generalizes well with varying numbers of recommendation requests.