Cascading Linear Submodular Bandits: Accounting for Position Bias and Diversity in Online Learning to Rank

Conference on Uncertainty in Artificial Intelligence (UAI)

Published July 22, 2019

Gaurush Hiranandani, Harvineet Singh, Prakhar Gupta, Iftikhar Ahamath Burhanuddin, Zheng Wen, Branislav Kveton

Online learning, position bias, and diversified retrieval are three crucial aspects in designing ranking systems based on user clicks. One simple click model which explains the position bias is the cascade model. Many online learning variants of the cascade model have been proposed, but none so far captures diversity. Similarly, online learning to rank methods which capture diversity do not handle position bias. Motivated by these limitations, we propose a novel click model, and the associated online learning variant to address both position bias and diversified retrieval in a unified framework. Despite the objective function not being a standard submodular set function, we prove that a greedy-approach is a reasonable approximation. We then propose, CascadeLSB, an algorithm whose goal is to recommend K most attractive items from a large set of L items with the attractiveness of items being modeled as a submodular utility function. We analyze the algorithm and prove a gap-free upper bound on its n-step regret. We comprehensively evaluate CascadeLSB on synthetic and real datasets, where it outperforms all the baselines.

Learn More