**Branislav Kveton** is a machine learning scientist at Adobe Research in San Jose. He was at Technicolor’s Research Center from 2011 to 2014, and at Intel Research from 2006 to 2011. Before 2006, he was a graduate student in the Intelligent Systems Program at the University of Pittsburgh. His advisor was Milos Hauskrecht.

He proposes, analyzes, and applies algorithms that learn incrementally, run in real time, and converge to near optimal solutions as the number of training examples increases. Most of his recent work is focused on online learning of structured problems, such as graphs, submodularity, matroids, polymatroids, and reinforcement learning.

Practical problems are often so massive that even low-order polynomial-time solutions are not practical. Fortunately, many optimization problems can be solved greedily, either optimally or suboptimally with guarantees. Two popular examples of such problems are finding the maximum of a modular function on a matroid and finding the maximum of a submodular function subject to a cardinality constraint. Recently, he proposed several algorithms for solving this kind of problems when the model of the problem is initially unknown / imperfect, and is learned by interacting repeatedly with the environment. These algorithms can solve many interesting real-world problems, such as learning near-optimal preference elicitation policies from eliciting preferences, and learning optimal policies for network routing from repeated rerouting.

Bernoulli Rank-1 Bandits for Click Feedback

Model-Independent Online Learning for Influence Maximization

Online Learning to Rank in Stochastic Click Models

Get to the Bottom: Causal Analysis for User Modeling

Does Weather Matter? Causal Analysis of TV Logs

Minimal Interaction Content Discovery in Recommender Systems

Practical Linear Models for Large-Scale One-Class Collaborative Filtering

Cascading Bandits for Large-Scale Recommendation Problems

DCM Bandits: Learning to Rank with Multiple Clicks

Combinatorial Cascading Bandits

Efficient Thompson Sampling for Online Matrix-Factorization Recommendation

Optimal Greedy Diversity for Recommendation

Cascading Bandits: Learning to Rank in the Cascade Model

Efficient Learning in Large-Scale Combinatorial Semi-Bandits

Tight Regret Bounds for Stochastic Combinatorial Semi-Bandits

Minimal Interaction Search in Recommender Systems

Structured Kernel-Based Reinforcement Learning

Kernel-Based Reinforcement Learning on Representative States