Skip to yearly menu bar Skip to main content


Virtual presentation / poster accept

Learning Kernelized Contextual Bandits in a Distributed and Asynchronous Environment

Chuanhao Li · Huazheng Wang · Mengdi Wang · Hongning Wang

Keywords: [ Theory ] [ contextual bandit ] [ kernelized method ] [ communication efficiency ] [ asynchronous distributed learning ]


Abstract:

Despite the recent advances in communication-efficient distributed bandit learning, most existing solutions are restricted to parametric models, e.g., linear bandits and generalized linear bandits (GLB). In comparison, kernel bandits, which search for non-parametric functions in a reproducing kernel Hilbert space (RKHS), offer higher modeling capacity. But the only existing work in distributed kernel bandits adopts a synchronous communication protocol, which greatly limits its practical use (e.g., every synchronization step requires all clients to participate and wait for data exchange).In this paper, in order to improve the robustness against delays and unavailability of clients that are common in practice, we propose the first asynchronous solution based on approximated kernel regression for distributed kernel bandit learning. A set of effective treatments are developed to ensure approximation quality and communication efficiency. Rigorous theoretical analysis about the regret and communication cost is provided; and extensive empirical evaluations demonstrate the effectiveness of our solution.

Chat is not available.