ICLR Poster signSGD via Zeroth-Order Oracle

Poster

signSGD via Zeroth-Order Oracle

Sijia Liu · Pin-Yu Chen · Xiangyi Chen · Mingyi Hong

Great Hall BC #18

Keywords: [ black-box adversarial attack ] [ zeroth-order algorithm ] [ nonconvex optimization ]

[ Abstract ]

Abstract: In this paper, we design and analyze a new zeroth-order (ZO) stochastic optimization algorithm, ZO-signSGD, which enjoys dual advantages of gradient-free operations and signSGD. The latter requires only the sign information of gradient estimates but is able to achieve a comparable or even better convergence speed than SGD-type algorithms. Our study shows that ZO signSGD requires $\sqrt{d}$ times more iterations than signSGD, leading to a convergence rate of $O(\sqrt{d}/\sqrt{T})$ under mild conditions, where $d$ is the number of optimization variables, and $T$ is the number of iterations. In addition, we analyze the effects of different types of gradient estimators on the convergence of ZO-signSGD, and propose two variants of ZO-signSGD that at least achieve $O(\sqrt{d}/\sqrt{T})$ convergence rate. On the application side we explore the connection between ZO-signSGD and black-box adversarial attacks in robust deep learning. Our empirical evaluations on image classification datasets MNIST and CIFAR-10 demonstrate the superior performance of ZO-signSGD on the generation of adversarial examples from black-box neural networks.

Live content is unavailable. Log in and register to view live content