Data di Pubblicazione:
2021
Citazione:
Beyond Bandit Feedback in Online Multiclass Classification / D. van der Hoeven, F. Fusco, N. Cesa Bianchi (ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS). - In: Advances in Neural Information Processing Systems / [a cura di] M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, J. Wortman Vaughan. - [s.l] : Curran Associates, 2021. - ISBN 9781713845393. - pp. 13280-13291 (( Intervento presentato al 34. convegno Neural Information Processing Systems tenutosi a virtual nel 2021.
Abstract:
We study the problem of online multiclass classification in a setting where the
learner’s feedback is determined by an arbitrary directed graph. While including
bandit feedback as a special case, feedback graphs allow a much richer set of
applications, including filtering and label efficient classification. We introduce
GAPPLETRON, the first online multiclass algorithm that works with arbitrary feed-
back graphs. For this new algorithm, we prove surrogate regret bounds that hold,
both in expectation and with high probability, for a large class of surrogate losses.
Our bounds are of order B√ρKT , where B is the diameter of the prediction space,
K is the number of classes, T is the time horizon, and ρ is the domination number
(a graph-theoretic parameter affecting the amount of exploration). In the full in-
formation case, we show that GAPPLETRON achieves a constant surrogate regret
of order B2K. We also prove a general lower bound of order max {B2K, √T }
showing that our upper bounds are not significantly improvable. Experiments on
synthetic data show that for various feedback graphs our algorithm is competitive
against known baselines.
learner’s feedback is determined by an arbitrary directed graph. While including
bandit feedback as a special case, feedback graphs allow a much richer set of
applications, including filtering and label efficient classification. We introduce
GAPPLETRON, the first online multiclass algorithm that works with arbitrary feed-
back graphs. For this new algorithm, we prove surrogate regret bounds that hold,
both in expectation and with high probability, for a large class of surrogate losses.
Our bounds are of order B√ρKT , where B is the diameter of the prediction space,
K is the number of classes, T is the time horizon, and ρ is the domination number
(a graph-theoretic parameter affecting the amount of exploration). In the full in-
formation case, we show that GAPPLETRON achieves a constant surrogate regret
of order B2K. We also prove a general lower bound of order max {B2K, √T }
showing that our upper bounds are not significantly improvable. Experiments on
synthetic data show that for various feedback graphs our algorithm is competitive
against known baselines.
Tipologia IRIS:
03 - Contributo in volume
Elenco autori:
D. van der Hoeven, F. Fusco, N. Cesa Bianchi
Link alla scheda completa:
Link al Full Text:
Titolo del libro:
Advances in Neural Information Processing Systems