Skip to Main Content (Press Enter)

Logo UNIMI
  • ×
  • Home
  • Persone
  • Attività
  • Ambiti
  • Strutture
  • Pubblicazioni
  • Terza Missione

Expertise & Skills
Logo UNIMI

|

Expertise & Skills

unimi.it
  • ×
  • Home
  • Persone
  • Attività
  • Ambiti
  • Strutture
  • Pubblicazioni
  • Terza Missione
  1. Pubblicazioni

Best-of-Both-Worlds Algorithms for Linear Contextual Bandits

Contributo in Atti di convegno
Data di Pubblicazione:
2024
Citazione:
Best-of-Both-Worlds Algorithms for Linear Contextual Bandits / Y. Kuroki, A. Rumi, T. Tsuchiya, F. Vitale, N. Cesa Bianchi (PROCEEDINGS OF MACHINE LEARNING RESEARCH). - In: Proceedings of The International Conference on Artificial Intelligence and Statistics / [a cura di] S. Dasgupt, S. Mandt, Y. Li. - [s.l] : ML Research Press, 2024. - pp. 1216-1224 (( Intervento presentato al 27. convegno International Conference on Artificial Intelligence and Statistics tenutosi a Valencia nel 2024.
Abstract:
We study best-of-both-worlds algorithms for $K$-armed linear contextual bandits. Our algorithms deliver near-optimal regret bounds in both the adversarial and stochastic regimes, without prior knowledge about the environment. In the stochastic regime, we achieve the polylogarithmic rate $\frac{(dK)^2\mathrm{poly}\!\log(dKT)}{\Delta_{\min}}$, where $\Delta_{\min}$ is the minimum suboptimality gap over the $d$-dimensional context space. In the adversarial regime, we obtain either the first-order $\widetilde{\mathcal{O}}(dK\sqrt{L^*})$ bound, or the second-order $\widetilde{\mathcal{O}}(dK\sqrt{\Lambda^*})$ bound, where $L^*$ is the cumulative loss of the best action and $\Lambda^*$ is a notion of the cumulative second moment for the losses incurred by the algorithm. Moreover, we develop an algorithm based on FTRL with Shannon entropy regularizer that does not require the knowledge of the inverse of the covariance matrix, and achieves a polylogarithmic regret in the stochastic regime while obtaining $\widetilde{\mathcal{O}}\big(dK\sqrt{T}\big)$ regret bounds in the adversarial regime.
Tipologia IRIS:
03 - Contributo in volume
Elenco autori:
Y. Kuroki, A. Rumi, T. Tsuchiya, F. Vitale, N. Cesa Bianchi
Autori di Ateneo:
CESA BIANCHI NICOLO' ANTONIO ( autore )
RUMI ALBERTO ( autore )
Link alla scheda completa:
https://air.unimi.it/handle/2434/1122718
Link al Full Text:
https://air.unimi.it/retrieve/handle/2434/1122718/2603185/kuroki24a.pdf
Titolo del libro:
Proceedings of The International Conference on Artificial Intelligence and Statistics
Progetto:
Learning in Markets and Society
  • Aree Di Ricerca

Aree Di Ricerca

Settori


Settore INFO-01/A - Informatica
  • Informazioni
  • Assistenza
  • Accessibilità
  • Privacy
  • Utilizzo dei cookie
  • Note legali

Realizzato con VIVO | Progettato da Cineca | 25.11.5.0