Skip to Main Content (Press Enter)

Logo UNIMI
  • ×
  • Home
  • People
  • Projects
  • Fields
  • Units
  • Outputs
  • Third Mission

Expertise & Skills
Logo UNIMI

|

Expertise & Skills

unimi.it
  • ×
  • Home
  • People
  • Projects
  • Fields
  • Units
  • Outputs
  • Third Mission
  1. Outputs

Linear Bandits with Memory

Academic Article
Publication Date:
2024
Citation:
Linear Bandits with Memory / G. Clerici, P. Laforgue, N. Cesa Bianchi. - In: TRANSACTIONS ON MACHINE LEARNING RESEARCH. - ISSN 2835-8856. - 5(2024), pp. 1-26.
abstract:
Nonstationary phenomena, such as satiation effects in recommendations, have mostly been modeled using bandits with finitely many arms. However, the richer action space provided by linear bandits is often preferred in practice. In this work, we introduce a novel nonstationary linear bandit model, where current rewards are influenced by the learner’s past actions in a fixed-size window. Our model, which recovers stationary linear bandits as a special case, leverages two parameters: the window size m ≥ 0, and an exponent γ that captures the rotting (γ < 0) or rising (γ > 0) nature of the phenomenon. When both m and γ are known, we propose and analyze a variant of OFUL which minimizes regret against cyclic policies. By choosing the cycle length so as to trade-off approximation and estimation errors, we then prove a bound of order √d (m + 1) 1 2 +max{γ,0} T 3/4 (ignoring log factors) on the regret against the optimal sequence of actions, where T is the horizon and d is the dimension of the linear action space. Through a bandit model selection approach, our results are then extended to the case where both m and γ are unknown. Finally, we complement our theoretical results with experiments comparing our approach to natural baselines
IRIS type:
01 - Articolo su periodico
List of contributors:
G. Clerici, P. Laforgue, N. Cesa Bianchi
Authors of the University:
CESA BIANCHI NICOLO' ANTONIO ( author )
Link to information sheet:
https://air.unimi.it/handle/2434/1087071
Full Text:
https://air.unimi.it/retrieve/handle/2434/1087071/2504165/2053_Linear_Bandits_with_Memor.pdf
Project:
Learning in Markets and Society
  • Research Areas

Research Areas

Concepts (2)


Settore INF/01 - Informatica

Settore INFO-01/A - Informatica
  • Guide
  • Help
  • Accessibility
  • Privacy
  • Use of cookies
  • Legal notices

Powered by VIVO | Designed by Cineca | 26.4.3.0