正在加载图片...
Regret Decomposition For stochastic MAB,a natural characterization of the arms: (i)Suboptimality gap:Aa=u(a*)-u(a); (ii)Number of times arm a is pulled int rounds:n(a)=>1=a). Regret can be reformulated as no-立=oi-a T E[Regretr)max E =】 ∑(u(a*)-μ(at)nr(a)=∑△anr(a aE[K] a∈[K] Advanced Optimization(Fall 2023) Lecture 12.Stochastic Bandits 6 Advanced Optimization (Fall 2023) Lecture 12. Stochastic Bandits 6 Regret Decomposition • For stochastic MAB, a natural characterization of the arms: • Regret can be reformulated as
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有