正在加载图片...
Online Convex Optimization At each round t=1,2,... (1)the player first picks a model x from aconvex set (2)and environments pick an online convex functionf:-R; (3)the player suffers loss fi(xt),observes some information about fr and updates the model. on the feedback information: full information partial information -full information:observe entire f(or at least gradient Vf(x)) 8B88 partial information (bandits):observe function value fi(xt)only less information horse racing multi-armed bandits Advanced Optimization(Fall 2023) Lecture 11.Adversarial Bandits 8Advanced Optimization (Fall 2023) Lecture 11. Adversarial Bandits 8 Online Convex Optimization less information full information horse racing partial information multi-armed bandits on the feedback information:
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有