正在加载图片...
9 Thus we have which further implies that 「T川 T川 ∑∑lp()-0e(训+∑∑©(),9()-p(t》 1k= o-a晚=2芝l-m-ik-L ≤2L乏》a0-A6L. 1 t=1k=1 w(t) +可三s,e-儿.+a,- For the third term in the right hand side of Equation(27), (34) recalling that (t)EF-1.we get Taking the expectation on both sides of Inequality (34)and E[gk(t)-9k(t),0x(t)-8*】 using the fact that Ellg (t)ll.<L.we have =E[(E(g()-9k(tF-1,0(t)-0*】=0. (30) Combining these equations,we obtain the running sum bound a0-%0l2--1l+红 (35) ∑f(p(t)-f(8*〗≤w(T)(8*)+ 2T1 i= 2 where ej represents the j-th standard basis vector in the -dimensional Euclidean space.To further bound 2L子兴EI④-44L (t)-u(t),we break the sum in Inequality (35)into 「可台台 w(t)T two terms,with a cutoff point t.From the Perron-Frobenius (31) theory presented in [42].we have Applying Lemma 4 to (31),it gives that Pts,-l ≤Vk+1(P) ∑f(0s(),r)-f(0',y1≤w(T)p(8)+ L2 T 1 From the above inequality,it follows that if t=1 又心E0-4L+L∑I@-y‖ 1≤k≤t- +1,then In2(P) P(.ke,- ≤VCr. w(t)T t= +万a0-4 Specifically,setting Cerr=l/TV☑,forl:1≤k≤ t=1 w(t)T t1,we have Dividing both sides of the inequality by T,we can obtain the theorem based on convexity of f. ■ Fork>t-品牺+l,we have C.Proof of Theorem 5 Proof:The proof concentrates on deriving the bound of the networkro in Theorem4,∑l兴pi-“l.We P-l ≤IPte,l+奇IL,=2 1 w(t) (36 first define the matrix P(t)=p-1i,where(t is the element in the i-th row and j-th column of the matrix The above clearly suggests that the cutoff point is =nC In2(P) P(t,1).From Equation (7),based on the record at time l,i.e. Since there are at most T steps in the summation,we have ()we can obtain the record at time t+1 as follows: 叭 Ea0-ol.≤LP-1A%-l 4t+1)=∑Pt,小4) k=t-i i=1 (32) t-t-1 t +∑∑P6,9.(k-1)+g,). + P(t-1,k)e- +2L k=l+1=1 If t =l,this iteration will be terminated in our algorithm. ≤2ui+t-i-)+2L From the definition of (t)in Lemma 2,it follows: ≤2Lt+L+2L(byt≤T) t-1l -4,=∑Σ (-e-1)- -2h(Y+3L In(2(P)) k=1i= =2L n(T所) +3L +丙.t-少-,t-以. Ina2(P) (33) s2LI(TVi) 1-02(P) +3L9 Thus we have E   ∑ T t=1 ∑ |V| k=1 ∥φ(t) − θk(t)∥ + ∑ T t=1 ∑ |V| k=1 ⟨gˆk (t), θk(t) − φ(t)⟩   ≤ 2L ∑ T t=1 ∑ |V| k=1 E ∥µ¯(t) − µk (t)∥∗ ω(t) . For the third term in the right hand side of Equation (27), recalling that θk(t) ∈ Ft−1, we get E[⟨gk (t) − gˆk (t), θk(t) − θ ∗ ⟩] = E[⟨E(gk (t)) − gˆk (t)|Ft−1, θk(t) − θ ∗ ⟩] = 0. (30) Combining these equations, we obtain the running sum bound ∑ T t=1 [f(φ(t)) − f(θ ∗ )] ≤ ω(T)ϕ(θ ∗ ) + L 2 2τ ∑ T t=1 1 ω(t) + 2L |V| ∑ T t=1 ∑ |V| k=1 E ∥µ¯(t) − µk (t)∥∗ ω(t)τ . (31) Applying Lemma 4 to (31), it gives that ∑ T t=1 [f(θk(t), yˆk ) − f(θ ∗ , y ∗ )] ≤ ω(T)ϕ(θ ∗ ) + L 2 2τ ∑ T t=1 1 ω(t) + 2L |V| ∑ T t=1 ∑ |V| k=1 E ∥µ¯(t) − µk (t)∥∗ ω(t)τ + L ∑ T t=1 ∥yˆk − y ∗ ∥ + ∑ T t=1 ∥µ¯(t) − µk (t)∥ ω(t)τ . Dividing both sides of the inequality by T, we can obtain the theorem based on convexity of f. C. Proof of Theorem 5 Proof: The proof concentrates on deriving the bound of the network error in Theorem 4, ∑|V| j=1 E∥µ¯ (t)−µj (t)∥∗ ω(t) . We first define the matrix P(t, l) = P t−l+1, where [P(t, l)]ij is the element in the i-th row and j-th column of the matrix P(t, l). From Equation (7), based on the record at time l, i.e. µj (l), we can obtain the record at time t + 1 as follows: µj (t + 1) = ∑ |V| i=1 [P(t, l)]ijµj (l) + ∑t k=l+1 ∑ |V| i=1 [P(t, k)]ijgˆi (k − 1) + gˆj (t). (32) If t = l, this iteration will be terminated in our algorithm. From the definition of µ¯(t) in Lemma 2, it follows: µ¯(t) − µj (t) = ∑t−1 k=1 ∑ |V| i=1 ( 1 |V| − [P(t − 1, k)]ij) gˆi (k − 1) + 1 |V| ∑ |V| i=1 [gˆi (t − 1) − gˆj (t − 1)], (33) which further implies that µ¯(t) − µj (t) ∗ ≤ ∑t−1 k=1 ∑ |V| i=1 1 |V| − [P(t − 1, k)]ij ∥gˆi (k − 1)∥∗ + 1 |V| ∑ |V| i=1 [∥gˆi (t − 1)∥∗ + gˆj (t − 1) ∗ ]. (34) Taking the expectation on both sides of Inequality (34) and using the fact that E ∥gˆi (t)∥∗ ≤ L, we have E µ¯(t) − µj (t) ∗ ≤ ∑t−1 k=1 L 1|V| |V| − P(t − 1, k)ej 1 + 2L, (35) where ej represents the j-th standard basis vector in the |V|-dimensional Euclidean space. To further bound µ¯(t) − µj (t) ∗ , we break the sum in Inequality (35) into two terms, with a cutoff point t˜. From the Perron-Frobenius theory presented in [42], we have P(t, k)ej − 1|V| |V| 1 ≤ √ |V|σ t−k+1 2 (P). From the above inequality, it follows that if 1 ≤ k ≤ t− ln Cerr ln σ2(P) +1, then P(t, k)ej − 1|V| |V| 1 ≤ √ |V|Cerr. Specifically, setting Cerr = 1/T√ |V|, for ∀l : 1 ≤ k ≤ t − ln Cerr ln σ2(P ) + 1, we have P(t, k)ej − 1|V| |V| 1 ≤ 1 T . For k > t − ln Cerr ln σ2(P ) + 1, we have P(t, k)ej − 1|V| |V| 1 ≤ ∥P(t, k)ej∥1 + 1 |V| 1|V| 1 = 2. (36) The above clearly suggests that the cutoff point is t˜= ln Cerr ln σ2(P ) . Since there are at most T steps in the summation, we have E µ¯(t) − µj (t) ∗ ≤ L ∑t−1 k=t−t˜ P(t − 1, k)ej − 1|V| |V| 1 + L t−∑t˜−1 k=1 P(t − 1, k)ej − 1|V| |V| 1 + 2L ≤ 2Lt˜+ L T (t − t˜− 1) + 2L ≤ 2Lt˜+ L + 2L (by t ≤ T) = 2L ln(T √ |V|) −1 ln(σ2(P)) + 3L = 2L ln(T √ |V|) ln σ −1 2 (P) + 3L ≤ 2L ln(T √ |V|) 1 − σ2(P) + 3L
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有