正在加载图片...
1 WANG Hongyuan,et al:Efficient tracker based on sparse coding with Euclidean local structure-based constraint ·141. Table 2 Euclidean Local Structure based Tracking ELSS-tracker) Input:Given a video stream for tracking,location of the target l in frame #1 Output:Tracking results of each frame 1:Set s=1,select 10 template regions extremely near the target in #1,then resize and stretch them to beTR 2:While not reach the end of the video sequence,ss+1 3:Pick N=200 candidate regions around the latest target location in frame #s,and stretch to be yRx 4:Construct D with T,positive and negative identity matries,likewise in Fig.I and Fig2 5:Solve ELSSC-optimization with Y and D in Tab.1,and denote the optimization result as c 6:Compute the reconstruct errors eVeand the normalized weight where exp(ea) 7:Locate the object for tracking with the weighted sum of all 200 candidate regions and in frame #s 8:Select 10 regions that extremely nearby the object as the new target templates T 9:End 3 Experiments ELSSC were verified.The experiments were designed with the Face sequence in the VOT 2013 benchmark 3.1 Experimental setting datasetis]as follows:six similar regions were repre- In order to evaluate the proposed tracker,experi- sented (CR,..,CR,their means and standard deri- ments on 12 video sequences were conducted,inclu- vations illustrate the similarity)sparsely with template ding Surfer,Dudek,Faceocc2,Animal,Girl,Stone,T=[T,To]R0xio from two regions apart Car,Cup,Face,Juice,Singer,Sunshade,Bike,Car from each other the red region and the green one). Dark,and Jumping These sequences covered al- Evidently,T is over-completed,and the entire diction- most all challenges in tracking,including occlusion ary DR is constructed likewise in Figs.1 (even heavy occlusion),motion blur,rotation,scale and 2. variation,illumination variation,and complex back- The sparse coefficients of CR,.,CR generated ground.For comparison,we used four state-of-the-art with the 1-,the NISSSC-,the original ELSSC-,and algorithms with the same initial positions and the same the improved ELSSC-optimization are plotted in Fig.3. representations of the targets.They were the incremen-In particular,six similar regions have very different tal learning-based tracker (IVT,a common discrimi- representation coefficients,when using the original I- nant tracker)[14],the covariance-based tracker (Cov- optimization problem,which ignores the structure in- Track,a generative tracker on Lie-group)(,the formation between regions.The results of the other tracker (a generative tracking method),and the three algorithms are much more stable,because of NISSST.All the experiments were run on a comput- preservation of the structural information.If two regions er with a 2.67 GHz CPU and a 2 GB memory. are similar to each other,they also have similar sparse The main parameters used in our experiments coefficients.This improves the robustness of tracking; are set as follows:the number of candidate regions otherwise,the tracker may degenerate or even fail to N=200,the number of template regions is n track.CR for example,with I-optimization,can be 10,and the candidates and targets are resized to represented by T2,Ts,T,T,and T1,and the track- 40×40. er may fail to track the top of the book.Meanwhile,ex- 3.2 Experimental results for sparsity and stability perimental results show that,NLSSSC and our two The stability and sparsity of the original sparse ELSSC are sparser than the original 1 -optimization coding,the NISSSC,and the original and improved problem.Table 2 Euclidean Local Structure based Tracking (ELSS⁃tracker) Input:Given a video stream for tracking,location of the target l 1 in frame #1 Output:Tracking results of each frame 1:Set s = 1,select 10 template regions extremely near the target in #1,then resize and stretch them to be T ∈ R 1 600×10 2:While not reach the end of the video sequence,s←s+1 3:Pick N= 200 candidate regions around the latest target location l s⁃1 in frame #s,and stretch to be Y ∈ R 1 600×200 4:Construct D with T, positive and negative identity matries,likewise in Fig.1 and Fig2 5:Solve ELSSC⁃optimization with Y and D in Tab.1,and denote the optimization result as c (s) i 6:Compute the reconstruct errors e (s) i = ‖x (s) i - Vc (s) i ‖2 2 and the normalized weight w (s) i = w (s) i /∑i w (s) i , where w (s) i = exp( - e (s) i / α) 7:Locate the object for tracking with the weighted sum of all 200 candidate regions and w (s) i in frame #s 8:Select 10 regions that extremely nearby the object as the new target templates T 9:End 3 Experiments 3.1 Experimental setting In order to evaluate the proposed tracker, experi⁃ ments on 12 video sequences were conducted, inclu⁃ ding Surfer, Dudek, Faceocc2, Animal, Girl, Stone, Car, Cup, Face, Juice, Singer, Sunshade, Bike, Car Dark, and Jumping [17⁃19] . These sequences covered al⁃ most all challenges in tracking, including occlusion (even heavy occlusion), motion blur, rotation, scale variation, illumination variation, and complex back⁃ ground. For comparison, we used four state⁃of⁃the⁃art algorithms with the same initial positions and the same representations of the targets. They were the incremen⁃ tal learning⁃based tracker ( IVT, a common discrimi⁃ nant tracker) [14] , the covariance⁃based tracker (Cov⁃ Track, a generative tracker on Lie⁃group) [15] , the l 1 ⁃ tracker ( a generative tracking method) [8⁃9] , and the NLSSST [11] .All the experiments were run on a comput⁃ er with a 2.67 GHz CPU and a 2 GB memory. The main parameters used in our experiments are set as follows: the number of candidate regions N = 200, the number of template regions is n = 10, and the candidates and targets are resized to 40 × 40. 3.2 Experimental results for sparsity and stability The stability and sparsity of the original sparse coding, the NLSSSC, and the original and improved ELSSC were verified. The experiments were designed with the Face sequence in the VOT 2013 benchmark dataset [18] as follows: six similar regions were repre⁃ sented (CR1 ,…,CR6 , their means and standard deri⁃ vations illustrate the similarity) sparsely with template T = T1 ,…,T10 [ ] ∈ R 1 600×10 from two regions apart from each other ( the red region and the green one). Evidently, T is over⁃completed, and the entire diction⁃ ary D ∈ R 1 600×3 210 is constructed likewise in Figs. 1 and 2. The sparse coefficients of CR1 ,…, CR6 generated with the l 1 ⁃, the NLSSSC⁃, the original ELSSC⁃, and the improved ELSSC⁃optimization are plotted in Fig. 3. In particular, six similar regions have very different representation coefficients, when using the original l 1 ⁃ optimization problem, which ignores the structure in⁃ formation between regions. The results of the other three algorithms are much more stable, because of preservation of the structural information. If two regions are similar to each other, they also have similar sparse coefficients. This improves the robustness of tracking; otherwise, the tracker may degenerate or even fail to track. CR4 for example, with l 1 ⁃optimization, can be represented by T2 , T8 , T6 , T7 , and T1 , and the track⁃ er may fail to track the top of the book. Meanwhile, ex⁃ perimental results show that, NLSSSC and our two ELSSC are sparser than the original l 1 ⁃optimization problem. 第 1 期 WANG Hongyuan, et al: Efficient tracker based on sparse coding with Euclidean local structure⁃based constraint ·141·
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有