正在加载图片...
FU et al:MULTI-VIEW VIDEO SUMMARIZATION 723 View 1 Video View 1 Video View 2 Video View 2 Video View N Video View N Video Fig.4.Final video summary resulting from optimization.Dashed lines connect Fig.3.Graph partition by random walks.Shot clusters generated by random the shots that are reserved in the same shot cluster. walks are enclosed by dashed circles. We formulate the summarization as a graph labeling problem. We then define the combinatorial Laplacian matrix for graph For the shot cluster Cs with ns shots,the decision whether as follows: or not the shots should be in the summary is denoted by (1,2,...,ns),V EX.X is the 0/1 solution space in which ∑W(S,S),ifi=j Ti=1 stands for reserved shot and 0 stands for unreserved one Li= -W(Si,Sj), if Si,Si are linked by edge (10) (F1g.4). 、02 otherwise. The multi-objective optimization function is given by L is an Nn X Nn dimensional sparse,symmetric,and positive max{-f(x),-f2(x),f3(x),fa(x)}s.t. g(c)≤Dmat definite matrix,where N is the number of nodes in graph. h(c)≥Rmim 13) We further decompose L into blocks corresponding to nodes in Vs and Vu separately as where f(a)=1,f2()=∑1D:·x,D:>0, f3() 1·,andf4(c)=(1/2)· (11) =l,jW(S,S):· f1,f2,f3,and f4 denote the total shot number,summary length,information coverage,and shot correlation within For each unseeded node,the final determination to which seeded cluster,respectively.Di and Ri are length and importance of cluster it belongs to is made by solving shot i separately.g(r)and h()are defined in forms of fuzzy LUXU =-BTXs (12) set [51] where Xu represents the probabilities that unseeded nodes be- g(x)=u(2(c),h(x)=u(f3(c) long to seeded nodes'clusters.X.s denotes the matrix that marks the cluster category of seeded nodes.We use a Conjugate Gra- with f(x)=[f(x)-inff(r]/supf(x)-inff(x小: dient algorithm [49]to solve the linear formulation of random Dma is the maximum allocated length of one cluster.Rmin walks,which runs very fast. is the minimum information entropy of Cs.They are defined as In the end,to favor important events with long duration, Dmam=入1·D,Rmim=2·R we filter out trivial shot clusters with low entropy values. Furthermore,the two clusters whose similarity exceeds a given where D and R are the total length of shots in C's and the sum threshold are merged together.The remainder shot clusters of importance values,respectively.A1 and A2 are the parameters are used as candidates for summarization in multi-objective that control summary granularity.The two constraints mean that optimization (Fig.3). the total length of shots in Cs after optimization should be less B.Multi-Objective Optimization than Dmas,whereas the entropy should be greater than Rmin We will show in experiments,by flexibly configuring A1 and A2, Users normally have various requirements over summariza multi-level summarization can be easily achieved. tion,according to different kinds of application scenarios.In We further define the minimum function general,a good summary should achieve the following goals si- multaneously.1)Minimize shot number.The retrieval applica- u(F())=min,tniu(fi(} (14) 2<4 tion of summary requires that a small number of shots should be generated.2)Minimize summary length.The minimum length of summary would be of great help to video storage.3)Maxi- in which F()=((f()),u(f2()),uf3()),(fa(xT i,=1,.4 are coefficients that control the weights of objective mize information coverage.To achieve enough information cov- functions satisfying=1 and ni0.They can be erage,the sum of resulting shots'entropy value in each cluster configured according to different user requirements. must exceed a certain threshold.4)Maximize shot correlation.It would be much better if shots in every resulting cluster strongly By employing the Max-Min method,the multi-objective op- timization is transformed into the following 0-1 mixed integer correlate with each other.This yields the most representative programming problem: shots for the interesting event. To meet the above requirements,we design a multi-objec- tive optimization model to generate final summary.The opti- x*=argmax u(F(x))s.t.A·F≤ -Rmin (15) mization follows the complexity incompatibility principle [50]. 石FU et al.: MULTI-VIEW VIDEO SUMMARIZATION 723 Fig. 3. Graph partition by random walks. Shot clusters generated by random walks are enclosed by dashed circles. We then define the combinatorial Laplacian matrix for graph as follows: if if are linked by edge otherwise. (10) is an dimensional sparse, symmetric, and positive definite matrix, where is the number of nodes in graph. We further decompose into blocks corresponding to nodes in and separately as (11) For each unseeded node, the final determination to which seeded cluster it belongs to is made by solving (12) where represents the probabilities that unseeded nodes be￾long to seeded nodes’ clusters. denotes the matrix that marks the cluster category of seeded nodes. We use a Conjugate Gra￾dient algorithm [49] to solve the linear formulation of random walks, which runs very fast. In the end, to favor important events with long duration, we filter out trivial shot clusters with low entropy values. Furthermore, the two clusters whose similarity exceeds a given threshold are merged together. The remainder shot clusters are used as candidates for summarization in multi-objective optimization (Fig. 3). B. Multi-Objective Optimization Users normally have various requirements over summariza￾tion, according to different kinds of application scenarios. In general, a good summary should achieve the following goals si￾multaneously. 1) Minimize shot number. The retrieval applica￾tion of summary requires that a small number of shots should be generated. 2) Minimize summary length. The minimum length of summary would be of great help to video storage. 3) Maxi￾mize information coverage. To achieve enough information cov￾erage, the sum of resulting shots’ entropy value in each cluster must exceed a certain threshold. 4) Maximize shot correlation. It would be much better if shots in every resulting cluster strongly correlate with each other. This yields the most representative shots for the interesting event. To meet the above requirements, we design a multi-objec￾tive optimization model to generate final summary. The opti￾mization follows the complexity incompatibility principle [50]. Fig. 4. Final video summary resulting from optimization. Dashed lines connect the shots that are reserved in the same shot cluster. We formulate the summarization as a graph labeling problem. For the shot cluster with shots, the decision whether or not the shots should be in the summary is denoted by . is the 0/1 solution space in which stands for reserved shot and 0 stands for unreserved one (Fig. 4). The multi-objective optimization function is given by (13) where , , , and . , , , and denote the total shot number, summary length, information coverage, and shot correlation within cluster, respectively. and are length and importance of shot separately. and are defined in forms of fuzzy set [51] with . is the maximum allocated length of one cluster. is the minimum information entropy of . They are defined as where and are the total length of shots in and the sum of importance values, respectively. and are the parameters that control summary granularity. The two constraints mean that the total length of shots in after optimization should be less than , whereas the entropy should be greater than . We will show in experiments, by flexibly configuring and , multi-level summarization can be easily achieved. We further define the minimum function (14) in which . are coefficients that control the weights of objective functions satisfying and . They can be configured according to different user requirements. By employing the Max-Min method, the multi-objective op￾timization is transformed into the following 0-1 mixed integer programming problem: (15)
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有