正在加载图片...
sudden change in illumination as might occur in an in- distributions corresponding to road,shadow,and vehicle dis- door environment by switching the lights on or off,or tribution.Adaptation of the Gaussian mixture models can be in an outdoor environment by a change between cloudy achieved using an incremental version of the EM algorithm. and sunny conditions; In [12],linear prediction using the Wiener filter is used to shadows cast on the background by objects in the back- predict pixel intensity given a recent history of values.The ground itself(e.g,buildings and trees)or by moving prediction coefficients are recomputed each frame from the foreground objects. sample covariance to achieve adaptivity.Linear prediction using the Kalman filter was also used in [6]-[8] Motion changes: All of the previously mentioned models are based on sta- tistical modeling of pixel intensity with the ability to adapt image changes due to small camera displacements the model.While pixel intensity is not invariant to illumi- (these are common in outdoor situations due to wind nation changes,model adaptation makes it possible for such load or other sources of motion which causes global techniques to adapt to gradual changes in illumination.On motion in the images); the other hand,a sudden change in illumination presents a motion in parts of the background,for example,tree challenge to such models. branches moving with the wind or rippling water. Another approach to model a wide range of variations in the pixel intensity is to represent these variations as dis- Changes introduced to the background:These include any crete states corresponding to modes of the environment,e.g, change in the geometry or the appearance of the background lights on/off or cloudy/sunny skies.Hidden Markov models of the scene introduced by targets.Such changes typically (HMMs)have been used for this purpose in [13]and [14]. occur when something relatively permanent is introduced In [13],a three-state HMM has been used to model the in- into the scene background (for example,if somebody moves tensity of a pixel for a traffic-monitoring application where (introduces)something from (to)the background,or if a car the three states correspond to the background,shadow,and is parked in the scene or moves out of the scene,or ifa person foreground.The use of HMMs imposes a temporal continuity stays stationary in the scene for an extended period). constraint on the pixel intensity,i.e.,if the pixel is detected as 2)Practice:Many researchers have proposed methods to a part of the foreground,then it is expected to remain part of address some of the issues regarding the background mod- the foreground for a period of time before switching back to eling,and we provide a brief review of the relevant work here. be part of the background.In [14],the topology of the HMM Pixel intensity is the most commonly used feature in back- representing global image intensity is learned while learning ground modeling.If we monitor the intensity value of a pixel the background.At each global intensity state,the pixel in- over time in a completely static scene,then the pixel in- tensity is modeled using a single Gaussian.It was shown that tensity can be reasonably modeled with a Gaussian distri- the model is able to learn simple scenarios like switching the bution N(u,o2),given that the image noise over time can lights on and off. be modeled by a zero mean Gaussian distribution N(0,o2). Alternatively,edge features have also been used to model This Gaussian distribution model for the intensity value of a the background.The use of edge features to model the back- pixel is the underlying model for many background subtrac- ground is motivated by the desire to have a representation tion techniques.For example,one of the simplest background of the scene background that is invariant to illumination subtraction techniques is to calculate an average image of changes.In [15],foreground edges are detected by com- the scene,subtract each new frame from this image,and paring the edges in each new frame with an edge map of the threshold the result.This basic Gaussian model can adapt to background which is called the background"primal sketch." slow changes in the scene(for example,gradual illumination The major drawback of using edge features to model the changes)by recursively updating the model using a simple background is that it would only be possible to detect edges adaptive filter.This basic adaptive model is used in [5]:also. of foreground objects instead of the dense connected regions Kalman filtering for adaptation is used in [6]-[8]. that result from pixel-intensity-based approaches.A fusion Typically,in outdoor environments with moving trees and of intensity and edge information was used in [16] bushes,the scene background is not completely static.For Block-based approaches have been also used for modeling example,one pixel can be the image of the sky in one frame, the background.Block matching has been extensively used a tree leaf in another frame.a tree branch in a third frame for change detection between consecutive frames.In [17. and some mixture subsequently.In each situation,the pixel each image block is fit to a second-order bivariate polynomial will have a different intensity (color),so a single Gaussian and the remaining variations are assumed to be noise.A sta- assumption for the pdf of the pixel intensity will not hold tistical likelihood test is then used to detect blocks with sig- Instead,a generalization based on a mixture of Gaussians nificant change.In [18,each block was represented with its has been used in [9]-[11]to model such variations.In [9] median template over the background learning period and its and [10],the pixel intensity was modeled by a mixture of K block standard deviation.Subsequently,at each new frame, Gaussian distributions (K is a small number from 3 to 5) each block is correlated with its corresponding template,and The mixture is weighted by the frequency with which each blocks with too much deviation relative to the measured stan- of the Gaussians explains the background.In [11],a mixture dard deviation are considered to be foreground.The major of three Gaussian distributions was used to model the pixel drawback with block-based approaches is that the detection value for traffic surveillance applications.The pixel inten- unit is a whole image block and therefore they are only suit- sity was modeled as a weighted mixture of three Gaussian able for coarse detection. ELGAMMAL et al:MODELING USING NONPARAMETRIC KERNEL DENSITY ESTIMATION FOR VISUAL SURVEILLANCE 1153• sudden change in illumination as might occur in an in￾door environment by switching the lights on or off, or in an outdoor environment by a change between cloudy and sunny conditions; • shadows cast on the background by objects in the back￾ground itself (e.g., buildings and trees) or by moving foreground objects. Motion changes: • image changes due to small camera displacements (these are common in outdoor situations due to wind load or other sources of motion which causes global motion in the images); • motion in parts of the background, for example, tree branches moving with the wind or rippling water. Changes introduced to the background: These include any change in the geometry or the appearance of the background of the scene introduced by targets. Such changes typically occur when something relatively permanent is introduced into the scene background (for example, if somebody moves (introduces) something from (to) the background, or if a car is parked in the scene or moves out of the scene, or if a person stays stationary in the scene for an extended period). 2) Practice: Many researchers have proposed methods to address some of the issues regarding the background mod￾eling, and we provide a brief review of the relevant work here. Pixel intensity is the most commonly used feature in back￾ground modeling. If we monitor the intensity value of a pixel over time in a completely static scene, then the pixel in￾tensity can be reasonably modeled with a Gaussian distri￾bution , given that the image noise over time can be modeled by a zero mean Gaussian distribution . This Gaussian distribution model for the intensity value of a pixel is the underlying model for many background subtrac￾tion techniques. For example, one of the simplest background subtraction techniques is to calculate an average image of the scene, subtract each new frame from this image, and threshold the result. This basic Gaussian model can adapt to slow changes in the scene (for example, gradual illumination changes) by recursively updating the model using a simple adaptive filter. This basic adaptive model is used in [5]; also, Kalman filtering for adaptation is used in [6]–[8]. Typically, in outdoor environments with moving trees and bushes, the scene background is not completely static. For example, one pixel can be the image of the sky in one frame, a tree leaf in another frame, a tree branch in a third frame, and some mixture subsequently. In each situation, the pixel will have a different intensity (color), so a single Gaussian assumption for the pdf of the pixel intensity will not hold. Instead, a generalization based on a mixture of Gaussians has been used in [9]–[11] to model such variations. In [9] and [10], the pixel intensity was modeled by a mixture of Gaussian distributions ( is a small number from 3 to 5). The mixture is weighted by the frequency with which each of the Gaussians explains the background. In [11], a mixture of three Gaussian distributions was used to model the pixel value for traffic surveillance applications. The pixel inten￾sity was modeled as a weighted mixture of three Gaussian distributions corresponding to road, shadow, and vehicle dis￾tribution. Adaptation of the Gaussian mixture models can be achieved using an incremental version of the EM algorithm. In [12], linear prediction using the Wiener filter is used to predict pixel intensity given a recent history of values. The prediction coefficients are recomputed each frame from the sample covariance to achieve adaptivity. Linear prediction using the Kalman filter was also used in [6]–[8]. All of the previously mentioned models are based on sta￾tistical modeling of pixel intensity with the ability to adapt the model. While pixel intensity is not invariant to illumi￾nation changes, model adaptation makes it possible for such techniques to adapt to gradual changes in illumination. On the other hand, a sudden change in illumination presents a challenge to such models. Another approach to model a wide range of variations in the pixel intensity is to represent these variations as dis￾crete states corresponding to modes of the environment, e.g., lights on/off or cloudy/sunny skies. Hidden Markov models (HMMs) have been used for this purpose in [13] and [14]. In [13], a three-state HMM has been used to model the in￾tensity of a pixel for a traffic-monitoring application where the three states correspond to the background, shadow, and foreground. The use of HMMs imposes a temporal continuity constraint on the pixel intensity, i.e., if the pixel is detected as a part of the foreground, then it is expected to remain part of the foreground for a period of time before switching back to be part of the background. In [14], the topology of the HMM representing global image intensity is learned while learning the background. At each global intensity state, the pixel in￾tensity is modeled using a single Gaussian. It was shown that the model is able to learn simple scenarios like switching the lights on and off. Alternatively, edge features have also been used to model the background. The use of edge features to model the back￾ground is motivated by the desire to have a representation of the scene background that is invariant to illumination changes. In [15], foreground edges are detected by com￾paring the edges in each new frame with an edge map of the background which is called the background “primal sketch.” The major drawback of using edge features to model the background is that it would only be possible to detect edges of foreground objects instead of the dense connected regions that result from pixel-intensity-based approaches. A fusion of intensity and edge information was used in [16]. Block-based approaches have been also used for modeling the background. Block matching has been extensively used for change detection between consecutive frames. In [17], each image block is fit to a second-order bivariate polynomial and the remaining variations are assumed to be noise. A sta￾tistical likelihood test is then used to detect blocks with sig￾nificant change. In [18], each block was represented with its median template over the background learning period and its block standard deviation. Subsequently, at each new frame, each block is correlated with its corresponding template, and blocks with too much deviation relative to the measured stan￾dard deviation are considered to be foreground. The major drawback with block-based approaches is that the detection unit is a whole image block and therefore they are only suit￾able for coarse detection. ELGAMMAL et al.: MODELING USING NONPARAMETRIC KERNEL DENSITY ESTIMATION FOR VISUAL SURVEILLANCE 1153
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有