正在加载图片...
Input image Extracted window Light Histogram Classification using pyramid (19x19 pixels)correction equalization support vector machines SVM quick discard possible-face/non-face If possible face SVM complete classifier face/nonface Preprocessing Figure 8.System architecture at runtime.(Used with permission.2) .Histogram equalization:Our process Rescale the input image several times; performs a histogram equalization over Cut 19x19 window patterns out of the the patterns to compensate for differences scaled image; in illumination brightness and different Preprocess the window using masking, cameras'response curves,and so on. light correction and histogran equaliza- tion; Once the process obtains a decision sur- Classify the pattern using the SVM:and face through training,it uses the runtime If the class corresponds to a face,draw system over images that do not contain a regtangle aroung the face in the output faces,storing misclassifications so that image they can be used as negative examples in subsequent training phases.Images of Figure 8 reflects the system's architec- landscapes,trees,buildings,and rocks,for ture at runtime example,are good sources of false posi- tives because of the many different textured Experimental results on static patterns they contain.This bootstrapping images step,which Sung and Poggio successfully To test the runtime system,we used two used,is very important in the context of a sets of images.Set A contained 313 high- face detector that learns from examples: quality images with the same number of faces.Set B contained 23 images of mixed Although negative examples are abun- quality,with a total of 155 faces.We tested dant,negative examples that are useful both sets,first using our system and then from a learning standpoint are very dif- the one by Sung and Poggio.5.6 To give Fiqure 9.Results from our face-detection system. ficult to characterize and define. true meaning to the number of false posi- By approaching the problem of object tives obtained,note that set A involved detection,and in this case of face de- 4.669.960 pattern windows,while set B Table 2.Performance of the SVM face-detection system tection,by using the paradigm of bi- involved 5,383,682.Table 2 compares the nary pattern classification,the two two systems. TEST SET A TEST SET B classes-object and nonobject-are not Figure 9 presents some output images of DETECT DETECT equally complex.The nonobject class our system,which were not used during the RATE FALSE RATE FALSE is broader and richer,and therefore training phase of the system. (ALARMS (ALARMS needs more examples to get an accurate SVM 97.14 74.220 definition that separates it from the Extension to a real-time system Sung 94.62 74.2 11 object class.Figure 7 shows an image The system I've discussed so far spends used for bootstrapping with some mis- approximately 6 seconds(SparcStation 20) classifications that later served as nega- on a 320x240 pixels gray-level image.Al- tive examples. though this is faster than most previous a Matrox RGB frame grabber and a systems,it is not fast enough for use as a Hitachi three-chip color camera.We After training the SVM,using an imple- runtime system.To build a runtime version used no special hardware to speed up mentation of the algorithm my colleagues of the system,we took the following steps: the computational burden of the system. and I describe elsewhere,8 we incorporate it ● We collected several color images with as the classifier in a runtime system very We ported the C code developed on the faces,from which we extracted areas similar to the one used by Sung and Pog- Sun environment to a Windows NT with skin and nonskin pixels.We col- gio.5.6 It performs the following operations: Pentium 200-MHz computer and added lected a dataset of6,000 examples. JULY/AUGUST 1998 25JULY/AUGUST 1998 25 • Histogram equalization: Our process performs a histogram equalization over the patterns to compensate for differences in illumination brightness and different cameras’response curves, and so on. Once the process obtains a decision sur￾face through training, it uses the runtime system over images that do not contain faces, storing misclassifications so that they can be used as negative examples in subsequent training phases. Images of landscapes, trees, buildings, and rocks, for example, are good sources of false posi￾tives because of the many different textured patterns they contain. This bootstrapping step, which Sung and Poggio6 successfully used, is very important in the context of a face detector that learns from examples: • Although negative examples are abun￾dant, negative examples that are useful from a learning standpoint are very dif￾ficult to characterize and define. • By approaching the problem of object detection, and in this case of face de￾tection, by using the paradigm of bi￾nary pattern classification, the two classes—object and nonobject—are not equally complex. The nonobject class is broader and richer, and therefore needs more examples to get an accurate definition that separates it from the object class. Figure 7 shows an image used for bootstrapping with some mis￾classifications that later served as nega￾tive examples. After training the SVM, using an imple￾mentation of the algorithm my colleagues and I describe elsewhere, 8 we incorporate it as the classifier in a runtime system very similar to the one used by Sung and Pog￾gio.5,6 It performs the following operations: • Rescale the input image several times; • Cut 19×19 window patterns out of the scaled image; • Preprocess the window using masking, light correction and histogran equaliza￾tion; • Classify the pattern using the SVM; and • If the class corresponds to a face, draw a regtangle aroung the face in the output image. Figure 8 reflects the system’s architec￾ture at runtime. Experimental results on static images To test the runtime system, we used two sets of images. Set A contained 313 high￾quality images with the same number of faces. Set B contained 23 images of mixed quality, with a total of 155 faces. We tested both sets, first using our system and then the one by Sung and Poggio. 5,6 To give true meaning to the number of false posi￾tives obtained, note that set A involved 4,669,960 pattern windows, while set B involved 5,383,682. Table 2 compares the two systems. Figure 9 presents some output images of our system, which were not used during the training phase of the system. Extension to a real-time system The system I’ve discussed so far spends approximately 6 seconds (SparcStation 20) on a 320×240 pixels gray-level image. Al￾though this is faster than most previous systems, it is not fast enough for use as a runtime system. To build a runtime version of the system, we took the following steps: • We ported the C code developed on the Sun environment to a Windows NT Pentium 200-MHz computer and added a Matrox RGB frame grabber and a Hitachi three-chip color camera. We used no special hardware to speed up the computational burden of the system. • We collected several color images with faces, from which we extracted areas with skin and nonskin pixels. We col￾lected a dataset of 6,000 examples. Figure 8. System architecture at runtime. (Used with permission.2 ) Input image pyramid Extracted window (19×19 pixels) Light correction Histogram equalization Classification using support vector machines SVM quick discard possible-face/non-face SVM complete classifier face/nonface If possible face Preprocessing Subsampling Figure 9. Results from our face-detection system. Table 2. Performance of the SVM face-detection system. TEST SET A TEST SET B DETECT DETECT RATE FALSE RATE FALSE (%) ALARMS (%) ALARMS SVM 97.1 4 74.2 20 Sung 94.6 2 74.2 11
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有