正在加载图片...
(a)Frame 1 (b)Frame 2 (c)Frame 3 (d)Frame 4 (e)Frame 5 Fig.2.Frames during two consecutive keystrokes 1)Keyboard detection:We use Canny edge detection algo-The initial/default value of y is y=50 rithm [14]to obtain the edges of the keyboard.Fig.4(b)shows When we obtain the white pixels,we need to get the the edge detection result of Fig.4(a).However,the interference contours of the keys and separate the keys from one another. edges (e.g.,the paper's edge/longest edge in Fig.4(b))should While considering the pitfall areas such as small white areas be removed.Based on Fig.4(b),the edges of the keyboard which do not belong to any key,we estimate the area of a key should be close to the edges of the keys.We use this feature at first.Based on Fig.4(e),we use P,P2,P3,P to calculate to remove pitfall edges,the result is shown in Fig.4(c). the area S of the keyboard as S6=是·(IBPB×BPa+ Additionally,we adopt the dilation operation [15]to join the PP x PP).Then,we calculate the area of each key. dispersed edge points which are close to each other,in order to We use N to represent the number of keys in the keyboard get better edges/boundaries of the keyboard.After that,we use Considering the size difference between keys,we treat larger the Hough transform [12]to detect the lines in Fig.4(c).Then, keys (e.g.,the space key)as multiple regular keys (e.g.,A-Z, we use the uppermost line and the bottom line to describe 0-9).For example,the space key is treated as five regular keys. the position range of the keyboard,as shown in Fig.4(d). In this way,we will change N to Navg.Then,we can estimate Similarly,we can use the Hough transform [12]to detect the the average area of a regular key as S/Nv In addition to left/right edge of the keyboard.If there are no suitable edges size difference between keys,different distances between the detected by the Hough transform,it is usually because the camera and the keys can also affect the area of a key in the keyboard is not perfectly located in the camera's view.In this image.Therefore,we introduce a.ah to describe the range of case,we simply use the left/right boundary of the image to represent the left/right edge of the keyboard.As shown in a valid area S of a key as 5o We set a=0.15,an =5 in CamK,based on extensive experiments. Fig.4(e),we extend the four edges (lines)to get four inter- The key segmentation result of Fig.4(e)is shown in Fig.4(f) sections Pi(1,21),P2(x2,2),P3(x3,y3),P(4,y4),which Then,we use the location of the space key (biggest key)to are used to describe the boundary of the keyboard. locate other keys,based on the relative locations between keys. C.Fingertip Detection In order to detect keystrokes,CamK needs to detect the fingertips and track the movements of fingertips.Fingertip de- tection consists of hand segmentation and fingertip discovery. (a)An input image (b)Canny edge detec-(c)Optimization for 1)Hand segmentation:Skin segmentation [15]is a com- tion result edges mon method used for hand detection.In YCrCb color space,a pixel (Y,Cr,Cb)is determined to be a skin pixel,if it satisfies Cr E [133,173]and Cb E [77,127].However,the threshold values of Cr and Cb can be affected by the surroundings such as lighting conditions.It is difficult to choose suitable 11r为 threshold values for Cr and Cb.Therefore.we combine Otsu's (d)Position range of (e)Keyboard boundary (f)Key segmentation re- method [16]and the red channel in YCrCb color space for skin keyboard sult segmentation. Fig.4.Keyboard detection and key extraction In YCrCb color space,the red channel Cr is essential to 2)Key segmentation:With the known location of the key- human skin coloration.Therefore,for a captured image,we board.we can extract the keys based on color segmentation. use the grayscale image that is split based on Cr channel In YCrCb space,the color coordinate (Y,Cr,Cb)of a white as an input for Otsu's method.Otsu's method [16]can pixel is (255,128,128),while that of a black pixel is (0, automatically perform clustering-based image thresholding, 128.128).Thus,we can only use the difference of the Y i.e.,it can calculate the optimal threshold to separate the value between the pixels to distinguish the white keys from the foreground and background.Therefore,this skin segmentation black background.If a pixel is located in the keyboard,while approach can tolerate the effect caused by environments such satisfying 255-E<Y<255,the pixel belongs to a key.as lighting conditions.For the input image Fig.5(a),the hand The offsets yEN of Y is mainly caused by light conditions. segmentation result is shown in Fig.5(b),where the white ey can be estimated in the initial training(see section IV-A). regions represent the hand regions,while the black regionsO  x y dl dr (a) Frame 1 /HIWKDQG 5LJKWKDQG (b) Frame 2 Finger number 1 2 3 4 5 6 7 8 9 10 (c) Frame 3 (d) Frame 4 (e) Frame 5 Fig. 2. Frames during two consecutive keystrokes 1) Keyboard detection: We use Canny edge detection algo￾rithm [14] to obtain the edges of the keyboard. Fig. 4(b) shows the edge detection result of Fig. 4(a). However, the interference edges (e.g., the paper’s edge / longest edge in Fig. 4(b)) should be removed. Based on Fig. 4(b), the edges of the keyboard should be close to the edges of the keys. We use this feature to remove pitfall edges, the result is shown in Fig. 4(c). Additionally, we adopt the dilation operation [15] to join the dispersed edge points which are close to each other, in order to get better edges/boundaries of the keyboard. After that, we use the Hough transform [12] to detect the lines in Fig. 4(c). Then, we use the uppermost line and the bottom line to describe the position range of the keyboard, as shown in Fig. 4(d). Similarly, we can use the Hough transform [12] to detect the left/right edge of the keyboard. If there are no suitable edges detected by the Hough transform, it is usually because the keyboard is not perfectly located in the camera’s view. In this case, we simply use the left/right boundary of the image to represent the left/right edge of the keyboard. As shown in Fig. 4(e), we extend the four edges (lines) to get four inter￾sections P1(x1, y1), P2(x2, y2), P3(x3, y3), P4(x4, y4), which are used to describe the boundary of the keyboard. (a) An input image (b) Canny edge detec￾tion result (c) Optimization for edges (d) Position range of keyboard P1 (x1 , y1 ) P4 (x4 , y4 ) P2 (x2 , y2 ) P3 (x3 , y3 ) (e) Keyboard boundary (f) Key segmentation re￾sult Fig. 4. Keyboard detection and key extraction 2) Key segmentation: With the known location of the key￾board, we can extract the keys based on color segmentation. In YCrCb space, the color coordinate (Y, Cr, Cb) of a white pixel is (255, 128, 128), while that of a black pixel is (0, 128, 128). Thus, we can only use the difference of the Y value between the pixels to distinguish the white keys from the black background. If a pixel is located in the keyboard, while satisfying 255 − εy ≤ Y ≤ 255, the pixel belongs to a key. The offsets εy ∈ N of Y is mainly caused by light conditions. εy can be estimated in the initial training (see section IV-A). The initial/default value of εy is εy = 50. When we obtain the white pixels, we need to get the contours of the keys and separate the keys from one another. While considering the pitfall areas such as small white areas which do not belong to any key, we estimate the area of a key at first. Based on Fig. 4(e), we use P1, P2, P3, P4 to calculate the area Sb of the keyboard as Sb = 1 2 · (| −−−→ P1P2 × −−−→ P1P4| + | −−−→ P3P4 × −−−→ P3P2|). Then, we calculate the area of each key. We use N to represent the number of keys in the keyboard. Considering the size difference between keys, we treat larger keys (e.g., the space key) as multiple regular keys (e.g., A-Z, 0-9). For example, the space key is treated as five regular keys. In this way, we will change N to Navg. Then, we can estimate the average area of a regular key as Sb/Navg. In addition to size difference between keys, different distances between the camera and the keys can also affect the area of a key in the image. Therefore, we introduce αl , αh to describe the range of a valid area Sk of a key as Sk ∈ [αl · Sb Navg , αh · Sb Navg ]. We set αl = 0.15, αh = 5 in CamK, based on extensive experiments. The key segmentation result of Fig. 4(e) is shown in Fig. 4(f). Then, we use the location of the space key (biggest key) to locate other keys, based on the relative locations between keys. C. Fingertip Detection In order to detect keystrokes, CamK needs to detect the fingertips and track the movements of fingertips. Fingertip de￾tection consists of hand segmentation and fingertip discovery. 1) Hand segmentation: Skin segmentation [15] is a com￾mon method used for hand detection. In YCrCb color space, a pixel (Y, Cr, Cb) is determined to be a skin pixel, if it satisfies Cr ∈ [133, 173] and Cb ∈ [77, 127]. However, the threshold values of Cr and Cb can be affected by the surroundings such as lighting conditions. It is difficult to choose suitable threshold values for Cr and Cb. Therefore, we combine Otsu’s method [16] and the red channel in YCrCb color space for skin segmentation. In YCrCb color space, the red channel Cr is essential to human skin coloration. Therefore, for a captured image, we use the grayscale image that is split based on Cr channel as an input for Otsu’s method. Otsu’s method [16] can automatically perform clustering-based image thresholding, i.e., it can calculate the optimal threshold to separate the foreground and background. Therefore, this skin segmentation approach can tolerate the effect caused by environments such as lighting conditions. For the input image Fig. 5(a), the hand segmentation result is shown in Fig. 5(b), where the white regions represent the hand regions, while the black regions
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有