正在加载图片...
This article has been accepted for publication in a future issue of this journal,but has not been fully edited.Content may change prior to final publication.Citation information:DOI 10.1109/JIOT.2021.3114224.IEEE Internet of Things Journal EEE INTERNET OF THINGS JOURNAL,VOL.XX,NO.XX,XX 2021 (a)Fm 1:moving to 'U'.(b)Fm 4:moving to 'U'.(c)Fm 7:pressing 'U'.(d)Fm 10:pressing 'U'.(e)Fm 13:leaving 'U'.(f)Fm 16:leaving 'U' Fig.11.Sampled frames in the process of pressing 'U'with fingertip 7.The yellow and red points are the locations of the fingertip 7 in previous and current sampled frame,respectively.The green arrow indicates the trend of the fingertip's movement.The blue circles and points are the locations of other fingertips in previous and current sampled frame,respectively.('Fm'is short for frame.) the hand,to further remove the pitfall points with distances moving out and coming back to the same key is usually smaller than Ar.Finally,for each cluster of points related to larger than 185ms (i.e.,more than the duration of 5 frames). a fingertip,we choose the middle point to represent the final At this time,if we have processed a keystroke in the jth detected fingertip,as shown in Fig.10(f).Unless otherwise frame,we will not process the keystroke in the kth frame specified,we set k=50 and Ar =100 pixels by default. repeatedly.Otherwise,we detect a new possible keystroke in the kth frame.Differently,if the coordinates of fingertip E.Keystroke Detection and Localization T and T)from the jth to the kth frames change,we After obtaining coordinates of keys and detecting fingertips, need to further determine whether there is a keystroke in we will detect and locate keystrokes.Specifically,we first the kth frame.At this time,we introduce the (-1)th determine whether a typing operation occurs,i.e.,keystroke frame,detect the coordinate of the fingertip as T)and detection,and then determine which fingertip is pressing the transform T() to the coordinate system of the kth frame as key,i.e.,keystroke localization,as shown in Alg.2. T)Then,we calculate the coordinate changes of fingertip 1)Keystroke Detection:According to Observation 4,the depth information of fingertips is hardly obtained through a d=V(-1y)2+(1y))2 between the single image,thus we detect a keystroke from multiple con- (k-1)th and the kth frames.If d'>er,the fingertip keeps secutive frames.Specifically,a keystroke operation involves moving,there is no keystroke.Otherwise,we detect a possible several steps,first the fingertip moves towards the key,then keystroke in the kth frame,and keystroke localization will be stays on the key for a short duration,and finally moves away described in the following subsection. from it.An example is shown in Fig.11 (i.e..the seventh fingertip).Therefore,the coordinate changes of fingertips can be used to detect possible keystrokes.Additionally,to reduce the processing cost,we introduce a frame-skipping scheme for keystroke detection,instead of detecting the coordinates of fingertips from each image frame. To capture enough information for keystroke detection and localization,we first set the frame rate of camera to 30fps. which is the maximum/default frame rate of off-the-shelf mobile devices.According to [9],the duration of a keystroke Fram Fig.12.Coordinate changes in y-axis of ten fingertips usually lasts 185ms,which is about the duration of capturing 5 or 6 frames.Therefore,we first process every 5 image frames 2)Keystroke Localization:Keystroke localization is to de- and compare each fingertip's coordinate.For convenience,we tect which finger is typing.As shown in Fig.11,although use T()and T())to represent the ith all fingertips move together during a keystroke,the fingertip fingertip's coordinate in the jth and the kth frames,where T pressing a key often has the largest coordinate changes, k=j+5.Considering that camera movement may happen especially in y-aris.This is because Tk needs to move between the jth frame and the kth frame,T)is transformed towards the target key,stay on the key,and then move away, to the coordinate system of the kth frame asT(). while other fingertips often keep hovering or staying on the keyboard without large variation of coordinates.As shown in based on perspective transformation.If the coordinate change Fig.12,the 'fingertip 7'pressing a key has the largest variation 6d =V(ay2+()2 is less than ed.the of coordinates in y-aris.For the detected fingertip pressing fingertip is considered unchanged,otherwise it is moving.We a key,we further match the coordinate of the fingertip and the set ed 15 pixels by default. location of a key to locate the keystroke. After obtaining the coordinate changes of a fingertip from 3)Adaptive Calibration:However,considering the possible every 5 frames,we further need to determine whether a errors in keystroke detection and localization,we introduce fingertip is pressing a key.As mentioned before,the duration the adaptive calibration scheme for a better typing experi- of a keystroke usually lasts for 185ms.If the coordinates of a ence.Firstly,in the user interface,we keep the 'ADD'and fingertip T)and T()from the jth to the kth (=j+5)DELETE'operations.If the typing operation is not detected, frames keep unchanged,it implies that the fingertip keeps the user can use 'ADD'button in the top right corner of staying on the pressed key during the last five frames be- user interface to input the character by screen.If the typing cause the duration for pressing a key in the jth frame,then operation is wrongly detected/located,the 'DELETE'button in2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3114224, IEEE Internet of Things Journal IEEE INTERNET OF THINGS JOURNAL, VOL. XX, NO. XX, XX 2021 8 1 2 3 4 5 6 8 9 10 7 o x y (a) Fm 1: moving to ‘U’. o x y 1 2 3 4 5 6 8 9 10 7 (b) Fm 4:moving to ‘U’. x y o 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 (a) Frame 1 (b) Frame 2 (c) Frame 3 (d) Frame 4 (e) Frame 5 (f) Frame 6 o (a) Frame 1 (b) Frame 2 (c) Frame 3 (d) Frame 4 (e) Frame 5 (f) Frame 6 x y 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 910 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 8 9 10 7 o x y 1 2 3 4 5 6 8 9 10 7 1 2 3 4 5 6 8 9 10 7 1 2 3 4 5 6 8 9 10 7 1 2 3 4 5 6 8 9 10 7 1 2 3 4 5 6 8 9 10 7 1 2 3 4 5 6 8 9 10 7 (c) Fm 7: pressing ‘U’. o x y 1 2 3 4 5 6 8 9 10 7 (d) Fm 10: pressing ‘U’. o x y 1 2 3 4 5 6 8 9 10 7 (e) Fm 13: leaving ‘U’. o x y 1 2 3 4 5 6 8 9 10 7 (f) Fm 16: leaving ‘U’. Fig. 11. Sampled frames in the process of pressing ‘U’ with fingertip 7. The yellow and red points are the locations of the fingertip 7 in previous and current sampled frame, respectively. The green arrow indicates the trend of the fingertip’s movement. The blue circles and points are the locations of other fingertips in previous and current sampled frame, respectively.(‘Fm’ is short for frame.) the hand, to further remove the pitfall points with distances smaller than ∆r. Finally, for each cluster of points related to a fingertip, we choose the middle point to represent the final detected fingertip, as shown in Fig. 10(f). Unless otherwise specified, we set k = 50 and ∆r = 100 pixels by default. E. Keystroke Detection and Localization After obtaining coordinates of keys and detecting fingertips, we will detect and locate keystrokes. Specifically, we first determine whether a typing operation occurs, i.e., keystroke detection, and then determine which fingertip is pressing the key, i.e., keystroke localization, as shown in Alg. 2. 1) Keystroke Detection: According to Observation 4, the depth information of fingertips is hardly obtained through a single image , thus we detect a keystroke from multiple con￾secutive frames. Specifically, a keystroke operation involves several steps, first the fingertip moves towards the key, then stays on the key for a short duration, and finally moves away from it. An example is shown in Fig. 11 (i.e., the seventh fingertip). Therefore, the coordinate changes of fingertips can be used to detect possible keystrokes. Additionally, to reduce the processing cost, we introduce a frame-skipping scheme for keystroke detection, instead of detecting the coordinates of fingertips from each image frame. To capture enough information for keystroke detection and localization, we first set the frame rate of camera to 30fps, which is the maximum/default frame rate of off-the-shelf mobile devices. According to [9], the duration of a keystroke usually lasts 185ms, which is about the duration of capturing 5 or 6 frames. Therefore, we first process every 5 image frames and compare each fingertip’s coordinate. For convenience, we use T (j) i (x (j) i , y (j) i ) and T (k) i (x (k) i , y (k) i ) to represent the ith fingertip’s coordinate in the jth and the kth frames, where k = j + 5. Considering that camera movement may happen between the jth frame and the kth frame, T (j) i is transformed to the coordinate system of the kth frame as T (j) 0 i (x (j) 0 i , y (j) 0 i ), based on perspective transformation. If the coordinate change δd = q (x (j) 0 i − x (k) i ) 2 + (y (j) 0 i − y (k) i ) 2 is less than d, the fingertip is considered unchanged, otherwise it is moving. We set d = 15 pixels by default. After obtaining the coordinate changes of a fingertip from every 5 frames, we further need to determine whether a fingertip is pressing a key. As mentioned before, the duration of a keystroke usually lasts for 185ms. If the coordinates of a fingertip T (j) 0 i and T (k) i from the jth to the kth (k = j + 5) frames keep unchanged, it implies that the fingertip keeps staying on the pressed key during the last five frames be￾cause the duration for pressing a key in the jth frame, then moving out and coming back to the same key is usually larger than 185ms (i.e., more than the duration of 5 frames). At this time, if we have processed a keystroke in the jth frame, we will not process the keystroke in the kth frame repeatedly. Otherwise, we detect a new possible keystroke in the kth frame. Differently, if the coordinates of fingertip T (j) 0 i and T (k) i from the jth to the kth frames change, we need to further determine whether there is a keystroke in the kth frame. At this time, we introduce the (k − 1)th frame, detect the coordinate of the fingertip as T (k−1) i , and transform T (k−1) i to the coordinate system of the kth frame as T (k−1)0 i . Then, we calculate the coordinate changes of fingertip δd0 = q (x (k−1)0 i − x (k) i ) 2 + (y (k−1)0 i − y (k) i ) 2 between the (k − 1)th and the kth frames. If δd0 > r, the fingertip keeps moving, there is no keystroke. Otherwise, we detect a possible keystroke in the kth frame, and keystroke localization will be described in the following subsection. 1 2 3 4 5 6 Frames 0 5 10 15 20 25 30 35 40 45 Coordinate changes in y-axis (pixels) tip 1 tip 2 tip 3 tip 4 tip 5 tip 6 tip 7 tip 8 tip 9 tip 10 Fig. 12. Coordinate changes in y-axis of ten fingertips. 2) Keystroke Localization: Keystroke localization is to de￾tect which finger is typing. As shown in Fig. 11, although all fingertips move together during a keystroke, the fingertip Tk pressing a key often has the largest coordinate changes, especially in y − axis. This is because Tk needs to move towards the target key, stay on the key, and then move away, while other fingertips often keep hovering or staying on the keyboard without large variation of coordinates. As shown in Fig. 12, the ‘fingertip 7’ pressing a key has the largest variation of coordinates in y − axis. For the detected fingertip pressing a key, we further match the coordinate of the fingertip and the location of a key to locate the keystroke. 3) Adaptive Calibration: However, considering the possible errors in keystroke detection and localization, we introduce the adaptive calibration scheme for a better typing experi￾ence. Firstly, in the user interface, we keep the ‘ADD’ and ‘DELETE’ operations. If the typing operation is not detected, the user can use ‘ADD’ button in the top right corner of user interface to input the character by screen. If the typing operation is wrongly detected/located, the ‘DELETE’ button in Authorized licensed use limited to: Nanjing University. Downloaded on December 03,2021 at 08:56:41 UTC from IEEE Xplore. Restrictions apply
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有