This article has been accepted for publication in a future issue of this journal,but has not been fully edited.Content may change prior to final publication.Citation information:DOI 10.1109/JIOT.2021.3114224.IEEE Internet of Things Journal EEE INTERNET OF THINGS JOURNAL,VOL.XX,NO.XX,XX 2021 10 12345578910 aonary shohe obvioss lame saonar low medium high 12345678910 (a)Key tracking vs.range of jitters (b)Key tracking vs.speed of jitters. (c)Key tracking vs.frame sizes (d)Key tracking vs.sampling inter val. Fig.14.Performance of key tracking under different ranges and speeds of jitters. frame sizes and rates 100 100 95 Localization accuracy Localization accuracy Localization erro Localization error alse posit由Wera1e ☐False positive rate ☐False negative rate ☐False negative rate 20 121 192126 29254 13 232529 423超 stasonary slight obvious 1也ge stationary low medium hich of jitters (b)Keystroke locali eed of jitters (a)Keystroke localiz vs.range of jitters vs.speed of jitters Localization accuracy ILocalization accuracy ocalization error 60 alization erro alse positive rate se positive rate gative rate 320240 480°320 6840360 800°480 1280720 2 3 5 10 (c)Keystroke localization vs. frame sizes Fig.15.Performance of keystroke localization under different ranges and speeds of jitters,frame sizes and rates C.Effect of Camera Jitters updated coordinates of keys perfectly match with the fingertip In this experiment,we evaluate the performance of key pressing the key,the tracking and localization performance tracking and keystroke localization under different camera decreases.The pixel deviation increases to 4.5 pixels while jitters.Firstly,we change the range of camera jitters,i.e.,from the IoU decreases to 88.9%,and the false negative rate for stationary,slight(1.28°±0.26),obvious(6.8°±4.4)to keystroke localization increases to 7.5%.However,considering large (18.49+3.7).Here,stationary means the device keeps the normal or unconscious head/camera movements during a unchanged during typing,while other jitters mean different typing process,the large or high-frequency jitters are rare,thus ranges of camera movements,which are controlled by attach- DynaKey performs well in typical cases.Besides,when using ing the device to a motor.The performance of key tracking and the camera with higher frame rates to capture fine-grained keystroke localization are shown in Fig.14(a)and Fig.15(a). camera movements,it is possible to mitigate the effect from respectively.The results show good performances of key large or high-frequency jitters tracking and keystroke localization under slight and obvious range of jitters.When camera jitter is obvious,the average D.Effect of Frame Sizes and Frame Rates pixel deviation is less than 3 pixels while the average IoU In this experiment,we evaluate how image sizes affect achieves 92.3%,and the localization accuracy reaches 93.7%.the performance of DynaKey.When the frame size is small, When the camera suffers from large jitters,the performance of e.g.,480x 320 pixels,the keyboard in the captured frame key tracking reduces clearly,the localization accuracy drops involves too few pixels to be extracted accurately,leading to to 89.1%.This may be caused by the mismatch between the poor performance.When the frame size increases to 800x480 detected fingertip and the key's coordinate during large jitters.pixels,the performance shows good results.When the frame In addition,we evaluate the performance of DynaKey by size keeps increasing to 1280x720 pixels,the performance has changing the frequency of jitters,i.e.,keeping stationary and a little decrease.This may be because the higher image reso- moving in low(0.04°±0.03/s),medium(0.09°±0.06/s) lution leads to the keyboard containing more pixels,resulting and high (0.2+0.15/s)speed,respectively.The subject types in larger pixel deviation for key tracking.Besides,the higher the same text as the above experiment.As shown in Fig.14(b) image resolution also causes higher image processing cost, and Fig.15(b),DynaKey can tolerate low and medium camera which may be too slow to process each keystroke and leads jitters well.When camera moves in medium speed,the average to higher false negative rate.In practice,to minimize latency pixel deviation is 3.2 pixels and the loU is 92.2%,and the and power consumption while guaranteeing the keystroke keystroke localization accuracy reaches 93.3%,respectively.In localization performance,the frame size is set to 800x480 case of high-frequency jitters,it is hard to guarantee that the pixels.2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3114224, IEEE Internet of Things Journal IEEE INTERNET OF THINGS JOURNAL, VOL. XX, NO. XX, XX 2021 10 stationary slight obvious large 0 5 Pixel deviation stationary slight obvious large 0.8 1 IoU (a) Key tracking vs. range of jitters. stationary low medium high 0 5 Pixel deviation stationary low medium high 0.8 1 IoU (b) Key tracking vs. speed of jitters. 320*240 480*320 640*360 800*480 1280*720 0 2 4 6 8 Pixel deviation 320*240 480*320 640*360 800*480 1280*720 0.7 0.8 0.9 1 IoU (c) Key tracking vs. frame sizes. 1 2 3 4 5 6 7 8 9 10 0 5 Pixel deviation 1 2 3 4 5 6 7 8 9 10 0.8 1 IoU (d) Key tracking vs. sampling interval. Fig. 14. Performance of key tracking under different ranges and speeds of jitters, frame sizes and rates. 97.4 1.1 2 1.5 95.5 1.9 2.1 2.6 93.7 2.9 2.5 3.4 89.1 3.6 4.3 7.3 stationary slight obvious large Range of jitters 0 20 40 60 80 100 Percentage (%) Localization accuracy Localization error False positive rate False negative rate (a) Keystroke localization vs. range of jitters. 97.4 1.1 2 1.5 94.8 2.3 2.5 2.9 93.3 3.4 2.7 3.3 89.9 2.6 4 7.5 stationary low medium high Speed of jitters 0 20 40 60 80 100 Percentage (%) Localization accuracy Localization error False positive rate False negative rate (b) Keystroke localization vs. speed of jitters. 73.2 12.611.814.2 81.7 7.3 3.1 11 88.5 5.2 2.5 6.3 95.5 1.9 2.1 2.6 85.9 5.2 2.6 8.9 320*240 480*320 640*360 800*480 1280*720 Frame sizes 0 20 40 60 80 100 Percentage (%) Localization accuracy Localization error False positive rate False negative rate (c) Keystroke localization vs. different frame sizes. 95.1 2.72.22.2 95.7 1.93.12.4 94.6 2 1.53.4 95.2 2.72.22.1 95.5 1.92.12.6 94.8 2.93.42.3 93.5 3.21.63.3 91.2 1.51.4 7.3 87 4.9 1.7 8.1 85.9 5.8 1.4 8.4 1 2 3 4 5 6 7 8 9 10 The duration Nd 0 20 40 60 80 100 Percentage (%) Localization accuracy Localization error False positive rate False negative rate (d) Keystroke localization vs. different sample intervals. Fig. 15. Performance of keystroke localization under different ranges and speeds of jitters, frame sizes and rates. C. Effect of Camera Jitters In this experiment, we evaluate the performance of key tracking and keystroke localization under different camera jitters. Firstly, we change the range of camera jitters, i.e., from stationary, slight (1.28◦ ± 0.26◦ ), obvious (6.8 ◦ ± 4.4 ◦ ) to large (18.4 ◦ ± 3.7 ◦ ). Here, stationary means the device keeps unchanged during typing, while other jitters mean different ranges of camera movements, which are controlled by attaching the device to a motor. The performance of key tracking and keystroke localization are shown in Fig. 14(a) and Fig.15(a), respectively. The results show good performances of key tracking and keystroke localization under slight and obvious range of jitters. When camera jitter is obvious, the average pixel deviation is less than 3 pixels while the average IoU achieves 92.3%, and the localization accuracy reaches 93.7%. When the camera suffers from large jitters, the performance of key tracking reduces clearly, the localization accuracy drops to 89.1%. This may be caused by the mismatch between the detected fingertip and the key’s coordinate during large jitters. In addition, we evaluate the performance of DynaKey by changing the frequency of jitters, i.e., keeping stationary and moving in low (0.04◦ ± 0.03◦ /s), medium (0.09◦ ± 0.06◦ /s) and high (0.2±0.15◦ /s) speed, respectively. The subject types the same text as the above experiment. As shown in Fig. 14(b) and Fig. 15(b), DynaKey can tolerate low and medium camera jitters well. When camera moves in medium speed, the average pixel deviation is 3.2 pixels and the IoU is 92.2%, and the keystroke localization accuracy reaches 93.3%, respectively. In case of high-frequency jitters, it is hard to guarantee that the updated coordinates of keys perfectly match with the fingertip pressing the key, the tracking and localization performance decreases. The pixel deviation increases to 4.5 pixels while the IoU decreases to 88.9%, and the false negative rate for keystroke localization increases to 7.5%. However, considering the normal or unconscious head/camera movements during a typing process, the large or high-frequency jitters are rare, thus DynaKey performs well in typical cases. Besides, when using the camera with higher frame rates to capture fine-grained camera movements, it is possible to mitigate the effect from large or high-frequency jitters. D. Effect of Frame Sizes and Frame Rates In this experiment, we evaluate how image sizes affect the performance of DynaKey. When the frame size is small, e.g., 480× 320 pixels, the keyboard in the captured frame involves too few pixels to be extracted accurately, leading to poor performance. When the frame size increases to 800×480 pixels, the performance shows good results. When the frame size keeps increasing to 1280×720 pixels, the performance has a little decrease. This may be because the higher image resolution leads to the keyboard containing more pixels, resulting in larger pixel deviation for key tracking. Besides, the higher image resolution also causes higher image processing cost, which may be too slow to process each keystroke and leads to higher false negative rate. In practice, to minimize latency and power consumption while guaranteeing the keystroke localization performance, the frame size is set to 800×480 pixels. Authorized licensed use limited to: Nanjing University. Downloaded on December 03,2021 at 08:56:41 UTC from IEEE Xplore. Restrictions apply