正在加载图片...
MobiSys'18,June 10-15,2018,Munich,Germany Ke Sun et al. Pushing hand Tapping finger Figure 7:Peak and valley estimate 6 FINGER TAPPING DETECTION In this section,we present the finger tapping detection algo- rithm which combines the information captured by the camera and microphones to achieve better accuracy. ime (second (a)1/Q waveforms 6.1 Finger Motion Pattern Tapping-in-the-air is slightly different from tapping on the physi- Tapping finger cal devices.Due to the absence of haptic feedback from the physical keys [9].it is hard for the user to perform concurrent finger tappings in-the-air and resolve the typing sequence using visual feedback. Furthermore,on virtual keypads,the users should first move their hand to locate the key then tap from the top of the key.As a result, we mainly focus on supporting one finger/hand typing in this work We leave two hand typing as our future work. (b)Complex 1/Q traces We divide the finger movement during the tapping-in-the-air Figure 6:The difference of phase change between the push- process into three parts.The first state is the"moving state",during ing hand and tapping finger which the user moves their finger to the key that he/she wants to press.During this state,the movement pattern of the fingers and In order to mitigate the effect of static multipaths,we take two hands is quite complex,due to the various ways to press different factors into consideration.First,we use the phase magnitude caused keys on virtual displays.It is difficult to build a model for the video by the reflected moving part to remove large movements.As shown and ultrasound signals in this state.Therefore,we just detect the in Figure 6,the magnitude of signal change caused by hand move- state without wasting computational resources and energy in ana- ment is 10 times larger than that of taping a single finger.As a result, lyzing the complex pattern.The second state is the"locating state", we set the threshold of the magnitude gap between the adjacent where the user keeps their finger on the target key position briefly peak and valley to isolate the finger movement from other move- before tapping it.Although this state can hardly be perceived by ments,which is called "FingerInterval".Second,there are many human beings,this short pause can be clearly detected by the ultra- fake extreme points as shown in Figure 7,which are caused by the sound or the 120 fps video.The average duration of the "locating noise of static vector.We use the speed of the finger tappings to state"is 386.2ms as shown in Figure 8.During this state,both video exclude the fake extreme points.As shown in Figure 8(d).the finger and audio signals remain static for a short interval,because the tapping only lasts 150ms on average.We can estimate the speed finger is almost static.The third state is the"tapping state",where of the path length change of finger tappings.As the ultrasound the user slightly moves their finger up and down to press the key. phase changes by 2 whenever the movement distance causes a In order to detect the finger tap,we divide the"tapping state"into path length change equal to the ultrasound wavelength,we set two states,the"tapping down state",and the"tapping up state". the threshold of the time duration of a/2 phase change,which is We use RM-ANOVA to analyze the motion pattern of in-air called "SpeedInterval"in PVE.Using this model,we can exclude finger tappings.Five volunteers participated in our user study.Each fake extreme points in the signal:if the interval between two con- user taps on the virtual QWERTY keyboard in AR environments tinuous extreme points in I/Q component is beyond the scope of with a single index finger for five minutes.The virtual keyboard is 'SpeedInterval",we will treat it as an fake extreme point.Note that rendered on the screen of the smartphone.Since the resolution of this approach only helps us to measure the phase change of integer the smartphone used in our experiments is 1920 x 1080,we set the multiple of/2,it can estimate the distance with a granularity of size of the virtual keys as 132 x 132 pixels. about 5 mm.To further reduce the measurement error,we use the We use 120 fps video camera to capture in-the-air tapping proce- peak and valley near the beginning and end to estimate the phase dure and do offline computer vision process to analyze the users'be- change in the beginning and end of the phase change.We use the havior.The offline analysis is manually verified to remove incorrect sum of last valley and peak of each component as the static vector state segments.The statistical results for the user study are shown in to estimate the beginning and ending phases.To mitigate dynamic Figure 8.In general,the process of tapping a single key on the virtual multipaths,we also combine the results of different frequencies display will go through all of the three states.However,we still find using linear regression. three different types of patterns.The first pattern corresponds to theMobiSys’18, June 10–15, 2018, Munich, Germany Ke Sun et al. 0 0.5 1 1.5 2 2.5 3 3.5 4 I/Q (normalized) -600 -400 -200 0 200 400 600 800 1000 Time (second) I Q Pushing hand Tapping finger Time (millisecond) 0 100 200 300 400 500 I/Q (normalized) 100 150 200 250 300 350 400 450 I Q (a) I/Q waveforms I (normalized) 700 900 1100 1300 1500 1700 Q (normalized) 200 400 600 800 1000 1200 1400 Pushing hand Tapping finger I (normalized) 1000 1050 1100 1150 1200 Q (normalized) 900 950 1000 1050 1100 Tapping finger Tapping finger (b) Complex I/Q traces Figure 6: The difference of phase change between the push￾ing hand and tapping finger In order to mitigate the effect of static multipaths, we take two factors into consideration. First, we use the phase magnitude caused by the reflected moving part to remove large movements. As shown in Figure 6, the magnitude of signal change caused by hand move￾ment is 10 times larger than that of taping a single finger. As a result, we set the threshold of the magnitude gap between the adjacent peak and valley to isolate the finger movement from other move￾ments, which is called “FingerInterval”. Second, there are many fake extreme points as shown in Figure 7, which are caused by the noise of static vector. We use the speed of the finger tappings to exclude the fake extreme points. As shown in Figure 8(d), the finger tapping only lasts 150ms on average. We can estimate the speed of the path length change of finger tappings. As the ultrasound phase changes by 2π whenever the movement distance causes a path length change equal to the ultrasound wavelength, we set the threshold of the time duration of π/2 phase change, which is called “SpeedInterval” in PVE. Using this model, we can exclude fake extreme points in the signal: if the interval between two con￾tinuous extreme points in I/Q component is beyond the scope of “SpeedInterval”, we will treat it as an fake extreme point. Note that this approach only helps us to measure the phase change of integer multiple of π/2, it can estimate the distance with a granularity of about 5mm. To further reduce the measurement error, we use the peak and valley near the beginning and end to estimate the phase change in the beginning and end of the phase change. We use the sum of last valley and peak of each component as the static vector to estimate the beginning and ending phases. To mitigate dynamic multipaths, we also combine the results of different frequencies using linear regression. Time (millisecond) 0 500 1000 1500 I/Q (normalized) 800 850 900 950 1000 1050 1100 1150 I Q Extreme Point Fake Extreme Point Figure 7: Peak and valley estimate 6 FINGER TAPPING DETECTION In this section, we present the finger tapping detection algo￾rithm which combines the information captured by the camera and microphones to achieve better accuracy. 6.1 Finger Motion Pattern Tapping-in-the-air is slightly different from tapping on the physi￾cal devices. Due to the absence of haptic feedback from the physical keys [9], it is hard for the user to perform concurrent finger tappings in-the-air and resolve the typing sequence using visual feedback. Furthermore, on virtual keypads, the users should first move their hand to locate the key then tap from the top of the key. As a result, we mainly focus on supporting one finger/hand typing in this work. We leave two hand typing as our future work. We divide the finger movement during the tapping-in-the-air process into three parts. The first state is the “moving state”, during which the user moves their finger to the key that he/she wants to press. During this state, the movement pattern of the fingers and hands is quite complex, due to the various ways to press different keys on virtual displays. It is difficult to build a model for the video and ultrasound signals in this state. Therefore, we just detect the state without wasting computational resources and energy in ana￾lyzing the complex pattern. The second state is the “locating state”, where the user keeps their finger on the target key position briefly before tapping it. Although this state can hardly be perceived by human beings, this short pause can be clearly detected by the ultra￾sound or the 120 fps video. The average duration of the “locating state” is 386.2ms as shown in Figure 8. During this state, both video and audio signals remain static for a short interval, because the finger is almost static. The third state is the “tapping state”, where the user slightly moves their finger up and down to press the key. In order to detect the finger tap, we divide the “tapping state” into two states, the “tapping down state”, and the “tapping up state”. We use RM-ANOVA to analyze the motion pattern of in-air finger tappings. Five volunteers participated in our user study. Each user taps on the virtual QWERTY keyboard in AR environments with a single index finger for five minutes. The virtual keyboard is rendered on the screen of the smartphone. Since the resolution of the smartphone used in our experiments is 1920 × 1080, we set the size of the virtual keys as 132 × 132 pixels. We use 120 fps video camera to capture in-the-air tapping proce￾dure and do offline computer vision process to analyze the users’ be￾havior. The offline analysis is manually verified to remove incorrect state segments. The statistical results for the user study are shown in Figure 8. In general, the process of tapping a single key on the virtual display will go through all of the three states. However, we still find three different types of patterns. The first pattern corresponds to the
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有