39 QGesture:Quantifying Gesture Distance and Direction with WiFi Signals NAN YU,State Key Laboratory for Novel Software Technology,Nanjing University,China WEI WANG,State Key Laboratory for Novel Software Technology,Nanjing University,China ALEX X.LIU,Deptartment of Computer Science of Engineering.Michigan State University,USA LINGTAO KONG,State Key Laboratory for Novel Software Technology,Nanjing University,China Many HCI applications,such as volume adjustment in a gaming system,require quantitative gesture measurement for metrics such as movement distance and direction.In this paper,we propose QGesture,a gesture recognition system that uses CSI values provided by COTS WiFi devices to measure the movement distance and direction of human hands.To achieve high accuracy in measurements,we first use phase correction algorithm to remove the phase noise in CSI measurements.We then propose a robust estimation algorithm,called LEVD,to estimate and remove the impact of environmental dynamics. To separate gesture movements from daily activities,we design simple gestures with unique characteristics as preambles to determine the start of the gesture.Our experimental results show that QGesture achieves an average accuracy of 3 cm in the measurement of movement distance and more than 95%accuracy in the movement direction detection in the one dimensional case.Furthermore,it achieves an average absolute direction error of 15 degrees and an average accuracy of 3.7 cm in the measurement of movement distance in the two-dimensional case. CCS Concepts:Human-centered computing-Ubiquitous and mobile computing systems and tools; Additional Key Words and Phrases:Gesture Recognition,WiFi Signals,Wireless Sensing ACM Reference Format: Nan Yu,Wei Wang.Alex X.Liu,and Lingtao Kong.2018.QGesture:Quantifying Gesture Distance and Direction with WiFi Signals.Proc.ACM Hum.-Comput.Interact.1,4,Article 39(March 2018),22 pages.https://doi.org/0000001. 0000001 1 INTRODUCTION Recently a number of interesting WiFi-based gesture recognition schemes have been proposed [1,8,19,24,29]. As human bodies are mostly made of water,they reflect WiFi signals and introduce distortions in the received signal when they move.Different gestures cause different types of distortions in WiFi signals.Thus,by analyzing the changes in WiFi signals,we can recognize the corresponding gesture.WiFi-based gesture recognition has many advantages over traditional approaches that use cameras [4]or wearable sensors [10,28,36].For exam- ple,WiFi-based gesture recognition requires neither lighting nor carrying any devices.It also provides better coverage as WiFi signals can penetrate through walls. Authors'addresses:Nan Yu,State Key Laboratory for Novel Software Technology,Nanjing University,State Key Laboratory for Novel Software Technology,Nanjing.Jiangsu,China;Wei Wang.State Key Laboratory for Novel Software Technology,Nanjing University,State Key Laboratory for Novel Software Technology,Nanjing.Jiangsu,China;Alex X.Liu,Deptartment of Computer Science of Engineering. Michigan State University,Computer Science and Engineering.USA:Lingtao Kong.State Key Laboratory for Novel Software Technology, Nanjing University,State Key Laboratory for Novel Software Technology,Nanjing.Jiangsu,China. ACM acknowledges that this contribution was authored or co-authored by an employee,contractor,or affiliate of the United States govern ment.As such,the United States government retains a nonexclusive,royalty-free right to publish or reproduce this article,or to allow others to do so,for government purposes only. 2018 Association for Computing Machinery. 2573-0142/2018/3-ART39$15.00 https:/doi.org/0000001.0000001 Proceedings of the ACM on Human-Computer Interaction,Vol.1,No.4,Article 39.Publication date:March 2018
39 QGesture: Quantifying Gesture Distance and Direction with WiFi Signals NAN YU, State Key Laboratory for Novel Software Technology, Nanjing University, China WEI WANG, State Key Laboratory for Novel Software Technology, Nanjing University, China ALEX X. LIU, Deptartment of Computer Science of Engineering, Michigan State University, USA LINGTAO KONG, State Key Laboratory for Novel Software Technology, Nanjing University, China Many HCI applications, such as volume adjustment in a gaming system, require quantitative gesture measurement for metrics such as movement distance and direction. In this paper, we propose QGesture, a gesture recognition system that uses CSI values provided by COTS WiFi devices to measure the movement distance and direction of human hands. To achieve high accuracy in measurements, we first use phase correction algorithm to remove the phase noise in CSI measurements. We then propose a robust estimation algorithm, called LEVD, to estimate and remove the impact of environmental dynamics. To separate gesture movements from daily activities, we design simple gestures with unique characteristics as preambles to determine the start of the gesture. Our experimental results show that QGesture achieves an average accuracy of 3 cm in the measurement of movement distance and more than 95% accuracy in the movement direction detection in the onedimensional case. Furthermore, it achieves an average absolute direction error of 15 degrees and an average accuracy of 3.7 cm in the measurement of movement distance in the two-dimensional case. CCS Concepts: • Human-centered computing → Ubiquitous and mobile computing systems and tools; Additional Key Words and Phrases: Gesture Recognition, WiFi Signals, Wireless Sensing ACM Reference Format: Nan Yu, Wei Wang, Alex X. Liu, and Lingtao Kong. 2018. QGesture: Quantifying Gesture Distance and Direction with WiFi Signals. Proc. ACM Hum.-Comput. Interact. 1, 4, Article 39 (March 2018), 22 pages. https://doi.org/0000001. 0000001 1 INTRODUCTION Recently a number of interesting WiFi-based gesture recognition schemes have been proposed [1, 8, 19, 24, 29]. As human bodies are mostly made of water, they reflect WiFi signals and introduce distortions in the received signal when they move. Different gestures cause different types of distortions in WiFi signals. Thus, by analyzing the changes in WiFi signals, we can recognize the corresponding gesture. WiFi-based gesture recognition has many advantages over traditional approaches that use cameras [4] or wearable sensors [10, 28, 36]. For example, WiFi-based gesture recognition requires neither lighting nor carrying any devices. It also provides better coverage as WiFi signals can penetrate through walls. Authors’ addresses: Nan Yu, State Key Laboratory for Novel Software Technology, Nanjing University, State Key Laboratory for Novel Software Technology, Nanjing, Jiangsu, China; Wei Wang, State Key Laboratory for Novel Software Technology, Nanjing University, State Key Laboratory for Novel Software Technology, Nanjing, Jiangsu, China; Alex X. Liu, Deptartment of Computer Science of Engineering, Michigan State University, Computer Science and Engineering, USA; Lingtao Kong, State Key Laboratory for Novel Software Technology, Nanjing University, State Key Laboratory for Novel Software Technology, Nanjing, Jiangsu, China. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor, or affiliate of the United States government. As such, the United States government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for government purposes only. © 2018 Association for Computing Machinery. 2573-0142/2018/3-ART39 $15.00 https://doi.org/0000001.0000001 Proceedings of the ACM on Human-Computer Interaction, Vol. 1, No. 4, Article 39. Publication date: March 2018
39:2·N.Yu et al One of the most important applications of WiFi-based gesture recognition is to interact with smart home devices.Existing home appliances use physical interfaces,such as knobs and levers,to provide quantitative inputs,including volume adjustment for TVs and brightness adjustment for lights.These physical inputs allow the user to fine-tune the input value based on immediate feedback.It is difficult to emulate these physical inputs using popular voice-based interactions provided by Amazon Echo or Google Home.However,WiFi-based gesture control can enable such fine-grained quantitative control.For example,the user can push his hand forward to increase the volume of the TV set,where the magnitude of volume increase is proportional to the distance of pushing.To enable this,we need not only to recognize different predefined gestures,but also to quantify gesture movement distance in a granularity of a few centimeters so that the system can adjust the volume according to the distance that the user pushes his hand,while providing audio feedback on the current volume setting along the pushing process.In this way,the user can quantitatively adjust the volume to the desired value using a single action rather than repeating the gesture to increase or decrease the volume by a small amount at each time. The task of using Radio Frequency(RF)signal obtained from commercial hardware to measure the ges- ture movement distance and direction is difficult.Prior systems that use WiFi measurements from commer- cial devices often require whole-body movements,such as walking,to track movement speeds and directions [3,24,33,37].With the coarse measurement from commercial WiFi devices,existing schemes cannot trace hand/finger movements,which introduces weaker WiFi signal distortions than whole-body movements,in a fine-granularity.(delete?)They can only recognize hand/finger movements by matching them to predefined ges- ture patterns [1,18].Using customized ASIC chips based on the 60 GHz radar technology,Google's recent Soli system can quantify micro gestures so that those gestures can serve as human input for small wearable devices (such as smart watches)whose touch screens are too small for human to conveniently input [19].However,due to the fast decay of 60 GHz signal in the air,60 GHz system requires the gesture to be performed within tens of centimeters [31].The limited operational range makes them unsuitable to serve as remote control interfaces for home appliances. In this paper,we propose QGesture,a Quantifying Gesture distance and direction system,which uses Commercial- Off-The-Shelf(COTS)WiFi devices to measure the movement distance and direction of human hands.Figure 1 shows the basic system structure of OGesture.When the user pushes towards the target device,the device collects Channel State Information(CSI),which is perturbed by the WiFi signal reflected by the moving hand The signal reflected by the hand appears as a dynamic vector component in the CSI values,which causes the complex-valued CSI measurement to rotate.The distance of movement can be calculated by the phase change of complex-valued CSI measurement and the direction of movement can be determined by the rotation direction. Therefore,the user can push forward to increase the volume while pulling away to reduce the volume and the amount of increase/decrease is determined by the movement distance.As the perturbation of WiFi signals can be captured at a long distance,QGesture can work at a distance as far as 2 meters.QGesture is the first step to- wards quantitative remote control for home appliances.It shows the feasibility of fine-grained distance/direction measurement of hand movement over a few meters using COTS WiFi devices.Note that currently only a limited modules of WiFi network cards can provide CSI measurements [12]and CSI is not available on smartphones. We envision that more commercial WiFi devices would open their CSI information so that our approach can be deployed on smartphones in the near future. There are four key challenges that need to be addressed in designing QGesture. Reconstruct the phase of CSI measurements:The phase of CSI measurements is important for determining the movement directions [29].However,due to hardware imperfections in COTS WiFi Network Interface Cards (NICs),there are Carrier Frequency Offsets(CFO)and Sampling Frequency Offsets(SFO)between the transmitter and the receiver [17,34].Both the CFO and SFO introduce high variations in the phase of CSI and these variations are sensitive to temperature and hardware conditions.Therefore,it is difficult to predict and remove such phase Proceedings of the ACM on Human-Computer Interaction,Vol.1,No.4,Article 39.Publication date:March 2018
39:2 • N. Yu et al. One of the most important applications of WiFi-based gesture recognition is to interact with smart home devices. Existing home appliances use physical interfaces, such as knobs and levers, to provide quantitative inputs, including volume adjustment for TVs and brightness adjustment for lights. These physical inputs allow the user to fine-tune the input value based on immediate feedback. It is difficult to emulate these physical inputs using popular voice-based interactions provided by Amazon Echo or Google Home. However, WiFi-based gesture control can enable such fine-grained quantitative control. For example, the user can push his hand forward to increase the volume of the TV set, where the magnitude of volume increase is proportional to the distance of pushing. To enable this, we need not only to recognize different predefined gestures, but also to quantify gesture movement distance in a granularity of a few centimeters so that the system can adjust the volume according to the distance that the user pushes his hand, while providing audio feedback on the current volume setting along the pushing process. In this way, the user can quantitatively adjust the volume to the desired value using a single action rather than repeating the gesture to increase or decrease the volume by a small amount at each time. The task of using Radio Frequency (RF) signal obtained from commercial hardware to measure the gesture movement distance and direction is difficult. Prior systems that use WiFi measurements from commercial devices often require whole-body movements, such as walking, to track movement speeds and directions [3, 24, 33, 37]. With the coarse measurement from commercial WiFi devices, existing schemes cannot trace hand/finger movements, which introduces weaker WiFi signal distortions than whole-body movements, in a fine-granularity.(delete?) They can only recognize hand/finger movements by matching them to predefined gesture patterns [1, 18]. Using customized ASIC chips based on the 60 GHz radar technology, Google’s recent Soli system can quantify micro gestures so that those gestures can serve as human input for small wearable devices (such as smart watches) whose touch screens are too small for human to conveniently input [19]. However, due to the fast decay of 60 GHz signal in the air, 60 GHz system requires the gesture to be performed within tens of centimeters [31]. The limited operational range makes them unsuitable to serve as remote control interfaces for home appliances. In this paper, we propose QGesture, a Quantifying Gesture distance and direction system, which uses CommercialOff-The-Shelf (COTS) WiFi devices to measure the movement distance and direction of human hands. Figure 1 shows the basic system structure of QGesture. When the user pushes towards the target device, the device collects Channel State Information (CSI), which is perturbed by the WiFi signal reflected by the moving hand. The signal reflected by the hand appears as a dynamic vector component in the CSI values, which causes the complex-valued CSI measurement to rotate. The distance of movement can be calculated by the phase change of complex-valued CSI measurement and the direction of movement can be determined by the rotation direction. Therefore, the user can push forward to increase the volume while pulling away to reduce the volume and the amount of increase/decrease is determined by the movement distance. As the perturbation of WiFi signals can be captured at a long distance, QGesture can work at a distance as far as 2 meters. QGesture is the first step towards quantitative remote control for home appliances. It shows the feasibility of fine-grained distance/direction measurement of hand movement over a few meters using COTS WiFi devices. Note that currently only a limited modules of WiFi network cards can provide CSI measurements [12] and CSI is not available on smartphones. We envision that more commercial WiFi devices would open their CSI information so that our approach can be deployed on smartphones in the near future. There are four key challenges that need to be addressed in designing QGesture. • Reconstruct the phase of CSI measurements: The phase of CSI measurements is important for determining the movement directions [29]. However, due to hardware imperfections in COTS WiFi Network Interface Cards (NICs), there are Carrier Frequency Offsets (CFO) and Sampling Frequency Offsets (SFO) between the transmitter and the receiver [17, 34]. Both the CFO and SFO introduce high variations in the phase of CSI and these variations are sensitive to temperature and hardware conditions. Therefore, it is difficult to predict and remove such phase Proceedings of the ACM on Human-Computer Interaction, Vol. 1, No. 4, Article 39. Publication date: March 2018
QGesture:Quantifying Gesture Distance and Direction with WiFi Signals.39:3 variations without disturbing the small phase changes caused by hand movements.To address this challenge,we carefully analyze the phase offsets in different antenna pairs and design our phase correction algorithm so that phase changes of hand movements can be preserved.Hence,we can determine the movement direction with an accuracy of more than 95% Separate the channel state changes caused by the moving hands from the mixture of changes caused by other body parts:This is particularly important for a gesture recognition system to operate over a long distance, i.e.,several meters,because such system captures both the gesture movements and the environmental dynamics. When the user performs the gesture,their torso and arms also move at the same time,which significantly perturb the measurements of the wireless channel.To address this challenge,we analyze the CSI signals and find the typical signal frequencies generated by gestures,which are different from those generated by movements of other body parts.We then design a robust estimation algorithm,called LEVD,to remove the impact of environmental dynamics. Separate gesture movements from daily activities:Daily activities,such as walking and sitting down,also distort the wireless channel state information.To ensure that QGesture only responses to the channel distortion caused by specific gestures,we design simple gestures with unique characteristics as preambles to determine the start of the gesture.Our experimental results show that QGesture can efficiently recognize the preamble with an accuracy of 92.5%and a low False Positive Rate(FPR)of 3.2%. Accommodate arbitrary pushing angles:The phase changes of CSI measurements are determined by the changes in path length,which depends on the movement angle and the position of the hand with respect to the sender and receiver.When the hand moves along the line connecting the sender and receiver,the path length changes by two times of the movement distance.However,when the movement is in other directions,we may get smaller path length change for the same movement distance.To allow pushing along arbitrary angle,we need to perform the 2D tracking of the hand.To address this challenge,we propose to use multiple receivers to track path length changes of different paths at the same time.By doing this,we can triangulate the position of the hand and measure both the pushing angle and the movement distance.Our experimental results show that we can measure the movement angle with an accuracy of 15 degrees and movement distance with an accuracy of 3.7 cm. We implemented QGesture using COTS WiFi routers and laptops.Our experimental results show that QGes- ture can measure the gesture movement distance with an accuracy of 3 cm within a distance of 1 meters in normal indoor environments.OGesture can also reliably detect the hand movement direction with an accuracy of more than 95%in the one-dimensional case.Furthermore,it achieves an average absolute direction error of 15 degrees and an average accuracy of 3.7 cm in the measurement of movement distance in the two-dimensional case. 2 RELATED WORK We classify existing related gesture systems into two groups:RF-based recognition/tracking and non-RF-based recognition/tracking.Considering the way of collecting RF signal,we further classify RF-based into two cate- gories:RF-based recognition/tracking using COTS hardware and RF-based recognition/tracking using special- ized devices. RF-based Recognition/Tracking Using COTS Hardware:Most COTS hardware based on recognition and tracking systems uses the Received Signal Strength Indicator(RSSI)or CSI obtained from WiFi NICs to capture gesture signals [1,5,7,13,18,22,25,26].The WiKey scheme proposed to use CSI dynamics to recognize micro human activities such as keystrokes [6].The WiFinger scheme used CSI to recognize a set of eight gestures with an accuracy of 93%[26].The WiGest scheme used three wireless links to recognize a special set of gestures, where user hands blocked the signal and thus introduced significant RSSI changes,and achieved a recognition Proceedings of the ACM on Human-Computer Interaction,Vol.1,No.4,Article 39.Publication date:March 2018
QGesture: Quantifying Gesture Distance and Direction with WiFi Signals • 39:3 variations without disturbing the small phase changes caused by hand movements. To address this challenge, we carefully analyze the phase offsets in different antenna pairs and design our phase correction algorithm so that phase changes of hand movements can be preserved. Hence, we can determine the movement direction with an accuracy of more than 95%. • Separate the channel state changes caused by the moving hands from the mixture of changes caused by other body parts: This is particularly important for a gesture recognition system to operate over a long distance, i.e., several meters, because such system captures both the gesture movements and the environmental dynamics. When the user performs the gesture, their torso and arms also move at the same time, which significantly perturb the measurements of the wireless channel. To address this challenge, we analyze the CSI signals and find the typical signal frequencies generated by gestures, which are different from those generated by movements of other body parts. We then design a robust estimation algorithm, called LEVD, to remove the impact of environmental dynamics. • Separate gesture movements from daily activities: Daily activities, such as walking and sitting down, also distort the wireless channel state information. To ensure that QGesture only responses to the channel distortion caused by specific gestures, we design simple gestures with unique characteristics as preambles to determine the start of the gesture. Our experimental results show that QGesture can efficiently recognize the preamble with an accuracy of 92.5% and a low False Positive Rate (FPR) of 3.2%. • Accommodate arbitrary pushing angles: The phase changes of CSI measurements are determined by the changes in path length, which depends on the movement angle and the position of the hand with respect to the sender and receiver. When the hand moves along the line connecting the sender and receiver, the path length changes by two times of the movement distance. However, when the movement is in other directions, we may get smaller path length change for the same movement distance. To allow pushing along arbitrary angle, we need to perform the 2D tracking of the hand. To address this challenge, we propose to use multiple receivers to track path length changes of different paths at the same time. By doing this, we can triangulate the position of the hand and measure both the pushing angle and the movement distance. Our experimental results show that we can measure the movement angle with an accuracy of 15 degrees and movement distance with an accuracy of 3.7 cm. We implemented QGesture using COTS WiFi routers and laptops. Our experimental results show that QGesture can measure the gesture movement distance with an accuracy of 3 cm within a distance of 1 meters in normal indoor environments. QGesture can also reliably detect the hand movement direction with an accuracy of more than 95% in the one-dimensional case. Furthermore, it achieves an average absolute direction error of 15 degrees and an average accuracy of 3.7 cm in the measurement of movement distance in the two-dimensional case. 2 RELATED WORK We classify existing related gesture systems into two groups: RF-based recognition/tracking and non-RF-based recognition/tracking. Considering the way of collecting RF signal, we further classify RF-based into two categories: RF-based recognition/tracking using COTS hardware and RF-based recognition/tracking using specialized devices. RF-based Recognition/Tracking Using COTS Hardware: Most COTS hardware based on recognition and tracking systems uses the Received Signal Strength Indicator (RSSI) or CSI obtained from WiFi NICs to capture gesture signals [1, 5, 7, 13, 18, 22, 25, 26]. The WiKey scheme proposed to use CSI dynamics to recognize micro human activities such as keystrokes [6]. The WiFinger scheme used CSI to recognize a set of eight gestures with an accuracy of 93% [26]. The WiGest scheme used three wireless links to recognize a special set of gestures, where user hands blocked the signal and thus introduced significant RSSI changes, and achieved a recognition Proceedings of the ACM on Human-Computer Interaction, Vol. 1, No. 4, Article 39. Publication date: March 2018
39:4·N.Yu et al CSI values Target device WiFi signal Volume increase WiFi router Fig.1.QGesture system overview. accuracy of 96%[1].However,most of these systems only recognized a predefined set of gestures without consid- ering movement distance/direction measurements.WiDir used WiFi CSI to estimate the whole-body movement direction,such as walking,with an error of 10 degrees [33].For small hand movements,WiDraw used the Angle- Of-Arrival(AOA)measurement to achieve a tracking accuracy of 5 cm [25].However,the AOA-based approach also had a limited working range of fewer than 2 feet,so that it cannot be used as remote controls in HCI applications.QGesture is inspired by previous WiFi CSI processing technologies,including the noise removal algorithm,basic phase correction algorithm,and preamble gesture design.QGesture advances the state-of-art design by capturing the small phase variations caused by hand movements at a long distance.In addition to WiFi-based schemes,existing schemes also use COTS RFID readers and tags to track gestures [9,28].However, these systems require users to wear RFIDs or operate close to the RFID array,which makes them inconvenient to use. RF-based Recognition/Tracking Using Specialized Devices:RF signals can also be captured by specialized devices such as software radio systems.Software radio systems,such as USRP or WARP,have access to the fine- grained baseband signal so that they can provide the capability of quantifying hand/finger movement distance and speed [2,3,14,31,35].WiSee used USRP software radio to identify and classify nine whole-body gestures with an accuracy of 94%[24].WiTrack used specially designed Frequency-Modulated Continuous-Wave(FMCW) radar with a high bandwidth of 1.79 GHz to track human movements behind the wall with a resolution of about 11 cm to 20 cm [2,3].WiDeo used the WARP hardware to enable a tracking accuracy of 7 cm for multiple objects [14].AllSee used a specially designed analog circuit to extract the envelopes of the received signals and recognize gestures within a short distance of 2.5 feet[15].While these system provided valuable insights on the dynamics of the wireless signal,tracking with the coarse-grained CSI measurements requires a different set of signal processing algorithms. Non-RF-based Recognition/Tracking:Gesture recognition can be enabled by non-RF based technologies, including computer vision,wearable devices,and sound waves.Computer vision based gesture recognition uses cameras and infrared sensors to reconstruct the depth information from videos.The distance measurement accuracy for computer vision based solutions could be a few millimeters when the target is within one meter [32],but the depth accuracy degrades to a few centimeters for an operational range of 5 meters [16].The key limitation of computer vision based solutions is that the accuracy is highly dependent on the viewing angle and lighting conditions.Moreover,users may also have privacy concerns for video-camera-based solutions.Sound waves can be used to measure moving distances [23,38,39]or moving speeds [11].When the user is holding the Proceedings of the ACM on Human-Computer Interaction,Vol.1,No.4,Article 39.Publication date:March 2018
39:4 • N. Yu et al. Target device WiFi signal WiFi router Volume increase CSI values Fig. 1. QGesture system overview. accuracy of 96% [1]. However, most of these systems only recognized a predefined set of gestures without considering movement distance/direction measurements. WiDir used WiFi CSI to estimate the whole-body movement direction, such as walking, with an error of 10 degrees [33]. For small hand movements, WiDraw used the AngleOf-Arrival (AOA) measurement to achieve a tracking accuracy of 5 cm [25]. However, the AOA-based approach also had a limited working range of fewer than 2 feet, so that it cannot be used as remote controls in HCI applications. QGesture is inspired by previous WiFi CSI processing technologies, including the noise removal algorithm, basic phase correction algorithm, and preamble gesture design. QGesture advances the state-of-art design by capturing the small phase variations caused by hand movements at a long distance. In addition to WiFi-based schemes, existing schemes also use COTS RFID readers and tags to track gestures [9, 28]. However, these systems require users to wear RFIDs or operate close to the RFID array, which makes them inconvenient to use. RF-based Recognition/Tracking Using Specialized Devices: RF signals can also be captured by specialized devices such as software radio systems. Software radio systems, such as USRP or WARP, have access to the finegrained baseband signal so that they can provide the capability of quantifying hand/finger movement distance and speed [2, 3, 14, 31, 35]. WiSee used USRP software radio to identify and classify nine whole-body gestures with an accuracy of 94% [24]. WiTrack used specially designed Frequency-Modulated Continuous-Wave (FMCW) radar with a high bandwidth of 1.79 GHz to track human movements behind the wall with a resolution of about 11 cm to 20 cm [2, 3]. WiDeo used the WARP hardware to enable a tracking accuracy of 7 cm for multiple objects [14]. AllSee used a specially designed analog circuit to extract the envelopes of the received signals and recognize gestures within a short distance of 2.5 feet [15]. While these system provided valuable insights on the dynamics of the wireless signal, tracking with the coarse-grained CSI measurements requires a different set of signal processing algorithms. Non-RF-based Recognition/Tracking: Gesture recognition can be enabled by non-RF based technologies, including computer vision, wearable devices, and sound waves. Computer vision based gesture recognition uses cameras and infrared sensors to reconstruct the depth information from videos. The distance measurement accuracy for computer vision based solutions could be a few millimeters when the target is within one meter [32], but the depth accuracy degrades to a few centimeters for an operational range of 5 meters [16]. The key limitation of computer vision based solutions is that the accuracy is highly dependent on the viewing angle and lighting conditions. Moreover, users may also have privacy concerns for video-camera-based solutions. Sound waves can be used to measure moving distances [23, 38, 39] or moving speeds [11]. When the user is holding the Proceedings of the ACM on Human-Computer Interaction, Vol. 1, No. 4, Article 39. Publication date: March 2018
QGesture:Quantifying Gesture Distance and Direction with WiFi Signals.39:5 device,sound wave based solutions can provide distance measurement accuracy of a few centimeters [23,38]. Due to the weakness of sound energy reflected by hand,device-free gesture recognition solutions mostly use the Doppler effect,which only provides low-resolution speed measurements that cannot be used for fine-grained control over a long distance [11].Recent fine-grained tracking solution only works for a short distance of 50 cm [21,30].QGesture uses the similar phase based distances measurement algorithm as LLAP [30].However, our long-range WiFi gesture tracking system needs to handle the phase noises and interferences from nearby movements,which can be ignored in short-range sound-based systems. 3 SYSTEM MODEL In this section,we first present the theoretical model that quantifies the gesture movement distance and direction.We then discuss the noise sources that make CSI measurements from COTS devices deviate from theoretical models.Finally,we present methods to remove the CFO and SFO in CSI measurements so that we can measure the movement distance and direction using theoretical models. 3.1 Modeling Phase-Distance Relationship The magnitude and phase changes in CSI are closely related to the distance and direction of gesture move- ments.For simplicity,we first consider signals traveling through only two paths,i.e.,the Line-Of-Sight(LOS) path(path A)and the hand-reflected path(path B),between a pair of transmitter/receiver as shown in Figure 2. In theory,the resulting Channel Frequency Response(CFR)H(f,t)in CSI measured at time t can be represented as[29,34]: H(f.t)=aa(f.t)e(f,t)e() (1) where jis the imaginary unit with j2=-1,f is the frequency of the WiFi signal,aa(f,t)and aB(f,t)represent the magnitude attenuation and the initial phase in path a and B.As the path length of path A and B are different, their propagation delay tA(t)and rg(t)are also different as we have the relationship ra(t)=la(t)/c,where lA(t) is the length of path A and c is the speed of light. The CFR H(f,t)contains two components:one static component for path A and one dynamic component for path B,as shown in Figure 3.Furthermore,the magnitude of the static component of different pairs of antenna of different subcarriers is different as a result of different propagation delay and different carrier frequencies as showed in our Section 5.5.Note that the CFR H(f,t)is a complex value,where the real and imaginary part are called the In-phase(I)part and Quadrature(Q)part,respectively.Therefore,when we plot CFR in the complex plane,the CFR value at each time instance will be a vector and the end of the vector draws an I/O trace as time evolves.In case that the hand pushes towards the transmitter/receiver,the I/Q trace for a single subcarrier is an arc as shown in Figure 3.This is because when the hand moves,the vector for path A,which is a(f,t)ejf) is not changed as both the transmitter and the receiver remain static.The vector for path A is a static component. However,the vector for path B,which is ag(f,t)e(),significantly changes when the path length of lg(t) changes.When lg(t)reduces,the attenuation ag(f,t)only changes slowly and the phase (t)=-2mfrg(t)= -2flB(t)/c increases significantly.For WiFi signals at 5 GHz,the radio wavelength A=c/f is equal to 6 cm. Therefore,the phase for the vector corresponding to path B,which is o(t)=-2mlg(t)/A,will increase by 2 when lB(t)is reduced by the radio wavelength of 6 cm.By measuring the phase change Aog of the dynamic component,we can get the movement distance d as: △pB入 d=- 2aπ (2) Proceedings of the ACM on Human-Computer Interaction,Vol.1,No.4,Article 39.Publication date:March 2018
QGesture: Quantifying Gesture Distance and Direction with WiFi Signals • 39:5 device, sound wave based solutions can provide distance measurement accuracy of a few centimeters [23, 38]. Due to the weakness of sound energy reflected by hand, device-free gesture recognition solutions mostly use the Doppler effect, which only provides low-resolution speed measurements that cannot be used for fine-grained control over a long distance [11]. Recent fine-grained tracking solution only works for a short distance of 50 cm [21, 30]. QGesture uses the similar phase based distances measurement algorithm as LLAP [30]. However, our long-range WiFi gesture tracking system needs to handle the phase noises and interferences from nearby movements, which can be ignored in short-range sound-based systems. 3 SYSTEM MODEL In this section, we first present the theoretical model that quantifies the gesture movement distance and direction. We then discuss the noise sources that make CSI measurements from COTS devices deviate from theoretical models. Finally, we present methods to remove the CFO and SFO in CSI measurements so that we can measure the movement distance and direction using theoretical models. 3.1 Modeling Phase-Distance Relationship The magnitude and phase changes in CSI are closely related to the distance and direction of gesture movements. For simplicity, we first consider signals traveling through only two paths, i.e., the Line-Of-Sight (LOS) path (path A) and the hand-reflected path (path B), between a pair of transmitter/receiver as shown in Figure 2. In theory, the resulting Channel Frequency Response (CFR) H(f ,t) in CSI measured at time t can be represented as [29, 34]: H(f ,t) = aA(f ,t)e −j2π f τA (t) + aB (f ,t)e −j2π f τB (t) , (1) where j is the imaginary unit with j 2 = −1, f is the frequency of the WiFi signal, aA (f ,t) and aB (f ,t) represent the magnitude attenuation and the initial phase in path A and B. As the path length of path A and B are different, their propagation delay τA(t) and τB (t) are also different as we have the relationship τA(t) = lA(t)/c, where lA(t) is the length of path A and c is the speed of light. The CFR H(f ,t) contains two components: one static component for path A and one dynamic component for path B, as shown in Figure 3. Furthermore, the magnitude of the static component of different pairs of antenna of different subcarriers is different as a result of different propagation delay and different carrier frequencies as showed in our Section 5.5. Note that the CFR H(f ,t) is a complex value, where the real and imaginary part are called the In-phase (I) part and Quadrature (Q) part, respectively. Therefore, when we plot CFR in the complex plane, the CFR value at each time instance will be a vector and the end of the vector draws an I/Q trace as time evolves. In case that the hand pushes towards the transmitter/receiver, the I/Q trace for a single subcarrier is an arc as shown in Figure 3. This is because when the hand moves, the vector for path A, which is aA(f ,t)e −j2π f τA (t) , is not changed as both the transmitter and the receiver remain static. The vector for path A is a static component. However, the vector for path B, which is aB (f ,t)e −j2π f τB (t) , significantly changes when the path length of lB (t) changes. When lB (t) reduces, the attenuation aB (f ,t) only changes slowly and the phase φB (t) = −2π f τB (t) = −2π f lB (t)/c increases significantly. For WiFi signals at 5 GHz, the radio wavelength λ = c/f is equal to 6 cm. Therefore, the phase for the vector corresponding to path B, which is φB (t) = −2πlB (t)/λ, will increase by 2π when lB (t) is reduced by the radio wavelength of 6 cm. By measuring the phase change ∆φB of the dynamic component, we can get the movement distance d as: d = − ∆φBλ 2aπ , (2) Proceedings of the ACM on Human-Computer Interaction, Vol. 1, No. 4, Article 39. Publication date: March 2018
39:6·N.Yu et al Path A Path B th 4 Wall Fig.2.Scenarios with multiple paths where a is the ratio of the path length change to the movement distance.For example,we have a =2 for the scenario in Figure 2.Thus,we can measure the movement distance with an accuracy of a few centimeters Furthermore,we can determine the movement direction by looking at the sign of Ad. 3.2 Practical Issues We need to consider three practical issues before we can fit the above theoretical model into the real CSI measurements. Static Multipath between Transmitter and Receiver:In indoor environments,the wireless signal will be reflected by surrounding objects such as walls,furniture,or doors.In case that the reflectors are static,e.g,path A and C in Figure 2,we can treat the combination of these static paths as a single static component.Consider the case that there are n paths in total,where the nth path is the dynamic path and other paths are static paths. Eq.(1)can be rewritten as: Hf,)-∑a,e0+f,e n-1 (3) i=1 dynamic component static component Because the signal strength ai(f,t)and path length li(t)of static component are constant in Eq.(3),all static multipaths can be treated as a single path with the corresponding I/Q vector as the sum of vectors for all static paths. Dynamic Multipath through Moving Reflector:The signal reflected by the moving reflector could also travel through multiple paths,such as paths B and D in Figure 2.In this case,all these paths will have time- varying path lengths when the reflector moves.These time-varying paths will generate multiple time-varying components in the received signal strength.In practice,these components do not significantly affect the mea- surement accuracy,since the received reflecting signal strength is dominated by the direct reflection path,e.g, path B in Figure 2.Human hands have small areas and the signals reflected by hands are weaker than the signal traveled through the static multipaths.Therefore,if the signal is reflected for more than once,e.g.,by both the hand and wall as in path D,the signal strength will be further attenuated.These multipath components will be small compared to the direct reflection path so that we can ignore them.As we will show in our experiments,the measurement error of QGesture only slightly increases when there are strong dynamic multipaths,e.g.,when pushing close to a wall. Noises in CSI Measurements:Commercial CSI measurements contain various types of noises [29,34].The phase of CSI measurements is especially noisy due to the existence of Carrier Frequency Offset(CFO)and Sam- pling Frequency Offsets(SFO).The hand movement only changes the phase of the CSI by a small amount,e.g, in Figure 3.Therefore,without the accurate phase information,we can only gain a rough distance measurement Proceedings of the ACM on Human-Computer Interaction,Vol.1,No.4,Article 39.Publication date:March 2018
39:6 • N. Yu et al. Path A Path B Wall Path C Path D d Fig. 2. Scenarios with multiple paths. where a is the ratio of the path length change to the movement distance. For example, we have a = 2 for the scenario in Figure 2. Thus, we can measure the movement distance with an accuracy of a few centimeters. Furthermore, we can determine the movement direction by looking at the sign of ∆φ. 3.2 Practical Issues We need to consider three practical issues before we can fit the above theoretical model into the real CSI measurements. Static Multipath between Transmitter and Receiver: In indoor environments, the wireless signal will be reflected by surrounding objects such as walls, furniture, or doors. In case that the reflectors are static, e.g., path A and C in Figure 2, we can treat the combination of these static paths as a single static component. Consider the case that there are n paths in total, where the n th path is the dynamic path and other paths are static paths. Eq. (1) can be rewritten as: H(f ,t) = ∑n−1 i=1 ai (f ,t)e −j2πli (t ) λ | {z } static component +an (f ,t)e −j2πln (t ) λ | {z } dynamic component . (3) Because the signal strength ai (f ,t) and path length li (t) of static component are constant in Eq. (3), all static multipaths can be treated as a single path with the corresponding I/Q vector as the sum of vectors for all static paths. Dynamic Multipath through Moving Reflector: The signal reflected by the moving reflector could also travel through multiple paths, such as paths B and D in Figure 2. In this case, all these paths will have timevarying path lengths when the reflector moves. These time-varying paths will generate multiple time-varying components in the received signal strength. In practice, these components do not significantly affect the measurement accuracy, since the received reflecting signal strength is dominated by the direct reflection path, e.g., path B in Figure 2. Human hands have small areas and the signals reflected by hands are weaker than the signal traveled through the static multipaths. Therefore, if the signal is reflected for more than once, e.g., by both the hand and wall as in path D, the signal strength will be further attenuated. These multipath components will be small compared to the direct reflection path so that we can ignore them. As we will show in our experiments, the measurement error of QGesture only slightly increases when there are strong dynamic multipaths, e.g., when pushing close to a wall. Noises in CSI Measurements: Commercial CSI measurements contain various types of noises [29, 34]. The phase of CSI measurements is especially noisy due to the existence of Carrier Frequency Offset (CFO) and Sampling Frequency Offsets (SFO). The hand movement only changes the phase of the CSI by a small amount, e.g., θ1 in Figure 3. Therefore, without the accurate phase information, we can only gain a rough distance measurement Proceedings of the ACM on Human-Computer Interaction, Vol. 1, No. 4, Article 39. Publication date: March 2018
QGesture:Quantifying Gesture Distance and Direction with WiFi Signals.39:7 、CFO Subcarrier 1 Path B Subcarrier 2 Path A 8, Fig.3.I/Q Phasor Representation. using the CSI magnitude as in [29],where the movement direction information has been lost.Furthermore,the magnitude of CSI measurements are also noisy due to the measurement errors.So,we need to first remove the phase and magnitude noises in CSI measurements before applying our theoretical model. 3.3 CSI Noise Sources There are three types of noises in CSI measurements.The first two are phase noises and the last one is mag- nitude noise. Carrier Frequency Offset(CFO):Due to the small frequency differences in the carrier at the transmitter and receiver,the received CSIH'(f.t)contains an extra phase shift as H(ft)=eH(f.t).where f is the CFO between the sender and the receiver [27,29,34].As IEEE 802.11n standard allows the CFO between the transmitter and receiver to be as large as 100 kHz,the CFO introduces large phase uncertainties in CSI measurements.For example,when the CSI is measured at a rate of 4,000 frames per second,the phase change caused by CFO could be 50m between the two consecutive frames that are separated by a small time interval t of just 0.25 ms.Thus,the phase change caused by CFO is much larger than the phase changes caused by the movements,which changes by less than 1 radian during hand movements,as shown in Figure 3.Even a small measurement error in the time interval between consecutive frames will lead to a large phase offset,which makes the phase of CSI appear random. Sampling Frequency Offset(SFO)and Packet Boundary Detection(PBD)Error:These two error sources have similar effects which add another phase offset on the CSIas H"(f,t)=eH(f,t),where k is the index of OFDM subcarrier and o is the phase offset [34].Unlike the phase offset caused by CFO,which accumulates over time,the phase offsets of SFO and PBD have a linear relationship over different OFDM subcarriers.Moreover, the SFO/PBD offset has different slopes of on different frames. Magnitude Variations:Due to the variations in transmission power and environmental noises,the magni- tude of the CSI measurements also has large variations.These magnitude variations often have high energy impulses that could bury the small magnitude changes caused by hand movements[29]. 3.4 Denoising CSI Measurements We denoise CSI measurements using two steps: 1.Phase Correction:As discussed in Section 3.3,there are CFO,SFO and PBD phase offsets in CSI measure- ments.Fortunately,CSI values are simultaneously measured on 30 OFDM subcarriers for each antenna pair of the transmitter/receiver.For a transmitter with 2 antennas and a receiver with 3 antennas,we obtain 2x3x30 =180 CSI values for each WiFi frame.We can utilize the redundancy in CSI measurements to perform phase correction. Proceedings of the ACM on Human-Computer Interaction,Vol.1,No.4,Article 39.Publication date:March 2018
QGesture: Quantifying Gesture Distance and Direction with WiFi Signals • 39:7 I Q Subcarrier 1 Path B Path A Subcarrier 2 θ1 θ2 CFO Fig. 3. I/Q Phasor Representation. using the CSI magnitude as in [29], where the movement direction information has been lost. Furthermore, the magnitude of CSI measurements are also noisy due to the measurement errors. So, we need to first remove the phase and magnitude noises in CSI measurements before applying our theoretical model. 3.3 CSI Noise Sources There are three types of noises in CSI measurements. The first two are phase noises and the last one is magnitude noise. Carrier Frequency Offset (CFO):. Due to the small frequency differences in the carrier at the transmitter and receiver, the received CSI H ′ (f ,t) contains an extra phase shift as H ′ (f ,t) = e −j2π δ f tH(f ,t), where δ f is the CFO between the sender and the receiver [27, 29, 34]. As IEEE 802.11n standard allows the CFO between the transmitter and receiver to be as large as 100 kHz, the CFO introduces large phase uncertainties in CSI measurements. For example, when the CSI is measured at a rate of 4,000 frames per second, the phase change caused by CFO could be 50π between the two consecutive frames that are separated by a small time interval t of just 0.25 ms. Thus, the phase change caused by CFO is much larger than the phase changes caused by the movements, which changes by less than 1 radian during hand movements, as shown in Figure 3. Even a small measurement error in the time interval between consecutive frames will lead to a large phase offset, which makes the phase of CSI appear random. Sampling Frequency Offset (SFO) and Packet Boundary Detection (PBD) Error: These two error sources have similar effects which add another phase offset on the CSI asH ′′(f ,t) = e −j2πkϕH ′ (f ,t), where k is the index of OFDM subcarrier and ϕ is the phase offset [34]. Unlike the phase offset caused by CFO, which accumulates over time, the phase offsets of SFO and PBD have a linear relationship over different OFDM subcarriers. Moreover, the SFO/PBD offset has different slopes of ϕ on different frames. Magnitude Variations: Due to the variations in transmission power and environmental noises, the magnitude of the CSI measurements also has large variations. These magnitude variations often have high energy impulses that could bury the small magnitude changes caused by hand movements [29]. 3.4 Denoising CSI Measurements We denoise CSI measurements using two steps: 1. Phase Correction: As discussed in Section 3.3, there are CFO, SFO and PBD phase offsets in CSI measurements. Fortunately, CSI values are simultaneously measured on 30 OFDM subcarriers for each antenna pair of the transmitter/receiver. For a transmitter with 2 antennas and a receiver with 3 antennas, we obtain 2×3×30 = 180 CSI values for each WiFi frame. We can utilize the redundancy in CSI measurements to perform phase correction. Proceedings of the ACM on Human-Computer Interaction, Vol. 1, No. 4, Article 39. Publication date: March 2018
39:8·N.Yu et al. 0.6 ●-t.0ms 【2ms 0.4 ★tm1mg t-2ms ★ 10 02 0 15 02 20 0.4 25 -0. 20 -10 0 10 20 30 30 -20 .10 0 10 20 30 OFDM Subcarriers OFDM Subcarriers (a)Raw CSI phase. (b)Denoised CSI phase. Fig.4.The effect of phase correction on frames at different time instances We first remove the highly dynamical phase offset introduced by CFO.To estimate the value of CFO,we first observe that the phase of CSI measurements is mainly determined by three factors:CFO,SFO/PBD,and hand movements.As shown in [27],the phase offset of SFO and PBD is zero on the subcarrier with an index of k=0.Therefore,the phase of subcarrier 0 for each antenna pair only contains the phase of CFO and the impact of hand movements.However,removing the phase of a randomly selected antenna pair could distort the small phase changes caused by hand movements.To preserve the impact of hand movements,we observe that the phase changes caused by the hand movement on different subcarriers are different as shown in Section 5.5.Consider the two subcarriers in Figure 3.Although the magnitudes of the dynamic components are similar for the two subcarriers,the phase change 02 in subcarrier 2 is much smaller because the magnitude of static component for subcarrier 2 is much larger than that for subcarrier 1.In real CSI measurements,we observe that there are subcarriers where the magnitude of static components is more than ten times higher than that of the dynamic component as shown in Section 5.5.In such cases,the phase change caused by hand movements(e.g., 02 in subcarrier 2)in these subcarriers is smaller than 0.1 rad,which is an ignorable offset in other subcarriers (e.g.in subcarrier 1).Therefore,we can pick the CSI phase in subcarrier 0 of one antenna pair that has the largest magnitude of static components to serve as CFO reference.Based on these observations,we remove the CFO offset as follows.We first estimate the magnitude of the static component by taking long-term average on the CSI magnitude of each antenna pair.We then select the antenna pair with the largest CSI magnitude as the reference.Since the CSI is not measured on subcarrier 0,we interpolate the phase of subcarrier-1 and 1 to get CSI phase of subcarrier 0 of the selected antenna pair,which serves as CFO reference.We then subtract the calculated CFO in the phase of all subcarriers of other antenna pairs. After removing the CFO,we correct the SFO/PBD offset,-2mko,for the remaining subcarriers based on a standard algorithm [17].Note that the slope o is different for every WiFi frame.Therefore,we perform a linear regression on the 30 subcarriers of each antenna pair to estimate the slope We then remove the SFO/PBD for each subcarrier using the estimated Figure 4 shows the CSI phase for one antenna pair before and after the phase correction.We observe the raw CSI phase changes significantly over a short duration of 1 ms as shown in Figure 4(a).The phase of the same subcarrier can differ by an amount of and the slope of is also different at different time instances.The corrected phase in Figure 4(b)is more consistent over time.Furthermore,we can observe the small phase changes over different subcarriers after phase correction. 2.Magnitude Correction:After phase correction,we reduce the magnitude noise using oversampling.We measure the CSI value at a high sampling rate of 2,500 samples per second.Note that normal human movements introduce CSI magnitude variations with frequencies in the range of 1~100 Hz [29].Therefore,we can use a low Proceedings of the ACM on Human-Computer Interaction,Vol.1,No.4,Article 39.Publication date:March 2018
39:8 • N. Yu et al. -30 -20 -10 0 10 20 30 OFDM Subcarriers -25 -20 -15 -10 -5 0 Raw CSI Phase (Radian) t = 0 ms t = 1 ms t = 2 ms (a) Raw CSI phase. -30 -20 -10 0 10 20 30 OFDM Subcarriers -0.6 -0.4 -0.2 0 0.2 0.4 0.6 Corrected Phase (Radian) t = 0 ms t = 1 ms t = 2 ms (b) Denoised CSI phase. Fig. 4. The effect of phase correction on frames at different time instances. We first remove the highly dynamical phase offset introduced by CFO. To estimate the value of CFO, we first observe that the phase of CSI measurements is mainly determined by three factors: CFO, SFO/PBD, and hand movements. As shown in [27], the phase offset of SFO and PBD is zero on the subcarrier with an index of k = 0. Therefore, the phase of subcarrier 0 for each antenna pair only contains the phase of CFO and the impact of hand movements. However, removing the phase of a randomly selected antenna pair could distort the small phase changes caused by hand movements. To preserve the impact of hand movements, we observe that the phase changes caused by the hand movement on different subcarriers are different as shown in Section 5.5. Consider the two subcarriers in Figure 3. Although the magnitudes of the dynamic components are similar for the two subcarriers, the phase change θ2 in subcarrier 2 is much smaller because the magnitude of static component for subcarrier 2 is much larger than that for subcarrier 1. In real CSI measurements, we observe that there are subcarriers where the magnitude of static components is more than ten times higher than that of the dynamic component as shown in Section 5.5. In such cases, the phase change caused by hand movements (e.g., θ2 in subcarrier 2) in these subcarriers is smaller than 0.1 rad, which is an ignorable offset in other subcarriers (e.g., θ1 in subcarrier 1). Therefore, we can pick the CSI phase in subcarrier 0 of one antenna pair that has the largest magnitude of static components to serve as CFO reference. Based on these observations, we remove the CFO offset as follows. We first estimate the magnitude of the static component by taking long-term average on the CSI magnitude of each antenna pair. We then select the antenna pair with the largest CSI magnitude as the reference. Since the CSI is not measured on subcarrier 0, we interpolate the phase of subcarrier -1 and 1 to get CSI phase of subcarrier 0 of the selected antenna pair, which serves as CFO reference. We then subtract the calculated CFO in the phase of all subcarriers of other antenna pairs. After removing the CFO, we correct the SFO/PBD offset, −2πkϕ, for the remaining subcarriers based on a standard algorithm [17]. Note that the slope ϕ is different for every WiFi frame. Therefore, we perform a linear regression on the 30 subcarriers of each antenna pair to estimate the slope ϕ. We then remove the SFO/PBD for each subcarrier using the estimated ϕ. Figure 4 shows the CSI phase for one antenna pair before and after the phase correction. We observe the raw CSI phase changes significantly over a short duration of 1 ms as shown in Figure 4(a). The phase of the same subcarrier can differ by an amount of π and the slope of ϕ is also different at different time instances. The corrected phase in Figure 4(b) is more consistent over time. Furthermore, we can observe the small phase changes over different subcarriers after phase correction. 2. Magnitude Correction: After phase correction, we reduce the magnitude noise using oversampling. We measure the CSI value at a high sampling rate of 2,500 samples per second. Note that normal human movements introduce CSI magnitude variations with frequencies in the range of 1∼100 Hz [29]. Therefore, we can use a low Proceedings of the ACM on Human-Computer Interaction, Vol. 1, No. 4, Article 39. Publication date: March 2018
QGesture:Quantifying Gesture Distance and Direction with WiFi Signals.39:9 pass moving average filter with window size of 80 samples to smooth the CSI magnitude,as well as the residual phase noises After the phase correction and magnitude correction,the CSI value of highly noisy subcarriers could still be corrupted.So we use subcarrier selection [20]and linear regression to reduce the error. 4 SYSTEM DESIGN After removing the noises in CSI measurements,QGesture first uses the LEVD algorithm to remove the static component in the complex plane.We then measure the hand movement distance and direction based on the phase-distance relationship introduced in Section 3.1. 4.1 Real-world CSI Measurements To better understand the CSI measurements provided by COTS devices,Figure 5 plots the CSI value captured by Intel 5300 network card.After denoising,the I/Q waveforms from COTS devices fit our theoretical model quite well.As the user pulls back his hand by 40 cm,we observe around 16 peaks in the waveform,which indicates that the phase is changed by 32m.Using the wavelength of 5.15 cm at 5.825 GHz,we get a path length change of 82.4 cm,which is very close to our model as we have a=2 in this case.Furthermore,we observe that the phase is reducing(CSI values circling clockwise as in Figure 5(b)),which indicates that the user is pulling back. We further observe that the real-world CSI values deviate from the theoretical model in two aspects.First, the static component is not a constant where the center of the circles in Figure 5(b)slowly changes.This is mainly due to the slow movements of the other body parts,such as the arms or the torso,of the user.Second, the magnitude of the dynamic component also changes.This is due to the reduction of strength of the reflected signal when the hand moves away from the transmitter/receiver.The slowly changing static component and reflected signal strength make it challenging to measure the phase of the dynamic component.For example,if we use a constant static component estimation,e.g,with I component equal to 2,the last few small peaks in Figure 5(a)could be ignored and we will underestimate the movement distance. 4.2 Removing Static Components QGesture uses a Local Extreme Value Detection algorithm(LEVD)to trace the slowly changing static compo- nent.The LEVD algorithm first initializes the static component estimation S(t)as the long-term average,e.g.. average value over 2 seconds,of the CSI real part or imaginary part.As the channel coherent time for our WiFi scenario is about 10 ms,averaging over a time period of 2 seconds is enough to smooth out the channel varia- tions.The algorithm uses an empirical threshold of T,which is determined by experiments,as shown in Section 5.5,to detect local maxima and minima.Once the CSI value deviates from its mean value by more than T,the algorithm starts detecting local maxima and minima.The local maxima and minima must satisfy the following two properties:1)Local maxima must be at least larger than the current static component estimation S(t)by the value of T.Similarly,local minima must be smaller than S(t)-T.2)Local maxima and minima must appear alternately.If there are two consecutive local maxima/minima,we only retain the larger/smaller one. While detecting the local extrema,we update the static component estimation S(t)dynamically by setting it to be the average of the last pair of local maximum and minimum values.In this way,LEVD is able to trace the slowly changing static component.Figure 6(a)shows the local extrema detected by LEVD,which precisely indicates the cycles of the waveform.After removing the estimated static component,LEVD gives a good estimation of dynamic component of the CSI waveform in Figure 6(a).The result of LEVD is better than simply removing the average of the CSI waveform.Figure 6(b)shows the distance estimation of LEVD and the simple average-removal algorithm for the changing path length of 80 cm.Using Eq.(2),we find that the pushing distance estimation of LEVD is 40.17 cm,while the distance estimation of simple average-removal is 33.6 cm.We observe that the Proceedings of the ACM on Human-Computer Interaction,Vol.1,No.4,Article 39.Publication date:March 2018
QGesture: Quantifying Gesture Distance and Direction with WiFi Signals • 39:9 pass moving average filter with window size of 80 samples to smooth the CSI magnitude, as well as the residual phase noises. After the phase correction and magnitude correction, the CSI value of highly noisy subcarriers could still be corrupted. So we use subcarrier selection [20] and linear regression to reduce the error. 4 SYSTEM DESIGN After removing the noises in CSI measurements, QGesture first uses the LEVD algorithm to remove the static component in the complex plane. We then measure the hand movement distance and direction based on the phase-distance relationship introduced in Section 3.1. 4.1 Real-world CSI Measurements To better understand the CSI measurements provided by COTS devices, Figure 5 plots the CSI value captured by Intel 5300 network card. After denoising, the I/Q waveforms from COTS devices fit our theoretical model quite well. As the user pulls back his hand by 40 cm, we observe around 16 peaks in the waveform, which indicates that the phase is changed by 32π. Using the wavelength of 5.15 cm at 5.825 GHz, we get a path length change of 82.4 cm, which is very close to our model as we have a = 2 in this case. Furthermore, we observe that the phase is reducing (CSI values circling clockwise as in Figure 5(b)), which indicates that the user is pulling back. We further observe that the real-world CSI values deviate from the theoretical model in two aspects. First, the static component is not a constant where the center of the circles in Figure 5(b) slowly changes. This is mainly due to the slow movements of the other body parts, such as the arms or the torso, of the user. Second, the magnitude of the dynamic component also changes. This is due to the reduction of strength of the reflected signal when the hand moves away from the transmitter/receiver. The slowly changing static component and reflected signal strength make it challenging to measure the phase of the dynamic component. For example, if we use a constant static component estimation, e.g., with I component equal to 2, the last few small peaks in Figure 5(a) could be ignored and we will underestimate the movement distance. 4.2 Removing Static Components QGesture uses a Local Extreme Value Detection algorithm (LEVD) to trace the slowly changing static component. The LEVD algorithm first initializes the static component estimation S (t) as the long-term average, e.g., average value over 2 seconds, of the CSI real part or imaginary part. As the channel coherent time for our WiFi scenario is about 10 ms, averaging over a time period of 2 seconds is enough to smooth out the channel variations. The algorithm uses an empirical threshold of T , which is determined by experiments, as shown in Section 5.5, to detect local maxima and minima. Once the CSI value deviates from its mean value by more than T , the algorithm starts detecting local maxima and minima. The local maxima and minima must satisfy the following two properties: 1) Local maxima must be at least larger than the current static component estimation S (t) by the value of T . Similarly, local minima must be smaller than S (t) −T . 2) Local maxima and minima must appear alternately. If there are two consecutive local maxima/minima, we only retain the larger/smaller one. While detecting the local extrema, we update the static component estimation S (t) dynamically by setting it to be the average of the last pair of local maximum and minimum values. In this way, LEVD is able to trace the slowly changing static component. Figure 6(a) shows the local extrema detected by LEVD, which precisely indicates the cycles of the waveform. After removing the estimated static component, LEVD gives a good estimation of dynamic component of the CSI waveform in Figure 6(a). The result of LEVD is better than simply removing the average of the CSI waveform. Figure 6(b)shows the distance estimation of LEVD and the simple average-removal algorithm for the changing path length of 80 cm. Using Eq. (2), we find that the pushing distance estimation of LEVD is 40.17 cm, while the distance estimation of simple average-removal is 33.6 cm. We observe that the Proceedings of the ACM on Human-Computer Interaction, Vol. 1, No. 4, Article 39. Publication date: March 2018
39:10·N.Yu et al. 15 -I Component QComponen 0 10 2 Static g 0 0.5 1.5 2 2.5 0 5 10 Time(seconds) I Component (a)I/O waveforms. (b)1/O trace in complex plane. Fig.5.Denoised CSl waveforms of pulling 40cm backward. simple average-removal algorithm underestimates the distance as it ignores the last few small peaks in Figure 5(a). -LEVD -Average-remov 20 -40 -60 80 -10 1 2.5 0 0.5 1 1写2 2.5 3 3.5 Time(seconds) Time(seconds) (a)Real part after LEVD. (b)Phase of two algorithms Fig.6.Static component removal and phase measurements 4.3 Measuring Movement Distance and Direction After removing the static component,we use two different strategies to fuse the CSI values in different sub- carriers and antenna pairs for movement distance/direction measurement.The first measurement method is called Principal Component Identification(PCD)[26].In this method,we select the subcarrier that is most sensi- tive to the hand movement,i.e.,has the largest changes in phase during the movement,and use this subcarrier for measurements.The second measurement method uses Principal Component Analysis(PCA)on the magni- tude of the CSI values over different subcarriers,in a similar way as in [29].We select the second-largest PCA component and use Hilbert transform to recover the phase information from the magnitude and then use the re- covered phase information to calculate the movement distance.These two algorithms have different advantages. The PCI algorithm retains the phase of CSI value so that we can determine the movement direction using this method.It works well when the hand is close to the sender/receiver.The PCA algorithm works better when the hand is more than 1.5 meters away from the sender/receiver.This is because the PCA algorithm"amplifies"the small changes caused by hand movement using the correlations in multiple subcarriers.However,as the PCA algorithm removes the phase information before processing,we cannot directly use it to detect the movement direction,which depends on whether the phase is increasing or decreasing. Proceedings of the ACM on Human-Computer Interaction,Vol.1,No.4,Article 39.Publication date:March 2018
39:10 • N. Yu et al. 0.5 1 1.5 2 2.5 3 Time (seconds) -5 0 5 10 15 I/Q Component I Component Q Component (a) I/Q waveforms. -5 0 5 10 I Component -4 -2 0 2 4 6 8 10 12 Q Component Static Component (b) I/Q trace in complex plane. Fig. 5. Denoised CSI waveforms of pulling 40cm backward. simple average-removal algorithm underestimates the distance as it ignores the last few small peaks in Figure 5(a). 0.5 1 1.5 2 2.5 Time (seconds) -5 0 5 10 I Component (a) Real part after LEVD. 0 0.5 1 1.5 2 2.5 3 3.5 Time (seconds) -100 -80 -60 -40 -20 0 Phase Value (radian) LEVD Average-removal (b) Phase of two algorithms. Fig. 6. Static component removal and phase measurements. 4.3 Measuring Movement Distance and Direction After removing the static component, we use two different strategies to fuse the CSI values in different subcarriers and antenna pairs for movement distance/direction measurement. The first measurement method is called Principal Component Identification (PCI) [26]. In this method, we select the subcarrier that is most sensitive to the hand movement, i.e., has the largest changes in phase during the movement, and use this subcarrier for measurements. The second measurement method uses Principal Component Analysis (PCA) on the magnitude of the CSI values over different subcarriers, in a similar way as in [29]. We select the second-largest PCA component and use Hilbert transform to recover the phase information from the magnitude and then use the recovered phase information to calculate the movement distance. These two algorithms have different advantages. The PCI algorithm retains the phase of CSI value so that we can determine the movement direction using this method. It works well when the hand is close to the sender/receiver. The PCA algorithm works better when the hand is more than 1.5 meters away from the sender/receiver. This is because the PCA algorithm “amplifies” the small changes caused by hand movement using the correlations in multiple subcarriers. However, as the PCA algorithm removes the phase information before processing, we cannot directly use it to detect the movement direction, which depends on whether the phase is increasing or decreasing. Proceedings of the ACM on Human-Computer Interaction, Vol. 1, No. 4, Article 39. Publication date: March 2018