Multi-Touch in the Air:Device-Free Finger Tracking and Gesture Recognition via COTS RFID Chuyu Wang',Jian Liu,Yingying Chen,Hongbo Liu",Lei Xief,Wei Wangt,Bingbing Hef,Sanglu Lut iState Key Laboratory for Novel Software Technology,Nanjing University,China Email:{wangcyu217,hebb}@dislab.nju.edu.cn,{Ixie,ww,sanglu@nju.edu.cn WINLAB,Rutgers University,New Brunswick,NJ,USA Email:jianliu @winlab.rutgers.edu,yingche@scarletmail.rutgers.edu "Indiana University-Purdue University,Indianapolis,IN,USA Email:hl45@iupui.edu Abstract-Recently,gesture recognition has gained consider- able attention in emerging applications (e.g.,AR/VR systems) Track the finger Recognize multi- to provide a better user experience for human-computer inter- on smart devices touch gesture action.Existing solutions usually recognize the gestures based on wearable sensors or specialized signals (e.g.,WiFi,acoustic and visible light),but they are either incurring high energy consumption or susceptible to the ambient environment,which prevents them from efficiently sensing the fine-grained finger RFID tag array Manipulate in movements.In this paper,we present RF-finger,a device-free VR gaming system based on Commercial-Off-The-Shelf(COTS)RFID,which leverages a tag array on a letter-size paper to sense the fine- grained finger movements performed in front of the paper. Particularly,we focus on two kinds of sensing modes:finger tracking recovers the moving trace of finger writings;multi-touch Fig.1.Illustrations of application of RF-finger. gesture recognition identifies the multi-touch gestures involving Therefore,accurately recognizing gestures in the air,especially multiple fingers.Specifically,we build a theoretical model to extract the fine-grained reflection feature from the raw RF-signal, fine-grained finger movements,has a great potential to provide which describes the finger influence on the tag array in cm- a better user experience in emerging VR applications and IoT level resolution.For the finger tracking,we leverage K-Nearest manipulations,which will have a market value of USD 48.56 Neighbors(KNN)to pinpoint the finger position relying on the billion by the year of 2024 [2]. fine-grained reflection features,and obtain a smoothed trace via Kalman filter.Additionally,we construct the reflection image of Existing gesture recognition solutions can be divided into each multi-touch gesture from the reflection features by regarding two categories:(i)Device-based approaches usually require the multiple fingers as a whole.Finally,we use a Convolutional the user to wear sensors,e.g.,RFID tag or smartwatch,and Neural Network(CNN)to identify the multi-touch gestures based track the motion of the sensors to recognize the gestures [15, on the images.Extensive experiments validate that RF-finger can 17.These studies usually derive the gestures by building achieve as high as 88%and 92%accuracy for finger tracking theoretical models to depict the signal changes received from and multi-touch gesture recognition,respectively. the sensors.However,device-based approaches either suffer I.INTRODUCTION from the uncomfortable user experience (e.g.,attaching the With the flourishing of ubiquitous sensing techniques,the RFID tag on the finger)or the short life cycles due to the high human-computer interaction is undergoing a reform:the nat- energy consumption.(ii)Device-free approaches recognize ural human gestures,e.g.,finger movements in the air,is pro- the gestures from ambient signals through different kinds of gressively replacing the traditional typing-based input devices techniques without requiring the user to wear any devices.As such as keyboards to provide a better user experience.Such the most popular solutions,camera-based solutions,such as gesture-based interactions have promoted the development of Kinect and LeapMotion,construct the body or finger structure both Virtual Reality (VR)and Argument Reality (AR)systems, from the video streams for accurately gesture recognition. where users could directly control the virtual objects via per- Nevertheless,they usually involve high computation and may forming gestures in the air,e.g.,writing words,manipulating incur privacy concerns of the users.More recent works try the tellurion or playing the VR games.Toward this end,the to recognize the gestures based on WiFi [16],acoustic sig- gesture-based interaction can further enable the operations on nals [18]and visible light [9].However,these solutions are the smart devices in the Internet-of-Things (loT)environments, either easily affected by the environmental noise or incapable e.g.,withdrawing the curtains,controlling the smart TVs. of sensing fine-grained gestures at the finger level.In this work,we are in search of a new device-free mechanism that Yingying Chen and Lei Xie are the co-corresponding authors can recognize finger-level gestures to facilitate the growingMulti-Touch in the Air: Device-Free Finger Tracking and Gesture Recognition via COTS RFID Chuyu Wang† , Jian Liu‡ , Yingying Chen‡ , Hongbo Liu* , Lei Xie† , Wei Wang† , Bingbing He† , Sanglu Lu† †State Key Laboratory for Novel Software Technology, Nanjing University, China Email: {wangcyu217, hebb}@dislab.nju.edu.cn, {lxie, ww, sanglu}@nju.edu.cn ‡ WINLAB, Rutgers University, New Brunswick, NJ, USA Email: jianliu@winlab.rutgers.edu, yingche@scarletmail.rutgers.edu * Indiana University-Purdue University, Indianapolis, IN, USA Email: hl45@iupui.edu Abstract—Recently, gesture recognition has gained considerable attention in emerging applications (e.g., AR/VR systems) to provide a better user experience for human-computer interaction. Existing solutions usually recognize the gestures based on wearable sensors or specialized signals (e.g., WiFi, acoustic and visible light), but they are either incurring high energy consumption or susceptible to the ambient environment, which prevents them from efficiently sensing the fine-grained finger movements. In this paper, we present RF-finger, a device-free system based on Commercial-Off-The-Shelf (COTS) RFID, which leverages a tag array on a letter-size paper to sense the finegrained finger movements performed in front of the paper. Particularly, we focus on two kinds of sensing modes: finger tracking recovers the moving trace of finger writings; multi-touch gesture recognition identifies the multi-touch gestures involving multiple fingers. Specifically, we build a theoretical model to extract the fine-grained reflection feature from the raw RF-signal, which describes the finger influence on the tag array in cmlevel resolution. For the finger tracking, we leverage K-Nearest Neighbors (KNN) to pinpoint the finger position relying on the fine-grained reflection features, and obtain a smoothed trace via Kalman filter. Additionally, we construct the reflection image of each multi-touch gesture from the reflection features by regarding the multiple fingers as a whole. Finally, we use a Convolutional Neural Network (CNN) to identify the multi-touch gestures based on the images. Extensive experiments validate that RF-finger can achieve as high as 88% and 92% accuracy for finger tracking and multi-touch gesture recognition, respectively. I. INTRODUCTION With the flourishing of ubiquitous sensing techniques, the human-computer interaction is undergoing a reform: the natural human gestures, e.g., finger movements in the air, is progressively replacing the traditional typing-based input devices such as keyboards to provide a better user experience. Such gesture-based interactions have promoted the development of both Virtual Reality (VR) and Argument Reality (AR) systems, where users could directly control the virtual objects via performing gestures in the air, e.g., writing words, manipulating the tellurion or playing the VR games. Toward this end, the gesture-based interaction can further enable the operations on the smart devices in the Internet-of-Things (IoT) environments, e.g., withdrawing the curtains, controlling the smart TVs. Yingying Chen and Lei Xie are the co-corresponding authors. Track the finger on smart devices Manipulate in VR gaming RFID tag array Recognize multitouch gesture Fig. 1. Illustrations of application of RF-finger. Therefore, accurately recognizing gestures in the air, especially fine-grained finger movements, has a great potential to provide a better user experience in emerging VR applications and IoT manipulations, which will have a market value of USD 48.56 billion by the year of 2024 [2]. Existing gesture recognition solutions can be divided into two categories: (i) Device-based approaches usually require the user to wear sensors, e.g., RFID tag or smartwatch, and track the motion of the sensors to recognize the gestures [15, 17]. These studies usually derive the gestures by building theoretical models to depict the signal changes received from the sensors. However, device-based approaches either suffer from the uncomfortable user experience (e.g., attaching the RFID tag on the finger) or the short life cycles due to the high energy consumption. (ii) Device-free approaches recognize the gestures from ambient signals through different kinds of techniques without requiring the user to wear any devices. As the most popular solutions, camera-based solutions, such as Kinect and LeapMotion, construct the body or finger structure from the video streams for accurately gesture recognition. Nevertheless, they usually involve high computation and may incur privacy concerns of the users. More recent works try to recognize the gestures based on WiFi [16], acoustic signals [18] and visible light [9]. However, these solutions are either easily affected by the environmental noise or incapable of sensing fine-grained gestures at the finger level. In this work, we are in search of a new device-free mechanism that can recognize finger-level gestures to facilitate the growing 1