AirContour:Building Contour-based Model for in-Air Writing Gesture Recognition YAFENG YIN,State Key Laboratory for Novel Software Technology,Nanjing University,China LEI XIE (corresponding author),State Key Laboratory for Novel Software Technology,Nanjing University,China TAO GU,RMIT University,Australia YIJIA LU,State Key Laboratory for Novel Software Technology,Nanjing University,China SANGLU LU,State Key Laboratory for Novel Software Technology,Nanjing University,China Recognizing in-air hand gestures will benefit a wide range of applications such as sign language recognition,remote control with hand gestures,and "writing"in the air as a new way of text input.This paper presents AirContour,which focuses on in-air writing gesture recognition with a wrist-worn device.We propose a novel contour-based gesture model which converts human gestures to contours in 3D space,and then recognize the contours as characters.Different from 2D contours, the 3D contours may have the problems such as contour distortion caused by different viewing angles,contour difference caused by different writing directions,and the contour distribution across different planes.To address the above problem,we introduce Principal Component Analysis(PCA) to detect the principal/writing plane in 3D space,and then tune the projected 2D contour in the principal plane through reversing,rotating and normalizing operations,to make the 2D contour in right orientation and normalized size under a uniform view.After that,we propose both an online approach AC-Vec and an offline approach AC-CNN for character recognition.The experimental results show that AC-Vec achieves an accuracy of 91.6%and AC-CNN achieves an accuracy of 94.3%for gesture/character recognition,both outperform the existing approaches. CCS Concepts:.Human-centered computing-Ubiquitous and mobile computing de- sign and evaluation methods;Empirical studies in ubiquitous and mobile computing. Additional Key Words and Phrases:AirContour,in-air writing,contour-based gesture model, principal component analysis(PCA),gesture recognition ACM Reference Format: Yafeng Yin,Lei Xie (corresponding author),Tao Gu,Yijia Lu,and Sanglu Lu.2019.AirContour: Building Contour-based Model for in-Air Writing Gesture Recognition.ACM Trans.Sensor Netw. 1,1,Article 1 (January 2019),26 pages.https://doi.org/10.1145/3343855 Authors'addresses:Yafeng Yin,State Key Laboratory for Novel Software Technology,Nanjing University, Nanjing,China,yafeng@nju.edu.cn;Lei Xie(corresponding author),State Key Laboratory for Novel Software Technology,Nanjing University,Nanjing,China,Ixie@nju.edu.cn;Tao Gu,RMIT University,Melbourne, Australia,tao.gu@rmit.edu.au;Yijia Lu,State Key Laboratory for Novel Software Technology,Nanjing University,Nanjing,China,lyj@smail.nju.edu.cn;Sanglu Lu,State Key Laboratory for Novel Software Technology,Nanjing University,Nanjing,China,sanglu@nju.edu.cn. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page.Copyrights for components of this work owned by others than ACM must be honored.Abstracting with credit is permitted.To copy otherwise, or republish,to post on servers or to redistribute to lists,requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. C)2019 Association for Computing Machinery. 1550-4859/2019/1-ART1$15.00 https://doi.org/10.1145/3343855 ACM Trans.Sensor Netw.,Vol.1,No.1,Article 1.Publication date:January 2019
1 AirContour: Building Contour-based Model for in-Air Writing Gesture Recognition YAFENG YIN, State Key Laboratory for Novel Software Technology, Nanjing University, China LEI XIE (corresponding author), State Key Laboratory for Novel Software Technology, Nanjing University, China TAO GU, RMIT University, Australia YIJIA LU, State Key Laboratory for Novel Software Technology, Nanjing University, China SANGLU LU, State Key Laboratory for Novel Software Technology, Nanjing University, China Recognizing in-air hand gestures will benefit a wide range of applications such as sign language recognition, remote control with hand gestures, and “writing” in the air as a new way of text input. This paper presents AirContour, which focuses on in-air writing gesture recognition with a wrist-worn device. We propose a novel contour-based gesture model which converts human gestures to contours in 3D space, and then recognize the contours as characters. Different from 2D contours, the 3D contours may have the problems such as contour distortion caused by different viewing angles, contour difference caused by different writing directions, and the contour distribution across different planes. To address the above problem, we introduce Principal Component Analysis (PCA) to detect the principal/writing plane in 3D space, and then tune the projected 2D contour in the principal plane through reversing, rotating and normalizing operations, to make the 2D contour in right orientation and normalized size under a uniform view. After that, we propose both an online approach AC-Vec and an offline approach AC-CNN for character recognition. The experimental results show that AC-Vec achieves an accuracy of 91.6% and AC-CNN achieves an accuracy of 94.3% for gesture/character recognition, both outperform the existing approaches. CCS Concepts: • Human-centered computing → Ubiquitous and mobile computing design and evaluation methods; Empirical studies in ubiquitous and mobile computing. Additional Key Words and Phrases: AirContour, in-air writing, contour-based gesture model, principal component analysis (PCA), gesture recognition ACM Reference Format: Yafeng Yin, Lei Xie (corresponding author), Tao Gu, Yijia Lu, and Sanglu Lu. 2019. AirContour: Building Contour-based Model for in-Air Writing Gesture Recognition. ACM Trans. Sensor Netw. 1, 1, Article 1 (January 2019), 26 pages. https://doi.org/10.1145/3343855 Authors’ addresses: Yafeng Yin, State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China, yafeng@nju.edu.cn; Lei Xie (corresponding author), State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China, lxie@nju.edu.cn; Tao Gu, RMIT University, Melbourne, Australia, tao.gu@rmit.edu.au; Yijia Lu, State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China, lyj@smail.nju.edu.cn; Sanglu Lu, State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China, sanglu@nju.edu.cn. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. © 2019 Association for Computing Machinery. 1550-4859/2019/1-ART1 $15.00 https://doi.org/10.1145/3343855 ACM Trans. Sensor Netw., Vol. 1, No. 1, Article 1. Publication date: January 2019
1:2 Y.Yin et al. 1 INTRODUCTION With the advancement of rich embedded sensors,mobile or wearable devices(e.g.,smartphone, smartwatch)have been largely used in activity recognition [37]21][23][45][26][31][41],and benefit many human-computer interactions,e.g.,motion-sensing games [25],sign language recognition [12],in-air writing [1],etc.As a typical interaction mode,writing in the air has aroused wide attentions [9][10][39][36][6].It allows user to write characters with arm and hand freely in the air without focusing the attentions on the small screen or tiny keys on a device [2.As shown in Fig.1,a user carrying/wearing a sensor-embedded device writes in the air,and the gesture will be recognized as a character.Recognizing in-air writing gestures is a key technology to facilitate writing gesture-based interactions in the air and can be used in many scenarios.For example,"writing"commands in the air to control a unmanned aerial vehicle(UAV),while looking at the scene transmitted from the UAV in a virtual reality(VR)headset,to avoid taking off the VR headset and inputting the commands with a controller.Another example could be replacing the traditional on-screen text input by "writing"the text message in the air,thus allowing to interact with mobile or wearable devices having tiny or no screen.Besides,when one hand of the user is occupied,typing with a keyboard becomes inconvenient,the sensor-assisted in-air input technology can be used to capture hand gestures and lay them out in text or image [1].When comparing to the existing handwriting,voice or camera-based input,in-air writing with inertial sensors can tolerate the limited screen,environmental noises and poor light conditions.In this paper, we focus on recognizing in-air writing gestures as characters. Wrist-worn device (eg.,smartwatch)】 In-air In-air/3D Output writing gesture contours (recognized characters) Fig.1.AirContour:in-air writing gesture recognition based on contours In inertial sensor based gesture recognition,many approaches have been proposed.Some data-driven approaches [10][2][7]35][15]tend to extract features from sensor data to train classifiers for gesture recognition,while paying little attention on human activity analysis.If the user performs gestures with more degrees of freedom,i.e.,the gestures may have large variations in speeds,sizes,or orientations,the type of approaches may fail to recognize them with high accuracy.On the Contrast,some pattern-driven approaches [1][32][13]try to capture the moving patterns of gestures for activity recognition.For example,Agrawal et al. [1]utilize the segmented strokes and grammar tree to recognize capital letters in a 2D plane. However,due to the complexity of analyzing human activities,the type of approaches may redefine the gesture patterns or constrain the gestures in a limited area(e.g.,on a limited 2D plane),which may decrease user experience.To track the continuous in-air gestures, Shen et al.[29]utilize the 5-DoF arm model and HMM to track the 3D posture of the arm. However,in 3D space,tracking is not directly linked to recognition,especially when the ACM Trans.Sensor Netw.,Vol.1,No.1,Article 1.Publication date:January 2019
1:2 Y. Yin et al. 1 INTRODUCTION With the advancement of rich embedded sensors, mobile or wearable devices (e.g., smartphone, smartwatch) have been largely used in activity recognition [37][21][23][45][26][31][41], and benefit many human-computer interactions, e.g., motion-sensing games [25], sign language recognition [12], in-air writing [1], etc. As a typical interaction mode, writing in the air has aroused wide attentions [9][10][39][36][6]. It allows user to write characters with arm and hand freely in the air without focusing the attentions on the small screen or tiny keys on a device [2]. As shown in Fig. 1, a user carrying/wearing a sensor-embedded device writes in the air, and the gesture will be recognized as a character. Recognizing in-air writing gestures is a key technology to facilitate writing gesture-based interactions in the air and can be used in many scenarios. For example, “writing” commands in the air to control a unmanned aerial vehicle (UAV), while looking at the scene transmitted from the UAV in a virtual reality (VR) headset, to avoid taking off the VR headset and inputting the commands with a controller. Another example could be replacing the traditional on-screen text input by “writing” the text message in the air, thus allowing to interact with mobile or wearable devices having tiny or no screen. Besides, when one hand of the user is occupied, typing with a keyboard becomes inconvenient, the sensor-assisted in-air input technology can be used to capture hand gestures and lay them out in text or image [1]. When comparing to the existing handwriting, voice or camera-based input, in-air writing with inertial sensors can tolerate the limited screen, environmental noises and poor light conditions. In this paper, we focus on recognizing in-air writing gestures as characters. Fig. 1. AirContour: in-air writing gesture recognition based on contours In inertial sensor based gesture recognition, many approaches have been proposed. Some data-driven approaches [10][2][7][35][15] tend to extract features from sensor data to train classifiers for gesture recognition, while paying little attention on human activity analysis. If the user performs gestures with more degrees of freedom, i.e., the gestures may have large variations in speeds, sizes, or orientations, the type of approaches may fail to recognize them with high accuracy. On the Contrast, some pattern-driven approaches [1][32][13] try to capture the moving patterns of gestures for activity recognition. For example, Agrawal et al. [1] utilize the segmented strokes and grammar tree to recognize capital letters in a 2D plane. However, due to the complexity of analyzing human activities, the type of approaches may redefine the gesture patterns or constrain the gestures in a limited area (e.g., on a limited 2D plane), which may decrease user experience. To track the continuous in-air gestures, Shen et al. [29] utilize the 5-DoF arm model and HMM to track the 3D posture of the arm. However, in 3D space, tracking is not directly linked to recognition, especially when the ACM Trans. Sensor Netw., Vol. 1, No. 1, Article 1. Publication date: January 2019
AirContour 1:3 trajectory (e.g.,handwriting trajectory)locates in different planes.Therefore,it is still a challenging task to apply the existing approaches to recognize in-air writing gestures which occurs in 3D space with more degrees of freedom,while guaranteeing user experience. To address aforementioned issues,in this paper we explore contours to represent in-air writing gestures,and propose a novel contour-based gesture model,where the 'contour'is represented with a sequence of coordinate points over time.We use an off-the-shelf wrist-worn device (e.g.,smartwatch)to collect sensor data,and our basic idea is to build a 3D contour model for each gesture and utilize the contour feature to recognize gestures as characters, as illustrated in Fig.1.Since the gesture contour keeps the essential movement patterns of in-air gestures,it can tolerate the intra-class variability of gestures.It is worth noting that while the proposed 'contour-gesture'model is applied in in-air writing gesture recognition for this work,it can also be used in sign language recognition and remote control with hand gestures [40].However,different from 2D contours,building 3D contours presents several challenges,i.e.,contour distortion caused by different viewing angles,contour difference caused by different writing directions,contour distribution across different planes,making it difficult to recognize 3D contours as 2D characters.To solve this problem,we first describe the range of viewing angles based on the way that the device is worn,which indicates the possible writing directions.We then apply Principal Component Analysis(PCA)to detect the principal/writing plane,i.e.,most of the contour is located in or close to the plane.After that,we calibrate the 2D projected contour in the principal plane for gesture/character recognition,while considering the distortion caused by dimensionality reduction and the difference of gesture sizes. We make the following contributions in this paper. To the best of our knowledge,we are the first to propose the contour-based gesture model to recognize in-air writing gestures.The model is designed to solve the new challenges in 3D gesture contours,e.g.,observation ambiguity,uncertain orientation and distribution of 3D contours,and tolerate the intra-class variability of gestures.The contour-based gesture model can be applied in not only in-air writing gesture recognition,but also many other scenarios such as sign language recognition,motion-sensing games and remote control with hand gestures. To recognize gesture contours in 3D space to characters in a 2D plane,we introduce PCA for dimensionality reduction and a series of calibrations for 2D contours.Specifically, we first utilize PCA to detect the principal/writing plane,and then project the 3D contour into the principal plane for dimensionality reduction.After that,we calibrate the 2D contour in the principal plane through reversing,rotating and normalizing operations,to make it in right orientation and normalized size under a uniform view, i.e.,to make the 2D contour suitable for character recognition. We conduct extensive experiments to verify the efficiency of the proposed contour- based gesture model.In addition,based on the model,we propose an online approach AC-Vec and an offline approach AC-CNN to recognize 2D contours as characters.The experimental results show that AC-Vec and AC-CNN achieve an accuracy of 91.6% and 94.3%,respectively,for gesture/character recognition,and both outperform the existing approaches. 2 RELATED WORK In this section,we describe and analyze the state-of-the-art related to in-air gesture recogni- tion,tracking,writing in the air,and handwritten character recognition,especially focus on inertial sensor based techniques. ACM Trans.Sensor Netw.,Vol.1,No.1,Article 1.Publication date:January 2019
AirContour 1:3 trajectory (e.g., handwriting trajectory) locates in different planes. Therefore, it is still a challenging task to apply the existing approaches to recognize in-air writing gestures which occurs in 3D space with more degrees of freedom, while guaranteeing user experience. To address aforementioned issues, in this paper we explore contours to represent in-air writing gestures, and propose a novel contour-based gesture model, where the ‘contour’ is represented with a sequence of coordinate points over time. We use an off-the-shelf wrist-worn device (e.g., smartwatch) to collect sensor data, and our basic idea is to build a 3D contour model for each gesture and utilize the contour feature to recognize gestures as characters, as illustrated in Fig. 1. Since the gesture contour keeps the essential movement patterns of in-air gestures, it can tolerate the intra-class variability of gestures. It is worth noting that while the proposed ‘contour-gesture’ model is applied in in-air writing gesture recognition for this work, it can also be used in sign language recognition and remote control with hand gestures [40]. However, different from 2D contours, building 3D contours presents several challenges, i.e., contour distortion caused by different viewing angles, contour difference caused by different writing directions, contour distribution across different planes, making it difficult to recognize 3D contours as 2D characters. To solve this problem, we first describe the range of viewing angles based on the way that the device is worn, which indicates the possible writing directions. We then apply Principal Component Analysis (PCA) to detect the principal/writing plane, i.e., most of the contour is located in or close to the plane. After that, we calibrate the 2D projected contour in the principal plane for gesture/character recognition, while considering the distortion caused by dimensionality reduction and the difference of gesture sizes. We make the following contributions in this paper. ∙ To the best of our knowledge, we are the first to propose the contour-based gesture model to recognize in-air writing gestures. The model is designed to solve the new challenges in 3D gesture contours, e.g., observation ambiguity, uncertain orientation and distribution of 3D contours, and tolerate the intra-class variability of gestures. The contour-based gesture model can be applied in not only in-air writing gesture recognition, but also many other scenarios such as sign language recognition, motion-sensing games and remote control with hand gestures. ∙ To recognize gesture contours in 3D space to characters in a 2D plane, we introduce PCA for dimensionality reduction and a series of calibrations for 2D contours. Specifically, we first utilize PCA to detect the principal/writing plane, and then project the 3D contour into the principal plane for dimensionality reduction. After that, we calibrate the 2D contour in the principal plane through reversing, rotating and normalizing operations, to make it in right orientation and normalized size under a uniform view, i.e., to make the 2D contour suitable for character recognition. ∙ We conduct extensive experiments to verify the efficiency of the proposed contourbased gesture model. In addition, based on the model, we propose an online approach AC-Vec and an offline approach AC-CNN to recognize 2D contours as characters. The experimental results show that AC-Vec and AC-CNN achieve an accuracy of 91.6% and 94.3%, respectively, for gesture/character recognition, and both outperform the existing approaches. 2 RELATED WORK In this section, we describe and analyze the state-of-the-art related to in-air gesture recognition, tracking, writing in the air, and handwritten character recognition, especially focus on inertial sensor based techniques. ACM Trans. Sensor Netw., Vol. 1, No. 1, Article 1. Publication date: January 2019
1:4 Y.Yin et al. In-air gesture recognition:Parate et al.[26]design a mobile solution called RisQ to detect smoking gestures and sessions with a wristband and use a machine learning pipeline to process sensor data.Blank et al.[7]present a system for table tennis stroke detection and classification by attaching inertial sensors to table tennis rackets.Thomaz et al.[31]describe the implementation and evaluation of an approach to infer eating moments using a 3-axis accelerometer in smartwatch.Xu et al.[35]build a classifier to identify user's hand and finger gestures utilizing the essential features of accelerometer and gyroscope data measured from smartwatch.Huang et al.[18]build a system to monitor the brushing quality using a manual toothbrush modified by attaching small magnets to the handle and an off-the-shelf smart watch.These approaches typically extract features from sensor data and apply machine learning techniques for gesture recognition. In-air gesture tracking:Zhou et al.[42][44][43]utilize a kinematic chain to track the human upper limb motion by placing multiple devices on the arm.Cutti et al.[11]utilize the joint angles to track the movements of upper limbs by placing sensors on the chest,shoulder, arm,and wrist.Chen et al.[8]design a wearable system consisting of a pair of magnetometers on fingers and a permanent magnet affixed to the thumb,and introduce uTrack to convert the thumb and fingers into a continuous input system (e.g.,3D pointing).Shen et al.[29] utilize the 5-DoF arm model and HMM to track the 3D posture of the arm,using both motion and magnetic sensors in smartwatch.In fact,accurate in-air gesture tracking in real time can be very challenging.Besides,obtaining the 3D moving trajectory does not mean recognizing the in-air gesture.In this paper,we do not require accurate trajectory tracking, while aiming to obtain the gesture contour and recognize it as a character. Writing in the air:Zhang et al.[39]quantify data into small integral vectors based on acceleration orientation,and then use HMM to recognize 10 Arabic numerals.Wang et al.[32] present IMUPEN to reconstruct motion trajectory and recognize handwritten digits.Bashir et al.[6]use a pen equipped with inertial sensors and apply DTW to recognize handwritten characters.Agrawal et al.[1]recognize handwritten capital letters and Arabic numerals in a 2D plane based on strokes and a grammar tree,by using the built-in accelerometer in smartphone.Amma et al.[2]design a glove equipped with inertial sensors,and use SVM, HMM and statistical language model to recognize capital letters,sentences,etc.Deselaers et al.[13]present GyroPen to reconstruct the writing path for pen-like interaction.Xu et al.[36] utilize the continuous density HMM and Viterbi algorithm to recognize handwritten digits and letters using inertial sensors.In this paper,we focus on single in-air character recognition without the assistance of language model.For a character,we do not define specific strokes or require pen-up for stroke segmentation,while tolerating the intra-class variability caused by writing speeds,gesture sizes,writing directions and observation ambiguity caused by viewing angles etc in 3D space. Handwritten character recognition:In addition to inertial sensor-based approaches, many image processing techniques [31416 are also adopted for recognizing handwritten characters in a 2D plane (i.e.,image).Bahlmann et al.[4]combine DTW and SVMs to establish a Gaussian DTW(GDTW)kernel for on-line recognition of UNIPEN handwriting data.Rayar et al.[28]propose preselection method for CNN-based classification and evaluate it in handwritten character recognition in images.Rao et al.[27 propose a newly designed network structure based on an extended nonlinear kernel residual network,to recognize the handwritten characters over MINIST and SVHN dataset.These approaches focus on recognizing hand-moving trajectories in a 2D plane,while our paper focuses on transforming the 3D gesture into a proper 2D contour,and then utilizes the contour's space-time feature to recognize contours as characters. ACM Trans.Sensor Netw.,Vol.1,No.1,Article 1.Publication date:January 2019
1:4 Y. Yin et al. In-air gesture recognition: Parate et al. [26] design a mobile solution called RisQ to detect smoking gestures and sessions with a wristband and use a machine learning pipeline to process sensor data. Blank et al. [7] present a system for table tennis stroke detection and classification by attaching inertial sensors to table tennis rackets. Thomaz et al. [31] describe the implementation and evaluation of an approach to infer eating moments using a 3-axis accelerometer in smartwatch. Xu et al. [35] build a classifier to identify user’s hand and finger gestures utilizing the essential features of accelerometer and gyroscope data measured from smartwatch. Huang et al. [18] build a system to monitor the brushing quality using a manual toothbrush modified by attaching small magnets to the handle and an off-the-shelf smart watch. These approaches typically extract features from sensor data and apply machine learning techniques for gesture recognition. In-air gesture tracking: Zhou et al. [42][44][43] utilize a kinematic chain to track the human upper limb motion by placing multiple devices on the arm. Cutti et al. [11] utilize the joint angles to track the movements of upper limbs by placing sensors on the chest, shoulder, arm, and wrist. Chen et al. [8] design a wearable system consisting of a pair of magnetometers on fingers and a permanent magnet affixed to the thumb, and introduce uTrack to convert the thumb and fingers into a continuous input system (e.g., 3D pointing). Shen et al. [29] utilize the 5-DoF arm model and HMM to track the 3D posture of the arm, using both motion and magnetic sensors in smartwatch. In fact, accurate in-air gesture tracking in real time can be very challenging. Besides, obtaining the 3D moving trajectory does not mean recognizing the in-air gesture. In this paper, we do not require accurate trajectory tracking, while aiming to obtain the gesture contour and recognize it as a character. Writing in the air: Zhang et al. [39] quantify data into small integral vectors based on acceleration orientation, and then use HMM to recognize 10 Arabic numerals. Wang et al. [32] present IMUPEN to reconstruct motion trajectory and recognize handwritten digits. Bashir et al. [6] use a pen equipped with inertial sensors and apply DTW to recognize handwritten characters. Agrawal et al. [1] recognize handwritten capital letters and Arabic numerals in a 2D plane based on strokes and a grammar tree, by using the built-in accelerometer in smartphone. Amma et al. [2] design a glove equipped with inertial sensors, and use SVM, HMM and statistical language model to recognize capital letters, sentences, etc. Deselaers et al. [13] present GyroPen to reconstruct the writing path for pen-like interaction. Xu et al. [36] utilize the continuous density HMM and Viterbi algorithm to recognize handwritten digits and letters using inertial sensors. In this paper, we focus on single in-air character recognition without the assistance of language model. For a character, we do not define specific strokes or require pen-up for stroke segmentation, while tolerating the intra-class variability caused by writing speeds, gesture sizes, writing directions and observation ambiguity caused by viewing angles etc in 3D space. Handwritten character recognition: In addition to inertial sensor-based approaches, many image processing techniques [3][14][16] are also adopted for recognizing handwritten characters in a 2D plane (i.e., image). Bahlmann et al. [4] combine DTW and SVMs to establish a Gaussian DTW (GDTW) kernel for on-line recognition of UNIPEN handwriting data. Rayar et al. [28] propose preselection method for CNN-based classification and evaluate it in handwritten character recognition in images. Rao et al. [27] propose a newly designed network structure based on an extended nonlinear kernel residual network, to recognize the handwritten characters over MINIST and SVHN dataset. These approaches focus on recognizing hand-moving trajectories in a 2D plane, while our paper focuses on transforming the 3D gesture into a proper 2D contour, and then utilizes the contour’s space-time feature to recognize contours as characters. ACM Trans. Sensor Netw., Vol. 1, No. 1, Article 1. Publication date: January 2019
AirContour 1:5 3 TECHNICAL CHALLENGES AND DEFINITIONS IN IN-AIR GESTURE RECOGNITION 3.1 Intra-class Variability in Sensor Data As shown in Fig.2,even when the user performs the same type of gestures (e.g.,writes 't'), the sensor data can be quite different,due to the variation of writing speeds(Fig.2(a)), gesture sizes (Fig.2(b)),writing directions (Fig.2(c)),etc.Directly using the extracted features from sensor data may fail to recognize in-air gestures accurately. To handle the intra-class variability of in-air gestures,e.g.,the variation of speed,amplitude, and orientation of gestures,we present the contour-based gesture model,which utilizes contours to correlate sensor data with human gestures.The 'contour'is represented with a sequence of coordinate points over time.Additionally,to avoid the differences caused by facing directions,we transform the sensor data from device coordinate system to human coordinate system shown in Fig.5(a),i.e.,we analyze the 3D contours in human coordinate system.In this paper,we take the instance of writing characters in the air to illustrate the contour-based gesture model.The characters are referred to the alphabet,i.e.,a'z',and we use the term“character'”,“letter'”interchangeably throughout the paper.It is worth mentioning that in-air writing letters can be different from printed letters,due to joined-up writing.In particular,we remove the point of 'i'and j',and use't'to represent the letter I' for simplification. 2420 25 nong eqene 5npg8net120 4 5nptnms9ne65we120 hamal 40的 Below 一 —6—上的 一泡将日 28 一dm6 -od- (0mgn 的mng0coI网e,0 (a)Different speeds (b)Different sizes (c)Different directions Fig.2.Linear acceleration of writing the same character 't' 3.2 Difference between 2D Contours and 3D Contours Usually,people get used to recognizing and reading handwritten characters in a 2D plane, e.g.,on a piece of paper.Therefore,we can map a 2D gesture contour with a 2D character for recognition.However,based on extensive observations and experimental study,we find that 3D contour recognition is quite different from 2D contour recognition.In fact,recognizing 3D contours as 2D characters can be very challenging,due to the contour distortion caused by viewing angles,contour difference caused by writing directions,and contour distribution across different planes,as described below. 3.2.1 Viewing Angles.There is a uniform viewing angle for a 2D character contour,while multiple viewing angles for a 3D character contour.In a predefined plane-coordinate system, ACM Trans.Sensor Netw.,Vol.1,No.1,Article 1.Publication date:January 2019
AirContour 1:5 3 TECHNICAL CHALLENGES AND DEFINITIONS IN IN-AIR GESTURE RECOGNITION 3.1 Intra-class Variability in Sensor Data As shown in Fig. 2, even when the user performs the same type of gestures (e.g., writes ‘t’), the sensor data can be quite different, due to the variation of writing speeds (Fig. 2(a)), gesture sizes (Fig. 2(b)), writing directions (Fig. 2(c)), etc. Directly using the extracted features from sensor data may fail to recognize in-air gestures accurately. To handle the intra-class variability of in-air gestures, e.g., the variation of speed, amplitude, and orientation of gestures, we present the contour-based gesture model, which utilizes contours to correlate sensor data with human gestures. The ‘contour’ is represented with a sequence of coordinate points over time. Additionally, to avoid the differences caused by facing directions, we transform the sensor data from device coordinate system to human coordinate system shown in Fig. 5(a), i.e., we analyze the 3D contours in human coordinate system. In this paper, we take the instance of writing characters in the air to illustrate the contour-based gesture model. The characters are referred to the alphabet, i.e., ‘a’ ∼ ‘z’, and we use the term “character”, “letter” interchangeably throughout the paper. It is worth mentioning that in-air writing letters can be different from printed letters, due to joined-up writing. In particular, we remove the point of ‘i’ and ‘j’, and use ‘ι’ to represent the letter ‘l’ for simplification. User13, D2-Front, D4-Right User6, Yinyafeng20, D3; 60-D5, 40-D2 YinyafengFast, D1-Fast, D5-Normal, Gezefan D2-Slow (a) Different speeds User13, D2-Front, D4-Right User6, Yinyafeng20, D3; 60-D5, 40-D2 YinyafengFast, D1-Fast, D5-Normal, Gezefan D2-Slow (b) Different sizes User13, D2-Front, D4-Right, D5-Down User6, Yinyafeng20, D3; 60-D4, 40-D2 YinyafengFast, D1-Fast D5-Normal, Gezefan D2-Slow Below (c) Different directions Fig. 2. Linear acceleration of writing the same character ‘t’ 3.2 Difference between 2D Contours and 3D Contours Usually, people get used to recognizing and reading handwritten characters in a 2D plane, e.g., on a piece of paper. Therefore, we can map a 2D gesture contour with a 2D character for recognition. However, based on extensive observations and experimental study, we find that 3D contour recognition is quite different from 2D contour recognition. In fact, recognizing 3D contours as 2D characters can be very challenging, due to the contour distortion caused by viewing angles, contour difference caused by writing directions, and contour distribution across different planes, as described below. 3.2.1 Viewing Angles. There is a uniform viewing angle for a 2D character contour, while multiple viewing angles for a 3D character contour. In a predefined plane-coordinate system, ACM Trans. Sensor Netw., Vol. 1, No. 1, Article 1. Publication date: January 2019
1:6 Y.Yin et al. the 2D gesture contour is discriminative and can be used for character recognition,it is consistent with people's cognition habits for handwriting letters.However,in 3D space,even in a predefined coordinate system,we can look at the 3D contour from different viewing angles,thus the observed 3D contour can be quite different.As shown in Fig.3,when we look at the 3D contour of't'from left to right,the shape and orientation of the character contour change a lot,as the contour located in the red circle in Fig.3(a),Fig.3(b)and Fig. 3(c).For a character,its contour is consisted of one or several strokes in a sequential order and right orientation.If the character contour changes,it can lead to the misrecognition of characters.For example,when we look at the contours of 'b'and 'q'from different viewing angles,we may get similar contours of them and be difficult to distinguish them.Therefore, it is expected to select a proper viewing angle to mitigate the confusion about character contours. 0.2 0.2 02 N 0.2 -0.2 -0.2 0.2 0.2 0202 0.2 2m 份0202 0202 x(m) 0202x8m) (a)From left (b)From center (c)From right Fig.3.Observed 3D contours from different viewing angles 3.2.2 Writing Directions.From a uniform view,2D contours of a same character are similar, while the 3D contours can be quite different,due to the uncertain writing directions.On a 2D plane,the contours of the same character keep the essential shape feature.Even if the orientation of a 2D contour changes,e.g.,the 2D contour rotates in the plane,it still keeps the shape feature of the contour.However,in 3D space,even if we look at the contours of the same character from the same viewing angle,the observed contours can be quite different, as the contours in the red circles shown in Fig.4.This is because the user can write in-air gestures towards different directions.Intuitively,if we can adaptively project the 3D contour into a corresponding coordinate plane (e.g.,Th-Zh plane,yh-Zh plane,or Th-yh plane), we may mitigate the contour distortion caused by writing directions. 0.2 02 0.2 0.2 -02 0.2 0.2 0.2 0.2 0.2 0.2 0.2 -0.2-0.2 x(m -0.2 -0.2 (m) 020.2x9m (a)Writing towardsx plane (b)Writing towards ya plane (c)Writing towardsx-y plane Fig.4.Different contours from the same viewing angle ACM Trans.Sensor Netw.,Vol.1,No.1,Article 1.Publication date:January 2019
1:6 Y. Yin et al. the 2D gesture contour is discriminative and can be used for character recognition, it is consistent with people’s cognition habits for handwriting letters. However, in 3D space, even in a predefined coordinate system, we can look at the 3D contour from different viewing angles, thus the observed 3D contour can be quite different. As shown in Fig. 3, when we look at the 3D contour of ‘t’ from left to right, the shape and orientation of the character contour change a lot, as the contour located in the red circle in Fig. 3(a), Fig. 3(b) and Fig. 3(c). For a character, its contour is consisted of one or several strokes in a sequential order and right orientation. If the character contour changes, it can lead to the misrecognition of characters. For example, when we look at the contours of ‘b’ and ‘q’ from different viewing angles, we may get similar contours of them and be difficult to distinguish them. Therefore, it is expected to select a proper viewing angle to mitigate the confusion about character contours. (a) From left (b) From center (c) From right Fig. 3. Observed 3D contours from different viewing angles 3.2.2 Writing Directions. From a uniform view, 2D contours of a same character are similar, while the 3D contours can be quite different, due to the uncertain writing directions. On a 2D plane, the contours of the same character keep the essential shape feature. Even if the orientation of a 2D contour changes, e.g., the 2D contour rotates in the plane, it still keeps the shape feature of the contour. However, in 3D space, even if we look at the contours of the same character from the same viewing angle, the observed contours can be quite different, as the contours in the red circles shown in Fig. 4. This is because the user can write in-air gestures towards different directions. Intuitively, if we can adaptively project the 3D contour into a corresponding coordinate plane (e.g., 𝑥ℎ − 𝑧ℎ plane, 𝑦ℎ − 𝑧ℎ plane, or 𝑥ℎ − 𝑦ℎ plane), we may mitigate the contour distortion caused by writing directions. (a) Writing towards xh -zh plane (c) Writing towards xh -y (b) Writing towards y h plane h -zh plane Fig. 4. Different contours from the same viewing angle ACM Trans. Sensor Netw., Vol. 1, No. 1, Article 1. Publication date: January 2019
AirContour 1:7 3.2.3 Contour Distribution.A 2D contour locates in a plane,while a 3D contour can distribute across different planes.In Fig.5(a),we show the human coordinate system (human-frame for short)ch-yh-zh.When the user writes in the air,her/his hand can move left and right,up and down,thus the in-air gesture generates a 3D contour across different planes.At this time,the in-air contour may be mainly located in or close to the plane A3B3C3D3 while not be parallel to any coordinate plane,and we cannot directly project the in-air contour into a coordinate plane for dimensionality reduction,e.g.,Th-Zh plane.As shown in Fig. 5(b),the 3D contour of 'k'distributes across different planes,and the 3D contour is mainly located in or close to the red plane,instead of any coordinate plane (e.g.,the blue plane in Fig.5(a)).Here,the red plane is called principal plane or writing plane,which contains or is close to most of points in the 3D contour,i.e.,the projected contour in the principal plane keeps the essential feature of the 3D contour.Therefore,we are expected to adaptively project the 3D contour into the principal plane,and obtain the essential contour feature of the handwritten character,as the contour 'k'shown in the red circle in Fig.5(b). 0 Plane B Human- B frame -02 C 0.1 0.1 C 8m)0.10.1 (m) (a)Hand movements (b)3D contour across different planes Fig.5.In-air gesture across different planes 3.3 Some Definitions about In-air Gestures According to Section 3.2,the improper viewing angle will lead to the distortion of the observed gesture contour.To mitigate the confusion or misrecognition of gesture contours caused by viewing angles,we first define the appropriate range of viewing angles,based on people's writing habits,i.e.,when the user writes in the air,her/his eyes track the movement of the hand naturally. As shown in Fig.6(a),when the user writes with the left hand,she/he tends to write in front,left side or below,the corresponding viewing angle comes from behind,right side or upside.Accordingly,we select a reference coordinate plane for each viewing angle,i.e., Th-Zh plane,yh-zh plane,and Th-yh plane,respectively.Similarly,as shown in Fig.6(b), when the user writes with the right hand in front,right side or below,the corresponding viewing angle comes from behind,left side or upside.The selected reference coordinate plane under the viewing angles are xh-zh plane,(-yh)-zh plane,and Ih-yh plane, respectively.Therefore,there is a mapping relationship between a reference coordinate plane and a viewing angle.With the selected reference coordinate plane,the user will not view a character contour in the right orientation as a reversed contour(referring to Fig.3(a)and Fig.3(c)).It is worth mentioning that the selected reference coordinate plane is used to indicate the possible orientation of the projected contour in principal plane,as described in Section 4.2.It does not mean that the user can only writes on Th-zh,yh-Zh or y-Zh planes.In fact,the user can write towards arbitrary directions in 3D space. ACM Trans.Sensor Netw.,Vol.1,No.1,Article 1.Publication date:January 2019
AirContour 1:7 3.2.3 Contour Distribution. A 2D contour locates in a plane, while a 3D contour can distribute across different planes. In Fig. 5(a), we show the human coordinate system (human-frame for short) 𝑥ℎ − 𝑦ℎ − 𝑧ℎ. When the user writes in the air, her/his hand can move left and right, up and down, thus the in-air gesture generates a 3D contour across different planes. At this time, the in-air contour may be mainly located in or close to the plane 𝐴3𝐵3𝐶3𝐷3 while not be parallel to any coordinate plane, and we cannot directly project the in-air contour into a coordinate plane for dimensionality reduction, e.g., 𝑥ℎ − 𝑧ℎ plane. As shown in Fig. 5(b), the 3D contour of ‘k’ distributes across different planes, and the 3D contour is mainly located in or close to the red plane, instead of any coordinate plane (e.g., the blue plane in Fig. 5(a)). Here, the red plane is called principal plane or writing plane, which contains or is close to most of points in the 3D contour, i.e., the projected contour in the principal plane keeps the essential feature of the 3D contour. Therefore, we are expected to adaptively project the 3D contour into the principal plane, and obtain the essential contour feature of the handwritten character, as the contour ‘k’ shown in the red circle in Fig. 5(b). (a) Hand movements Plane (b) 3D contour across different planes Fig. 5. In-air gesture across different planes 3.3 Some Definitions about In-air Gestures According to Section 3.2, the improper viewing angle will lead to the distortion of the observed gesture contour. To mitigate the confusion or misrecognition of gesture contours caused by viewing angles, we first define the appropriate range of viewing angles, based on people’s writing habits, i.e., when the user writes in the air, her/his eyes track the movement of the hand naturally. As shown in Fig. 6(a), when the user writes with the left hand, she/he tends to write in front, left side or below, the corresponding viewing angle comes from behind, right side or upside. Accordingly, we select a reference coordinate plane for each viewing angle, i.e., 𝑥ℎ −𝑧ℎ plane, 𝑦ℎ −𝑧ℎ plane, and 𝑥ℎ −𝑦ℎ plane, respectively. Similarly, as shown in Fig. 6(b), when the user writes with the right hand in front, right side or below, the corresponding viewing angle comes from behind, left side or upside. The selected reference coordinate plane under the viewing angles are 𝑥ℎ − 𝑧ℎ plane, (−𝑦ℎ) − 𝑧ℎ plane, and 𝑥ℎ − 𝑦ℎ plane, respectively. Therefore, there is a mapping relationship between a reference coordinate plane and a viewing angle. With the selected reference coordinate plane, the user will not view a character contour in the right orientation as a reversed contour (referring to Fig. 3(a) and Fig. 3(c)). It is worth mentioning that the selected reference coordinate plane is used to indicate the possible orientation of the projected contour in principal plane, as described in Section 4.2. It does not mean that the user can only writes on 𝑥ℎ − 𝑧ℎ, 𝑦ℎ − 𝑧ℎ or 𝑥𝑦 − 𝑧ℎ planes. In fact, the user can write towards arbitrary directions in 3D space. ACM Trans. Sensor Netw., Vol. 1, No. 1, Article 1. Publication date: January 2019
1:8 Y.Yin et al. 0 m m (a)Writing with left hand (b)Writing with right hand Fig.6.Viewing angles for writing with different hands Here,the hand (i.e.,left hand or right hand)and the writing directions,i.e.,in front, left side,right side or below,determine the viewing angles.To detect which hand writes in the air,we introduce an initial gesture before writing,i.e.,the user stands with the hands down and then opens up the arm wearing the device until the arm is parallel to the floor.In the human coordinate system,if the hand moves left,then the user writes with left hand. Otherwise,the user writes with right hand.In regard to the human coordinate system,it will be described in the later System Design section.To detect the writing direction and project the 3D contour into a 2D plane properly,we introduce the 3D contour-based gesture model,as described below. 4 3D CONTOUR-BASED GESTURE MODEL Based on the accelerometer,gyroscope and magnetometer of the wrist-worn device,we can get the 3D contour of the in-air gesture.However,according to Section 3.2,due to the uncertainty of the viewing angle,writing direction and contour distribution,it is essential to find a plane to get the proper projection of 3D contour for character recognition.To solve this issue,we first introduce Principal Component Analysis(PCA)to adaptively detect the principal/writing plane.Then,we detect the reference coordinate plane and determine the viewing angle.After that,we tune the 2D contour in the principal plane to get the character contour in right orientation and normalized size. 4.1 Principal Plane Detection with PCA As mentioned before,to get a proper projected 2D contour for character recognition,we need to detect the principal/writing plane,which contains or is close to most of points in the 3D contour,as the red plane in Fig.7(a),Fig.7(b)and Fig.7(c).It is worth noting that the principal plane may be not parallel to any coordinate plane,as shown in Fig.7.In this paper,we utilize Principal Component Analysis(PCA)[30]to reduce the dimensionality of 3D contour and detect the principal plane adaptively,as described below. For convenience,we usei=(,,)T,i[1,n]to represent the contour (i.e.,point sequence)in Th-aris,yh-axis and zh-aris of human coordinate system.At first,we introduce the centralization operation to update the coordinates i of the contour,i.e., 1=x1-是∑=1,2=2-是∑-12,3=x8-是∑=1a.Then,euse wi=(wi,wi2,wis),iE[1,2]to represent the orthonormal basis vectors of the principal plane.Here,lwill2 =1,wwj=0,ij. As shown in Fig.8,for the point ci in human-frame,its projection point in the principal plane is y;=(vi,)T=i,where =(w1,w2).Then,we can use y;to reconstruct the ACM Trans.Sensor Netw.,Vol.1,No.1,Article 1.Publication date:January 2019
1:8 Y. Yin et al. -yh zh xh zh xh yh zh xh yh zh xh yh (a) Writing with left hand -yh zh xh zh xh yh zh xh yh zh xh zh (b) Writing with right hand Fig. 6. Viewing angles for writing with different hands Here, the hand (i.e., left hand or right hand) and the writing directions, i.e., in front, left side, right side or below, determine the viewing angles. To detect which hand writes in the air, we introduce an initial gesture before writing, i.e., the user stands with the hands down and then opens up the arm wearing the device until the arm is parallel to the floor. In the human coordinate system, if the hand moves left, then the user writes with left hand. Otherwise, the user writes with right hand. In regard to the human coordinate system, it will be described in the later System Design section. To detect the writing direction and project the 3D contour into a 2D plane properly, we introduce the 3D contour-based gesture model, as described below. 4 3D CONTOUR-BASED GESTURE MODEL Based on the accelerometer, gyroscope and magnetometer of the wrist-worn device, we can get the 3D contour of the in-air gesture. However, according to Section 3.2, due to the uncertainty of the viewing angle, writing direction and contour distribution, it is essential to find a plane to get the proper projection of 3D contour for character recognition. To solve this issue, we first introduce Principal Component Analysis (PCA) to adaptively detect the principal/writing plane. Then, we detect the reference coordinate plane and determine the viewing angle. After that, we tune the 2D contour in the principal plane to get the character contour in right orientation and normalized size. 4.1 Principal Plane Detection with PCA As mentioned before, to get a proper projected 2D contour for character recognition, we need to detect the principal/writing plane, which contains or is close to most of points in the 3D contour, as the red plane in Fig. 7(a), Fig. 7(b) and Fig. 7(c). It is worth noting that the principal plane may be not parallel to any coordinate plane, as shown in Fig. 7. In this paper, we utilize Principal Component Analysis (PCA) [30] to reduce the dimensionality of 3D contour and detect the principal plane adaptively, as described below. For convenience, we use 𝑥𝑖 = (𝑥𝑖1, 𝑥𝑖2, 𝑥𝑖3) 𝑇 , 𝑖 ∈ [1, 𝑛] to represent the contour (i.e., point sequence) in 𝑥ℎ − 𝑎𝑥𝑖𝑠, 𝑦ℎ − 𝑎𝑥𝑖𝑠 and 𝑧ℎ − 𝑎𝑥𝑖𝑠 of human coordinate system. At first, we introduce the centralization operation to update the coordinates 𝑥𝑖 of the contour, i.e., 𝑥𝑖1 = 𝑥𝑖1 − 1 𝑛 ∑︀𝑛 𝑗=1 𝑥𝑗1, 𝑥𝑖2 = 𝑥𝑖2 − 1 𝑛 ∑︀𝑛 𝑗=1 𝑥𝑗2, 𝑥𝑖3 = 𝑥𝑖3 − 1 𝑛 ∑︀𝑛 𝑗=1 𝑥𝑗3. Then, we use 𝜔𝑖 = (𝜔𝑖1, 𝜔𝑖2, 𝜔𝑖3) 𝑇 , 𝑖 ∈ [1, 2] to represent the orthonormal basis vectors of the principal plane. Here, ‖𝜔𝑖‖2 = 1, 𝜔 𝑇 𝑖 𝜔𝑗 = 0, 𝑖 ̸= 𝑗. As shown in Fig. 8, for the point 𝑥𝑖 in human-frame, its projection point in the principal plane is 𝑦𝑖 = (𝑦𝑖1, 𝑦𝑖2) 𝑇 = Ω 𝑇 𝑥𝑖 , where Ω = (𝜔1, 𝜔2). Then, we can use 𝑦𝑖 to reconstruct the ACM Trans. Sensor Netw., Vol. 1, No. 1, Article 1. Publication date: January 2019
AirContour 1:9 0.2 0.2 0.2 Eo 83 0,2 0.2 83 0.2 02 0.2 m0202 m m)0202 x(m) %}m0202 m (a) (b) (c) Fig.7.Different principal planes coordinate of ci as ti,as shown in Eq.(1).The distance between xi and ti is di=xiil2. ,=∑%=n(n) (1) j=1 When the average distance-∑”_,d,i∈[l,nreaches the minimal value,the plane represented with the orthonormal basis vectors =(w,w2)is the principal/writing plane, as shown in Eq.(2). 11 arg min∑z:-主,2 =1 (2) s.t.nTn=I. By combining Eq.(1)and Eq.(2),we can transform the objective in Eq.(2)to Eq.(3), where X =(x1,x2,...,xn),while tr means the trace of a matrix,i.e.,the sum of the elements on the main diagonal of the matrix. arg max tr(nTXXTn) (3) s.t.nn=I. After that,we use Lagrange multiplier method to obtain the orthonormal basis vectors ,w2,based on eigenvalue decomposition ofT,as shown in Eq.(4).The orthonormal basis vector wi with the largest eigenvalue corresponds to the eigenvector w1,while the second eigenvector is w2.In the principal plane,we use wi and w2 to represent the p-axis and yp-axis of the principal plane,respectively. XXTwi=Xwi. (4) X;-axis x-axis x2-axis Fig.8.The principle of writing plane detection with PCA ACM Trans.Sensor Netw.,Vol.1,No.1,Article 1.Publication date:January 2019
AirContour 1:9 T3, x=55, y=25, [-0.18 0.18] T4, x=40, y=25, z=90, [-0.18 0.18], y取反-Right T5, z=50, y=20, [-0.18 0.18] (a) (b) (c) Fig. 7. Different principal planes coordinate of 𝑥𝑖 as 𝑥^𝑖 , as shown in Eq. (1). The distance between 𝑥𝑖 and 𝑥^𝑖 is 𝑑𝑖 = ‖𝑥𝑖−𝑥^𝑖‖ 2 . 𝑥^𝑖 = ∑︁ 2 𝑗=1 𝑦𝑖𝑗𝜔𝑗 = Ω(Ω 𝑇 𝑥𝑖) (1) When the average distance ¯𝑑 = ∑︀𝑛 𝑖=1 𝑑𝑖 , 𝑖 ∈ [1, 𝑛] reaches the minimal value, the plane represented with the orthonormal basis vectors Ω = (𝜔1, 𝜔2) is the principal/writing plane, as shown in Eq. (2). arg min Ω 1 𝑛 ∑︁𝑛 𝑖=1 ‖𝑥𝑖 − 𝑥^𝑖‖ 2 𝑠.𝑡.Ω 𝑇 Ω = 𝐼. (2) By combining Eq. (1) and Eq. (2), we can transform the objective in Eq. (2) to Eq. (3), where 𝑋 = (𝑥1, 𝑥2, . . . , 𝑥𝑛), while 𝑡𝑟 means the trace of a matrix, i.e., the sum of the elements on the main diagonal of the matrix. arg max Ω 𝑡𝑟(Ω 𝑇 𝑋𝑋𝑇 Ω) 𝑠.𝑡.Ω 𝑇 Ω = 𝐼. (3) After that, we use Lagrange multiplier method to obtain the orthonormal basis vectors {𝜔1, 𝜔2}, based on eigenvalue decomposition of 𝑋𝑋𝑇 , as shown in Eq. (4). The orthonormal basis vector 𝜔𝑖 with the largest eigenvalue corresponds to the eigenvector 𝜔1, while the second eigenvector is 𝜔2. In the principal plane, we use 𝜔1 and 𝜔2 to represent the 𝑥𝑝-axis and 𝑦𝑝-axis of the principal plane, respectively. 𝑋𝑋𝑇 𝜔𝑖 = 𝜆𝜔𝑖 . (4) Fig. 8. The principle of writing plane detection with PCA ACM Trans. Sensor Netw., Vol. 1, No. 1, Article 1. Publication date: January 2019
1:10 Y.Yin et al. As shown in Fig.9(a),the black line and the green line respectively mean the first basis vector wi and the second basis vector w2,while the red plane containing wi and w2 is the detected principal plane.In the principal plane,we can obtain the projected 2D contour, as shown in Fig.9(b).However,due to the information loss of dimensionality reduction, there may exist the problems like reversal and skew of the projected contour,which needs a further calibration.It is worth mentioning that lowercase letters are different from capital letters,the shapes of different lowercase letters observed from different viewing angles can be similar,e.g.,'b'and'q','d'and 'p',thus the orientation and writing order of a character are both important for lowercase letter recognition.It is essential to calibrate the projected 2D contour in right orientation and normalized size under a uniform view for character recognition. 0.2 02 n" -0.2 0.2 02 m022 02 m -02 x.m) 0.2 (a)3D contour (b)Projected contour Fig.9.Relationship between contours and principal plane 4.2 Reference Coordinate Plane Detection According to Section 4.1 and Fig.9,the projected 2D contour in the principal plane has a high probability of keeping the shape feature of the in-air contour,while still having the problems like reversal and skew,i.e.,the orientation of the contour is changed.Thus we need to calibrate the 2D contour in the principal plane.To achieve this goal,we detect the reference coordinate plane and determine the viewing angle at first.Here,the reference coordinate plane is used to indicate the viewing angle and possible orientation of the projected contour in the principal plane.In regard to the user,she/he can perform the gesture towards arbitrary directions,the writing plane may be not parallel to any coordinate plane. 0 02 82 4m0202 m) m m (a)3D contour (b)2D contour by PCA (c)2D contour with re-(d)2D contour with ro- versal tation Fig.10.Contour of character'm'written with right hand,h z-axis mean the projection of ch. yh,Zh-axis,respectively ACM Trans.Sensor Netw.,Vol.1,No.1,Article 1.Publication date:January 2019
1:10 Y. Yin et al. As shown in Fig. 9(a), the black line and the green line respectively mean the first basis vector 𝜔1 and the second basis vector 𝜔2, while the red plane containing 𝜔1 and 𝜔2 is the detected principal plane. In the principal plane, we can obtain the projected 2D contour, as shown in Fig. 9(b). However, due to the information loss of dimensionality reduction, there may exist the problems like reversal and skew of the projected contour, which needs a further calibration. It is worth mentioning that lowercase letters are different from capital letters, the shapes of different lowercase letters observed from different viewing angles can be similar, e.g., ‘b’ and ‘q’, ‘d’ and ‘p’, thus the orientation and writing order of a character are both important for lowercase letter recognition. It is essential to calibrate the projected 2D contour in right orientation and normalized size under a uniform view for character recognition. (a) 3D contour (b) Projected contour Fig. 9. Relationship between contours and principal plane 4.2 Reference Coordinate Plane Detection According to Section 4.1 and Fig. 9, the projected 2D contour in the principal plane has a high probability of keeping the shape feature of the in-air contour, while still having the problems like reversal and skew, i.e., the orientation of the contour is changed. Thus we need to calibrate the 2D contour in the principal plane. To achieve this goal, we detect the reference coordinate plane and determine the viewing angle at first. Here, the reference coordinate plane is used to indicate the viewing angle and possible orientation of the projected contour in the principal plane. In regard to the user, she/he can perform the gesture towards arbitrary directions, the writing plane may be not parallel to any coordinate plane. (a) 3D contour (b) 2D contour by PCA (c) 2D contour with reversal (d) 2D contour with rotation Fig. 10. Contour of character ‘m’ written with right hand, 𝑥 ′ ℎ, 𝑦 ′ ℎ, 𝑧 ′ ℎ-axis mean the projection of 𝑥ℎ, 𝑦ℎ, 𝑧ℎ-axis, respectively ACM Trans. Sensor Netw., Vol. 1, No. 1, Article 1. Publication date: January 2019