173 UbiComp/ISWC'20 Adjunct,September 12-16,2020,Virtual Event,Mexico Lingyu Zhang and Yafeng Yin,et al. Image Preprocessing Character Extraction Character Recognition Arithmetic Expression Gray processing Expression detection Separation for printed characters Calculation and handwritten characters Rule based Right Bilateral filtering Expression segmentation Printed character recognition calculation or Input mage Handwritten character Image binaryzation Character segmentation recognition Figure 2:System overview However,there are some challenges in the problem.Usually,the 3.1 Image Preprocessing picture taken by the smartphone is not as clear as the scanned image, As shown in Fig.1,the four arithmetic operations are in white and characters may have some deformation.Besides,the printed paper,and the picture taken by the smartphone is a RGB image.To characters and handwritten ones usually have different styles,even remove noises and distinguish characters with the background in for the same numeral.In addition,the computing power of a smart- the image,we first preprocess the image.As shown in Fig.2,we phone is limited.Therefore,to achieve the goal,we first introduce first process the raw image,i.e.,picture taken by the smartphone image preprocessing to distinguish the characters and background. with gray-scale image processing.Then,we use a bilateral filter Then,we extract each character based on horizontal projection and to remove noises while keeping the edges in the image.After that, vertical projection.After that,we separate the arithmetic expression we perform image binaryzation to separate the foreground (i.e., and recognize printed characters and handwritten ones individually. characters)and background,i.e.,the characters are in black color We also choose suitable image sizes to make the system work on a while the background is in white color.The preprocessed image smartphone. will be used for the following character extraction. 2 RELATED WORK 3.2 Character Extraction Four arithmetic operations including numerals and operators often To extract the characters,we need to detect the arithmetic expres- appear in the homework of low-grade pupils.To recognize and sion and separate each character.Specifically,with the binarized calculate arithmetic operations automatically,Meng et al.[5]used image,we use the horizontal projection of pixels in each row to a BP neural network and template matching to recognize printed detect the arithmetic expression.Supposed the coordinate of a pixel numerals and operators.To recognize handwritten-style characters, in an image is (xi,yj).i[1.wl.je[1,h].where wand h represent Jiang et al.[2]used the end-to-end learning technology to recognize the width and height of the image,respectively.If the number np of arithmetic operations in fixed-size or varied-size images,where black pixels (xi.yp),ie [1,w]in the pth row satisfies np ep.the the characters were CAPTCHA-style.Khalighi et al.[3]proposed a row is treated as 'Expression-Row',where ep is set to 2 by default. novel OCR system to recognize and calculate handwritten Persian By connecting the consecutive 'Expression-Rows,we can get the arithmetic expression.Instead of only recognizing numerals and arithmetic expressions,as shown in Fig.3(a).To further segment operators,Li et al.[4]proposed BAGS,an automatic homework the arithmetic expressions in same rows,e.g.,"28 x 14 =392"and grading system based on the pictures taken by smart phones.BAGS "38 x 40 1520",we introduce the vertical projection of each pixels detected answer areas in the answer sheets and recognized the in each column,where the principle of vertical projection is sim- handwritten characters(e.g.,words).In BAGS,the images were ilar to that of horizontal projection.After that,we can get each processed in another computers or servers instead of smartphones. arithmetic expression,as shown in Fig.3(b).In addition,to seg- Different from the existing work,we provide a homework auto- ment each character from the arithmetic expression,we repeatedly checking system based on images taken by smartphones,aiming use the vertical projection.Finally,we can extract each arithmetic to recognize both printed characters and handwritten characters expression and its corresponding characters,as shown in Fig.3(c). in arithmetic operations.Besides,our system runs on the easy-to- get smartphone locally without transmitting images or processing 840-420 604-2=302 38x40=1520 images offline. 38区40=|旧20 28×14=3923朗40=J520 (e)Vertical projection 3 SYSTEM DESIGN ↓ To check the homework consisted of four arithmetic operations, 840-420042=202 28x14=3238x40=1520☐ it is necessary to detect the characters,recognize the characters and verify the calculation of arithmetic expression.Therefore,the 28x14-39238x40=J520 28×14=392 36×40=1520 proposed system HmwkCheck consists of four components,i.e.. (a)Horizontal projection image preprocessing,character extraction,character recognition, arithmetic expression calculation,as shown in Fig.2. Figure 3:An example of character segmentationUbiComp/ISWC ’20 Adjunct, September 12–16, 2020, Virtual Event, Mexico Lingyu Zhang and Yafeng Yin, et al. Figure 2: System overview However, there are some challenges in the problem. Usually, the picture taken by the smartphone is not as clear as the scanned image, and characters may have some deformation. Besides, the printed characters and handwritten ones usually have different styles, even for the same numeral. In addition, the computing power of a smartphone is limited. Therefore, to achieve the goal, we first introduce image preprocessing to distinguish the characters and background. Then, we extract each character based on horizontal projection and vertical projection. After that, we separate the arithmetic expression and recognize printed characters and handwritten ones individually. We also choose suitable image sizes to make the system work on a smartphone. 2 RELATED WORK Four arithmetic operations including numerals and operators often appear in the homework of low-grade pupils. To recognize and calculate arithmetic operations automatically, Meng et al. [5] used a BP neural network and template matching to recognize printed numerals and operators. To recognize handwritten-style characters, Jiang et al. [2] used the end-to-end learning technology to recognize arithmetic operations in fixed-size or varied-size images, where the characters were CAPTCHA-style. Khalighi et al. [3] proposed a novel OCR system to recognize and calculate handwritten Persian arithmetic expression. Instead of only recognizing numerals and operators, Li et al. [4] proposed BAGS, an automatic homework grading system based on the pictures taken by smart phones. BAGS detected answer areas in the answer sheets and recognized the handwritten characters (e.g., words). In BAGS, the images were processed in another computers or servers instead of smartphones. Different from the existing work, we provide a homework autochecking system based on images taken by smartphones, aiming to recognize both printed characters and handwritten characters in arithmetic operations. Besides, our system runs on the easy-toget smartphone locally without transmitting images or processing images offline. 3 SYSTEM DESIGN To check the homework consisted of four arithmetic operations, it is necessary to detect the characters, recognize the characters and verify the calculation of arithmetic expression. Therefore, the proposed system HmwkCheck consists of four components, i.e., image preprocessing, character extraction, character recognition, arithmetic expression calculation, as shown in Fig. 2. 3.1 Image Preprocessing As shown in Fig. 1, the four arithmetic operations are in white paper, and the picture taken by the smartphone is a RGB image. To remove noises and distinguish characters with the background in the image, we first preprocess the image. As shown in Fig. 2, we first process the raw image, i.e., picture taken by the smartphone, with gray-scale image processing. Then, we use a bilateral filter to remove noises while keeping the edges in the image. After that, we perform image binaryzation to separate the foreground (i.e., characters) and background, i.e., the characters are in black color while the background is in white color. The preprocessed image will be used for the following character extraction. 3.2 Character Extraction To extract the characters, we need to detect the arithmetic expression and separate each character. Specifically, with the binarized image, we use the horizontal projection of pixels in each row to detect the arithmetic expression. Supposed the coordinate of a pixel in an image is (xi ,yj), i ∈ [1,w], j ∈ [1, h], where w and h represent the width and height of the image, respectively. If the number np of black pixels (xi ,yp ), i ∈ [1,w] in the pth row satisfies np > ϵp , the row is treated as ‘Expression-Row’, where ϵp is set to 2 by default. By connecting the consecutive ‘Expression-Rows’, we can get the arithmetic expressions, as shown in Fig. 3(a). To further segment the arithmetic expressions in same rows, e.g., “28 × 14 = 392” and “38 × 40 = 1520”, we introduce the vertical projection of each pixels in each column, where the principle of vertical projection is similar to that of horizontal projection. After that, we can get each arithmetic expression, as shown in Fig. 3(b). In addition, to segment each character from the arithmetic expression, we repeatedly use the vertical projection. Finally, we can extract each arithmetic expression and its corresponding characters, as shown in Fig. 3(c). Figure 3: An example of character segmentation 173