华罗庚研讨课（学习笔记）A Deep-Learning Based Semi-Interactive Method for Re-colorization.pdf_大学文库

Contents Contents I Abstract III 1 Introduction 1 1.1 Colorizing Grayscale Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1.2 Image Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Two Views of Colorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2.1 From Certain Color Style . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2.2 From Given Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2 Style Transferring Methods 4 2.1 Pixel-wise LUT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1.1 Description of Content and Style . . . . . . . . . . . . . . . . . . . . . 4 2.1.2 Look-Up Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1.3 RGB Matching Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.1.4 YUV Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2 Wavelet Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2.1 Frequency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2.2 Wavelet Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2.3 Soften the Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.3 Edge Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.3.1 Convolution Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.3.2 Transforming with Edge Orientation Information . . . . . . . . . . . . 14 3 Colorization by Optimizing 15 3.1 Continuity Preserving Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.1.1 Mask . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.1.2 RGB Optimizing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.1.3 Poisson Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.2 Optimizing on YUV Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.2.1 Loss Function on YUV . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.2.2 YUV Optimization Results . . . . . . . . . . . . . . . . . . . . . . . . . 20 4 Introduce of CNN 22 4.1 Colorization Nets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 4.1.1 Basic Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 4.1.2 Plain Network Constructions . . . . . . . . . . . . . . . . . . . . . . . . 24 4.1.3 Colorize by GAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.2 VGG-19 and Gram Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.2.1 VGG-19 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.2.2 Representation of Metrics . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.2.3 Result Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 I

Course Paper L.-K. Hua Seminar July 12, 2023 With YUV or HSI representation, the re-colorize problem can be summarized as following: given matrix Y (I), find the most likely matrix of U and V (S and H). It is obvious that such things can’t be done with no prior information, for we are trying to construct three dimension from one. From different information, we can get different views of the problem. 1.2 Two Views of Colorization 1.2.1 From Certain Color Style A common situation is that we have already known the color style of a grayscale image. For example, an image of a flower might be re-colorized to any color, but if we have a color image of the same flower, we can fill the grayscale with similar color, which is called to transfer the color style. Just as figure 4 shows, if we know the light condition in the left image is same as the middle, we can transfer it to the right. Figure 4: Style-transferring Different ways to transfer leads to different effects, but the principle is clear: let S be the source image that contains color style, T be the target grayscale image, style-transferring is a problem like I = argminX{αdc(X, T) + βds(X, S)} Here dc, ds means the distance on content and style, while α, β are weight parameters. A simple thought is to let dc(X, T) = kYX − YT k 2 F , use Frobenius norm to judge the distance, but for ds, to find a proper form is quite uneasy: color style should be independent of location, and still show overall message. In the section that follows, we will talk about some empirical representation of metric dc and ds, with the methods elicited by it. It is worth noting that style-transferring is not necessarily an explicit optimization problem. In many cases, dc and ds are independent, thus can be minimized respectively. With style-transferring and an image gallery large enough, we can form a straightforward semi-interactive method: compare the target image with every image in the gallery and find one with the nearest content, then transfer the style to the grayscale. Write in formula, that is to choose S by (G means the gallery): S = argminX∈Gdc(T, X) Semi-interactive implies that the method doesn’t need to designate a certain image, but still the range of the gallery is important. 3

Course Paper L.-K. Hua Seminar July 12, 2023 matrices of the same dimension: dgh,ij (X, Y ) = ( 1 (xgh − xij )(ygh − yij ) ≤ 0 and ygh 6= yij 0 Otherwise. dc(X, Y ) = X (g,h)(i,j) dgh,ij (X, Y ) This ”distance” is not symmetric, for its null point only requires two pixels that are the same in X to be the same in Y, but not the reverse. In addition to that, it requires X and Y to have the same relationship of partial order. Visually speaking, this distance means that we recognize the content by the order of intensity in pixels. So long as we preserve the order, we can see the same contents. For style, the metric is described as (Il means characteristic function, which is 1 when l is true, otherwise 0): h(X) = (hk(X)), hk(X) = 1 size(X) X ij Ixij≤k ds(X, Y ) = kh(X) − h(Y )k Instead of comparing every point pairs like content, style metric just counts how many pixels of a certain intensity are in the matrix, and compare the intensity distribution. From probability view, here the h is just the distribution function of random sampling in X, called the histogram of a matrix. 2.1.2 Look-Up Table Letting α = ∞, β = 1, the optimization problem turns to the case which the content must be preserved. For equal pixels in X determines equal pixels in Y, we need to find a monotonic increase function f, then yij = f(xij ). Because x and y have discrete values, this f can also be recorded as an array, which is called a look-up table. It is not hard to construct the table by the following code: j = 0 for i = 0:255 while (g(j+1) < h(i+1)) && (j < 255) j = j + 1; end LUT(i+1) = j; end Here g means the histogram of the target image, while h means the histogram of the source image. By such matching, the color style of the source image can be transformed to the target image. By independently transforming on R, G, B or Y, U, V, we can get three look-up tables. From equation(1), the inverter equation is also clear:   R G B   =   1 0 1.402 1 −0.34414 −0.71414 1 1.772 0     Y U − 128 V − 128   (4) 5