正在加载图片...
Localized Content-Based Image Retrieval Through Evidence Region Identification Wu-Jun Li Dit-Yan Yeung Department of Computer Science and Engineering Hong Kong University of Science and Technology,Hong Kong,China {liwujun,dyyeung}@cse.ust.hk Abstract of the image.If features from the whole image area are used to represent an image,the useful information may be Over the past decade,multiple-instance learning (MIL) overridden by noisy information from irrelevant regions. has been successfully utilized to model the localized For example,in Figure 3.if the interest of the user is in content-based image retrieval (CBIR)problem,in which a the object"FabricSoftenerBox",the two images with label bag corresponds to an image and an instance corresponds "FabricSoftenerBox"should have higher similarity than the to a region in the image.However.existing feature rep- first two images in the upper row.However,the first two im- resentation schemes are not effective enough to describe ages in the upper row are expected to give higher similarity the bags in MIL.which hinders the adaptation of sophisti- than the two images in the leftmost column if global meth- cated single-instance learning (SIL)methods for MIL prob- ods are used.On the contrary,localized CBIR [11,12,13], lems.In this paper,we first propose an evidence region which describes the task where the user is only interested in for evidence instance)identification method to identify the a portion of the image with the rest being irrelevant,is more evidence regions supporting the labels of the images (i.e., natural and is in line with human perception.For example, bags).Then,based on the identified evidence regions,a in Figure 3,a user may only be interested in the apple in the very effective feature representation scheme,which is also image with label“Apple'”. very computationally efficient and robust to labeling noise, A new learning paradigm called multiple-instance learn- is proposed to describe the bags.As a result,the MIL prob- ing(MIL)[6]was proposed to model learning problems lem is converted into a standard SIL problem and a sup- where the class labels are only associated with sets of exam- port vector machine (SVM)can be easily adapted for local- ples rather than individual examples.In MIL,an individual ized CBIR.Experimental results on two challenging data example is called an instance and a bag contains a set of sets show that our method,called EC-SVM,can outperform instances.Training labels are associated with bags rather the state-of-the-art methods in terms of accuracy,robust- than instances.A bag is labeled positive if at least one of its ness and efficiency. instances is positive;otherwise,the bag is negative.In this paper,we use the term single-instance learning (SIL)to re- fer to the traditional supervised learning paradigm in which 1.Introduction each individual example has a class label. 1.1.Background In the existing localized CBIR work,the region of inter- est can be either at a fixed location or marked by the user. According to the low-level image features used in the The first case does not conform to the general image re- retrieval process,existing content-based image retrieval trieval task and the second case requires too much effort (CBIR)methods can be categorized into two major classes, from the user,making it unappealing in practice.Hence, namely,global methods and localized methods (a.k.a.local- the focus ofthis paper is to design a general automatic local- ized CBIR [11,121).Global methods exploit features char- ized CBIR system that does not necessarily require the user acterizing the global view of an image,such as color his- to mark the region of interest.Specifically,we require that tograms,to compute the similarity between images.These multiple labeled images be provided for the system to auto- methods have been widely used by traditional CBIR sys- tems.Although global features can be extracted easily. IDue to the page limit constraint,in this paper,we can only cite the most related references from the computer vision community or those fo- in many cases,only a small part or several small parts of cused on vision applications.Many other references,especially those from the image are useful for characterizing the visual content the machine learning community,can be found in [7].Localized Content-Based Image Retrieval Through Evidence Region Identification Wu-Jun Li & Dit-Yan Yeung Department of Computer Science and Engineering Hong Kong University of Science and Technology, Hong Kong, China {liwujun,dyyeung}@cse.ust.hk Abstract Over the past decade, multiple-instance learning (MIL) has been successfully utilized to model the localized content-based image retrieval (CBIR) problem, in which a bag corresponds to an image and an instance corresponds to a region in the image. However, existing feature rep￾resentation schemes are not effective enough to describe the bags in MIL, which hinders the adaptation of sophisti￾cated single-instance learning (SIL) methods for MIL prob￾lems. In this paper, we first propose an evidence region (or evidence instance) identification method to identify the evidence regions supporting the labels of the images (i.e., bags). Then, based on the identified evidence regions, a very effective feature representation scheme, which is also very computationally efficient and robust to labeling noise, is proposed to describe the bags. As a result, the MIL prob￾lem is converted into a standard SIL problem and a sup￾port vector machine (SVM) can be easily adapted for local￾ized CBIR. Experimental results on two challenging data sets show that our method, called EC-SVM, can outperform the state-of-the-art methods in terms of accuracy, robust￾ness and efficiency. 1. Introduction 1.1. Background According to the low-level image features used in the retrieval process, existing content-based image retrieval (CBIR) methods can be categorized into two major classes, namely, global methods and localized methods (a.k.a. local￾ized CBIR [11, 12]). Global methods exploit features char￾acterizing the global view of an image, such as color his￾tograms, to compute the similarity between images. These methods have been widely used by traditional CBIR sys￾tems. Although global features can be extracted easily, in many cases, only a small part or several small parts of the image are useful for characterizing the visual content of the image. If features from the whole image area are used to represent an image, the useful information may be overridden by noisy information from irrelevant regions. For example, in Figure 3, if the interest of the user is in the object “FabricSoftenerBox”, the two images with label “FabricSoftenerBox” should have higher similarity than the first two images in the upper row. However, the first two im￾ages in the upper row are expected to give higher similarity than the two images in the leftmost column if global meth￾ods are used. On the contrary, localized CBIR [11, 12, 13], which describes the task where the user is only interested in a portion of the image with the rest being irrelevant, is more natural and is in line with human perception. For example, in Figure 3, a user may only be interested in the apple in the image with label “Apple”. A new learning paradigm called multiple-instance learn￾ing (MIL) [6] 1 was proposed to model learning problems where the class labels are only associated with sets of exam￾ples rather than individual examples. In MIL, an individual example is called an instance and a bag contains a set of instances. Training labels are associated with bags rather than instances. A bag is labeled positive if at least one of its instances is positive; otherwise, the bag is negative. In this paper, we use the term single-instance learning (SIL) to re￾fer to the traditional supervised learning paradigm in which each individual example has a class label. In the existing localized CBIR work, the region of inter￾est can be either at a fixed location or marked by the user. The first case does not conform to the general image re￾trieval task and the second case requires too much effort from the user, making it unappealing in practice. Hence, the focus of this paper is to design a general automatic local￾ized CBIR system that does not necessarily require the user to mark the region of interest. Specifically, we require that multiple labeled images be provided for the system to auto- 1Due to the page limit constraint, in this paper, we can only cite the most related references from the computer vision community or those fo￾cused on vision applications. Many other references, especially those from the machine learning community, can be found in [7]
向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有