[go: up one dir, main page]

WO2002067143A1 - Image analysis for image database indexing - Google Patents

Image analysis for image database indexing Download PDF

Info

Publication number
WO2002067143A1
WO2002067143A1 PCT/GB2002/000675 GB0200675W WO02067143A1 WO 2002067143 A1 WO2002067143 A1 WO 2002067143A1 GB 0200675 W GB0200675 W GB 0200675W WO 02067143 A1 WO02067143 A1 WO 02067143A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
images
primitive
database
query
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/GB2002/000675
Other languages
French (fr)
Inventor
Guoping Qiu
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Nottingham
Original Assignee
University of Nottingham
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Nottingham filed Critical University of Nottingham
Publication of WO2002067143A1 publication Critical patent/WO2002067143A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • G06T7/66Analysis of geometric attributes of image moments or centre of gravity
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/5838Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using colour
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/5862Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using texture
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/40Analysis of texture
    • G06T7/41Analysis of texture based on statistical description of texture

Definitions

  • This invention relates to image analysis, and in particular image processing and database management and to image analysis especially for the indexing, categorisation, and content-based retrieval of electronically stored images.
  • the invention further relates to image representation in general, and to image representation particularly for the matching, indexing, categorisation or retrieval of images based on the image content.
  • the problem therefore arises of developing an image analysis technique whereby the images held in a database may be analysed according to their content and the database may be instructed to supply all images which look similar to a target or query image.
  • US 5,852,823 claims a system, responsive to a query image set, for retrieving, from a stored database, a desired image that is visually similar to the image set, said set containing at least one query image, said system comprising: an image database having stored images therein; a signature generator, responsive to an input image and having three successive convolution filters, for producing a signature for the input image, the signature containing a numeric measure of each of a plurality of pre-defined characteristics of the input image; wherein the signature generator, in response to the query image set, generates a corresponding signature for each image in the query image set, and, in response to a test image accessed from the stored database, generates a corresponding test signature; a statistics generator, responsive to the signature for each image in the query image set, for providing a separate pre-defined statistical measure, for each of the pre-defined characteristics, that represents variability of said each pre-defined characteristic across all images in the query image set; a image database manager, operative in conjunction with the image database, for retrieving successive ones
  • a signature is computed for each image in a set using multi-level iterative convolution filtering, with pixel values supplied as input to each filtering level being separately convolved with each one of a set of predefined Gaussian kernels.
  • Average and variance vectors are separately computed across corresponding elements in all the image signatures for the set.
  • a linguistic term, semantically descriptive of all the images in the set is associated with the numeric descriptors of this set and, with this set, is stored in a database.
  • the descriptor for any set is accessed by a textual search through the database using the appropriate linguistic term.
  • the descriptor is then compared against accessed signatures for other images in the database in order to retrieve a image, among those stored in the database, that is the most similar to those in the set associated with the descriptor.
  • US 5,893,095 claims a search engine, comprising: a function container capable of storing primitive functions; a registration interface storing functions to the function container; and a primitive supplying primitive functions to the registration interface, wherein the primitive functions include an analysis function capable of extracting features from an object, together with a method of comparison using such a search engine, and it goes on to describe a system and method for content-based search and retrieval of visual objects.
  • a base visual information retrieval (NIR) engine utilises a set of universal primitives to operate on the visual objects.
  • NIR visual information retrieval
  • An extensible NIR engine allows custom, modular primitives to be defined and registered.
  • a custom primitive addresses domain specific problems and can utilize any image understanding technique.
  • Object attributes can be extracted over the entire image or over only a portion of the object.
  • a schema is defined as a specific collection of primitives.
  • a specific schema implies a specific set of visual features to be processed and a corresponding feature vector to be used for content-based similarity scoring.
  • a primitive registration interface registers custom primitives and facilitates storing of an analysis function and a comparison function to a schema table.
  • a heterogeneous comparison allows objects analysed by different schemas to be compared if at least one primitive is in common between the schemas.
  • a threshold-based comparison is utilised to improve performance of the NIR engine.
  • a distance between two feature vectors is computed in any of the comparison processes so as to generate a similar
  • the present invention provides an image analysis system, which may be a processing or representation system, in which an input image is deconstructed into primitive images corresponding to palette features of the input image and in which the geometric properties of the binary images are measured and populations of different geometrical measures of the respective palette features are tabulated.
  • an image analysis system which may be a processing or representation system, in which an input image is deconstructed into primitive images corresponding to palette features of the input image and in which the geometric properties of the binary images are measured and populations of different geometrical measures of the respective palette features are tabulated.
  • the present invention also provides an image indexing system in which the image is deconstructed into a plurality of primitive images each reflecting a palette feature of the original image and then tabulating the populations of distinct geometrical features of each such primitive image.
  • the invention includes an image retrieval system for retrieving images similar to a query image from an image database in which an algorithm is applied to the query image to deconstruct the query image into a plurality of primitive query images each reflecting a palette feature of the original query image and then tabulate the populations of distinct geometrical features of each such primitive query image to form a query table, and then comparing the query table with image tables created by applying said algorithm to images held in said database.
  • the invention includes an image database which contains image tables created by applying an algorithm to each of a set of images to deconstruct each of those images into a plurality of primitive images each reflecting a palette feature of the original image and then tabulating the populations of distinct geometrical features of each such primitive image to form a lookup table.
  • the invention also includes a workstation programmed with an algorithm adapted to be applied to an image to deconstruct that image into a plurality of primitive images each reflecting a palette feature of the original image and then to tabulate the populations of distinct geometrical features of each such primitive image to form a digital table.
  • Such workstation may be remote from or wired to a database storing images optionally operated on by said algorithm.
  • the workstation is suitably programmed to apply the algorithm to a query image to form a query table and then to compare the query table with image tables created by applying said algorithm to images held in said database, in order to retrieve images similar to the query image.
  • the invention extends to a method of retrieving images retrieving images similar to a query image from an image database in which an algorithm is applied to the query image to deconstruct the query image into a plurality of primitive query images each reflecting a palette feature of the original query image and then tabulate the populations of distinct geometrical features of each such primitive query image to form a query table, and then comparing the query table with image tables created by applying said algorithm to images held in said database.
  • the invention also extends to a method of indexing images in which an algorithm is applied to the image to deconstruct it into a plurality of primitive images each reflecting a palette feature of the original image and then to tabulate the populations of distinct geometrical features of each such primitive image.
  • the invention further extends to an object recognition system in which an image is formed of the object to be recognised, an algorithm is applied to the object image to deconstruct the object image into a plurality of primitive object images each reflecting a palette feature of the original object image and then tabulate the populations of distinct geometrical features of each such primitive object image to form a query table, and then comparing that query table with image tables created by applying said algorithm to images held in an image database.
  • the invention includes a method of creating an image content table which comprises: defining palette features of an image and deconstructing the image into a plurality of binary primitive images each reflecting a said palette feature;
  • the invention also includes an image content table which comprises a tabulated list of populations of different geometrical features of each of a plurality of primitive binary images .
  • the primitive images are binary images having a plurality of pixels each having one of two values associated with it.
  • each pixel in the binary image is associated with a respective part of the deconstructed image from which it is formed.
  • Each said part of the deconstructed image most conveniently comprises a pixel of the deconstructed image.
  • the primitive images may each indicate which of said parts have the respective palette feature and which do not.
  • each palette feature corresponds to a respective one of the classifications .
  • each palette feature has a palette feature vector associated with it and each pixel of the image is classified by defining a pixel feature vector for the pixel and determining which of the palette feature vectors it is closest to.
  • the at least one image feature may include surface texture, or colour.
  • each image feature has a plurality of possible categories, and each defined palette feature corresponds to a plurality of said categories.
  • geometric features of each primitive image are recorded in an array, for example a one-dimensional array, and the arrays for the primitive images are combined to form a table.
  • the geometric properties of the primitive images include geometric properties of connected elements within the primitive images.
  • the geometrical features tabulated may be of various kinds. Generally, the tabulation is based on an analysis of the rows and columns of pixels making up the primitive binary image, though other analytical systems are possible, for example by referring to blocks of pixels, for example connected blocks of pixels .
  • the tabulation comprises a table of the populations of rows and columns of pixels of each palette feature of a given length or height.
  • the table may reflect the presence of Rj rows made up of one pixel, R 2 rows made up of two pixels and so on, and of Ci columns one pixel high, C 2 columns which are two pixels high, and son on.
  • Rj rows made up of one pixel
  • R 2 rows made up of two pixels and so on
  • Ci columns one pixel high
  • C 2 columns which are two pixels high, and son on.
  • the geometrical features may thus be considered as rows and columns of a given pixel length or height.
  • the tabulation comprises the total number of pixels containing the selected palette feature in each row and column.
  • the geometrical feature is thus simply the active pixel content of each row or column.
  • the geometrical feature maps to the total size of each continuous array of pixels of a given palette feature.
  • the palette feature analysed may be a colour tone, a pattern or texture, or any other tonal value contributing to the appearance of the image which is being subjected to the algorithm.
  • at least one said palette feature represents a collocation of a plurality of visually distinguishable tonal values of the respective image.
  • the algorithm may be adapted to reduce the colour palette prior to tabulation. For example the image may be reduced to a colour palette of 128 colours, or of 64 or 32 or even fewer colours. This of course greatly reduces the memory space required for storing the tabular information required if it is desired to maintain an entire database in tabular form.
  • the algorithm may be applied in such a way that, for example, all blue tones are combined to a single blue palette feature and the image is tabulated on that basis.
  • the invention is not of course confined to the amalgamation of like colours into a single palette feature. But the algorithm applied to the stored images and the query image must be compatible.
  • the present invention takes the previously known techniques a step further by providing a method of representing image contents based on the co-occurrence of geometric features and surface appearance features and summarising the co-occurrence statistics in a tabular form.
  • a number of innovative features of the current invention make this invention advantageous over prior art technologies.
  • the use of the co-occurrence of the surface appearance features (such as colour tone, texture characteristics etc) and the geometric structure (shape, size, and other parameters) of image areas of homogeneous appearance makes the representation more discriminative, thus enabling more accurate image retrieval.
  • the decomposition of the image into binary primitive form also allows fast and flexible processing.
  • the present invention is not only applicable to image indexing and retrieval but also has more general application in general image matching and object recognition.
  • Figure 1 is a general schematic diagram illustrating image content table generation
  • Figure 2 illustrates a palette feature table relating to different colours
  • Figure 3 illustrates a palette feature table relating to different texture features
  • Figure 4 illustrates a classifier
  • Figure 5 shows a binary image used to illustrate the construction of horizontal and vertical pixel arrays
  • Figure 6 shows an 8-connected pixel array and a 4 connected pixel array
  • FIG. 7 illustrates connected component labelling
  • Figure 8 illustrates how a primitive binary image may be divided into horizontal and vertical patches
  • Figure 9 is an example illustrating the calculation of vertical and horizontal projections
  • Figure 10 is an image content table (ICT) based on Horizontal and Vertical Line Measurement
  • Figure 11 is an image content table (ICT) based on Connected Component Measurement
  • Figure 12 is an image content table (ICT) based on Horizontal and Vertical Projections;
  • Figure 13 illustrates an application of Image Content Table to Image Database Indexing and Retrieval;
  • Figure 14 illustrates the construction of object class prototypes/models and a recognition model
  • Figure 15 illustrates steps involved in the recognition of an unknown object.
  • the image analysis method of the invention uses a novel concept of image content table (ICT) .
  • Fig. 1 shows the schematic of a method for generating an ICT.
  • the input image is first passed through a classifier, which classifies the input image pixels into a number of classes (N) according to the palette features (PFs) which are to be analysed.
  • N the number of classes
  • PFs palette features
  • These palette features result in N feature vectors characterising image tonal value properties such as colour tones or surface pattern or texture properties. It is not essential that each individual visually or otherwise distinguishable colour or other tonal value should be assigned to a palette feature giving rise to a feature vector. Indeed, it is preferred that the various colour tones of the input image should be assigned to a reduced colour palette in which a single palette feature represents a plurality of tonal values of the input image. Thus the various colour tones of the input image are preferably assigned to one of, for example, 128 different palette features. In a simpler system requiring even less memory, the palette used for analysing the input image contains 64 or 32 or 16 or 8 colours or features.
  • N primitive binary images are formed, each of the same dimensions as the input image.
  • Each PBI corresponds to a palette feature (vector), and the pixel values of the PBI indicate the presence or absence of the palette feature in the input image at the pixel locations.
  • a pixel value of 1 in the PBI indicates the presence of the feature and a pixel value of 0 indicates the absence of that feature. (However, since it is a binary image, assigning the pixel values the other way round is also possible) .
  • the primitive binary images are then analysed by the binary image analyser, whose primary function is to characterise the geometric features of the 1 -valued pixels in the PBI's.
  • a two-dimensional table, the image content table (ICT) which characterises the frequencies of the co-occurrence of the palette features and the geometric features is then generated.
  • Palette Feature Vectors The palette features are contained in a pre-determined table consisting of N entries, each pointing to an image feature vector.
  • image features There are various properties of the input image which may be considered as image features : these include (1) colour, and (2) surface texture properties. Other features are possible.
  • the palette feature table for the case where colour tones are used as image features is shown in Fig. 2.
  • each palette feature is characterised by a 3-dimensional vector in an appropriate colour space, and for illustration purposes, an RGB colour space is used here. It is to be noted that the use of other colour spaces is also possible.
  • palette features that characterise the surface texture or pattern or other properties of the image.
  • the palette table for the case when texture properties are used as image features is shown in Fig.3, where it is assumed the texture property of a pixel location is characterised by an m-dimensional texture feature vector, obtained for example by a filtering based approach.
  • the texture features can be used to describe the texture surface of a grey-scale as well as colour images (see A. Jain and G. Healey, "A multiscale representation including opponent color features for texture recognition" , IEEE Transactions on Image Processing, vol. 7, no. 1, pp. 124 - 128, January 1998) .
  • the N feature vectors should be chosen as statistically representative vectors in the chosen feature space.
  • vector quantization [Reference 4] and use as many example images as possible to generate training samples.
  • the process of obtaining the palette feature vectors needs only be performed once, and after feature vectors are generated they are fixed and stored for used by the classifier. (Generating the palette feature table can be easily done by those known in the art e.g. Gerrastz, M. & Purgathofer, W.
  • X denote the PFV
  • X (r x , g x , b x )
  • this PFV is the m-dimensional texture characteristic feature vector.
  • X (T.(l), T x (2) , ...,T x (m)) .
  • Each pixel's PFV in the original input image is compared with the N palette feature vectors in the Euclidean space.
  • the classifier outputs the index of the palette feature vector that is the closest to the PFV, X.
  • each pixel in the input image will have been assigned an index, the pixel feature index (PFI), ranging from 1 to N, indicating the assignment of the pixel into one of the N palette features.
  • PFI pixel feature index
  • a pixel in the i-th primitive binary image will have a value of 1 if the pixel in the corresponding location in the input original image is associated with the i-th palette feature in the palette feature table, and a value of 0 if the pixel in the corresponding location in the input original image is assigned to any other palette feature.
  • a set of line length values, Li, L 2 , ...L ⁇ is pre-set.
  • G V (K) ⁇ whose elements are the number of connected vertical lines of lengths L ⁇ L 2 , ...L ⁇ . Note that the number of line length values and indeed the values of the line length for horizontal and vertical direction do not have to be the same.
  • connected 1-valued pixels are given the same label.
  • Two pixels can be 8-connected or 4-connected as illustrated in Figs. 6a and 6b respectively.
  • Fig. 6a pixel 1 and pixels 2, 3, 4, 5, 6, 7, 8, and 9 are 8-connected
  • Fig. 6b pixel 1 and pixels 2, 3, 4, 5 are 4- connected.
  • 8-connected pixels are considered here.
  • Another method of analysing the primitive binary images is based on the projection of the 1-valued pixels along horizontal and vertical directions. Assuming the original input image consists of M (row) x N (column) pixels, we divide the binary image into horizontal and vertical patches, each consists of equal number of rows or columns of pixels, as illustrated in Fig. 8. Assuming we are dividing the image into V (1 ⁇ V ⁇ N) vertical and H (1 ⁇ H ⁇ M) horizontal patches, then each vertical patch will consists of INT(N/V) columns of pixels, and each horizontal patch will consist of INT(M/H) rows of pixels. Notice INT(x) denotes the integer part of x.
  • Fig. 9 shows an example calculating the vertical and horizontal projection.
  • the original image is 12 rows x 13 columns (pixel) .
  • We want to divide it into 6 horizontal patches and 7 vertical patches. Since 13/7 1.86 is not an integer, we duplicate the last column of pixels to make the image 12 rows x 14 columns (pixel) . Therefore, each vertical patch consists of two columns of pixels, and each horizontal patch consists of 2 rows of pixels.
  • an image content table (ICT) is generated.
  • the ICT is a two dimensional array formed by arranging the 1 -dimensional geometric measurement arrays produced by the image analyser.
  • the table will be different and the construction of the ICT for each of the three measurements described above is illustrated in Figures 10 to 12 respectively.
  • the image content table is formed by stacking GHL(i) and GVL(i) to form the i-th column of the 2-dimensional ICT as shown in Fig. 10. ICT based on Connected Component Measurement (Fig. 11)
  • G s (i) ⁇ G s (i, 1), G s (i, 2) , ..., G s (i, K) ⁇ be the geometric property measurement array generated by the image analyser based on the method of Connected Component Measurement for the i-th primitive binary image.
  • the image content table, ICT is formed by arranging G s (i) into the i-th column of the ICT as shown in Fig. 11.
  • the image content table is formed by stacking GPV (i) and GPH (i) to form the i-th column of the 2- dimensional ICT as shown in Fig. 12.
  • Image Content Table for Image Database Applications A principal application of the image content table is in image database management. It can be used in content-based image indexing and retrieval. The scenario can be illustrated in Fig. 13.
  • an image content table (ICT) is constructed and stored on the disk as part of the image database.
  • Such a database constructed using our ICT can be used to implement a content-based image query, or a query by pictorial example.
  • a user has an image and wants to find from the database images that are similar to the query image.
  • An image content table, ICTQ is first constructed for the query image. Then this ICTQ is compared to the ICTs of the database.
  • a similarity scorer calculates a metric measurement between the query image's ICT and the ICTs of the database images.
  • One possible implementation of the Similarity Scorer is as follows.
  • the similarity score between the two images can be calculated:
  • each image in the database can be given a similarity score.
  • All the scores may be sorted in an increasing order (the smaller the score, the more similar the images) and returned to the user. From this sorted list, the user can identify the images from the database that are most similar to the query.
  • the image content table has more general applications in machine vision based automatic object recognition.
  • an object is first imaged using a sensor, e.g. , a CCD camera.
  • the image of the object is deconstructed; e.g. , using a pre-designed colour palette table, into a set of primitive binary images and an image content table is constructed to represent the image (object) .
  • a sensor e.g. , a CCD camera.
  • the image of the object is deconstructed; e.g. , using a pre-designed colour palette table, into a set of primitive binary images and an image content table is constructed to represent the image (object) .
  • general automatic object recognition machine systems can be constructed.
  • Fig. 14 illustrates how the ICT can be used in constructing object class prototype/model and recognition models.
  • Each sample object of known class identity is imaged and represented using an ICT, which can be regarded as a feature vector (pattern) .
  • pattern can be used to construct object class prototypes/models using an appropriate recognition strategy.
  • Fig. 15 illustrates the use of the object class prototype/model ICT to recognise unknown objects.
  • An unknown object is first imaged and then the image of the unknown object is represented in image content table form and presented to the same pattern recognition algorithms which will use the object prototype/model information to identify the class identity of the unknown object automatically.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Geometry (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A method of image analysis for the classification, indexing and searching of images includes forming a plurality of primitive binary images, each representing the presence or absence of a respective palette feature for each pixel of the image, analysing geometric properties of each of the primitive binary images to from a one dimensional array, and combining the arrays to form an image content table for the image.

Description

IMAGE ANALYSIS FOR IMAGE DATABASE INDEXING
This invention relates to image analysis, and in particular image processing and database management and to image analysis especially for the indexing, categorisation, and content-based retrieval of electronically stored images. The invention further relates to image representation in general, and to image representation particularly for the matching, indexing, categorisation or retrieval of images based on the image content.
In present times, literally millions of images are created each day and there is accordingly a huge body of image information available which may be stored as files in an image database. Plainly, there is a requirement that any image file can be retrieved from such a database and printed or otherwise displayed on demand.
It is a relatively straightforward matter to search a database of text files for the presence of a given alphanumeric string because a limited linguistic or mathematical vocabulary is associated with the text file, and even in large databases, searching can be carried out quite rapidly, using keywords as required. It would of course be possible to attach one or more keywords as an alphanumeric string to any image file and to search the database on that basis, but this is not satisfactory since different people might ascribe different keywords to any given image, or the same keywords to a wide variety of different images. People tend to prefer to interrogate an image database in accordance with what the image itself contains rather in accordance with some label that is put on that image.
The problem therefore arises of developing an image analysis technique whereby the images held in a database may be analysed according to their content and the database may be instructed to supply all images which look similar to a target or query image.
Useful reviews of the recent development of systems for image retrieval appear in sections of US Patents 5,852,823 (Microsoft) and 5,893,095 (Nirage ,Inc.) in particular in sections respectively headed "Background of the Disclosure" and "Background of the Invention" .
In particular, US 5,852,823 claims a system, responsive to a query image set, for retrieving, from a stored database, a desired image that is visually similar to the image set, said set containing at least one query image, said system comprising: an image database having stored images therein; a signature generator, responsive to an input image and having three successive convolution filters, for producing a signature for the input image, the signature containing a numeric measure of each of a plurality of pre-defined characteristics of the input image; wherein the signature generator, in response to the query image set, generates a corresponding signature for each image in the query image set, and, in response to a test image accessed from the stored database, generates a corresponding test signature; a statistics generator, responsive to the signature for each image in the query image set, for providing a separate pre-defined statistical measure, for each of the pre-defined characteristics, that represents variability of said each pre-defined characteristic across all images in the query image set; a image database manager, operative in conjunction with the image database, for retrieving successive ones of a plurality of stored images from the database and for routing each one of the plurality of successive images as the test image to the signature generator so as to generate a plurality of corresponding test signatures; a comparator for comparing, across all of the pre-defined characteristics, the statistical measure for each of the pre-defined characteristics against a corresponding numeric measure in each one of the test signatures so as to yield a similarity measure, the similarity measure for said one of the test images being reflective of a degree of visual similarity between the one test image and the query image set; and a manager, responsive to the similarity measure for each of the test images and operative in conjunction with the image database manager, for selecting at least one of the test images having a highest relative similarity measure as a retrieved image and providing the retrieved image as output, and it describes an apparatus and an accompanying method for generating a semantically based, linguistically searchable, numeric descriptor of a pre-defined group of input images and which is particularly useful in a system for automatically classifying individual images, on a numerical basis, in, e.g. , an image database, and, through a query-by-example paradigm, retrieving a desired image (s) therefrom. Specifically, a signature is computed for each image in a set using multi-level iterative convolution filtering, with pixel values supplied as input to each filtering level being separately convolved with each one of a set of predefined Gaussian kernels. Average and variance vectors, as collective numeric descriptors of all the images in the set, are separately computed across corresponding elements in all the image signatures for the set. A linguistic term, semantically descriptive of all the images in the set, is associated with the numeric descriptors of this set and, with this set, is stored in a database. For image retrieval, the descriptor for any set is accessed by a textual search through the database using the appropriate linguistic term. The descriptor is then compared against accessed signatures for other images in the database in order to retrieve a image, among those stored in the database, that is the most similar to those in the set associated with the descriptor. US 5,893,095 claims a search engine, comprising: a function container capable of storing primitive functions; a registration interface storing functions to the function container; and a primitive supplying primitive functions to the registration interface, wherein the primitive functions include an analysis function capable of extracting features from an object, together with a method of comparison using such a search engine, and it goes on to describe a system and method for content-based search and retrieval of visual objects. A base visual information retrieval (NIR) engine utilises a set of universal primitives to operate on the visual objects. An extensible NIR engine allows custom, modular primitives to be defined and registered. A custom primitive addresses domain specific problems and can utilize any image understanding technique. Object attributes can be extracted over the entire image or over only a portion of the object. A schema is defined as a specific collection of primitives. A specific schema ;implies a specific set of visual features to be processed and a corresponding feature vector to be used for content-based similarity scoring. A primitive registration interface registers custom primitives and facilitates storing of an analysis function and a comparison function to a schema table. A heterogeneous comparison allows objects analysed by different schemas to be compared if at least one primitive is in common between the schemas. A threshold-based comparison is utilised to improve performance of the NIR engine. A distance between two feature vectors is computed in any of the comparison processes so as to generate a similarity score.
One particularly promising approach to image database indexing and retrieval is the query-by-image-content (QBIC) method (W. Νiblack et al, "Querying images by content using colour, texture and shape" , Proc. SPIE Conference on Storage and Retrieval for Image and Video Databases, vol. 1908, pp. 173-187, 1993) whereby the visual contents of the images, such as colour distribution (represented by a colour histogram), texture attributes and other image features are extracted from the image using computer vision/image processing techniques and used as indexing keys. In an image database, these visual keys are stored along with the actual image data, and image retrieval from the database is based on the matching of the model's visual keys with those of the query image (s) .
Early approaches used colour distribution alone for content-based image indexing and retrieval (M. Swain and D Ballard, "Colour Indexing" , International Journal of Computer Vision, Vol. 7, pp. 11-32, Year 1991) . Because a colour histogram is a global measure and contains no information about the relative spatial distribution of the colours, the success of the colour histogram method is quite limited. Recognising the importance of relative spatial information, researchers have proposed various methods to incorporate such information in the indexing and retrieval for image databases. Although it is not possible to be exhaustive, representative approaches to incorporating the spatial information into content-based image indexing and retrieval include: classical texture descriptors (W.Y. Ma and B.S. Manjunath, "NeTra: A toolbox for navigating large image databases" , Multimedia Systems, 7, 184-198, 1999) , image segmentation ("blobworld") (C. Carson et al, "Blobworld: A system for region-based image indexing and retrieval" , Proc. Int. Conf. Visual Information Systems, Springer- Verlag, 1999) , joint histograms of colours and edges (G. Pass and R. Zabih,
"Comparing images using joint histograms" , Multimedia Systems, vol. 7, pp. 234-240, Springer- Verlag, 1999) , and colour correlogram (J. Huang et al, "Image indexing using colour correlogram", IEEE CVPR, pp. 762-768, 1997) . Recent work has also proposed in the use of a binary space partitioning tree as an efficient representation for capturing the colours and their spatial distribution information (G. Qiu and S. Sudirman, "Representation and retrieval of colour images using binary space partitioning tree" , 8th Colour Imaging Conference, Scottsdale, Arizona, USA, Nov. 2000, pp. 195-201) .
A survey of various techniques for indexing and retrieval can be found in (Y. Rui, T.S. Huang, T.S. and S.F. Chang, "Image Retrieval: Current Techniques, Promising Directions, and Open Issues" , Journal of Visual Communication and Image Representation, Vol. 10, pp. 39-62, 1999) .
It is an object of this invention to provide a new and alternative system for analysing and representing the contents of an image which analysis or representation may then be applied in various ways relating to image database management including the indexing, comparison, retrieval and recognition of images based on the image content, and which has certain advantages over previously known techniques as will be referred to later in this specification.
The present invention provides an image analysis system, which may be a processing or representation system, in which an input image is deconstructed into primitive images corresponding to palette features of the input image and in which the geometric properties of the binary images are measured and populations of different geometrical measures of the respective palette features are tabulated.
The present invention also provides an image indexing system in which the image is deconstructed into a plurality of primitive images each reflecting a palette feature of the original image and then tabulating the populations of distinct geometrical features of each such primitive image. The invention includes an image retrieval system for retrieving images similar to a query image from an image database in which an algorithm is applied to the query image to deconstruct the query image into a plurality of primitive query images each reflecting a palette feature of the original query image and then tabulate the populations of distinct geometrical features of each such primitive query image to form a query table, and then comparing the query table with image tables created by applying said algorithm to images held in said database.
The invention includes an image database which contains image tables created by applying an algorithm to each of a set of images to deconstruct each of those images into a plurality of primitive images each reflecting a palette feature of the original image and then tabulating the populations of distinct geometrical features of each such primitive image to form a lookup table.
The invention also includes a workstation programmed with an algorithm adapted to be applied to an image to deconstruct that image into a plurality of primitive images each reflecting a palette feature of the original image and then to tabulate the populations of distinct geometrical features of each such primitive image to form a digital table.
Such workstation may be remote from or wired to a database storing images optionally operated on by said algorithm. The workstation is suitably programmed to apply the algorithm to a query image to form a query table and then to compare the query table with image tables created by applying said algorithm to images held in said database, in order to retrieve images similar to the query image.
The invention extends to a method of retrieving images retrieving images similar to a query image from an image database in which an algorithm is applied to the query image to deconstruct the query image into a plurality of primitive query images each reflecting a palette feature of the original query image and then tabulate the populations of distinct geometrical features of each such primitive query image to form a query table, and then comparing the query table with image tables created by applying said algorithm to images held in said database.
The invention also extends to a method of indexing images in which an algorithm is applied to the image to deconstruct it into a plurality of primitive images each reflecting a palette feature of the original image and then to tabulate the populations of distinct geometrical features of each such primitive image.
The invention further extends to an object recognition system in which an image is formed of the object to be recognised, an algorithm is applied to the object image to deconstruct the object image into a plurality of primitive object images each reflecting a palette feature of the original object image and then tabulate the populations of distinct geometrical features of each such primitive object image to form a query table, and then comparing that query table with image tables created by applying said algorithm to images held in an image database.
The invention includes a method of creating an image content table which comprises: defining palette features of an image and deconstructing the image into a plurality of binary primitive images each reflecting a said palette feature;
analysing geometrical features of each such primitive image; and tabulating populations of different geometrical features of each such primitive image. The invention also includes an image content table which comprises a tabulated list of populations of different geometrical features of each of a plurality of primitive binary images .
Preferably the primitive images are binary images having a plurality of pixels each having one of two values associated with it.
Preferably each pixel in the binary image is associated with a respective part of the deconstructed image from which it is formed. Each said part of the deconstructed image most conveniently comprises a pixel of the deconstructed image.
The primitive images may each indicate which of said parts have the respective palette feature and which do not.
Preferably there are defined a plurality of classifications of at least one image feature of an image, and each palette feature corresponds to a respective one of the classifications .
Preferably each palette feature has a palette feature vector associated with it and each pixel of the image is classified by defining a pixel feature vector for the pixel and determining which of the palette feature vectors it is closest to.
The at least one image feature may include surface texture, or colour.
Preferably each image feature has a plurality of possible categories, and each defined palette feature corresponds to a plurality of said categories. Preferably geometric features of each primitive image are recorded in an array, for example a one-dimensional array, and the arrays for the primitive images are combined to form a table.
Preferably the geometric properties of the primitive images include geometric properties of connected elements within the primitive images.
The geometrical features tabulated may be of various kinds. Generally, the tabulation is based on an analysis of the rows and columns of pixels making up the primitive binary image, though other analytical systems are possible, for example by referring to blocks of pixels, for example connected blocks of pixels .
In some embodiments of the invention, the tabulation comprises a table of the populations of rows and columns of pixels of each palette feature of a given length or height. For example the table may reflect the presence of Rj rows made up of one pixel, R2 rows made up of two pixels and so on, and of Ci columns one pixel high, C2 columns which are two pixels high, and son on. There would be a table for each palette feature selected for analysis. The geometrical features may thus be considered as rows and columns of a given pixel length or height.
In other embodiments of the invention, the tabulation comprises the total number of pixels containing the selected palette feature in each row and column. The geometrical feature is thus simply the active pixel content of each row or column.
In yet other embodiments of the invention, the geometrical feature maps to the total size of each continuous array of pixels of a given palette feature. The palette feature analysed may be a colour tone, a pattern or texture, or any other tonal value contributing to the appearance of the image which is being subjected to the algorithm. In particularly preferred embodiments of the invention, at least one said palette feature represents a collocation of a plurality of visually distinguishable tonal values of the respective image. Thus the algorithm may be adapted to reduce the colour palette prior to tabulation. For example the image may be reduced to a colour palette of 128 colours, or of 64 or 32 or even fewer colours. This of course greatly reduces the memory space required for storing the tabular information required if it is desired to maintain an entire database in tabular form.
The algorithm may be applied in such a way that, for example, all blue tones are combined to a single blue palette feature and the image is tabulated on that basis. The invention is not of course confined to the amalgamation of like colours into a single palette feature. But the algorithm applied to the stored images and the query image must be compatible.
The present invention takes the previously known techniques a step further by providing a method of representing image contents based on the co-occurrence of geometric features and surface appearance features and summarising the co-occurrence statistics in a tabular form. A number of innovative features of the current invention make this invention advantageous over prior art technologies. The use of the co-occurrence of the surface appearance features (such as colour tone, texture characteristics etc) and the geometric structure (shape, size, and other parameters) of image areas of homogeneous appearance makes the representation more discriminative, thus enabling more accurate image retrieval. The decomposition of the image into binary primitive form also allows fast and flexible processing. The present invention is not only applicable to image indexing and retrieval but also has more general application in general image matching and object recognition.
Preferred embodiments of the invention will now be described in greater detail with reference to the accompanying diagrammatic drawings in which:
Figure 1 is a general schematic diagram illustrating image content table generation;
Figure 2 illustrates a palette feature table relating to different colours ; Figure 3 illustrates a palette feature table relating to different texture features;
Figure 4 illustrates a classifier;
Figure 5 shows a binary image used to illustrate the construction of horizontal and vertical pixel arrays; Figure 6 shows an 8-connected pixel array and a 4 connected pixel array;
Figure 7 illustrates connected component labelling;
Figure 8 illustrates how a primitive binary image may be divided into horizontal and vertical patches; Figure 9 is an example illustrating the calculation of vertical and horizontal projections;
Figure 10 is an image content table (ICT) based on Horizontal and Vertical Line Measurement
Figure 11 is an image content table (ICT) based on Connected Component Measurement
Figure 12 is an image content table (ICT) based on Horizontal and Vertical Projections; Figure 13 illustrates an application of Image Content Table to Image Database Indexing and Retrieval;
Figure 14 illustrates the construction of object class prototypes/models and a recognition model; and Figure 15 illustrates steps involved in the recognition of an unknown object.
Image Content Table
The image analysis method of the invention uses a novel concept of image content table (ICT) . Fig. 1 shows the schematic of a method for generating an ICT.
The input image is first passed through a classifier, which classifies the input image pixels into a number of classes (N) according to the palette features (PFs) which are to be analysed. These palette features result in N feature vectors characterising image tonal value properties such as colour tones or surface pattern or texture properties. It is not essential that each individual visually or otherwise distinguishable colour or other tonal value should be assigned to a palette feature giving rise to a feature vector. Indeed, it is preferred that the various colour tones of the input image should be assigned to a reduced colour palette in which a single palette feature represents a plurality of tonal values of the input image. Thus the various colour tones of the input image are preferably assigned to one of, for example, 128 different palette features. In a simpler system requiring even less memory, the palette used for analysing the input image contains 64 or 32 or 16 or 8 colours or features.
According to the result of the classifier, N primitive binary images (PBI) are formed, each of the same dimensions as the input image. Each PBI corresponds to a palette feature (vector), and the pixel values of the PBI indicate the presence or absence of the palette feature in the input image at the pixel locations. Generally, a pixel value of 1 in the PBI indicates the presence of the feature and a pixel value of 0 indicates the absence of that feature. (However, since it is a binary image, assigning the pixel values the other way round is also possible) . The primitive binary images are then analysed by the binary image analyser, whose primary function is to characterise the geometric features of the 1 -valued pixels in the PBI's. A two-dimensional table, the image content table (ICT) , which characterises the frequencies of the co-occurrence of the palette features and the geometric features is then generated.
These concepts, the palette feature vector, the classifier, the primitive binary images, the binary image analyser and the image content table generator will now be described in detail.
Palette Feature Vectors The palette features are contained in a pre-determined table consisting of N entries, each pointing to an image feature vector. There are various properties of the input image which may be considered as image features : these include (1) colour, and (2) surface texture properties. Other features are possible. Schematically, the palette feature table for the case where colour tones are used as image features is shown in Fig. 2. As can be seen, each palette feature is characterised by a 3-dimensional vector in an appropriate colour space, and for illustration purposes, an RGB colour space is used here. It is to be noted that the use of other colour spaces is also possible. Instead of, or in addition to the use of colour tone as palette feature, we can also use palette features that characterise the surface texture or pattern or other properties of the image. In this case, there are also many methods for generating the characteristic palette feature vector. For example, one particular type of method (see T. Randen and J. H. Husøy, "Filtering for texture classification: A comparative Study" , IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 21 , no. 4, pp. 291- 310, April 1999) involves the use of image filtering where a bank of filters is applied to the image and the texture property at each pixel location is characterised by a texture feature vector formed from the output of the filter bank. Schematically, the palette table for the case when texture properties are used as image features is shown in Fig.3, where it is assumed the texture property of a pixel location is characterised by an m-dimensional texture feature vector, obtained for example by a filtering based approach. The texture features can be used to describe the texture surface of a grey-scale as well as colour images (see A. Jain and G. Healey, "A multiscale representation including opponent color features for texture recognition" , IEEE Transactions on Image Processing, vol. 7, no. 1, pp. 124 - 128, January 1998) .
Generally, there will be only one palette feature table in the system, either based on colour or based on texture. A combination of the two palettes is also possible. In any case, the N feature vectors should be chosen as statistically representative vectors in the chosen feature space. There are many well-established methods can be used, almost all based on some forms of vector quantization [Reference 4] and use as many example images as possible to generate training samples. The process of obtaining the palette feature vectors needs only be performed once, and after feature vectors are generated they are fixed and stored for used by the classifier. (Generating the palette feature table can be easily done by those known in the art e.g. Gervautz, M. & Purgathofer, W. "A Simple Method for Color Quantization: Octree Quantization" , Graphics Gems, Academic Press, 1990, and A. Gersho, R. M. Gray, Vector quantization and signal compression, Kluwer Academic Publishers, Boston, 1992). Classifier There are well-established methods of building the classifier. As an illustration, a minimum distance classifier based on Euclidean distance measure is described here. The schematic of the classifier is illustrated in Fig. 4. Each pixel, P, in the input image is described by its pixel feature vector (PFV) . In the case where colours are used as palette features, this PFV is simply the 3 -dimensional colour vector (e.g., R, G, and B) . Let X denote the PFV, and X = (rx, gx, bx) . In the case where surface texture features are used as palette features; this PFV is the m-dimensional texture characteristic feature vector. Again, using X to denote the PFV, X = (T.(l), Tx(2) , ...,Tx(m)) . Each pixel's PFV in the original input image is compared with the N palette feature vectors in the Euclidean space. The classifier outputs the index of the palette feature vector that is the closest to the PFV, X.
Formally, the classification process illustrated in Fig. 4 can be described as follows:
A) For each pixel P in the input image, compute the PFV, X
B) Calculate the Euclidean distances between X and the N palette feature vectors, PFj (i = 1, 2, ..., N) according to: d(x,i) = || -PE,.|| , for ali i = 1, 2, ..., N
In the case where colours are used as palette features, the distances may be calculated as d(x,i) = ^(rx - r.f + (gx - gif + (bx - b , for ali i = 1, 2, ..., N where (ri; g;, b;) is the i-th colour palette feature vector in the palette feature table.
Similarly, in the case texture features are used as the palette feature, the distances are calculated as d(x,i) = l∑(Tx(j) -Ti(j))2 for all i = 1, 2, N
where(T(l), T(2) , ...,Ti(m)) is the i-th texture palette feature vector in the palette feature table.
C) Find the minimum distance, and output its index as the pixel feature index PFI, i.e.
PFI (P) = k, if d(x, k) < d(x, i), for all i = 1, 2, ... N and i ≠ k
Creating Primitive Binary Images
After classification, each pixel in the input image will have been assigned an index, the pixel feature index (PFI), ranging from 1 to N, indicating the assignment of the pixel into one of the N palette features. From the association of the pixels and the palette features, N binary images, each of the same dimensions as the input image, are constructed. Let P(x, y) , x = 1, 2, ..., L; y = 1, 2, ..., M; be the pixels of an L x M image (grey scale or colour) at location (x, y). Let PBL(x, y), i = 1, 2, ..., N; x = 1, 2, ..., L; y = 1, 2, ..., M; be the pixels of the i-th primitive binary image at location (x, y) . Since after classification, each P(x, y) will have been assigned a label associating it with a palette feature, it is a simple matter to assign the pixel values in the primitive binary images. Formally we have
PBlfa
Figure imgf000019_0001
That is, a pixel in the i-th primitive binary image will have a value of 1 if the pixel in the corresponding location in the input original image is associated with the i-th palette feature in the palette feature table, and a value of 0 if the pixel in the corresponding location in the input original image is assigned to any other palette feature. Binary Image Analyser
After representing the image in a series of primitive binary images, we analyse the binary image to extract the geometric measurements of connected 1- valued pixels. There are many possibilities in measuring the geometric structure of connected components in binary image: three possible measurements are illustrated.
Horizontal and Vertical Line Measurement
In this method, we measure the length of connected 1 -valued pixels within the PBIs along the horizontal and vertical directions. A set of line length values, Li, L2, ...Lκ, is pre-set. The image is first scanned along the horizontal direction to form a 1 x K array, GHL, = {GHL(1) , GHL(2) , ... , GHL(K)} whose elements are the number of connected horizontal lines of lengths Lj, L2, ...Lκ. Then the image is scanned along the vertical direction to form another 1 x K array, GVL, = {GVL(D , GVL(2), ... , GV (K)} , whose elements are the number of connected vertical lines of lengths L^ L2, ...Lκ. Note that the number of line length values and indeed the values of the line length for horizontal and vertical direction do not have to be the same. To illustrate how GHL and GVL are formed we use the binary image as shown in Fig. 5. For illustration convenience, we set K = 4 and L! = 1 L2 = 2, L3 = 3, L4 = 4. (Other values of K and Lj, i = 1, 2, ..., K are possible.) Clearly we have GHL = {9, 1, 2, 3} and GHL= (16, 3, 0, 4} . Notice that if a connected line is of length L and L > Lκ, then it is counted as INT(L/LK) lines of length Lκ, and one line of length (L - Lκ* INT(L/LK)), where INT(x) denotes the integer part of the value x.
Connected Component Measurement
In this method, connected 1-valued pixels are given the same label. Two pixels can be 8-connected or 4-connected as illustrated in Figs. 6a and 6b respectively. In Fig. 6a, pixel 1 and pixels 2, 3, 4, 5, 6, 7, 8, and 9 are 8-connected, and in Fig. 6b, pixel 1 and pixels 2, 3, 4, 5 are 4- connected. For illustration purpose, 8-connected pixels are considered here.
There are well established algorithms to label connected pixels [Reference 5] . As an illustration, we use the binary image in Fig. 7 as an example in which all pixels in a group of connected pixels have been given the same label.
All pixels with the same label form a connected component. We measure the geometric properties of the connected components. There are many parameters of a connected component which can be used to characterise its geometric property. As an illustration, we could use the size of connected component as a measure of its geometric property. Other measures such as the shape, the second order moments, the centroid etc. can be obtained easily and used instead (M. Sonka, V. Havac and R. Boyle, Image Processing, Analysis and Machine Vision, Chapman & Hall, London, 1994) .
First, we divide the size of connected components into K categories, each denoting a range of sizes. Let S = {Si, S2, ..., Sκ} be the size category vector, where S; denotes the size of a connected component. Then, we form a 1 x K array, Gs = {Gs(l), Gs(2), ... , GS(K)} whose elements are the number of connected components whose sizes fall within a certain category. Gs(l) is the number of components whose size is less than or equal to Si. Gs(i), for i = 2, 3, ..., K-l, is the number of components whose size is greater than S; and less than or equal to Si+ 1. GS(K) is the number of components whose size are greater than Sκ.
As an illustration, we again use Fig. 7 as an example. If we set S = {Si, S2, ... , S = {1 , 2, 3, 4, 5}, it is easy to see that Gs = {Gs(l), Gs(2) , ... , GS(K)} = {2, 1, 0, 2, 2} . If we set S = {Slf S2, ... , Sκ} = {2, 4, 6, 8}, it is easy to see that Gs = {Gs(l) , Gs(2) , ... , GS(K)} = {3, 2, 0, 1, 1} .
Projection Measurements
Another method of analysing the primitive binary images is based on the projection of the 1-valued pixels along horizontal and vertical directions. Assuming the original input image consists of M (row) x N (column) pixels, we divide the binary image into horizontal and vertical patches, each consists of equal number of rows or columns of pixels, as illustrated in Fig. 8. Assuming we are dividing the image into V (1 < V < N) vertical and H (1 < H < M) horizontal patches, then each vertical patch will consists of INT(N/V) columns of pixels, and each horizontal patch will consist of INT(M/H) rows of pixels. Notice INT(x) denotes the integer part of x. In the cases where M ≠ iH and N ≠ jV, where i and j are some integer number, then the image is expanded by duplicating the last row and last column of pixels as many time as necessary to satisfy the condition M = iH and N = jV. (This technique can easily be performed by those skilled in the art) .
To describe the geometric property of the primitive binary image, two 1- dimensional arrays are constructed: a 1 x V vertical projection array, GPV = {Gpv(l) , GpV(2) , ... , Gpv(V)}, whose elements are the numbers of 1- valued pixels in each of the V vertical patches; and a 1 x H horizontal projection array, GPH = {GPH(1), GPH(2), ..., GPH(H)}, whose elements are the numbers of 1-valued pixels in each of the H horizontal patches. Let PBI(x, y), x = 1, 2, ... M, y = 1, 2, ...N be a binary image divided into V vertical and H horizontal patches, then the elements of the horizontal and vertical projection arrays can be calculated as follows:
Figure imgf000022_0001
N (j+l)*lNT(M/H)
GPH(f)= ∑ ∑(PBl(x,y)) y=\ x=l+j*INT(M/H) for j = 1, 2, H
Fig. 9 shows an example calculating the vertical and horizontal projection. The original image is 12 rows x 13 columns (pixel) . We want to divide it into 6 horizontal patches and 7 vertical patches. Since 13/7 = 1.86 is not an integer, we duplicate the last column of pixels to make the image 12 rows x 14 columns (pixel) . Therefore, each vertical patch consists of two columns of pixels, and each horizontal patch consists of 2 rows of pixels. It can be easily worked out that GPV = {Gpv(D , Gpv(2) , ..., GPV(7)> = {8, 4, 4, 7, 2, 3, 18}, and GP„ = {GP„(1) , Gp„(2), ..., GPli(6)} = {13, 6, 11, 2, 3, 2}.
Image Content Table Generator
After obtaining the geometric property measurements on each of the N primitive binary images, an image content table (ICT) is generated. The ICT is a two dimensional array formed by arranging the 1 -dimensional geometric measurement arrays produced by the image analyser. For different implementations (embodiments) of the image analyser, the table will be different and the construction of the ICT for each of the three measurements described above is illustrated in Figures 10 to 12 respectively.
ICT based on Horizontal and Vertical Line Measurement (Fig. 10)
Let G„L(i) = {GHL(i, 1) , GHL(i, 2) , ..., G„L(i, K)} and GVL(i) = {GVL(i, 1), GVL(i, 2), ... , GVL(i, K)} be horizontal and vertical line measurement arrays for the i-th primitive binary image, where i = 1, 2, ..., N, generated by the image analyser based on the method of horizontal and vertical line measurement. The image content table is formed by stacking GHL(i) and GVL(i) to form the i-th column of the 2-dimensional ICT as shown in Fig. 10. ICT based on Connected Component Measurement (Fig. 11)
Let Gs(i) = {Gs(i, 1), Gs(i, 2) , ..., Gs(i, K)} be the geometric property measurement array generated by the image analyser based on the method of Connected Component Measurement for the i-th primitive binary image. The image content table, ICT, is formed by arranging Gs(i) into the i-th column of the ICT as shown in Fig. 11.
ICT based on Projection Measurement (Fig. 12)
Let Gpy(i) = {GpV(i, 1) , GPV(i, 2), ... , GPV(i, V)} and GPH(i) = {GP„(i, 1), GPH(i, 2), ... , GPH(i, H)} be the vertical and horizontal projection arrays generated by the image analyser based on the projection measurement method for the i-th primitive binary image. The image content table is formed by stacking GPV (i) and GPH (i) to form the i-th column of the 2- dimensional ICT as shown in Fig. 12.
Image Content Table for Image Database Applications A principal application of the image content table is in image database management. It can be used in content-based image indexing and retrieval. The scenario can be illustrated in Fig. 13.
Construction of Image Database based on Image Content Table
Assuming that we have a large collection of digital image consists of Z images. For each image, an image content table (ICT) is constructed and stored on the disk as part of the image database.
Image Database Query based on Pictorial Example
Such a database constructed using our ICT can be used to implement a content-based image query, or a query by pictorial example. Suppose that a user has an image and wants to find from the database images that are similar to the query image. An image content table, ICTQ, is first constructed for the query image. Then this ICTQ is compared to the ICTs of the database. A similarity scorer calculates a metric measurement between the query image's ICT and the ICTs of the database images. One possible implementation of the Similarity Scorer is as follows.
Let ICTQ = {X(i, j) : i = 1, 2, ..., K, j = 1, 2, ...N} be the image content table of the query image, and ICTD = {Y(i, j) : i = 1 , 2, ..., K, j = 1, 2, ...N} be the image content table of a database image. The similarity score between the two images can be calculated:
Normalise the ICTs as follows:
NX(iJ) = κ ) NY(i,j) = K Y}i )
∑∑X(i ) ∑∑Yi J)
Calculate the score S as follows:
Figure imgf000025_0001
In this way, each image in the database can be given a similarity score. All the scores may be sorted in an increasing order (the smaller the score, the more similar the images) and returned to the user. From this sorted list, the user can identify the images from the database that are most similar to the query.
Application of Image Content Table to General Objection Recognition
The image content table has more general applications in machine vision based automatic object recognition. In this case, an object is first imaged using a sensor, e.g. , a CCD camera. The image of the object is deconstructed; e.g. , using a pre-designed colour palette table, into a set of primitive binary images and an image content table is constructed to represent the image (object) . Using this representation, general automatic object recognition machine systems can be constructed.
Fig. 14 illustrates how the ICT can be used in constructing object class prototype/model and recognition models. Each sample object of known class identity is imaged and represented using an ICT, which can be regarded as a feature vector (pattern) . Then well known pattern recognition techniques can be used to construct object class prototypes/models using an appropriate recognition strategy.
Fig. 15 illustrates the use of the object class prototype/model ICT to recognise unknown objects. An unknown object is first imaged and then the image of the unknown object is represented in image content table form and presented to the same pattern recognition algorithms which will use the object prototype/model information to identify the class identity of the unknown object automatically.
Other Imaging Applications
Other applications can also be envisaged involving the use of a representation that uses a feature palette to decompose the input image into primitive binary image and measures the geometric properties of the primitive binary images, and tabulates the frequencies of the co- occurrence of the geometric measures and the palette features .

Claims

1. An image analysis system in which an input image is deconstructed into a plurality of primitive images each corresponding to a palette feature of the input image and in which the geometric properties of the respective binary images are measured and populations of different geometrical measures of the respective palette features are tabulated.
2. An image indexing, classification or categorisation system in which an algorithm is applied to the image to deconstruct it into a plurality of primitive images each reflecting a palette feature of the original image and then to tabulate the populations of distinct geometrical features of each such primitive image.
3. An image retrieval system for retrieving images similar to a query image from an image database in which an algorithm is applied to the query image to deconstruct the query image into a plurality of primitive query images each reflecting a palette feature of the original query image and then tabulate the populations of distinct geometrical features of each such primitive query image to form a query table, and then comparing the query table with image tables created by applying said algorithm to images held in said database.
4. An image database which contains image tables created by applying an algorithm to each of a set of images to deconstruct each of those images into a plurality of primitive images each reflecting a palette feature of the original image and then tabulating the populations of distinct geometrical features of each such primitive image to form a lookup table.
5. A workstation programmed with an algorithm adapted to be applied to an image to deconstruct that image into a plurality of primitive images each reflecting a palette feature of the original image and then to tabulate the populations of distinct geometrical features of each such primitive image to form a digital table.
6. A workstation according to claim 5 programmed to apply the algorithm to a query image to form a query table and then to compare the query table with image tables created by applying said algorithm to images held in said database, in order to retrieve images similar to the query image.
7. A data bearing medium carrying an algorithm adapted to operate on a digitally stored image to deconstruct that image into a plurality of primitive images each reflecting a palette feature of the original image and then to tabulate the populations of distinct geometrical features of each such primitive image to form a digital table.
8. A method of retrieving images retrieving images similar to a query image from an image database in which an algorithm is applied to the query image to deconstruct the query image into a plurality of primitive query images each reflecting a palette feature of the original query image and then tabulate the populations of distinct geometrical features of each such primitive query image to form a query table, and then comparing the query table with image tables created by applying said algorithm to images held in said database.
9. A method of indexing images in which an algorithm is applied to the image to deconstruct it into a plurality of primitive images each reflecting a palette feature of the original image and then to tabulate the populations of distinct geometrical features of each such primitive image.
10. An object recognition system in which an image is formed of the object to be recognised, an algorithm is applied to the object image to deconstruct the object image into a plurality of primitive object images each reflecting a palette feature of the original object image and then tabulate the populations of distinct geometrical features of each such primitive object image to form a query table, and then comparing that query table with image tables created by applying said algorithm to images held in an image database.
11. A method of creating an image content table which comprises: defining palette features of an image and deconstructing the image into a plurality of primitive images each reflecting a said palette feature; analysing geometrical features of each such primitive image; and tabulating populations of different geometrical features of each such primitive image.
12. An image content table which comprises a tabulated list of populations of different geometrical features of each of a plurality of primitive images.
13. The invention of any preceding claim in which at least one said palette feature represents a collocation of a plurality of visually distinguishable tonal values of the respective image.
14. A system, database, workstation, medium or method according to any foregoing claim wherein the primitive images are binary images having a plurality of pixels each having one of two values associated with it.
15. A system, database, workstation, medium or method according to claim 14 wherein each pixel in the binary image is associated with a respective part of the deconstructed image from which it is formed.
16. A system, database, workstation, medium or method according to claim 15 wherein each said part of the deconstructed image comprises a pixel of the deconstructed image.
17. A system, database, workstation, medium or method according to claim 15 or claim 16 wherein the primitive images each indicate which of said parts have the respective palette feature and which do not.
18. A system, database, workstation, medium or method according to any foregoing claim wherein there are defined a plurality of classifications of at least one image feature of an image, and each palette feature corresponds to a respective one of the classifications.
19. A system, database, workstation, medium or method according to claim 18 wherein each palette feature has a palette feature vector associated with it and each pixel of the image is classified by defining a pixel feature vector for the pixel and determining which of the palette feature vectors it is closest to.
20. A system, database, workstation, medium or method according to claim 18 or claim 19 wherein the at least one image feature includes surface texture.
21. A system, database, workstation, medium or method according to any of claims 18 to 20 wherein the at least one image feature includes colour.
22. A system, database, workstation, medium or method according to any of claims 18 to 21 wherein each image feature has a plurality of possible categories, and each defined palette feature corresponds to a plurality of said categories.
23. A system, database, workstation, medium or method according to any foregoing claim wherein geometric features of each primitive image are recorded in an array, and the arrays for the primitive images are combined to form a table.
24. A system, database, workstation, medium or method according to claim 23 wherein the arrays are one-dimensional.
25. A system, database, workstation, medium or method according to any foregoing claim wherein the geometric properties of the primitive images include geometric properties of connected elements within the primitive images.
PCT/GB2002/000675 2001-02-17 2002-02-15 Image analysis for image database indexing Ceased WO2002067143A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GBGB0103965.0A GB0103965D0 (en) 2001-02-17 2001-02-17 Image and image content processing,representation and analysis for image matching,indexing or retrieval and database management
GB0103965.0 2001-02-17

Publications (1)

Publication Number Publication Date
WO2002067143A1 true WO2002067143A1 (en) 2002-08-29

Family

ID=9908976

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB2002/000675 Ceased WO2002067143A1 (en) 2001-02-17 2002-02-15 Image analysis for image database indexing

Country Status (2)

Country Link
GB (1) GB0103965D0 (en)
WO (1) WO2002067143A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012033460A1 (en) * 2010-09-10 2012-03-15 Choros Cognition Ab Method for automatically classifying a two- or higher-dimensional image
CN103164436A (en) * 2011-12-13 2013-06-19 阿里巴巴集团控股有限公司 Image search method and device
CN115457167A (en) * 2022-09-21 2022-12-09 山东大学 Color Palette Design System Based on Color Sorting

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0866409A1 (en) * 1997-03-19 1998-09-23 Canon Kabushiki Kaisha Image retrieval apparatus and method
EP1045313A2 (en) * 1999-04-14 2000-10-18 Eastman Kodak Company Perceptually significant feature-based image archival and retrieval
GB2349460A (en) * 1999-04-29 2000-11-01 Mitsubishi Electric Inf Tech Representing and searching for colour images

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0866409A1 (en) * 1997-03-19 1998-09-23 Canon Kabushiki Kaisha Image retrieval apparatus and method
EP1045313A2 (en) * 1999-04-14 2000-10-18 Eastman Kodak Company Perceptually significant feature-based image archival and retrieval
GB2349460A (en) * 1999-04-29 2000-11-01 Mitsubishi Electric Inf Tech Representing and searching for colour images

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
GONG Y: "ADVANCING CONTENT-BASED IMAGE RETRIEVAL BY EXPLOITING IMAGE COLOR AND REGION FEATURES", MULTIMEDIA SYSTEMS, ASSOCIATION FOR COMPUTING MACHINERY, NEW YORK, US, vol. 7, no. 6, pages 449 - 457, XP000912566, ISSN: 0942-4962 *
KANKANHALLI M S ET AL: "Color and spatial feature for content-based image retrieval", PATTERN RECOGNITION LETTERS, NORTH-HOLLAND PUBL. AMSTERDAM, NL, vol. 20, no. 1, January 1999 (1999-01-01), pages 109 - 118, XP004154223, ISSN: 0167-8655 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012033460A1 (en) * 2010-09-10 2012-03-15 Choros Cognition Ab Method for automatically classifying a two- or higher-dimensional image
US9036924B2 (en) 2010-09-10 2015-05-19 Choros Cognition Ab Method for automatically classifying a two-or higher-dimensional image
CN103164436A (en) * 2011-12-13 2013-06-19 阿里巴巴集团控股有限公司 Image search method and device
CN103164436B (en) * 2011-12-13 2017-06-16 阿里巴巴集团控股有限公司 A kind of image search method and device
CN115457167A (en) * 2022-09-21 2022-12-09 山东大学 Color Palette Design System Based on Color Sorting

Also Published As

Publication number Publication date
GB0103965D0 (en) 2001-04-04

Similar Documents

Publication Publication Date Title
Chen et al. A region-based fuzzy feature matching approach to content-based image retrieval
Schettini et al. A survey of methods for colour image indexing and retrieval in image databases
US6584221B1 (en) Method for image retrieval with multiple regions of interest
Fournier et al. Retin: A content-based image indexing and retrieval system
Smeulders et al. Content-based image retrieval at the end of the early years
Liu et al. A survey of content-based image retrieval with high-level semantics
Iqbal et al. Applying perceptual grouping to content-based image retrieval: Building images
CN100357944C (en) Image retrieving system, image classifying system, image retrieving program, image classifying program, image retrieving method and image classifying method
US8150216B2 (en) Methods and apparatus for automated true object-based image analysis and retrieval
US20030147558A1 (en) Method for image region classification using unsupervised and supervised learning
US7545980B2 (en) Method of and apparatus for classifying an image
Moghaddam et al. Defining image content with multiple regions-of-interest
CN103377376A (en) Method and system for image classification, and method and system for image retrieval
Shih et al. An intelligent content-based image retrieval system based on color, shape and spatial relations
Chiang et al. Region-based image retrieval using color-size features of watershed regions
Khotanzad et al. Color image retrieval using multispectral random field texture model and color content features
Jaswal et al. Content based image retrieval using color space approaches
White et al. ImageGREP: Fast visual pattern matching in image databases
Zhang et al. Modelling traditional Chinese paintings for content-based image classification and retrieval
WO2002067143A1 (en) Image analysis for image database indexing
Prakash et al. Combining novel features for content based image retrieval
Goswami et al. RISE: a robust image search engine
Kurniasari et al. Content-Dependent Image Search System for Aggregation of Color, Shape and Texture Features
Aparna Retrieval of digital images based on multi-feature similarity using genetic algorithm
Elhady et al. Weighted feature voting technique for content-based image retrieval

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SI SK SL TJ TM TN TR TT TZ UA UG US UZ VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP