WO2001027865A1

WO2001027865A1 - Cartoon recognition

Info

Publication number: WO2001027865A1
Application number: PCT/GB2000/003839
Authority: WO
Inventors: Mark Pawleski
Original assignee: British Telecommunications PLC
Current assignee: British Telecommunications PLC
Priority date: 1999-10-08
Filing date: 2000-10-05
Publication date: 2001-04-19
Anticipated expiration: 2002-04-08
Also published as: EP1224610A1; AU7545700A; CA2385714A1

Abstract

A method of analysing an image for example from a video signal for the presence of outlines to determine whether the image is likely to represent a cartoon is based on the concept that cartoons are often characterised by the presence of dark outlines around objects. In an improved version the signal is separated into signals representing smaller areas of the image before analysing each signal for the presence of dark outlines.

Description

Cartoon Recognition

This invention relates to a method of and apparatus for determining whether an image, for example a frame of a video signal, represents a cartoon.

With the growing availability of online data, provision of hundreds or even thousands of data channels by an information provider causes problems of content management and verification, as manual checking of every piece of data becomes infeasible. For image data, there is increasing interest in techniques for automated image interpretation and classification. Automated image interpretation and classification could help with indexing, cataloging and searching of still image or moving image databases.

Image interpretation and classification can be done either by the service provider or by the service receiver. For example, if it is possible to determine whether a signal represents a cartoon or not then it is possible for parents to stop children from downloading pictures from the Internet or from watching TV programs other than cartoons. Other types of classifiers could prove useful, for example, classification of pornographic images or recognition of particular people.

According to the present invention there is provided a method for classifying whether an image represents a cartoon, comprising the step of generating a likelihood in dependence on the presence of low luminosity outlines in the image.

According to the present invention there is also provided a data carrier loadable into a computer and carrying instructions for causing the computer to carry out said method.

In a preferred embodiment the method further comprises the step of analysing the image to provide one or more parameters wherein one parameter relates to the luminosity of the image; and the generating step determines the generated likelihood in dependence upon the value of said one parameter.

Advantageously the image comprises a plurality of pixels and the analysing step includes the sub-step of vector quantising the image so that each pixel corresponds to one of a plurality of codes. Preferably the analysing step further comprises the sub-step of calculating the percentage of pixels corresponding to a one of the plurality of codes and preferably said one of the plurality of codes is a code which corresponds to pixels of low luminosity.

In a preferred embodiment the vector quantising sub-step comprises sub-steps of dividing the image into a plurality of blocks, each block comprising a subset of pixels in the image; and independently vector quantising each block.

Advantageously the generating step comprises the sub-step of generating a block likelihood value for each of a plurality of blocks, the block likelihood representing the probability that the pixels in that block represent an image comprising one or more outlines and preferably the generating step comprises the sub-step of combining a plurality of block likelihood values to provide the likelihood value for the image.

According to another aspect of the present invention there is provided apparatus for classifying whether an image represents a cartoon said apparatus comprising generating means for generating a likelihood in dependence on the presence of low luminosity outlines in the image.

According to the present invention there is also provided a data carrier loadable into a computer and carrying instructions for enabling the computer to provide said apparatus.

In a preferred embodiment the apparatus further comprises means for analysing the image to provide one or more parameters wherein one parameter relates to the luminosity of the image; and the generating means receives in operation said one parameter and determines the generated likelihood in dependence upon the value of said one parameter.

Advantageously the analysing means comprises a vector quantiser which receives in operation a plurality of pixels comprising the image and outputs a plurality of codes each output code corresponding to each of the received pixels.

Preferably the analysing means further comprises means for calculating the percentage of pixels corresponding to a one of the plurality of codes and said one of the plurality of codes is a code which corresponds to pixels of low luminosity. In a preferred embodiment the vector quantiser further comprises means for dividing the image into a plurality of blocks, each block comprising a subset of pixels in the image; and means for independently vector quantising each block.

Advantageously the generating means comprises means for generating a block likelihood value for each of a plurality of blocks, the block likelihood value representing the probability that the pixels in that block represent an image comprising one or more outlines and the generating means further comprises means for combining a plurality of block likelihood values to provide a likelihood value for the image.

An embodiment of the invention will now be described, by way of example only, with reference to the accompanying drawings in which

Figure 1 is a schematic representation of a computer loaded with software embodying the present invention;

Figure 2 shows red, blue, green and luminance components for a cartoon;

Figure 3 shows red, blue, green and luminance components for a photograph;

Figure 4 shows red, blue, green and luminance components for a complex cartoon;

Figure 5 is a functional block diagram of the program elements that comprise the software indicated in Figure 1 ;

Figure 6 is a flow chart showing the method steps performed in one embodiment of the invention by the software illustrated in Figure 5;

Figure 7 is a flow chart showing the vector quantising step of the method illustrated in

Figure 6; Figure 8 is a flow chart showing the production of a low luminosity signal;

Figure 9 shows images for each level of vector quantisation for a cartoon and a photograph; and

Figure 10 is a flow chart showing the determination step of the method illustrated in Figure

6.

Figure 1 illustrates a conventional computer 101 , such as a Personal Computer, generally referred to as a PC, running a conventional operating system 103, such as Windows (a

Registered Trade Mark of Microsoft Corporation), and having a number of resident application programs 105 such as a word processing program, a network browser and e- mail program or a database management program. The computer 101 also includes an image classification program 109 that enables a signal representing an image to be classified according to whether the image represents a cartoon. The computer 101 is also connected to a conventional disc storage unit 111 for storing data and programs, a keyboard 113 and mouse 115 for allowing user input and a printer 117 and display unit 119 for providing output from the computer 101. The computer 101 also has access to external networks (not shown) via a network card 121.

A simple image classification program for determining whether an image represents a cartoon can be implemented which analyses the colour distribution within the image. Figure 2a shows a cartoon (represented in a grey scale in the figure). Figure 2b is a histogram showing the number of pixels with particular values for the red component Figures 2c, 2d and 2e show similar histograms for the green, blue and luminance components. For this cartoon there are prominent spikes in each component. Figure 3 shows a similar set of histograms for an image which is not a cartoon. The distribution for each component shows no such spikes. However, for a more complicated cartoon, such as that shown in Figure 4, it is difficult to see any obvious distinction between the histograms of Figures 4b - 4e and those of Figures 3b - 3e.

As shown in Figure 6, in accordance with a method of the present invention at step 10 an input signal representing an image, for example a frame of video data, comprising a plurality of pixels is received. At step 20 the received signal is converted into a luminosity signal, which represents a grey scale version of the image, by calculating a luminosity value (L) for each pixel. The received signal has components representing a value in the range 0 to 255 for a red component (R) a blue component (B) and a green component (G) for each of the plurality of pixels which comprise the frame of video data. The luminosity value is calculated at step 20 using the equation

L = 0.299R + 0.587G + 0.114B It is not necessary to use a calculated luminosity value; the invention works equally well if a vector comprising, for example, the R, G and B values is used. At step 30 the image represented by the luminosity signal is split into a plurality of block signals, each signal representing an area of the original image. In the embodiment of the invention described here, each block signal represents an area of the same size as the area represented by each other block signal (although the size may differ slightly due to quantisation effects) However, the areas represented by the block signals could equally well be different sizes from each other. At step 40 each block signal is vector quantised into a predetermined number of levels. A code (for example an integer in the range 1 to the predetermined number of levels) being used to represent each level. At step 50 the vector quantised signals are used to provide a low luminosity signal comprising the vector quantised signals which represent the darkest level for each block. Finally at step 60 the lowest luminosity signal is used to determine whether the received signal represents a cartoon. It is not necessary to split the luminosity signal into a plurality of block signals prior to vector quantisation. However, the determination at step 60 is more accurate if the received signal is split into signals representing smaller blocks of the frame.

The algorithm now described with reference to Figure 7 is used to determine the level represented by each code and to assign each pixel value to a code. It is similar to the well known LBG algorithm (as described in Linde, Y, Buzo, A and Gray, R. M. "An algorithm for vector quantizer design", IEEE Trans. Comm., vol. COM-28, Jan 1980, S 84-86). In Figure 7 at step 42 each pixel value is assigned to a code. Initially there is a single code used to represent each pixel value. At step 43, for the or each code, the mean and the standard deviation of the pixel values which the or each code currently represents are calculated. The mean for the or each code is then associated with that code. At step 44, the code which represents pixel values having the greatest standard deviation is determined. A new value to be associated with that code is then calculated at step 45 as the mean for that code minus half the standard deviation for that code. If the new value is calculated to be less than zero then the new value is set to zero. At step 46 a new value to be associated with a new code is calculated as the mean plus half said standard deviation.

At step 47 one of the plurality of codes is assigned to each pixel value. A code is used to represent a pixel value if the value which that code is associated with is 'closer' to the pixel value than any of the other codes. In this embodiment of the invention, a luminosity value is used, so it is a simple matter to measure the distance between the luminosity value and the value associated with a code, by calculating the difference between the two values. In embodiments which use a vector of values to represent each pixel, such as R, G, and B values, a distance may be calculated using, for example, the 'city-block' distance or the least squares distance.

At step 48 a check is performed to check whether the number of codes corresponding to the predetermined number of levels have been created. If not, the steps 43 to 47 are repeated. In the embodiment of the invention described here four codes are created for each block, although the number of codes (and hence the predetermined number of levels) does not need to be the same for each block. In embodiments of the invention using a vector of values for each pixel the vector quantising step operates in an analogous manner to that described above.

In Figure 8 at step 51 each block is taken in turn. At step 52 the luminance value for each pixel is set to be equal to the value associated with the code which is used to represent that pixel. At step 53 a signal is generated with the luminance value for each pixel set to white for each pixel which is not represented by the code associated with the lowest luminosity value for that block. In the embodiment of the invention described here a similar signal is generated (for display purposes) for each one of the codes, in order to generate images for each vector quantisation level. The image generated from the signals for the lowest luminosity value will be referred to as a level 0 image, the image generated from the signals for the next highest luminosity value will be referred to as a level 1 image, etc.

Figure 9 shows cartoon image 71 and photographic image 81 , together with level 0 images 72 and 82, level 1 images 73, and 83, level 2 images 74 and 84, and level 3 images 75 and 85. It can be seen that the level 0 image 72 generated from the lowest luminosity signal for each block for the cartoon image 71 differs from the corresponding level 0 image 82 generated from a signal representing the photographic image 81. The level 0 image 72 clearly comprises a plurality of outlines whereas the level 0 image 82 does not. This is because even complex cartoons have outlines delineating the areas of one colour from the areas of another colour, even when the areas of colour are carefully shaded. Photographic images do not have such outlines. Small areas of the level 0 image 82 may be mistakenly judged to contain outlines, however the majority of the image 82 does not contain outlines.

At step 61 of Figure 10 the signal representing a frame of video data is separated into block signals representing smaller areas of the frame. These need not be the same size areas as were produced at step 30 of Figure 6. Again, each area represented by a signal need not necessarily be the same size as each other area represented by a signal. At step 62 the number of dark pixels for each block is determined. Then at step 63 the number of blocks which are likely to contain outlines is determined by testing whether the percentage of dark pixels in a block less than a predetermined dark-threshold. At step 64 a test is performed as to whether the number of blocks which are likely to contain outlines divided by the total number of blocks is greater than a predetermined outline-threshold. If the calculated ratio is greater than the outline-threshold then the signal is deemed to represent a cartoon, otherwise the signal is deemed not to represent a cartoon. In the embodiment described, the test for whether a block contains outlines is fairly simple. It would be possible to replace steps 62 and 63 by a more sophisticated algorithm which detects, for example, narrow bands of dark pixels, or an algorithm for detecting substantially parallel edges where pixels change from dark to light, or vice versa, or to implement a classifier using a neural network.

As shown in Figure 5 an image classification program 109 according to the invention comprises a grey scale converter 130 which performs steps 10 and 20 of Figure 6, an analysing means 140 which performs steps 30, 40 and 50 of Figure 6, and a likelihood generator 150 which performs step 60 of Figure 6. The analysing means 140 comprises a vector quantiser 142 which performs steps 30 and 40 of Figure 6 and a luminosity parameter generator 144 which performs step 50 of Figure 6. The likelihood generator 150 comprises a low luminosity block signal generator 152, a block likelihood generator 154 and a likelihood combiner 156. The vector quantiser 142 comprises a block signal generator 146 and a block signal vector quantiser 148. The luminosity parameter generator 144 comprises a low luminosity signal generator 132 and a percentage of low luminosity signals calculator 134.

As will be understood by those skilled in the art, the image classification program 109 can be contained on various transmission and/or storage mediums such as a floppy disc, CD- ROM, or magnetic tape so that the program can be loaded onto one or more general purpose computers or could be downloaded over a computer network using a suitable transmission medium.

Whilst the invention has been described with reference to a signal representing an image comprising a plurality of pixels, it will be appreciated that the detection of low luminosity outlines may equally well be performed on images for which the original source of the image does not represent the image as a plurality of pixels. Unless the context clearly requires otherwise, throughout the description and the claims, the words "comprise", "comprising" and the like are to be construed in an inclusive as opposed to an exclusive or exhaustive sense; that is to say, in the sense of "including, but not limited to".

Claims

1. A method for classifying whether an image represents a cartoon, comprising the step of generating a likelihood in dependence on the presence of low luminosity outlines in the image.

2. A method according to claim 1 , and further comprising the steps of analysing the image to provide one or more parameters wherein one parameter relates to the luminosity of the image; and the generating step the generates the likelihood in dependence upon the value of said one parameter.

3. A method according to claim 2, in which the image comprises a plurality of pixels and the analysing step includes the sub-step of vector quantising the image so that each pixel corresponds to one of a plurality of codes.

4. A method according to claim 3, in which the analysing step further comprises the sub-step of calculating the percentage of pixels corresponding to a one of the plurality of codes.

5. A method according to claim 4, in which said one of the plurality of codes is a code which corresponds to pixels of low luminosity.

6. A method according to any one of claims 3 to 5, in which the vector quantising sub-step comprises sub-steps of dividing the image into a plurality of blocks, each block comprising a subset of pixels in the image; and independently vector quantising each block.

7. A method according to claim 6, in which the generating step comprises the sub- step of generating a block likelihood value for each of a plurality of blocks, the block likelihood representing the probability that the pixels in that block represent an image comprising one or more outlines.

8. A method according to claim 7, in which the generating step further comprises the sub step of combining a plurality of block likelihood values to provide the likelihood value for the image.

9. An apparatus for classifying whether an image represents a cartoon, the apparatus comprising generating means for generating a likelihood in dependence on the presence of low luminosity outlines in the image.

10. An apparatus according to claim 9, and further comprising means for analysing the image to provide one or more parameters wherein one parameter relates to the luminosity of the image; and the generating means receives in operation said one parameter and generates the likelihood in dependence upon the value of said one parameter.

11. An apparatus according to claim 10, in which the analysing means comprises a vector quantiser which receives in operation a plurality of pixels comprising the image and outputs a plurality of codes each output code corresponding to each of the received pixels.

12. An apparatus according to claim 11 , in which the analysing means further comprises means for calculating the percentage of pixels corresponding to a one of the plurality of codes.

13. An apparatus according to claim 12, in which said one of the plurality of codes is a code which corresponds to pixels of low luminosity.

14. An apparatus according to any one of claims 11 to 13, in which the vector quantiser further comprises means for dividing the image into a plurality of blocks, each block comprising a subset of pixels in the image; and means for independently vector quantising each block.

15. An apparatus according to claim 14, in which the generating means comprises means for generating a block likelihood value for each of a plurality of blocks, the block likelihood representing the probability that the pixels in that block represent an image comprising one or more outlines.

16. An apparatus according to claim 15, in which the generating means further comprises means for combining a plurality of block likelihood values to provide a likelihood value for the image.

17. A data carrier loadable into a computer and carrying instructions for causing the computer to carry out the method according to any one of claims 1 to 8.

18. A data carrier loadable into a computer and carrying instructions for enabling the computer to provide the apparatus according to any one of claims 9 to 16.

19. A method for classifying whether an image represents a cartoon substantially as described herein with reference to the accompanying drawings.

20. An apparatus for classifying whether an image represents a cartoon substantially as described herein with reference to the accompanying drawings.