Video processing
FIELD OF THE INVENTION
The present invention relates to a video processing apparatus and method, and in particular, to a video processing apparatus and method involving cartoonizing.
BACKGROUND OF THE INVENTION
Video communication is increasingly being used in numerous applications such as video telephones, video conferencing, television-collaboration, shared virtual table environments, and so on. In such systems, face detection and recognition is being actively researched in order to enhance the services provided by applications such as video conferencing. For example, face detection is used in video conferencing systems to create a virtual conference room, whereby participants of the meeting are seated around a virtual table. Numerous approaches have been used to assist in face detection, including techniques such as feature invariant approaches, appearance-based approaches, and wavelength analysis.
The majority of face detection research aims to find structural features that exist even when the light and viewpoint vary. Feature extraction methods utilize various properties of the face and skin to isolate and extract data, such as "eye" data. Popular methods include skin color segmentation, principle component analysis, eigenspace modeling, histogram analysis and texture analysis.
As mentioned above, face detection and skin detection methods are currently used in applications such as creating virtual video conferencing systems, or face recognition systems for security applications.
SUMMARY OF THE INVENTION
The aim of the present invention is to provide a video processing apparatus and method that utilizes information received from sources such as face and/or skin detection for cartoon applications.
The invention is defined by the independent claims. The dependent claims define advantageous embodiments.
According to a first aspect of the invention, there is provided an apparatus for cartoonizing an image signal having an object of interest. The apparatus comprises detecting means for detecting a feature of the object, and receiving means for receiving an input signal, the input signal relating to a characteristic of the object. The apparatus further comprises image processing means that is configured to automatically adapt the image signal based on the received input signal and/or the detected feature.
The invention has the advantage of being able to automatically adapt an image signal based on an input signal and/or detected feature of the object being viewed.
According to another aspect of the invention, there is provided a method of cartoonizing an image signal having an object of interest,. The method comprises the steps of detecting a feature of the object, and receiving an input signal relating to a characteristic of the object. The image signal is automatically adapted based on the received input signal and/or the detected feature.
BRIEF DESCRIPTION OF THE DRAWINGS
For a better understanding of the invention, and to show more clearly how it can be carried into effect, reference will now be made, by way of example, to the following drawings in which:
Fig. 1 shows a first embodiment of the present invention; Fig. 2 shows a second embodiment of the present invention; and
Fig. 3 shows a third embodiment of the present invention.
DETAILED DESCRIPTION OF A PREFERRED EMBODIMENT OF THE PRESENT INVENTION Referring to Fig. 1, a first embodiment of the invention is disclosed in which a skin detection unit 1 receives an image signal from a sensor or camera 3. A video processor 5 receives the image signal from the camera 3, and a skin detection signal from the skin detection unit 1. The video processor 5 processes the image data to produce an output video signal 9 for display on a display means (not shown). According to the first embodiment, the video processor 5 is configured to change the skin color based on an input signal 7. The input 7 relates to a characteristic of the image being viewed. For example, the input signal 7 may relate to an emotional characteristic of the person being viewed. The emotion of the person being viewed can be detected from the tone of voice of that person (e.g. the average pitch of
the voice). Alternatively, the emotion can be detected by means of a separate infrared camera (not shown), which detects heat from the face of the person being viewed.
Based on the input signal 7 representing a characteristic of the object being viewed, the video processor 5 is configured to automatically adapt the image signal accordingly. In one example, the skin color of the person is changed according to the emotion of the person. For example, the skin color of the person could be changed to red when an angry tone is detected, or grey when a calm tone is detected. Likewise, the skin color could be changed to red when the infrared camera detects an increase in heat dissipation, or grey when less heat is detected. Preferably, a user can configure the system such that the adaptation carried out by the video processor 5 is programmable. For example, the user can configure a settings table stored in a memory, to select the input condition that triggers an adaptation by the video processor 5, and a corresponding output condition for each input signal 5. In other words, the settings table maps a received input signal with an adaptation process to be performed by the video processor 5.
In addition to changing the skin color as described above, it is noted that the adaptation carried out by the video processor 5 may comprise other forms of video processing, for example facial texturing may also be applied. Thus, according to the first embodiment, the image signal can be automatically changed in accordance with an input signal relating to a characteristic of the person being viewed. Thus, in this embodiment the cartoonizing involves a form of emotional conditioning.
Fig. 2 shows a second embodiment of the invention. According to the second embodiment, a feature extraction unit 21 is provided for detecting a feature in the object being viewed by a sensor or camera 23. For example, the feature extraction unit 21 may be configured to detect a feature in the face of a person being viewed. A video processor 25 receives the image signal from the camera 23, and a feature extraction signal from the feature extraction unit 21. The feature may be, for example, a left eye, a right eye, a left cheek, a right cheek, a chin, a left ear, a right ear, the top of the head, a left eyebrow, a right eyebrow, a beard, a nose or a mouth, etc. Having detected a particular feature in the image signal, the video processor 25 is configured to alter or adapt the image signal, by superimposing a secondary feature onto the image. The secondary feature is preferably positioned in a predetermined relationship to the feature originally detected, for example on or next to the feature originally detected.
The object being superimposed may be, for example, sunglasses, a hat, a beard, a tattoo, or any other feature chosen by a user.
Also, as described above in the first embodiment, a user may select in advance which secondary feature is to be automatically superimposed onto the image signal, for example by configuring a settings table to map detected features with secondary features. For example, the system could be configured to automatically superimpose a pair of spectacles onto the eyes of the person being viewed, or a beard onto the person's chin.
The second embodiment can also be configured to automatically superimpose a secondary feature according to an emotional characteristic of the person being viewed. For example, the emotion of the person could be determined from the voice of the person, or heat detected from a separate infrared camera. In one example, if an angry emotion is detected, a set of horns could be placed on the head of the person being viewed, or smoke arranged to appear from the person's ears or forehead.
Alternatively, the background of a scene could be automatically changed according to the emotional characteristic of the person being viewed.
Fig. 3 shows a further embodiment of the invention, comprising a visual light camera 33 and an infrared (IR) or near infrared (nIR) camera 34. A face and/or skin detection unit 31 receives the signals from the visual light camera 33 and the IR camera 34, and based on the two received signals, an improved face/skin tone detection is carried out. A video processor 35 receives an image signal from the visual light camera 33, plus a face/skin detection signal from the face/skin detection unit 31. As with the first embodiment, the video processor 35 is configured to change the skin color based on an input signal 37. The input signal 37 relates to a characteristic of the image being viewed. For example, the input signal 37 may relate to an emotional characteristic of the person being viewed. The emotion of the person being viewed can be detected from the tone of voice of that person (e.g. the average pitch of the voice). Alternatively, the emotion can be detected using the infrared camera 34, which detects heat from the face of the person being viewed.
Based on the input signal 37 representing a characteristic of the object being viewed, the video processor 35 is configured to adapt the image signal accordingly. In one example, the skin color of the person is changed according to the emotion of the person. For example, the skin color of the person could be changed to red when an angry tone is detected, or grey when a calm tone is detected. Likewise, the skin color could be changed to red when the infrared camera detects an increase in heat dissipation, or grey when less heat is detected.
There is therefore provided a cartoon apparatus that automatically adapts an image signal in accordance with an input signal relating to a characteristic of an image being viewed, and/or a feature detected in the image being viewed.
While the preferred embodiments have referred to changing the skin color, for example, it will be appreciated that other features of an image could also be changed, for example hair color, eye color, etc. In addition, although the preferred embodiments are described in relation to a person being the main object in the image signal, it will be appreciated that the invention is equally applicable to any other object or objects.
Thus, it should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design many alternative embodiments without departing from the scope of the appended claims. The word 'comprising' does not exclude the presence of elements or steps other than those listed in a claim.
In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the device claim enumerating several means, several ofthese means may be embodied by one and the same item of hardware. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.