HK1132871B - Methods and apparatus for detecting on-screen media sources - Google Patents
Methods and apparatus for detecting on-screen media sources Download PDFInfo
- Publication number
- HK1132871B HK1132871B HK10100279.0A HK10100279A HK1132871B HK 1132871 B HK1132871 B HK 1132871B HK 10100279 A HK10100279 A HK 10100279A HK 1132871 B HK1132871 B HK 1132871B
- Authority
- HK
- Hong Kong
- Prior art keywords
- image
- video signal
- region
- interest
- mode
- Prior art date
Links
Description
Technical Field
The present invention relates generally to audience measurement systems, and more particularly to methods and apparatus for detecting on-screen media sources.
Background
Television program producers typically utilize the demographics of television viewers to improve the marketability (marketability) of their television programming and to determine the best cost of advertising during such programming. In addition, accurate television viewing demographics enable advertisers to target commercial content to desired groups of people.
To determine these demographics, audience measurement companies may recruit multiple television viewers as survey subjects. The viewing habits of the recruited viewers and demographic data related to the recruited viewers are collected and used to statistically determine the demographics of the television viewers.
One aspect of audience measurement relates to determining the identity of content displayed on a television. Conventional audience measurement systems utilize channel detection techniques to identify channels (i.e., channel information) to which a television is tuned. This channel information may then be combined with programming information (e.g., a program schedule) to determine the programs that are displayed on the television during the time that the television is tuned to the detected channel.
In addition, audience members may use televisions to consume media from one or more alternative sources, such as digital versatile/video disc (DVD) players, Video Cassette Recorders (VCRs), game consoles, personal computers, and the like. For example, audience members may use television to play games, gamble, shop, video on demand, and so forth. While alternative media sources may not originate from broadcasters (sources such as DVD players, VCRs, video game consoles, etc.), measurements of these alternative sources may enhance the measurement of demographics and viewing habits of the recruited audience.
In addition to detecting the channel to which the television is tuned, the channel detection system may also be used to detect a channel change event in which the television stops displaying programs associated with a first channel and begins displaying programs associated with a second channel. Because the channel change event corresponds to a change in the program displayed to the audience of the program, the channel change event may serve as a trigger for initiating one or more methods of obtaining identification data related to a new program displayed to the audience.
Furthermore, a channel change event and/or a switch to one or more alternative media sources (hereinafter "media change event") often corresponds to a change in audience membership and may therefore also be used to trigger a method of collecting information related to an audience. For example, an audience measurement system may include a device (e.g., a people meter) having a set of input keys, where each input key is assigned to represent a single audience. The people meter may be adapted to collect information about the audience by prompting the audience member to indicate that they are in the audience by, for example, pressing an appropriate input button provided on the people meter. The use of a media change event as a trigger for a people meter to prompt enables accurate recording of changes in the audience size and/or membership that may be caused by or may cause the media change event.
Drawings
Fig. 1 is a block diagram of an exemplary audience measurement system connected to an exemplary television system.
Fig. 2 is a block diagram of a first exemplary on-screen media source detector (OSMSD) used in the exemplary audience measurement system of fig. 1.
Fig. 3 is a block diagram of a second example OSMSD used in the example audience measurement system of fig. 1.
Fig. 4A illustrates an exemplary region of interest of the exemplary television of fig. 1.
Fig. 4B-4E illustrate a series of screens that may be displayed on the television of fig. 1 and 4A, illustrating television viewing, banner browsing, video on demand, and television guide features.
Fig. 5 illustrates a flow diagram of an exemplary process for determining viewer activity.
Fig. 6A illustrates a flow chart of a first exemplary process for determining whether a channel change event has occurred.
Fig. 6B illustrates a flow chart of a second exemplary process for determining whether a channel change event has occurred.
FIG. 7 is an exemplary content captured by the frame grabber of FIG. 1 that may be displayed in an exemplary region of interest.
Fig. 8 illustrates the exemplary content of fig. 7 after conversion from a grayscale image to a black-and-white image.
FIG. 9 illustrates an exemplary set of characteristics of the exemplary content shown in FIG. 8.
FIG. 10 illustrates an exemplary set of digital components arranged to form a set of digits.
11A-11C illustrate a flow diagram of an exemplary process for identifying a displayed channel number.
Fig. 12A-12E depict a series of screens illustrating an exemplary sequence of regions of interest detected by an exemplary OSMSD.
Fig. 13 illustrates an exemplary video-on-demand screen.
Fig. 14A illustrates an exemplary state diagram of a media device.
FIG. 14B illustrates a flow chart of an exemplary process for determining viewer activity status based on screen order.
FIG. 15 depicts an exemplary screen illustrating a video game with a recognizable area of interest.
Fig. 16 depicts an exemplary multi-screen display.
FIG. 17 depicts a series of exemplary screens showing a resume feature.
Fig. 18A is a block diagram of an exemplary prior art on-screen display reader (OSDR) system.
FIG. 18B is a block diagram of an exemplary OSDR system employing an OSMSD front-end filter.
Fig. 19 illustrates an exemplary set of processing timelines corresponding to the exemplary OSDR and OSMSD systems of fig. 18A-18B.
Detailed Description
While the following description discloses exemplary systems including, among other components, software executed on hardware, it should be noted that such systems are merely illustrative and should not be considered as limiting. For example, it is contemplated that any or all of the disclosed hardware and software components could be embodied exclusively in dedicated hardware, exclusively in software, exclusively in firmware or in any combination of hardware, firmware and/or software.
Further, while the following disclosure is made with respect to an exemplary television system, it should be understood that the disclosed system may be readily applied to many other types of media systems, including but not limited to television systems, media recording devices, and/or video game devices. Thus, while the following description describes certain exemplary systems and processes, persons of ordinary skill in the art will readily appreciate that the disclosed examples are not the only way to implement such systems.
Media environment
A block diagram of an exemplary audience measurement system 100 capable of monitoring television viewers is illustrated in the exemplary environment of use of fig. 1. Further details of an exemplary audience measurement system 100 are given in PCT application No. PCT/US2004/012272, filed on 19/4/2004, the specification of which is hereby incorporated by reference in its entirety. The exemplary audience measurement system 100 includes a demultiplexer 106, a frame grabber 108, and an on-screen media source detector (OSMSD) 112. In the illustrated example, audience measurement system 100 is connected to an exemplary media center 101, media center 101 including a set-top box 104 for receiving signals from a television service provider 102, a video game console 105, and a television 110 connected to set-top box 104. These components of the television system 101 may be connected in any manner, including the manner shown in fig. 1.
The television 110 may be any type of television or television display device. For example, the television 110 may be a television and/or display device that supports the National Television Standards Committee (NTSC) standard, the Phase Alternating Line (PAL) standard, the sequential color and memory (SECAM) standard, a standard developed by the Advanced Television Systems Committee (ATSC), such as High Definition Television (HDTV), a standard developed by the Digital Video Broadcasting (DVB) project, or may be a multimedia computer system, etc.
The television service provider 102 may be any television service provider such as a cable television service provider, a telecommunications television service provider, a satellite television service provider, and/or a Radio Frequency (RF) television service provider. Television service provider 102 may provide analog and/or digital television signals to media center 101, for example, via a coaxial cable or any wireless connection.
Set top box 104 may be any set top box such as a cable television converter, a telecommunications television converter, a digital radio broadcast converter, a Direct Broadcast Satellite (DBS) decoder, a personal digital video recorder (e.g., TiVo), a Digital Video Disc (DVD) player, a Video Cassette Recorder (VCR), etc. The set top box 104 receives a plurality of television channels from the television service provider 102. Generally, the set-top box 104 selects one of a plurality of television channels based on user input and/or outputs a video signal transmitted on the selected television channel. In the case of an analog signal, the set-top box 104 tunes to a particular channel to obtain the programming transmitted on that channel. For digital signals, the set top box 104 decodes certain data packets to obtain the program transmitted on the selected television channel.
The video game console 105 may be any video game system, such as an Xbox (registered trademark) of any version of Microsoft (registered trademark), a GameCube (registered trademark) and/or Wii (registered trademark) of Nintendo (registered trademark), and/or a PlayStation (registered trademark) of any version of Sony (registered trademark). Without limitation, the video game may also be provided to the user through the set-top box 104 (e.g., over the internet) and displayed on the television 110. The user may access video-game content stored in the set-top box 104 and/or receive video-game content from, for example, a television service provider.
Television service providers may also provide other forms of media that are not generally referred to as broadcast content, such as gambling, internet content, weather information, and/or shopping. The set top box 104 may also allow the user to view multi-screen displays for multiple channels, program guide screens, and/or Video On Demand (VOD) services, all of which, if detected, provide the advertisement metering company with a rich source of viewing habit information.
The output of the set-top box 104 and/or the video game console 105 is fed back to the signal splitter 106 of the audience measurement system 100. Signal splitter 106 may be an analog Y splitter. In the exemplary audience measurement system 100, a signal splitter produces two signals that identify the output of the set top box 104. Of course, one of ordinary skill in the art will readily appreciate that demultiplexer 106 may produce any number of signals and/or signals of any resolution (e.g., high definition television signals). one of these two signals is fed back to television 110, while the other signal is passed to frame grabber 108.
Frame grabber 108 can be implemented using any type of commercially available frame grabber and is used to convert standard television signals to digital data. For example, frame grabber 108 may convert the NTSC signal to a digital bitmap. NTSC television images have 525 horizontal lines per frame. The lines are scanned or "acquired" from left to right and top to bottom, with every other line skipped (i.e., interleaved). As a result of the interleaving, two picture scans are required to complete the entire frame. Each field scan takes approximately 1/60 seconds, scanning one complete frame every 1/30 seconds. Alternatively or additionally, the frame grabber 108 may convert the PAL (phase alternating line) and/or SECAM (sequential color and memory) signals to a digital bitmap. The resulting bitmap may be of any size and may encode any number of shades and/or colors, regardless of the television broadcast standard in which the signal was originally formatted. For example, frame grabber 108 may grab 320x240 gray scale pixels, 640x480 gray scale pixels, high definition pixels, color images, and so on.
The frame grabber 108 periodically (e.g., every 120ms) captures a "screenshot" corresponding to an image displayed on the television 110 at a particular time and sends the resulting digital image data to an on-screen media source detector (OSMSD) 112. The OSMSD112 scans a series of acquired images to determine if a channel change event and/or media source change event (both referred to hereinafter as a "media change event") has occurred. The OSMSD112 will be discussed in more detail below in conjunction with the description of fig. 2 and 3.
The exemplary media center 101 may also include a remote control 114 for transmitting control information, which may be received by any or all of the set-top box 104, the video game console 105, the television 110, the frame grabber 108, and/or the OSMSD 112. Those of ordinary skill in the art will recognize that the remote control device 114 may transmit this information using any type of technology, including but not limited to Infrared (IR) transmission, radio frequency transmission, wired/cable connections, etc.
The example audience measurement system 100 may also include a people meter 116 for collecting information about the audience. The example people meter 116 may have a set of input keys, each assigned to represent a corresponding audience, and the example people meter 116 may prompt the audience member to indicate whether they are among the audience by pressing the appropriate input key. The people meter 116 may also receive an instruction or a set of instructions from the OSMSD112 that cause the people meter 116 to prompt an audience member. The prompting of the audience members may include, but is not limited to, presenting one or more graphical overlay images (graphics images) on a screen directed to one or more audience members. For example, the people meter 116 may generate text-superimposed graphical information that asks audience members to press a button that identifies them. Such a graphic overlay image may be detected in one or more ROIs so that activity of the people meter 116 may be determined. Additionally, the OSMSD112 may receive information from the people meter 116 to modify operation of the OSMSD112, for example, to cause the OSMSD112 to report the information to the central processing device 118. As will be appreciated by one of ordinary skill in the art, the people meter 116 may receive and/or transmit information using any variety of techniques including, but not limited to, Infrared (IR) transmission, radio frequency transmission, wired/cable connections, and the like.
Fig. 2 is a block diagram of an example OSMSD200 that may be used to implement the OSMSD112 of fig. 1. The example OSMSD200 receives images and/or image data corresponding to a screenshot or portion thereof associated with the television 110 from, for example, the frame grabber 108. As discussed in further detail below, OSMSD200 stores the received image data in memory 202 if certain image conditions are met. Memory 202 may also be used to store one or more extracted regions of the intermediate image and/or one or more images generated by processing image data from frame grabber 108.
To process the received image, the example OSMSD200 includes a region of interest (ROI) extractor 204. The ROI extractor 204 reads the image stored in the memory 202 and extracts one or more regions of interest (ROIs) from the image being processed. As discussed in more detail below, the extracted ROI may correspond to a region displayed on the television 110 in association with the display of a channel number during a channel change event. After extracting the ROI from the image being processed, the ROI extractor may store the resulting ROI image back into memory 202.
Image management/processing
Prior to storing the ROI image, the ROI extractor 204 may pass the ROI image to the image quantizer 206. By quantizing the ROI image using image quantizer 206, the amount of memory (i.e., resulting in a lower amount of memory required) and/or the amount of processing required for subsequent blocks/functions in OSMSD200 can be significantly reduced. For example, the memory storage space required for a second order black and white (e.g., binary) ROI image may be as small as 1/8 for a 256-order grayscale ROI image.
Referring to fig. 2, the image quantizer 206 takes as input a first image having a plurality of color and/or intensity levels. The image quantizer 206 then converts the first image into a second image having fewer colors (e.g., a chroma threshold), alternating intensity levels (e.g., a pixel brightness/luminosity threshold), and/or higher or lower pixel densities than the original first image based on one or more quantizer thresholds 208. For example, the image quantizer 206 may be implemented to convert a first image having 256 gray levels into a second black-and-white image having two gray levels based on a single threshold 208. In this example, pixels in the first image having a grayscale value greater than the threshold 208 may be converted to white in the second image. Similarly, pixels in the first image having a gray value less than the threshold 208 may be converted to black in the second image. Additionally, the quantizer threshold 208 may be adjusted (e.g., manually or automatically based on the results of subsequent processing performed on the image) to improve the sharpness of the second image. After conversion, the image quantizer 206 may store the resulting quantized image (i.e., the second image) in the memory 202 for use in subsequent processing.
To process an image from the memory 202 (e.g., a quantized ROI image from the image quantizer 206), the example OSMSD200 is to include an object detector 210 and an edge detector 212. The object detector 210 may be used to detect one or more objects in the ROI image from a set of object features stored in the memory 202. For example, the object detector 210 may be configured to detect digits in the ROI image based on a set of features (e.g., height, width, spacing between digits, etc.) associated with the digits. Additionally, the object detector 210 may be configured to detect symbols, such as channel icons (e.g., "peacocks" of NBC) and/or service icons, that may appear on the television 110 when the user enters the broadcast guide mode of the set-top box 104. As discussed in further detail below, the media source may be determined based on images, numbers, and/or symbols appearing on the television screen. The media source may include, but is not limited to, live television viewing, delayed viewing, video on demand viewing, pay per view, multi-view display, internet browsing, DVD and/or VCR recording and/or playback, Time Warner (registered trademark), StartOver (registered trademark) services, and/or interactive applications (e.g., online shopping, customer service, guide mode, and/or instant messaging). Once activated, the video game console 105 (e.g., an arbitrary version of the Xbox from Microsoft corporation) prominently displays the word (splash) "Xbox" in the middle area of the television screen 110 and the word "Microsoft" in the lower middle area. The edge detector 212 may be used to detect one or more edges (e.g., transitions/differences between intensity levels) in the ROI image. For example, the edge detector 212 may be used to determine whether the ROI image corresponds to a static area (quiet area), i.e., an area with a constant background color/intensity.
In the exemplary OSMSD200, the object detector 210 increments the counter 214 each time a desired object (e.g., channel number, symbol, trademark, channel icon) or group of objects is detected. Conversely, if an edge corresponding to a static region is detected in the ROI image, the edge detector 212 may, for example, reset the counter 214. The behavior and use of the object detector 210 and the edge detector 212 will become more apparent from the subsequent description of FIGS. 4A-11C provided below.
Continuing with the description of FIG. 2, a counter 214 may be used to indicate the number of times the object detector 210 detects a desired object or group of objects in one or more images stored in the memory 202. The counter comparator 216 may be used to compare the value of the counter 214 to a comparator threshold 218 to determine, for example, whether a desired object or group of objects is detected in a sufficient number of images to indicate that a monitored event may have occurred. For example, whether a channel number has been detected in a sufficient number of captured images to indicate that a channel change may have occurred, and/or whether "Xbox" and "Microsoft" images have been detected in a sufficient number of captured images to indicate that the user has just opened game console 105. The behavior and use of counter 214 and counter comparator 216 will become more apparent from the subsequent description of fig. 4A-11C provided below.
To compare two or more images, the example OSMSD200 is to include an image comparator 220. The image comparator 220 may be used to compare two images, e.g., an ROI image and a stored reference image/template, to determine whether the two images (or their contents) are substantially similar or different. For example, in the case of detecting a channel change event, the image comparator 220 may compare an ROI image corresponding to a currently displayed channel number with a reference image/template corresponding to a previously displayed channel number. If the image comparator 220 determines that the two images are substantially identical, then a channel change event may not have occurred. However, if the two images are substantially different, a channel change event may have occurred. The image comparator 220 may be enabled based on, for example, a trigger signal from the counter comparator 216 indicating that a sufficient number of objects have been detected and/or from the object detector 210 indicating that a desired object or group of objects has been detected. The behavior and use of the image comparator 220 will become more apparent from the subsequent description of fig. 4A-11C provided below.
Further, the example OSMSD200 may include an object identifier 222 for identifying one or more objects present in the image being processed (e.g., ROI image). For example, the object identifier 222 may identify one or more objects in the ROI image based on a set of object features stored in the memory 202. As described above, the objects may include, but are not limited to, symbols, channel icons, and/or trademarks. In another example, the object identifier 222 may utilize the results of the image comparator 220 to identify one or more objects in the ROI image. In the latter example, the image comparator 220 may compare the ROI image to a reference image/template corresponding to a known object. If it is determined that the ROI image is substantially identical to the reference image/template, the object identifier 222 may conclude that the ROI image contains a known object. The behavior and use of the object identifier 222 will become more apparent from the subsequent description of FIGS. 4A-17 provided below.
The exemplary OSMSD200 may also include a remote control signal detector 224 for detecting signals from a remote control device (e.g., the remote control device 114 of fig. 1). The OSMSD200 may use the remote control signal detector 224 to determine whether the remote control 114 transmitted one or more signals within the time window in which the frame grabber 108 acquired the image currently being processed. The output of the remote control signal detector 224 may be used to determine whether the image being processed may correspond to a media change event. As described above, a media change event may include, but is not limited to, a channel change, the activation of a VCR player, a DVD player, and/or a game console. Additionally, the media change event may include the use of an electronic program guide, VOD, home shopping, gambling, and/or the viewing of a multi-channel multi-screen display. For example, if remote control signal detector 224 determines that remote control 114 is not transmitting a signal (indicating that no user input has been applied to media center 101), remote control signal detector 224 may reset counter 214 to indicate, for example, that none of the digits currently detected correspond to a channel change event. Similarly, although a logo such as the TiVo (registered trademark) logo appears on the screen, the absence of a corresponding remote control signal may conceive that the logo appears as a result of an advertisement rather than a user/viewer invoking the TiVo (registered trademark) DVR. The behavior and use of the remote control signal detector 224 will become more apparent from the subsequent description of fig. 4A-17 provided below.
The images identified by the object identifier 222 may be stored in the image memory 202. In addition, such identified images may also be forwarded to the central processing device 118, as shown in FIG. 1. Central facility 118 may further process the images to determine, for example, whether a particular image detected in a particular ROI corresponds to a channel change or a particular media device (e.g., set-top box VOD mode, PPV mode, VCR, DVD player, etc.). Additionally or alternatively, the example OSMSD200 may further process the identified images in the memory 202 to determine which media device and/or which media device mode the viewer is using, as discussed in further detail below.
Fig. 3 is a block diagram of an example OSMSD250, the OSMSD250 may be used to implement the OSMSD112 of fig. 1 or the example OSMSD200 of fig. 2. The exemplary OSMSD250 may be a Personal Computer (PC) or any other processor or computing device. Specifically, the exemplary OSMSD250 includes a main processing unit 252 powered by a power supply 253. The main processing unit 252 may include a processor 254 electrically connected to a main memory device 258 and/or one or more interface circuits 260 via a system interconnect 256. For example, system interconnect 256 may be an address/data bus. Of course, those skilled in the art will readily appreciate that interconnects other than buses may be used to connect processor 254 to main memory device 258. For example, one or more dedicated lines and/or a crossbar may be used to connect processor 254 to main memory device 258.
Processor 254 may comprise any type of Central Processing Unit (CPU), such as a microprocessor from the Intel pentium (registered trademark) microprocessor family, the Intel Itanium (registered trademark) microprocessor family, the Intel Centrino (registered trademark) processor family, and/or the Intel XScale (registered trademark) processor family. Processor 254 may include any type of cache memory such as Static Random Access Memory (SRAM). The main memory device 258 may include Dynamic Random Access Memory (DRAM), but may also include non-volatile memory. Main memory device 258 may be used to store software programs that are executed by processor unit 254 in any suitable manner.
Interface circuit 260 may be implemented using any type of interface standard, such as an analog cable interface, a digital cable interface, a satellite signal interface, an ethernet interface, and/or a Universal Serial Bus (USB) interface. One or more input devices 262 may be connected to the interface circuit 260 for inputting data and/or commands to the main processing unit 252. For example, the input device 262 may be a keyboard, mouse, touch screen, track pad, track ball, isopoint, and/or a voice recognition system. In addition, interface circuit 260 may process digital data inputs supplied by frame grabber 108, inputs supplied by remote control detector 263, and inputs and/or outputs corresponding to people meter 264 in FIG. 1. The remote control detector 263 detects a signal transmitted from the remote control device 114 in fig. 1. For example, the remote control detector 263 may be designed to detect a signaling event (e.g., detect that a signal is being transmitted by the remote control device 114) or decode a signal transmitted by the remote control device 114 to determine the control information being transmitted to a receiving device (e.g., the set top box 104). Similarly, the people meter interface 264 can be designed to receive/transmit signals from and/or to the people meter 116 of fig. 1. For example, the people meter interface 264 may be used to send a signal to the people meter 116 if a channel change is detected. One of ordinary skill in the art will recognize that all or a portion of the frame grabber 108, remote control detector 263, and/or personnel meter interface 264 may be incorporated into the OSMSD 250.
One or more displays, printers, speakers, and/or other output devices 265 may also be connected to the main processing unit 252 through one or more of the interface circuits 260. For example, the output device may be used to observe the collected data or for diagnostic purposes. The display 265 may be a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), or any other type of display. The display 265 may provide visual indications of data generated during operation of the main processing unit 252. Further, the visual display may include prompting a human operator for input, calculating numerical values, detecting data, and the like.
The OSMSD250 may also include one or more storage devices 266. For example, OSMSD250 may include one or more hard disk drives, Compact Disk (CD) drives, DVD drives, and/or other computer media input/output (I/O mounted) devices.
The OSMSD250 can also exchange data with other devices over a connection 267 to the network 268. The network connection 267 may be any type of network connection such as an ethernet connection, a Digital Subscriber Line (DSL), a telephone line, a coaxial cable, etc., and the network 268 may be any type of network such as the internet, a telephone network, a wired network, and/or any type of wireless network. For example, OSMSD250 may exchange data with central processing device 270 over network 268. The central processing device 270 may receive data from the OSMSD250 from which the central processing device 270 may determine a set of viewing statistics or identify a set of viewing behaviors. Additionally, the OSMSD250 may receive information from the central processing device 270 to modify the operation of the OSMSD250, such as the frequency at which the OSMSD250 provides data to the central processing device 270.
Fig. 4A illustrates types of images and ROIs that may be displayed on an exemplary television 300 (e.g., the exemplary television 110 of fig. 1). The exemplary television 300 includes a display device 302, such as a CRT, plasma, or LCD capable of displaying video/image content within a display area 304. When the channel received by the television 300 changes (e.g., by the set-top box 104 of fig. 1), the new channel number is displayed in region 306, which region 306 is located, for example, near the upper right corner of the display region 304. Region 306 includes a background region 308 that may be, for example, solid colored (e.g., blue), transparent, translucent, or opaque. A set of digits 310 representing the new channel number may be superimposed on the background 308. Region 308 is also referred to hereinafter as a first region of interest (ROI)306 and may be associated with a channel number display. Similarly, when the viewer calls a different media device, such as TiVo (registered trademark) DVR, a TiVo (registered trademark) mark 311 may be displayed in the middle left region 313 of the display region 304.
In some examples, the program content displayed in the first ROI306 may include features substantially similar to channel numbers (e.g., in the case of a sporting event or contest show where scores or other digital information may be displayed). To reduce the likelihood of erroneously determining such display content as a channel number, a second queue (indicator) may be employed, such as checking other areas of the display area 304 to confirm whether a channel number is displayed. For example, many exemplary televisions 300 and/or set top boxes 104 may also display a title guide in a title region 312 in synchronization with displaying a new channel number within the first ROI306, the title region 312 being disposed, for example, along the bottom of the display region 304. The title guide may include information such as the name and/or call sign of the current channel, the current time, a description of the program content being broadcast over the selected channel, the duration of the program content, and the like. In general, in the case where a title guide is displayed, the title area 312 is mainly a pure background color (e.g., blue) on which display information is superimposed. Thus, a known static area 314 (i.e., an area that typically has a constant display color/intensity if a title is displayed) may be defined in the title area 312, and the static area 314 may be used to indicate whether a title guide is currently being displayed. The static region 314 is also referred to hereinafter as the second ROI314 and may be associated with the display of a title guide.
For many exemplary televisions 300 and/or set top boxes 104, detection of the channel number in the first ROI306 and the title guide in the title region 312 (e.g., by detecting a static region in the second ROI 314) may be sufficient to indicate that the detection of the channel number is not erroneous. However, in other exemplary televisions 300 and/or set-top boxes 104, the title guide display in title area 312 may correspond to events other than channel changes. In these cases, other areas of the display area 304 may be examined to confirm the presence of a display channel number corresponding to a channel change event or other event. For example, the user may request that the set-top box 104 provide a title guide containing information corresponding to the current channel being viewed. The request may also cause the current channel number to be displayed simultaneously in the first ROI 306. Alternatively, the user may, for example, request that the set-top box 104 generate a title guide containing information corresponding to channels other than the channel currently being viewed. In the latter case, the user may enter a "title browsing" mode in which the user may browse through information content (e.g., title guide information) corresponding to a set of available channels other than the current channel while program content for the current channel is still displayed in the display area 304. In the title browsing process, the channel number of the channel whose title is browsed (instead of the current channel or a new channel after a channel change event) may be displayed in the first ROI 306. At the same time, the current channel number may be displayed in region 316, with region 316 located, for example, near the upper left corner of display region 304. Region 316 is also referred to hereinafter as the third ROI 316 and may be associated with the display of the current channel number during one or more title guide modes of operation. Thus, for example, the display content in the third ROI 316 may be examined to confirm that the detected channel number in the first ROI306 corresponds to a channel change event, or to determine that the channel number displayed in the first ROI306 corresponds to a title browsing event.
Fig. 4B includes an exemplary screen 320 illustrating title browsing. Different set-top boxes may employ different ROIs to display the channel 322 currently being viewed in the viewing channel ROI 324 and the title channel 326 in the title channel ROI 328 with associated descriptive information 328. Fig. 4B-4E also illustrate four exemplary viewing modes that a user may invoke with the set-top box 104. As shown in fig. 4B, a television ("TV") viewing mode 320 provides a viewer with a title 322, the title 322 including a network icon "HBO" 324, a set of digits 326 representing a currently viewed channel, and descriptive information 328 associated with the currently viewed channel 326. Fig. 4C illustrates a program guide mode 330 including a "television guide" icon 332 and a program information type 334. Fig. 4D illustrates an exemplary screenshot of a title browsing mode 336 that includes a set of digits 338 representing a currently viewed channel. The title browsing mode 336 also includes a title 340, a network icon 342, a channel number 344, and descriptive information 346, each of which relates to program content other than the currently viewed content. FIG. 4E illustrates a video-on-demand (VOD) mode 348 including an exemplary title 349 and an exemplary VOD icon 350.
Each of the television viewing mode 320, guide mode 330, title browsing mode 336, and/or VOD mode 348 includes a unique ROI, icon, symbol, and/or combination thereof. For example, while both the television viewing mode 320 and the title browsing mode 336 include titles (322 and 340, respectively), the presence of a set of digits 338 indicates that the user is browsing for titles rather than just watching television. Each viewing mode may be identified by training the OSMSD112 to monitor a particular ROI and identify the presence and/or absence of one or more icons, symbols, and/or numbers. As a result, viewer behavior may be determined through video and/or remote control usage, including determining that the viewer is using one or more media devices (e.g., DVD player, VCR, video game console, TiVo (registered trademark), etc.).
One of ordinary skill in the art will appreciate that the regions of interest (e.g., ROIs 306, 314, and/or 316) described above may be adapted to the particular set-top box 104 and/or television 110 employed in the television system 101. For example, another exemplary set top box 104 may have the channel number displayed in a first ROI306 located in the upper left corner of the display area 304 (rather than the upper right corner depicted in the exemplary television 300 of fig. 4A). Alternatively or additionally, the title region 312 may be disposed along the top of the display region 304 (rather than along the bottom as depicted in the exemplary television 300). Also during title browsing, some set top boxes 104 and/or televisions 110 may display the channel number to which the title is browsed in the title region 312 rather than in a separate location in the display region 304 (relative to the title browsing channel number displayed in the first ROI306 in the exemplary television 300). Typically, the number and characteristics of one or more regions of interest are calibrated during initial configuration of the audience measurement system 100. The corresponding calibration settings may be updated, for example, when a new component (e.g., a new set-top box 104) is installed in the television system 101. The calibration settings may also be updated by any or all of the central processing device 222, OSMSD112, and/or frame grabber 108 to improve the reliability of detecting a channel change event.
Fig. 5 shows a flow diagram illustrating an exemplary process 500 for determining the activity state of an audience. Process 500 may be performed by an OSMSD (e.g., the example OSMSD112 or 200) and may be implemented as one or more software programs stored in one or more memories and executed in any manner by one or more processors. However, some or all of the blocks of process 500 may be performed manually and/or by one or more hardware devices.
In general, the example process 500 scans one or more ROIs (e.g., the ROIs 306, 314, and 316 of fig. 4A or the ROIs 322, 324, 326, 332, and 338 of fig. 4B) associated with a predetermined number of consecutively acquired video frames to determine whether the user is viewing the broadcast media 320, viewing the guide device 330, conducting title browsing 336, and/or consuming the VOD service 348. In particular, when the user invokes any of the above viewing modes, the particular ROI combination will contain numbers, symbols and/or icons.
Exemplary process 500 begins at block 502, where frame grabber 108 of fig. 1 captures a screenshot corresponding to a frame of a video signal being delivered from, for example, set top box 104 to television 110. The captured screenshot may correspond to the entire video frame, or only a portion of the frame. The OSMSD112 extracts at least one ROI from the captured screenshot (block 504). For example, the OSMSD112 may extract and store ROIs related to titles, such as regions 322, 340, and 349, and/or extract ROIs related to Guide features 332. As discussed in further detail below, the numbers, symbols, and/or icons extracted from one or more ROIs can be modified to facilitate detection and/or save memory.
If the OSMSD112 does not detect a feature of the on-screen title (e.g., 322, 340, and 348), the OSMSD112 checks if a "TV guide" symbol 332 is present (block 508). If Guide symbol 332 is detected, then the current media device being used is set top box 104 and the current mode is Guide mode 330 (block 510). To reduce the likelihood of misinterpreting the media device being used and/or the viewer activity, the OSMSD112 may evaluate other ROIs. For example, if a television advertisement is being shown that includes a television guide symbol 332 and such a symbol happens to stay in the same screen position as shown in FIG. 4C, it may be erroneously detected as guide mode 330, when in fact the user is only watching broadcast television. Such problems may be avoided by analyzing one or more other ROIs indicative of one or more features, such as guide feature 330. In particular, the OSMSD112 may check the program information 334 in a grid format, as shown in fig. 4B, to verify that the guide mode is valid. If a title is not present (block 506) and a television guide symbol is not present in the predetermined ROI of the title (block 508), the current media source and/or viewing mode may not be known (block 512). Other video frame acquisitions may continue to look for viewer patterns and/or media devices being used (block 502).
If a title is present on the screen (block 506), any of the three exemplary modes may be active in the illustrated example. Specifically, in the case where the title (322, 340, 349) exists, the television viewing mode 320, the title browsing mode 336, or the VOD mode 348 may be active. The OSMSD112 may determine whether a set of digits 338 is present in the upper right corner of the screen, as shown in fig. 4B, and if so, determine that the viewer is browsing the headline (block 516). While different set-top boxes may present title browsing in different ways, the current channel being viewed typically remains on-screen while the user reviews the alternate channel number 344 and corresponding program information 346. However, if the upper right corner of the screen does not include a set of digits 338 (block 514), the OSMSD112 checks whether a VOD symbol, such as VOD symbol 350 shown in fig. 4B, is present in the captured picture (block 518). In the illustrated example, there is a VOD symbol 350, title 349 and no set of digits 338 in the upper right corner of the television screen, indicating that the viewer has invoked VOD (block 520). On the other hand, if the VOD symbol 350 is not present, the viewer may be watching a broadcast program or may have pressed an information button to obtain the title screen 322 containing the program information 328 (block 522). Process 500 may repeat (block 502) to continue monitoring for audience and/or media device usage.
One or more viewer activities and corresponding media assets, including but not limited to channel changes initiated by the viewer with set-top box 104, title browsing, VOD, television guide services, video games played through set-top box 104 and/or a separate video game console, and/or multi-view display viewing, may be detected using frame grabber 108 and OSMSD 112.
OSMSD detection of channel change
The use of frame grabber 108 and OSMSD112 to detect a television channel change by monitoring on-screen activity of one or more channel change indicators will now be discussed based on the previous description of fig. 1 through 4E. Specifically, frame grabber 108, which receives substantially the same electrical signal as television 110, periodically grabs a "screen shot" of the image displayed on television 110. A screenshot is a digital representation of the image displayed on the television 110 at a particular moment in time. These screenshots include one or more regions of interest, such as ROIs 306, 314, and 316 of FIG. 4A, where on-screen activity (or lack thereof) may be used to determine whether a television channel change has occurred. As described above, many televisions 110, which may be controlled by set top box 104, are adapted to temporarily display a channel number in the upper right corner (e.g., ROI 306) (or some other predetermined region) of television display region 304 when a channel change occurs. Thus, in this example, the ROI306 may be used as the first ROI, and the ROI306 may be scanned horizontally and vertically and converted to a black and white image. The resulting image data may then be examined to determine whether one or more digits representing the channel number are displayed in the region. To determine whether the content displayed in the ROI306 represents a channel number and thus may indicate that a channel change has occurred, one or more features of the displayed content (including, for example, height, width, and content center) are detected and measured. These features are preprogrammed into the television 110 or set-top box 104 during manufacture and are used by the television 110 or set-top box 104 to control the manner in which channel numbers are displayed on the screen. Thus, displaying content with preprogrammed channel number characteristics in the first ROI306 may be used to indicate that a channel change may have occurred and to trigger a prompt on the people meter 116, for example.
Depending on the operating state of the television 110 and/or set-top box 104, displaying the channel number in the first ROI306 may not be sufficient to justify that a channel change has occurred. For example, the detected numbers displayed in the first ROI306 may be related to the program content being displayed, but not to a channel change. In this case, the displayed numbers may be closely similar to those associated with the actual channel number display. For example, a television advertisement may display on-screen a contact phone number that stays in the same ROI used to display tuned channels. In another example, a different television 110 and/or set-top box 104 may be capable of an activity called "title browsing" that allows a viewer to continue watching a first channel while causing the television (e.g., in title area 312 of fig. 4A) to display a title guide that contains information about programs available on one or more other channels at that time or at other times. The information contained in the title guide may include channel number information and/or program identification information (e.g., program name). The term "browse" is often used to describe this behavior because many television systems 101 allow viewers to quickly navigate through information about each of a plurality of channels by repeatedly pressing a designated button on remote control 114 or by holding a designated button in a pressed position for a predetermined minimum period of time. In addition, some televisions 110 and/or set-top boxes 104 that are capable of title browsing will also display the channel number whose title is being browsed (the channel for which the title guide is providing information) at the same screen location (e.g., ROI 306) where the channel number is displayed, in response to the channel change. In addition, the channel number whose title is browsed may be displayed using the same characteristics as those stored for displaying the channel number related to the actual channel change. Thus, detection of a digit in the first ROI306 may not correspond to a channel change, but may instead represent a channel whose information is being obtained through title browsing.
A false detection channel change determination is defined hereinafter as the detection of a digit in the primary ROI306 that does not correspond to an actual channel change event and is erroneously identified as representing a channel change. To reduce the likelihood of misdetecting a channel change determination, other areas of interest may be examined and/or an empty screen may be monitored. For example, a second ROI, such as a background region or a static region (e.g., ROI314 of FIG. 2) may be examined. The static region ROI314 may constitute a portion of the television screen in which a solid color is displayed when a channel number is displayed in the first ROI 306. Thus, if the static region ROI314 does not display a solid background when a channel change is detected, the detected channel change may be a false detection. However, as described above, some televisions 110 and/or set-top boxes 104 display channel numbers in the primary ROI306 when the viewer is browsing the title and thus the static region ROI314 displays a solid background. The channel number displayed in the primary ROI306 during such title browsing does not indicate the channel number of the channel currently being viewed, nor does it indicate that a channel change has occurred. Instead, the channel number displayed represents a channel that is outside the channel currently being viewed and for which the viewer seeks information.
To ensure that the title browsing number is not mistaken for a channel number representing an actual channel change event, a third ROI (e.g., ROI 316 of fig. 4A) may also be checked. This third region of interest may be located, for example, in the upper left corner of the television screen because many televisions 110 and/or set top boxes 104 are configured to display the actual channel number in the upper left corner of the display region 304 (e.g., ROI 316) when the title browsing channel number is being displayed in the upper right corner of the display region 304 (e.g., ROI 306). Thus, if a channel number is detected in both the primary ROI306 and the tertiary ROI 316, the channel number detected in the primary ROI306 does not represent a channel change event.
Once a channel number is detected in the first ROI306, it may be compared to a channel number displayed prior to the currently displayed channel number. If the previously displayed channel number is the same as the channel number currently being displayed, a channel change has not occurred. It is in fact most likely that the channel number currently being viewed is displayed for the purpose of obtaining information. In contrast, a channel change may occur if the previously displayed channel number is different from the channel number currently being displayed.
Fig. 6A provides a flow chart illustrating an exemplary process 600 for determining whether a channel change event has occurred. Process 600 may be performed by an OSMSD, such as exemplary OSMSD112 or 200, and may be implemented as one or more software programs stored in one or more memories and executed in any manner by one or more processors. However, some or all of the blocks of process 600 may be performed manually and/or by one or more hardware devices. Although process 600 is described with reference to the flowchart illustrated in FIG. 6A, persons of ordinary skill in the art will readily appreciate that many other methods of performing process 600 may be used. For example, the order of many of the blocks may be changed, the performance of one or more blocks may be combined, and/or one or more blocks may be eliminated.
In general, the example process 600 scans one or more ROIs (e.g., ROIs 306, 314, and 316 of fig. 4A) associated with a predetermined number of consecutively acquired video frames to determine whether the consecutively acquired frames all display digits representing a channel change. In the illustrated example, the predetermined number of frames corresponds to the number of consecutive frames in which the channel number is displayed when a channel change occurs, but other numbers of frames may be employed. In particular, when a channel change occurs, the television 110 and/or set-top box 104 typically displays the reselected channel number for a predetermined length of time sufficient to enable the channel number to be viewed and identified by the viewer. The predetermined length of time will control the number of consecutive frames in which the channel number is to be displayed and will vary depending on the settings of the television 110 and/or set top box 104 being monitored. If such a digit is detected in a first ROI (e.g., ROI 306), one or more other regions of interest associated with the same video frame may also be examined to reduce the likelihood that the detected digit will result in a false detection determination.
Exemplary process 600 begins at block 602, where frame grabber 108 of FIG. 1 grabs a screenshot corresponding to a frame of a video signal delivered to television 110 from, for example, set top box 104. The captured screenshot may correspond to the entire video frame, or only a portion of the frame. Next, the OSMSD112 extracts one or more ROIs from the captured screenshots (block 604). For example, the OSMSD112 may extract three separate ROIs (e.g., ROIs 306, 314, and 316 of fig. 4A) and store in memory. In the extraction process, the ROI is converted into a grayscale image by any type of method. The resulting grayscale image may be further converted into sets of digital data representing the grayscale image in any format type. Control then proceeds to block 606 where the OSMSD112 scans the first ROI306 horizontally and vertically, determining whether digits are present (block 606). Specifically, the horizontal scan of the first ROI306 is performed by reading the corresponding screen capture data from left to right (or vice versa) pixel by pixel, and the vertical scan is performed by reading the frame data from top to bottom (or vice versa) pixel by pixel. During the scan, the grayscale image corresponding to the ROI306 is converted to a black and white (e.g., binary) image based on one or more quantization thresholds. This conversion process is discussed in more detail below in conjunction with fig. 7 and 8.
The black and white image data is checked for features indicating the presence of one or more digits. Specifically, referring to fig. 9, each digit may be divided into one or more parts (components). By detecting one or more of the portions and measuring/recording one or more characteristics of the portions, the OSMSD112 can detect whether one or more digits formed from such portions are present. For example, depending on one or more parameters/characteristics stored in the television 110 and/or set-top box 104, the channel numbers and portions thereof displayed in the first ROI306 will have a predetermined width/height or range of widths/heights, such that any content displayed with a width/height not equal to the predetermined width/height or with a width/height outside the predetermined range will not be considered a potential portion and thus may be discarded. Similarly, the arabic numerals (i.e., 1, 2, 3, 4, etc.) have no horizontal gaps (horizotalgap), and thus any content with such gaps is not considered a potential number and can be discarded. A horizontal gap is defined as an area comprising a horizontally arranged row of pixels extending over the entire width of the digit and arranged between the outermost vertical edges of the digit, i.e. the top and bottom of the digit. The values of the pixels within the horizontal gap are substantially equal to each other, but substantially different from the values of the pixels corresponding to the displayed digits.
Similarly, the manner in which the content is centered in the first ROI306 can be compared to the stored character centering features. Content may represent a digit if the content is centered according to a stored character centering feature. After performing the horizontal scan or the vertical scan, or both, and checking the mapping of the resulting digital data, the OSMSD112 uses the check results to increment the digit detection counter (in the case where a digit is detected) or clear the digit detection counter (in the case where a digit has not been detected).
As will be appreciated by those of ordinary skill in the art, the number of features that must be detected before the displayed content is confirmed as a digit (thereby causing the digit detection counter to increment) may vary depending on the certainty required by the system performing process 600. For example, as feature detection increases consistent with a digital display, the certainty that the display content is actually a digital bit increases. Processing complexity may also increase as the examination of the region of interest is more comprehensive. For example, process 600 may require that the mapping of the digital data contain content consistent with stored height, width, spacing, and/or character centering features.
As will be appreciated by one of ordinary skill in the art, the first ROI306, and for that matter any ROI, may be scanned/examined by any type of method capable of monitoring one or more digit portions and/or any characteristics of digit portions to determine if digits are present: such methods may include the following techniques: the location/position of any one edge of the portions, the height of any one portion, the width of any one portion, the number and/or width of any vertical gaps separating any portions (used to determine whether these gaps are equal to or within an expected and predetermined distance between two digits representing a multi-digit number), the absence of any horizontal gaps, etc. are examined. One of ordinary skill in the art will readily appreciate that other features/parameters specific to set-top box 104 and/or television 110 may be used to determine whether a digit displayed in a channel number display region (e.g., ROI 306) should be considered a channel number.
In the example of fig. 6A, if the OSMSD112 determines that the content displayed in the first ROI306 is unlikely to be digital because the features of the detected portion (if any) are not consistent with the stored features (block 608), the OSMSD112 checks whether another quantization threshold can be used to quantize the ROI image to black and white format. As discussed in more detail below, different quantization thresholds may produce a black and white ROI with better edge definition. If another threshold can be used (block 609), then control returns to block 606, where the OSMSD112 reconverts the grayscale ROI image to a black and white image, again scanning the image to determine if there are digits. Control then proceeds as described above. However, if another threshold may not be used (block 609), the OSMSD112 clears the digit detection counter (block 610), and control returns to block 602. Conversely, if the OSMSD112 determines that the characteristics of the displayed portion are consistent with the stored characteristics for displaying the digits of the channel number (block 608), the OSMSD112 may increment a continuity check counter (block 612) to indicate that the channel number is being displayed in the currently acquired frame.
In the event that a digit is detected, which in turn increments a digit detection counter, the OSMSD112 determines whether a second ROI, such as the background region 314 of fig. 4A, can be scanned (block 614). For example, the OSMSD112 may read a location in the memory 208 to determine whether the second ROI314 can be scanned. In the example of fig. 6A, if digits are detected in the primary ROI306 at block 608, control passes to block 614. Block 614 may be used to initiate a procedure for detecting a false positive by checking whether information is displayed in the secondary ROI314 that would cause the digits detected at block 608 to be valid or invalid. When the digits representing the channel number are actually displayed in the first ROI306, the second ROI 316 may be defined to include a portion of the title region 312 in which the title guide is programmed to appear in response to a user prompt entered via, for example, the remote control 114. As described above, the title guide may be configured to provide information about programs available on the currently displayed channel or available on other channels. To indicate whether a title guide is present, for example as a result of a channel change event, the second ROI 316 may be defined as a region within the title region 312 that would remain stationary (i.e., pure background color) if the title guide were active.
Thus, since in the illustrated example the secondary ROI314 will only contain a solid background when the primary ROI306 contains a channel number, scanning the edges of the secondary ROI314 (block 616) the presence of the edges indicates that the secondary ROI314 is not filled with only background color, thus indicating that the digits detected in the primary ROI306 represent a false detection determination. Similar to the processing of the first ROI306, in the scanning process, the grayscale image corresponding to the second ROI314 is converted to a black and white image based on one or more quantization thresholds. If no edge is present in the second ROI314 (block 618), thereby indicating that the background region is stationary (as expected when the channel number is displayed), the OSMSD112 may confirm/validate the decision made at block 608 (i.e., that the channel number is being displayed). Thus, if an edge is detected at block 618, control proceeds to block 610 and blocks thereafter (as described above) where the number detection counter is cleared to indicate that a channel number has not been detected in the current frame and another video frame is acquired for checking (block 602). As one of ordinary skill in the art will appreciate, there are many methods to perform edge detection, and the OSMSD112 may use any of these methods to determine whether the second ROI314 contains an edge. Further similar to the processing of the first ROI306, in the absence of detected edges, if multiple quantization thresholds are available, the OSMSD112 may attempt to convert and/or scan again the black and white image corresponding to the second ROI314 before determining that there are no edges in the region.
If no edge is detected at block 618, the OSMSD112 determines whether a third ROI, such as ROI 316 of fig. 3, can be scanned (block 620). The third ROI 316 may be scanned, for example, to further verify that the digits detected in the first ROI306 represent a channel number and correspond to a channel change event. A user typically browses the title guide to obtain information about available programs without actually changing the currently tuned channel. While the title guide is displayed, the television 110 and/or set-top box 104 may be configured to display the channel number being browsed (for which information is being sought) in the first ROI306 and the current channel number being viewed in the third ROI 316. Thus, the presence of digits in the third ROI 316 may indicate that digits detected in the first ROI306 do not represent a channel change, but are instead related to a title browsing event.
If the OSMSD112 determines at block 618 that there are no edges in the second ROI314 (i.e., the background region is stationary), or if the second ROI314 cannot be scanned, the OSMSD112 may determine whether the third ROI 316 can be scanned (block 620). The OSMSD112 may read a location in the memory 208 again to determine if such a scan is possible. If so, the OSMSD112 scans the third ROI 316 to determine if digits are being displayed. Similar to the processing of the first ROI306, in the scanning process, the grayscale image corresponding to the third ROI 316 is converted to a black-and-white image based on one or more quantization thresholds. As described above, some televisions 110 and/or set-top boxes 104 are configured to display a channel number in the third ROI 316 (e.g., the upper left corner of the display region 304) when the browsing guide channel is displayed in the first ROI306 (e.g., the upper right corner of the display region 304). Thus, if digits having channel number display characteristics are displayed in the third ROI 316, the digits detected in the first ROI306 may represent a title browsing channel number rather than the actual channel number (block 622). If the first ROI306 contains a channel number and the third ROI 316 contains a channel number, the decision made for the first ROI306 may indicate a false positive. To detect the display of digits in the third ROI 316, any of the methods/tests described above for examining the first ROI306 may be used.
Thus, if one or more digits are detected in the third ROI 316 (block 624), the OSMSD112 clears the continuous detection counter (block 610). If no digits are detected (block 624), or the third ROI 316 cannot be scanned (block 620), then the channel numbers detected in the first ROI306 are not considered false detections (i.e., they are considered valid detections). (it should be noted that similar to the processing of the first ROI306, in the event that no digits are detected, and if multiple quantization thresholds are available, the OSMSD112 may attempt to re-convert and/or re-scan the black and white image corresponding to the third ROI 316 before determining that no digits are available in the region.) in this case, the OSMSD112 determines whether the channel number displayed in the first ROI306 has displayed at least a minimum number of consecutive frames by determining whether the consecutive detection counter exceeds a predetermined threshold (block 626).
If the continuity detect counter does not exceed the predetermined threshold (block 626), then the channel number detected in the primary ROI306 has not yet displayed a sufficient number of video frames, so control returns to block 602 and subsequent blocks to capture another video as described above. If the continuity check counter exceeds the predetermined threshold (block 626), the channel number detected in the primary ROI306 has been displayed for a sufficient number of frames, and control proceeds to block 628. At block 628, the OSMSD112 checks whether the channel number currently displayed in the primary ROI306 matches the channel number acquired in connection with the most recent channel change. The OSMSD112 performs this operation by comparing the currently acquired ROI image containing the current channel number with a reference image or template containing the most recently displayed channel number. If the currently captured channel number image matches the previously captured reference channel number image/template, no channel change has occurred. In practice, the viewer is most likely to display the channel number of the currently viewed channel in order to obtain information. Accordingly, the OSMSD112 clears the continuous detection counter (block 610) and control returns to capturing another video frame (block 602). A channel change occurs if the currently displayed channel number does not match the previously displayed channel number. In this case, the OSMSD112 stores the current channel number ROI image for use as a reference image/template for comparison with subsequent channel number displays (block 630). The OSMSD112 then reports a channel change event (block 632). Further, as described above, the reported channel change event (block 632) may cause the OSMSD112 to send information to the people meter 116 of fig. 1, causing the people meter 116 to prompt the audience members to indicate their presence in the audience.
In addition to reporting a channel change at block 632, process 600 may also save and/or report the channel number being displayed, for example, for channel identification purposes. One of ordinary skill in the art will appreciate that there are many ways to save and/or report channel numbers. For example, digitized images of the channel number may be collected and stored in memory 208 for later transmission to the central processing device 222 over the network 218. As another example, the OSMSD112 may convert digitized images of channel numbers to numerical values via any image recognition technique. The value may be stored in the memory 208 for later transmission to the central processing device 222 via the network 218. In another example, the OSMSD112 may compare the captured channel number image to a series of stored templates representing images of all possible channel numbers until a match is detected. The digital information identifying the channel number represented by the matching template may then be stored in memory 208 and/or transmitted over network 218 and used to report the current channel number to central processing device 222. The central processing device 222 may use the reported channel numbers to determine viewing statistics related to, for example, monitored viewers, viewed channels, viewed programs, etc. Channel number identification is discussed in more detail below in conjunction with the description of fig. 11A, 11B, and 11C.
Another exemplary process 650 for determining whether a channel change has occurred is shown in the flowchart of fig. 6B. The exemplary process is similar to the exemplary process 600, but also includes a block that uses information from the remote control device 114 to reduce the likelihood of a false positive determination. Because there is much overlap between the flow diagrams in fig. 6A and 6B, substantially similar blocks that occur simultaneously in both diagrams will not be described again here for the sake of brevity. In fact, the interested reader is referred to the above description of FIG. 6A for a complete description of the corresponding blocks. To assist the reader in this process, substantially similar blocks are labeled with the same reference numeral in the figures. Further, to reduce the complexity of fig. 6B, the retry block 609 in fig. 6A is omitted from the exemplary process 650. However, one of ordinary skill in the art will recognize that the same retry procedure may be inserted into process 650 at a location similar to process 600 in FIG. 6A, or at any other suitable location.
Comparing fig. 6B with fig. 6A, in the event that a digit is detected that causes the digit detection counter to increment (block 612), the OSMSD112 executing the process 650 of fig. 6B determines whether the remote control detector 213 is enabled (block 602). For example, the OSMSD112 may read a location in the memory 208 to determine whether a channel change event should occur using a detection signal from the remote control 114. If the remote control detector 213 is available (block 652), the OSMSD112 determines if the remote control detector 213 received and/or decoded a signal from the remote control device 114 in a time window before the digit was detected (block 654). If the remote control detector 213 receives and/or decodes a signal (block 654), or the remote control detector 213 is unavailable (block 652), then control passes to block 614 and the operations of block 614 are described above. If the remote control detector 213 does not receive a signal (block 654), control proceeds to block 610 and blocks thereafter (as described above) where the number detection counter is cleared to indicate that a channel number has not been detected within the current frame and another video frame is acquired for examination.
Those of ordinary skill in the art will appreciate that information received from a remote control device may be used anywhere in the exemplary processes 600 and 650 to assist in identifying a channel change event. For example, the remote detection procedure of blocks 652 and 654 of fig. 6B may be used after a digit is detected in the first ROI306 (e.g., after block 612). Alternatively, the remote detection procedure may be used after the second or third ROI (314 or 316, respectively) is examined (e.g., blocks 618 or 624), or as an initial step in the process 600 or 650 prior to examining the first ROI 306. Similarly, detection of remote control information may be used as a trigger to perform one or more blocks of process 600 or 650.
Image quantization
Although the above examples and discussion refer to the detection of channel numbers, OSMSD112 is not limited and symbols and/or icons may be detected. In particular, the OSMSD112 may detect a symbol, such as the "tv guide" symbol 332 shown in fig. 4B, and/or detect a network broadcast icon, such as the "HBO" icon 324 shown in fig. 4B. An exemplary ROI 660 representative of the ROIs 306 and/or 316 of fig. 4A is illustrated in fig. 7. As described above, the ROI 660 may be extracted by the OSMSD112 from the screen shots acquired by the frame grabber 108. To reduce memory storage and/or processing requirements, the grayscale ROI 660 can be quantized to a black and white (e.g., binary) image 700 as shown in fig. 8. In the example of fig. 7, ROI 660 includes channel number 664, a solid background (static region) 670, and a region 672 located between two digits. In fig. 8, the same exemplary ROI after conversion of OSMSD112 from grayscale to black and white is illustrated as ROI 700. As can be seen in fig. 7 and 8, channel number 664 that is lightly shaded is converted to white channel number 704. The dark shaded background 670 and the region 672 between the digits are converted to black regions 710 and 712, respectively. To perform the conversion from the ROI 660 to the ROI 700, light and/or dark thresholds may be used to set intensity levels above which all pixels are converted to white and below which all pixels are converted to black. Using such a quantization threshold to convert a grayscale (or color) image to a black and white image simplifies the edge detection and/or channel number comparison procedures. For edge detection (for channel number detection in ROIs 306 and/or 316 or feature detection in static region ROI 314), the corresponding program identifies columns and/or rows of pixels that change from black to white (or from white to black depending on the display polarity used by the particular set top box 104) as the image is scanned vertically and/or horizontally. In addition, an optional polarity parameter may be provided to the OSMSD112 to initialize the edge detection procedure with the expected dominant background color (e.g., white or black) of a particular set top box 104. The corresponding procedure for channel number comparison is discussed in more detail below.
A single quantization threshold may not be sufficient to produce robust performance based on the characteristics of the background region in one or more regions of interest (e.g., ROIs 306 or 316) and the program content displayed in display region 304. For example, if the background region 308 in the first ROI306 is transparent or translucent, the intensity difference between the displayed channel number 310 and the background 308 will depend on the intensity of the displayed program content that coincides with the first ROI 306. For example, if the channel number 310 is represented by a light color (e.g., white), then the overlapping program content having a relatively small intensity in the first ROI306 may require a different quantization threshold than the content having a larger intensity in that region. The exemplary process 600 of fig. 6A thus supports the use of multiple quantization thresholds as represented by block 609.
In the exemplary process 600, the quantization threshold may be selected and used as follows. First, the baseline threshold is calibrated based on display characteristics of the set-top box 104 and/or the television 110. Many set top boxes 104 and televisions 110 may be interrogated to provide a baseline (reference) gray scale value. Next, at each point in the process 600 where the image is quantized to black and white format, a first quantization attempt may be performed with an initial quantization threshold. If a positive result is obtained based on the quantized image (e.g., a number is detected in the case of ROIs 306 and/or 316 or an edge is detected in the case of ROI 314), control proceeds to an appropriate subsequent point in the procedure. Conversely, if a negative result is obtained, another threshold is tried. For example, other thresholds (e.g., thresholds that are +/-4%, +/-10%, etc. from the initial threshold) may be determined relative to the initial threshold. Multiple attempts may be made to quantize the image until a set of quantization thresholds are exhausted, at which point control will follow a path corresponding to a negative result for a corresponding point in process 600. One of ordinary skill in the art will recognize that many techniques may be used to determine the one or more quantization thresholds, of which the foregoing is one example.
Image features
Fig. 9 illustrates an exemplary set of features associated with the ROI 700 of fig. 8 that may be detected and/or examined to determine whether the displayed content constitutes a channel number display. For example, the distance between the upper horizontal edge 802 and the lower horizontal edge 804 defines a character height 818, and the character height 818 may be compared to upper and/or lower desired character height thresholds in determining whether an on-screen channel number is present. The distance between the opposing vertical edges 805, 806, 808, 810, 812, 814 defines a character width 824, 828, 832 that can be compared to upper and/or lower desired character width thresholds in determining whether an on-screen channel number is present. Further, the distance between the opposing vertical edges 805, 806, 808, 810, 812, 814 can be used to define intervals 826, 830 between characters that can be compared to upper and/or lower desired character interval thresholds in determining whether an on-screen channel number is present. Further, the distance between the appropriate edges may be used to define the center characters 816, 820, 822, 834 of the display digit, which may be compared to upper and/or lower desired character center thresholds in determining whether an on-screen channel number is present. As will be appreciated by those of ordinary skill in the art, one or more edge detection techniques may be used to identify one or more edge locations, and then the distance between these edges may be used to define a digit height, width, etc. Alternatively, a group of pixels having the same color feature, which are disposed adjacent to each other and extend in a specific direction (e.g., a horizontal direction), may be counted to determine the width of the digit. Similarly, adjacent pixels of a common color extending in a vertical direction may be counted to determine the height of the digit.
After the channel change detection process (e.g., process 600 or 650 described previously) detects a channel number in the appropriate ROI (e.g., ROI 306) based on, for example, the above-described features, and determines that the displayed number does not correspond to a headline navigation event, the process compares the detected channel number to the previous channel number (e.g., block 628 of fig. 6A). Although any technique may be used to perform this comparison, the preferred method is based on computing the dot product between the quantized image of the ROI containing the current channel number and the stored reference image/template corresponding to the previous channel number. As described above, the ROI containing the current channel number is captured by the frame grabber 108 and converted (quantized) to a black and white image. Similarly, the stored reference image/template corresponding to the previous channel number is also a black and white image that may correspond to, for example, a previously acquired ROI or an ideal representation of the stored number based on characteristics of the set top box 104 and/or television 110. In either case, the dot product is calculated by multiplying the corresponding pixels of the quantized ROI image with the stored reference image/template to form a set of intermediate inner products at each pixel location. These intermediate inner products are then added together to form a dot product. Mathematically, the dot product is given by the following expression:
where image (r, c) represents the value of the acquired ROI image at pixel location (r, c), template (r, c) represents the value of the stored reference image/template at pixel location (r, c), and index (r, c) refers to the pixel location of the r-th row and c-th column of the corresponding image.
For black and white images, white pixels may be represented by a value equal to 1 and black pixels may be represented by a value equal to-1 (note: any relative or other equivalent expression may be used). Thus, the inner product of two identical pixel values results in a value of 1, while the inner product of two different pixel values results in a value of-1, i.e.:
white x white 1x1 1;
black x black-1 x-1 ═ 1;
white x black 1x-1 ═ 1;
black x white-1 x 1-1.
Thus, if the acquired ROI image is similar to the stored reference image/template, the inner products will sum constructively and the resulting dot product will be a large positive value. Conversely, if the two images are dissimilar, the inner products will sum destructively, and the resulting dot product will fall within a range from a small positive value to a large negative value.
Based on the previous definition of the dot product and the representation of the corresponding image, a number of useful criteria can be used to compare the channel number of the acquired ROI with the previous channel number represented by the stored reference image/template. One preferred such criterion is the number-equivalent criterion, which is defined as the ratio between the dot product and the total number of pixels in the image, i.e.:
where numpixels is the total number of pixels in the image, which is equal to the product of the number of pixels in the R row and the number of pixels in the C column. Assuming a pixel value of 1 or-1 for a black and white image, the equivalent expression of the above defined number equivalent criterion includes:
or
criterion=ratio of same pixels-ratio of different pixels
Where num same pixels are used for the acquired ROI image as the stored template value, and num diff pixels are used for the different pixel positions. The amount of ratio of local pixels is equal to num local pixels divided by the total number of pixel locations (numpixels), and the amount of ratio of differential pixels is equal to num differential pixels divided by numpixels. Depending on the implementation, one of these previous expressions may be more preferable than the other expressions.
The channel change detection process (e.g., process 600 or 650) may use the quantity equivalent criteria defined above to determine whether the channel number represented by the captured ROI image is substantially similar to or substantially different from the previous channel number represented by the stored reference image/template. For example, if the criteria have a threshold criteria value of 95% or greater, the process may determine that the channel numbers are substantially similar and that a channel change has not occurred. If the value of the criterion is less than 95%, the process may determine that the channel numbers are substantially different, and thus determine that a channel change has occurred.
Referring to fig. 10, to reduce the complexity of any algorithm for determining whether the display content represents a digit, the display content may be examined to determine whether it contains a portion representing a digit. For example, a number 1 may be considered as a digit having a single portion that includes a group of adjacent pixels 950 having a common color/shade, the pixels 950 extending vertically a predetermined number of pixel locations, wherein the predetermined number of pixel locations represents a desired digital height based on height characteristics stored in an associated television or set-top box. By way of further example, the number 3 may be considered to have three horizontally disposed sections 954, 956, 958 and a single vertically disposed section 952. If the detected portion has the correct characteristics (i.e., characteristics that match the display characteristics of the television 110 and/or set-top box 104 controlling the digital display), then the OSMSD112 may indicate that an on-screen channel number display has been detected. For example, the OSMSD112 may compare the location of one or more of a set of detected portions to a set of stored portion constructs corresponding to a digital display. If a match is determined, the OSMSD112 may indicate that an on-screen channel number display has been detected. The OSMSD112 may further use information related to the matching construct to identify the value of the channel number being displayed. In another example, the OSMSD112 may not compare the arrangement of the plurality of portions to each other and/or to portion configuration information stored in memory. Indeed, the OSMSD112 may only examine the characteristics of a portion, and if such a portion has a set of characteristics expected for a set of display digit portions, the OSMSD112 may identify such a portion as representing an on-screen digit display.
The channel change detection process (e.g., process 600 or 650) may be extended to not only detect channel change events but also to identify reselected channels. An exemplary process 1000 for identifying display channel numbers is illustrated in fig. 11A. Given the other processing resources and time that may be required to identify a channel number, the example process 1000 may perform the background processing and foreground channel change detection processes 600 or 650 in parallel. To enable this type of execution, the exemplary channel identification process 1000 employs a queue to store pending channel number ROI images awaiting identification. Referring to FIG. 11A, exemplary process 1000 includes two sub-processes 1002 and 1004 that are executed in parallel (although processes 1002 and 1004 may also be configured to be executed in a serial fashion). The purpose of the sub-process 1002 (queue management) is to add a channel number ROI image to the image queue. The purpose of the sub-process 1004 (queue process) is to process the ROI images stored in the image queue.
A sub-process 1002 is shown in fig. 11B, where control begins at block 1005, where the OSCCD112 determines whether there is a new channel number ROI image (e.g., output from the exemplary process 600 or 650). If there are no new images, control returns to block 1005 and the OSMSD112 waits for a new image to appear. Conversely, if there is a new image, the OSMSD112 checks the status of the image queue. If the image queue is not full (block 1006), the OSMSD112 adds a new channel number ROI image to the queue (block 1008). After adding the image to the queue, control returns to block 1005 and the OSMSD112 waits for a new channel number ROI image to appear. However, if the OSMSD112 determines at block 1006 that the queue is full, an error condition may be raised, causing the example process 1002 to exit.
A sub-process 1004 is shown in fig. 11C, where control begins at block 1010 where the OSMSD112 determines whether the image queue contains a channel number ROI image. If the queue is empty, control returns to block 1010 where the OSMSD112 waits for an image to be added to the queue. Conversely, if the queue is not empty, control proceeds to block 1012, where the OSMSD112 fetches the oldest image stored in the queue with the queue in a first-in-first-out (FIFO) configuration. A Last In First Out (LIFO) configuration may also be used, depending on the particular application of the channel identification procedure. In either case, after the OSMSD112 obtains the next image to be identified (block 1012), control proceeds to block 1014 where the OSMSD112 retrieves the channel number image template from memory. The channel number image template is one of a set of templates corresponding to a set of known channel numbers. The OSMSD112 then compares the acquired channel number ROI image to the template image (block 1016). If the two images do not match, control proceeds to block 1018 where the OSMSD112 determines whether there is another stored template. If there is another template, control returns to block 1014 where the OSMSD112 retrieves the next template from memory and control proceeds as described above. If there is no other template (block 1018), the OSMSD112 generates a report indicating that the channel identification was unsuccessful. Control then returns to block 1010 and proceeds as described above.
However, if the OSMSD112 determines at block 1016 that the acquired channel number ROI image matches the stored channel number image template (e.g., based on the dot product calculation and quantity equivalence criteria described above), then control proceeds to block 1022. At block 1022, the OSMSD112 generates a report identifying the channel number in the captured ROI image as corresponding to the channel number of the matching template. Control may then return to block 1010 and proceed as described above, or control may optionally proceed to a set of blocks contained in block 1024.
The processing performed at block 1024 may be used to modify the channel change detection process (e.g., processes 600 or 650) by replacing a previously captured channel number based image (e.g., used at block 628 of fig. 6A) with a stored template having known characteristics. If block 1024 is available, control begins with block 1022 to proceed as follows. After identifying a channel number based on the selected template (block 1022), control proceeds to block 1026 where the OSMSD112 determines whether the identified channel number is still used as a reference channel number for a channel change event detection process (e.g., process 600 or 650). For example, the OSMSD112 may examine the captured ROI image to determine whether it corresponds to a captured reference channel number image that is used to determine whether a channel change event has occurred (e.g., based on tag and/or header information included in the digital representation of each image). If the identified channel number is not used (block 1028), control returns to block 1010 and proceeds as described above. Conversely, if the identified channel number is still in use, control proceeds to block 1028 where the OSMSD112 determines whether the captured reference image or stored template is used as a reference channel number image. If the stored template is already in use (block 1028), control returns to block 1010 and proceeds as described above. However, if the captured image is used as a reference channel number image (block 1028), the OSMSD112 may replace the captured reference image with the corresponding template determined at block 1022. Control then returns to block 1010 and proceeds as described above.
The channel identification process may be loaded for processing, for example, into a companion processor (companion processor) or even the central processing device 222. In such a configuration, the OSMSD112 may send the captured ROI images to the companion processor (e.g., via the interface circuitry 210) and/or to the central processing device 222 (e.g., via the network 218).
The exemplary methods and apparatus described herein may be applied to other applications in addition to channel change event detection and channel number identification. For example, one or more of the ROI, stored template, and/or reference features may be modified to enable OSMSD (e.g., OSMSD200) to detect changes in displayed letters/text and/or to also recognize such letters/text. In another example, a particular shape/object may be detected and/or identified based on an appropriate selection of the ROI, stored template, and reference features.
OSMSDs (e.g., OSMSDs 200, 250) may also be used as front-end filters or gates for any number of image processing and/or recognition algorithms (e.g., an on-screen display reader for processing captured images corresponding to received video frames). In such an application, the OSMSD200, 250 (and/or a process similar to the channel change detection process 600 or 650) may be used to select only images containing desired content (e.g., particular numbers, letters, objects, etc.) for processing. As described above, the OSMSD200, 250 (or similar process) may be configured to detect desired numbers, letters, objects, etc. based on appropriate selection of ROIs, stored templates, and/or reference features. In this way, the OSMSDs 200, 250 (or similar processing) may significantly reduce the processor and memory requirements of the associated image processing and/or recognition algorithms.
The viewing mode and media device source may also be determined by ROI order. The different viewing features may exhibit a uniquely identifiable sequence of images, numbers, symbols, and/or icons after and/or during the invocation of the feature by the user. For example, if a user invokes a pay-per-view (PPV) feature, the viewer may be presented with one or more confirmation screens, one or more screens that the viewer is requested to wait for when retrieving PPV selections, and a list of PPV selections, corresponding viewing times, screenshots, movie trailers, program descriptions, and/or viewing prices. Because each picture appears in a predetermined order, the OSMSD112 can monitor whether one or more ROIs are in the predetermined order to identify viewer behavior. While the following example includes ROI sequential detection for set-top box PPVs, this is not limiting and similar processing can be applied to VCRs, DVD players and/or game consoles. Specifically, manufacturers of different media devices typically display different sequences of pictures when the media device is activated. For example, the game console may display the name of the manufacturer and/or the trademark of the game console (e.g., Xbox (registered trademark) of Microsoft (registered trademark)) before displaying any particular video game-specific screen. In addition, the VCR may display a unique screen shot sequence to the user when the user schedules the VCR to record a broadcast program.
12A-12E illustrate exemplary screen sequences displayed after a user invokes an exemplary PPV feature. Fig. 12A illustrates an initial screen displayed after a viewer presses, for example, a remote control or PPV button on a set-top box. The OSMSD112 may monitor the title ROI 1202, the message frame ROI1204, and/or one or more sub-ROIs within the title ROI 1202 and/or the message frame ROI 1204. The OSMSD112 may specifically monitor the header 1202 for the PPV symbol 1206, the brand name of the PPV feature (e.g., "itrl" 1208), and/or the specific channel number 1210. Additionally, the OSMSD112 may monitor the message box 1204 for the same brand name 1212 and/or the phrase "please wait for a moment" 1214. The exemplary PPV feature may next display a screen in which only the message box 1204 is retained without the header 1202, as shown in fig. 12B. While the PPV feature continues to initialize and/or obtain a list of available PPV programs, a simple text-based "please wait" screen 1216 may appear, as shown in fig. 12C. Because PPV program data can be large, providing information to a user can be delayed for a certain amount of time, especially if the user is a satellite television user and weather conditions interfere with satellite signal reception. The received information may include another "please wait" screen 1218, the screen 1218 including a richer set of graphics, as shown in FIG. 12D. Finally, the exemplary PPV picture sequence may end with a program listing, one or more screenshots 1222 of available PPV programs, price information 1224, and/or a narrative description 1226 of the PPV programs.
Additionally or alternatively, the viewing pattern recognition may be determined by the OSMSD112 by monitoring whether a particular symbol is present. The broadcaster and/or set-top box manufacturer typically implement the viewing functionality in a predictable manner. Thus, the state diagram may illustrate an appropriate ROI that should be monitored to determine whether there is subsequent viewer activity. It is not necessary for the OSMSD112 to scan and process all ROIs to determine viewer activity, and selective ROI scanning requires less OSMSD112 processing power and increased response time. Fig. 13 illustrates an exemplary screen 1303 in which the viewer has selected the VOD feature. The exemplary screen includes a title 1304 and a symbol 1306, the symbol 1306 only appearing on the screen during a limited number of viewing modes of the exemplary set-top box 104. As a result, knowing which limited number of states will occur next allows monitoring of a particular ROI when OSMSD112 detects symbol 1306.
Fig. 14A illustrates an exemplary state diagram of an exemplary set top box 104. Without any limitation, other set top box manufacturers may have substantially different viewing mode configurations. Although the exemplary state diagram illustrated in fig. 14A is a graph, such graphical representation is for illustration purposes and the state diagram may be in a machine-readable format, without limitation. The "VOD selection" state 1408 indicates the VOD screen 1302 in fig. 13. Because the "select VOD" state 1408 is known in the state diagram map in fig. 14A, any subsequent states can be detected by monitoring a limited number of ROIs and/or possible symbols. For example, according to the "select VOD" mode 1408, the OSMSD112 itself need not be concerned with the detection of the symbol "B" 1410, the symbol "C" 1412, or any ROI to which these symbols are assigned. Because the next possible states are limited to "guide/info" 1414, "watch TV" 1416, or "watch VOD" 1418, the OSMSD112 need only detect the simultaneous disappearance of the title 1304 and symbol "a" 1306 (representing the "guide/info" state 1414), the disappearance of the symbol "a" 1306 while the title 1304 remains on-screen (representing the "watch TV" mode 1416), or the disappearance of the title 1304 while the symbol "a" 1306 remains on-screen (representing the "watch VOD" mode 1418).
A flowchart illustrating an exemplary process 1450 for determining viewer activity from a sequence of screens is shown in FIG. 14B. Exemplary process 1450 begins at block 1452 where frame grabber 108 of FIG. 1 captures a screenshot corresponding to a video frame signal delivered to television 110 from, for example, set top box 104. If the OSMSD112 has not performed the exemplary process 1450 at least once before determining the viewer's current mode (block 1454), the OSMSD112 must extract all ROIs from the video frame for analysis (block 1456). On the other hand, if the next possible state of the media device (e.g., the example set-top box 104) is known, the OSMSD112 need only extract a subset of the ROI from the video frame (block 1458). The OSMSD112 may know the current state and all possible subsequent states by means of a known state diagram, such as the one shown in fig. 14A. The state diagram of the media device is typically static and is a function of the firmware program on the media device. Such a state diagram may be published as a user manual, or may be determined empirically prior to installation in the audience's home, and/or stored in the OSMSD 112.
If there are no known symbols within the captured video frame (block 1460), the process repeats (block 1452) in an attempt to determine the mode of the exemplary set top box 104. On the other hand, if a known symbol is detected, the OSMSD112 determines whether the presence of such a symbol is consistent with any of the next possible patterns in the state diagram. For example, if the previous mode was "watch TV" 1416 and the symbol "a" 1306 was detected without the title 1304, then there is a candidate next mode (block 1462) and the candidate mode is identified as "watch VOD" (block 1464). However, if the desired symbol is detected, but does not match the state mapping (e.g., on the state diagram of fig. 14A), exemplary process 1450 is repeated (block 1452). This result may occur if the OSMSD112 does not have complete knowledge of all possible states of the media device. Alternatively, this result may occur if a television commercial with a known symbol is presented. However, if the known symbol does not appear in the desired ROI, then the state change is deemed not to have occurred.
As described above, the media device or devices used to determine viewer activity and/or identify a viewer are not limited to changing channels and/or set-top boxes. FIG. 15 illustrates exemplary game screens 1500 that may be generated by a separate game console or as part of a set-top box function. After capturing the video frames of the television, the OSMSD112 may identify a "select games" screen 1502 based on the detection of particular game icons 1504, 1506, 1508, and 1510. Additionally or alternatively, the "select games" screen 1502 may be identified by other screen symbols (e.g., a top horizontal bar 1512 and/or a bottom horizontal bar 1514 in a particular ROI 1516). Each individual game corresponding to game icons 1504, 1506, 1508, and 1510 can include additional screen layers related to game play. While the OSMSD memories 216, 252 may be populated with exhaustive details regarding the graphical layout of each game, certain common features of each game may be utilized to conserve memory requirements and/or processing power. For example, each of game "A" 1518, game "B" 1520, and game "C" 1522 includes a common menu graphic 1524 located in the lower right corner ROI. Thus, the OSMSD112 does not need to extract ROI information from the entire viewing screen, thereby saving considerable processing resources and/or memory for other purposes. If the common menu graphic 1524 ever disappears from the desired ROI, the OSMSD may analyze the alternative focus ROI with reference to a state diagram similar to the state diagram discussed in FIG. 13B.
The OSMSD112 may also detect user activity via interactive content (e.g., multi-screen displays, weather, horoscope, lottery, and/or help screens). The active screen typically includes certain user inputs such as channels to view preferentially in a single multi-screen display, area-of-interest weather data, birthday information entered to get horoscope results, local lottery numbers, and/or help topics entered to get results from the help screen. Each active picture typically includes recognizable symbols, text, and/or icons that are unique to the particular feature. For example, fig. 16 illustrates an exemplary multi-screen display screen 1602 containing seven news broadcast programs that are simultaneously displayed to a user. The OSMSD112 can identify this multi-view display 1604 by the combination of letters spelled "NewsMix" in the ROI near the top center of the view 1602.
The OSMSD112 can detect other features and/or modes of operation of one or more media devices without limitation. While some media devices are loaded with a predetermined plurality of features and/or states (e.g., embedded in firmware) when shipped from a manufacturer, other media devices (e.g., set-top boxes) may present a wide range of services and/or features to viewers when they become available. Broadcasters not only provide users with a rich set of multimedia content, but can also offer diverse services including home shopping, multi-view displays, gambling, gaming, VOD, and/or other media enhancement features. As new features and/or state diagrams of the media device (e.g., new DVD players, VCRs, set-top boxes, video game consoles, and games played thereon) become available, such state diagrams may be uploaded into the memory 208 and/or other storage device 216 of the OSMSD 112.
An additional feature includes a restart service (SOS). Viewers that miss the beginning of a broadcast program may invoke the SOS to resume the broadcast program, rewind and/or pause the program. SOS is particularly useful for viewers who suffer unexpected delays that prevent the viewer from viewing at the normal scheduled time for a particular broadcast. Fig. 17 illustrates an exemplary screenshot 1700 of an SOS that includes symbols, icons, and/or titles in a predetermined ROI for monitoring and recognition by the OSMSD 112. For example, the user may select an SOS feature from a button panel on a remote control and/or a set-top box during the live program 1702. Selection of the SOS feature causes an SOS feature confirmation header 1704 to appear in the upper left ROI of the live 1702 view. In addition, a subscript title 1706 is displayed to the viewer to provide a program name 1708, a scheduled broadcast time 1710 (e.g., 8:00-8:30pm), and a current time 1712 (e.g., 8:15 pm). Prior to invoking the SOS, the viewer may determine the extent to which the broadcast program was missed, in this exemplary case half of the program has been broadcast. If the viewer decides to respond affirmatively to SOS feature confirmation title 1704 and resumes the program, SOS prominence display 1714 is presented to the user. The SOS saliency display 1714 includes SOS symbols 1716 located in the center ROI, which may be detected by the OSMSD 112. Highlight display 1714 convinces the user that the SOS was started before the beginning of program 1718 was displayed.
The OSMSD112 may detect SOS in a manner similar to that discussed in connection with FIGS. 12A-12E, where the OSMSD112 detects VOD through a sequence of ROIs. Much like the exemplary state diagram in fig. 13B, the SOS may also exhibit a predictable ROI sequence that may be detected by the OSMSD 112.
A block diagram of an exemplary on-screen display reader (OSDR) system 1800 is illustrated in fig. 18A. Exemplary OSDR system 1800 includes a frame grabber 1804, which may be substantially similar or identical to frame grabber 108 of fig. 1. A video signal having a video frame rate of 30 frames/second is provided as an input to the frame grabber 1804. The frame grabber 1804 grabs screenshots of the input video signal at a predetermined rate (e.g., once every 120 milliseconds or equivalently at a rate of 8.3 per second). The captured screenshots are provided to OSDR 1808 for processing. In this example, OSDR 1808 analyzes each screenshot at the aforementioned screenshot rate to determine, for example, whether a given screenshot contains content of interest, and if so, processes the screenshot containing such content. Thus, the processing of the first screenshot by OSDR 1808 must be completed before the second screenshot arrives. OSDR 1808 may require significant processor resources to meet such real-time processing constraints.
Fig. 18B illustrates a block diagram of an exemplary OSDR system 1850 that employs OSMSD1854 as a front-end filter. The OSMSD1854 may be substantially similar to or identical to the OSMSD112 in fig. 1. Exemplary OSDR system 1850 includes frame grabber 1804 and OSDR 1808 discussed in connection with exemplary OSDR system 1800 in fig. 18A. However, in this example, OSMSD1854 processes screenshots received from frame grabber 1804 at a predetermined screenshot rate (e.g., screenshots every 120 milliseconds) before providing such screenshots to OSDR 1808. Accordingly, OSMSD1854 may be configured to analyze each input screenshot for content of interest and provide only screenshots containing such content to OSDR 1808. In this way, the average rate of OSDR 1808 providing screenshots to exemplary system 1850 may be significantly reduced relative to the predetermined rate of screenshots initiated (e.g., significantly less than the rate of initiation of screenshots every 120 milliseconds). By reducing the average input screen capture rate, the processor demand on OSDR 1808 can be significantly reduced as shown in fig. 19 below. Further, OSMSD1854 may be configured to provide selected screenshots of the raw format and/or the quantized format (e.g., black and white) to OSDR 1808 based on the requirements of a given application.
FIG. 19 illustrates an exemplary processing timeline for the exemplary system 1800 of FIG. 18A and the exemplary system 1850 of FIG. 18B. For each of the process timelines 1904, 1908, 1912 in fig. 19, the horizontal axis represents time, while the vertical axis represents the processor speed (e.g., measured in million instructions per second or MIPS) required to perform the respective process in the amount of time represented. Thus, the area under each processing timeline 1904, 1908, 1912 represents the number of instructions (e.g., MIPS) required to perform the corresponding processing*Millions of instructions in seconds).
The sequence of four (4) screenshots 1916, 1920, 1924, 1928 acquired by frame grabber 1804 of fig. 18A or 18B is shown in fig. 19. For the exemplary system 1800 in FIG. 18A, a processing timeline corresponding to OSDR 1808 is represented by timeline 1904. OSDR 1808 in fig. 18A is configured to process each input screenshot, and timeline 1904 shows three processing regions 1932, 1936, and 1940 corresponding to screenshots 1916, 1920, and 1924, respectively, as desired. Because of real-time processing limitations of OSDR 1808 of fig. 18A (e.g., processing for a certain screenshot (e.g., screenshot 1916) must be completed before the next screenshot (e.g., screenshot 1920) arrives), OSDR 1808 may require a significant amount of processor resources (e.g., as represented by the height of processing regions 1932, 1936, and 1940).
A processing timeline for processing of each screenshot 1916, 1920, 1924 by OSMSD (e.g., OSMSD1854 of fig. 18B) is represented by time line 1908. As discussed above, OSMSD1854 may be configured to search for a particular object (e.g., channel number, ROI, symbol, icon) only in black and white (binary) versions of screenshots 1916, 1920, 1924. Thus, the processing requirements of OSMSD1854 may be significantly less than the processing requirements of OSDR 1808 (e.g., as represented by the reduced height and width of OSMSD processing areas 1944, 1948, 1952 as compared to OSDR processing areas 1932, 1936, 1940, respectively).
To illustrate the potential benefits of using OSDR 1854 (e.g., with lower processor demand as shown in timeline 1908) with OSDR 1808 (e.g., with higher processor demand as shown in timeline 1904), a processor timeline corresponding to the exemplary system 1850 of fig. 18B is represented by time line 1912. In exemplary system 1850, OSMSD1854 is configured to process each input screenshot 1916, 1920, 1924, etc., as shown by processing areas 1956, 1960, and 1964, respectively. In the exemplary scenario of FIG. 19, screenshot 1916 contains focus content, while screenshots 1920 and 1924 do not. Accordingly, OSMSD 1954 may be used to select screenshot 1916 for processing by OSDR 1808, while screenshots 1920 and 1924 are discarded. Thus, the time required for OSDR 1808 to process screenshot 1916 may now extend to the time associated with screenshots 1920 and 1924. In other words, in the case of OSDR 1808 in system 1850, the real-time constraints faced by OSDR 1808 may still be met if the raw processing area 1932 required by OSDR 1808 in system 1800 extends to processing areas 1968, 1972, and 1976. Thus, the processor requirements of OSDR 1808 used in system 1850 with OSMSD1854 as a front-end filter are significantly lower than OSDR 1808 used in exemplary system 1800 (e.g., as represented by the reduced heights of the processing regions 1968, 1972, 1976).
Although certain methods, apparatus, and articles of manufacture have been described herein, the scope of coverage of this patent is not limited thereto. On the contrary, this patent covers all apparatus, methods, and articles of manufacture fairly falling within the scope of the appended claims either literally or under the doctrine of equivalents.
This patent application claims priority from U.S. provisional patent application No. 60/815,122 entitled "Methods and apparatus for Detecting On-Screen Media Sources," filed On 2006, 20/6, which is hereby incorporated by reference in its entirety.
Claims (33)
1. A method of identifying an active home media source device, the method comprising the steps of:
receiving a video signal;
capturing a portion of the video signal at a region of interest; and
comparing the captured portion to a designed state diagram to identify the active home media source device from a plurality of home media source devices that are not providing the video signal to the display apparatus, wherein the designed state diagram includes a plurality of device modes, each device mode corresponding to an image and associated with a candidate next mode.
2. The method of claim 1, wherein the state diagram comprises a desired video image located at the region of interest.
3. The method of claim 2, further comprising the steps of: associating the active home media source device with the video signal when a desired video image is detected within the area of interest.
4. The method of claim 2, wherein the desired video image comprises at least one of a digital image, an alpha image, a symbol, or a broadcast station identifier.
5. The method of claim 1, wherein the active home media source device comprises at least one of a set top box, a video tape recorder, a digital video disk player, a game console, a television, a delayed viewing device, a video on demand viewing device, a multi-view display viewing device, an internet browsing device, or a people meter.
6. The method of claim 1, wherein capturing a portion of the video signal comprises quantizing the portion of the video signal to generate a quantized video signal.
7. The method of claim 6, wherein the step of quantizing the portion of the video signal comprises at least one of converting a color video signal to a grayscale video signal, converting a color video signal to a black and white video signal, or converting a grayscale video signal to a black and white video signal.
8. The method of claim 6, wherein the quantized video signal has a lower memory requirement than the video signal.
9. The method of claim 6, further comprising the steps of: applying a quantization threshold specifying at least one of chroma, pixel intensity, or pixel density to generate the quantized video signal.
10. The method of claim 1, wherein capturing a portion of the video signal comprises detecting an edge of the portion of the video signal.
11. The method of claim 10, wherein the edge of the portion of the video signal comprises a static area surrounding the portion of the video signal.
12. The method of claim 1, further comprising the step of capturing a second portion of the video signal at a second region of interest, the second region of interest being a static region, and further comprising the steps of: if the static area is not present, this part of the video signal is ignored.
13. A method of identifying a change in a device of a home media source, the method comprising the steps of:
monitoring whether a first image exists in a region of interest of a video signal;
comparing the first image to a designed state diagram, wherein the designed state diagram includes a plurality of device modes, each device mode corresponding to an image and associated with a candidate next mode, the first image associated with one of the plurality of home media source devices; and
when the image associated with the candidate next mode is different from the first image and associated with a second criterion, confirming the home media source device change to identify an alternate home media source device from among a plurality of home media source devices that do not provide the video signal to the display apparatus.
14. The method of claim 13, wherein the second criteria comprises at least one of a second image located in a second region of interest, a static region, the presence of a third region of interest, the absence of a fourth region of interest, or a remote control signal.
15. The method of claim 13, wherein the image associated with the candidate next mode comprises a first total pixel value and the first image comprises a second total pixel value.
16. The method of claim 15, further comprising the steps of: the dot product of the first pixel sum and the second pixel sum is calculated.
17. The method of claim 16, wherein the dot product produces a positive output when an image associated with the candidate next mode is similar to the first image.
18. The method of claim 16, wherein the dot product produces a negative output when the image associated with the candidate next mode is different from the first image.
19. The method of claim 16, wherein the dot products add constructively when the image associated with the candidate next mode is similar to the first image and add destructively when the image associated with the candidate next mode is different from the first image.
20. The method of claim 13, further comprising the steps of: a criterion is calculated for determining whether the first image is different from an image associated with the candidate next mode.
21. The method of claim 20, wherein the criteria comprises a ratio between a dot product of the first image and an image associated with the candidate next mode and a number of pixels in the first image.
22. The method of claim 20, wherein the criteria comprises a difference between a number of like pixels and a number of different pixels divided by a total number of pixels in the first image.
23. The method of claim 20, wherein the criteria comprises a difference between a ratio of similar pixels and a ratio of different pixels.
24. The method of claim 20, further comprising the steps of: the criteria is compared to a threshold to determine whether the first image is different from an image associated with the candidate next mode.
25. A method for identifying a media device mode, the method comprising:
monitoring a region of interest in a video signal at a first time;
comparing the monitored region of interest to a state diagram comprising a plurality of device modes to determine media device mode candidates, each device mode corresponding to an image and associated with at least one of the media device mode candidates;
monitoring the region of interest at a second time; and
comparing the monitored area of interest with the state diagram to confirm the media device mode candidate as a media device mode and to identify an active home media source device from among a plurality of home media source devices that do not provide the video signal to a display apparatus.
26. The method of claim 25, wherein the state diagram includes a plurality of device modes of at least one of a set-top box, a television, a video tape recorder, a digital video disk player, a people meter, or a game console.
27. The method of claim 25, wherein the device mode comprises a channel change mode, a banner browsing mode, a video on demand mode, a pay-per-view mode, or a guide mode.
28. The method of claim 25, further comprising the steps of: a second media device mode is detected at a third time.
29. An apparatus for monitoring a video signal, the apparatus comprising:
a region-of-interest extractor for extracting an image signal from a region of the video signal;
an image quantizer for quantizing the extracted image signal; and
an object comparator for comparing the quantized image signal with a pattern sequence state diagram of the home media device to identify an available viewing pattern of the home media device and to identify an active home media source device from among a plurality of home media source devices that do not provide the video signal to a display apparatus.
30. The apparatus of claim 29, wherein the home media device comprises at least one of a set-top box, a television, a video tape recorder, a digital video disk player, a people meter, or a game console.
31. The apparatus of claim 29, further comprising: a quantizer threshold module to provide a quantization threshold to the image quantizer to achieve at least one of reducing pixel density, changing pixel brightness, or removing pixel color.
32. An apparatus for processing a video signal, the apparatus comprising:
a frame grabber for grabbing a video signal at a first rate;
a media source detector for quantizing the video signal to improve image detection of the first region of interest and the second region of interest; and
an on-screen display reader for comparing the images detected in the first and second regions of interest with a pattern sequence state diagram to identify a home media source device from among a plurality of home media source devices that do not provide the video signal to the display apparatus.
33. The apparatus of claim 32, wherein the media source detector further comprises: a quantizer threshold for adjusting at least one of pixel density, pixel color, or pixel brightness.
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US81512206P | 2006-06-20 | 2006-06-20 | |
| US60/815,122 | 2006-06-20 | ||
| PCT/US2007/014317 WO2007149458A2 (en) | 2006-06-20 | 2007-06-20 | Methods and apparatus for detecting on-screen media sources |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| HK1132871A1 HK1132871A1 (en) | 2010-03-05 |
| HK1132871B true HK1132871B (en) | 2012-07-06 |
Family
ID=
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US8019162B2 (en) | Methods and apparatus for detecting on-screen media sources | |
| US8553148B2 (en) | Methods and apparatus to distinguish a signal originating from a local device from a broadcast signal | |
| EP2244208B1 (en) | Methods and apparatus for identifying primary media content in a post-production media content presentation | |
| CN101536513B (en) | Method and apparatus for detecting on-screen media sources | |
| US20130042262A1 (en) | Platform-independent interactivity with media broadcasts | |
| US8214368B2 (en) | Device, method, and computer-readable recording medium for notifying content scene appearance | |
| US20090009532A1 (en) | Video content identification using ocr | |
| US20070180459A1 (en) | Methods and apparatus to identify viewing information | |
| JP5086393B2 (en) | Method and apparatus for monitoring a multimedia presentation including multiple content windows | |
| CN109275022B (en) | Method and device for displaying signal source list | |
| CN101996313A (en) | System and method for detecting advertisement | |
| JP2016536834A (en) | Method and apparatus for simultaneous display and monitoring of video programs | |
| JP4267649B2 (en) | VIDEO PROGRAM PROCESSING METHOD, RELATED DEVICE, AND RELATED MEDIUM | |
| CN103370945A (en) | Display control device, display control method and program | |
| WO2005057322A2 (en) | Methods and apparatus for detecting a television channel change event | |
| US8276173B2 (en) | System and method for processing program including advertisements | |
| HK1132871B (en) | Methods and apparatus for detecting on-screen media sources | |
| AU2012268871B2 (en) | Methods and apparatus for identifying primary media content in a post-production media content presentation | |
| Chattopadhyay et al. | Electronic Program Guide for RF fed TV Channels | |
| MXPA06007503A (en) | Methods and apparatus to distinguish a signal originating from a local device from a broadcast signal |