HK40010744A

HK40010744A - Systems and methods for addressing a media database using distance associative hashing

Info

Publication number: HK40010744A
Application number: HK42019000395.4A
Authority: HK
Inventors: 泽夫·纽梅尔; 布莱恩·里德
Original assignee: 构造数据有限责任公司
Priority date: 2013-03-15
Filing date: 2019-12-25
Publication date: 2020-07-10

Description

System and method for addressing a media database using distance associative hashing

The application is a divisional application of an invention patent application with the application date of 2014, 3, 17, international application numbers of PCT/US2014/030782, national application number of 201480017043.9 and the invented name of 'system and method for addressing a media database by using distance correlation hashing'.

Priority requirement

The present application constitutes a continuation-in-part application of U.S. patent application No. 12/788,721 entitled "METHODS FOR IDENTIFYING VIDEO segments and displaying contextual TARGETED CONTENT ON a CONNECTED TELEVISION" filed ON 27/5/2010 and published ON 6/11/2013 as U.S. patent No. 8,595,781, that application is a non-provisional application claiming the benefit of U.S. provisional patent application No. 61/182,334 entitled "SYSTEM FOR processing INFORMATION IN television video SIGNALs" filed ON 29.5.2009 and is a non-provisional application claiming the benefit of U.S. provisional patent application No. 61/290,714 entitled "content acquisition BASED DATA RECEIVED FROM television programs" (determined BASED ON contextual targeting of data received FROM a television SYSTEM) filed ON 29.12.2009; the present application further constitutes a continuation-in-part application entitled "METHODS FOR DISPLAYING CONTEXTUALLY targeted content ON a connected television" U.S. patent application No. 12/788,748 filed ON 27/5/2010 and issued ON 1/7/2014 as U.S. patent No. 8,769,584; the present application further constitutes a continuation-in-part application of U.S. patent application No. 14/089,003 entitled "METHODS for identifying VIDEO segments and displaying context-related target CONTENT ON connected TELEVISIONs" filed 2013 ON month 11 and 25, and issued 2014 ON month 11 and 25 as U.S. patent No. 8,898,714; the present application further constitutes a continuation-in-part application entitled "system and method FOR IDENTIFYING VIDEO SEGMENTS FOR displaying CONTEXTUALLY relevant content" U.S. patent application No. 14/217,075, filed 3, month 17, 2014; the present application further constitutes a continuation-in-part application of U.S. patent application No. 14/217,094 entitled "system and method FOR REAL-TIME TELEVISION commercial DETECTION USING AUTOMATED content recognition DATABASE" filed 3, 17, 2014, and issued 1, 6, 2015 as U.S. patent No. 8,930,980; the present application further constitutes a continuation-in-part application entitled "SYSTEMS AND METHODS FOR ON-SCREEN GRAPHICS DETECTION" U.S. patent application No. 14/217,375 filed 3, month 17, 2014; the present application further constitutes a continuation-in-part application entitled "united states patent application No. 14/217,425 FOR SYSTEMS AND METHODS FOR IMPROVING SERVER AND client METHODS INFINGERPRINTACR SYSTEMS (systems and METHODS FOR IMPROVING server and client performance in a fingerprint ACR system)" filed 3, 17.2014; the present application further forms a continuation-in-part application of U.S. patent application No. 14/217,435 entitled "SYSTEMS AND METHODS FOR MULTI-BROADCAST discrimination" filed 3, month 17, 2014; and this application further constitutes a continuation-in-part application entitled "System and method FOR identifying video SEGMENTS DISPLAYED ON REMOTELY LOCATED television" U.S. patent application No. 61/791,578 filed ON 15/3/2013 and entitled "SYSTEMS AND METHODS FOR identifying video SEGMENTS. The aforementioned application is either a currently co-pending or a currently co-pending application entitled to benefit of the filing date.

Technical Field

The present invention relates generally to matching unknown media data (e.g., video or audio clips) to a mass database of reference media files.

Background

Systems for Automated Content Recognition (ACR) of audio or video media are well known to those skilled in the art. However, such ACR systems present a number of technical challenges, including managing a potentially very large database of encoded audio or video information and managing the large indices needed to address the information in the database.

It is also well known to those skilled in the art that large database indices (as may be used in the present invention) may be generated using some hash function. Another way to address a database may be by applying a binary tree structure, also known as a b-tree. Both methods are commonly used in data management systems.

Regardless of the method employed to index large databases, the index is often too large to reside in its entirety in the main memory of a computer server as used in typical ACR systems. When the database cannot fit completely in the memory of a computer system, it is typically stored on disk storage, and then portions of the database are read into memory in blocks corresponding to index values providing addresses. This means of recalling partial database information is also known to those skilled in the art as "paging," a process common to many different computer software systems.

The present invention is an extension of the above-referenced invention and is a system and method for matching unknown digital media (e.g., television programs) to a database of known media using signal processing means that employs a modified path tracing algorithm (as described in the first invention).

Another novel aspect of the systems and methods as disclosed herein is the distance associative hash indexing means which may be subdivided into a plurality of independently addressable segments, wherein each of the segments may address a portion of a database, and each of the segments may reside in its entirety in a main memory of a server device. The resulting server clusters of the indexing means each host sectors of an index addressing associated data of a large database of searchable audio or video information. This indexing approach of the present invention results in a significant improvement in the speed and accuracy of ACR systems that are enabled so as to identify unknown media even when the television viewer is showing content that the user is changing channels, fast rewinding, fast forwarding, or even pausing video from a digital video recorder.

SUMMARY

In some embodiments, an exemplary method involving addressing a media database using distance associative hashing may include receiving one or more indications of samples of a video segment; determining an algorithmically derived value for at least one tile of a video segment comprising the one or more pixels of the at least one tile; subtracting the established midpoint value for each slice from the average value for each slice; transforming the values resulting from the subtraction using a pre-derived function to evenly distribute the values; constructing a hash value from the transformed values; referencing a plurality of most significant bits of the constructed hash value to determine a database sector; and storing at least the hash value on the determined database sector.

In some embodiments, at least one of receiving, determining, subtracting, transforming, constructing, referencing, or storing of the foregoing exemplary methods is implemented, at least in part, using one or more processing devices. In some embodiments of the foregoing exemplary method, receiving one or more indications of samples of a video segment may include receiving one or more indications of at least one of a frame or a still image. In some embodiments of the foregoing exemplary method, receiving one or more indications of samples of a video segment may include receiving one or more indications of samples of a video segment associated with at least one indication of a channel, at least one indication of a video segment, and at least one indication of a time code offset from a beginning of the video segment.

In some embodiments of the foregoing exemplary method, determining the algorithmically-derived value of the one or more pixels of each tile for the at least one tile of the sample of the video segment that includes at least one or more pixels of at least one tile comprises determining an average value of the one or more pixels of each tile for the at least one tile of the sample of the video segment that includes at least one or more pixels of at least one tile. In some embodiments of the foregoing exemplary method, subtracting the established midpoint value for each segment from the average value for each segment may comprise subtracting the established midpoint value for each segment from the average value for each segment, the established midpoint value for each segment having been previously determined using data from each segment over at least one time period for a plurality of frequency channels.

In some embodiments of the foregoing exemplary method, transforming the values resulting from the subtraction using a pre-derived function to evenly distribute the values may include forming a variable matrix including at least the values resulting from the subtraction; obtaining a static matrix that will more evenly distribute the transformed values when intersecting the variable matrix; and calculating a dot product of the variable matrix and the static matrix, the dot product including at least the more evenly distributed transformed values. In some embodiments of the foregoing example method, obtaining a static matrix that will more evenly distribute the transformed values when intersected by the variable matrix may include determining a static matrix that will more evenly distribute the transformed values of the variable matrix when intersected by the variable matrix based at least in part on one or more previously obtained hash values using location-sensitive hashing.

In some embodiments of the foregoing exemplary method, constructing the hash value from the transformed values may include constructing the hash value from the transformed values, including reducing the fidelity of the transformed values at least by reducing each transformed value to a binary representation. In some embodiments of the foregoing exemplary method, reducing the fidelity of the transformed values by reducing each transformed value to a binary representation may comprise determining, for each transformed value, whether the transformed value is a positive number, and assigning a one to the hash value if the transformed value is a positive number, and assigning a zero to the hash value otherwise.

In some embodiments of the foregoing exemplary method, referencing the constructed plurality of most significant bits of the hash value to determine a database sector may include referencing the constructed plurality of most significant bits of the hash value to determine a database server, wherein the plurality of most significant bits are predetermined to address a plurality of database servers, wherein the plurality of database servers associated with the plurality of most significant bits are established such that at least one index associated with a database sector can reside entirely in a memory of the respective database server. In some embodiments of the foregoing exemplary method, storing at least the hash value on the determined database sector may include storing at least the hash value on the determined database sector, including storing at least one indication of a channel, at least one indication of a video segment, and at least one indication of a time code offset from a start of the video segment at a database location based at least in part on the hash value.

In one or more alternative embodiments of the foregoing exemplary method, a number of related systems include, but are not limited to, circuitry and/or programming for implementing the method embodiments referenced herein; the circuitry and/or programming can be virtually any combination of hardware, software, and/or firmware configured to effect the herein-referenced method aspects depending upon the design choices of the system designer.

In various embodiments, an exemplary method involving addressing a media database using distance associative hashing may include receiving a hint, the hint constructed by one or more operations associated with a media storage operation; referencing a plurality of most significant bits of the received hint to determine a database sector; and returning at least one indication of at least one candidate from the database sector based at least in part on the received hint.

In some embodiments of the foregoing exemplary method, receiving a prompt, the prompt being constructed by one or more operations associated with a media storage operation, may include receiving a prompt associated with a sample of a video buffer of a client system, including receiving at least one or more indications related to a time of day associated with the sample of the video buffer of the client system. In some embodiments of the foregoing exemplary method, receiving a hint (the hint constructed by one or more operations associated with the media storage operation) may include receiving a hint, the hint associated with a sample of a video buffer of the client system, the hint determined at least in part by hashing at least some of the values associated with the video buffer.

In some embodiments of the foregoing example methods, receiving a hint (the hint being associated with a sample of a video buffer of the client system, the hint determined at least in part by hashing at least some values associated with the video buffer) may include receiving a hint, the hint being associated with a sample of a video buffer of the client system, the hint determined at least in part by hashing at least some values associated with the video buffer, the hashing based at least in part on one or more of at least one operand or at least one algorithm also used in the associated media storage operation. In some embodiments of the foregoing exemplary method, receiving a hint (the hint being constructed by one or more operations associated with the media storage operation) may include receiving a hint, the hint determined by one or more operations including at least: receiving one or more indications of at least one item of content of a video buffer of a client system; determining an algorithmically derived value of one or more pixels of each tile for at least one tile of the at least one item of content of the video buffer comprising the one or more pixels of each tile; subtracting the midpoint value from the average value for each slice; transforming the values resulting from the subtraction; constructing a hash value from the transformed values; and associating the hint at least in part with the constructed hash value, wherein at least one of the determining, subtracting, transforming, or constructing operations utilizes one or more of at least one operand or at least one algorithm also used in the associated media storage operation.

In some embodiments of the foregoing exemplary method, returning at least one indication of at least one candidate item from the database sector based, at least in part, on the received hint may comprise returning at least one indication of at least one candidate item from the database sector based, at least in part, on a probability point location in the equator ("PPLEB") algorithm as a function of the received hint. In some embodiments of the foregoing exemplary method, returning at least one indication of at least one candidate from the database sector based, at least in part, on the received hint may comprise returning at least one indication of at least one candidate from the database sector based, at least in part, on the received hint, the at least one candidate being within a predetermined inverse percentile distribution radius of the received hint.

In various embodiments, an exemplary method involving addressing a media database using distance relevance hashing may include receiving at least one indication of at least one candidate item and at least one indication of at least one prompt; adding a token to a bin associated with at least one received candidate; and determining whether a number of tokens within the bin exceeds a value associated with a probability that the client system is displaying a particular video segment associated with at least one prompt, and returning at least some data associated with the particular video segment based at least in part on the bin if the number of tokens within the bin exceeds a value associated with a probability that the client system is displaying the particular video segment associated with at least one prompt.

In some embodiments of the foregoing exemplary method, adding the token to the bin associated with the at least one received candidate item may include adding the token to a time bin associated with the at least one received candidate item. In some embodiments of the foregoing exemplary method, adding the token to the bin associated with the at least one received candidate item may include determining a relative time including at least subtracting a candidate item time associated with the at least one candidate item from any time associated with the at least one prompt; and adding a token to a time bin associated with the candidate based at least in part on the determined relative time. In some embodiments of the foregoing exemplary method, the method may include removing one or more tokens from the time bin based at least in part on the elapsed time period.

In various embodiments, an exemplary system involving addressing a media database using distance associative hashing may include, but is not limited to: one or more computing devices; and one or more instructions that when executed on at least some of the one or more computing devices cause the at least some of the one or more computing devices to at least: receiving at least one rasterized video stream; creating at least one hash value associated with at least one sample of at least one received rasterized video stream; determining at least one database sector for storing the created at least one hash value; and storing the created at least one hash value on the determined at least one database sector.

In various embodiments, an exemplary system involving addressing a media database using distance associative hashing may include, but is not limited to: one or more computing devices; and one or more instructions that when executed on at least some of the one or more computing devices cause the at least some of the one or more computing devices to at least: receiving one or more instructions associated with at least one video buffer of at least one client system; determining a hint based at least in part on the at least one video buffer and at least one time instance associated with the at least one video buffer, wherein one or more of at least one operand or at least one function associated with determining the hint is also used in an associated media storage operation; referencing a plurality of most significant bits of the determined hint to determine a database sector; and returning at least one indication of at least one candidate from the determined database sector based at least in part on the determined hint.

In various embodiments, an exemplary system involving addressing a media database using distance associative hashing may include, but is not limited to: one or more computing devices; and one or more instructions that when executed on at least some of the one or more computing devices cause the at least some of the one or more computing devices to at least: receiving at least one indication of at least one candidate item and at least one indication of at least one prompt; adding a token to a bin associated with at least one received candidate; and determining whether a number of tokens within the bin exceeds a value associated with a probability that the client system is receiving a particular video segment associated with the received at least one prompt, and returning at least some data associated with the particular video segment based at least in part on the bin if the number of tokens within the bin exceeds a value associated with a probability that the client system is receiving the particular video segment associated with the received at least one prompt.

In addition to the foregoing, various other method, system, and/or program product embodiments are set forth and described in the text (e.g., claims, drawings, and/or detailed description) and/or in the teaching of the drawings as this disclosure.

The foregoing is a summary and thus contains, by necessity, simplifications, generalizations, and omissions of detail; consequently, those skilled in the art will appreciate that the summary is illustrative only and is not intended to be in any way limiting. Other aspects, embodiments, features, and advantages of the devices and/or processes and/or other subject matter described herein will become apparent in the teachings set forth herein.

Brief description of the drawings

Certain embodiments of the invention are described in more detail below with reference to the following drawings.

Fig. 1 illustrates the construction of a sectorized video matching database as taught by the present invention, which begins with an initial video ingest or capture process that is then continuously updated. A television display system 101 and its corresponding television display memory buffer 103 are shown for a potential embodiment of the system. The assignment of pixel tiles 102 and the calculation of values 105 using some algorithmic means known to those skilled in the art is done for each pixel tile and the resulting data structure is created and then time stamped to produce a "hint" 106, which may also have additional metadata associated with it.

FIG. 2 illustrates the use of a distance associative hashing process to process hint data 201 and generate a hash index 202, further illustrating a sectorized addressing scheme 203 to store data in related groups (buckets) 206.

Fig. 3 illustrates the real-time capture of unknown television content for identification from a connected television monitor or the like 301. A pixel tile is generally defined as a square pixel area of video buffer 303 having a size of approximately ten pixels by ten rows of pixels 304, however, any reasonable shape and size may be used. The number of pixel tile positions may be any number between ten and fifty positions within the video buffer and is processed 305 for sending hint data 306 to a central server device.

Fig. 4 illustrates the extraction of a plurality of candidate hint values 401 from a reference (matching) database bucket 404 and the application of the hint values 403 to a path tracking content matching process 402 as taught in the first invention cited above.

FIG. 5 illustrates a data structure of bins that hold tokens for scoring candidate values from the matching database. As the search process progresses through time, the bin is "leaky" and tokens expire over time.

FIG. 6 illustrates a typical memory paging scheme for accessing large databases as taught in the prior art.

Fig. 7 shows the creation of a hash value involving several steps, starting with computing the median value for each of a large number of points that make up those samples from a video frame.

Fig. 8 shows how the hash value is calculated.

FIG. 9 demonstrates the beneficial result of using the median value of the pixel locations as part of the process of calculating the hash value.

FIG. 9a illustrates the problem of not using median values when partitioning a multi-dimensional dataset.

Figure 9b demonstrates the benefit of finding the median value of the data set.

FIG. 10 sets forth an operational flow representing a number of exemplary operations pertaining to addressing a media database using distance associative hashing.

FIG. 11 illustrates an alternative embodiment of the operational procedure of FIG. 10.

FIG. 12 illustrates an alternative embodiment of the operational procedure of FIG. 10.

FIG. 13 illustrates an alternative embodiment of the operational procedure of FIG. 10.

FIG. 14 illustrates an alternative embodiment of the operational procedure of FIG. 10.

FIG. 15 illustrates an alternative embodiment of the operational procedure of FIG. 10.

FIG. 16 illustrates an alternative embodiment of the operational procedure of FIG. 10.

FIG. 17 illustrates an alternative embodiment of the operational procedure of FIG. 10.

FIG. 18 illustrates an alternative embodiment of the operational procedure of FIG. 10.

FIG. 19 illustrates an alternative embodiment of the operational procedure of FIG. 10.

FIG. 20 sets forth a different operational flow representing a number of exemplary operations pertaining to addressing a media database using distance associative hashing.

FIG. 21 illustrates an alternative embodiment of the operational procedure of FIG. 20.

FIG. 22 illustrates an alternative embodiment of the operational procedure of FIG. 20.

FIG. 23 illustrates an alternative embodiment of the operational procedure of FIG. 20.

FIG. 24 sets forth another operational flow representing a number of exemplary operations pertaining to addressing a media database using distance associative hashing.

FIG. 25 illustrates an alternative embodiment of the operational procedure of FIG. 24.

FIG. 26 illustrates an alternative embodiment of the operational procedure of FIG. 24.

FIG. 27 illustrates a system for addressing a media database using distance associative hashing.

FIG. 28 illustrates another system for addressing a media database using distance associative hashing.

Fig. 29 illustrates yet another system for addressing a media database using distance associative hashing.

Detailed Description

As described in the foregoing disclosure, the first invention related to the present invention is a system and method for matching unknown video with a database of known videos using novel signal processing means employing a modified path-tracking algorithm, as well as other means.

The novel approach of the new invention is its Distance Associated Hashing, accompanied by providing database access using a sectorized index. The indexing means provides a computationally efficient means for matching unknown media segments to a reference database of known media (e.g., audio or video content).

This indexing approach of the present invention results in a significant improvement in the speed and accuracy of ACR systems that are enabled to track the identification of media even when a television viewer is showing content that the user is changing channels, fast rewinding, fast forwarding, or even pausing video from a digital video recorder.

Both the creation, updating and subsequent access to the media matching database will describe a system that is capable of generating and addressing sectorized databases such that these database sectors may each reside in the main memory of a corresponding large number of server devices without the aid of paging devices within each of the corresponding server devices. This common approach to addressing sectorized databases through location sensitive hashing provides a significant improvement in operating efficiency.

The construction of a sectorized video matching database begins with the process shown in fig. 1. The television system 101 decodes the television signal and places the contents of each video frame into a video frame buffer in preparation for display or further processing of the pixel information for that video frame. The television system may be any television decoding system that can decode a television signal from either baseband or from a modulated television source and fill a video frame buffer with decoded RGB values at the corresponding frame size specified by the video signal. Such systems are well known to those skilled in the art.

The system of the present invention first builds a reference database of television program fingerprints, described in the original application as cues or cue values, and then continuously updates them. For the purpose of building a reference database of the video cues, the present invention performs acquisition of one or more video slices 102 read from video frame buffer 103. The video slices may be of any shape or pattern, but for purposes of this example, should be 10 pixels in the horizontal direction by 10 pixels in the vertical direction. Also for purposes of this example, it is assumed that the 25 pixel tile locations in the video frame buffer are evenly distributed within the boundaries of the buffer, although they are not necessarily evenly distributed. Each pixel should consist of a red value, a green value and a blue value 104, which values are typically represented by a total of 24 bits for an eight-bit binary value for each color or three bytes per tile position.

This composite data structure is filled with average pixel values from multiple pixel tile locations of the video buffer. A pixel tile is defined as a generally square pixel area of a video buffer having a size of approximately ten pixels by ten rows of pixels 304. The number of pixel tile positions may typically lie between ten and fifty positions within the video buffer.

These average pixel values 305 are combined with a time code 306 that references the "time of day" from the processor means of the television system. Time of day is defined as the time of one second with a number of small parts in the past since 1 month 1 midnight of 1970, which is a well-established practice in computing systems, especially Unix (or Linux) based systems.

Further, as taught in the original patent application, metadata may be included and defined with the data structure 106, referred to as marker fingerprints, "hints," or "points. Such metadata attributes may be derived from closed caption data from a video program currently being displayed, or it may be keywords extracted by a speech recognition system running within a processor device of the television system that converts video from the corresponding television program into textual information. The text information may then be retrieved for the relevant keywords or sent in its entirety as part of a prompt data structure to a central server device for further processing.

Those hint records 201 are passed to a hash function 202 in fig. 2 that generates a hash value 203 using a location-sensitive hash algorithm based on a probability point location algorithm ("PPLEB") in an equator. This hash value is calculated from the average pixel value from the hint record (fingerprint) 207 and the process associates 206 data with similar values into groups called buckets.

The ten by ten pixel tile 302 shown in this particular example will have one hundred pixels and mathematically average to produce an average pixel value 305 for the red, green and blue values, respectively. Alternatively, any averaging function may be used instead of a simple average.

A plurality of such pixel slices are extracted from the video frame. For example, if 25 such pixel slices were extracted from a video frame, the result would be a point representing a location within a 75-dimensional space. The skilled person will know that such a large search space would require a large amount of computational resources to locate (even approximately) the values in combination with the other 74 values representing one video frame.

The system and method of distance relevance hashing described herein has the advantage of reducing the computational load and improving the accuracy of matching unknown video frames to a database of known video frames.

Creating a hash value involves several steps, starting with calculating an algorithmically derived value for each point, 701-775 shown in fig. 7. A useful means of algorithmically deriving the median value by summing each point of each frame of each program stream or video channel maintained by the matching database over a period of approximately 24 hours was found. The median value for each point is found from this summation. The next step in deriving the final hash value is to subtract the average from the point value of each corresponding point, line 801 minus line 802 equals line 803. The result is a positive or negative value to which a pre-derived hash function is applied. Typically, the result of the point value minus the average value of the corresponding point is arranged in a matrix, to which the dot product is calculated using a similar matrix constituting a pre-derived hash value (or hash key). The result of the dot product of the two matrices is then further transformed to one or zero based on the sign of the product matrix elements. Typically, the technician will set positive values to one and negative values to zero.

The generated hash values point to more or fewer values that are evenly distributed across the data storage area. Hash value 203 may be further partitioned (fig. 2) such that 'n' most significant bits 205 address one of the 2n (2^ n) sectors of the database. The remaining bits 206 address the various 'buckets' of the addressed sector of the database, as will be described in more detail below.

The partition points of the hash values defining the respective sector address spaces are calculated such that the index of the data of the database sector fits within the memory bounds of those processor systems of the memory sector. Otherwise, the database will be subject to paging, which will reduce the effectiveness of this process.

Comparing the systems and methods taught by the present invention with those known to those skilled in the relevant art, FIG. 6 illustrates a typical paging scheme. In FIG. 6, it is assumed that the example system is attempting to match unknown data with servers of known data. The index 602 is used to address only a portion of the data 605 that can fit in the main memory of the CPU 606. This data is searched and if the result is negative, another piece of data is fetched into the main memory 603 and the search continues.

This access approach using paging is common but significantly reduces the efficiency of the computer system. In fact, this approach cannot be used with ACR systems that search large media databases because the read/write speed of the hard disk drive is not sufficient to keep up with the task. Many different algorithmic approaches have been developed over the years to address the problem of splitting a search into smaller parts and distributing smaller searches to multiple computer server systems.

A well-known example might be the fairly large Google (Google) search engine. The skilled person knows that this system is one of the largest computer systems established to date. The speed and accuracy of the google search process is excellent. However, google search approaches are significantly different and simply not suitable for matching unknown media with a database of known media, even if the two databases are of the same very large size. This is because google search approaches employ a map-reduce algorithm designed to search large databases of substantially unrelated data. Although advanced over paged systems, map-reduce is a computationally intensive process that also requires significant data communication bandwidth between the participating computer systems. In contrast, the present invention is efficient in the use of processing resources and communication resources.

In the present invention, the distance associative hash function provides a means of addressing the database by sectors, so that the data of the addressing means fit into the main memory of the individual server devices of the server group. The grouping is accomplished by grouping the distance related data in the multi-dimensional array into the same sector using a distance associative hashing step as a means of implementing the grouping. The sector identification for addressing the data element is calculated from a hash index generated from the process by extracting a subset of the total bits of the hash function and using the subset to address the desired sector in which to store the data in a reference database.

In this way, the hash index subset is an address of a sector (referred to as a bucket in the first invention) containing distances associated with a plurality of hash values. The remainder of the hash address is then used to address the bucket of sectors that is used to store new data. Alternatively, the sector address may be found by hashing the first hash value again.

Such a system and method of database addressing through multiple hash indexing steps results in an efficient database access scheme with significant performance benefits and improved efficiency over the conventional database access methods described above.

Distance associative hashing provides a means to quickly address very complex (multidimensional) databases by finding data that is not a perfect match but is within a predetermined radius (distance associative) of the value sought. Importantly, sometimes this means of addressing will result in no match at all. In the event that the business-oriented database cannot tolerate inaccuracies, the media matching system can easily tolerate a missing match and will simply continue the matching process once the next data received and taught in the first patent arrives. Of course, the arrival of data from unknown sources to be determined by the ACR system is periodic, but may be commanded by the system of the present invention to arrive at different intervals based on requirements for accuracy or by requirements imposed by the state of the system (e.g., when the system may be approaching overload), and then these sending clients are commanded to send lower sampling rates. For example, a typical data reception rate may be an interval of 1/10 seconds.

For a reference media database, a set of pixel values is derived from each video frame from each video source that is to be part of the reference database. Then, the broadcast time of the video program and certain metadata, which is information of the program such as a content Identification (ID), a program title, cast names, a broadcast time, a brief synopsis, etc., are attached to the pixel value group. The metadata is typically obtained from commercial electronic program guide sources.

The array of processed pixel values with the added time code plus metadata is then stored in a reference database, and the address of the stored data is then added to the hash index at the corresponding hash value and sector ID value. In addition, a second database index is created and maintained using the content ID from the metadata as another means for addressing the reference database.

The process of building and continually updating the database is continuous, and the number of days of data maintained by the database is based on the needs of the user but may range, for example, from one day to one month.

The process of identifying unknown video segments from data received from a large number of client devices begins with a procedure similar to that used above for building a reference database. In fig. 3, this program relates to a television monitor 301, such as a popular flat screen HDTV of the type commonly referred to as a smart TV, wherein the TV contains processing means with memory capable of executing applications similar to the type found on smart phones commonly used today. The system of the present invention samples the area 302 of the video frame buffer 301 in a generally large number of locations. The samples have exactly the same size, shape and position as those pixel tiles used in the process of building the reference database. Each of these collected pixel tiles is then algorithmically processed to produce calculated values for the red, green and blue values of each tile in exactly the same way as the method used to create the reference database.

The system of the present invention then calculates the exact same distance relevance hash index of the collected averages as the content ingestion function described above. Also exactly the same as the ingestion system described above, the resulting sector Identification (ID) value is extracted as a subset of the total bits of the hash index. The remainder of the hash index is used to address the desired sector to search for all candidate (potential) matches therein that belong to the same bucket as the unknown data point.

Optionally, if a good guess for a match (a successful match) is available from the above process, the system of the present invention will also use the content ID index to collect candidates from a database responsible for the sectors created during the ingestion process as described above belonging to the potential content ID to address multiple reference cues around the time radius r' of the timestamp (of the successfully matched candidate). Next, as taught in the first patent, duplicate candidates are removed as well as candidates that are too far (radius r) from the unknown point.

To test for a match of an unknown video segment with a reference database of known video data, a list of candidates from the last step is assumed, where each candidate (i.e. each possible match) has associated it with the following data item: content ID, media time, inverse percentage distribution radius calculated as distance from the current unknown point (from the unknown video stream), where 100% represents the exact value of the unknown point and 0% is a value beyond the radius r (distribution) from the unknown point.

Each match candidate 501 is assigned a data structure 502 in the memory of the matching system of the present invention. The data structure consists of, among other things, arbitrary time bins (e.g., about one second) grouped by some arbitrary amount. For purposes of example, assume that the data structure consists of one hundred boxes representing ten second video cue points. The bins are typically not equally spaced in time.

For each candidate found in the reference (matching) database: first, the relative time is calculated by subtracting the candidate time from the arbitrary time of the unknown video. The candidate time is the play time of each video cue associated with the candidate during transmission of the reference program.

Any time of unknown video comes from the moment of the television monitor generated internally from the application of the invention running in the memory of the television or in the set-top box attached to the television and is sent by the application to the central server device of the invention together with the sampled video cue points. Time of day is well known to the skilled person and is commonly used in computer systems. The time is taken as the current number of time units since 1 month 1 day 1970.

For example, if the time difference between any time from the television (in the home) and the real media time is 100 seconds, then the relative time of those actually matching candidates should be close to that value. Also, a candidate that is not a good match is unlikely to have a relative time close to 100 seconds for this example.

In the candidate data structure, when the cue point of the unknown video matches the reference cue point, the system of the present invention adds the token to the corresponding bin of the candidate data structure. The system then repeats the process as described in the preceding paragraph for the next candidate.

Another and important step for scoring the results is to apply a time discount to all bins. This is a relatively simple process that decrements the values in all bins by a small amount for each time period. The skilled person will recognise that this is a "leaky bucket" scoring method. By definition, over multiple cycles of the process, bins that are no longer filled by matching hint points will eventually decrement to zero. Also, bins that slowly fill through random noise in the system will likewise be decremented. Therefore, the time discount eventually empties the bin filled by false positive matches and random noise. The skilled person will also clearly see that without the time counting binning, all bins will eventually fill to capacity and no results can be obtained from the process.

The time discount also decrements to zero any bins (e.g., 503) that have a level above the match threshold 504 when the video stream from the client television monitor is changed in any way by any of the following: change channel, fast rewind, fast forward, pause video, etc.

If any bin of the candidate data structure (e.g., bin 503) is above some threshold 504, then the content is declared a match. A further means of defining a match may include testing for a consistent match of the candidate segment in greater than a determined number of seconds (e.g., three seconds).

Fig. 8 shows how the hash value is calculated. First, the median value for each pixel location contributing to the video fingerprint is found by summing those values for the location over a number of days of time at the location from the acquisition values in a plurality of television channels representing typical television programs to be identified by the present invention. Once the median value is determined, it can be used as a constant indefinitely without further calculation or adjustment. The pixel values sent from the client to the server matching system are first processed by subtracting the median of the pixel locations. The resulting results are stored in a matrix with other pixel locations of the video frame and an appropriate hash function is applied to the matrix. A plurality of hash values are then derived from the generated dot products.

FIG. 9 demonstrates the beneficial result of using the median value of the pixel locations as part of the process of calculating the hash value. Graph 901 shows the resulting curve of the output of a typical unoptimized hash function having a relatively small number of hash values occupying a relatively narrow range on the left side of the curve. The resulting median 902 is lower. Graph 903 shows the advantageous redistribution of hash values resulting from computing the median value for each pixel location participating in the matching process and applying the median value as part of the hash function. The distribution of hash values is more spread out, with a rise in the median of all hash keys 904.

FIG. 9a illustrates what happens to a data set when no median is found before partitioning the data set. If the system samples sixteen pixel locations per video frame and if each pixel location has a red pixel value, a green pixel value, and a blue pixel value, there will be 64 dimensions (or axes) to the graph. For illustration purposes, in this example, the data set includes only two sample points 906 and 908 of a single video frame. Further, this example assumes that only a single luminance value is obtained at each pixel point. By dividing the data set in the diagonal direction 907 into clockwise 907c and counter-clockwise 907cc sectors and the vertical 908 and horizontal 906 axes intersect at zero 905, there are only two of the eight sectors 910 and 911 between the two said pixel locations.

Figure 9b demonstrates the benefit of finding the median value for each pixel location. The present example continues using the assumption that the pixel values are a single luminance value from zero to 255, although the absolute value is not important to the present method. This example demonstrates a simplified assumption that the median value is 128 for both pixel locations. Now, by shifting the partition point to 905 ', the vertical and horizontal axes are shifted to 908 ' and 906 ', respectively. Diagonal slice 909 moves to 909'. It is clear from the diagram that all eight sectors now contain data.

When the data set is partitioned in this manner, the calculated median value is neither necessarily nor necessarily in the middle of the data set. The desired result is to spread the data so that when the data is partitioned and assigned to various servers, the system more uniformly accesses the servers. In contrast, the unoptimized data of FIG. 9 would only see two of the eight accessed servers if partitioned between eight servers as shown. In the method illustrated in fig. 9b, the actual calculation results in the application of 48 median values calculated as a 48-dimensional map, with those color values at each pixel location and with an example of 16 pixel locations. Further, more than one data stitching may be performed around each intermediate point of the 48-dimensional map as needed, thereby allowing the data set produced by the slice to fit within the operating constraints of the individual computer servers of the system. In any case, data will be found most of the time on the clockwise and counterclockwise sides of each partition slice.

FIG. 10 illustrates an operational flow 1000 representing a number of example operations relating to addressing a media database using distance associative hashing. In fig. 10 and in the following figures that include various examples of operational flows, discussion and explanation will be provided with respect to the above examples of fig. 1-9 and/or with respect to other examples and contexts. However, it should be understood that these operational flows may be performed in many other environments and contexts and/or in modified versions of fig. 1-9. Moreover, while the various operational flows are presented in the illustrated sequences, it should be understood that the various operations may be performed in other sequences than the illustrated sequences, or the operations may be performed concurrently.

After the start operation, operational flow 1000 moves to operation 1002. Operation 1002 depicts receiving one or more indications of samples of a video segment. For example, as shown in and/or described with respect to fig. 1-9, these indications may be associated with one or more pixel slices from the ingestion system.

Operation 1004 then depicts determining an algorithmically derived value for at least one tile of the sample of the video segment that includes the one or more pixels of each tile for the at least one tile. For example, as shown in and/or described with respect to fig. 1-9, an average of those red pixels in each tile, those green pixels in each tile, and those blue pixels in each tile may be calculated.

Operation 1006 then depicts subtracting the established midpoint value for each patch from the average value for each patch. For example, as shown in and/or described with respect to fig. 1-9, a median value for each pixel location contributing to a video fingerprint may be found by summing those values for the location over a number of days of acquisition values from multiple television channels at the location.

Operation 1008 then depicts transforming the values resulting from the subtraction using a pre-derived function to evenly distribute the values. For example, as shown in and/or described with respect to fig. 1-9, the values resulting from the subtraction populate the matrix. The dot product of that matrix with a pre-derived static matrix can be calculated. This pre-derived static matrix may be determined prior to instantiating the operational flow 1000 and may be algorithmically optimized based on data ingested in the past such that multiple matrices intersected therewith will produce results that are more evenly distributed than results directly from subtraction operations.

Operation 1010 then depicts constructing a hash value from these transformed values. For example, as shown in and/or described with respect to fig. 1-9, values that can hold RGB values are reduced to a bit form so that the hash value may be a bit string.

Operation 1012 then depicts referencing the most significant bits of the hash value as constructed to determine a database sector. For example, as shown in and/or described with respect to fig. 1-9, a plurality of bits may be predetermined such that the predetermined plurality of bits of the hash value are used to address one or more database sectors.

Operation 1014 then depicts storing at least the hash value on the determined database sector. For example, as shown in and/or described with respect to fig. 1-9, the hash value may be stored in a bucket that includes other hash values that are mathematically proximate, where the hash values are associated with at least a particular video segment and offset.

Fig. 11 illustrates an alternative embodiment of the example operational flow 1000 of fig. 10. FIG. 11 illustrates an example embodiment in which operational flow 1000 may include at least one additional operation. Additional operations may include operation 1102.

Operation 1002 illustrates using one or more processing devices to implement, at least in part, at least one of receiving operation 1002, determining operation 1004, subtracting operation 1006, transforming operation 1008, constructing operation 1010, referencing operation 1012, or storing operation 1014. In some instances, one or more computer processors may be used to implement, at least in part, one of the aforementioned operations. The other processing means may include an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), or any other circuitry configured to achieve the results of at least one of the foregoing operations.

Fig. 12 illustrates an alternative embodiment of the example operational flow 1000 of fig. 10. FIG. 12 illustrates an example embodiment, wherein operation 1002 may include at least one additional operation. Additional operations may include operation 1202 and/or operation 1204.

Operation 1202 shows receiving one or more indications of at least one of a frame or a still image. For example, as shown in and/or described with respect to fig. 1-9, samples of a video clip may comprise individual frames of a video stream. Such a frame may be a 30fps video frame. In different embodiments, the sample of the video clip may be a still image or a portion of the video clip that may be imaged at a rate other than 30 times per second.

Further, operation 1204 illustrates receiving one or more indications of samples of a video segment, the one or more indications of samples of a video segment associated with at least one indication of a channel, at least one indication of a video segment, and at least one indication of a time code offset from a start of the video segment. For example, as shown in and/or described with respect to fig. 1-9, data associated with a video clip (which may be a program title and/or other metadata associated with the video clip), a channel from which the program was ingested, and a time offset from the beginning of the program may be received from, for example, a channel guide associated with the channel being monitored by the ingestion system.

Fig. 13 illustrates an alternative embodiment of the example operational flow 1000 of fig. 10. FIG. 13 illustrates an example embodiment wherein operation 1004 may include at least one additional operation 1302.

Operation 1302 illustrates determining an average value of the one or more pixels of each tile for at least one tile of the sample of a video segment that includes the one or more pixels of the at least one tile. For example, as shown in and/or described with respect to fig. 1-9, the algorithmic operation used to reduce the one or more pixels in a tile to a single value may be, for example, an arithmetic average.

Fig. 14 illustrates an alternative embodiment of the example operational flow 1000 of fig. 10. FIG. 14 illustrates an example embodiment, wherein operation 1006 may include at least one additional operation 1402.

Operation 1402 illustrates subtracting the established midpoint value for each tile, which has been previously determined using data from each tile over at least one time period for a plurality of frequency channels, from the average value for each tile. For example, as shown in and/or described with respect to fig. 1-9, a median value may be determined for each shard, where the median value is established for the same shard at ingestion as in the operation of determining the segments on the client system as a constant value derived from monitoring the same shard across many channels over a long period of time (one month, one year, etc.).

Fig. 15 illustrates an alternative embodiment of the example operational flow 1000 of fig. 10. FIG. 15 illustrates an example embodiment, wherein operation 1008 may include at least one additional operation. Additional operations may include operation 1502, operation 1504, and/or operation 1506.

Operation 1502 illustrates forming a variable matrix that includes at least the values resulting from the subtraction. For example, as shown in and/or described with respect to fig. 1-9, the values are arranged in a matrix, the values resulting from a subtraction operation that subtracts a median value established over time for each slice from the average of the instant frames being ingested.

Operation 1504 shows obtaining a static matrix that, when intersected by the variable matrix, will more evenly distribute the transformed values. For example, as shown in and/or described with respect to fig. 1-9, the matrix may be determined based on a mathematical analysis of a previously obtained data set with respect to hash values. The matrix may be mathematically optimized such that when used as an operand in a dot product operation with a continuous variable matrix, the corresponding continuous result matrix will include a plurality of values that are more evenly distributed along the distribution curve than the variable matrix prior to the dot product operation.

Operation 1506 illustrates calculating a dot product of the variable matrix and the static matrix, the dot product including at least the more evenly distributed transformed values. For example, as shown in and/or described with respect to fig. 1-9, a variable matrix containing a plurality of values resulting from the subtraction operation may be intersected with a static matrix that has been predetermined to more evenly distribute the data represented by the variable matrix, such that the resulting matrix is more spread apart rather than bunched around a particular portion of the distribution.

Fig. 16 illustrates an alternative embodiment of the example operational flow 1000 of fig. 10. FIG. 16 illustrates an example embodiment where operation 1504 may include at least one additional operation 1602.

Operation 1602 shows determining a static matrix that will more evenly distribute the transformed values of the variable matrix when intersecting the variable matrix, based at least in part on one or more previously obtained hash values using location-sensitive hashing. For example, as shown in and/or described with respect to fig. 1-9, previously captured video samples may be analyzed using a location-sensitive hashing technique to generate a matrix such that, when used as an operand in a dot product operation with a continuous variable matrix, the corresponding continuous result matrix will include a plurality of values that are more evenly distributed along the distribution curve than the variable matrix prior to the dot product operation.

Fig. 17 illustrates an alternative embodiment of the example operational flow 1000 of fig. 10. FIG. 17 illustrates an example embodiment, wherein operation 1010 may include at least one additional operation. Additional operations may include operation 1702 and/or operation 1704.

Operation 1702 illustrates constructing hash values from the transformed values, including reducing the fidelity of the transformed values by at least reducing each transformed value to a binary representation. For example, as shown in and/or described with respect to fig. 1-9, each value of the result matrix from the dot-product operation may be reduced from an 8-bit value, e.g., from 0 to 255 (or from-127 to 128), to a single bit, either a one or a zero.

Operation 1702 may include operation 1704. Operation 1704 illustrates determining, for each transformed value, whether the transformed value is a positive number and assigning a one to the hash value if the transformed value is a positive number and assigning a zero to the hash value otherwise. For example, as shown in and/or described with respect to fig. 1-9, each value between 1 and 128 of the result matrix from the dot product operation may be reduced to a bit value of 1, and each value between-127 and 0 of the result matrix from the dot product operation may be reduced to a bit value of 0.

Fig. 18 illustrates an alternative embodiment of the example operational flow 1000 of fig. 10. FIG. 18 illustrates an example embodiment, wherein operation 1012 may include at least one additional operation 1802.

Operation 1802 illustrates referencing a plurality of most significant bits of the constructed hash value to determine a database server, wherein the plurality of most significant bits are predetermined to address a plurality of database servers, wherein the plurality of database servers associated with the plurality of most significant bits are established such that at least one index associated with a database sector can reside entirely in a memory of the respective database server. For example, as shown in and/or described with respect to fig. 1-9, a number of most significant bits of the 2 bits may be selected, whereby the 2 bits may provide four different values (00, 01, 10, and 11), each of which may be assigned to a different database sector. The plurality of most significant bits of the hash value may be established to provide enough servers so that the contents associated with the plurality of hash values may fit completely within the memory of a particular database sector, which may be a database server, a cluster partner, a virtual machine, and/or another type of database node. The number of bits does not necessarily, but may exactly, represent the maximum number of database sectors at any given time (i.e., the system may operate with fewer servers (e.g., 60 sectors) or with a maximum of 64 sectors when 6 bits may be selected to provide addressing of up to 64 database sectors).

Fig. 19 illustrates an alternative embodiment of the example operational flow 1000 of fig. 10. Fig. 19 illustrates an example embodiment, wherein operation 1014 may include at least one additional operation 1902.

Operation 1902 illustrates storing at least the hash value on the determined database sector, including storing at least one indication of a channel, at least one indication of a video segment, and at least one indication of a time code offset from a start of the video segment at a database location based at least in part on the hash value. For example, as shown in and/or described with respect to fig. 1-9, data associated with a video clip (which may be a program title and/or other metadata associated with the video clip), a channel from which the program was ingested, and a time offset from the start of the program may be stored or stored with the hash value in a location associated with and/or referenceable by the hash value, which storage may be in the same or different sector, server, or database as the hash value.

Fig. 20 illustrates an operational flow 2000 representative of a number of example operations relating to addressing a media database using distance associative hashing. In fig. 20 and in the following figures that include various examples of operational flows, discussion and explanation will be provided with respect to the above examples of fig. 1-9 and/or with respect to other examples and contexts. However, it should be understood that these operational flows may be performed in many other environments and contexts and/or in modified versions of fig. 1-9. Moreover, while the various operational flows are presented in the illustrated sequences, it should be understood that the various operations may be performed in other sequences than the illustrated sequences, or the operations may be performed concurrently.

After the start operation, the operational flow 2000 moves to operation 2002. Operation 2002 depicts: a hint is received, the hint being constructed by one or more operations associated with the media storage operation. For example, as shown in and/or described with respect to fig. 1-9, at least some data associated with a sample from video data taken from a particular client system is received. This data may be associated with just the same shard of the client system as defined by the ingestion operation. The data may be processed algorithmically to obtain hash values using the same operations as the ingest operation. Accordingly, if a particular frame associated with a particular time offset for a particular program on a particular channel is ingested and hashed, resulting in a hash value associated with that particular frame, if that particular frame is also sampled while displayed on the client system, the same hash operation as applied to the ingested frame will result in the same hash value as produced by the hash operation on the ingested frame. But in contrast to the hash value prepared during ingestion, the prompt of operation 2002 represents data associated with a sample of video data from a particular client system. The prompt may be received via, for example, an HTTP request.

Operation 2004 then depicts referencing the most significant bits of the hint received to determine a database sector. For example, as shown in and/or described with respect to fig. 1-9, the same hint bits defined by the plurality of most significant bits used to reference the database sector during ingestion are checked. For example, if the first two bits of the hash value at ingestion are used to store the hash value at a particular database sector, the same first two bits of the hint associated with the sample of video data from the client system are used to address the particular database sector.

Operation 2006 then depicts returning at least one indication of at least one candidate item from the database sector based at least in part on the received hint. For example, as shown in and/or described with respect to fig. 1-9, hash values that exactly match or are near the hint are returned as one or more of suspicion terms or candidates. Candidates may be returned within a specific percentage radius. The candidates may be returned according to a nearest neighbor algorithm or a modified nearest neighbor algorithm.

Fig. 21 illustrates an alternative embodiment of the example operational flow 2000 of fig. 20. FIG. 21 illustrates an example embodiment where operation 2002 may include at least one additional operation. Additional operations may include operation 2102, operation 2104, and/or operation 2106.

Operation 2102 illustrates receiving a prompt associated with a sample of a video buffer of a client system, including receiving at least one or more indications related to a time of day associated with the sample of the video buffer of the client system. For example, as shown in and/or described with respect to fig. 1-9, the cue may include or be associated with a time offset from any time. For example, the time offset may be calculated from 1/1970.

Operation 2104 illustrates: a hint is received, the hint associated with a sample of a video buffer of a client system, the hint determined at least in part by hashing at least some of the values associated with the video buffer. For example, as shown in and/or described with respect to fig. 1-9, multiple slices associated with a video buffer may be reduced to a bit string by one or more mathematical operations or algorithms using one or more operands as constants that are pre-derived by, for example, the operations described elsewhere herein with respect to hashing.

Operation 2016 shows: receiving a hint, the hint associated with a sample of a video buffer of a client system, the hint determined at least in part by hashing at least some values associated with the video buffer, the hashing based at least in part on one or more of at least one operand or at least one algorithm also used in an associated media storage operation. For example, as shown in and/or described with respect to fig. 1-9, at least some of the data associated with the samples of the video buffer representing what the television screen displays at a particular time quantum is processed by operations utilized by the ingest operations and/or utilized in conjunction with data locations common to the ingest process and/or involving constant values for operands utilized by the ingest process. For example, the number of slices analyzed at ingestion may also be used to provide hints associated with a particular client system. The size of the pixel tiles analyzed at the time of ingestion may also be used to provide hints associated with a particular client system. The same pre-derived static matrix used to more evenly distribute hash values at ingestion may also be used during hashing of data associated with a particular client system.

Fig. 22 illustrates an alternative embodiment of the example operational flow 2000 of fig. 20. FIG. 22 illustrates an example embodiment where operation 2002 may include at least one additional operation. Additional operations may include operation 2202, operation 2204, operation 2206, operation 2208, operation 2210, operation 2212, and/or operation 2214.

Operation 2202 illustrates receiving one or more indications of at least one item of content of a video buffer of a client system. For example, as shown in and/or described with respect to fig. 1-9, the pixel values of the red, green, and blue pixels at each pixel location at each predefined tile of the video buffer for the client system may be read for each frame, or for every three frames, or for every ten frames, or for every second, or at some other interval. These indications (pixel values or other data) may be received by a control on the television, by control logic on the television, by a system coupled with the media server, or elsewhere.

Operation 2204 illustrates determining an algorithmically derived value of at least one pixel of each slice for at least one slice of the at least one content of the video buffer comprising the at least one pixel of the at least one slice. For example, as shown in and/or described with respect to fig. 1-9, the pixel values of the red, green, and blue pixels at each pixel location at each predefined tile of the video buffer for the client system may be averaged.

Operation 2206 illustrates subtracting the midpoint value from the average value for each slice. For example, as shown in and/or described with respect to fig. 1-9, a midpoint value at each shard established through analysis of ingested content is determined. Once determined by the system associated with the media database and the ingestion system, the intermediate point value for each shard may be provided, for example, to the client system. These intermediate point values may be updated from time to time (hourly, daily, monthly, yearly). These intermediate point values provided for hashing data associated with the video buffer of the client system may be the same intermediate point value used to hash the incoming content at ingestion.

Operation 2208 illustrates transforming the values resulting from the subtraction. For example, as shown in and/or described with respect to fig. 1-9, the values resulting from the subtraction are populated into a matrix and are interleaved with a predefined static matrix. The dot product operation of intersecting the two matrices may be performed at the client system during the process of converting pixel tile data associated with frames within the video buffer into hints, such that hints are sent in HTTP requests instead of actual pixel tile data, resulting in compact HTTP messages. The predefined static matrix may be provided to the client system prior to transformation and may be the same matrix that is generated to more evenly distribute the plurality of hashed values upon ingestion. The predefined static matrix may be updated at the client system from time to time. Alternatively, the sharded data (with or without other metadata) may be sent from the client system (e.g., television) to a different system for processing and/or hashing.

Operation 2210 shows constructing a hash value from the transformed values. For example, as shown in and/or described with respect to fig. 1-9, those values in a matrix resulting from crossing a matrix having values associated with a video buffer with a pre-derived static matrix may be reduced to a plurality of bits, with a single bit replacing each 8-bit value in the matrix. In other embodiments, the constructed hash value may include a different number of bits for each value in the matrix. In different embodiments, the constructed hash value may have the same number of bits as those in the matrix, or may be a direct representation of those values in the matrix.

Operation 2212 illustrates associating the hint at least in part with the constructed hash value. For example, as shown in and/or described with respect to fig. 1-9, the constructed bit string from the transformed matrix may be a prompt, or the constructed bit string may be associated with a time (e.g., a time of day) to form a prompt, or other data (e.g., an IP address or other identifier associated with the client television or a control of the client television) may be associated to form a prompt. Alternatively, the prompt may include or otherwise be associated with any other metadata associated with the audiovisual content at the client system.

Operation 2214 illustrates that at least one of the determining operation 2204, the subtracting operation 2206, the transforming operation 2208, or the constructing operation 2210 utilizes one or more of at least one operand or at least one algorithm also used in the associated media storage operation. For example, as shown in and/or described with respect to fig. 1-9, one or more parameters including one or more of a definition of the number of pixel slices, a definition of the size of a pixel slice, a predefined median associated with a pixel slice, or a predefined static matrix may be provided to the client TV, which one or more parameters are also utilized by the ingest process such that multiple operations applied to samples from the video buffer will result in the same hash value as that produced when that frame (e.g., the same video segment and time offset) is ingested and hashed.

Fig. 23 illustrates an alternative embodiment of the example operational flow 2000 of fig. 20. FIG. 23 illustrates an example embodiment where operation 2006 may include at least one additional operation. Additional operations may include operation 2302 and/or operation 2304.

Operation 2302 demonstrates returning at least one indication of at least one candidate from the database sector based, at least in part, on a probability point location in an equator ("PPLEB") algorithm as a function of the received hint. For example, as shown in and/or described with respect to fig. 1-9, at least one of the candidate items or suspect items representing waypoints proximate to the cue (e.g., neighbors, nearest neighbors, within a radius, from within the same locker, belonging to the same ring, etc.) are returned from the media database constructed and/or modified by the ingestion process.

Operation 2304 illustrates: returning, based at least in part on the received hint, at least one indication of at least one candidate from the database sector, the at least one candidate being within a predetermined inverse percentage distribution radius of the received hint. For example, as shown in and/or described with respect to fig. 1-9, at least one candidate or suspect item associated with location sensitive hashing with respect to at least one of a hint and a hash value is returned.

Fig. 24 illustrates an operational flow 2400 representative of a number of example operations relating to addressing a media database using distance associative hashing. In fig. 24 and in the following figures that include various examples of operational flows, discussion and explanation will be provided with respect to the above examples of fig. 1-9 and/or with respect to other examples and contexts. However, it should be understood that these operational flows may be performed in many other environments and contexts and/or in modified versions of fig. 1-9. Moreover, while the various operational flows are presented in the illustrated sequences, it should be understood that the various operations may be performed in other sequences than the illustrated sequences, or the operations may be performed concurrently.

After beginning operations, operational flow 2400 moves to operation 2402. Operation 2402 depicts receiving at least one indication of at least one candidate item and at least one indication of at least one prompt. For example, as shown in and/or described with respect to fig. 1-9, a hash value relating to a video buffer of a client system is determined along with one or more associated candidate or suspect terms.

Operation 2404 then depicts adding a token to the bin associated with the at least one received candidate. For example, as shown in and/or described with respect to fig. 1-9, scoring the candidates is performed by adding a token, e.g., a value that increments each time the token is added by one, to the bin corresponding to the candidate/suspect item.

Then, operation 2406 depicts: determining whether a number of tokens within a bin exceeds a value associated with a probability that the client system is displaying a particular video segment associated with at least one prompt, and returning at least some data associated with the particular video segment associated with the at least one prompt based at least in part on the bin if the number of tokens within the bin exceeds the value associated with the probability that the client system is displaying the particular video segment. For example, as shown in and/or described with respect to fig. 1-9, the determination of a particular video segment and a particular offset for the video segment is probabilistically determined by the score associated with the bins.

Fig. 25 illustrates an alternative embodiment of the example operational flow 2400 of fig. 24. Fig. 25 illustrates an example embodiment, wherein operation 2404 may include at least one additional operation 2502.

Operation 2502 illustrates adding a token to a time box associated with at least one received candidate item. For example, as shown in and/or described with respect to fig. 1-9, the data structure associated with the candidate/suspect item may include arbitrary time bins grouped by arbitrary time.

Fig. 26 illustrates an alternative embodiment of the example operational flow 2400 of fig. 20. FIG. 26 illustrates an example embodiment, wherein operation 2404 may include at least one additional operation. Additional operations may include operation 2602 and/or operation 2604. Further, the operational flow 2400 may include at least one additional operation 2606.

Operation 2602 illustrates: determining a relative time includes subtracting a candidate time associated with the at least one candidate from at least an arbitrary time associated with the at least one prompt. For example, as shown in and/or described with respect to fig. 1-9, the time offset of the video clip associated with the candidate is subtracted from any time associated with the moment of the prompt received from the client system (television, set-top box, or item, machine, or composition displaying and/or providing and/or receiving video content).

Operation 2604 illustrates adding a token to the time bin associated with the candidate item based at least in part on the determined relative time. For example, as shown in and/or described with respect to fig. 1-9, when a cue point associated with a client system matches or nearly matches a reference cue point associated with a media database, a token may be added to the bin, which may include adding one to a value associated with the bin or other means of tracking the operation of the bin.

Operation 2606 illustrates removing one or more tokens from the time bin based at least in part on the elapsed time period. For example, as shown in and/or described with respect to fig. 1-9, a bin may be compromised such that data and/or tokens associated with old suspect items/candidates may be released from the bin, which may include decrementing a value associated with the bin or other means of tracking bin operation by one.

In different embodiments, a pixel location may relate to one or many colors and/or color spaces/models (e.g., red, blue, green; red, blue, green, and yellow; cyan, magenta, yellow, and black; a single pixel value uniquely identifies a color, e.g., a 24-bit value associated with the pixel location; hue, saturation, brightness; etc.). Different numbers of pixels in a tile may be used and the tile is not necessarily a square tile. Further, the resolution of the video buffer of the client system may vary. The resolution and/or color density at the client system and the ingestion system may vary. The system may operate at different raster resolutions including, but not limited to, 1920 by 1080, 3840 by 2160, 1440 by 1080, 1366 by 768, or other resolutions. It is expected that in the next twenty years, an increase in pixel resolution of common programs, televisions, and/or client systems will occur; although the number, size, sampling rate, or other aspects of pixel tiles may vary, the same basic operation may be utilized. Further, upconversion, downconversion, or other conversion operations associated with resolution and/or color density may occur and/or be interposed between other operations described herein.

Fig. 27 illustrates an example system 2700 in which embodiments can be implemented. System 2700 includes one or more computing devices 2702. The system 2700 also illustrates a framework 2704 for facilitating communication between the one or more computing devices and the one or more client devices 2706. The system 2700 includes one or more client devices 2706 that are also illustrated. In some embodiments, the one or more client devices may be between the one or more computing devices. System 2700 also illustrates at least one non-transitory computer-readable medium 2708. In some embodiments 2708 may include one or more instructions 2710 that, when executed on at least some of the one or more computing devices, cause the at least some of the one or more computing devices to at least: receiving at least one rasterized video stream; creating at least one hash value associated with at least one sample of at least one received rasterized video stream; determining at least one database sector for storing the created at least one hash value; and storing the created at least one hash value on the determined at least one database sector. In various embodiments, the one or more instructions may be executed on a single computing device. In other embodiments, some portions of the one or more instructions may be executed by a first plurality of the one or more computing devices, while other portions of the one or more instructions may be executed by a second plurality of the one or more computing devices.

FIG. 28 illustrates an example system 2800 in which embodiments may be implemented. System 2800 includes one or more computing devices 2802. The system 2800 also illustrates a framework 2804 for facilitating communication between one or more computing devices and one or more client devices 2806. The system 2800 includes one or more client devices 2806 that are also illustrated. In some embodiments, the one or more client devices may be between the one or more computing devices. System 2800 also illustrates at least one non-transitory computer-readable medium 2808. In some embodiments, 2808 may include one or more instructions 2810 that, when executed on at least some of the one or more computing devices, cause at least some of the one or more computing devices to at least: receiving one or more instructions associated with at least one video buffer of at least one client system; determining a hint based at least in part on the at least one video buffer and at least one time instance associated with the at least one video buffer, wherein one or more of at least one operand or at least one function associated with determining the hint is also used in an associated media storage operation; referencing a plurality of most significant bits of the determined hint to determine a database sector; and returning at least one indication of at least one candidate from the determined database sector based at least in part on the determined hint. In various embodiments, the one or more instructions may be executed on a single computing device. In other embodiments, some portions of the one or more instructions may be executed by a first plurality of the one or more computing devices, while other portions of the one or more instructions may be executed by a second plurality of the one or more computing devices.

Fig. 29 illustrates an example system 2900 in which embodiments may be implemented. System 2900 includes one or more computing devices 2902. System 2900 also presents a framework 2904 for facilitating communication between one or more computing devices and one or more client devices 2906. System 2900 includes one or more client devices 2906 are also illustrated. In some embodiments, the one or more client devices may be between the one or more computing devices. System 2900 also illustrates at least one non-transitory computer-readable medium 2908. In some embodiments, 2908 may include one or more instructions 2910 that, when executed on at least some of the one or more computing devices, cause at least some of the one or more computing devices to at least: receiving at least one indication of at least one candidate item and at least one indication of at least one prompt; adding a token to a bin associated with at least one received candidate; and determining whether a number of tokens within the bin exceeds a value associated with a probability that the client system is receiving a particular video segment associated with the received at least one prompt, and returning at least some data associated with the particular video segment based at least in part on the bin if the number of tokens within the bin exceeds a value associated with a probability that the client system is receiving the particular video segment associated with the received at least one prompt. In various embodiments, the one or more instructions may be executed on a single computing device. In other embodiments, some portions of the one or more instructions may be executed by a first plurality of the one or more computing devices, while other portions of the one or more instructions may be executed by a second plurality of the one or more computing devices.

Certain aspects of the present invention include process steps and instructions described herein in the form of an algorithm. It should be noted that the process steps and instructions of the present invention could be embodied in software, firmware or hardware, and when embodied in software, could be downloaded to reside on and be operated from different platforms used by real time network operating systems.

The present invention also relates to an apparatus for performing the operations herein. This apparatus may be specially constructed for the required purposes, or it may comprise a general-purpose computer selectively activated or reconfigured by a computer program stored in the computer. Such a computer program may be stored in a computer readable storage medium, such as, but is not limited to, any type of disk including floppy disks, optical disks, CD-ROMs, magnetic-optical disks, read-only memories (ROMs), Random Access Memories (RAMs), EPROMs, EEPROMs, magnetic or optical cards, Application Specific Integrated Circuits (ASICs), or any type of media suitable for storing electronic instructions, and each coupled to a computer system bus.

In addition, the computers or computing devices referred to in this specification may include a single processor or may employ a multi-processor design for increased computing capability.

The algorithms and displays presented herein are not inherently related to any particular computer or other apparatus. Various general-purpose systems may also be used with programs in accordance with the teachings herein, or it may prove convenient to construct more specialized apparatus to perform the required method steps. The required structure for a variety of these systems will appear from the description above. In addition, the present invention is not described with reference to any particular programming language or operating system. It will be appreciated that a variety of programming languages and operating systems may be used to implement the teachings of the invention as described herein.

The systems and methods, flow charts and block diagrams described in this specification may be implemented in a computer processing system that includes program code comprising program instructions that are executable by the computer processing system. Other implementations may also be used. Further, the flow diagrams and structural block diagrams described herein describe specific methods and/or corresponding acts in support of various steps and corresponding functions in support of the disclosed structural means and can also be used to implement corresponding software structures and algorithms, and equivalents thereof.

Embodiments of the subject matter disclosed in this specification can be implemented as one or more computer program products, i.e., one or more modules of computer program instructions encoded on a tangible program carrier for execution by, or to control the operation of, data processing apparatus. The computer storage medium may be a machine-readable storage device, a machine-readable storage substrate, a memory device, or a combination of one or more of them.

A computer program (also known as a program, software application, script, or code) can be written in any form of programming language, including compiled or interpreted languages, or declarative or procedural languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment. A computer program does not necessarily correspond to a file in a file system. A program can be stored in a portion of a file that holds other programs or data (e.g., one or more scripts stored in a markup language document), in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub programs, or portions of code). A computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a suitable communication network.

The processes or logic flows described in this specification can be performed by one or more programmable processors executing one or more computer programs to perform functions that operate on input data and generate output. These processes or logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit).

The essential elements of a computer are a processor for executing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks. However, a computer need not have such a device. Processors suitable for the execution of a computer program include, by way of example and not limitation, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and data from a read-only memory or a random access memory or both.

To provide for user or administrator interaction with the systems described herein, embodiments of the subject matter described in this specification can be implemented on a computer having a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user and a keyboard (e.g., a mouse or a trackball) and a pointing device by which the user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user. For example, feedback provided to the user can be any form of sensory feedback, such as visual feedback, auditory feedback, or tactile feedback; and input from the user may be received in any form, including acoustic, speech, or tactile input.

Embodiments of the subject matter described in this specification can be implemented in a computing system that includes a back-end component, which includes one or more data servers, or that includes a front-end component, such as a client computer, that includes one or more middleware components, such as an application server, or that includes a graphical user interface or a Web browser through which a user or administrator can interact with some implementations of the subject matter described in this specification, or any combination of one or more such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication, e.g., a communication network. The computing system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

While this specification contains many specific implementation details, these should not be construed as limitations on the scope of any inventions or of what may be claimed, but rather as descriptions of features that may be specific to particular embodiments of particular inventions. Certain features that are described in this specification in the context of separate embodiments can also be implemented in combination in a single embodiment.

Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. Moreover, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.

Similarly, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. In some cases, multitasking and parallel processing may be advantageous. Moreover, the separation of various system components in the embodiments described above should not be understood as requiring such separation in all embodiments, and it should be understood that the described program components and systems can generally be integrated together in a single software product or packaged into multiple software products.

This written description sets forth the best mode of the invention and provides examples to describe the invention and to enable a person of ordinary skill in the art to make and use the invention. This written description does not limit the invention to the precise terms set forth. Thus, while the invention has been described in detail with reference to the examples set forth above, those of ordinary skill in the art may effect alterations, modifications and variations to these examples without departing from the scope of the invention.

Claims

1. A system, comprising:

one or more processors;

one or more non-transitory machine-readable storage media embodying instructions that, when executed on the one or more processors, cause the one or more processors to:

receiving a rasterized video stream;

creating a hash value associated with the rasterized video stream, wherein the hash value is created by performing a hash function on pixel values associated with samples of the rasterized video stream, wherein the hash value comprises a first plurality of bits and a second plurality of bits, wherein the first plurality of bits are predetermined to correspond to a plurality of media database sectors of a media database, and wherein the second plurality of bits are predetermined to correspond to a plurality of buckets in one or more of the plurality of media database sectors;

determining a media database sector of the media database for storing the created hash value, wherein the database media sector is determined by referencing the first plurality of bits of the created hash value; and

the created hash value is stored in the determined media database sector.

2. The system of claim 1, further comprising instructions that when executed on the one or more processors cause the one or more processors to:

facilitating identification of an unknown video segment client device, wherein the identification is performed using pixel data received from the client device and the created hash value stored in the determined media database sector.

3. The system of claim 1, wherein creating the hash value comprises:

for a tile of the sample of the rasterized video stream, determining algorithmically-derived values associated with one or more pixels of the tile; and

creating the hash value using the determined algorithmically derived value.

4. The system of claim 1, wherein creating the hash value comprises:

subtracting the midpoint value established for the tile from the algorithmically derived values associated with the tile of the samples of the rasterized video stream; and

creating the hash value using the value resulting from the subtraction.

5. The system of claim 4, wherein creating the hash value comprises:

transforming the value resulting from the subtraction using a pre-derived function to evenly distribute the value resulting from the subtraction and one or more other values; and

creating the hash value from the transformed value.

6. The system of claim 1, wherein creating the hash value comprises:

transforming a value resulting from subtracting a midpoint value established for a slice from an average value of the slice, the transforming using a pre-derived function to evenly distribute the resulting value and one or more other values; and

creating the hash value from the transformed value.

7. The system of claim 1, wherein the hash values are created using evenly distributed values of tiles derived from the samples of the rasterized video stream.

8. The system of claim 7, wherein the hash values are created using the evenly distributed values of the shards and data related to the shard values over time.

9. The system of claim 1, wherein creating the hash value comprises:

for a tile of the samples of the rasterized video stream, determining an algorithmically-derived value for one or more pixels of the tile;

subtracting the established midpoint values for the slices from the algorithmically derived values;

transforming the value resulting from the subtraction using a pre-derived function, thereby evenly distributing the value resulting from the subtraction and one or more other values; and

creating the hash value from the transformed value.

10. The system of claim 1, wherein the hash value is created using the following elements: one or more mean values associated with one or more pixels associated with a tile of the sample, one or more midpoint values associated with pixel data associated with the tile over time, and a pre-derived transformation matrix.

11. The system of claim 1, wherein the hash function comprises distance correlation hashing.

12. The system of claim 1, wherein referencing the first plurality of bits of the created hash value comprises: a number of most significant bits of the created hash value are referenced.

13. The system of claim 1, wherein the plurality of media database sectors associated with the first plurality of bits are established such that an index associated with a database sector can reside entirely in memory of the media database sector, such that paging is not required when accessing the media database.

14. A computer-implemented method, comprising:

receiving a rasterized video stream;

the created hash value is stored in the determined media database sector.

15. The method of claim 14, further comprising:

facilitating identification of an unknown video segment associated with a client device, wherein the identification is performed using pixel data received from the client device and the created hash value stored in the determined media database sector.

16. The method of claim 14, wherein creating the hash value comprises:

creating the hash value using the determined algorithmically derived value.

17. The method of claim 14, wherein creating the hash value comprises:

creating the hash value using the value resulting from the subtraction;

creating the hash value from the transformed value.

18. A computer program product, tangibly embodied in a non-transitory machine-readable storage medium of a computing device, comprising instructions configured to cause one or more data processors to:

receiving a rasterized video stream;

the created hash value is stored in the determined media database sector.

19. The computer program product of claim 18, the instructions further configured to cause one or more data processors to:

20. The computer program product of claim 18, wherein creating the hash value comprises:

creating the hash value using the value resulting from the subtraction;

creating the hash value from the transformed value.