WO2017149447A1 - A system and method for providing real time media recommendations based on audio-visual analytics - Google Patents
A system and method for providing real time media recommendations based on audio-visual analytics Download PDFInfo
- Publication number
- WO2017149447A1 WO2017149447A1 PCT/IB2017/051160 IB2017051160W WO2017149447A1 WO 2017149447 A1 WO2017149447 A1 WO 2017149447A1 IB 2017051160 W IB2017051160 W IB 2017051160W WO 2017149447 A1 WO2017149447 A1 WO 2017149447A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- user
- real time
- audio
- metadata
- group
- Prior art date
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/81—Monomedia components thereof
- H04N21/8106—Monomedia components thereof involving special audio data, e.g. different tracks for different languages
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/783—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04H—BROADCAST COMMUNICATION
- H04H60/00—Arrangements for broadcast applications with a direct linking to broadcast information or broadcast space-time; Broadcast-related systems
- H04H60/56—Arrangements characterised by components specially adapted for monitoring, identification or recognition covered by groups H04H60/29-H04H60/54
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04H—BROADCAST COMMUNICATION
- H04H60/00—Arrangements for broadcast applications with a direct linking to broadcast information or broadcast space-time; Broadcast-related systems
- H04H60/61—Arrangements for services using the result of monitoring, identification or recognition covered by groups H04H60/29-H04H60/54
- H04H60/66—Arrangements for services using the result of monitoring, identification or recognition covered by groups H04H60/29-H04H60/54 for using the result on distributors' side
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/41—Structure of client; Structure of client peripherals
- H04N21/422—Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
- H04N21/42203—Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS] sound input device, e.g. microphone
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/45—Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
- H04N21/466—Learning process for intelligent management, e.g. learning user preferences for recommending movies
- H04N21/4668—Learning process for intelligent management, e.g. learning user preferences for recommending movies for recommending content, e.g. movies
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/482—End-user interface for program selection
- H04N21/4825—End-user interface for program selection using a list of items to be played back in a given order, e.g. playlists
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04H—BROADCAST COMMUNICATION
- H04H60/00—Arrangements for broadcast applications with a direct linking to broadcast information or broadcast space-time; Broadcast-related systems
- H04H60/35—Arrangements for identifying or recognising characteristics with a direct linkage to broadcast information or to broadcast space-time, e.g. for identifying broadcast stations or for identifying users
- H04H60/46—Arrangements for identifying or recognising characteristics with a direct linkage to broadcast information or to broadcast space-time, e.g. for identifying broadcast stations or for identifying users for recognising users' preferences
Definitions
- the present invention relates to a system and method for providing real time media recommendations and more specifically relates to a system for providing real time media recommendations using audio-visual analytics conducted at frame level optionally enhanced based on user profile/user viewing history and/or other metrics.
- a system and method for providing dynamic video segment recommendation based on the current playback location of the video was introduced.
- the term dynamic as disclosed in '295 means that the media recommendation changes as the scene changes.
- the system as disclosed in '295 generates recommendations only after analyzing the video content completely. Further, the recommendation generated is a onetime process and may not necessarily be repeated for each and every user. Further, the recommendations appear either as a advertisement in session or they appear as a separate list of interactive options. [0007]
- the system as disclosed in '295 does not provide effective media recommendations for the video content which has no prior recorded information.
- the present invention relates to a system for providing real time media recommendations to the user for a video at frame level that is being watched by a user.
- the system of the present invention comprises a real time recommendation engine to generate metadata based on the currently broadcasted or streamed video content to the user.
- the real time recommendation engine uses the generated metadata to identify media content that is relevant to the current video frame watched by the user so as to provide real time dynamic media recommendations to a video consumption device.
- the system saves the metadata generated in a data store for future references.
- the real time recommendation engine generates metadata based on the currently broadcasted or streamed video content to the user, using an audio analytics module and a visual analytics module. Further, the real time recommendation engine uses the generated metadata to search for similar media content so as to provide real time contextual media recommendations to the user. [0012] In accordance to one embodiment of the present invention, the real time recommendation engine checks whether the video feed is being previously broadcasted before generating metadata, and then uses the saved metadata to make recommendations for the previously broadcasted video feed. Further, the recommendations may also appear as a part of the currently viewed audio/video feed.
- the real time recommendation engine provides real time contextual media recommendations based on the user's past viewing history, user preferences and/or metadata generated using the audio analytics module and visual analytics module.
- the method for providing real time media recommendations to the user or a group of user for a video at frame level comprises the steps of receiving a video feed indicating a currently broadcasted or streamed video content to the user or the group of users and generating metadata based on the currently viewed video frame by the user, using audio-visual analytics techniques. The method then uses the generated metadata to identify relevant media that is similar to the current video frame viewed by the user so as to provide real time dynamic media recommendations to the user/group of users.
- the system and method of the present invention overcomes the drawback of the prior art by providing dynamic real time media recommendations for both recorded as well as live video feed continuously for the user/group of users. Further, the system may also provide recommendations as a part of the currently viewed audio/video feed.
- FIG 1 illustrates a system for providing real time media recommendations to the user, in accordance to one or more embodiment of the present invention.
- FIG 2 illustrates a method for providing real time media recommendations to the user, in accordance to one or more embodiment of the present invention.
- the present invention provides a system and method for generating real time media recommendations for a video feed indicating a currently broadcasted or streamed video content to the user.
- the system generates metadata based on the currently viewed video frame using audio-visual analytics techniques.
- the system uses the generated metadata to identify relevant media for the current video frame viewed by the user and provides real time dynamic media recommendations to the user/group of users.
- FIG 1 illustrates a system for providing real time media recommendations to the user, in accordance to one or more embodiment of the present invention.
- the system (100) comprises a real time recommendation engine (101) to generate metadata based on the currently broadcasted or streamed video content to the user, using an audio analytics module (101a) and a visual analytics module (101b).
- the real time recommendation engine (101) uses the generated metadata to identify media content that is relevant to the current video frame watched by the user, and then provides real time dynamic media recommendations (104) to a video consumption device (106).
- the video consumption device (106) may either pull or poll real time media recommendations for the current video frame watched by the user.
- the recommendations might be embedded into the audio/video stream either at the video consumption device (106) or in the broadcasted feed.
- the real time recommendation engine (101) uses the audio analytics module (101a) to generate metadata based upon the received audio stream for the current video frame watched by the user.
- the audio analytics module (101a) is configured to separate the audio stream from the received video feed and convert the received audio stream into audio text data.
- the audio analytics module (101a) uses the converted audio text data to identify metadata associated with the audio stream.
- the audio analytics module (101a) further analyzes the audio text data to identify domains associated with metadata.
- the system (100) then stores the identified audio metadata along with their associated domain in a database say an analytics database (102).
- possible metadata could be names such as celebrity names, brand names, landmarks, weather forecast, news, art, entertainment, real estates, technologies etc.
- the audio analytics module (101a) upon identification of the metadata from the audio text, looks for the possible domains that may be associated with the identified metadata. For instance, if the metadata is a celebrity name, then the audio analytics module (101a) further analyses the audio text for further finding the domain name such as film, sports, music, dance industry associated with that corresponding metadata celebrity name.
- the audio analytics module (101a) is also capable of providing list of similar celebrity names in the same domain and stores it in the analytics database (102).
- the visual analytics module (101b) of the real time recommendation engine (101) is configured to recognize faces, landmarks and objects present in the current video frame. This visual recognition process is accelerated using the identification of possible matches of people, landmarks and objects based on the metadata obtained from the audio analytics module (101a), by using face detection and object detection techniques. For example, if the metadata identified from the audio analytics module (101a) is a celebrity name such as 'Sachin' and the associated domain is cricket, then the visual analytics module (101b) would narrow the set of possible matches of the faces to the entities related to ' Sachin' thereby speeding up the visual recognition process. Thus, the real time recommendation engine (101) provides real time recommendations using audio-visual analytics techniques.
- the real time recommendation engine (101) searches for similar media content using the metadata obtained from the audio analytics module (101a) and visual analytics module (101b) for providing real time dynamic media recommendations (104) with or without using a recommendation box (105) or embedding recommendations as a part of the audio/video feed.
- the recommendation box (105) may provide a list of recommendations to the user in a collapsed form and the list may be expanded by a simple action such as by a mouse click or touch or voice command or remote key press etc over the recommendation box (105) or in case of embedded recommendations the user or user group would view or hear the recommendation as part of the viewing process.
- the recommendations present in the recommendation list may be based on the currently broadcasted or streamed video content to the user.
- the recommendations may also be based on the content that has been streamed/broadcasted overall.
- the real time recommendation engine (101) is capable of providing recommendations based on the overall content or the current video content streamed to the user/group of user.
- the recommendations provided using audio visual analytics techniques may be further personalized based on the user profile, group profile, user past viewing history, group past viewing history, user preferences and/or group preferences.
- the real time recommendation engine (101) also provides real time contextual based media recommendations based on the user/group past viewing history, user/group preferences and/or metadata generated using the audio analytics module (101a) and visual analytics module (101b).
- the real time recommendation engine (101) is also capable of providing real time contextual based media recommendations for a predetermined time interval say, every one second and/or based on the user/group settings for the frequency/granularity of recommendations.
- the system (100) may also use user identification system to provide more specific personalized real time dynamic media recommendations to the user.
- the system (100) may use one or more cameras, radio-frequency identification (RFID) reader, fingerprint scanner, Deoxyribonucleic acid (DNA) sequence based identifier, voice recognition system, retina based identification system known in the art or future-developed for electronically identifying/recognizing the user of the system (100).
- RFID radio-frequency identification
- DNA Deoxyribonucleic acid
- voice recognition system voice recognition system
- retina based identification system known in the art or future-developed for electronically identifying/recognizing the user of the system (100).
- the system (100) upon identifying the user/group, using the above user/group identification system provides personalized media recommendations to the user based on the user profile, user past viewing history and/or user preferences.
- the system (100) is also capable of providing real time contextual media recommendations for the currently broadcasted or streamed television content such as news, serials, sports or movies.
- the system (100) may provide real time contextual based media recommendations such as interactive media timeline, statistics, and profiles for events such as a news story related to a terrorist attack, ongoing war, murder mystery, election constituency, actor, sports/sportsman, presenter, politicians, animals, birds, location etc.
- the real time user/group tailored recommendations may appear as a part of currently broadcasted or streamed video content itself in either visual or audio form so that the consumption of the recommendations is seamless for the user/group of users as a part of natural viewing process.
- the system (100) is also capable of providing real time medical health recommendations to the user based on the currently broadcasted or streamed medical health related video content to the user.
- the possible medical health recommendations may be related to medical health conditions, symptoms or diagnosis related to the video feed currently viewed by the user.
- the system (100) is capable of providing real time medical health recommendations taking into consideration the currently broadcasted or streamed video content to the user, user health history, geographic location, and/or socio-economic status of the user.
- the system (100) of the present invention is also capable of providing real time media recommendations tailored to education field by providing Its users individualized education content specific to the aptitude of the student/learner and relevant to the video frame/segment/topic that is being consumed so as to enhance the rate of learning and enable more just in time learning for people of all age groups and fields.
- the video consumption device (106) used herein includes a smart phone, a cellular phone, a personal digital assistant (PDA), a personal computer, a set top box, a streaming media player, a smart TV, a laptop or any similar computing or video consumption device that can be used by the user.
- PDA personal digital assistant
- the video consumption device (106) used herein includes a smart phone, a cellular phone, a personal digital assistant (PDA), a personal computer, a set top box, a streaming media player, a smart TV, a laptop or any similar computing or video consumption device that can be used by the user.
- the real time recommendation engine (101) saves the metadata generated using the audio analytics module (101a) and the visual analytics module (101b).
- the real time recommendation engine (101) checks whether the received video feed is being previously broadcasted before generating metadata, using audio-visual analytics techniques. Further, upon identifying that the video feed is being previously broadcasted, the real time recommendation engine (101) uses the stored saved metadata to make recommendations, if found optimal.
- FIG 2 illustrates a method for providing real time media recommendations to the user, in accordance to one or more embodiment of the present invention.
- the method for providing real time dynamic media recommendations to the user comprises the steps of receiving a video feed indicating a currently broadcasted or streamed video content to the user/group of users at step 201.
- the method generates metadata based on the currently viewed video frame by the user/group of users, using audio-visual analytics techniques.
- the method further uses the generated metadata to identify video frames or relevant media that is similar to the current video frame viewed by the user/group of users at step 203, and provides dynamic real time media recommendations based on the identified relevant media that is similar to the current video frame watched by the user/group of users at step 204.
- the method may continuously push recommendations to the video consumption device (106) commonly used today such as a television, laptop, streaming device, or a mobile phone.
- the video consumption device (106) may pull or poll real time media recommendations for the current video frame watched by the user/group of users.
- the recommendations may be alternatively embedded into the audio/video stream during transmission/streaming of video or dynamically inserted real time into the stream at the video consumption device (106) so that the recommendations are part of the video and its consumption would be seamless for the user/group of users.
- the present invention overcomes the drawbacks of the prior art by providing a system (100) and method and for generating real time media recommendations for a video at frame level that is being watched by the user.
- the system (100) of the present invention is capable of providing real time media recommendations for both recorded as well as live video feed continuously.
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Multimedia (AREA)
- Databases & Information Systems (AREA)
- Library & Information Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
Description
Claims
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
IN201641006867 | 2016-02-29 | ||
IN201641006867 | 2016-02-29 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2017149447A1 true WO2017149447A1 (en) | 2017-09-08 |
Family
ID=59743594
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/IB2017/051160 WO2017149447A1 (en) | 2016-02-29 | 2017-02-28 | A system and method for providing real time media recommendations based on audio-visual analytics |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2017149447A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10652592B2 (en) | 2017-07-02 | 2020-05-12 | Comigo Ltd. | Named entity disambiguation for providing TV content enrichment |
US10939146B2 (en) | 2018-01-17 | 2021-03-02 | Comigo Ltd. | Devices, systems and methods for dynamically selecting or generating textual titles for enrichment data of video content items |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100199295A1 (en) * | 2009-02-02 | 2010-08-05 | Napo Enterprises | Dynamic video segment recommendation based on video playback location |
US20100195975A1 (en) * | 2009-02-02 | 2010-08-05 | Porto Technology, Llc | System and method for semantic trick play |
US20150156530A1 (en) * | 2013-11-29 | 2015-06-04 | International Business Machines Corporation | Media selection based on content of broadcast information |
US20150186510A1 (en) * | 2007-12-21 | 2015-07-02 | Lemi Technology, Llc | System For Generating Media Recommendations In A Distributed Environment Based On Seed Information |
-
2017
- 2017-02-28 WO PCT/IB2017/051160 patent/WO2017149447A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150186510A1 (en) * | 2007-12-21 | 2015-07-02 | Lemi Technology, Llc | System For Generating Media Recommendations In A Distributed Environment Based On Seed Information |
US20100199295A1 (en) * | 2009-02-02 | 2010-08-05 | Napo Enterprises | Dynamic video segment recommendation based on video playback location |
US20100195975A1 (en) * | 2009-02-02 | 2010-08-05 | Porto Technology, Llc | System and method for semantic trick play |
US20150156530A1 (en) * | 2013-11-29 | 2015-06-04 | International Business Machines Corporation | Media selection based on content of broadcast information |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10652592B2 (en) | 2017-07-02 | 2020-05-12 | Comigo Ltd. | Named entity disambiguation for providing TV content enrichment |
US10939146B2 (en) | 2018-01-17 | 2021-03-02 | Comigo Ltd. | Devices, systems and methods for dynamically selecting or generating textual titles for enrichment data of video content items |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9888279B2 (en) | Content based video content segmentation | |
US11197036B2 (en) | Multimedia stream analysis and retrieval | |
US10271098B2 (en) | Methods for identifying video segments and displaying contextually targeted content on a connected television | |
US9202523B2 (en) | Method and apparatus for providing information related to broadcast programs | |
US12425698B2 (en) | Detection of common media segments | |
US9471936B2 (en) | Web identity to social media identity correlation | |
EP2541963B1 (en) | Method for identifying video segments and displaying contextually targeted content on a connected television | |
US11080749B2 (en) | Synchronising advertisements | |
US10652592B2 (en) | Named entity disambiguation for providing TV content enrichment | |
US11057457B2 (en) | Television key phrase detection | |
US20140075465A1 (en) | Time varying evaluation of multimedia content | |
CN108509611B (en) | Method and device for pushing information | |
ES2648368A1 (en) | Video recommendation based on content (Machine-translation by Google Translate, not legally binding) | |
Vettehen et al. | Competitive pressure and arousing television news: A cross-cultural study | |
WO2017149447A1 (en) | A system and method for providing real time media recommendations based on audio-visual analytics | |
KR101674310B1 (en) | System and method for matching advertisement for providing advertisement associated with video contents | |
EP3044728A1 (en) | Content based video content segmentation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17759349 Country of ref document: EP Kind code of ref document: A1 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 17759349 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 15.02.2019) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 17759349 Country of ref document: EP Kind code of ref document: A1 |