HK1171304A

HK1171304A - Simulated group interaction with multimedia content

Info

Publication number: HK1171304A
Application number: HK12111904.8A
Authority: HK
Inventors: K．S．佩雷; A．巴-泽埃夫
Original assignee: 微软技术许可有限责任公司
Priority date: 2010-12-16
Filing date: 2012-11-21
Publication date: 2013-03-22

Description

Simulated group interaction with multimedia content

Technical Field

The present invention relates to multimedia technology, and more particularly, to simulated group interaction with multimedia content.

Background

Video On Demand (VOD) systems allow users to stream content through set-top boxes, computers, or other devices to select and view multimedia content as desired. Video-on-demand systems typically provide users with the flexibility to view multimedia content at any time. However, users may not feel that they are part of a live event or experience when viewing recorded video content, video on demand content, or other on demand media content, as the content is typically streamed offline to the user. In addition, users may lack a sense of community and connectivity when viewing multimedia content on demand, as they may not view the content live with their friends and family.

Disclosure of Invention

Disclosed herein is a method and system for enhancing a viewer's experience when viewing recorded video content, video-on-demand content, or other on-demand media content by recreating for the viewer an experience to view multimedia content live with other users, such as the viewer's friends and family. In one embodiment, the disclosed technology generates a synchronized data stream multiple times while a viewer is viewing a multimedia content stream, the synchronized data stream including comments provided by the viewer and other users, such as friends and family of the viewer. Comments may include text messages, audio messages, video feeds, gestures, or facial expressions provided by viewers and other users. The time-synchronized data stream is presented to the viewer via the audiovisual device while the viewer is viewing the multimedia content stream, thereby recreating the experience of viewing the multimedia content live with other users for the viewer. In one embodiment, multiple viewers view multimedia content at a single location and record interactions with the multimedia content stream from the multiple viewers.

In another implementation, a method for generating a time-synchronized stream of commentary data based on viewer interaction with a stream of multimedia content is disclosed. A multimedia content stream associated with a current broadcast is received. A viewer is identified within a field of view of a capture device connected to a computing device. A viewer's interaction with a multimedia content stream being viewed by the viewer is recorded. A time-synchronized stream of commentary data is generated based on the viewer's interaction. A request is received from a viewer to view one or more time-synchronized streams of commentary data related to a stream of multimedia content being viewed by the viewer. The time-synchronized stream of commentary data is displayed to the viewer via the viewer's audiovisual device in response to the viewer's request.

This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter. Furthermore, the claimed subject matter is not limited to implementations that solve any or all disadvantages noted in any part of this disclosure.

Drawings

FIG. 1 illustrates one embodiment of a target recognition, analysis, and tracking system for performing the operations of the disclosed technology.

FIG. 2 illustrates one embodiment of a capture device that may be used as part of a tracking system.

FIG. 3 illustrates an embodiment of an environment for implementing the present technology.

Fig. 4 illustrates an example of a computing device that may be used to implement the computing devices of fig. 1-2.

FIG. 5 illustrates a general purpose computing device that can be used to implement another embodiment of the computing device of FIGS. 1-2.

FIG. 6 is a flow chart describing one embodiment of a process for generating a time-synchronized stream of commentary data based on viewer interaction with a stream of multimedia content.

FIG. 6A is a flow chart describing one embodiment of a process for receiving a stream of commentary data generated by other users upon request by a viewer to view the stream of commentary data.

FIG. 6B is a flow chart describing one embodiment of a process for generating a time-synchronized review data stream.

FIG. 7 is a flow diagram describing one embodiment of a process for generating a report of a time-synchronized stream of commentary data related to a particular stream of multimedia content viewed by one or more users.

FIG. 8 is a flow diagram describing one embodiment of a process for providing a viewer with a stream of review data generated by other users based on the viewer's review viewing qualifications.

FIG. 9A illustrates an exemplary user interface screen for obtaining viewer preference information prior to recording the viewer's interaction with a multimedia content stream.

FIG. 9B illustrates an exemplary user interface screen for obtaining input by a viewer to view comments from other users.

FIG. 10 illustrates an exemplary user interface screen displaying to a viewer one or more options for viewing a time-synchronized stream of commentary data associated with a stream of multimedia content.

11A, 11B, and 11C illustrate exemplary user interface screens in which one or more time-synchronized streams of commentary data related to a multimedia content stream are displayed to a viewer.

Detailed Description

Techniques are disclosed to enhance a user's experience when viewing recorded video content, video-on-demand content, or other on-demand media content. The viewer views the multimedia content stream associated with the current broadcast via an audiovisual device. The viewer's interaction with the multimedia content stream is recorded. In one approach, a viewer's interaction with a multimedia content stream may include comments in the form of text messages, audio messages, or video feeds provided by the viewer while the viewer views the multimedia content stream. In another approach, the viewer's interaction with the multimedia content stream may include a gesture, posture or facial expression made by the viewer while the viewer is viewing the multimedia content stream. A time-synchronized stream of commentary data is generated based on the viewer's interaction. The time-synchronized commentary data stream is generated by time-synchronizing a data stream containing the viewer's interactions with respect to the actual start of the multimedia content stream. In one embodiment, a time-synchronized stream of commentary data is presented to a viewer via an audiovisual device while the viewer's interactions with the multimedia content stream are recorded. In another embodiment, one or more time-synchronized streams of commentary data generated by other users are presented to a viewer via an audiovisual device at the viewer's request while the viewer's interactions with the multimedia content stream are recorded.

Multiple data streams may be synchronized with one multimedia content stream and identified by user comments. In this way, groups may be defined based on data streams associated with multimedia content streams. Viewers and users who provide their reactions and comments at different viewing times and locations are thereby brought together upon subsequent viewing of the multimedia content because data associated with the content is added during each viewing in accordance with the present techniques. Groups may extend from friends of the viewer to the viewer's social graph and to a wider scope.

FIG. 1 illustrates one embodiment of a target recognition, analysis, and tracking system 10 (hereinafter collectively referred to as a motion tracking system) for performing the operations of the disclosed technology. The tracking system 10 may be used to identify, analyze, and/or track one or more human targets, such as users 18 and 19. As shown in FIG. 1, the tracking system 10 may include a computing device 12. In one embodiment, computing device 12 may be implemented as any one or combination of a wired and/or wireless device, as any form of television client device (e.g., a television set-top box, Digital Video Recorder (DVR), etc.), personal computer, portable computer device, mobile computing device, media device, communication device, video processing and/or rendering device, appliance device, gaming device, electronic device, and/or as any other type of device that may be implemented to receive media content in any form of audio, video, and/or image data. According to one embodiment, computing device 12 may include hardware components and/or software components such that computing device 12 may be used to execute applications such as gaming applications, non-gaming applications, and the like. In one embodiment, computing device 12 may include a processor, such as a standardized processor, a specialized processor, a microprocessor, or the like that may execute instructions stored on a processor readable storage device for performing the processes described herein.

As shown in FIG. 1, the tracking system 10 may also include a capture device 20. The capture device 20 may be, for example, a camera that may be used to visually monitor one or more users 18 and 19 such that movements, gestures, and gestures made by those users may be captured and tracked by the capture device 20 within the field of view 6 of the capture device 20. Lines 2 and 4 represent the boundaries of the field of view 6.

According to one embodiment, computing device 12 may be connected to an audiovisual device 16 such as a television, a monitor, a high-definition television (HDTV), or the like that may provide visual and/or audio to human targets 18 and 19. For example, computing device 12 may include a video adapter such as a graphics card and/or an audio adapter such as a sound card that may provide audiovisual signals to a user. The audiovisual device 16 may receive the audiovisual signals from the computing device 12 and may then output the visuals and/or audio associated with the audiovisual signals to the users 18 and 19. According to one embodiment, the audiovisual device 16 may be connected to the computing device 12 via, for example, an S-video cable, a coaxial cable, an HDMI cable, a DVI cable, a VGA cable, or the like.

In one set of operations performed by the disclosed technology, users 18, 19 view multimedia content streams related to a current broadcast via audiovisual device 16, and computing device 12 records the user's interactions with the multimedia content streams. In one approach, a viewer, such as a user 18, 19, may interact with the multimedia content stream by providing a text message, an audio message, or a video feed while viewing the multimedia content stream. The text message may comprise an email message, an SMS message, an MMS message or a twitter message. In one example, the viewer may provide text messages, audio messages, and video feeds wirelessly (e.g., WiFi, bluetooth, infrared, or other wireless communication means) or via a remote control device or mobile computing device that communicates with computing system 12 over a wired connection. In one embodiment, a remote control device or mobile computing device is synchronized to computing device 12, which computing device 12 streams a multimedia content stream to a viewer so that the viewer can provide a text message, an audio message, or a video feed while viewing the multimedia content stream. In another example, the viewer may also interact with the multimedia content stream by making movements, gestures, or facial expressions while viewing the multimedia content stream. As the viewer views the multimedia content stream via the audiovisual device 16, the viewer's movements, gestures, postures, and facial expressions may be tracked by the capture device 20 and recorded by the computing system 12.

As described herein, a multimedia content stream may include recorded video content, video-on-demand content, television programs, advertisements, commercials, music, movies, video clips, and other on-demand media content. Other multimedia content streams may include interactive games, network-based applications, and any other content or data (e.g., including program guide application data, user interface data, advertising content, closed captioning, content metadata, search results and/or recommendations, etc.).

In another set of operations performed by the disclosed technology, computing device 12 generates a time-synchronized stream of commentary data based on viewer interaction with the stream of multimedia content. The time-synchronized data stream is generated by synchronizing the data stream containing the viewer's interaction with respect to the actual start time of the multimedia content stream. In one embodiment, computing device 12 presents the viewer's stream of commentary data via audiovisual device 16 while recording the viewer's interactions with the stream of multimedia content. In another embodiment, at the viewer's request, computing device 12 presents the stream of commentary data generated by other users via audiovisual device 16 while recording the viewer's interactions with the stream of multimedia content. The operations performed by the computing device 12 and the capture device 20 are discussed in detail below.

FIG. 2 illustrates one embodiment of the capture device 20 and computing device 12 that may be used in the system of FIG. 1 to perform one or more operations of the disclosed technology. According to one embodiment, the capture device 20 may be configured to capture video having depth information including a depth image that may include depth values via any suitable technique including, for example, time-of-flight, structured light, stereo image, or the like. According to one embodiment, the capture device 20 may organize the calculated depth information into "Z layers" or layers that may be perpendicular to a Z axis extending from the depth camera along its line of sight.

As shown in FIG. 2, the capture device 20 may include an image camera component 32. According to one embodiment, the image camera component 32 may be a depth camera that may capture a depth image of a scene. The depth image may include a two-dimensional (2-D) pixel area of the captured scene where each pixel in the 2-D pixel area may represent a depth value, such as a distance in, for example, centimeters, millimeters, or the like of an object in the captured scene from the camera.

As shown in FIG. 2, the image camera component 32 may include an IR light component 34, a three-dimensional (3-D) camera 36, and an RGB camera 38 that may be used to capture the depth image of the capture area. For example, in time-of-flight analysis, the IR light component 34 of the capture device 20 may emit an infrared light onto the capture area and may then use sensors to detect the backscattered light from the surface of one or more targets and objects in the capture area with, for example, the 3-D camera 36 and/or the RGB camera 38. In some embodiments, pulsed infrared light may be used so that the time difference between an outgoing light pulse and a corresponding incoming light pulse may be measured and used to determine a physical distance from the capture device 20 to a particular location on a target or object in the capture area. Furthermore, the phase of the outgoing light wave may be compared to the phase of the incoming light wave to determine the phase shift. The phase shift may then be used to determine a physical distance from the capture device to a particular location on the targets or objects.

According to one embodiment, time-of-flight analysis may be used to indirectly determine a physical distance from the capture device 20 to a particular location on the targets or objects by analyzing the intensity of the reflected beam of light over time via various techniques including, for example, shuttered light pulse imaging.

In another example, the capture device 20 may use a structured light to capture depth information. In this analysis, patterned light (i.e., light displayed as a known pattern such as a grid pattern or a stripe pattern) may be projected onto the capture area via, for example, the IR light component 34. Upon striking the surface of one or more targets or objects in the capture area, the pattern may deform in response. Such a deformation of the pattern may be captured by, for example, the 3-D camera 36 and/or the RGB camera 38, and may then be analyzed to determine a physical distance from the capture device to a particular location on the targets or objects.

According to one embodiment, the capture device 20 may include two or more physically separated cameras that may view the capture area from different angles to obtain visual stereo data that may be resolved to generate depth information. Other types of depth image sensors may also be used to create the depth image.

The capture device 20 may also include a microphone 40. Microphone 40 may include a transducer or sensor that may receive sound and convert it into an electrical signal. According to one embodiment, the microphone 40 may be used to reduce feedback between the capture device 20 and the computing device 12 in the target recognition, analysis, and tracking system 10. Additionally, microphone 40 may be used to receive audio signals that may also be provided by a user when interacting with the multimedia content stream, or to control applications such as gaming applications, non-gaming applications, or the like that may be executed by computing device 12.

In one embodiment, the capture device 20 may also include a processor 42 that may be in operative communication with the image camera component 32. Processor 42 may include a standard processor, a special purpose processor, a microprocessor, etc. that may execute instructions, which may include instructions for storing a profile, receiving a depth image, determining whether a suitable target is included in a depth image, converting a suitable target into a skeletal representation or model of the target, or any other suitable instructions.

The capture device 20 may also include a memory component 44, and the memory component 34 may store instructions executable by the processor 42, images or frames of images captured by the 3-D camera or RGB camera, user profiles, or any other suitable information, images, and so forth. According to one example, the memory component 44 may include Random Access Memory (RAM), Read Only Memory (ROM), cache, flash memory, a hard disk, or any other suitable storage component. As shown in FIG. 2, the memory component 44 may be a separate component in communication with the image capture component 32 and the processor 42. In another embodiment, the memory component 44 may be integrated into the processor 42 and/or the image capture component 32. In one embodiment, some or all of the components 32, 34, 36, 38, 40, 42, and 44 of the capture device 20 shown in FIG. 2 are housed in a single housing.

The capture device 20 may communicate with the computing device 12 via a communication link 46. The communication link 46 may be a wired connection including, for example, a USB connection, a firewire connection, an ethernet cable connection, and/or a wireless connection such as a wireless 802.11b, 802.11g, 802.11a, or 802.11n connection. The computing device 12 may provide a clock to the capture device 20 via the communication link 46 that may be used to determine when to capture, for example, a scene.

The capture device 20 may provide depth information and images captured by, for example, the 3-D (or depth) camera 36 and/or the RGB camera 38 to the computing device 12 via the communication link 46. As discussed in detail below, computing device 12 may then use the depth information and the captured image to perform one or more operations of the disclosed techniques.

In one embodiment, the capture device 20 captures one or more users viewing the multimedia content stream within a field of view 6 of the capture device. Capture device 20 provides the captured visual image of the user to computing device 12. The computing device 12 performs identification of the user captured by the capture device 20. In one embodiment, computing device 12 includes a facial recognition engine 192 to perform identification of the user. The facial recognition engine 192 may correlate the user's face from the visual image received from the capture device 20 with the reference visual image to determine the user's identity. In another example, the identity of the user may also be determined by receiving input from the user identifying their identity. In one embodiment, users may be required to identify themselves by standing in front of computing system 12 so that capture device 20 may capture depth images and visual images of each user. For example, the user may be asked to stand in front of the capture device 20, turn around, and make various poses. After computing device 12 obtains the data necessary to identify the user, the user is provided with a unique identifier that identifies the user. More information about Identifying users can be found in U.S. patent application serial No. 12/696,282, "Visual Based Identity Tracking," and U.S. patent application serial No. 12/475,308, "Device for Identifying and Tracking multiple human over Time," the entire contents of both of which are incorporated herein by reference. In another implementation, the identity of the user may already be known to the computing device when the user logs into the computing device (such as, for example, when the computing device is a mobile computing device such as the user's cellular telephone). In another embodiment, the user's voiceprint can also be used to determine the user's identity.

In one embodiment, the identification information of the user may be stored in a user profile database 207 in computing device 12. In one example, the user profile database 207 may include information about the user such as: a unique identifier associated with the user, the user's name, and other demographic information related to the user, such as the user's age group, gender, and geographic location. The user profile database 207 also includes information about the user's program viewing history, such as a list of programs viewed by the user and a list of user preferences. The user preferences may include information about: a social graph of the user, friends of the user, friend identities, preferences of the friends, activities (of the user and of the friends of the user), photos, images, recorded videos, and the like. In one example, the user's social graph may include information about: a user wishes to have his or her comments on the user's preferences of the user group for which the multimedia content stream is available when viewed.

In one set of operations performed by the disclosed technology, the capture device 20 tracks movements, gestures, poses, and facial expressions made by a user as the user views a multimedia content stream via the audiovisual device 16. For example, the facial expression tracked by capture device 20 may include detecting a smile, laugh, cry, frown, yawning, or applause from the user while the user is viewing the multimedia content stream.

In one embodiment, computing device 12 also includes a gesture library 196 and a gesture recognition engine 190. The gesture library 196 includes a collection of gesture filters, each comprising information related to movements, gestures, or gestures made by a user. In one implementation, the gesture recognition engine 190 may compare the skeletal model captured by the cameras 36, 38 and device 20 and the data in the form of movements associated therewith to gesture filters in the gesture library 192 to identify when the user (as represented by the skeletal model) has made one or more gestures or postures. Computing device 12 may use gesture library 192 to interpret movements of the skeletal model to perform one or more operations of the disclosed techniques. For more information on the Gesture Recognition engine 190, see U.S. patent application 12/422,661, "Gesture Recognition system architecture," filed on 13.4.2009, which is incorporated herein by reference in its entirety. For more information on recognizing Gestures and postures see U.S. patent application 12/391,150 "Standard getcures" filed on 23/2/2009 and U.S. patent application 12/474,655 "gettrue Tool" filed on 29/5/2009, both of which are incorporated herein by reference in their entirety. More information on Motion Detection and Tracking can be found in U.S. patent application 12/641,788, "Motion Detection Using Depth Images," filed 12, 18, 2009, and U.S. patent application 12/475,308, "Device for identifying and Tracking Multiple human machines over Time," both of which are incorporated herein by reference in their entirety.

The facial recognition engine 192 in the computing device 12 may include a facial expression library 198. The facial expression library 198 includes a collection of facial expression filters, each of which includes information about a user's facial expressions. In one example, the facial expression engine 192 may compare data captured by the cameras 36, 38 in the capture device 20 to facial expression filters in the facial expression library 198 to identify the facial expressions of the user. In another example, the facial recognition engine 192 may also compare data captured by the microphone 40 in the capture device 20 to facial expression filters in the facial expression library 198 to identify one or more sounds or audio responses, such as, for example, sounds from a user's laugh or clapping.

In another embodiment, the user's movements, gestures, postures and facial expressions may also be tracked using one or more additional sensors located in a room in which the user views the multimedia content stream via the audiovisual device or placed on a physical surface in the room, such as a desktop. The sensors may include, for example, one or more active beacon sensors that emit structured light, pulsed infrared light, or visible light onto the physical surface, detect light backscattered from the surface of one or more objects on the physical surface, and detect movements, gestures, and facial expressions made by the user. The sensors may also include biometric monitoring sensors, user wearable sensors, or sensors that can track movements, gestures, and facial expressions made by the user.

In one set of operations performed by the disclosed technology, computing device 12 receives a multimedia content stream associated with a current broadcast from a media provider 52. Media provider 52 may comprise, for example, any entity such as a content provider, a broadband provider, or a third party provider that may create a structure and stream multimedia content to computing device 12. The multimedia content stream may be received over various networks 50. Suitable types of networks that may be configured to support service providers in providing multimedia content services may include, for example, telephone-based networks, coaxial cable-based networks, and satellite-based networks. In one embodiment, the multimedia content stream is displayed to the user via an audiovisual device 16. As described above, the multimedia content stream may include recorded video content, video-on-demand content, television programs, advertisements, commercials, music, movies, video clips, and other on-demand media content.

In another set of operations performed by the disclosed technology, computing device 12 identifies program information related to a multimedia content stream being viewed by a viewer, such as users 18, 19. In one example, the multimedia content stream may be identified as a television program, a movie, a live performance, or a sporting event. For example, program information associated with a television program may include the program name, the current season of the program, the episode number, and the broadcast date and time of the program.

In one embodiment, computing device 12 includes a comment data stream generation module 56. The comment data stream generation module 56 records the viewer's interactions with the multimedia content stream as the viewer views the multimedia content stream. In one approach, a viewer's interaction with a multimedia content stream may include comments in the form of text messages, audio messages, or video feeds provided by the viewer while the viewer is viewing the multimedia content stream. In another approach, the viewer's interaction with the multimedia content stream may include gestures, poses, and facial expressions performed by the viewer while the viewer is viewing the multimedia content stream.

The comment data stream generation module 56 generates a time-synchronized data stream based on the viewer's interaction. Comment data stream generation module 56 provides time-synchronized comment data streams and program information related to the multimedia content streams to centralized data server 306 (shown in fig. 2B) for provision to other viewers. In one embodiment, the time-synchronized stream of commentary data includes time stamps of viewer interactions with the multimedia content stream synchronized with respect to an actual start time of the multimedia content stream. The operations performed by computing device 12 to generate a time-synchronized stream of review data are discussed in detail in fig. 6.

A display module 82 in the computing device 12 presents the viewer-generated time-synchronized stream of commentary data via the audiovisual device 16. In one embodiment, the viewer may also select one or more options for viewing the stream of commentary data generated by other users via a user interface in the audiovisual device 16. The manner in which a viewer may interact with the user interface in the audiovisual device 16 is discussed in detail in fig. 9-11.

FIG. 3 illustrates an environment for implementing the present technology. FIG. 3 shows a plurality of client devices 300A, 300B … … 300X coupled to a network 304 and in communication with a centralized data server 306. The centralized data server 306 sends and receives information to and from the client devices 300A, 300B … … 300X and provides a collection of services that can be invoked and utilized by applications running on the client devices 300A, 300B … … 300X. Client devices 300A, 300B … … 300X may include computing device 12 discussed in FIG. 1, or may be implemented as any of the devices described in FIGS. 4-5. For example, the client devices 300A, 300B … … 300X may include game and media consoles, personal computers, or mobile devices such as cellular phones, web-enabled smart phones, personal digital assistants, palmtop computers, or laptop computers. Network 304 may include the Internet, although other networks, such as LANs or WANs, are contemplated.

In one embodiment, centralized data server 306 includes a comment data stream aggregation module 312. In one implementation, the commentary data stream aggregation module 312 receives one or more time-synchronized commentary data streams from one or more users at the client devices 300A, 300B … … 300X, receives program information related to multimedia content streams and preference information related to one or more users from the client devices 300A, 300B … … 300X, and generates reports of the time-synchronized commentary data streams related to particular multimedia content streams viewed by the one or more users. In one example, the report may be implemented as a table having fields identifying: one or more users providing comments to a particular multimedia content stream, a playout date/time of the user viewing the multimedia content stream, a time-synchronized stream of comment data generated by the user, and comment viewing qualifications set by the user regarding the particular multimedia content stream. An exemplary illustration of this report is shown in table 1, shown below:

TABLE 1-reporting of time-synchronized commentary data streams in relation to a particular multimedia content stream

As shown in Table 1, the "time-synchronized commentary data stream" includes user interactions that are time-stamped relative to the actual start time of the presentation of the multimedia content stream to the user. The process of generating a time-synchronized review data stream is discussed in fig. 6. "review viewing qualification" refers to a group of users for whom a user wishes to make his or her reviews available for viewing. In one example, the group of users may include friends, family, or the world wide of the user. In one example, the comment viewing qualifications may be obtained from a user's social graph stored in the user profile database 207. In another example, comment viewing eligibility may also be determined by: user preferences for a user group to whom the user wishes to make his or her comments available are obtained directly from the user via a user interface in the user's computing device prior to recording the user's interactions with the multimedia content stream.

In another embodiment, centralized data server 306 also provides, at the client device, the commentary data streams generated by other users to a viewer watching the multimedia content stream at the viewer's request and based on the viewer's commentary viewing capabilities. Centralized data server 306 includes a review database 308. The comment database 308 stores one or more time-synchronized comment data streams generated by users at the client devices 300A, 300B … … 300X. For example, the media provider 52 may comprise any entity that can create a structure and deliver multimedia content streams directly to the client devices 300A, 300B … … 300X or the centralized data server 306, such as a content provider, a broadband provider, or a third party provider. For example, centralized data server 306 may receive a multimedia content stream associated with a current broadcast (which may be a live, on-demand, or pre-recorded broadcast) from media provider 52 and provide the multimedia content stream to one or more users at client devices 300A, 300B … … 300X.

In one embodiment, the media provider may operate a centralized data server, or the centralized data server may be provided as a separate service by a party not associated with the media provider 52.

In another embodiment, the centralized data server 306 may include a data aggregation service 315 with other input sources. For example, server 306 may also interface from one or more third party information sourcesReceiving information such as provided by one or more usersA feed source,Updates or voice messages, etc. from a social network or other communication service. The aggregation service 315 may include authentication of the third party communication service 54 by the data server 306 and receiving updates to the third party service directly from the client devices 300A-300C. In one embodiment, the aggregation service 315 may collect real-time data updates from the third party information sources 54 and provide updates to the viewing applications on the devices 300A-300C. In one example, the real-time data update may be stored in the reviews database 308. One example of such an aggregation service is provided on a mobile device for execution in a mobile computing device of a viewerMicrosoft Live service of social updates of search applications. The viewer's mobile computing device is synchronized to the viewer's computing device to enable the viewer to view the real-time data updates via an audiovisual display connected to the viewer's computing device.

In the case of third party aggregation services, these services may automatically filter any real-time data updates related to the multimedia content being viewed by the user, and then provide the filtered real-time data updates to the viewer via the audiovisual display 16 as the viewer views the multimedia content stream. In another example, the application may automatically filter information updates provided to the viewer to prevent the viewer from obtaining real-time data updates about the multimedia content the viewer is watching while the viewer is watching a live broadcast. For example, when a user is viewing a selected media stream, social updates provided by the user regarding the media stream may be stored for "playback" by the stream when later viewers view the data stream.

Fig. 4 illustrates an example of a computing device 100 that may be used to implement computing device 12 of fig. 1-2. The computing device 100 of fig. 4 may be a multimedia console 100, such as a gaming console. As shown in FIG. 4, the multimedia console 100 has a Central Processing Unit (CPU)200 and a memory controller 202 that facilitates processor access to various types of memory, including a flash Read Only Memory (ROM)204, a Random Access Memory (RAM)206, a hard disk drive 208, and the portable media drive 106. In one implementation, CPU 200 includes a level 1 cache 210 and a level 2 cache 212 to temporarily store data and thus reduce the number of memory access cycles made to hard drive 208, thereby improving processing speed and throughput.

The CPU 200, memory controller 202, and various memory devices are interconnected together via one or more buses (not shown). The details of the bus used in this implementation are not particularly relevant to understanding the subject matter of interest discussed herein. It should be understood, however, that such a bus may include one or more of serial and parallel buses, a memory bus, a peripheral bus, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures can include an Industry Standard Architecture (ISA) bus, a Micro Channel Architecture (MCA) bus, an enhanced ISA (eisa) bus, a Video Electronics Standards Association (VESA) local bus, and a Peripheral Component Interconnect (PCI) bus also known as a mezzanine bus.

In one implementation, the CPU 200, memory controller 202, ROM 204, and RAM 206 are integrated onto a common module 214. In this implementation, ROM 204 is configured as a flash ROM that is connected to memory controller 202 via a PCI bus and a ROM bus (neither of which are shown). RAM 206 is configured as multiple double data Rate synchronous dynamic RAM (DDR SDRAM) modules that are independently controlled by memory controller 202 via separate buses (not shown). Hard disk drive 208 and portable media drive 106 are shown connected to memory controller 202 via the PCI bus and an AT attachment (ATA) bus 216. However, in other implementations, different types of dedicated data bus structures may alternatively be applied.

Graphics processing unit 220 and video encoder 222 form a video processing pipeline for high speed and high resolution (e.g., high definition) graphics processing. Data is transmitted from the graphics processing unit 220 to the video encoder 222 via a digital video bus (not shown). An audio processing unit 224 and an audio codec (coder/decoder) 226 form a corresponding audio processing pipeline for multi-channel audio processing of various digital audio formats. Audio data is transmitted between the audio processing unit 224 and the audio codec 226 via a communication link (not shown). The video and audio processing pipelines output data to an a/V (audio/video) port 228 for transmission to a television or other display. In the illustrated implementation, the video and audio processing component 220 and 228 are installed on the module 214.

FIG. 4 shows module 214 including USB host controller 230 and network interface 232. USB host controller 230 is shown in communication with CPU 200 and memory controller 202 via a bus (e.g., a PCI bus) and serves as a host for peripheral controllers 104(1) - (104 (4)). The network interface 232 provides access to a network (e.g., the internet, home network, etc.) and may be any of a wide variety of various wired or wireless interface components including an ethernet card, a modem, a wireless access card, a bluetooth module, a cable modem, and the like.

In the implementation depicted in fig. 4, the console 102 includes a controller support subassembly 240 for supporting four controllers 104(1) -104 (4). The controller support subassembly 240 includes any hardware and software components necessary to support wired and wireless operation with external control devices such as, for example, media and game controllers. The front panel I/O subassembly 242 supports the multiple functions of the power button 112, the eject button 114, as well as any LEDs (light emitting diodes) or other indicators exposed on the outer surface of the console 102. The sub-assemblies 240 and 242 communicate with the module 214 through one or more cable assemblies 244. In other implementations, the console 102 may include additional controller subcomponents. The illustrated implementation also shows an optical I/O interface 235 configured to send and receive signals that may be passed to the module 214.

Memory Units (MUs) 140(1) and 140(2) are shown as being connectable to MU ports "a" 130(1) and "B" 130(2), respectively. Additional MUs (e.g., MUs 140(3) -140(6)) are shown connectable to controllers 104(1) and 104(3), i.e., two MUs per controller. The controllers 104(2) and 104(4) may also be configured to receive MUs (not shown). Each MU 140 provides additional storage on which games, game parameters, and other data may be stored. In some implementations, the other data can include any of a digital game component, an executable gaming application, an instruction set for expanding a gaming application, and a media file. When inserted into console 102 or a controller, MU 140 can be accessed by memory controller 202. The system power supply module 250 supplies power to the components of the gaming system 100. A fan 252 cools the circuitry within console 102.

An application 260 comprising machine instructions is stored on hard disk drive 208. When console 102 is powered on, various portions of application 260 are loaded into RAM 206, and/or caches 210 and 212, for execution on CPU 200, wherein application 260 is one such example. Various application programs may be stored on hard disk drive 208 for execution on CPU 200.

By simply connecting gaming and media system 100 to monitor 150 (fig. 1), a television, a video projector, or other display device, system 100 may operate as a standalone system. In this standalone mode, gaming and media system 100 allows one or more players to play games or enjoy digital media, such as watching movies or enjoying music. However, with the integration of broadband connectivity made possible through network interface 232, gaming and media system 100 may also be operated as a participant in a larger network gaming community.

FIG. 5 illustrates a general purpose computing device that can be used to implement another embodiment of computing device 12. With reference to FIG. 5, an exemplary system for implementing the disclosed technology includes a general purpose computing device in the form of a computer 310. Components of computer 310 may include, but are not limited to, a processing unit 320, a system memory 330, and a system bus 321 that couples various system components including the system memory to the processing unit 320. The system bus 321 may be any of several types of bus structures including a memory bus or memory controller, a peripheral bus, and a local bus using any of a variety of bus architectures. By way of example, and not limitation, such architectures include Industry Standard Architecture (ISA) bus, Micro Channel Architecture (MCA) bus, Enhanced ISA (EISA) bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus also known as mezzanine bus.

Computer 310 typically includes a variety of computer readable media. Computer readable media can be any available media that can be accessed by computer 310 and includes both volatile and nonvolatile media, removable and non-removable media. By way of example, and not limitation, computer readable media may comprise computer storage media and communication media. Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can accessed by computer 310. Communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. The term "modulated data signal" means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared and other wireless media. Combinations of any of the above should also be included within the scope of computer readable media.

The system memory 330 includes computer storage media in the form of volatile and/or nonvolatile memory such as Read Only Memory (ROM)331 and Random Access Memory (RAM) 332. A basic input/output system 333(BIOS), containing the basic routines that help to transfer information between elements within computer 310, such as during start-up, is typically stored in ROM 223. RAM332 typically contains data and/or program modules that are immediately accessible to and/or presently being operated on by processing unit 320. By way of example, and not limitation, FIG. 5 illustrates operating system 334, application programs 335, other program modules 336, and program data 337.

The computer 310 may also include other removable/non-removable, volatile/nonvolatile computer storage media. By way of example only, FIG. 5 illustrates a hard disk drive 340 that reads from or writes to non-removable, nonvolatile magnetic media, a magnetic disk drive 351 that reads from or writes to a removable, nonvolatile magnetic disk 352, and an optical disk drive 355 that reads from or writes to a removable, nonvolatile optical disk 356 such as a CD ROM or other optical media. Other removable/non-removable, volatile/nonvolatile computer storage media that can be used in the exemplary operating environment include, but are not limited to, magnetic tape cassettes, flash memory cards, digital versatile disks, digital video tape, solid state RAM, solid state ROM, and the like. The hard disk drive 341 is typically connected to the system bus 321 through a non-removable memory interface such as interface 340, and magnetic disk drive 351 and optical disk drive 355 are typically connected to the system bus 321 by a removable memory interface, such as interface 350.

The drives and their associated computer storage media discussed above and illustrated in FIG. 5, provide storage of computer readable instructions, data structures, program modules and other data for the computer 310. In FIG. 5, for example, hard disk drive 341 is illustrated as storing operating system 344, application programs 345, other program modules 346, and program data 347. Note that these components can either be the same as or different from operating system 334, application programs 335, other program modules 336, and program data 337. Operating system 344, application programs 345, other program modules 346, and program data 347 are given different numbers here to illustrate that, at a minimum, they are different copies. A user may enter commands and information into the computer 20 through input devices such as a keyboard 362 and pointing device 361, commonly referred to as a mouse, trackball or touch pad. Other input devices (not shown) may include a microphone, joystick, game pad, satellite dish, scanner, or the like. These and other input devices are often connected to the processing unit 320 through a user input interface 360 that is coupled to the system bus, but may be connected by other interface and bus structures, such as a parallel port, game port or a Universal Serial Bus (USB). A monitor 391 or other type of display device is also connected to the system bus 321 via an interface, such as a video interface 390. In addition to the monitor, computers may also include other peripheral output devices such as speakers 397 and printer 396, which may be connected through an output peripheral interface 390.

The computer 310 may operate in a networked environment using logical connections to one or more remote computers, such as a remote computer 380. The remote computer 380 may be a personal computer, a server, a router, a network PC, a peer device or other common network node, and typically includes many or all of the elements described above relative to the computer 310, although only a memory storage device 381 has been illustrated in FIG. 5. The logical connections depicted in FIG. 5 include a Local Area Network (LAN)371 and a Wide Area Network (WAN)373, but may also include other networks. Such networking environments are commonplace in offices, enterprise-wide computer networks, intranets and the Internet.

When used in a LAN networking environment, the computer 310 is connected to the LAN 371 through a network interface or adapter 370. When used in a WAN networking environment, the computer 310 typically includes a modem 372 or other means for establishing communications over the WAN 373, such as the Internet. The modem 372, which may be internal or external, may be connected to the system bus 321 via the user input interface 360, or other appropriate mechanism. In a networked environment, program modules depicted relative to the computer 310, or portions thereof, may be stored in the remote memory storage device. By way of example, and not limitation, FIG. 5 illustrates remote application programs 385 as residing on memory device 381. It will be appreciated that the network connections shown are exemplary and other means of establishing a communications link between the computers may be used.

The hardware devices of fig. 1-5 discussed above may be used to implement a system that generates one or more time-synchronized streams of commentary data based on the interaction of one or more users with the multimedia content streams viewed by those users.

FIG. 6 is a flow chart describing one embodiment of a process for generating a time-synchronized stream of commentary data based on viewer interaction with a stream of multimedia content. In one embodiment, the steps of FIG. 6 may be performed by computing device 12 under the control of software.

In step 600, the identity of one or more viewers within the field of view of the computing device is determined. In one embodiment and as discussed in fig. 2, the identity of the viewer may be determined by receiving input from the viewer identifying the identity of the viewer. In another implementation, the facial recognition engine 192 in the computing device 12 may also perform identification of the viewer.

In step 602, preference information of a viewer is acquired. The viewer's preference information includes one or more user groups in the viewer's social graph for which the viewer wishes to make his or her comments available when viewing the multimedia content stream. In one approach, the viewer's preference information may be obtained from a social graph of viewers stored in the user profile database 207. In another approach, the viewer's preference information may be obtained directly from the viewer via the audiovisual display 16. Fig. 9A illustrates an exemplary user interface screen for obtaining preference information of a viewer. In one example, the viewer's preference information may be obtained from the viewer each time the viewer watches a multimedia content stream, such as a movie or program. In another example, the viewer's preference information may be obtained during initial setup of the viewer's system, each time the viewer logs into the system, or during a particular session, such as just before the viewer starts watching a movie or program. In step 604, the viewer's preference information is provided to the centralized data server 306.

In step 606, the viewer selects multimedia content to view. In step 608, the user-selected multimedia content stream is displayed to the user via the audiovisual device 16. In step 610, it is determined whether the user-selected multimedia content stream includes previous comments from other users. If the multimedia content stream includes previous comments from other users, then in step 612, the steps of the process described in FIG. 6A are performed (630-640).

If the multimedia content stream being viewed by the viewer does not include any previous comments, then in step 614, the viewer's interaction with the multimedia content stream is recorded. As shown in FIG. 2, in one approach, a viewer's interaction with a multimedia content stream may be recorded based on a text message, an audio message, or a video feed provided by the viewer while the viewer is viewing the multimedia content stream. In another approach, a viewer's interaction with the multimedia content stream may also be recorded based on gestures, postures, or facial expressions made by the viewer while the viewer is viewing the multimedia content stream.

In step 616, a time-synchronized stream of commentary data is generated based on the viewer's interaction. The process for generating a time-synchronized review data stream is described in FIG. 6B.

In step 618, the time-synchronized commentary data stream and the program information associated with the multimedia content stream are provided to a centralized data server for analysis. In step 620, the time-synchronized stream of commentary data may optionally be displayed to the viewer via the audiovisual device 16 while the viewer is viewing the stream of multimedia content.

FIG. 6A is a flow chart describing one embodiment of a process for receiving a stream of commentary data generated by other users upon request by a viewer to view the stream of commentary data. In one embodiment, the steps of FIG. 6A are performed when it is determined that the multimedia content stream being viewed by the user includes previous comments from other users (e.g., step 610 of FIG. 6). In step 627, it is determined whether the viewer wishes to view comments from other users. For example, a viewer may have selected a multimedia content stream with previous comments, but may wish to view the multimedia content stream without comments. FIG. 9B illustrates an exemplary user interface screen for obtaining a viewer's request to view comments from other users. If the viewer does not wish to view comments from other users, the multimedia content stream is presented to the viewer via the audiovisual device 16 without displaying comments from other users in step 629.

If the viewer wishes to view comments from other users, program information associated with the multimedia content stream is provided to centralized data server 306 in step 628. In step 630, one or more time-synchronized comment data streams are received from centralized data server 306 from one or more users whose viewers are eligible to view their comments. In one example, comment viewing qualifications associated with a multimedia content stream being viewed by a viewer may be obtained from reports (e.g., as shown in table 1) generated by centralized data server 306 of time-synchronized comment data streams associated with a particular multimedia content stream being viewed by one or more users.

In step 632, the viewer is presented with one or more options to view the time-synchronized commentary data streams from the respective users via a user interface in the audiovisual display 16. In one example, the options include displaying to the viewer a stream of commentary data from one or more particular users. In another example, the options include displaying a stream of commentary data for a particular content type to a viewer. The content types may include text messages, audio messages, and video feeds provided by individual users. The content types may also include gestures and facial expressions provided by the respective users. FIG. 10 illustrates an exemplary user interface screen displaying one or more options for viewing one or more streams of comment data from one or more users.

In step 634, the viewer's selection of one or more options is obtained via the user interface. For example, in one embodiment, the viewer may choose to view all of the text messages and audio messages of users Sally and Bob. In step 636, the time-synchronized stream of commentary data is displayed to the viewer based on the viewer's selection via the audiovisual device 16. In step 638, the viewer's own interactions with the multimedia content stream that the viewer is viewing are also recorded simultaneously. This provides the other user with the option to re-view the stream and allows the other viewer to view multiple sets of reviews at a later time.

In step 640, a time-synchronized stream of commentary data is generated based on the viewer's interactions. In step 642, the time-synchronized commentary data stream and the program information associated with the multimedia content stream are provided to a centralized data server for analysis.

FIG. 6B is a flow chart describing one embodiment of a process for generating a time-synchronized review data stream (e.g., 616 of FIG. 6 and 640 of FIG. 6A in more detail). In step 650, the actual start time for presentation of the multimedia content stream to the viewer is determined. For example, if a multimedia content stream, such as a television program, is broadcast to viewers at 9.00 a.m. (pacific standard time), the actual start time of the multimedia content stream is determined to be 0 hours, 0 minutes and 0 seconds in one embodiment. In step 652, a timestamp of each interaction by the viewer relative to the actual start time is determined. For example, if a viewer's interaction with a television program that the viewer is watching is recorded at 9.12 pm (pacific standard time), the timestamp of the viewer's interaction relative to the actual start time is determined to be 0 hours, 12 minutes, 0 seconds. In step 654, a time synchronized review data stream is generated. The time-synchronized stream of commentary data includes viewer interactions that are time-stamped relative to an actual start time of presentation of the stream of multimedia content to the viewer.

FIG. 7 is a flow diagram describing one embodiment of a process for generating a report of a time-synchronized stream of commentary data related to a particular stream of multimedia content viewed by one or more users. In one embodiment, the steps of FIG. 7 may be performed by comment data stream aggregation module 312 in centralized data server 306 under the control of software. In step 700, one or more time-synchronized streams of commentary data, programming information associated with a stream of multimedia content, and preference information associated with one or more users are received from one or more client devices 300A, 300B … … 300X. In step 702, a report is generated regarding a time-synchronized stream of commentary data for a particular multimedia content stream viewed by a user. An exemplary illustration of a report of a time-synchronized stream of commentary data related to a particular stream of multimedia content viewed by one or more users is shown in table 1 above. In one embodiment and as discussed in FIG. 2, the preference information of one or more users is used to determine comment viewing eligibility for the user group for which the user wishes to make his or her comments available for viewing.

FIG. 8 is a flow diagram describing one embodiment of a process for providing a viewer with a stream of review data generated by other users based on the viewer's review viewer selection. In one embodiment, the steps of FIG. 8 may be performed by a centralized data server 306 under the control of software. In step 704, a request is received from one or more client devices 300A, 300B … … 300X to view one or more previous streams of commentary data related to the multimedia content being viewed by the viewer. Step 704 may be performed, for example, when a request is received from one or more client devices at step 628 of FIG. 6A. In step 706, one or more users providing comments related to the multimedia content stream are identified. In one example, the one or more users may be identified by referencing reports (e.g., as shown in table 1) of time-synchronized commentary data streams related to a particular multimedia content stream viewed by the one or more users. In step 708, a subset of users that viewers are eligible to view their commentary is identified. In one example, the subset of users may be identified by reference to the "review viewing qualification" field shown in Table 1. For example, if a viewer is among one or more users in a user group listed in a "review viewing qualifications" field provided by a particular user, the viewer is eligible to view reviews provided by the particular user. In step 710, a time-synchronized stream of commentary data relating to a subset of users is provided to a viewer at the viewer's client device.

FIG. 9A illustrates an exemplary user interface screen for obtaining viewer preference information prior to recording the viewer's interaction with a multimedia content stream. As discussed above, the viewer's preference information includes a group of users to whom the viewer wishes to make his or her comments available for viewing. In one example, the user groups may include friends of viewers, family, or the world. In the exemplary illustration of FIG. 9A, the viewer may be presented with text, such as "select the group of users with whom you want to share your comments! ". In one example, the viewer may check one or more of boxes 902, 904, or 906 to select one or more user groups.

FIG. 9B illustrates an exemplary user interface screen for obtaining a viewer's request to view comments from other users. In the exemplary illustration of fig. 9B, text may be presented to the viewer, such as "do you wish to view comments of other users? ". In one example, a viewer's request may be obtained when the viewer selects a box, i.e., "yes" or "no".

FIG. 10 illustrates an exemplary user interface screen displaying to a viewer one or more options for viewing a time-synchronized stream of commentary data associated with a stream of multimedia content. In one example, a viewer may view data from one or more particular commentary data streams by selecting one or more of boxes 910, 912, or 914. As further illustrated, in one example, time-synchronized commentary data streams from various users may be categorized as "live" or "offline". As used herein, a comment data stream for a user is classified as "live" if the user provides comments during live broadcast of a program, and as "offline" if the user provides comments during recording of a program. The "live" or "offline" classification of the commentary data stream may be derived based on the broadcast time/date of the reported program from the time-synchronized commentary data stream generated by centralized data server 306. It will be appreciated that the "live" and "offline" categories of time-synchronized commentary data streams provide viewers with the option of viewing only commentary from users that view programs live or that view recordings of programs. As further shown in fig. 10, the viewer may select one or more of boxes 910, 912, 914 or 916 to select one or more user groups. In another example, a viewer may also view a time-synchronized stream of commentary data for a particular content type, such as a text message, an audio message, a video feed, a gesture, or a facial expression provided by one or more users. A viewer may select one or more of boxes 918, 920, 922, or 924 to view a time-synchronized stream of commentary data for a particular content type of one or more users. In another example, the viewer may also view real-time data updates provided to the viewer's mobile computing device from the third-party information source 54 via the audiovisual device 16 while viewing the multimedia content stream. In another example, the viewer may also choose not to view any of the stream of comment data from any of the users.

11A, 11B, and 11C illustrate exemplary user interface screens in which one or more time-synchronized streams of commentary data related to a multimedia content stream are displayed to a viewer. In the exemplary illustration, time-synchronized comment data streams 930, 932 include comments of users Sally and Bob, respectively. As further shown, the time synchronized comment data streams 930, 932 are time synchronized with respect to the actual start of the multimedia content stream. As the viewer views the multimedia content stream, the commentary data streams 930, 932 recreate the viewer's experience of viewing the multimedia content live with other users.

Fig. 11A illustrates one embodiment of a technique in which a text message appears at time point 10:02 in a data stream. The text message may have been sent by Sally at the time when the content was viewed and thus recreated as text on the screen of the viewing user. Fig. 11B shows a voice message or voice comment played through the audio output. It should be understood that the audio output need not have any virtual indicator, or may include a small indicator as shown in FIG. 11B to indicate that the audio is not part of a stream.

FIG. 11C shows user avatar 1102 or video recording clip 1104 providing Sally and Bob. In the case of utilizing the capture device and tracking system discussed above, an avatar that simulates the user's movements and audio may be generated by the system. Additionally, the user's video clip 1104 may be recorded by the capture device and tracking system. All or portions of the commenting user may be displayed. For example, the entire body image is displayed in Sally's avatar 1102, but only Bob's face is displayed in Bob's video to show that Bob is sad in the content portion. Either or both of these user representations may be provided in the user interface of the viewing user. Whether a commenting user is displayed as an avatar having the look of the user, an avatar representing something other than the user, or a video showing the commenting user may be configured by the commenting user or the user being viewed. Further, although only two commenting users are shown, any number of commenting users may be presented in the user interface. In addition, the rendered size of the avatar or video may also be varied from a relatively small display portion to a larger display portion. The avatar or video may be presented in a separate window or overlaid on the multimedia content.

Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims. The scope of the present technology is defined by the appended claims.

Claims

1. A computer-implemented method for generating a time-synchronized stream of commentary data based on viewer interaction with a stream of multimedia content, comprising the computer-implemented steps of:

identifying (600) a viewer within a field of view of a capture device connected to a computing device;

receiving (606), via the computing device, a selection of a multimedia content stream to view from the viewer;

recording (614) the viewer's interactions with the multimedia content stream being viewed by the viewer;

generating (616) a time-synchronized commentary data stream based on the viewer's interaction; and

in response to a request (634) from the viewer, one or more time-synchronized streams of commentary data relating to a stream of multimedia content being viewed by the viewer are displayed (636) via an audiovisual device connected to the computing device.

2. The computer-implemented method of claim 1, wherein the viewer's interactions include text messages, audio messages, and video feeds provided by the viewer while the viewer is viewing the multimedia content stream, or gestures, and facial expressions made by the viewer.

3. The computer-implemented method of claim 1, further comprising obtaining preference information related to the multimedia content stream, the preference information including one or more user groups in the viewer's social graph that are eligible to view the viewer's interactions with the multimedia content stream.

4. The computer-implemented method of claim 1, wherein generating the time-synchronized commentary data stream further comprises:

determining an actual start time for presentation of the multimedia content stream to the viewer;

determining a timestamp of the viewer interaction relative to the actual start time; and

generating a time-synchronized commentary data stream that includes viewer interactions that are time-stamped relative to an actual start time of presentation of the multimedia content stream to the viewer.

5. The computer-implemented method of claim 1, wherein displaying the one or more time-synchronized comment data streams further comprises:

obtaining viewer comment viewing qualification related to a multimedia content stream being viewed by the viewer;

presenting, via a user interface, one or more options for viewing the one or more time-synchronized comment data streams based on the viewer comment viewing qualification;

obtaining a selection of the one or more options from the viewer via the user interface; and

displaying the one or more time-synchronized comment data streams to a viewer based on the viewer's selection.

6. The computer-implemented method of claim 1, wherein displaying the one or more time-synchronized comment data streams further comprises:

recording the viewer's interactions with the multimedia content stream while presenting the one or more time-synchronized commentary data streams to the viewer.

7. A system for generating a time-synchronized stream of commentary data based on viewer interaction with a stream of multimedia content, comprising:

one or more client devices (300) in communication with a centralized data server (306) via a communication network (304), the one or more client devices including instructions that cause a processing device in the client devices to:

recording (614) interactions from one or more viewers with a multimedia content stream being viewed by the one or more viewers;

generating (616) one or more time-synchronized comment data streams based on the interactions;

providing (618) the time-synchronized review data stream to the centralized data server;

receiving (634) a selection from the one or more viewers to view multimedia content;

determining (702) whether a stream of commentary data exists for the multimedia content; and

when the one or more viewers select to view the stream of commentary data, multimedia content with the stream of commentary data is presented (636).

8. The apparatus of claim 16, wherein:

the one or more client devices record visual and audio interactions of the one or more viewers in the stream of commentary data.

9. The apparatus of claim 17, further comprising:

an audiovisual device connected to the one or more client devices, the audiovisual device displaying visual and audio interactions of other users in a stream of commentary data with a stream of multimedia content being viewed by the viewer.

10. The apparatus of claim 18, further comprising at least one of:

a depth camera connected to the one or more client devices, the depth camera tracking interactions from the one or more viewers based on movements, gestures, poses, and facial expressions made by the one or more viewers within a field of view of the one or more client devices; or

A mobile computing device connected to the one or more client devices, the mobile computing device receiving the interactions from the one or more viewers, and the mobile computing device being synchronized to the one or more client devices streaming the multimedia content stream to the viewers.