WO2018191169A1 - Synchronous capture and playback of 4d content - Google Patents
Synchronous capture and playback of 4d content Download PDFInfo
- Publication number
- WO2018191169A1 WO2018191169A1 PCT/US2018/026717 US2018026717W WO2018191169A1 WO 2018191169 A1 WO2018191169 A1 WO 2018191169A1 US 2018026717 W US2018026717 W US 2018026717W WO 2018191169 A1 WO2018191169 A1 WO 2018191169A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- audio
- content data
- content
- pseudo
- captured
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/472—End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
- H04N21/4722—End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for requesting additional data associated with the content
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/4302—Content synchronisation processes, e.g. decoder synchronisation
Definitions
- the present invention relates to the field of electronic capture of content, in particular to synchronous capture and playback of 4D content.
- a program code is developed, which when executed by one or more computing devices, generates the audio-visual content such as VR or AR.
- the audio-visual content such as VR or AR.
- game developers develop specific programs that project VR content into the VR goggles based on the movement of the person (as recorded by handheld joysticks or motion detection cameras, as examples.)
- the surround sound audio is generated by software tools analyzing, processing and generating different audio channels (for voice, ambience, etc.) to be outputted to different speakers to create the surround sound effect.
- audio-visual content refers herein to audio content and/or video content and therefore, extends to media content that is limited only to audio or only to visual representations (such as images or video content) or both.
- the problem is further exacerbated for 4D effects, the non-audio-visual effects that can be apprehended by a person, such as multi-directional movement, speed, acceleration, smell and touch, particularly because of the challenges in reproduction.
- the 4D effect is programmatically generated. For example, using specialized equipment such as a 4D chair (such as a motion simulating chair), a gamer may experience movement because the game's program controls the coupled 4D chair.
- the game program's complex logic directs the chair to move according to the calculations of the game's program of user's game inputs and other variables.
- FIG. 1 A-D are block diagrams that depict 4D Synchronous Capture System (4D-SCS) 100, in one or more embodiments.
- 4D-SCS Synchronous Capture System
- FIG. 2 is a flow diagram that depicts a process for generating real-time 4D content and synchronizing the 4D content with captured audio-visual content, in an embodiment.
- FIG. 3 is a graph that depicts 4D content data generated over time, in an embodiment.
- FIG. 4 A is a graph that depicts a pseudo-audio signal for roll and pitch angles, in an embodiment.
- FIG. 4B is a graph that depicts a frequency spectrum of a pseudo-audio signal for roll and pitch angles, in an embodiment.
- FIG. 5 is a block diagram that illustrates a computer system upon which an embodiment of the invention may be implemented.
- a 4D synchronous capture system comprises of one or more sensors.
- the system continuously captures and records the sensor data.
- the generation of 4D content data by the system is performed concurrently with the capture of video content and/or audio content by a media capturing device internal or external to the system (such as a non-professional, semi- professional, or professional (video) camera system).
- FIG. 1A-D are block diagrams that depicts an example of 4D Synchronous Capture System (4D-SCS), 4D-SCS 100, in one or more embodiments.
- 4D-SCS 4D Synchronous Capture System
- the 4D-SCS uses one or more "synchronization timestamps" to perform synchronization of 4D content being captured with the audio- visual content that is simultaneously captured and may record the "synchronization timestamps" along with the corresponding 4D content data values in storage of the system.
- timestamp refers herein to a reference timestamp from a sequence of reference timestamps generated by the system at a regular interval.
- a non-limiting example of a synchronization timestamp is a timecode.
- the system may generate a synchronization timestamp for
- FIG IB depicts external media capturing device 150 that is configured to receive external synchronization signal which includes synchronization timestamps of a regular interval.
- the external device may record external synchronization signal within the concurrently captured audio-visual content and may store the audio- visual content in media content item on the internal storage. Accordingly, the 4D content captured by 4D-SCS 100 and audio-visual content captured by media capturing device 150 are associated with the same timing information, the
- the 4D-SCS encodes 4D content into an audio signal in an unused audio channel in real time.
- the "pseudo-audio channel” term refers herein to an audio channel of an audio recording in which 4D content is encoded into.
- the "pseudo-audio signal” term refers herein to the encoded audio signal in pseudo-audio channel that contains captured 4D content.
- FIGS. 1C and ID depict 4D-SCS 100 generating a pseudo-audio signal, in one or more embodiments.
- 4D-SCS 100 when one or more new sensor readings of 4D content are received, 4D-SCS 100 generates the corresponding pseudo-audio signal according to the techniques described herein.
- 4D-SCS 100 may directly transmit the pseudo-audio signal into an audio input of external media capturing device 150.
- external media capturing device 150 is capturing audio-visual content using other one or more audio channels of the audio recording for the audio content. Because the media capturing device is recording the audio-visual content's audio content in parallel in the one or more other channels, the concurrently captured 4D content in the pseudo-channel is automatically synchronized with the audio-visual content.
- another channel of the same audio recording that includes a pseudo-audio channel is used to capture sound from a microphone of the 4D-SCS (such as microphone 132 of 4D-SCS 100).
- the pseudo-channel captured by the 4D-SCS can be readily synchronized(e.g. in postproduction) with the external device using the audio channels that carry the recorded sound. This is possible because the sound captured by the external media capturing device (such as external media device 150) is the same as the sound captured by the 4D-SCS's microphone in the audio recording that also contains already synchronized pseudo-audio channel.
- FIG. 1 is a block diagram that depicts 4D Synchronous Capture System (4D-SCS) 100, in an embodiment.
- 4D-SCS 100 comprises one or more of processing unit(s) 105, sensor unit(s) 120 and storage unit 140.
- Processing unit(s) 105 may periodically request sensor data from sensor unit(s) 120, or sensor unit(s) 120 may be configured by processing unit(s) 105 to periodically provide processing unit(s) 105 with sensor data.
- the frequency used for receiving sensor data from sensor unit(s) 120 may vary based on the processing capabilities of 4D playback system(s).
- 10 Hz frequency may be used to retrieve sensor data (every 0.1 seconds) because the playback device of 4D content may not be sensitive enough to adjust faster than every 0.1 second to new 4D content data value.
- the sensor read frequency also determines the interval for the time series sensor data when sensor data is processed and stored in storage unit 140, in an embodiment.
- Processing unit(s) 105 may process raw sensor data readings from sensor unit(s) 120 to generate 4D content. Processing unit(s) 105 store the 4D content in 4D content data store 142 of storage unit 140 and/or encode the 4D content into pseudo-audio channel of media content items 144 according to techniques described herein and/or transmit directly into external media capturing device 150. For example, GPS receiver may provide raw coordinates of 4D-SCS 100's location. Processing unit(s) 105 may use previously received coordinates and the sensor read frequency to determine the velocity and store/encode 4D content for velocity instead of storing/encoding the raw coordinates.
- 4D-SCS 100 may include media capturing unit 140.
- Media capturing unit captures audio-visual content using camera 134 and microphone 132 and stores the audiovisual content in media content items 144 on storage unit 140.
- one or more audio channels (pseudo-audio channel(s)) of the audio recording are used to record the 4D- content.
- the audio channels, including the pseudo-channel(s) may be stored in media content items 144 on storage unit 140 or may be streamed to an external playback device or an external media capturing device such as extern media capturing device 150.
- External media capturing device 150 captures audio-visual content using its own camera and microphone among other media capturing components.
- Media capturing device 150 may be communicatively coupled with 4D-SCS 100 to receive an audio stream from 4D-SCS 100.
- the audio stream comprising one or more of audio and pseudo-audio channels streamed by 4D-SCS 100.
- the external media capturing device 150 captures its own audio-visual content as well as the audio content streamed by 4D-SCS that includes 4D content.
- External capturing device 150 may store both audio-visual content into the same one or more media content items on external capturing device 150 or elsewhere.
- external media device 150 is communicatively coupled with 4D-SCS 100 to receive a synchronization signal generated by 4D-SCS 100. Based on the synchronization signal, external media device 150 captures synchronization timestamps generated by 4D-SCS 100 - the same one or more synchronization timestamps that are used in timestamping 4D content in 4D content data store 142. Accordingly, the 4D content in 4D content store 142 can be synchronized with the audio-visual content captured by external media capturing device 150.
- FIG. 2 is a flow diagram that depicts a process for generating real-time 4D content and synchronizing the 4D content with captured audio-visual content, in an embodiment.
- a 4D-SCS initiates the 4D content capture.
- the initiation may be automatically triggered by the initiation of audio-visual content capture.
- a media device (internal or external to the 4D-SCS) receives an input to initiate recording of audio-visual content.
- the media device requests the 4D-SCS to initiate the capture of 4D content in synch with the audio-visual content.
- the 4D-SCS receives a user input to initiate the capture of 4D content in synch with separately initiated capture of audio-visual content, or the 4D-SCS initiates the capture of 4D content as well as the audio-visual content.
- one or more sensor unit readings are initialized before the capture of 4D content.
- the sensor reading(s) at the time of the initialization may be denoted as the baseline sensor reading(s).
- At least a portion of 4D content may be generated based on the difference between the next sensor reading(s) and the respective baseline sensor reading(s).
- the initial reading(s) from accelerometer and/or gyroscope before the 4D content capture initiation are denoted as the initial (zero) position of the 4D-SCS.
- the subsequent roll, pitch and yaw angles are calculated based on the initial sensor reading(s) of the initial position of the 4D-SCS.
- the coordinates received during the initialization are denoted as the initial location coordinates of the 4D-SCS.
- the 4D-SCS retrieves one or more readings from sensor unit(s), in response to the initiation of the 4D content capture.
- a sensor reading may describe
- environmental parameters at a point in time when the sensor reading is captured include an angular velocity, a directional acceleration, angular position, a location coordinate, an ambient temperature and humidity.
- the point of time of the capture can be determined based on the sensor read frequency or by associating a synchronization timestamp indicating the point of time.
- the sensor reading(s) are further transformed to generate 4D content. For example, angular velocities around each axis, collected by a gyroscope, and/or the directional accelerations in each of the axis, collected by an accelerometer, are used to calculate the current angle of rotation in each of the axis.
- Table 1 below depicts examples of raw sensor data from a gyroscope and an accelerometer that is used to calculate the corresponding angles of rotation, in an embodiment.
- gx, gy and gz are example raw sensor readings from the gyroscope; ax, ay, az are example raw sensor readings from the accelerometer. Each row of the readings may correspond to a particular time at which both the gyroscope readings and accelerometer readings are retrieved.
- the 4D-SCS retrieves sensor data at a predetermined interval (using a pre-determined sensor read frequency), thus generating a time series of sensor data.
- a predetermined interval using a pre-determined sensor read frequency
- the time interval between the sensor readings in the first row and the second row of Table 1 may be the same as the time interval between the sensor readings in the second row and the third row.
- the corresponding 4D content data is generated, at step 215.
- the gx, gy, gz, ax, ay and az sensor data is used to generate 4D content data for pitch angle 310 and roll angle 320. Since each row of sensor readings correspond to a particular time, each of the generated 4D content corresponds to the same respective time.
- FIG. 3 is a graph that depicts 4D content data generated over time, in an embodiment.
- Pitch angle 310 and roll angle 320 are generated based on the periodic sensor readings at 10Hz of a sensor read frequency. Accordingly, each of pitch angle 310's and roll angle 320's data point on the vertical axis of FIG. 3 are spaced 1/10 seconds apart on the horizontal time axis. As new raw sensor data is retrieved, new pitch angle 310 and roll angle 320 points may be generated real-time.
- the 4D content data is generated without transformation of retrieved raw sensor data.
- temperature sensor readings may represent 4D content data for temperature without any need for further transformation and can be readily recorded or encoded as 4D content data.
- the 4D-SCS synchronizes the captured 4D content with concurrently captured audio-visual content, according to embodiments.
- the synchronization causes the generated 4D content data and the concurrently captured audiovisual content data to be associated with the same timing information, i.e. data values of the 4D content are time-wise aligned with concurrent elements of the audio-visual content.
- the 4D-SCS encodes the generated 4D content into a real-time signal that is made part of the concurrently captured audio-visual content.
- the 4D-SCS may transform the 4D content data into one or more pseudo-audio signals recorded into one or more audio channels (pseudo-audio channel(s)) of an audio recording.
- the initiation of 4D-SCS capture may cause the initiation of capture of an audio signal in at least one different audio channel of the audio-channels of the same audio recording. For example, for a stereo audio recording that contains a right channel and a left channel, the left channel may be used for an audio signal recording sound through a microphone, as an example, while the right channel may be used for a pseudo-audio signal, or vice versa.
- 4D content data may be encoded as an audio signal of different frequencies that correspond to different values of 4D content data.
- the 4D-SCS calculates the corresponding frequency value and generates the pseudo- audio signal of the corresponding frequency.
- 4D content data of multiple different sensors is encoded simultaneously into a pseudo-audio signal.
- the audio spectrum may be divided into frequency ranges and each of the ranges may be used for a particular type of 4D content.
- roll angle data and pitch angle data may be encoded into a pseudo-audio signal. Similar techniques for encoding can be used for other 4D content such as velocity, temperature and humidity.
- a low frequency spectrum of sound is used to encode roll angles: from 550 Hz to 5,050 Hz. In the selected sound spectrum, each 50 Hz spectrum is reserved for each integer value of the angle to lower the risk of noise distortion in transmission. Accordingly, a frequency for a value of a roll angle may be calculated using the following sample equation:
- roii is the frequency of a pseudo-audio signal to be generated to encode a roll angle value of 4D content
- 46° is a constant offset to transform the roll angles into positive integer values from 1 to 91
- 50 Hz is the reserved spectrum for each integer angle value
- 500Hz is the offset to ensure the pseudo-audio signal is in the audible audio spectrum range that starts from 550 Hz.
- a high frequency spectrum of sound may be used to encode pitch angles: from 5,550Hz to 10,050Hz. Accordingly, a frequency for a value of a pitch angle may be calculated using the following sample equation:
- pitch (apitch + 46°) * 50 Hz + 5,500 Hz
- F P uch the frequency of a pseudo-audio signal to be generated to encode a pitch angle value of 4D content
- 46° is an offset constant to transform the pitch angles into positive integer values from 1 to 91
- 50 Hz is the selected reserved spectrum for each integer angle value
- 5,500Hz is the offset to ensure the pseudo-audio signal is in the high range of the audible audio spectrum, and thus doesn't overlap with the pseudo-audio signal spectrum used to encode roll angles.
- FIG. 4A is a graph that depicts a pseudo-audio signal for roll and pitch angles, in an embodiment.
- FIG. 4B is a graph that depicts a frequency spectrum of a pseudo-audio signal for roll and pitch angles, in an embodiment.
- the frequencies of pseudo-audio signal at each 0.1s corresponds to different roll angle and pitch angle values.
- the higher frequency dotted lines corresponds to the pitch angle values and the lower frequency dotted line corresponds to the roll angles.
- 4D content data values are encoded into the pseudo channel of an audio recording concurrently with audio signal from a microphone being encoded into another audio channel of the same audio recording. Because of the concurrent capture into the same audio recording, the captured audio content from the microphone and the simultaneously captured 4D content are associated with the same timing information.
- the 4D-SCS stores the encoded pseudo-audio signal in a pseudo-audio channel of an audio content file, in an embodiment.
- the same audio content file also stores the concurrently captured audio-visual content.
- the 4D-SCS transmits the encoded pseudo- audio signal in pseudo-audio channel to an external media device.
- the external media device may have one or more audio inputs for an external microphone or another audio capturing device to capture audio for the video content being captured by the media device.
- Such an audio input can be readily used to receive audio recordings that include one or more pseudo-audio channels from the 4D-SCS.
- the aggregated audio-visual content is streamed, at step 235, in one embodiment.
- a 4D playback device may receive such a stream to synchronously playback 4D content of the streamed one or more pseudo-channels as well as the audio-visual content streamed together.
- the 4D-SCS may generate one or more synchronization timestamps for associating the synchronization timestamps with the generated 4D content at step 215, in an embodiment.
- the 4D-SCS may generate a synchronization signal.
- the synchronization signal may comprise of a sequence of synchronization timestamp such as timecodes.
- One example of synchronization signal is Linear (or Longitudinal) Timecode (LTC) signal which is an encoding of synchronization timestamps, such as timecodes, into an audio signal.
- LTC Linear (or Longitudinal) Timecode
- Other examples of synchronization signals may be used such as MIDI time code (MTC), Control track longitudinal (CTL), AES3, Rewnteable Consumer Timecode (RCTC or RC).
- the 4D-SCS stores the generated 4D content data value in 4D content data set in an association with a synchronization timestamp.
- the synchronization timestamp representing the time at which the data value is being stored in the 4D content data set; or at which the 4D content data value was generated; or at which the sensor reading(s) for the data value were retrieved/captured.
- the 4D-SCS may explicitly or implicitly associate the synchronization timestamp with a 4D content data value in the 4D content data set.
- the 4D-SCS may explicitly associate the synchronization timestamp with the data value by storing the synchronization timestamp in association with the data value.
- the 4D-SCS may implicitly associate the synchronization timestamp with the data value by storing 4D content data set as time series and associating the synchronization timestamp only with the initial 4D content data value(s).
- the rest of the data values in the 4D content data set can be associated with the respective synchronization timestamp using the initial synchronization timestamp, the sensor read frequency, and the index of the data value in the times series to determine.
- the 4D-SCS may send a synchronization signal to a media capturing device for the media capturing device to synchronize the
- the media capturing device upon the receipt of the synchronization timestamp, associates the newly captured audio-visual content with the received synchronization timestamp which has also been already associated with the 4D content captured at the same time. Thereby, the nearly simultaneously captured 4D content and the audio-visual content are associated with the same timing information, i.e. data values of the 4D content are time-wise aligned with audio-visual content elements.
- a 4D content playback device comprising one or more components of a computer system, may receive 4D content in a media content item that is accessible as a file or in any other storage format.
- the 4D content may be encoded into one or more pseudo-audio signals of pseudo-audio channel(s) along with audiovisual content data.
- the 4D content may be stored in meta-data associated with the media content item as a 4D content data set timestamped with one or more synchronization timestamps.
- the 4D content playback device may receive 4D content as a media stream.
- the received stream may include 4D content as one or more pseudo-audio signals of pseudo-audio channel(s) or as data values of a 4D content data set timestamped with one or more synchronization timestamps.
- one or more other audio channels that do not carry a pseudo-audio signal may be used to synchronize with separate audio-visual content. Since the separate audio- visual content contains the recording of the same audio as the audio recording containing the pseudo- channel, the non-pseudo audio-channels are synchronized with the audio content of the separate audio-visual content. By the virtue of the pseudo-channel being synchronized with the non- pseudo audio channel, the pseudo-audio channel containing 4D content is thereby synchronized with the separate audio-visual content.
- the 4D content playback device decodes the one or more pseudo-audio signals into a 4D content data set, in an embodiment.
- the decoded 4D content data set may be further transformed using, as an example, a Bezier curve model.
- the decoding may further generate synchronization timestamps to be associated with the decoded 4D content data value(s).
- the generated synchronization timestamps correspond to the synchronization timestamps of simultaneously captured audio-visual content.
- the initiation of a playback of captured audio-visual content may cause the transmission of 4D content to a 4D reproduction device.
- the 4D content data value(s), corresponding to the particular synchronization timestamp in the 4D content data set may be transmitted to the 4D reproduction device.
- the decoding may generate 4D content data set arranged in time series of a regular interval.
- the 4D playback device may also receive the 4D content data set in time series.
- One or more data values in the 4D content data set that correspond to the start of the captured audio-visual content may be indicated in the decoded or received 4D content data set.
- the initiation of a playback of captured audio-visual content may cause the transmission of 4D content to a 4D reproduction device.
- the 4D content data value(s) that are indicated to correspond to the start of the audio- visual content may be transmitted to a 4D reproduction device.
- the 4D content playback device determines based on a time period that has lapsed from the start of the playback (or the last transmission of 4D content data value(s)), whether to transmit the next 4D content data values.
- the 4D reproduction device may transmit the 4D content value(s) at a regular interval as the appropriate time period for the interval lapses.
- 4D content data value(s) may be further transformed before transmission to a 4D reproduction device.
- Each 4D reproduction device may require a particular format of 4D content values.
- either all 4D content data values in the 4D content data set are transformed ahead of the transmission, or each 4D content data value is transformed before its transmission.
- a 4D reproduction device Upon receipt of a 4D content data value, a 4D reproduction device reproduces the 4D effect corresponding to the 4D content data value.
- a non-limiting example of 4D reproduction device is a motion simulating chair.
- a motion simulating chair upon the receipt of a roll and/or pitch angle value, performs the motion to adjust the chair to the corresponding roll/pitch angle thereby imitating the motion depicted in the corresponding audio-visual content.
- Another example may be a variable speed fan to emulate the air flow that is experienced at different velocities. As a new velocity value is received, the fan adjusts the speed of rotation to simulate the headwind corresponding to the velocity.
- the techniques described herein are implemented by one or more special-purpose computing devices.
- the special-purpose computing devices may be hard-wired to perform the techniques, or may include digital electronic devices such as one or more application-specific integrated circuits (ASICs) or field programmable gate arrays (FPGAs) that are persistently programmed to perform the techniques, or may include one or more general purpose hardware processors programmed to perform the techniques pursuant to program instructions in firmware, memory, other storage, or a combination.
- ASICs application-specific integrated circuits
- FPGAs field programmable gate arrays
- Such special-purpose computing devices may also combine custom hard-wired logic, ASICs, or FPGAs with custom programming to accomplish the techniques.
- the special-purpose computing devices may be desktop computer systems, portable computer systems, handheld devices, networking devices or any other device that incorporates hard-wired and/or program logic to implement the techniques.
- FIG. 5 is a block diagram that illustrates a computer system 500 upon which an embodiment of the invention may be implemented.
- Computer system 500 includes a bus 502 or other communication mechanism for communicating information, and a hardware processor 504 coupled with bus 502 for processing information.
- Hardware processor 504 may be, for example, a general purpose microprocessor.
- Computer system 500 also includes a main memory 506, such as a random access memory (RAM) or other dynamic storage device, coupled to bus 502 for storing information and instructions to be executed by processor 504.
- Main memory 506 also may be used for storing temporary variables or other intermediate information during execution of instructions to be executed by processor 504.
- Such instructions when stored in non-transitory storage media accessible to processor 504, render computer system 500 into a special-purpose machine that is customized to perform the operations specified in the instructions.
- Computer system 500 further includes a read only memory (ROM) 508 or other static storage device coupled to bus 502 for storing static information and instructions for processor 504.
- ROM read only memory
- a storage device 510 such as a magnetic disk or optical disk, is provided and coupled to bus 502 for storing information and instructions.
- Computer system 500 may be coupled via bus 502 to a display 512, such as a cathode ray tube (CRT), for displaying information to a computer user.
- An input device 514 is coupled to bus 502 for communicating information and command selections to processor 504.
- cursor control 516 such as a mouse, a trackball, or cursor direction keys for communicating direction information and command selections to processor 504 and for controlling cursor movement on display 512.
- This input device typically has two degrees of freedom in two axes, a first axis (e.g., x) and a second axis (e.g., y), that allows the device to specify positions in a plane.
- Computer system 500 may implement the techniques described herein using customized hard-wired logic, one or more ASICs or FPGAs, firmware and/or program logic which in combination with the computer system causes or programs computer system 500 to be a special-purpose machine. According to one embodiment, the techniques herein are performed by computer system 500 in response to processor 504 executing one or more sequences of one or more instructions contained in main memory 506. Such instructions may be read into main memory 506 from another storage medium, such as storage device 510. Execution of the sequences of instructions contained in main memory 506 causes processor 504 to perform the process steps described herein. In alternative embodiments, hard-wired circuitry may be used in place of or in combination with software instructions.
- Non-volatile media includes, for example, optical or magnetic disks, such as storage device 510.
- Volatile media includes dynamic memory, such as main memory 506.
- storage media include, for example, a floppy disk, a flexible disk, hard disk, solid state drive, magnetic tape, or any other magnetic data storage medium, a CD-ROM, any other optical data storage medium, any physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, NVRAM, any other memory chip or cartridge.
- Storage media is distinct from but may be used in conjunction with transmission media.
- Transmission media participates in transferring information between storage media.
- transmission media includes coaxial cables, copper wire and fiber optics, including the wires that comprise bus 502.
- transmission media can also take the form of acoustic or light waves, such as those generated during radio-wave and infra-red data communications.
- Various forms of media may be involved in carrying one or more sequences of one or more instructions to processor 504 for execution.
- the instructions may initially be carried on a magnetic disk or solid state drive of a remote computer.
- the remote computer can load the instructions into its dynamic memory and send the instructions over a telephone line using a modem.
- a modem local to computer system 500 can receive the data on the telephone line and use an infra-red transmitter to convert the data to an infra-red signal.
- An infra-red detector can receive the data carried in the infra-red signal and appropriate circuitry can place the data on bus 502.
- Bus 502 carries the data to main memory 506, from which processor 504 retrieves and executes the instructions.
- the instructions received by main memory 506 may optionally be stored on storage device 546 either before or after execution by processor 504.
- Computer system 500 also includes a communication interface 518 coupled to bus 502.
- Communication interface 518 provides a two-way data communication coupling to a network link 520 that is connected to a local network 522.
- communication interface 518 may be an integrated services digital network (ISDN) card, cable modem, satellite modem, or a modem to provide a data communication connection to a corresponding type of telephone line.
- ISDN integrated services digital network
- communication interface 518 may be a local area network (LAN) card to provide a data communication connection to a compatible LAN.
- LAN local area network
- Wireless links may also be implemented.
- communication interface 518 sends and receives electrical, electromagnetic or optical signals that carry digital data streams representing various types of information.
- Network link 520 typically provides data communication through one or more networks to other data devices.
- network link 520 may provide a connection through local network 522 to a host computer 524 or to data equipment operated by an Internet Service Provider (ISP) 526.
- ISP 526 in turn provides data communication services through the world wide packet data communication network now commonly referred to as the "Internet" 528.
- Internet 528 uses electrical, electromagnetic or optical signals that carry digital data streams.
- the signals through the various networks and the signals on network link 520 and through communication interface 518, which carry the digital data to and from computer system 500, are example forms of transmission media.
- Computer system 500 can send messages and receive data, including program code, through the network(s), network link 520 and communication interface 518.
- a server 530 might transmit a requested code for an application program through Internet 528, ISP 526, local network 522 and communication interface 518.
- the received code may be executed by processor 504 as it is received, and/or stored in storage device 46, or other non-volatile storage for later execution.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Databases & Information Systems (AREA)
- Human Computer Interaction (AREA)
- Signal Processing For Digital Recording And Reproducing (AREA)
Abstract
Techniques are described for synchronous capture and playback of 4D content. A 4D synchronous capture system retrieves sensor readings that describe environmental parameters at a point of time at which the sensor readings have been captured. Based the sensor readings, the system generates 4D content data, which digitally represents the environmental parameters captured by sensor readings. The system synchronizes the 4D content data with concurrently captured audio-visual content data and causes the concurrently captured 4D content data and the audio-visual content data to be associated with the same timing information. Based on the time synchronization, the 4D content data reproduces the environmental parameters in synch with the playback of the audio-visual content, in an embodiment.
Description
SYNCHRONOUS CAPTURE AND PLAYBACK OF 4D CONTENT
FIELD OF THE TECHNOLOGY
[0001] The present invention relates to the field of electronic capture of content, in particular to synchronous capture and playback of 4D content.
BACKGROUND
[0002] The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.
[0003] Technological advances have been made in realistic audio-visual reproduction of the world in a digital domain. Surround sound, virtual reality (VR) and augmented reality (AR) are examples of current technologies that enable users to transport themselves visually and audibly into a digital world of reality.
[0004] However, rather than capturing the audio-visual content for such a reproduction, in many instances, a program code is developed, which when executed by one or more computing devices, generates the audio-visual content such as VR or AR. For example, game developers develop specific programs that project VR content into the VR goggles based on the movement of the person (as recorded by handheld joysticks or motion detection cameras, as examples.) Similarly, the surround sound audio is generated by software tools analyzing, processing and generating different audio channels (for voice, ambiance, etc.) to be outputted to different speakers to create the surround sound effect.
[0005] Alternative to programmatically generated audio-visual content, a highly specialized and expensive equipment can be used to capture such audio-visual content in real-time. For example, to shoot IMAX or 3D movies, specialized and very expensive cameras are used, and in many cases still significant and specialized post-processing has to occur for the captured content to be played in IMAX theaters or in movie theaters with 3D glasses.
[0006] Even with the specialized equipment, the reproduction of reality has been generally limited to audio-visual content. The "audio-visual content" term refers herein to audio content
and/or video content and therefore, extends to media content that is limited only to audio or only to visual representations (such as images or video content) or both.
[0007] The problem is further exacerbated for 4D effects, the non-audio-visual effects that can be apprehended by a person, such as multi-directional movement, speed, acceleration, smell and touch, particularly because of the challenges in reproduction. In a rare case, if a 4D effect indeed can be reproduced, the 4D effect is programmatically generated. For example, using specialized equipment such as a 4D chair (such as a motion simulating chair), a gamer may experience movement because the game's program controls the coupled 4D chair. The game program's complex logic directs the chair to move according to the calculations of the game's program of user's game inputs and other variables.
[0008] The lack of real-time capture of 4D effects hampers the reproduction of full experience with audio-visual content. It is indeed exciting to audio-visually re-live a parachute jump, the world's fastest roller coaster ride, or a loop around the race track but substantially incomplete without 4D effects, such as motion.
BRIEF DESCRIPTION OF THE DRAWINGS
[0009] In the drawings of certain embodiments in which like reference numerals refer to corresponding parts throughout the figures:
[0010] FIG. 1 A-D are block diagrams that depict 4D Synchronous Capture System (4D-SCS) 100, in one or more embodiments.
[0011] FIG. 2 is a flow diagram that depicts a process for generating real-time 4D content and synchronizing the 4D content with captured audio-visual content, in an embodiment.
[0012] FIG. 3 is a graph that depicts 4D content data generated over time, in an embodiment.
[0013] FIG. 4 A is a graph that depicts a pseudo-audio signal for roll and pitch angles, in an embodiment.
[0014] FIG. 4B is a graph that depicts a frequency spectrum of a pseudo-audio signal for roll and pitch angles, in an embodiment.
[0015] FIG. 5 is a block diagram that illustrates a computer system upon which an embodiment of the invention may be implemented.
DETAILED DESCRIPTION
[0016] In the following description, for the purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the present invention. It will
be apparent, however, that the present invention may be practiced without these specific details. In other instances, structures and devices are shown in block diagram form in order to avoid unnecessarily obscuring the present invention.
GENERAL OVERVIEW
[0017] The approaches herein describe synchronized capture and playback of 4D content. The term "4D content" refers herein to non-visual and non-audio content that contains necessary information to reproduce a 4D effect such as motion, speed or acceleration. In an embodiment, a 4D synchronous capture system (4D-SCS) comprises of one or more sensors. The system continuously captures and records the sensor data. The generation of 4D content data by the system is performed concurrently with the capture of video content and/or audio content by a media capturing device internal or external to the system (such as a non-professional, semi- professional, or professional (video) camera system). The 4D-SCS synchronizes the generated 4D content with the captured audio-visual content by associating each 4D content data value with synchronously captured audio-visual content elements (image frames, audio signal). FIG. 1A-D are block diagrams that depicts an example of 4D Synchronous Capture System (4D-SCS), 4D-SCS 100, in one or more embodiments.
[0018] In an embodiment, the 4D-SCS uses one or more "synchronization timestamps" to perform synchronization of 4D content being captured with the audio- visual content that is simultaneously captured and may record the "synchronization timestamps" along with the corresponding 4D content data values in storage of the system. The "synchronization
timestamp" term refers herein to a reference timestamp from a sequence of reference timestamps generated by the system at a regular interval. A non-limiting example of a synchronization timestamp is a timecode. The system may generate a synchronization timestamp for
timestamping at least the initial 4D content data in a time series of 4D content data set. The initial audio-visual content elements, which are captured at the same time as 4D content, are also associated with the same synchronization timestamp. Thereby, the nearly simultaneously captured 4D content and the audio-visual content are associated with same timing information. The synchronization of 4D content with audio-visual content using synchronization timestamps may be performed in postproduction (after the content is fully captured).
[0019] As an example of synchronous capture of 4D content using synchronization timestamps, FIG IB depicts external media capturing device 150 that is configured to receive external synchronization signal which includes synchronization timestamps of a regular interval. The external device may record external synchronization signal within the concurrently captured audio-visual content and may store the audio- visual content in media content item on the internal storage. Accordingly, the 4D content captured by 4D-SCS 100 and audio-visual content captured by media capturing device 150 are associated with the same timing information, the
synchronization signal.
[0020] In another embodiment, the 4D-SCS encodes 4D content into an audio signal in an unused audio channel in real time. The "pseudo-audio channel" term refers herein to an audio channel of an audio recording in which 4D content is encoded into. The "pseudo-audio signal" term refers herein to the encoded audio signal in pseudo-audio channel that contains captured 4D content.
[0021] FIGS. 1C and ID depict 4D-SCS 100 generating a pseudo-audio signal, in one or more embodiments. In such embodiments, as depicted in FIGS. 1C and ID, when one or more new sensor readings of 4D content are received, 4D-SCS 100 generates the corresponding pseudo-audio signal according to the techniques described herein. 4D-SCS 100 may directly transmit the pseudo-audio signal into an audio input of external media capturing device 150. Simultaneously to pseudo-signal generation, external media capturing device 150 is capturing audio-visual content using other one or more audio channels of the audio recording for the audio content. Because the media capturing device is recording the audio-visual content's audio content in parallel in the one or more other channels, the concurrently captured 4D content in the pseudo-channel is automatically synchronized with the audio-visual content.
[0022] In one embodiment, as depicted in an example system of FIG. ID, another channel of the same audio recording that includes a pseudo-audio channel is used to capture sound from a microphone of the 4D-SCS (such as microphone 132 of 4D-SCS 100). Even if the audio-visual content is captured separately by an external media capturing device, the pseudo-channel captured by the 4D-SCS can be readily synchronized(e.g. in postproduction) with the external device using the audio channels that carry the recorded sound. This is possible because the sound captured by the external media capturing device (such as external media device 150) is the same
as the sound captured by the 4D-SCS's microphone in the audio recording that also contains already synchronized pseudo-audio channel.
SYSTEM OVERVIEW
[0023] FIG. 1 is a block diagram that depicts 4D Synchronous Capture System (4D-SCS) 100, in an embodiment. 4D-SCS 100 comprises one or more of processing unit(s) 105, sensor unit(s) 120 and storage unit 140. Processing unit(s) 105 may periodically request sensor data from sensor unit(s) 120, or sensor unit(s) 120 may be configured by processing unit(s) 105 to periodically provide processing unit(s) 105 with sensor data. The frequency used for receiving sensor data from sensor unit(s) 120 may vary based on the processing capabilities of 4D playback system(s). For example, 10 Hz frequency may be used to retrieve sensor data (every 0.1 seconds) because the playback device of 4D content may not be sensitive enough to adjust faster than every 0.1 second to new 4D content data value. The sensor read frequency also determines the interval for the time series sensor data when sensor data is processed and stored in storage unit 140, in an embodiment.
[0024] Processing unit(s) 105 may process raw sensor data readings from sensor unit(s) 120 to generate 4D content. Processing unit(s) 105 store the 4D content in 4D content data store 142 of storage unit 140 and/or encode the 4D content into pseudo-audio channel of media content items 144 according to techniques described herein and/or transmit directly into external media capturing device 150. For example, GPS receiver may provide raw coordinates of 4D-SCS 100's location. Processing unit(s) 105 may use previously received coordinates and the sensor read frequency to determine the velocity and store/encode 4D content for velocity instead of storing/encoding the raw coordinates.
[0025] Additionally, 4D-SCS 100 may include media capturing unit 140. Media capturing unit captures audio-visual content using camera 134 and microphone 132 and stores the audiovisual content in media content items 144 on storage unit 140. In an embodiment, concurrently with the capture of the audio-content in one or more audio channels of an audio recording, one or more audio channels (pseudo-audio channel(s)) of the audio recording are used to record the 4D- content. The audio channels, including the pseudo-channel(s), may be stored in media content items 144 on storage unit 140 or may be streamed to an external playback device or an external media capturing device such as extern media capturing device 150.
[0026] External media capturing device 150 captures audio-visual content using its own camera and microphone among other media capturing components. Media capturing device 150 may be communicatively coupled with 4D-SCS 100 to receive an audio stream from 4D-SCS 100. The audio stream comprising one or more of audio and pseudo-audio channels streamed by 4D-SCS 100. Thus, the external media capturing device 150 captures its own audio-visual content as well as the audio content streamed by 4D-SCS that includes 4D content. External capturing device 150 may store both audio-visual content into the same one or more media content items on external capturing device 150 or elsewhere.
[0027] Additionally or alternatively, external media device 150 is communicatively coupled with 4D-SCS 100 to receive a synchronization signal generated by 4D-SCS 100. Based on the synchronization signal, external media device 150 captures synchronization timestamps generated by 4D-SCS 100 - the same one or more synchronization timestamps that are used in timestamping 4D content in 4D content data store 142. Accordingly, the 4D content in 4D content store 142 can be synchronized with the audio-visual content captured by external media capturing device 150.
GENERATING REAL-TIME 4D CONTENT
[0028] FIG. 2 is a flow diagram that depicts a process for generating real-time 4D content and synchronizing the 4D content with captured audio-visual content, in an embodiment. At step 205, a 4D-SCS initiates the 4D content capture. The initiation may be automatically triggered by the initiation of audio-visual content capture. For example, a media device (internal or external to the 4D-SCS) receives an input to initiate recording of audio-visual content. The media device requests the 4D-SCS to initiate the capture of 4D content in synch with the audio-visual content. Alternatively, the 4D-SCS receives a user input to initiate the capture of 4D content in synch with separately initiated capture of audio-visual content, or the 4D-SCS initiates the capture of 4D content as well as the audio-visual content.
[0029] In an embodiment, one or more sensor unit readings are initialized before the capture of 4D content. The sensor reading(s) at the time of the initialization may be denoted as the baseline sensor reading(s). At least a portion of 4D content may be generated based on the difference between the next sensor reading(s) and the respective baseline sensor reading(s).
[0030] For example, the initial reading(s) from accelerometer and/or gyroscope before the 4D content capture initiation are denoted as the initial (zero) position of the 4D-SCS. The
subsequent roll, pitch and yaw angles are calculated based on the initial sensor reading(s) of the initial position of the 4D-SCS. Similarly, for a GPS receiver in such an embodiment, the coordinates received during the initialization are denoted as the initial location coordinates of the 4D-SCS. When, for example, the first 4D content's velocity is calculated, the initial location is used as a starting point to determine the distance coverage till the next reading of the GPS coordinates is received, and the velocity is calculated based on the determined distance.
[0031] At step 210, the 4D-SCS retrieves one or more readings from sensor unit(s), in response to the initiation of the 4D content capture. A sensor reading may describe
environmental parameters at a point in time when the sensor reading is captured, in an embodiment. Non-limited examples of environmental parameters include an angular velocity, a directional acceleration, angular position, a location coordinate, an ambient temperature and humidity. The point of time of the capture can be determined based on the sensor read frequency or by associating a synchronization timestamp indicating the point of time.
[0032] In one embodiment, the sensor reading(s) are further transformed to generate 4D content. For example, angular velocities around each axis, collected by a gyroscope, and/or the directional accelerations in each of the axis, collected by an accelerometer, are used to calculate the current angle of rotation in each of the axis. Table 1 below depicts examples of raw sensor data from a gyroscope and an accelerometer that is used to calculate the corresponding angles of rotation, in an embodiment.
Table 1 - Raw Sensor Data and Generated 4D Content
[0033] In Table 1 , gx, gy and gz are example raw sensor readings from the gyroscope; ax, ay, az are example raw sensor readings from the accelerometer. Each row of the readings may
correspond to a particular time at which both the gyroscope readings and accelerometer readings are retrieved.
[0034] In an embodiment, at step 210, the 4D-SCS retrieves sensor data at a predetermined interval (using a pre-determined sensor read frequency), thus generating a time series of sensor data. For example, the time interval between the sensor readings in the first row and the second row of Table 1 may be the same as the time interval between the sensor readings in the second row and the third row.
[0035] For each cycle of reading of sensor data, the corresponding 4D content data is generated, at step 215. For example, in Table 1, the gx, gy, gz, ax, ay and az sensor data is used to generate 4D content data for pitch angle 310 and roll angle 320. Since each row of sensor readings correspond to a particular time, each of the generated 4D content corresponds to the same respective time.
[0036] FIG. 3 is a graph that depicts 4D content data generated over time, in an embodiment. Pitch angle 310 and roll angle 320 are generated based on the periodic sensor readings at 10Hz of a sensor read frequency. Accordingly, each of pitch angle 310's and roll angle 320's data point on the vertical axis of FIG. 3 are spaced 1/10 seconds apart on the horizontal time axis. As new raw sensor data is retrieved, new pitch angle 310 and roll angle 320 points may be generated real-time.
[0037] In another embodiment, at step 215, the 4D content data is generated without transformation of retrieved raw sensor data. For example, temperature sensor readings may represent 4D content data for temperature without any need for further transformation and can be readily recorded or encoded as 4D content data.
SYNCHRONIZING 4D CONTENT WITH AUDIO-VISUAL CONTENT
[0038] Continuing with FIG. 2, in one or more steps 220-260, the 4D-SCS synchronizes the captured 4D content with concurrently captured audio-visual content, according to embodiments. The synchronization causes the generated 4D content data and the concurrently captured audiovisual content data to be associated with the same timing information, i.e. data values of the 4D content are time-wise aligned with concurrent elements of the audio-visual content.
ENCODING 4D CONTENT
[0039] At step 220, the 4D-SCS encodes the generated 4D content into a real-time signal that is made part of the concurrently captured audio-visual content. The 4D-SCS may transform the
4D content data into one or more pseudo-audio signals recorded into one or more audio channels (pseudo-audio channel(s)) of an audio recording. In an embodiment, the initiation of 4D-SCS capture may cause the initiation of capture of an audio signal in at least one different audio channel of the audio-channels of the same audio recording. For example, for a stereo audio recording that contains a right channel and a left channel, the left channel may be used for an audio signal recording sound through a microphone, as an example, while the right channel may be used for a pseudo-audio signal, or vice versa.
[0040] In one embodiment, 4D content data may be encoded as an audio signal of different frequencies that correspond to different values of 4D content data. For a generated 4D content data value, the 4D-SCS calculates the corresponding frequency value and generates the pseudo- audio signal of the corresponding frequency.
[0041] In an embodiment, 4D content data of multiple different sensors is encoded simultaneously into a pseudo-audio signal. The audio spectrum may be divided into frequency ranges and each of the ranges may be used for a particular type of 4D content.
[0042] As a non-limiting example of encoding of 4D content into a pseudo-audio signal, roll angle data and pitch angle data may be encoded into a pseudo-audio signal. Similar techniques for encoding can be used for other 4D content such as velocity, temperature and humidity. For example, a low frequency spectrum of sound is used to encode roll angles: from 550 Hz to 5,050 Hz. In the selected sound spectrum, each 50 Hz spectrum is reserved for each integer value of the angle to lower the risk of noise distortion in transmission. Accordingly, a frequency for a value of a roll angle may be calculated using the following sample equation:
[0043] roii = (ciroii + 46°) * 50 Hz + 500 Hz
[0044] in which roii is the frequency of a pseudo-audio signal to be generated to encode a roll angle value of 4D content, 46° is a constant offset to transform the roll angles into positive integer values from 1 to 91, 50 Hz is the reserved spectrum for each integer angle value, and 500Hz is the offset to ensure the pseudo-audio signal is in the audible audio spectrum range that starts from 550 Hz.
[0045] Similarly, a high frequency spectrum of sound may be used to encode pitch angles: from 5,550Hz to 10,050Hz. Accordingly, a frequency for a value of a pitch angle may be calculated using the following sample equation:
[0046] pitch = (apitch + 46°) * 50 Hz + 5,500 Hz
[0047] in which FPuch is the frequency of a pseudo-audio signal to be generated to encode a pitch angle value of 4D content, 46° is an offset constant to transform the pitch angles into positive integer values from 1 to 91, 50 Hz is the selected reserved spectrum for each integer angle value, and 5,500Hz is the offset to ensure the pseudo-audio signal is in the high range of the audible audio spectrum, and thus doesn't overlap with the pseudo-audio signal spectrum used to encode roll angles.
[0048] FIG. 4A is a graph that depicts a pseudo-audio signal for roll and pitch angles, in an embodiment. FIG. 4B is a graph that depicts a frequency spectrum of a pseudo-audio signal for roll and pitch angles, in an embodiment. In FIG.'s 4A and 4B, due to 10 Hz sensor read frequency, the frequencies of pseudo-audio signal at each 0.1s corresponds to different roll angle and pitch angle values. In FIG 4B, the higher frequency dotted lines corresponds to the pitch angle values and the lower frequency dotted line corresponds to the roll angles.
[0049] In an embodiment, 4D content data values are encoded into the pseudo channel of an audio recording concurrently with audio signal from a microphone being encoded into another audio channel of the same audio recording. Because of the concurrent capture into the same audio recording, the captured audio content from the microphone and the simultaneously captured 4D content are associated with the same timing information.
[0050] Continuing with FIG. 2, at step 225, the 4D-SCS stores the encoded pseudo-audio signal in a pseudo-audio channel of an audio content file, in an embodiment. The same audio content file also stores the concurrently captured audio-visual content.
[0051] Additionally or alternatively, at step 230, the 4D-SCS transmits the encoded pseudo- audio signal in pseudo-audio channel to an external media device. The external media device may have one or more audio inputs for an external microphone or another audio capturing device to capture audio for the video content being captured by the media device. Such an audio input can be readily used to receive audio recordings that include one or more pseudo-audio channels from the 4D-SCS.
[0052] Regardless, whether the pseudo-audio channel is aggregated with audio-visual content at the 4D-SCS or at the external media device, the aggregated audio-visual content is streamed, at step 235, in one embodiment. Multiple techniques exist for streaming audio-visual content over network to one or more remote client devices. Since the pseudo-audio channel is seamlessly part of the captured audio- visual content, the same techniques can be used to stream
the pseudo-audio channel along with the rest of the audio-visual content to the remote client devices. A 4D playback device may receive such a stream to synchronously playback 4D content of the streamed one or more pseudo-channels as well as the audio-visual content streamed together.
SYNCRONIZING 4D CONTENT DATA SET
[0053] Continuing with FIG. 2, at step 250, the 4D-SCS may generate one or more synchronization timestamps for associating the synchronization timestamps with the generated 4D content at step 215, in an embodiment. Concurrently with capturing and generating 4D content, the 4D-SCS may generate a synchronization signal. The synchronization signal may comprise of a sequence of synchronization timestamp such as timecodes. One example of synchronization signal is Linear (or Longitudinal) Timecode (LTC) signal which is an encoding of synchronization timestamps, such as timecodes, into an audio signal. Other examples of synchronization signals may be used such as MIDI time code (MTC), Control track longitudinal (CTL), AES3, Rewnteable Consumer Timecode (RCTC or RC).
[0054] At step 255, the 4D-SCS stores the generated 4D content data value in 4D content data set in an association with a synchronization timestamp. The synchronization timestamp representing the time at which the data value is being stored in the 4D content data set; or at which the 4D content data value was generated; or at which the sensor reading(s) for the data value were retrieved/captured.
[0055] The 4D-SCS may explicitly or implicitly associate the synchronization timestamp with a 4D content data value in the 4D content data set. The 4D-SCS may explicitly associate the synchronization timestamp with the data value by storing the synchronization timestamp in association with the data value. The 4D-SCS may implicitly associate the synchronization timestamp with the data value by storing 4D content data set as time series and associating the synchronization timestamp only with the initial 4D content data value(s). The rest of the data values in the 4D content data set can be associated with the respective synchronization timestamp using the initial synchronization timestamp, the sensor read frequency, and the index of the data value in the times series to determine.
[0056] While capturing 4D content, at step 260, the 4D-SCS may send a synchronization signal to a media capturing device for the media capturing device to synchronize the
simultaneously captured audio-visual content by the media capturing device with respective
synchronization timestamps of the synchronization signal. The media capturing device, upon the receipt of the synchronization timestamp, associates the newly captured audio-visual content with the received synchronization timestamp which has also been already associated with the 4D content captured at the same time. Thereby, the nearly simultaneously captured 4D content and the audio-visual content are associated with the same timing information, i.e. data values of the 4D content are time-wise aligned with audio-visual content elements.
4D CONTENT PLAYBACK
[0057] In an embodiment, a 4D content playback device, comprising one or more components of a computer system, may receive 4D content in a media content item that is accessible as a file or in any other storage format. In the media content item, the 4D content may be encoded into one or more pseudo-audio signals of pseudo-audio channel(s) along with audiovisual content data. Alternatively, the 4D content may be stored in meta-data associated with the media content item as a 4D content data set timestamped with one or more synchronization timestamps.
[0058] In another embodiment, the 4D content playback device may receive 4D content as a media stream. The received stream may include 4D content as one or more pseudo-audio signals of pseudo-audio channel(s) or as data values of a 4D content data set timestamped with one or more synchronization timestamps.
[0059] In an embodiment in which 4D content is represented as a pseudo-audio signal of a pseudo-audio channel, one or more other audio channels that do not carry a pseudo-audio signal may be used to synchronize with separate audio-visual content. Since the separate audio- visual content contains the recording of the same audio as the audio recording containing the pseudo- channel, the non-pseudo audio-channels are synchronized with the audio content of the separate audio-visual content. By the virtue of the pseudo-channel being synchronized with the non- pseudo audio channel, the pseudo-audio channel containing 4D content is thereby synchronized with the separate audio-visual content.
[0060] The 4D content playback device decodes the one or more pseudo-audio signals into a 4D content data set, in an embodiment. To reduce vibration and other sources of noises, the decoded 4D content data set may be further transformed using, as an example, a Bezier curve model. The decoding may further generate synchronization timestamps to be associated with the
decoded 4D content data value(s). The generated synchronization timestamps correspond to the synchronization timestamps of simultaneously captured audio-visual content.
[0061] The initiation of a playback of captured audio-visual content may cause the transmission of 4D content to a 4D reproduction device. At the same time as the audio-visual content, corresponding to a particular synchronization time stamp, is played, the 4D content data value(s), corresponding to the particular synchronization timestamp in the 4D content data set, may be transmitted to the 4D reproduction device.
[0062] Alternatively or additionally, the decoding may generate 4D content data set arranged in time series of a regular interval. The 4D playback device may also receive the 4D content data set in time series. One or more data values in the 4D content data set that correspond to the start of the captured audio-visual content may be indicated in the decoded or received 4D content data set.
[0063] The initiation of a playback of captured audio-visual content may cause the transmission of 4D content to a 4D reproduction device. The 4D content data value(s) that are indicated to correspond to the start of the audio- visual content may be transmitted to a 4D reproduction device. In one embodiment, the 4D content playback device determines based on a time period that has lapsed from the start of the playback (or the last transmission of 4D content data value(s)), whether to transmit the next 4D content data values. In an embodiment in which 4D content data set is time series, the 4D reproduction device may transmit the 4D content value(s) at a regular interval as the appropriate time period for the interval lapses.
[0064] 4D content data value(s) may be further transformed before transmission to a 4D reproduction device. Each 4D reproduction device may require a particular format of 4D content values. In such an embodiment, either all 4D content data values in the 4D content data set are transformed ahead of the transmission, or each 4D content data value is transformed before its transmission.
[0065] Upon receipt of a 4D content data value, a 4D reproduction device reproduces the 4D effect corresponding to the 4D content data value. A non-limiting example of 4D reproduction device is a motion simulating chair. A motion simulating chair, upon the receipt of a roll and/or pitch angle value, performs the motion to adjust the chair to the corresponding roll/pitch angle thereby imitating the motion depicted in the corresponding audio-visual content. Another example may be a variable speed fan to emulate the air flow that is experienced at different
velocities. As a new velocity value is received, the fan adjusts the speed of rotation to simulate the headwind corresponding to the velocity.
HARDWARE OVERVIEW
[0066] According to one embodiment, the techniques described herein are implemented by one or more special-purpose computing devices. The special-purpose computing devices may be hard-wired to perform the techniques, or may include digital electronic devices such as one or more application-specific integrated circuits (ASICs) or field programmable gate arrays (FPGAs) that are persistently programmed to perform the techniques, or may include one or more general purpose hardware processors programmed to perform the techniques pursuant to program instructions in firmware, memory, other storage, or a combination. Such special-purpose computing devices may also combine custom hard-wired logic, ASICs, or FPGAs with custom programming to accomplish the techniques. The special-purpose computing devices may be desktop computer systems, portable computer systems, handheld devices, networking devices or any other device that incorporates hard-wired and/or program logic to implement the techniques.
[0067] For example, FIG. 5 is a block diagram that illustrates a computer system 500 upon which an embodiment of the invention may be implemented. Computer system 500 includes a bus 502 or other communication mechanism for communicating information, and a hardware processor 504 coupled with bus 502 for processing information. Hardware processor 504 may be, for example, a general purpose microprocessor.
[0068] Computer system 500 also includes a main memory 506, such as a random access memory (RAM) or other dynamic storage device, coupled to bus 502 for storing information and instructions to be executed by processor 504. Main memory 506 also may be used for storing temporary variables or other intermediate information during execution of instructions to be executed by processor 504. Such instructions, when stored in non-transitory storage media accessible to processor 504, render computer system 500 into a special-purpose machine that is customized to perform the operations specified in the instructions.
[0069] Computer system 500 further includes a read only memory (ROM) 508 or other static storage device coupled to bus 502 for storing static information and instructions for processor 504. A storage device 510, such as a magnetic disk or optical disk, is provided and coupled to bus 502 for storing information and instructions.
[0070] Computer system 500 may be coupled via bus 502 to a display 512, such as a cathode ray tube (CRT), for displaying information to a computer user. An input device 514, including alphanumeric and other keys, is coupled to bus 502 for communicating information and command selections to processor 504. Another type of user input device is cursor control 516, such as a mouse, a trackball, or cursor direction keys for communicating direction information and command selections to processor 504 and for controlling cursor movement on display 512. This input device typically has two degrees of freedom in two axes, a first axis (e.g., x) and a second axis (e.g., y), that allows the device to specify positions in a plane.
[0071] Computer system 500 may implement the techniques described herein using customized hard-wired logic, one or more ASICs or FPGAs, firmware and/or program logic which in combination with the computer system causes or programs computer system 500 to be a special-purpose machine. According to one embodiment, the techniques herein are performed by computer system 500 in response to processor 504 executing one or more sequences of one or more instructions contained in main memory 506. Such instructions may be read into main memory 506 from another storage medium, such as storage device 510. Execution of the sequences of instructions contained in main memory 506 causes processor 504 to perform the process steps described herein. In alternative embodiments, hard-wired circuitry may be used in place of or in combination with software instructions.
[0072] The term "storage media" as used herein refers to any non-transitory media that store data and/or instructions that cause a machine to operation in a specific fashion. Such storage media may comprise non-volatile media and/or volatile media. Non-volatile media includes, for example, optical or magnetic disks, such as storage device 510. Volatile media includes dynamic memory, such as main memory 506. Common forms of storage media include, for example, a floppy disk, a flexible disk, hard disk, solid state drive, magnetic tape, or any other magnetic data storage medium, a CD-ROM, any other optical data storage medium, any physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, NVRAM, any other memory chip or cartridge.
[0073] Storage media is distinct from but may be used in conjunction with transmission media. Transmission media participates in transferring information between storage media. For example, transmission media includes coaxial cables, copper wire and fiber optics, including the
wires that comprise bus 502. Transmission media can also take the form of acoustic or light waves, such as those generated during radio-wave and infra-red data communications.
[0074] Various forms of media may be involved in carrying one or more sequences of one or more instructions to processor 504 for execution. For example, the instructions may initially be carried on a magnetic disk or solid state drive of a remote computer. The remote computer can load the instructions into its dynamic memory and send the instructions over a telephone line using a modem. A modem local to computer system 500 can receive the data on the telephone line and use an infra-red transmitter to convert the data to an infra-red signal. An infra-red detector can receive the data carried in the infra-red signal and appropriate circuitry can place the data on bus 502. Bus 502 carries the data to main memory 506, from which processor 504 retrieves and executes the instructions. The instructions received by main memory 506 may optionally be stored on storage device 546 either before or after execution by processor 504.
[0075] Computer system 500 also includes a communication interface 518 coupled to bus 502. Communication interface 518 provides a two-way data communication coupling to a network link 520 that is connected to a local network 522. For example, communication interface 518 may be an integrated services digital network (ISDN) card, cable modem, satellite modem, or a modem to provide a data communication connection to a corresponding type of telephone line. As another example, communication interface 518 may be a local area network (LAN) card to provide a data communication connection to a compatible LAN. Wireless links may also be implemented. In any such implementation, communication interface 518 sends and receives electrical, electromagnetic or optical signals that carry digital data streams representing various types of information.
[0076] Network link 520 typically provides data communication through one or more networks to other data devices. For example, network link 520 may provide a connection through local network 522 to a host computer 524 or to data equipment operated by an Internet Service Provider (ISP) 526. ISP 526 in turn provides data communication services through the world wide packet data communication network now commonly referred to as the "Internet" 528. Local network 522 and Internet 528 both use electrical, electromagnetic or optical signals that carry digital data streams. The signals through the various networks and the signals on network link 520 and through communication interface 518, which carry the digital data to and from computer system 500, are example forms of transmission media.
[0077] Computer system 500 can send messages and receive data, including program code, through the network(s), network link 520 and communication interface 518. In the Internet example, a server 530 might transmit a requested code for an application program through Internet 528, ISP 526, local network 522 and communication interface 518.
[0078] The received code may be executed by processor 504 as it is received, and/or stored in storage device 46, or other non-volatile storage for later execution.
[0079] In the foregoing specification, embodiments of the invention have been described with reference to numerous specific details that may vary from implementation to
implementation. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense. The sole and exclusive indicator of the scope of the invention, and what is intended by the applicants to be the scope of the invention, is the literal and equivalent scope of the set of claims that issue from this application, in the specific form in which such claims issue, including any subsequent correction.
Claims
1. A computer- implemented method comprising:
retrieving one or more sensor readings that describe one or more environmental
parameters at a point of time at which the one or more sensor readings have been captured;
based on one or more sensor readings, generating 4D content data, the 4D content data digitally representing the one or more environmental parameters captured by the one or more sensor readings;
synchronizing the 4D content data with concurrently captured audio-visual content data, the synchronizing causing the 4D content data and the audio- visual content data to be associated with same timing information;
wherein a playback of the 4D content data causes a reproduction of the one or more environmental parameters at the point of time at which the one or more sensor readings have been captured.
2. The method of Claim 1, further comprising:
encoding the 4D content data into an audio signal and recording the audio signal into a first audio channel of a recording thereby generating a pseudo-audio signal in a pseudo-audio channel of the recording.
3. The method of Claim 2, wherein generating the pseudo-audio signal in the pseudo-audio channel comprises:
for each data value of the 4D content data, calculating a corresponding frequency value; generating a respective audio signal using the corresponding frequency value.
4. The method of Claim 3, wherein a frequency spectrum for the pseudo-audio signal is divided into a plurality of frequency spectrums, each of the plurality of frequency spectrums corresponding to a type of 4D content that is generated.
5. The method of Claim 2, further comprising:
providing the pseudo-audio channel as an input to a media capture device that captures the audio-visual content data;
causing the media capture device to record the pseudo-audio channel into a particular audio channel of the audio-visual content data, thereby causing the 4D content data and the audio-visual content data to be associated with the same timing information.
The method of Claim 2, further comprising:
concurrently with the generating of the pseudo-audio signal, capturing an audio signal from a microphone into a second audio channel of the recording;
using the audio signal and a particular audio signal from the audio-visual content data to synchronize the pseudo-audio signal of the pseudo-audio channel with the audiovisual content data thereby causing the 4D content data and the audio-visual content data to be associated with the same timing information.
The method of Claim 2, further comprising:
streaming the audio-visual content data with the 4D content data to a client device by streaming the recording that includes the pseudo-audio channel with the pseudo- audio signal.
The method of Claim 1 , wherein synchronizing the 4D content data with the concurrently captured audio-visual content data:
generating a synchronization signal by generating one or more synchronization
timestamps at a regular interval;
sending the synchronization signal to a media capturing device that is capturing the
audio-visual content data;
causing the media capturing device to associate at least one synchronization timestamp of the synchronization signal with an element of the audio-visual content data that is captured at a receipt of the at least one synchronization timestamp.
The method of Claim 8, further comprising:
associating a synchronization timestamp of the synchronization signal with one or more data values of the 4D content data which are stored in a data storage at the time the synchronization timestamp is generated, or which are generated at the time the synchronization timestamp is generated, or which are generated based on one or more particular sensor readings are retrieved or captured at the time the synchronization timestamp is generated.
10. The method of Claim 1, wherein a frequency for retrieving one or more sensor readings is based on a device that reproduces the one or more environmental parameters.
11. The method of Claim 1, wherein generating the 4D content data comprises calculating a 4D content data value of the 4D content data based on at least one of: a frequency for retrieving the one or more sensor readings and one or more previous sensor readings, which are retrieved before the one or more sensor readings.
12. The method of Claim 1, wherein retrieving the one or more sensor readings and
generating the 4D content data is initiated by a user input to start a capture of the audiovisual content data.
13. A system comprising one or more processing units and memory, the memory storing program instructions, which when executed by the one or more processing units, cause performance of a method as recited in any one of claims 1-12.
14. One or more non-transitory computer-readable media storing one or more programs which, when executed by one or more processors of the user computer, cause
performance of a method as recited in any one of claims 1-12.
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US201762483451P | 2017-04-09 | 2017-04-09 | |
| US62/483,451 | 2017-04-09 | ||
| US201815946570A | 2018-04-05 | 2018-04-05 | |
| US15/946,570 | 2018-04-05 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2018191169A1 true WO2018191169A1 (en) | 2018-10-18 |
Family
ID=63793555
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2018/026717 Ceased WO2018191169A1 (en) | 2017-04-09 | 2018-04-09 | Synchronous capture and playback of 4d content |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2018191169A1 (en) |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2012134016A1 (en) * | 2011-03-25 | 2012-10-04 | 주식회사 포리얼 | Apparatus for synchronizing special effects and system for playing 4d image |
| US20140205260A1 (en) * | 2013-01-24 | 2014-07-24 | Immersion Corporation | Haptic sensation recording and playback |
| US20150109220A1 (en) * | 2012-03-15 | 2015-04-23 | Nokia Corporation | Tactile apparatus link |
| US20160086637A1 (en) * | 2013-05-15 | 2016-03-24 | Cj 4Dplex Co., Ltd. | Method and system for providing 4d content production service and content production apparatus therefor |
| US20160231720A1 (en) * | 2015-02-06 | 2016-08-11 | Electronics And Telecommunications Research Institute | Controller for scent diffusing device and a server for supporting the controller |
-
2018
- 2018-04-09 WO PCT/US2018/026717 patent/WO2018191169A1/en not_active Ceased
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2012134016A1 (en) * | 2011-03-25 | 2012-10-04 | 주식회사 포리얼 | Apparatus for synchronizing special effects and system for playing 4d image |
| US20150109220A1 (en) * | 2012-03-15 | 2015-04-23 | Nokia Corporation | Tactile apparatus link |
| US20140205260A1 (en) * | 2013-01-24 | 2014-07-24 | Immersion Corporation | Haptic sensation recording and playback |
| US20160086637A1 (en) * | 2013-05-15 | 2016-03-24 | Cj 4Dplex Co., Ltd. | Method and system for providing 4d content production service and content production apparatus therefor |
| US20160231720A1 (en) * | 2015-02-06 | 2016-08-11 | Electronics And Telecommunications Research Institute | Controller for scent diffusing device and a server for supporting the controller |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US10679676B2 (en) | Automatic generation of video and directional audio from spherical content | |
| US12436728B2 (en) | Guided collaborative viewing of navigable image content | |
| AU2021250981B2 (en) | System and method for real-time synchronization of media content via multiple devices and speaker systems | |
| CN108200445B (en) | Virtual image studio system and method | |
| JP6664071B2 (en) | System and method for recording haptic data for use with multimedia data | |
| JP6585358B2 (en) | System and method for converting sensory data into haptic effects | |
| US12347194B2 (en) | Automated generation of haptic effects based on haptics data | |
| EP2457181A1 (en) | Improved audio/video methods and systems | |
| US20190130644A1 (en) | Provision of Virtual Reality Content | |
| US20150078723A1 (en) | Method and apparatus for smart video rendering | |
| WO2018191169A1 (en) | Synchronous capture and playback of 4d content | |
| CN207851764U (en) | Content reproduction apparatus, the processing system with the content reproduction apparatus | |
| KR20240100999A (en) | Method and device for operating a device for obtaining skeleton information | |
| CN118451475A (en) | Advanced multimedia system for analysis and accurate simulation of live events | |
| CN114071323A (en) | Control method and control device of TWS (two-way motion system) sound based on panoramic playing | |
| CN115567735A (en) | Transmission system dynamic display method and system adaptive to individual requirements | |
| HK40084126A (en) | System and method for real-time synchronization of media content via multiple devices and speaker systems | |
| CN107132921A (en) | Content reproduction apparatus, processing system and method with the content reproduction apparatus | |
| KR20110119166A (en) | Device for generating video file including sound source location related information, method and recording medium |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18784569 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 18784569 Country of ref document: EP Kind code of ref document: A1 |