WO2014076823A1 - Dispositif de traitement de données, système de traitement de données, programme de traitement de données, et procédé pour la transmission et la réception de données vidéo - Google Patents
Dispositif de traitement de données, système de traitement de données, programme de traitement de données, et procédé pour la transmission et la réception de données vidéo Download PDFInfo
- Publication number
- WO2014076823A1 WO2014076823A1 PCT/JP2012/079866 JP2012079866W WO2014076823A1 WO 2014076823 A1 WO2014076823 A1 WO 2014076823A1 JP 2012079866 W JP2012079866 W JP 2012079866W WO 2014076823 A1 WO2014076823 A1 WO 2014076823A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- frame
- video data
- video
- deleted
- information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/2343—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
- H04N21/234381—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by altering the temporal resolution, e.g. decreasing the frame rate by frame skipping
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/233—Processing of audio elementary streams
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/24—Monitoring of processes or resources, e.g. monitoring of server load, available bandwidth, upstream requests
- H04N21/2402—Monitoring of the downstream path of the transmission network, e.g. bandwidth available
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/4402—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
- H04N21/440281—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by altering the temporal resolution, e.g. by frame skipping
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/60—Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client
- H04N21/65—Transmission of management data between client and server
- H04N21/654—Transmission by server directed to the client
- H04N21/6547—Transmission by server directed to the client comprising parameters, e.g. for client setup
Definitions
- the present invention relates to video data transmission / reception technology.
- mobile terminal devices such as smartphones (hereinafter referred to as mobile terminals) that can be taken out and driven by a battery can shoot videos with a built-in camera and view them on the mobile terminal. Some have graphic performance that can be done.
- a mobile terminal has a video playback function and an Internet connection function, and can view the content of a video site on the Internet.
- mobile terminals that can view personal photos and videos stored in an auxiliary storage device of a personal computer (hereinafter referred to as a personal computer) in the home via the Internet are provided.
- a receiving apparatus that absorbs a difference in clock frequency between an encoder side and a decoder side and lip-syncs the audio frame output timing with the video frame output timing.
- the receiving apparatus receives a plurality of encoded video frames sequentially attached with video time stamps based on a reference clock on an encoder side and a plurality of encoded audio frames sequentially attached with audio time stamps based on a reference clock.
- Decoding means for decoding is included.
- the receiving apparatus has storage means for storing a plurality of video frames and a plurality of audio frames obtained as a result of decoding the encoded video frame and the encoded audio frame by the decoding means.
- the receiving device also has a calculation means for calculating a time difference caused by a difference between the clock frequency of the reference clock on the encoder side and the clock frequency of the system time clock on the decoder side. Further, the receiving apparatus adjusts video frame output timing when sequentially outputting a plurality of video frames in units of frames based on an audio frame output timing when sequentially outputting a plurality of audio frames in units of frames according to a time difference.
- the lip sync control device includes first input means for inputting an encoded audio signal that includes an audio reference signal input at a predetermined timing. Furthermore, the lip sync control device includes a second input means for inputting an encoded video signal including a video reference signal input at the same timing as the audio reference signal, and an audio signal input by the first input means. First decoding means for decoding. The lip sync control device further includes second decoding means for decoding the video signal input by the second input means, and an audio reference signal included in the audio signal decoded by the first decoding means. . The lip sync control device further includes a time shift detection unit that detects a time shift amount between the video reference signal included in the video signal decoded by the second decoding unit. Further, the lip sync control device controls the audio signal and the video signal, which are earlier in time relative to each other, to be output after being delayed by the amount of time shift based on the detection result of the time shift detection means. And a control means.
- a third technique is an image communication apparatus that communicates video and audio via a network, and includes a control unit that controls a transmission rate from an application layer according to a reception state on the video and audio reception side. There is an image communication device. The image communication apparatus further includes transmission means for transmitting video and audio at a transmission rate controlled by the control means.
- the same time information is inserted as a digital watermark in each of the synchronized digital video and digital audio, and the time information is extracted from each of the digital video and digital audio after the digital watermark is out of synchronization. Then, the time information of the digital video and the time information of the digital audio are compared, and at least one of the digital video and the digital audio is delayed so that they match.
- JP 2005-102192 A Japanese Patent Laid-Open No. 2008-131591 Japanese Patent Laid-Open No. 10-164533 JP-A-4-362932 JP 2003-259314 A
- Any of the first to fifth techniques is a technique in which video frames and audio frames are individually encoded and transmitted as different streams and reproduced at the transmission destination.
- the image quality at the time of reproduction is deteriorated in the network environment where the transmission rate is lower than the reproduction rate of the moving picture compared with the original image quality of the data before transmission.
- an object of the present invention is to improve the reproduction quality of moving image data transmitted from a storage destination via a network with a narrow bandwidth.
- the first information processing apparatus includes a storage unit, a monitoring unit, a deletion unit, a frame information generation unit, and a transmission unit.
- the storage unit stores first video data and audio data including synchronization information associated with the first video data.
- the monitoring unit monitors the state of the communication network.
- the deletion unit determines, from the first frame rate indicating the first number of frames per unit time of the first video data, the audio level of the corresponding audio data in the first video data according to the monitoring result.
- the second video data having a second frame rate lower than the first frame rate is generated by deleting consecutive video frames having a value equal to or less than a predetermined threshold.
- the frame information generation unit generates frame information regarding the deleted frame.
- the transmission unit transmits the second video data and frame information.
- the second information processing apparatus includes a reception unit, a complementary image generation unit, and a video data generation unit. From the first frame rate indicating the first number of frames per unit time of the first video data, the receiving unit has the audio level of the corresponding audio data of the first video data equal to or less than a predetermined threshold. Second video data in which consecutive video frames are deleted to a second frame rate smaller than the first frame number and frame information related to the deleted frames are received.
- the complementary image generation unit generates a complementary image that complements the deleted frame image using the frame information.
- the video data generation unit generates video data of the first frame rate using the complementary image and the second video data.
- the information processing system can improve the reproduction quality of moving image data transmitted from the storage destination via the network.
- FIG. 1 shows an example of the configuration of an information processing system according to the present embodiment.
- An example of the structure of the meta-information of video data is shown.
- reconstruction of a frame is shown.
- generation process is shown.
- An example of a method for creating a complementary frame by discriminating an object having a large movement amount will be described.
- the figure for demonstrating adjustment of the frame rate in the transmission terminal of moving image data is shown.
- 6 shows a flowchart of a frame reduction process. Details of an operation flow for determining a frame to be deleted based on the audio level of each frame in the moving image data stored in the work buffer are shown.
- the figure for demonstrating the decoding process of a receiving terminal is shown. It is a flowchart of frame reconstruction by a receiving terminal. The operation
- generation process of the complementary frame with respect to the deletion frame in a receiving terminal is shown.
- An example of the sequence diagram (the 1) of the information processing system concerning this embodiment is shown.
- An example of the sequence diagram (the 2) of the information processing system concerning this embodiment is shown.
- An example of the structure of the server in this embodiment is shown. 2 shows an example of a hardware configuration of a server or a personal computer according to the present embodiment. An example of the hardware constitutions of the mobile terminal which concerns on this embodiment is shown.
- An example of the structure of the information processing system in this embodiment (modification) is shown.
- FIG. 1 is an example of a functional block diagram of the information processing system according to the present embodiment.
- the first information processing apparatus 1 includes a storage unit 2, a monitoring unit 3, a deletion unit 4, a frame information generation unit 5, and a transmission unit 6.
- the storage unit 2 stores first video data and audio data including synchronization information associated with the first video data.
- the monitoring unit 3 monitors the state of the communication network.
- the deletion unit 4 determines the audio of the corresponding audio data from the first frame rate based on the first frame rate indicating the first number of frames per unit time of the first video data according to the monitoring result. Second video data having a second frame rate lower than the first frame rate is generated by deleting consecutive video frames having a level equal to or lower than a predetermined threshold. In addition, the deletion unit 4 deletes any of the consecutive frames whose similarity is equal to or greater than a predetermined threshold between consecutive frames.
- the frame information generation unit 5 generates frame information regarding the deleted frame.
- the transmission unit 6 transmits the second video data, audio data, and frame information.
- the second information processing apparatus 7 includes a receiving unit 8, a complementary image generating unit 9, and a video data generating unit 10.
- the receiving unit 8 determines that the audio level of the corresponding audio data of the first video data is equal to or lower than a predetermined threshold.
- the second video data in which a certain continuous video frame is deleted and the second frame rate is smaller than the first frame number and the frame information related to the deleted frame are received.
- the receiving unit 8 further receives audio data including synchronization information associated with the first video data.
- the complementary image generation unit 9 generates a complementary image that complements the deleted frame image using the frame information. Further, the complementary image generation unit 9 generates a complementary image by duplicating the frame immediately before the deleted frame. In addition, the complementary image generation unit 9 uses the frames before and after the deleted frame to determine an object whose movement amount included in the deleted frame is equal to or greater than a predetermined threshold, and an area for displaying the object is The complementary image is generated by duplicating the area indicating the object of the non-deleted frame immediately after the deleted frame.
- the video data generation unit 10 generates video data of the first frame rate using the complementary image and the second video data. Further, the video data generation unit 10 uses the frame information to insert a complementary image at the position of the deleted frame of the second video data to generate video data of the first frame rate. In addition, the video data generation unit 10 uses the synchronization information to synchronize the complementary image corresponding to the deleted frame and the audio data corresponding to the deleted frame to generate video data at the first frame rate. To do.
- the transmission rate or frame rate of moving image data can be changed according to the network bandwidth. Therefore, the data amount per unit time of streaming data flowing through the network can be reduced. Further, it is possible to prevent a delay of the video frame with respect to the audio frame. In addition, since the transmission amount of the video frame is larger than that of the audio frame, it is possible to prevent the loss of the video frame that occurs when the video frame does not reach within a certain time during the video decoding process of the mobile terminal.
- the viewer's A sense of incongruity can be suppressed.
- the amount of computation for generating complementary frames can be reduced, and the difference between video and audio depending on the creation time of complementary frames can be reduced. Can be prevented.
- the time stamp of the generated complementary frame is synchronized with the time stamp of the audio frame corresponding to the video frame, so that the video data and Audio data can be synchronized.
- the memory area for lip sync adjustment can be reduced, and the calculation load can be reduced.
- the lip sync adjustment may not be performed by delaying the video data or the audio data. Therefore, the memory area for lip sync and the GPU (Graphics Processing Unit) that can withstand the video processing can be reduced.
- FIG. 2 shows an example of the configuration of the information processing system according to the present embodiment.
- the information processing system includes a personal computer 33, a server 31, and a mobile terminal 35.
- the personal computer 33 and the server 31, and the server 31 and the mobile terminal 35 are connected via a communication carrier 34 via a network.
- the server 31 is an example of the first information processing apparatus 1.
- the mobile terminal 35 is an example of the second information processing device 7.
- the information processing system includes a personal computer 33, a server 31, and a mobile terminal 35.
- the personal computer 33 stores moving image data.
- the server 31 receives the moving image data from the personal computer 33, and deletes and transfers a plurality of video frames from the received moving image data.
- the mobile terminal 35 can reproduce the moving image data received from the server 31.
- the personal computer 33 stores moving image data accessed from the mobile terminal 35 via the server 31.
- the personal computer 33 is disposed in the home, for example, and is connected to the server 31 via the Internet. Access from the server 31 to the personal computer 33 is restricted by the authentication function. Further, the personal computer 33 holds a file including a list of moving image files that can be provided to the server 31 for each server 31. Or the personal computer 33 may hold
- the personal computer 33 may be a host computer having the function of the server 31, or may be a storage device that stores a moving image file and is connected to a network.
- the server 31 receives a viewing request for a moving image file stored in the personal computer 33 from the mobile terminal 35.
- the server 31 also monitors the bandwidth status of the server 31 and the mobile terminal 35. Then, the server 31 acquires the video data to be viewed from the personal computer 33, deletes a plurality of video frames from the acquired video data according to the network bandwidth condition, and transfers the video data to the mobile terminal 35. . Details of network bandwidth monitoring and video frame deletion will be described later.
- the server 31 has an authentication function for establishing a connection with the mobile terminal 35.
- the mobile terminal 35 receives a request for a video to be played from the user. Then, it instructs the personal computer 33 to transfer the moving image.
- the mobile terminal 35 receives the moving image data from the server 31 in a streaming format. If there is a video frame deleted at the time of transmission in the received video data, the mobile terminal 35 generates a complementary frame corresponding to the video frame and inserts it into the video data. Thereby, the personal computer 33 restores the video data before deleting the video frame. Then, the mobile terminal 35 reproduces the restored moving image data. By performing the restoration process, the mobile terminal 35 reproduces the moving image data at the same frame rate as the moving image data stored in the personal computer 33.
- transmitting data from the personal computer 33 to the server 31 may be referred to as “upload”, and transmitting data from the server 31 to the mobile terminal 35 may be referred to as “downlink” (download).
- the personal computer 33 or the server 31 that is the transmission source of the moving image data may be referred to as a transmission terminal
- the server 31 or the mobile terminal 35 that is the reception side of the moving image data may be referred to as a reception terminal.
- a video frame deleted in the personal computer 33 may be referred to as a deleted frame.
- the meta information of the video data is sent from the transmitting terminal to the receiving terminal together with the video data.
- the transmitting terminal adds the information of the deleted video frame to the meta information, and the receiving terminal restores the moving image data by generating a complementary frame using the meta information.
- FIG. 3 shows an example of the structure of meta information of video data.
- Meta information is information about moving image data and is associated with moving image data.
- the meta information includes a format size (content resolution) 43, a video title 44, a video time 45, a creation date 46, contents 47, a deletion frame start number 41, and a deletion frame period (number of frames) 42.
- the format size (content resolution) 43, the video title 44, the video time 45, the creation date 46, and the content 47 are included in the moving image data stored in the personal computer 33.
- the format size (content resolution) 43, video title 44, video time 45, creation date 46, and content 47 are respectively the format size (content resolution), video title, video time, creation date, and content of the corresponding video data. is there.
- the deletion frame start number 41 and the deletion frame period (the number of frames) 42 are data items added to the metadata when the video frame is deleted at the transmission terminal.
- the deletion frame start number 41 is a frame number (frame identification number) of a video frame deleted by the transmission terminal.
- the deletion frame period (the number of frames) 42 indicates a period when the video frames to be deleted are continuous, and is represented by, for example, the continuous number of deletion frames.
- the deletion frame start number 41 and the deletion frame period (the number of frames) 42 are information on the deletion frame every predetermined period (for example, every 1 second). Therefore, when there are a plurality of deletion frames within a predetermined period, data items of the deletion frame start number 41 and the deletion frame period 42 associated with each deletion frame are added to the meta information.
- the deletion frame start number 41 and the deletion frame period (number of frames) 42 may be the frame number of the restart frame and the period (number of frames) of the restart frame.
- the video at the mobile terminal 35 can be obtained. It is possible to recognize in advance the start number of the deleted frame and its duration during playback. Thereby, the mobile terminal 35 may complement the start number of the deleted frame and its period.
- a transmission terminal deletes a video frame from video data and transmits the video data
- a reception terminal generates a complementary frame and restores the video data.
- the determination of which video frame is deleted at the transmitting terminal is made based on the audio level of the audio frame corresponding to each video frame.
- the lip sync deviation is most noticeable in the movement of the person's mouth and the voice in the scene where the person speaks.
- the volume level is low, it is considered that the mouth is not moving even if a person's face is displayed in the video. Therefore, it is possible to suppress a viewer's uncomfortable feeling when reproducing a moving image by deleting consecutive video frames having a low audio level and restoring by copying a video frame immediately before the deleted frame.
- FIG. 4 is a diagram for explaining how video frames are deleted and restored.
- the video frame deletion process is performed by the server 31, but may be performed by the personal computer 33.
- the sound level may be, for example, a numerical value of a sound volume, a sound volume belonging to a specific frequency band belonging to a human audible range, or the like, but is not limited thereto.
- the audio level of frame A is “60”
- the audio level of frame B is “70”
- the audio level of frame C is “90”, and so on.
- the voice level is as shown in FIG.
- the server 31 discriminates a frame whose audio level is equal to or less than a predetermined threshold among the frames. Of the frames whose audio level is equal to or lower than a predetermined threshold, a video frame whose audio level of the immediately preceding frame is equal to or lower than the predetermined threshold is recognized as a deletion target frame.
- the predetermined threshold value is 20.
- the frames whose audio level is 20 or less are D to G.
- the audio level of the immediately preceding frame is equal to or higher than the threshold (frame C, audio level 90), so the video frame D is not recognized as a deletion target frame.
- the frames E to G are all recognized as frames to be deleted because the sound level of the immediately preceding frame is below the threshold value. Note that video data and audio data are associated with each frame, and the audio level of the video frame is recognized using information of the audio frame corresponding to the video frame.
- the server 31 deletes the recognized deletion target frame from the moving image data. Then, the server 31 adds information regarding the deleted frame to the meta information. In the case of the example of FIG. 4, information of the video frames E to G is added to the meta information. Specifically, the server 31 stores the frame number of the video frame E in the deletion frame start number 41 and stores the integer value “3” in the deletion frame period (number of frames) 42.
- the number of video frames to be deleted is determined such that the transmission rate of the moving image data to be transmitted is smaller than the network bandwidth between the server 31 and the mobile terminal 35.
- the server 31 adds meta information in which the information of the deleted video frame is added to the frame and transmits the meta information to the mobile terminal 35. Then, the mobile terminal 35 receives video data in a state where the video frames E to G are deleted. The mobile terminal 35 receives the meta information at the same time as the video data, and recognizes that the deleted video frames are the video frames E to G. Specifically, the mobile terminal 35 recognizes information on the deletion frame start number 41 and the deletion frame period (number of frames) 42 of the meta information, and specifies the deletion frame.
- the mobile terminal 35 generates a complementary frame of the deleted video frames E to G, and inserts the complementary frame at the position of the deleted video frame of the moving image data, thereby performing moving image data restoration processing. Specifically, the mobile terminal 35 inserts a complementary frame at the position of the deleted frame specified from the deleted frame start number 41 and the deleted frame period (number of frames) 42 of the corresponding meta information.
- Complementary frame generation processing is executed by duplicating the video frames before and after the deleted frame.
- FIG. 5 is a diagram illustrating an example of a complementary frame generation process.
- the example of FIG. 5 shows an example of complementation when the video frames D to F are deleted at the transmission terminal based on the audio level of the frame.
- the frame corresponding to the video frame D is generated by duplicating the video frame C, which is the previous video frame not deleted.
- the video frames E and F are generated by duplicating the video frame G, which is a video frame that has not been deleted immediately after.
- the copy source video frame may be changed in consideration of the amount of movement of the object in the video frame.
- the mobile terminal 35 compares the video frames that have not been deleted before and after the deletion frame, and discriminates an object having a large movement amount and an object that is not. Then, the mobile terminal 35 sets the pixel value at the position corresponding to the object whose movement amount is not large as a value obtained by duplicating the pixel value of the video frame not deleted immediately before.
- a pixel value at a position corresponding to an object having a large movement amount changes a copy source video frame before or after a predetermined time among a plurality of consecutive deletion frames.
- the pixel value of the deleted frame up to a predetermined time is a value obtained by duplicating the pixel value of the video frame not deleted immediately before, and the pixel value of the deleted frame after the predetermined time is not deleted immediately after
- the pixel value of the video frame is a duplicated value.
- the amount of movement can be measured by various methods such as a method using a motion vector search for estimating the amount of motion of an image as used in motion compensation of MPEG (Moving / Picture / Experts / Group).
- FIG. 6 shows an example of a method in which an object having a large movement amount and an object that is not so are discriminated and a complementary frame is generated.
- the example of FIG. 6 shows an example of complementation when the video frames D to F are deleted at the transmission terminal based on the audio level of the frame.
- the pixel value at the position indicating the object with the large moving amount of the video frames D and E is the pixel value at the corresponding position of the video frame C that is the previous video frame that has not been deleted.
- the pixel value at the position indicating the object with the large moving amount of the video frame F is set as the pixel value at the corresponding position of the video frame G which is the video frame not deleted immediately after.
- a pixel value at a position indicating an object with a small moving amount of the video frames D, E, and F is a pixel value at a corresponding position in the video frame C.
- the method of determining the deletion target frame based on the audio level of the frame has been described.
- the closeness with the video frames before and after the time series of the video data That is, a method that further considers the similarity may be used.
- the similarity calculated by comparing the video frame with the immediately previous video frame may be used as the similarity, or the sum of squares of differences in pixel values (SSD).
- the similarity may be calculated by calculating using: (Sum of Squared Differences).
- the transmitting terminal determines a video frame having a similarity greater than or equal to a predetermined threshold among the video frames, and an audio level of the video frame determined to have a similarity greater than or equal to a predetermined threshold is equal to or lower than the predetermined threshold.
- a predetermined threshold among the video frames
- an audio level of the video frame determined to have a similarity greater than or equal to a predetermined threshold is equal to or lower than the predetermined threshold.
- Distribution quality monitoring is an adjustment amount for adjusting the video frame rate of moving image data to be transmitted so that the bandwidth used for moving image transfer is equal to or less than the network bandwidth between the server 31 and the mobile terminal 35. Is to decide.
- the server 31 dynamically monitors the status of the network bandwidth between the server 31 and the mobile terminal 35, and adjusts the transmission rate by deleting video frames from the video data according to the monitoring result.
- the server 31 deletes the video frame so that the transmission rate of the moving image data does not exceed the network bandwidth.
- the server 31 compares the data amount of the moving image to be transmitted with the data amount that can be stably transmitted / received, and the data amount of the moving image to be transmitted falls within the range of the data amount that can be stably transmitted / received. Next, the video frame rate of the moving image data is determined.
- the video frame rate of the moving image data determined here is referred to as a transmission frame rate in the following description.
- the server 31 determines the available bandwidth between the server 31 and the mobile terminal 35 from the result of bandwidth monitoring between the server 31 and the mobile terminal 35. Alternatively, the server 31 determines an available bandwidth from the result of bandwidth monitoring and the transmission path capacity of the line. Then, the server 31 calculates the transmission bit rate of the moving image data from the resolution, bit color, compression rate, activity level, and frame rate of the moving image data. Then, the server 31 derives the deletion amount of the video frame rate of the moving image data so that the calculated transmission bit rate is within the usable bandwidth.
- the activity level is the frequency of packet transmission with respect to the stream, and varies depending on, for example, the content of the moving image data. For example, in the case of silent video data, there is no need to transmit an audio stream, and the activity level of the audio stream is zero.
- the communication standard between the server 31 and the mobile terminal 35 is assumed to be LTE (Long Term Evolution).
- the bandwidth in this case is assumed to be 75 Mbps at the maximum.
- moving image data is encoded at a fixed bit rate in the codec.
- the moving image data has, for example, a resolution of 1280 [dot per inch (dpi)] ⁇ 720 [dpi] full HD (full high definition), a bit color of 8 and a video frame rate of 24 fps (frames per second).
- a compression method such as MPEG includes a full I picture (Intra picture), a 1/2 size P picture (Predictive picture), and a 1/4 size B picture (Bi-directional predictive picture).
- the I picture, P picture, and B picture are configured in a ratio of 1: 4: 10. Therefore, the compression rate of moving image data is about 11/30.
- the original video data is content to be displayed on a device with a high resolution
- video data is played back on the mobile terminal 35 with a low resolution
- the appearance changes even if the resolution of the video data is reduced to the resolution of the mobile terminal 35.
- the performance of the video playback chip of the mobile terminal 35 often cannot use all the amount of information included in the moving image data, and cannot reproduce fine changes in video.
- the resolution based on the resolution of the mobile terminal 35 recognized by the server 31 is A, and the number of video frames is B.
- FIG. 7 is a diagram for explaining adjustment of the video frame rate of the moving image data in the transmission terminal.
- the file decoder 51 separates the streaming data into video data and audio data. Then, the file decoder 51 outputs the separated video data to the video encoder 53 and outputs the separated audio data to the audio encoder 55. Further, the file decoder 51 transmits the meta information of the streaming data to the frame control unit 52.
- the separated video data and audio data are associated with each frame.
- video frames and audio frames having the same frame number are associated with each other.
- the same value is set for the time stamp of the corresponding video frame and audio frame.
- the time stamp is defined by adding the number of seconds from the start of playback to each frame, and the frame is played back at the time of the time stamp.
- Each video frame and each audio frame holds time stamp information.
- the frame control unit 52 determines a video frame to be deleted so that the frame rate of the video data is equal to or lower than the transmission frame rate. Then, the frame control unit 52 adds information on the deleted video frame to the meta information. That is, the frame control unit 52 stores the information of the deleted video frame in the deleted frame start number 41 and the deleted frame period (number of frames) 42 of the meta information.
- the video data output from the file decoder 51 is input to the video encoder 53.
- the video encoder 53 deletes the deleted frame determined by the frame control unit 52 from the video data, and constructs the video frame again. Then, the video encoder 53 encodes the reconstructed video frame into a transmission format.
- the encoded video data is divided or aggregated into, for example, RTMP (Real Time Messaging Protocol) format packets.
- the video encoder 53 outputs the encoded video data to the video memory 54, and the encoded video data is transmitted from the video memory 54 to the receiving terminal.
- RTMP Real Time Messaging Protocol
- the audio data separated by the file decoder 51 is input to the audio encoder 55.
- the audio encoder 55 converts the received audio data into a sending format and outputs it to the audio memory 56.
- the encoded audio data is transmitted from the audio memory 56 to the receiving terminal.
- the audio frame corresponding to the deleted video frame is not deleted, and the meta information of the audio data is transmitted without being changed.
- FIG. 8 shows a flowchart of video frame reduction processing.
- the transmission terminal determines a frame to be deleted for each video frame of a predetermined period stored in the work buffer, and by deleting the frame, the video frame rate of the moving image data stored in the work buffer is set for transmission. It is adjusted so that it is below the frame rate.
- the flow in FIG. 8 is periodically executed for each video frame of a predetermined period stored in the work buffer.
- the transmitting terminal reads (buffers) moving image data for a predetermined period into the work buffer (S61).
- the transmission terminal confirms the result of the distribution quality monitoring and recognizes the transmission frame rate (S62).
- the transmitting terminal deletes the video frames currently stored in the work buffer so that the video frame rate of the moving image data currently stored in the work buffer becomes the transmission frame rate recognized in S62. The number of frames is confirmed (S63).
- the transmission terminal determines a deletion target frame to be deleted based on the audio level corresponding to each frame among the video frames stored in the work buffer (S64). Then, the transmitting terminal determines whether or not the number of deletion target frames determined in S64 is equal to or greater than the number of deleted frames confirmed in S63 (S65).
- the transmitting terminal increases the period of frames to be stored in the work buffer (S66), and deletes the deleted frames again. A determination is made (S64).
- the transmitting terminal deletes the deletion target frame determined in S64 and uses the information on the deletion frames as meta information. (S66). Then, the transmission terminal encodes the video data for delivery and distributes the video data (S67).
- FIG. 9 shows the details of the flow of the operation (S64) in which the transmitting terminal discriminates the frame to be deleted based on the audio level of each frame in the moving image data stored in the work buffer.
- the frame number of the discrimination target video frame is assumed to be n (hereinafter, the video frame having the frame number n is referred to as n frame).
- the transmitting terminal determines whether or not the audio level of the video frame (n-1 frame) with the frame number n-1 is equal to or higher than a predetermined threshold (S71). If the audio level of the n ⁇ 1 frame is less than the predetermined threshold (No in S71), the transmitting terminal determines that the n frame is not a deletion target frame (S74). Then, the transmitting terminal increments the value of n (S75).
- the transmitting terminal determines whether or not the sound level of the n frame is equal to or higher than the predetermined threshold value (S72).
- the transmitting terminal determines that the n frame is not a frame to be deleted (S74). Then, the transmitting terminal increments the value of n (S75).
- the transmitting terminal determines that the n frame is a frame to be deleted (S73). Then, the transmitting terminal increments the value of n (S75).
- the transmitting terminal determines whether or not the discrimination process has been performed on the video frames in all the work buffers (S76). If there is a video frame in the work buffer that has not been subjected to the discrimination process (No in S76), the process returns to S71. If discrimination processing has been performed on all the video frames in the work buffer (Yes in S76), the processing ends.
- the work buffer does not have to store all the video frames of the moving image data, and there is an area in which the video frame can be deleted and the video frame rate can be made equal to or lower than the transmission frame rate determined by the distribution quality monitoring. I just need it.
- the period of moving image data that is compared with the transmission frame rate at a time in the transmitting terminal is referred to as a moving image data discrimination target period.
- the transmission terminal determines the deletion target by dividing the moving image data every predetermined period, and determines whether or not the video frame rate of the moving image in this predetermined period is smaller than the transmission frame rate. ing.
- the transmitting terminal increases the determination target period of the moving image data.
- the transmission terminal increases the period of the moving image data stored in the work buffer in which the video frames that are simultaneously determined whether or not to be deleted are stored.
- the transmission terminal deletes the video frame so that the video frame rate of the moving image data during the period of the video frame that can be stored in the work buffer is equal to or less than the transmission frame rate determined by the distribution quality monitoring.
- the video frames L to V are further buffered in the work buffer. Among them, the video frame whose similarity is equal to or less than the threshold is deleted. Then, it is confirmed whether or not the video frame rate is equal to or lower than the transmission frame rate determined by the distribution quality monitoring in the period of the video frames A to V.
- FIG. 10 is a diagram for explaining the decoding process of the receiving terminal.
- the decoding processing unit includes a file decoder 91, a frame control unit 92, a video decoder 93, a video memory 94, an audio decoder 95, and an audio memory 96.
- the file decoder 91 separates the streaming data received from the transmission terminal into video data and audio data. Then, the file decoder 91 outputs the separated video data to the video decoder 93 and outputs the audio data to the audio decoder 95. In addition, the file decoder 91 extracts meta information from the received streaming data and transmits it to the frame control unit 92.
- the frame control unit 92 recognizes the deleted frame from the meta information received from the file decoder 91, and outputs the deleted frame information to the video decoder 93.
- the deleted frame information includes, for example, a deleted frame start number 41 and a deleted frame period (number of frames) 42. Further, the frame control unit 92 recognizes the time stamp of the audio frame corresponding to the deleted frame using the meta information, and outputs the time stamp information of the audio frame to the video decoder 93 as a control signal.
- the video decoder 93 receives the video data from the file decoder 91 and receives the deletion frame information from the frame control unit 92.
- the video decoder 93 decodes the video data.
- the video decoder 93 generates a complementary frame of the deleted frame from the video data and the deleted frame information, and reconstructs the video frame.
- the time stamp of the complementary frame generated here is set to the same value as the time stamp of the audio frame corresponding to the deleted frame by the control signal received from the frame control unit 92. Thereby, lip sync adjustment is performed.
- the video decoder 93 outputs the reconstructed video data to the video memory 94.
- the audio decoder 95 receives the audio data from the file decoder 91 and performs a decoding process. Then, the audio data is output to the audio memory 96.
- the lip sync adjustment is not performed when data is stored in the video memory 94 and the audio memory 96, but is completed in the video decoder 93. Thereby, the memory capacity for lip sync adjustment at the receiving terminal can be reduced.
- FIG. 11 is a flowchart of frame reconstruction by the receiving terminal.
- the video decoder 93 of the receiving terminal executes a deletion frame complementary frame generation process (S101).
- the video decoder 93 receives information regarding the deleted frame included in the meta information from the frame control unit 92.
- the information regarding the deleted frame includes, for example, a deleted frame start number 41 and a deleted frame period (number of frames) 42. Then, the video decoder 93 recognizes the deletion frame number and the period of the deletion frame from the information regarding the deletion frame (S102). Note that the restart frame number may be recognized instead of the deletion frame number.
- the video decoder 93 assigns a frame number to the generated complementary frame based on the recognized deletion frame number or restart frame number (S103).
- This frame number assignment can be said to be an operation of inserting a complementary frame at the position of the deleted frame.
- the video decoder 93 detects a frame of a predetermined period (S104), the generated complementary frame and the time stamp value of the audio frame corresponding to the frame number assigned in S103 are matched (S105). As a result, the lip sync is adjusted, and the video and audio can be synchronized.
- the video decoder 93 outputs the video data for which the lip sync adjustment has been completed to the video memory 94, and outputs the audio data to the audio memory 96. Then, the moving image reproduction application is notified that the video data and the audio data have been output to the video memory 94 and the audio memory 96, respectively (S106). Using this data, the mobile terminal 35 reproduces the moving image. Note that the flow of FIG. 11 is periodically executed for each frame of a predetermined period.
- FIG. 12 is an operation flowchart of a complementary frame generation process for a deleted frame in the receiving terminal.
- the value of the deletion frame start number 41 of the meta information is n + 1 and the value of the deletion frame period (number of frames) 42 is T will be described for the sake of explanation.
- a threshold value used for changing whether the value of the pixel indicating the object is duplicated from the immediately preceding video frame or the immediately following video frame Let x be x.
- the threshold value x is an integer representing the number of frames, but is not limited thereto.
- the receiving terminal confirms from the metadata information that there is a video frame deleted at the transmitting terminal (S121). That is, the receiving terminal recognizes the deletion frame start number 41 and the deletion frame period (number of frames) 42 included in the metadata.
- the receiving terminal buffers a video frame (n frame) having a frame number immediately before the deleted frame start number 41 among frames not deleted (S122).
- the receiving terminal buffers a frame ((n + 1 + T) frame) having a frame number immediately after the deletion frame start number 41 among the frames not deleted (S123).
- the receiving terminal uses the n frame and the (n + 1 + T) frame to determine an object having a large movement amount in the frame (S124).
- the frame to be complemented is set to n + 1 frame (S125). Then, the receiving terminal determines whether or not the frame number of the complementary processing target frame is n + x or more (S126). When the frame number of the complement processing target frame is less than n + x (No in S126), the complement processing target frame is generated by duplicating n frames (S127).
- the receiving terminal increments the frame number to be complemented (S129), and determines whether the frame number is greater than n + T (S130). When the complement processing target frame number is n + T or less (No in S130), the process returns to S126. If the frame number to be complemented is greater than n + T (Yes in S130), the process ends.
- the frame number to be complemented is incremented (S129), and it is determined whether or not the frame number is larger than n + T (S130).
- the complement processing target frame number is n + T or less (No in S130)
- the process returns to S126. If the frame number to be complemented is greater than n + T (Yes in S130), the process ends.
- FIG. 13A and 13B are sequence diagrams of the information processing system according to the present embodiment.
- FIG. 13 shows an operational relationship among the personal computer 33, the server 31, and the mobile terminal 35.
- the mobile terminal 35 connects to the server 31 that relays between the mobile terminal 35 and the personal computer 33 in order to acquire the moving image data stored in the personal computer 33 (S401).
- the server 31 performs authentication in order to confirm the validity of the mobile terminal 35 that requested the connection. Further, the server 31 recognizes the resolution at the time of moving image reproduction on the mobile terminal 35 simultaneously with the authentication of the mobile terminal 35 (S402). If the authentication is successful, the server 31 sends a device approval response to the mobile terminal 35 (S403).
- the server 31 connects to the personal computer 33 in which the moving image data requested from the mobile terminal 35 is stored (S404). Even when the connection between the personal computer 33 and the server 31 is established, authentication for confirming the validity of the connection is performed. If the authentication is successful, the personal computer 33 transmits a connection response to the server 31 (S405).
- the personal computer 33 notifies the mobile terminal 35 of list information of videos that can be provided via the server 31 (S406).
- a list of videos that can be provided may be preset for each mobile terminal 35 or for each server 31 and stored as a file in the personal computer 33.
- the resolution information of the mobile terminal 35 may be received from the server 31, and the personal computer 33 may determine a video that the mobile terminal 35 can reproduce from the video data according to the information.
- the user When the notified list is displayed on the screen of the mobile terminal 35, the user operates the mobile terminal 35 to select a video to be played back.
- the mobile terminal 35 makes a delivery request for the selected playback video (hereinafter referred to as a playback video request) to the server 31 (S407).
- the server 31 performs information collection work for monitoring the network bandwidth between the mobile terminal 35 and the server 31 (S408).
- the server 31 transmits the playback video request received from the mobile terminal 35 to the personal computer 33 (S409).
- the personal computer 33 notifies the server 31 of meta information related to the video specified by the playback video request (S410).
- the server 31 monitors the distribution quality of the video transmission between the server 31 and the mobile terminal 35 (S411). Specifically, the server 31 adjusts the video frame rate of the moving image data to be transmitted so that the bandwidth used for moving image transfer is equal to or less than the network bandwidth between the server 31 and the mobile terminal 35. Determine the amount. At this time, the server 31 notifies the mobile terminal 35 of the video data change information as a reproduction information change request so that the mobile terminal 35 can recognize the video data change information (S412).
- the mobile terminal 35 When the mobile terminal 35 receives the reproduction information change request from the server 31, the mobile terminal 35 recognizes the change of the distribution video and sends a reproduction setting response to the server 31 (S413).
- the server 31 notifies the personal computer 33 of a streaming start request (S414).
- the personal computer 33 sends a streaming start response to the mobile terminal 35 via the server 31 (S415).
- the personal computer 33 distributes the streaming data to the server 31 (S416).
- the server 31 When the server 31 receives the streaming data, the server 31 recognizes the audio level of the moving image data (S417). Then, the server 31 determines and selects the video frame to be deleted so that the video frame rate becomes the transmission frame rate derived by the distribution quality monitoring based on the recognized audio level. Here, the server 31 may change the resolution of the moving image data to the resolution recognized in S402. Then, the server 31 deletes the selected deletion target frame from the moving image data (S418).
- the server 31 adds the deleted video frame information to the meta information (S419). Then, the server 31 encodes the deleted moving image data into a format for sending (S420), and streams the encoded data to the mobile terminal 35 (S421).
- the mobile terminal 35 When receiving the streaming data, the mobile terminal 35 decodes the received data (S422). Then, the mobile terminal 35 restores the moving image data by generating a complementary frame corresponding to the video frame deleted in S418 and inserting it into the moving image data (S423). Then, the mobile terminal 35 reproduces the restored moving image data (S424).
- connection operation (S401, S404) between the mobile terminal 35 and the server 31, and between the server 31 and the personal computer 33 will be described in detail.
- the mobile terminal 35 is installed with an application program (hereinafter referred to as an application) for reproducing a moving image, and is connected to the target server 31 by the application.
- the designation of the target server 31 may be selected by the user, or may be set in advance in the application.
- the mobile terminal 35 when the mobile terminal 35 is connected like a 3G (3rd generation) line, it can specify the line and connect to the Internet.
- the mobile terminal 35 and the personal computer 33 enable P2P (Peer to Peer) connection via the Internet.
- Authentication is performed when a connection is established between the server 31 and the mobile terminal 35 and between the server 31 and the personal computer 33.
- Authentication information used there is, for example, specific device information such as an IP address and a MAC address of each device. .
- the authentication information is managed on the server 31, and the IP address and unique device information are stored grouped for each mobile terminal 35 in association with the mobile terminal 35 and the personal computer 33 to which the mobile terminal 35 can be connected.
- One personal computer 33 may be associated with one mobile terminal 35 or may be associated with a plurality.
- the server 31 collates the IP address or unique device information of the mobile terminal 35 that has made the connection request with the IP address or unique device information stored in the server 31. If the collation results match, the server 31 establishes a connection assuming that the authentication is successful.
- Various authentication techniques may be used for authentication, and a password authentication method, an electronic certificate authentication method, or the like may be used.
- the server 31 After establishing the connection with the mobile terminal 35, the server 31 connects to the personal computer 33 corresponding to the mobile terminal 35. Specifically, when one personal computer 33 is associated with one mobile terminal 35 in the authentication information stored in the server 31, the server 31 confirms the authentication information and corresponds to the mobile terminal 35. Connect to PC 33. Alternatively, when the mobile terminal 35 establishes a connection with the server 31, the mobile terminal 35 may designate a personal computer 33 as a connection destination, and the server 31 may be connected to the designated personal computer 33.
- the personal computer 33 that has received the connection request from the server 31 authenticates whether or not the server 31 that requested the access is legitimate.
- Various authentication techniques are used for the authentication, as in the authentication performed by the server 31 for the mobile terminal 35.
- access control may be performed by granting access authority to a moving image file held by the personal computer 33 or a directory (folder) of the personal computer 33.
- the personal computer 33 and the server 31, and the server 31 and the mobile terminal 35 may be connected using a network technology with high security such as VPN (Virtual Private Network). Furthermore, in data transmission between the personal computer 33 and the server 31, and between the server 31 and the mobile terminal 35, the transmitted data may be encrypted by various encryption techniques.
- the server 31 and the personal computer 33 may be arranged in the same intranet.
- the server 31 monitors the bandwidth between the mobile terminal 35 and the server 31.
- the server 31 transmits a band detection packet to the mobile terminal 35 in order to detect the bandwidth of the network connected to the mobile terminal 35.
- the time when the server 31 transmits is recorded.
- the mobile terminal 35 receives the packet, it measures the reception time and compares it with the transmission time recorded in the packet. Thereby, the mobile terminal 35 can calculate the time required for a certain amount of packets to reach the mobile terminal 35 from the server 31.
- the mobile terminal 35 transfers the bandwidth monitoring information obtained here to the server 31.
- the server 31 divides a predetermined amount of data into a plurality of packets, and measures the time required for the mobile terminal 35 to receive the last packet after the server 31 transmits the first packet. Bandwidth monitoring information may be acquired.
- packet transmission for bandwidth monitoring is performed within a range where there is no problem in reproduction of actual moving image data.
- the bandwidth may be monitored by sending a command for monitoring the bandwidth, or by sending a command such as ping and measuring the response time.
- the bandwidth between the personal computer 33 and the server 31 and between the server 31 and the mobile terminal 35 may be monitored at a time by sending a packet from the personal computer 33 to the mobile terminal 35 via the server 31.
- bandwidth monitoring not only bandwidth monitoring but also line traffic monitoring may be performed based on transmission delay.
- the server 31 determines the data amount (usage amount) that can be stably transmitted / received per unit time based on the prescribed information amount and the bandwidth monitoring information.
- the server 31 transmits the monitoring result to the mobile terminal 35, and the user confirms the bandwidth monitoring result by the mobile terminal 35 and designates the resolution of the moving image to be reproduced. Also good. In that case, the designated resolution information is used in distribution quality monitoring.
- FIG. 14 shows an example of the configuration of the server 31 in this embodiment.
- the server 31 includes a decoding processing unit 131, an arithmetic processing unit 132, a storage 133, a content server 134, and a streaming server 135.
- the decode processing unit 131 decodes video data uploaded from a terminal device such as the personal computer 33.
- a terminal device such as the personal computer 33.
- decoding processing is performed to restore the moving image data.
- the arithmetic processing unit 132 performs bandwidth monitoring and distribution quality management, and performs video frame deletion processing and meta information change processing according to the result.
- the storage 133 stores an operating system, middleware, and applications, which are read into the memory by the arithmetic processing unit 132 and executed.
- the content server 134 manages content prepared for streaming reproduction, and the mobile terminal 35 can select a moving image to be reproduced from the managed content.
- the streaming server 135 distributes the moving image data to the mobile terminal 35.
- the moving image data the moving image data from which the video frame is deleted is received from the arithmetic processing unit 132.
- the streaming server 135 is divided according to protocols used for distribution, such as HTTP (HyperText Transfer Protocol) and HTTP / RTMP.
- FIG. 15 shows an example of a hardware configuration of the server 31 or the personal computer 33 according to the present embodiment.
- the server 31 or the personal computer 33 includes a CPU (Central Processing Unit) 161, an SDRAM (Synchronous Dynamic Random Access Memory) 162, a serial port 163, a flash memory 164, a digital I / O, and an analog I / O 165.
- the server 31 or the personal computer 33 includes a storage 166, a chip set 167, a communication card 168, a CF (Compact Flash (registered trademark)) interface card 169, and a real time clock 170.
- CF Compact Flash
- the CPU 161 uses the SDRAM 162 or the flash memory 164 to execute a program describing the procedure of the above-described flowchart stored in the storage 166. Further, the CPU 161 exchanges data with the communication card 168, the CF interface card 169, and the real time clock 170 via the chip set 167.
- the server 31 or the personal computer 33 inputs and outputs moving image data from the serial port 163, digital I / O, or analog I / O 165 via a communication card.
- the CPU 161 provides some or all of the functions of the monitoring unit 3, the deletion unit 4, the frame information generation unit 5, the transmission unit 6, the reception unit 8, the complementary image generation unit 9, and the video data generation unit 10.
- the CPU 161 performs encoding / decoding of the moving image data, lip sync, authentication operation for establishing a connection, and reproduction of the moving image data.
- the storage 166 provides a part or all of the functions of the storage unit 2.
- the CPU 161 can use the SDRAM 162 as a temporary data storage area (working buffer) for performing frame deletion processing of moving image data and restoration processing of moving image data.
- the SDRAM 162 is not limited to this and can be various RAMs (Random Access Memory).
- the flash memory 164 stores a kernel, applications in the server 31, setting files, and the like.
- the flash memory has an expansion area and can be used as a temporary data storage area (working buffer) for performing frame deletion processing of moving image data and restoration processing of moving image data.
- the CF interface card 169 is used as an auxiliary function for use in maintenance of the server 31 and the like. Since the storage is built in, this is used for data processing between many personal computers 33 and mobile terminals 35.
- the real-time clock 170 is a dedicated chip having a function as a computer clock.
- the time stamp of the frame is set according to the clock of the real time clock 170.
- a part of the first information processing apparatus 1 and the second information processing apparatus 7 of the embodiment may be realized by hardware.
- the first information processing device 1 and the second information processing device 7 of the embodiment may be realized by a combination of software and hardware.
- FIG. 16 shows an example of the hardware configuration of the mobile terminal 35 according to the present embodiment.
- the mobile terminal 35 includes a CPU 201, a memory 202, a storage unit 203, a reading unit 204, a communication interface 206, an input / output unit 207, and a display unit 208.
- the CPU 201, the memory 202, the storage unit 203, the reading unit 204, the communication interface 206, the input / output unit 207, and the display unit 208 are connected to each other via a bus 209, for example.
- the CPU 201 uses the memory 202 to execute a program describing the above-described flowchart procedure.
- the CPU 201 provides some or all of the functions of the reception unit 8, the complementary image generation unit 9, and the video data generation unit 10.
- the CPU 201 also decodes moving image data, lip sync, and reproduces moving image data.
- the memory 202 is, for example, a semiconductor memory, and includes a RAM area and a ROM area.
- the storage unit 203 is a hard disk, for example. Note that the storage unit 203 may be a semiconductor memory such as a flash memory.
- An application for reproducing moving image data is stored in the memory 202 or the storage unit 203 and executed by the CPU 201.
- the mobile terminal 35 may be configured without the storage unit 203.
- the reading unit 204 accesses the removable recording medium 205 in accordance with an instruction from the CPU 201.
- the detachable recording medium 205 is, for example, a semiconductor device (USB memory or the like), a medium (such as a magnetic disk) to which information is input / output by a magnetic action, or a medium (CD-ROM, For example, a DVD).
- the communication interface 206 transmits and receives data via the network in accordance with instructions from the CPU 201.
- the communication interface 206 receives moving image data.
- the input / output unit 207 corresponds to, for example, a device that receives an instruction from the user. The user can use the input / output unit 207 to specify the moving image data to be reproduced and the resolution of the moving image data.
- the display unit 208 displays the reproduced moving image data.
- An information processing program for realizing the embodiment is provided to the mobile terminal 35 in the following form, for example. (1) Installed in advance in the storage unit 203. (2) Provided by the removable recording medium 205. (3) Provided via a network.
- FIG. 17 shows an example of the configuration of the information processing system in the present embodiment (modification).
- the receiving terminal is in the form of a terminal device 37 such as a personal computer 33, for example. Even in the same personal computer 33, image quality deterioration may be confirmed depending on the bandwidth of the Internet. In such a case, the configuration described in the embodiment is applied.
- the modification is an example in which the personal computer 33 performs the frame deletion process.
- the personal computer 33 may perform network bandwidth monitoring, distribution quality monitoring, frame deletion processing, and processing corresponding thereto performed by the server 31 in the present embodiment. Network bandwidth monitoring and distribution quality monitoring may also be performed between the server 31 and the personal computer 33.
- the moving image data is stored in the personal computer 33, but the moving image data may be stored in the server 31 and provided from the server 31 to the mobile terminal 35.
- the mobile terminal 35 may be a thin client terminal.
- the decoding in the present embodiment may be performed in accordance with a standard such as MPEG2.
- the moving image is distributed from the server 31 to the mobile terminal 35 in the streaming format.
- the distribution method is not limited to the streaming format.
- this embodiment is not limited to the embodiment described above, and can take various configurations or embodiments without departing from the gist of the present embodiment.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
La présente invention a pour objectif d'améliorer la qualité de lecture de données vidéo qui sont transmises à partir d'un emplacement de stockage via un réseau dont la bande passante est étroite. Afin d'atteindre l'objectif visé, la présente invention se rapporte à un premier dispositif de traitement de données qui : surveille l'état d'un réseau de communication ; et qui, à partir des résultats de la surveillance, sur la base d'un premier débit de trames qui indique un premier nombre par unité de temps de premières données vidéo, génère des secondes données vidéo à un second débit de trames qui est inférieur au premier débit de trames, en supprimant des trames vidéo successives dont le niveau audio des données audio correspondantes n'est pas supérieur à un seuil prédéterminé. D'autre part, le premier dispositif de traitement de données : génère des données de trames relatives aux trames supprimées ; et transmet les secondes données vidéo ainsi que les données audio et les données de trames. Par ailleurs, un second dispositif de traitement de données : génère une image complémentaire qui vient compléter l'image des trames supprimées au moyen des données de trames ; et génère des données vidéo au premier débit de trames au moyen de l'image complémentaire et des secondes données vidéo.
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/JP2012/079866 WO2014076823A1 (fr) | 2012-11-16 | 2012-11-16 | Dispositif de traitement de données, système de traitement de données, programme de traitement de données, et procédé pour la transmission et la réception de données vidéo |
| JP2014546811A JP6119765B2 (ja) | 2012-11-16 | 2012-11-16 | 情報処理装置、情報処理システム、情報処理プログラム、及び動画データ送受信方法 |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/JP2012/079866 WO2014076823A1 (fr) | 2012-11-16 | 2012-11-16 | Dispositif de traitement de données, système de traitement de données, programme de traitement de données, et procédé pour la transmission et la réception de données vidéo |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2014076823A1 true WO2014076823A1 (fr) | 2014-05-22 |
Family
ID=50730763
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2012/079866 Ceased WO2014076823A1 (fr) | 2012-11-16 | 2012-11-16 | Dispositif de traitement de données, système de traitement de données, programme de traitement de données, et procédé pour la transmission et la réception de données vidéo |
Country Status (2)
| Country | Link |
|---|---|
| JP (1) | JP6119765B2 (fr) |
| WO (1) | WO2014076823A1 (fr) |
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2016091425A (ja) * | 2014-11-07 | 2016-05-23 | セイコーエプソン株式会社 | 画像供給装置、画像供給方法およびプログラム |
| CN107659463A (zh) * | 2017-08-24 | 2018-02-02 | 中国科学院计算机网络信息中心 | 流量回放方法、装置及存储介质 |
| JPWO2021172578A1 (fr) * | 2020-02-27 | 2021-09-02 | ||
| JPWO2022085491A1 (fr) * | 2020-10-19 | 2022-04-28 |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPH11355380A (ja) * | 1998-06-10 | 1999-12-24 | Matsushita Electric Ind Co Ltd | 無線動画伝送装置 |
| JP2006229618A (ja) * | 2005-02-17 | 2006-08-31 | Ntt Communications Kk | 映像通信システム、映像通信装置、プログラム、及び映像通信方法 |
| JP2010239389A (ja) * | 2009-03-31 | 2010-10-21 | Kddi R & D Laboratories Inc | 映像伝送システム |
-
2012
- 2012-11-16 WO PCT/JP2012/079866 patent/WO2014076823A1/fr not_active Ceased
- 2012-11-16 JP JP2014546811A patent/JP6119765B2/ja not_active Expired - Fee Related
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPH11355380A (ja) * | 1998-06-10 | 1999-12-24 | Matsushita Electric Ind Co Ltd | 無線動画伝送装置 |
| JP2006229618A (ja) * | 2005-02-17 | 2006-08-31 | Ntt Communications Kk | 映像通信システム、映像通信装置、プログラム、及び映像通信方法 |
| JP2010239389A (ja) * | 2009-03-31 | 2010-10-21 | Kddi R & D Laboratories Inc | 映像伝送システム |
Cited By (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2016091425A (ja) * | 2014-11-07 | 2016-05-23 | セイコーエプソン株式会社 | 画像供給装置、画像供給方法およびプログラム |
| CN107659463A (zh) * | 2017-08-24 | 2018-02-02 | 中国科学院计算机网络信息中心 | 流量回放方法、装置及存储介质 |
| JPWO2021172578A1 (fr) * | 2020-02-27 | 2021-09-02 | ||
| JP7627503B2 (ja) | 2020-02-27 | 2025-02-06 | アトモフ株式会社 | 画像表示装置、システム及び方法 |
| JPWO2022085491A1 (fr) * | 2020-10-19 | 2022-04-28 |
Also Published As
| Publication number | Publication date |
|---|---|
| JP6119765B2 (ja) | 2017-04-26 |
| JPWO2014076823A1 (ja) | 2017-01-05 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| EP2908547B1 (fr) | Dispositif de traitement d'informations, système de traitement d'informations, programme de traitement d'informations et procédé de transmission / réception de données d'images animées adaptant le taux de trame de données d'images animées en fonction de la similarité desdites trames | |
| JP7471725B2 (ja) | 画像処理方法、装置、電子機器及びコンピュータプログラム | |
| US11570226B2 (en) | Protocol conversion of a video stream | |
| JP4444358B1 (ja) | プログレッシブダウンロード再生用プログラム | |
| JP6239472B2 (ja) | エンコード装置、デコード装置、ストリーミングシステム、および、ストリーミング方法 | |
| JP6119765B2 (ja) | 情報処理装置、情報処理システム、情報処理プログラム、及び動画データ送受信方法 | |
| JP4526294B2 (ja) | ストリームデータ送信装置、受信装置、プログラムを記録した記録媒体、およびシステム | |
| CN108307248A (zh) | 视频播放方法、装置、计算设备及存储介质 | |
| CN110214448A (zh) | 信息处理装置和方法 | |
| JP5940999B2 (ja) | 映像再生装置、映像配信装置、映像再生方法、映像配信方法及びプログラム | |
| WO2010103963A1 (fr) | Dispositif, procédé et système de traitement d'informations | |
| JP7365212B2 (ja) | 動画再生装置、動画再生システム、および動画再生方法 | |
| US11409415B1 (en) | Frame interpolation for media streaming | |
| US10356159B1 (en) | Enabling playback and request of partial media fragments | |
| JP2010011287A (ja) | 映像伝送方法および端末装置 | |
| JP2012137900A (ja) | 映像出力システム、映像出力方法及びサーバ装置 | |
| KR101603976B1 (ko) | 동영상 파일 결합 방법 및 그 장치 | |
| CN114257771A (zh) | 一种多路音视频的录像回放方法、装置、存储介质和电子设备 | |
| CN118573955B (zh) | 一种基于hls协议的加密方法及系统 | |
| JP2004349743A (ja) | 映像ストリーム切替システム、方法、映像ストリーム切替システムを含む映像監視、映像配信システム | |
| JP7292901B2 (ja) | 送信装置、送信方法、及びプログラム | |
| JPWO2010134479A1 (ja) | 動画表示装置 | |
| JP2005176068A (ja) | 動画像配信システム及びその方法 | |
| JP2016192658A (ja) | 通信システム、通信装置、通信方法および通信制御方法 | |
| JP2016019140A (ja) | コンテンツ転送方法、コンテンツ転送装置、コンテンツ受信装置およびコンテンツ転送プログラム |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 12888476 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 2014546811 Country of ref document: JP Kind code of ref document: A |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 12888476 Country of ref document: EP Kind code of ref document: A1 |