CN104410868B - A kind of shared-file system multifile rapid polymerization and the method read - Google Patents
A kind of shared-file system multifile rapid polymerization and the method read Download PDFInfo
- Publication number
- CN104410868B CN104410868B CN201410600003.9A CN201410600003A CN104410868B CN 104410868 B CN104410868 B CN 104410868B CN 201410600003 A CN201410600003 A CN 201410600003A CN 104410868 B CN104410868 B CN 104410868B
- Authority
- CN
- China
- Prior art keywords
- file
- polymerization
- subfile
- metadata
- client
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000006116 polymerization reaction Methods 0.000 title claims abstract description 173
- 238000000034 method Methods 0.000 title claims abstract description 58
- 238000012545 processing Methods 0.000 claims abstract description 34
- 238000013507 mapping Methods 0.000 claims abstract description 19
- 238000001514 detection method Methods 0.000 claims abstract description 16
- 238000003860 storage Methods 0.000 claims description 15
- 230000008569 process Effects 0.000 claims description 12
- 238000004364 calculation method Methods 0.000 claims description 9
- 238000012856 packing Methods 0.000 claims description 9
- 238000009826 distribution Methods 0.000 claims description 8
- 230000015572 biosynthetic process Effects 0.000 claims description 6
- 238000005538 encapsulation Methods 0.000 claims description 6
- 238000003786 synthesis reaction Methods 0.000 claims description 6
- 230000006835 compression Effects 0.000 claims description 3
- 238000007906 compression Methods 0.000 claims description 3
- 230000002452 interceptive effect Effects 0.000 claims description 3
- 238000006243 chemical reaction Methods 0.000 claims 1
- 230000008859 change Effects 0.000 abstract description 3
- 230000005540 biological transmission Effects 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 6
- 238000007726 management method Methods 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 239000013307 optical fiber Substances 0.000 description 2
- 230000008520 organization Effects 0.000 description 2
- 230000003466 anti-cipated effect Effects 0.000 description 1
- 238000012550 audit Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 239000004744 fabric Substances 0.000 description 1
- 238000005194 fractionation Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000006855 networking Effects 0.000 description 1
- 238000004806 packaging method and process Methods 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 235000015170 shellfish Nutrition 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/239—Interfacing the upstream path of the transmission network, e.g. prioritizing client content requests
- H04N21/2393—Interfacing the upstream path of the transmission network, e.g. prioritizing client content requests involving handling client requests
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/231—Content storage operation, e.g. caching movies for short term storage, replicating data over plural servers, prioritizing data for deletion
- H04N21/23113—Content storage operation, e.g. caching movies for short term storage, replicating data over plural servers, prioritizing data for deletion involving housekeeping operations for stored content, e.g. prioritizing content for deletion because of storage space restrictions
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/2343—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
- H04N21/234336—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by media transcoding, e.g. video is transformed into a slideshow of still pictures or audio is converted into text
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/433—Content storage operation, e.g. storage operation in response to a pause request, caching operations
- H04N21/4335—Housekeeping operations, e.g. prioritizing content for deletion because of storage space restrictions
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/4402—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
- H04N21/440236—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by media transcoding, e.g. video is transformed into a slideshow of still pictures, audio is converted into text
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Computer And Data Communications (AREA)
Abstract
The present invention relates to a kind of shared-file system multifile rapid polymerization and the method read, including:Multifile rapid polymerization;Open file;Obtain metadata;Read or change file;Close file." multifile rapid polymerization " includes:Detection polymerization subfile validity;Whether detection polymerization is additional polymerization;Create aggregate file;Aggregate file is established with polymerizeing subfile mapping relations;Processing polymerization subfile." the step of obtaining metadata " includes:Open up memory space;Selection is handled;Whether the file that detection obtains metadata is aggregate file;Aggregate file is obtained with polymerizeing subfile mapping relations;Obtain metadata.By file caused by multiple client, rapid polymerization is into one big file in a short time or quickly addition is aggregated to behind existing big file by multiple files by rapid polymerization method by the present invention, file copy work is not produced in the course of the polymerization process, the file of rapid polymerization is identical with using the file generated after physics polymerization.
Description
Technical field
The present invention relates to a kind of shared-file system multifile rapid polymerization and the method read, is that one kind is applied to share
The PC clusters such as file system video and audio field distributed trans-coding, distributed packing produce the side that multiple files carry out rapid polymerization
Method, it is a kind of method of the multifile rapid polymerization under shared-file system suitable for broadcasting and TV application field.
Background technology
At present, general shared-file system is by meta data server(MDS), shared storage and multi-client pass through LAN
Network and SAN network composition, meta data server and client can directly access shared deposit by FC or ISCSI agreements
Storage, respectively FC-SAN and IP-SAN frameworks.For SAN network due to being optical fiber transmission, bandwidth, capacity are big, speed is fast, are commonly used to pass
The huge file of transmission of data amount, such as:Video file.
Metadata in SAN shared-file systems is to describe the data structure of data organization method, metadata essential record
Some association attributeses of division methods of this document on block device, deposit position and file in SAN shared-file systems
Deng.SAN shared-file systems by metadata by continuous block device storage organization into file structure, size and the text of metadata
Number of packages evidence compares very little, so transmission bandwidth that need not be very high, so being transmitted with lan network.In SAN shared-file systems
Metadata information is managed collectively by meta data server.Client passes through tcp/ip agreements and Metadata Service under lan network
Device is connected and communicated.
SAN file system is widely used in the post-production of broadcast TV program, including one of those it is important should
Use scene:Multiple client can simultaneously in SAN shared-file systems file carry out accessing operation, such as trans-coding system,
Packaging system.The system be in programming process must be through link, while be also the intensive application of computed altitude, especially face
Current high definition is made, it is necessary to expend the substantial amounts of producing efficiency for calculating the time, influenceing program.Therefore, utilize distributed meter
Distributed packing/transcoding technology of calculation arises at the historic moment, and the technology carries out PC cluster application using SAN shared-file systems, with
Unit computing is compared and operation time is greatly shortened.But the current generally existing of the technology is a little insufficient, i.e., after PC cluster terminates,
Each client can generate a new file in shared storage, at this moment need to read respectively in some client each
File is local to client, then carries out physics polymerization to these files in local client, generates the big file of new polymerization, most
The big file of polymerization generation is write in shared storage again afterwards, conducted interviews for other clients.This uses distributed computing technology
The process of the extra physics polymerization brought needs to consume very long polymerization time, and the node for once carrying out PC cluster is non-
Chang Duo, substantial amounts of file to be polymerized will be produced, it is also longer so as to carry out polymerizeing the spent time.Meanwhile if
Use such a physics polymerization, it will take very long polymerization time so that whole SAN network operation is slack-off, for a long time
Reduce the memory bandwidth that meta data server externally provides service.Moreover, physics polymerization needs to carry out each file
Read operation, while newly-generated physics aggregate file is carried out once to arrive shared storage(Typically disk array)Write behaviour
Make, the read-write operation of these files also largely occupies the I/O capability of disk array and the transmission bandwidth of SAN network.
The content of the invention
Overcome problem of the prior art, the purpose of the present invention is to propose to a kind of shared-file system multifile rapid polymerization and
The method of reading.Described method is after PC cluster terminates, by specific polymerization by caused by multiple client
Into one big file or by multiple files, quickly addition is aggregated to behind existing big file rapid polymerization file in a short time,
Do not produce the copy work of file actual content in the course of the polymerization process, while for client, the file of this rapid polymerization and
It is identical with operation during write-in in reading using the file generated after physics polymerization, substantially reduce file polymerization and account for
Time.
The object of the present invention is achieved like this:
A kind of shared-file system multifile rapid polymerization method, the hardware system used in described method include:
Multiple client is connected by transmitting the express network of large volume file with meta data server and disk array, and described is multiple
Client is also connected by transmitting the network of metadata and control and interactive information with meta data server simultaneously, methods described
Step is as follows:
The step of generating subfile:Multiple client performs same processing task not respectively according to default polymeric rule
Same part, it is every to be partly generated as an independent subfile;
The step of multifile rapid polymerization:Multiple subfiles that shared-file system polymerize to needs are effectively checked
And " logical aggregate " is rapidly performed by, multiple subfile rapid polymerizations or quick add are aggregated into a big file of polymerization.
Further, above-mentioned " the step of multifile rapid polymerization " includes following sub-step:
The sub-step of detection polymerization subfile validity:It is in need that file system detects institute when carrying out file rapid polymerization
The subfile validity of polymerization and whether meet polymeric rule, multifile rapid polymerization step is exited if "No", if
"Yes" then enters next sub-step;
Whether detection polymerization is the additional sub-step polymerizeing:File system detects polymerization when carrying out file rapid polymerization
No is to carry out adding polymerization on the basis of original aggregate file, need to detect existing aggregate file if additional polymerization be
It is no to meet additional polymeric rule, aggregate file is established with polymerizeing Ziwen if adding polymerization and meeting that additional polymeric rule enters
The sub-step of part mapping relations, else if being additional polymerization but being unsatisfactory for adding polymeric rule and just exit multifile rapid polymerization
Step;Then enter next sub-step if not additional polymerization;
Create the sub-step of aggregate file:Created according to the aggregate file information of client request and each subfile attribute information
Aggregate file is built, and calculates the correlation attribute information of aggregate file;
Aggregate file is established with polymerizeing the sub-step of subfile mapping relations:The each subfile polymerizeing as needed is related
Attribute information, the correlation attribute information of aggregate file is updated, establish the mapping relations of aggregate file and each subfile.
Further, the step of above-mentioned multifile rapid polymerization also includes the sub-step of processing polymerization subfile:Polymerization
After the completion of, polymerization subfile is handled, client is no longer viewed polymerization subfile information.
Further, above-mentioned " the step of generation subfile " includes following sub-step:
The sub-step of processing task distribution:Same processing task is split at more height according to the requirement of polymeric rule
Reason task, subtask is dispatched to different client executings;
Calculate the sub-step of subfile:Multiple client is to the subprocessing tasks carrying Distributed Calculation of distribution, and generation is respectively
Self-corresponding subfile;If sub-file length is not the integral multiple of file system block size in the encapsulation process of subfile,
Clear data is mended to insufficient position if necessary.
Further, the method that multifile rapid polymerization is realized in above-mentioned shared-file system, described polymeric rule bag
The size of enclosed tool file is the integral multiple of file system blocks;
Further, the method that multifile rapid polymerization is realized in above-mentioned shared-file system, described calculating include compiling
Synthesis is changed and/or rendered to code form;
Further, the method that multifile rapid polymerization is realized in above-mentioned shared-file system, it is described to generate respectively
Mode is to be generated using the mode of Distributed Calculation;
Further, the method that multifile rapid polymerization is realized in above-mentioned shared-file system, if processing task is to turn
Code task or packing task, the video compression coding of the file destination of generation select the coded system of cbr (constant bit rate);
A kind of read method for stating shared-file system multifile rapid polymerization file, include the step of this method:
The step of opening file:Request is sent to meta data server, it is desirable to open in disk array and treat for client
The file of reading;
The step of handling metadata:Client is obtained to meta data server application and treated according to the content of file to be read
The metadata corresponding to file is read, client obtains respective meta-data, while client receives to distribute to the chance lock of oneself;
The step of reading file:Client obtains metadata section, and according to the metadata section obtained, disk array is sent out
The block request of data for reading file is acted, to complete the metadata section corresponding blocks data read operation, the application metadata section of circulation
With read metadata section corresponding to block number evidence, until complete needed for data reading;
The step of closing of a file:Client sends requirement to meta data server, closes the reading file handle opened,
Complete the reading of this file.
Further, meta data server is treated in normally processing client request in above-mentioned " the step of processing metadata "
Whether be aggregate file, including following sub-step if detection in addition is also needed to outside the metadata of operation file and obtains the file of metadata
Suddenly:
Open up the sub-step of memory space:Opened up for client in local memory or hard disk for storing metadata
Memory space;
Select the sub-step of processing:Backstage queue is placed on to metadata application request selecting for client to wait still
Processing in real time, waited if backstage waits that then metadata application request is put into metadata request queue, if in real time
Processing then enters next sub-step;
Detection obtain metadata file whether the sub-step of aggregate file:For meta data server detection client hair
Whether the file for playing acquisition metadata request is aggregate file, and " sub-step for obtaining metadata " is entered if "No", if
"Yes" then enters next sub-step;
Aggregate file is obtained with polymerizeing subfile mapping relations sub-step:According to first number of the aggregate file of client request
It is believed that breath, obtains the mapping relations for corresponding polymerization subfile;
Obtain the sub-step of metadata:Client is returned to according to the metadata of the corresponding file of acquisition request.
The beneficial effect comprise that:Method of the present invention, when carrying out PC cluster, by specific fast
Fast polymerization can by file caused by multiple client in a short time rapid polymerization into one big file or by multiple texts
Quickly addition is aggregated to behind existing big file part, will not produce file copy work in the course of the polymerization process, while to client
For end, the file of this rapid polymerization is identical with using the file after physics polymerization.The present invention is effectively shortened
Shared-file system is applied to the polymerization time of multiple files during calculating, and aggregate file quantity is more, and effect is all the more obvious,
For the transcoding in common video and audio processing or packing synthesis processing, distributed trans-coding/beat largely is improved
The efficiency of bag.PC cluster speed is also improved simultaneously, reduces data storage bandwidth.Due to the application in broadcast television industry
In, meeting is frequently packed/turned by further improving distribution using the operation of distributed trans-coding/distribution packing, the present invention
The efficiency of code, the improved efficiency in TV programme integral manufacturing flow is brought, especially current HDTV program production is anticipated
Justice is especially great.So that distributed trans-coding or packing synthesis are after the completion of each subtask, it is not necessary to wait very long file physics
Union operation, but be directly almost moment and complete logic and merge, and the file after logic is merged submit the follow-up review of a film by the censor or
Person, which broadcasts, to be used, and meets the demand of Modern Media mechanism very first time transmission information.
Brief description of the drawings
The invention will be further described with reference to the accompanying drawings and examples.
Fig. 1 is showing for hardware system used in shared-file system file rapid polymerization method described in inventive embodiments one
It is intended to;
Fig. 2 is the flow chart of shared-file system file rapid polymerization method described in inventive embodiments one.
Fig. 3 is the flow chart of the reading aggregate file method described in inventive embodiments one.
Fig. 4 is the schematic flow sheet of the processing metadata step described in inventive embodiments one.
Embodiment
Embodiment one:
The present embodiment is a kind of shared-file system multifile rapid polymerization and the method for reading the aggregate file.It is described
Method used in hardware system include:Multiple client(3 clients are only depicted in Fig. 1, can be had in practice more
More clients)Pass through the SAN network of transmitting video files(Heavy line and two-wire represent in Fig. 1)With meta data server and magnetic
Disk array connects, and described multiple client is by transmitting the lan network of metadata(Represented in Fig. 1 with fine line)With metadata
Server connects, as shown in Figure 1.
Wherein client can be common PC work station or server, have the energy of connection SAN network
Power, and large-scale file can be handled, such as high definition video file.SAN network described in the present embodiment is handed over by optical fiber
The optical networking changed planes with optical cable composition, is the broadband network that bandwidth exceedes 1G, can also may be used with the video file of transmission of high-definition
To be formed SAN network with thousand M or ten thousand M Fast Ethernet.Lan network described in the present embodiment is made up of simultaneously ethernet switch
Using TCP/IP be communications protocol Ethernet, have hundreds of M bandwidth, can quickly transmit meta data file and control and
Task interactive information.Under normal conditions in order to avoid meta data server goes wrong and influences the normal of whole SAN system
Operation, can increase a standby meta data server, i.e. two meta data servers, it is backuped each other, in synchrodata
On the basis of realize metadata server redundancy.Shared storage device in system leads to usually using disk array, disk array
Cross SAN network with client with meta data server to be connected, client and meta data server can pass through fc agreements or iscsi
Agreement conducts interviews to it.
The basic ideas of the present embodiment are:When carrying out Distributed Calculation, multiple client is according to default polymeric rule
Multiple files are generated, are afterwards quickly gathered file caused by multiple client in a short time by specific rapid polymerization method
The quick addition of multiple files is aggregated to behind existing big file by the one big file of synthesis, does not produce text in the course of the polymerization process
Part copies work, while for client, file and use the text generated after physics polymerization that this rapid polymerization generates
Part is identical.Big data quantity content particularly suitable for distributed trans-coding, packing is generated and merged into by Distributed Calculation
After one file, for the situation of multi-client share and access.The present embodiment methods described can be expressed as:Multiple client server
After PC cluster is carried out, multiple files can be generated in storage sharing, respectively file1, file2, file3 ..., fileN,
Size is respectively M1, M2, M3 ..., MN, by SAN file system using specific polymerization by multiple file f ile1,
File2, file3 ..., rapid polymerization is big into a new big file, entitled mergefile, file in a short time by fileN
The small size sum for being M1+M2+M3+ ...+MN, being the multiple files being polymerize, meanwhile, the file of rapid polymerization generation and use
The file generated after physics polymerization includes file size and other attributes are identical.So, it is greatly lowered polymerization
The time that multiple files use, while also reduce the bandwidth utilization rate of storage.Multiple files use thing in common SAN system
, it is necessary to read file f ile1, file2, file3 ... respectively in a client in reason polymerization process, fileN is to objective
Family end, physics polymerization then is carried out to file1, file2, file3 ..., fileN in local, generates file
Mergefile, finally the file mergefile of polymerization generation is write in shared storage again, conducted interviews for other clients.
And the present embodiment then carries out " logical aggregate " in shared storage to multiple file f ile1, file2, file3 ..., fileN,
During " logical aggregate ", All Files is present in shared storage all the time, file will not be copied and read work.With
This can be effectively shortened the polymerization time that SAN shared systems are applied to multiple files during PC cluster, and aggregate file number
Amount is more, and advantage is all the more obvious.PC cluster speed is also improved simultaneously, reduces in polymerization process and data of magnetic disk array is deposited
The occupancy of bandwidth is stored up, while reduces the occupancy to the transmission bandwidth of SAN network so that same disk array and SAN network
It can support more clients are shared to use.It is crucial that copied because actual file does not occur for logical aggregate process
It shellfish, can be completed in moment, very valuable time is saved so as to quickly audit broadcast for TV programme, in high-definition program text
In the case of part bulky, the saving meaning of this backstage technical finesse holding time is especially great.
Realize that the detailed process step of multifile rapid polymerization method is as follows in shared-file system described in the present embodiment:
1st, the step of multiple subfile generations:Same processing task is torn open according to the requirement of polymeric rule for client
It is divided into multiple subprocessing tasks, subtask is dispatched to different client executings;Multiple client is appointed to the subprocessing of distribution
Business performs Distributed Calculation, generates each self-corresponding subfile;If sub-file length is not in the encapsulation process of subfile
The integral multiple of file system block size, clear data is mended to insufficient position if necessary.Multiple subfiles of generation are write by SAN network
Enter onto shared storage.Include in specific implementation:
(1), for client same processing task is split into multiple subprocessing tasks according to the requirement of polymeric rule,
Subtask is dispatched to different client executings;Next big calculating task of normal conditions has administration authority by one
Client is assigned into multiple subtasks, is given the parallel computation simultaneously of more clients and is completed, wherein the process of distribution task, is just needed
Calculating task is split previously according to the requirement of office and rule, so that each client for performing sub- calculating task is most lifelong
Into subfile meet rapid polymerization requirement.For example, transcoding, pipe are carried out for the program that a time span is 2 hours
Client is managed according to current idle client terminal quantity, the different piece of the program of 2 hours is dispatched to different clients
Transcoding calculating is performed simultaneously, respectively subfile of the generation corresponding to different piece.
The fractionation of task should meet that the workload that each client executing calculates is roughly the same, so as in the most short time
It is interior while complete transcoding task, give next working link and use;When meeting rapid polymerization again, except last Ziwen
Outside part, the size of remaining subfile must be the integral multiple of file system block size.Under normal circumstances, for the target of transcoding
The Video coding of file is the compressed encoding form of cbr (constant bit rate), is easier to determine the file size after each section of transcoding, from
And split task and relatively easily realize.For some video and audio Document encapsulation forms, it is allowed in every frame video and audio of reality
The filling data of blank are added at data end, so as to reach the requirement for the integral multiple that subfile size is file system block size.
So for transcoding task or packing synthesis task, if it is desired to finally polymerize the effect for improving Piece file mergence using rapid file
Rate, compressed encoding form that can be using the Video coding of selection target file as cbr (constant bit rate), the encapsulation format of video and audio file are
AVI or MXF OP1A..For the Document encapsulation form not in addition blank filling data among video/audio, need
Management client is wanted strictly to calculate the file size at each frame data end, finding can meet that subfile size is file just
The in and out point of the integral multiple of system block size, other client executings are given so as to form sub- calculating task.
For have N platforms can be with the client of subtasking in the case of, calculating task can be both divided into N parts, it is each
The one of calculating task of platform client executing, generate N number of subfile and quickly merged, such case is typically all clients
When calculating task is all completed at end, from management client to file system application documents rapid polymerization;It can also generate far more than N's
M subtask, subtasking, management client monitor the generation situation of subfile to each client at any time successively from front to back,
And submit rapid file aggregate request to file system at any time.In this case it is often the additional polymerization of subfile, works as whole
When subtask is carried out completing, final rapid polymerization file can just submit to next link and use.
(2), calculate subfile sub-step:Multiple client performs Distributed Calculation to the content of point good section, encodes lattice
Formula uses the video compression coding mode of cbr (constant bit rate), generate in file processes and given birth to according to the integral multiple of file system block size
Into file, clear data is mended in insufficient position.In order that the subfile that must be generated meets default size, it is necessary to during coding
It is determined that the size of generation data, the file if just meeting generation directly seals if being the requirement of file system block size integral multiple
Dress up file;If be unsatisfactory for, clear data is supplemented behind the data to meet.Due to having been examined when calculating task is split
The requirement of rapid polymerization rule is considered, client only needs subtasking and supplements clear data when being necessary
Generation meets the subfile of needs.
2. the step of preparing multifile rapid polymerization:For extracting the information of each subfile for preparing polymerization.
3rd, the step of multifile rapid polymerization:The multiple subfiles polymerizeing for file system to needs are effectively checked
And " logical aggregate " is rapidly performed by, multiple polymerization subfile rapid polymerizations or quick add are aggregated into the big text of a polymerization
Part.
This step and the difference of traditional physics polymerization are:Traditional physics polymerization is to pass through copy mode
Complete, read first file for needing to polymerize first, be then successively read alternative document according to polymerization sequence, until all
File all polymerize completion.And this step will not then use copy mode, pass through the mapping established aggregate file with polymerize subfile
Relation is carried out quickly " logical aggregate ".Therefore, the multifile rapid polymerization step described in this step includes following sub-step:
(1)The sub-step of detection polymerization subfile validity:Detected for file system when carrying out file rapid polymerization
The subfile validity of polymerization in need and whether meet polymeric rule, multifile rapid polymerization step is exited if "No"
Suddenly, next sub-step is entered if "Yes".
This sub-step is judge the step of, judge the subfile in need being polymerize whether effectively and whether
Meet polymeric rule.For the subfile of polymerization in need must be fulfilled for file system some rules, that is, except last
Individual polymerization subfile, the size of other all polymerization subfiles all must be the integral multiple of file system block size, and ensure
Without these subfiles of other client operations, it so just can guarantee that the file after polymerization can correctly be accessed by client.By
In using " logical aggregate ", in polymerization process, the work such as position change and the copy of file will not be carried out, this requires to remove
Last polymerization subfile, sizes of other polymerization subfiles must all be fulfilled for be file system block size integral multiple,
Otherwise " cavity " occurs in the file polymerizeing, and causes file that problem occurs in reading process.Therefore, this step is mainly used in text
Part system detected when carrying out file rapid polymerization polymerization in need subfile whether effectively and whether meet polymerization rule
Then, if rapid file polymerization procedure will be exited by being unsatisfactory for polymeric rule.
(2)Whether detection polymerization is the additional sub-step polymerizeing:Examined for file system when carrying out file rapid polymerization
Survey whether polymerization is to carry out adding polymerization on the basis of original aggregate file, need to detect existing gather if additional polymerization
Close whether file meets to add polymeric rule, if additional polymerize and meet that additional polymeric rule just updates the phase of aggregate file
Attribute information is closed, into the sub-step established aggregate file with polymerize subfile mapping relations, multifile is otherwise exited and quickly gathers
Close;Then enter next sub-step if not additional polymerization.
In additional polymerization process, it is necessary to ensure that the size of last polymerization subfile in aggregate file be present
Must be fulfilled for be file system block size integral multiple, otherwise the file after additional polymerization " cavity " occurs, cause file to exist
Problem occurs in reading process.This sub-step is judge the step of, judges whether polymerize is additional polymerization, if
It polymerize to be additional, then the already present big file of polymerization needs to meet some rules of SAN file system, that is, existing polymerization text
Last polymerization subfile size in part must be the integral multiple of file system block size, so just can guarantee that other polymerizations
Subfile is added on the big file of existing polymerization being aggregated to.Meet conditions above, it is possible to correctly to institute's polymerization in need
File polymerize or additional polymerization, otherwise will exit multifile rapid polymerization.
(3)Create the sub-step of aggregate file:Believed according to the aggregate file information of client request and each subfile attribute
Breath creates aggregate file, and calculates the correlation attribute information of aggregate file.Here attribute information is primarily referred to as aggregate file
Document size information, it can be obtained by calculating each subfile size sum.
(4)Aggregate file is established with polymerizeing the sub-step of subfile mapping relations:The each subfile polymerizeing as needed
Correlation attribute information, establish the mapping relations of aggregate file and each subfile.
This sub-step is the committed step of multifile rapid polymerization, and whether aggregate file can be correctly accessed depending on the son
Step.Because " logical aggregate " mode used aggregate file, actually in file system, each subfile is scattered
It is self-existent, all it is not related with other subfiles and aggregate file.So when accessing aggregate file, it must just lead to
Certain mode is crossed opening relationships between aggregate file and all subfiles, could normally access whole aggregate file.This
In establish mapping relations by way of index for aggregate file and subfile, preserve aggregate file first in indexed file
Information, include the number of aggregate file title, size and aggregate file, then preserve the title letter of each subfile successively
Breath, the size of offset and the subfile of the subfile in aggregate file.So, when accessing aggregate file, pass through
Accessed aggregate file offset can quickly be accurately positioned corresponding subfile, it is then corresponding by reading subfile
Information completes the access to aggregate file.
(5)The sub-step of processing polymerization subfile:After the completion of polymerization, polymerization subfile is handled, makes client not
Polymerization subfile information is viewed again.After the completion of polymerization, special marking processing is carried out to the subfile polymerizeing, that is, in text
The meta data server end of part system hides the subfile being aggregated so that client can not be to subfile that these polymerize
Directly accessed.
The present embodiment also includes the method for reading the aggregate file, and this method comprises the following steps:
1st, the step of opening file:Request is sent to meta data server, it is desirable to open in disk array for client
File to be read.This step is basic step, and when user needs to read a file, user opens in client and treated
The handle of operation file, client send associative operation request according to the handle of this document to meta data server.
2nd, the step of metadata is handled:Client obtains according to the content of file to be read to meta data server application
Metadata corresponding to file to be operated, client obtains respective meta-data information, while client receives to distribute to oneself
Chance lock.
The difference of this step and the acquisition metadata of traditional SAN shared files is:Traditional SAN file system obtains member
Data method metadata information according to corresponding to the content of client request file directly obtains this document content, and this step is then
Also need to judge to obtain whether file to be operated is aggregate file in addition before the metadata of file to be operated is normally obtained.
Therefore, the client process metadata described in this step includes following sub-step:
(1)Open up the sub-step of memory space:Opened up for client in local memory or hard disk for storing first number
According to memory space.The situation that ordinary circumstance double base data server all breaks down be not it is a lot, as needed can be hard
Disk opens up memory space.
(2)Select the sub-step of processing:Backstage queue is placed on to metadata application request selecting for client to wait
Or processing in real time, waited if backstage waits that then metadata application request is put into metadata request queue, if
Processing then enters next sub-step in real time.Handle file read-write, can be carried out simultaneously in client it is multiple, in order to improve efficiency
Multiple file read-write threads can be opened up, thus have the operation queue of multiple threads, it is necessary to which it is can from queue to handle
Middle taking-up, is handled.If running background, will below the step of, adds request queue, transfers to the processing line being previously created
Journey processing;If not running background, directly handled in this thread.
(3)Detection obtain metadata file whether the sub-step of aggregate file:Client is detected for meta data server
Whether the file that end initiates to obtain metadata request is aggregate file, and " sub-step for obtaining metadata " is entered if "No",
Enter next sub-step if "Yes".This sub-step is detect and judge the step of, judges client request metadata
File whether be aggregate file, if the metadata information for aggregate file of request, need according to aggregate file and poly-
The corresponding subfile metadata information of zygote File Mapping Relation acquisition, next sub-step can be entered in this case:If Qing
The metadata information for non-polymeric file asked, then respective meta-data information is directly obtained according to request, enter to obtain metadata
Sub-step.
(4)Aggregate file is obtained with polymerizeing subfile mapping relations sub-step:According to the aggregate file of client request
Metadata, obtain the mapping relations for corresponding polymerization subfile.
(5)Obtain the sub-step of metadata:Corresponding operating file is obtained according to communication rule for meta data server
Metadata, and metadata information is returned to client.An aggregate file metadata is obtained with obtaining first number of ordinary file
It is the same according to process, simply needs first to judge the Ziwen corresponding to the metadata information when obtaining aggregate file metadata
Part, then obtain the corresponding metadata information of the subfile and return to client, so circulation is until by first number of required data
Finished according to acquisition of information.To this sub-step, whole " the step of processing metadata " terminates.
3rd, the step of reading file:Metadata section is obtained for client, and according to the metadata section obtained, to disk
Array initiates to read the block request of data of file, to complete the metadata section corresponding blocks data read operation, the application member of circulation
Block number evidence corresponding to data segment and reading metadata section, until the reading of data needed for completing.
4th, the step of closing file:Requirement is sent to meta data server for client, closes the operation text opened
Part handle, complete the reading of this file.
Finally it should be noted that being merely illustrative of the technical solution of the present invention and unrestricted above, although with reference to preferable cloth
Scheme is put the present invention is described in detail, it will be understood by those within the art that, can be to the technology of the present invention
Scheme(For example obtain the sequencing of the mode of metadata, the reading manner of file, step etc.)Modify or equally replace
Change, without departing from the spirit and scope of technical solution of the present invention.Method of the present invention can be worked out as applied to department of computer science
The program of system, and run in computer network system of the present invention.
Claims (8)
1. a kind of shared-file system multifile rapid polymerization method, the hardware system used in described method includes:It is more
Individual client is connected by transmitting the express network of large volume file with meta data server and disk array, described multiple visitors
Family end is also connected by transmitting the network of metadata and control and interactive information with meta data server simultaneously, the step of methods described
It is rapid as follows:
The step of generating subfile:Multiple client performs the different portions of same processing task according to default polymeric rule respectively
Point, it is every to be partly generated as an independent subfile;
The step of multifile rapid polymerization:Shared-file system to multiple subfiles for polymerizeing of needs effectively check and quick
Progress " logical aggregate ", multiple subfile rapid polymerizations or quick addition are aggregated into an aggregate file, described " logic
Polymerization " refers to:In the course of the polymerization process, All Files is present in shared storage all the time, is not copied and read work to file
Make;
Characterized in that,
Described " the step of multifile rapid polymerization ", includes following sub-step:
The sub-step of detection polymerization subfile validity:File system detects institute's polymerization in need when carrying out file rapid polymerization
Subfile validity and whether meet polymeric rule, multifile rapid polymerization step is exited if "No", if "Yes"
Then enter next sub-step;
Whether detection polymerization is the additional sub-step polymerizeing:File system detected when carrying out file rapid polymerization polymerization whether be
Carry out adding polymerization on the basis of original aggregate file, whether completely to need to detect existing aggregate file if additional polymerization
The additional polymeric rule of foot, if additional polymerization and meeting that additional polymeric rule enters and establishing aggregate file and reflected with polymerizeing subfile
The sub-step of relation is penetrated, else if being additional polymerization but being unsatisfactory for adding polymeric rule and just exit multifile rapid polymerization step
Suddenly;Then enter next sub-step if not additional polymerization;
Create the sub-step of aggregate file:Gathered according to the aggregate file information of client request and each subfile attribute information creating
File is closed, and calculates the correlation attribute information of aggregate file;
Aggregate file is established with polymerizeing the sub-step of subfile mapping relations:The each subfile association attributes polymerizeing as needed
Information, the correlation attribute information of aggregate file is updated, establish the mapping relations of aggregate file and each subfile.
2. shared-file system multifile rapid polymerization method as claimed in claim 1, it is characterised in that described multifile
The step of rapid polymerization, also includes the sub-step of processing polymerization subfile:After the completion of polymerization, polymerization subfile is handled, made
Client no longer views polymerization subfile information.
3. shared-file system multifile rapid polymerization method as claimed in claim 1, it is characterised in that described " generation
The step of subfile ", includes following sub-step:
The sub-step of processing task distribution:Same processing task is split into multiple subprocessings according to the requirement of polymeric rule to appoint
Business, subtask is dispatched to different client executings;
Calculate the sub-step of subfile:For multiple client to the subprocessing tasks carrying Distributed Calculation of distribution, it is each right to generate
The subfile answered;It is necessary if sub-file length is not the integral multiple of file system block size in the encapsulation process of subfile
When to insufficient position mend clear data.
4. shared-file system multifile rapid polymerization method as claimed in claim 1, it is characterised in that described polymerization rule
Then the size including subfile is the integral multiple of file system blocks.
5. shared-file system multifile rapid polymerization method as claimed in claim 1, it is characterised in that described calculating bag
Include coded format conversion and, or render synthesis.
6. shared-file system multifile rapid polymerization method as claimed in claim 1, it is characterised in that described gives birth to respectively
Into mode be to be generated using the mode of Distributed Calculation.
7. shared-file system multifile rapid polymerization method as claimed in claim 1, it is characterised in that if processing task
It is transcoding task or packing task, the video compression coding of the file destination of generation selects the coded system of cbr (constant bit rate).
Include 8. a kind of read method of aggregate file as claimed in claim 1, the step of methods described:
The step of opening file:Request is sent to meta data server, it is desirable to open to be read in disk array for client
File;
The step of handling metadata:Client obtains to be read according to the content of file to be read to meta data server application
Metadata corresponding to file, client obtains respective meta-data, while client receives to distribute to the chance lock of oneself;
The step of reading file:Client obtains metadata section, and according to the metadata section obtained, disk array is initiated to read
The block request of data of file is taken, to complete the metadata section corresponding blocks data read operation, the application metadata section of circulation and reading
Block number evidence corresponding to metadata section is taken, until the reading of data needed for completing;
The step of closing of a file:Client sends requirement to meta data server, closes the reading file handle opened, and completes
This file is read;
Characterized in that, meta data server is waited to grasp in normally processing client request in described " the step of processing metadata "
Whether be aggregate file, including following sub-step if making to also need to detection in addition outside the metadata of file to obtain the file of metadata:
Open up the sub-step of memory space:The storage for storing metadata is opened up in local memory or hard disk for client
Space;
Select the sub-step of processing:Backstage queue is placed on to metadata application request selecting for client to wait or real-time
Processing, waited if backstage waits that then metadata application request is put into metadata request queue, if processing in real time
Then enter next sub-step;
Detection obtain metadata file whether the sub-step of aggregate file:Initiate to obtain for meta data server detection client
Whether the file for taking metadata request is aggregate file, " sub-step for obtaining metadata " is entered if "No", if "Yes"
Then enter next sub-step;
Aggregate file is obtained with polymerizeing subfile mapping relations sub-step:Believed according to the metadata of the aggregate file of client request
Breath, obtain the mapping relations for corresponding polymerization subfile;
Obtain the sub-step of metadata:Client is returned to according to the metadata of the corresponding file of acquisition request.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201410600003.9A CN104410868B (en) | 2014-10-31 | 2014-10-31 | A kind of shared-file system multifile rapid polymerization and the method read |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201410600003.9A CN104410868B (en) | 2014-10-31 | 2014-10-31 | A kind of shared-file system multifile rapid polymerization and the method read |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN104410868A CN104410868A (en) | 2015-03-11 |
| CN104410868B true CN104410868B (en) | 2017-11-17 |
Family
ID=52648452
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201410600003.9A Expired - Fee Related CN104410868B (en) | 2014-10-31 | 2014-10-31 | A kind of shared-file system multifile rapid polymerization and the method read |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN104410868B (en) |
Families Citing this family (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN104837071A (en) * | 2015-05-11 | 2015-08-12 | 华为软件技术有限公司 | Method and device for pushing file |
| CN109391787A (en) * | 2018-09-30 | 2019-02-26 | 武汉中科通达高新技术股份有限公司 | File format, image polymerization and read method |
| CN111176578B (en) * | 2019-12-29 | 2022-03-22 | 浪潮电子信息产业股份有限公司 | Object aggregation method, apparatus, device and readable storage medium |
| WO2021223174A1 (en) | 2020-05-07 | 2021-11-11 | Citrix Systems, Inc. | Task shifting between computing devices |
| CN114756173B (en) * | 2022-04-15 | 2025-08-19 | 京东科技信息技术有限公司 | Method, system, apparatus and computer readable medium for file merging |
| CN116340266A (en) * | 2023-03-28 | 2023-06-27 | 上海科技大学 | Fine-grained file system and file read-write method |
Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102123279A (en) * | 2010-12-28 | 2011-07-13 | 乐视网信息技术(北京)股份有限公司 | Distributed real-time transcoding method and system |
| US8055623B2 (en) * | 2005-12-01 | 2011-11-08 | International Business Machines Corporation | Article of manufacture and system for merging metadata on files in a backup storage |
| CN102546776A (en) * | 2011-12-27 | 2012-07-04 | 北京中科大洋科技发展股份有限公司 | Method for realizing off-line reading files in SAN (Storage Area Networking) shared file system |
| CN103237037A (en) * | 2013-05-08 | 2013-08-07 | 华迪计算机集团有限公司 | Media format conversion method and system based on cloud computing architecture |
| CN103297807A (en) * | 2013-06-21 | 2013-09-11 | 哈尔滨工业大学深圳研究生院 | Hadoop-platform-based method for improving video transcoding efficiency |
| CN103312803A (en) * | 2013-06-17 | 2013-09-18 | 杭州华三通信技术有限公司 | Method and device for optimizing Web access experience |
| CN103716413A (en) * | 2014-01-13 | 2014-04-09 | 浪潮(北京)电子信息产业有限公司 | Acceleration method for mass small document IO operation transmission in distribution type document system |
-
2014
- 2014-10-31 CN CN201410600003.9A patent/CN104410868B/en not_active Expired - Fee Related
Patent Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8055623B2 (en) * | 2005-12-01 | 2011-11-08 | International Business Machines Corporation | Article of manufacture and system for merging metadata on files in a backup storage |
| CN102123279A (en) * | 2010-12-28 | 2011-07-13 | 乐视网信息技术(北京)股份有限公司 | Distributed real-time transcoding method and system |
| CN102546776A (en) * | 2011-12-27 | 2012-07-04 | 北京中科大洋科技发展股份有限公司 | Method for realizing off-line reading files in SAN (Storage Area Networking) shared file system |
| CN103237037A (en) * | 2013-05-08 | 2013-08-07 | 华迪计算机集团有限公司 | Media format conversion method and system based on cloud computing architecture |
| CN103312803A (en) * | 2013-06-17 | 2013-09-18 | 杭州华三通信技术有限公司 | Method and device for optimizing Web access experience |
| CN103297807A (en) * | 2013-06-21 | 2013-09-11 | 哈尔滨工业大学深圳研究生院 | Hadoop-platform-based method for improving video transcoding efficiency |
| CN103716413A (en) * | 2014-01-13 | 2014-04-09 | 浪潮(北京)电子信息产业有限公司 | Acceleration method for mass small document IO operation transmission in distribution type document system |
Non-Patent Citations (1)
| Title |
|---|
| "基于集群运算的在线制作协同环境设计研究";孙思慧;《中国优秀硕士学位论文全文数据库信息科技辑》;20101015;I136-377 * |
Also Published As
| Publication number | Publication date |
|---|---|
| CN104410868A (en) | 2015-03-11 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN104410868B (en) | A kind of shared-file system multifile rapid polymerization and the method read | |
| CN102123279B (en) | Distributed real-time transcoding method and system | |
| CN114997337B (en) | Information fusion, data communication method, device, electronic equipment and storage medium | |
| CN105045856B (en) | A kind of big data remote sensing satellite data processing system based on Hadoop | |
| CN113706099A (en) | Data labeling and deep learning model training and service publishing system | |
| CN102368802B (en) | Television station production system and method thereof for realizing construction and operation | |
| CN102340689B (en) | Method and device for configuring business subsystem in television station production system | |
| CN106156328B (en) | A kind of bank's running log data monitoring method and system | |
| CN103838779A (en) | Idle computing resource multiplexing type cloud transcoding method and system and distributed file device | |
| CN106547882A (en) | A kind of real-time processing method and system of big data of marketing in intelligent grid | |
| CN103761309A (en) | Operation data processing method and system | |
| CN103093034B (en) | Based on the Collaborative Design method of cloud computing | |
| CN106254458A (en) | A kind of image processing method based on cloud robot vision, platform and system | |
| CN111951363A (en) | A rendering method, system and storage medium based on cloud computing chain | |
| CN111126621A (en) | Online model training method and device | |
| CN107295084A (en) | A kind of video editing system and method based on high in the clouds | |
| CN104301671B (en) | Traffic Surveillance Video storage method based on event closeness in HDFS | |
| CN104348793B (en) | The storage method of storage server system and data message | |
| CN110356007A (en) | A kind of extensive 3D printing model slice cloud platform based on IPv6 network | |
| CN105446952B (en) | For handling the method and system of semantic segment | |
| CN110147356A (en) | Data transmission method and device | |
| Dong et al. | An elastic system architecture for edge based low latency interactive video applications | |
| Kim et al. | Real-time video-based point cloud encoding system on a distributed platform | |
| CN109558214A (en) | Host method for managing resource, device and storage medium under isomerous environment | |
| CN114531605B (en) | Distributed video transcoding system based on intelligent home |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant | ||
| CF01 | Termination of patent right due to non-payment of annual fee | ||
| CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20171117 Termination date: 20201031 |