Summary of the invention
The purpose of this invention is to provide a kind of didactic data partition method, this method can not only make the load basis equalization of each processor, make that also the overlapped data of required storage is minimum, thereby the performance of whole system increases.
For achieving the above object, the heuristic data division methods of the form adaptive in the MPEG-4 video parallel encoding comprises step:
According to the interconnection mode of processor, select to adopt didactic data partitioning algorithm 1 or didactic data partitioning algorithm 2;
And, the VOP data are divided into and the same number of a plurality of subregions of processor according to the rule that respective algorithms is determined heuristicly.
The present invention is according to the characteristics of MPEG-4 based on shape coding, adopt didactic data partition method, optimize the VOP data and manage the distribution of device throughout, make the load relative equilibrium between each processor, and the overlapped data of required storage is minimum, reduce the time of data passes, to improve the efficient of whole video parallel encoding system.
Embodiment
The technical solution adopted in the present invention is the interconnection mode according to processor, adopts different didactic data partitioning algorithms.
At first, suppose that the processor that can be used for the VOP coding has T, the zone of dividing for the need data then, be listed as the least unit of dividing as data with this regional macro-block line or macro block, make two zones that mark off be rectangle, and the ratio of the macroblock number of the need that comprised coding approaches T/2: T/2+1 (T is an odd number) or T/2: T/2 (T is an even number) most, and then this algorithm of zone continuation execution to marking off, all be allocated to a zone until each processor and carry out the parallel encoding processing.Like this, the macroblock number of required coding is basic identical in each zone that this algorithm marked off, and the overlapped data of the required storage of whole system is minimum.
If the processor adopting bus mode links to each other, adopt algorithm 1, the result as shown in Figure 1, detailed process is as follows:
1, according to the ALPHA plane information, by array MB[0..6] [0..8] store the distribution of standard macroblock and boundary macroblocks among this VOP, and its intermediate value 1 is expressed as needs coded macroblocks (comprising standard macroblock and boundary macroblocks), and value 0 is expressed as transparent macro block.Know that by Fig. 1 the number of total need coded macroblocks is 32.Available processor is P[0..T], T=8;
2, because row 2,3,4 is blank column, the threshold value that has outnumbered regulation of continuous blank column, so divide according to blank column, whole VOP is divided into row 2,3,4 grades are that the boundary is divided into two part A and B, and the macroblock number ratio that need encode in both sides is 7: 9, then 9 processors with 4: 5 ratio respectively in order to handle corresponding subregion A and B.P[0..3 wherein] in order to handle subregion A, P[4..8] in order to handle subregion B
3, regional A and B are condensed to tight rectangle
4, regional A is carried out similar division, all distribute to a processor until the zone of each division
5, area B is carried out similar division, all distribute to a processor until the zone of each division
If processor is interconnected in 2 dimension grid modes, adopt algorithm 2, as shown in Figure 2, process is as follows:
1, with algorithm 1, tm=3 wherein, tn=3, available processor are P[0..tm] [0..tn].
2, because row 0,1 with remaining row in need coded macroblocks number ratio be 7: 9, be to approach most 1: 2, so be two sub regions A and B (number of noting continuous blank column surpasses its threshold value, so do not divide according to blank column) with whole area dividing at row 1 and row 2 places.P[0 wherein] [0..2] in order to handle subregion A, P[1..2] [0..2] in order to handle subregion B.
3, regional A and B are condensed to tight rectangle
4, tm=0, tn=0..2 carries out horizontal division to regional A,
A) because in the one's respective area, needing the ratio of coded macroblocks number in the number of the need coded macroblocks in the row 0,1 and the remaining row is 1: 2, thus be expert at 1 and 2 of row regional A is divided into zone C and D.
B) for zone C, tm=0, tn=0, division is finished, by processor P 00 processing region C
C) for region D, tm=0, tn=1..2; Because tn>tm carries out horizontal division.Owing to need the number of coded macroblocks to approach 1: 1 in the number that needs coded macroblocks in row 2 and the row 3 and the remaining row, thus be expert at 3 and 4 of row region D is divided into subregion E and F.
D) for subregion E, because tm=0, tn=1 is responsible for processing by P01; For subregion F, tm=0, tn=2 is responsible for processing by P02.So far, the division of regional A is finished
5.tm=1..2 tn=0..2 carries out horizontal division to area B.Process is the same.
Table 1 is heuristic data partitioning algorithm and original partitioning algorithm comparative result of form adaptive.
The comparative result of table 1 partitioning algorithm