CN104008134A - Efficient storage method and system based on Hbase - Google Patents
Efficient storage method and system based on Hbase Download PDFInfo
- Publication number
- CN104008134A CN104008134A CN201410188339.9A CN201410188339A CN104008134A CN 104008134 A CN104008134 A CN 104008134A CN 201410188339 A CN201410188339 A CN 201410188339A CN 104008134 A CN104008134 A CN 104008134A
- Authority
- CN
- China
- Prior art keywords
- combination
- hbase
- bytes
- value
- unit
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/221—Column-oriented storage; Management thereof
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses an efficient storage method and system based on Hbase and relates to the field of big data. The method comprises the steps that a line of user data to be stored in a target table are input; whether initialization information of byte combination coding of the target table exists in an internal storage is judged, if not, metadatabase accessing is carried out, byte combination coding of the target table is subjected to initialization, and the initialization information is written into the internal storage; if yes, a main key column value is obtained by analyzing, byte coding and combining are carried out, and a byte sequence is formed and is used as a line key value of a storage format of a key value; according to the initialization information, a non-main-key column value is obtained by analyzing, byte coding and combining are carried out, and a byte sequence is formed and is used as value field content of the storage format of the key value; and the line key value byte sequence and the value field byte sequence are combined into a key value pair, and HBase writing in is completed. The storage space of the HBase can be saved, and the input-output performance of the HBase is improved.
Description
Technical field
The present invention relates to large data field, specifically relate to a kind of high-efficiency storage method and system based on HBase.
Background technology
Along with popularizing of mobile Internet, intelligent terminal, Internet of Things, cloud computing and wisdom city, people have progressed into " large data " epoch.US Internet data center points out, the data on internet will increase by 50% every year, every two years just will double, and at present in the world more than 90% data be just to produce recent years.Except the information of issuing on internet, on global commercial unit, automobile, ammeter, there is countless digital sensors, measure at any time and transmitting the variation about chemical substance in position, motion, vibrations, temperature, humidity and even air, also produced the data message of magnanimity.
Large data are data sets that scale is very huge and complicated, and data volume reaches after the rank of petabyte, the byte that ends or damp byte traditional database management tool and deals with and face a lot of problems, as obtain, store, retrieve and analysis etc.Large data have caused some problems, the high-level efficiency storage as the high concurrent reading and writing of database is required, to mass data and requirements for access, demand to database enhanced scalability and high availability, and traditional database and data warehouse technology seem unable to do what one wishes.
Hadoop is the software frame that of being safeguarded by Apache Software Foundation can carry out to mass data distributed treatment, and Hadoop has brought the ability of the cheap large data of processing.Hadoop is a huge ecosystem, provides various tool and platform for processing large data.In the Hadoop ecosystem, HBase is a high reliability, high-performance, towards row, telescopic distributed non-relational Database Systems, can in mass data, locate fast results needed.
HBase is as column storage database, and the each row in table, all belong to certain and be listed as family, and row family is a part for the structure of table, must before using table, define, and the quantity of row and type are without definition, because it does not belong to the structure of table.In one Ge Lie family, can comprise multiple row, these row are named as prefix using its column family, and form is the < row >:< of family row >.In HBase, the read-write of data is all to carry out in row family aspect.
User data in HBase is all stored in Hadoop file system with HFile form, the proprietary storage format of HBase is shown in Figure 1, data block in HFile form is actual place of depositing user data, each data block is started by a random digit that prevents corrupted data, thereafter several key-value pairs are immediately spliced, and key-value pair is not subdivisible minimum data unit in HBase.
Shown in Figure 2, key-value pair is a byte arrays, has comprised a lot of, and has had fixing structure, and its concrete structure is composed as follows: it is the numerical value of two regular lengths that key-value pair starts, and represents respectively the length of key part and the length of value part; And then be key part, starting is the numerical value of fixing 2 byte lengths, represents row key length; Then be line unit value; Then be the numerical value of fixing 1 byte length, represent the length of row Praenomen; Then be row Praenomen and row name; Then be to fix the timestamp of 8 byte lengths and the type of fixing 1 byte length, this key-value pair of mark is update or deletion action; It is finally the value that value part is deposited these row.
The fixed part that can calculate each key-value pair from Fig. 2 has 22 bytes (supposition row Praenomen and row name are 1 bytes), if according to general minimum 10 bytes of line unit value length, so more than the rarest 32 bytes of key-value pair, and the value that now value part in this key-value pair has only been stored row, that is to say in the storage of HFile, each row of a line item can be stored as a key-value pair.If there are N row in a table, N is positive integer, the data line of storing so this table just needs N such key-value pair, constant drain is greatly about (32*N) byte, and key part in these key-value pairs is identical substantially, this just causes great waste on storage space, also can cause certain influence to the throughput performance of HBase simultaneously.
The list structure defining taking following structural model is as example:
{ row 1 (major key): row 1 data type; Row 2: row 2 data types; Row 3: row 3 data types; Row 4: row 4 data types }
This table has 4 row, and wherein first row is major key.
If use HBase native mode to store this table, definition row family is F1, and these row are all placed in F1, and in example, structuring table and HBase storage pass are:
The corresponding line unit of row 1;
The corresponding <F1:C2> row of row 2;
The corresponding <F1:C3> row of row 3;
The corresponding <F1:C4> row of row 4;
Hence one can see that, a line of this structuring table is recorded in storage in HBase needs 3 key-value pairs, its storage organization is shown in Figure 3, can see, the line item in example, need 3 key-value pairs to store, regular length expense is about (32*3) byte, in the time of the storage of magnanimity record, just has the storage overhead of a large amount of regular lengths, so both wasted storage space, again impact storage throughput performance.
Summary of the invention
The object of the invention is the deficiency in order to overcome above-mentioned background technology, a kind of high-efficiency storage method and system based on HBase is provided, can effectively save the storage space of HBase, improve the throughput performance of HBase.
The invention provides a kind of high-efficiency storage method based on HBase, comprise the following steps:
Step 101, input a line user data to object table to be stored;
Step 102, judge the initialization information that whether has the combination of bytes of object table coding in internal memory, if do not had, forward step 103 to; If had, forward step 104 to;
Step 103, accesses meta-data storehouse, carry out initialization to the combination of bytes coding of object table, and by initialization information write memory, return to step 102;
Step 104, the initialization information of encoding according to object table line unit combination of bytes in internal memory, from user data to be deposited, parse major key train value as combination of bytes coded object, carry out one by one byte code combination, form 1 byte sequence, as the line unit value of key-value pair storage format, forward step 105 to;
Step 105, the initialization information of encoding according to object table value field combination of bytes in internal memory, from user data to be deposited, parse non-major key train value as combination of bytes coded object, carry out one by one byte code combination, form 1 byte sequence, as the value field content of key-value pair storage format, forward step 106 to;
Step 106, storage HBase: the value field byte sequence that the line unit value byte sequence that step 104 is obtained and step 105 obtain, be assembled into key-value pair, and the data inserted mode of calling HBase client application DLL (dynamic link library) completes HBase and writes.
On the basis of technique scheme, after step 106, also comprise the step of following inquiry HBase data:
Step 201, the inquiry request of input to object table;
Step 202, judge the initialization information that whether has the combination of bytes of object table coding in internal memory, if do not had, forward step 203 to; If had, forward step 204 to;
Step 203, accesses meta-data storehouse, carry out initialization to the combination of bytes coding of object table, and by initialization information write memory, return to step 202;
Step 204, one by one the major key belonging in inquiry request scope is carried out to combination of bytes coding according to the initialization information of object table in internal memory, obtain the strong scope of row in the corresponding HBase of result set;
Step 205, according to the strong scope scanning of the row of step 204 HBase, obtain the value scope in the corresponding HBase of result set, forward step 206 to;
Value scope in the corresponding HBase of result set of step 206, integrating step 205, carries out Gray code according to value field combination of bytes coded system and target pointer thereof, obtains this inquiry request result, forwards step 207 to;
Step 207, return to Query Result.
On the basis of technique scheme, described initialization information comprises row list of file names, column data list of types, primary key column list, line unit combination of bytes coded system and target pointer thereof, value field combination of bytes coded system and the target pointer thereof of object table.
The present invention also provides a kind of high efficiency storage system based on HBase, described system comprises HBase writing station, described HBase writing station comprises the first input block, the first initialization unit, primary key column combination of bytes coding unit, non-primary key column combination of bytes coding unit, HBase writing unit, wherein:
Described the first input block, for: input a line user data to object table to be stored;
Described the first initialization unit, be used for: judge whether internal memory has the initialization information of the combination of bytes coding of object table, if there is no initialization information, accesses meta-data storehouse, combination of bytes coding to object table carries out initialization, and by initialization information write memory; If there is initialization information, generates primary key column combination of bytes coded trigger signal, and be sent to primary key column combination of bytes coding unit;
Described primary key column combination of bytes coding unit, be used for: according to the initialization information of internal memory object table line unit combination of bytes coding, from user data to be deposited, parse major key train value as combination of bytes coded object, carry out one by one byte code combination, form 1 byte sequence, as the line unit value of key-value pair storage format, generate non-primary key column combination of bytes coded trigger signal, and be sent to non-primary key column combination of bytes coding unit;
Described non-primary key column combination of bytes coding unit, be used for: according to the initialization information of internal memory object table value field combination of bytes coding, from user data to be deposited, parse non-major key train value as combination of bytes coded object, carry out one by one byte code combination, form 1 byte sequence, as the value field content of key-value pair storage format;
Described HBase writing unit, be used for: the value field byte sequence that the line unit value byte sequence that primary key column byte code unit is obtained and non-primary key column byte code unit obtain, be assembled into key-value pair, and the data inserted mode of calling HBase client application DLL (dynamic link library) completes HBase and writes.
On the basis of technique scheme, described system also comprises HBase inquiry unit, and described HBase inquiry unit comprises the second input block, the second initialization unit, major key combination of bytes coding unit, scanning element and Gray code unit, wherein:
Described the second input block, for: the inquiry request of input to object table;
Described the second initialization unit, be used for: judge whether internal memory has the initialization information of the combination of bytes coding of object table, if there is no initialization information, accesses meta-data storehouse, combination of bytes coding to object table carries out initialization, and by initialization information write memory; If there is initialization information, generates major key combination of bytes coded trigger signal, and be sent to major key combination of bytes coding unit;
Described major key combination of bytes coding unit, for: one by one the major key belonging in inquiry request scope is carried out to combination of bytes coding according to the initialization information of internal memory object table, obtain the strong scope of row in the corresponding HBase of result set;
Described scanning element, for: according to the strong scope scanning of row HBase, obtain the value scope in the corresponding HBase of result set, and be sent to Gray code unit;
Described Gray code unit, for: the value scope of the corresponding HBase of result set that reception scanning element is sent, carry out Gray code according to value field combination of bytes coded system and target pointer thereof, obtain this inquiry request result, return to Query Result.
On the basis of technique scheme, described initialization information comprises row list of file names, column data list of types, primary key column list, line unit combination of bytes coded system and target pointer thereof, value field combination of bytes coded system and the target pointer thereof of object table.
Compared with prior art, advantage of the present invention is as follows:
The present invention does not preserve column information in HFile, and a line item is only with a key-value pair storage, and column information is kept in a metadatabase, only stores user data in HFile, can effectively save the storage space of HBase, improves HBase throughput performance.
Brief description of the drawings
Fig. 1 is the proprietary storage format schematic diagram of HBase.
Fig. 2 is the storage format schematic diagram of key-value pair.
Fig. 3 is the storage schematic diagram of one line item while using existing HBase storage means.
Fig. 4 is the process flow diagram of the high-efficiency storage method based on HBase in the embodiment of the present invention.
Fig. 5 is the process flow diagram of data query.
Fig. 6 is the storage schematic diagram of one line item while using storage means in the embodiment of the present invention.
Fig. 7 is the process flow diagram of metadata information warehouse-in.
Embodiment
Below in conjunction with drawings and the specific embodiments, the present invention is described in further detail.
Shown in Figure 4, the embodiment of the present invention provides a kind of high-efficiency storage method based on HBase, comprises the following steps:
Step 101, input a line user data to object table to be stored;
Step 102, judge the initialization information that whether has the combination of bytes of object table coding in internal memory, if do not had, forward step 103 to; If had, forward step 104 to;
Step 103, accesses meta-data storehouse, carry out initialization to the combination of bytes coding of object table, and by initialization information write memory, return to step 102; Initialization information comprises row list of file names, column data list of types, primary key column list, line unit combination of bytes coded system and target pointer thereof, value field combination of bytes coded system and the target pointer thereof of object table;
Step 104, the initialization information of encoding according to object table line unit combination of bytes in internal memory, from user data to be deposited, parse major key train value as combination of bytes coded object, carry out one by one byte code combination, form 1 byte sequence, as the line unit value of key-value pair storage format, forward step 105 to;
Step 105, the initialization information of encoding according to object table value field combination of bytes in internal memory, from user data to be deposited, parse non-major key train value as combination of bytes coded object, carry out one by one byte code combination, form 1 byte sequence, as the value field content of key-value pair storage format, forward step 106 to;
Step 106, storage HBase: the value field byte sequence that the line unit value byte sequence that step 104 is obtained and step 105 obtain, be assembled into key-value pair, and the data inserted mode of calling HBase client application DLL (dynamic link library) completes HBase and writes.
Shown in Figure 5, to use after said method storage, the flow process of inquiry (read procedure) HBase data is as follows:
Step 201, input inquiry (reading) request to object table;
Step 202, judge the initialization information that whether has the combination of bytes of object table coding in internal memory, if do not had, forward step 203 to; If had, forward step 204 to;
Step 203, accesses meta-data storehouse, carry out initialization to the combination of bytes coding of object table, and by initialization information write memory, return to step 202; Initialization information comprises row list of file names, column data list of types, primary key column list, line unit combination of bytes coded system and target pointer thereof, value field combination of bytes coded system and the target pointer thereof of object table;
Step 204, one by one the major key belonging in inquiry request scope is carried out to combination of bytes coding according to the initialization information of object table in internal memory, obtain the strong scope of row in the corresponding HBase of result set;
Step 205, according to the strong scope scanning of the row of step 204 HBase, obtain the value scope in the corresponding HBase of result set, forward step 206 to;
Step 206, the result set of step 205 is carried out to Gray code according to value field combination of bytes coded system and target pointer thereof, obtain this inquiry request result, forward step 207 to;
Step 207, return to Query Result.
The embodiment of the present invention also provides a kind of high efficiency storage system based on HBase, comprise HBase writing station and HBase inquiry unit, HBase writing station comprises the first input block, the first initialization unit, primary key column combination of bytes coding unit, non-primary key column combination of bytes coding unit, HBase writing unit, wherein:
The first input block, for: input a line user data to object table to be stored;
The first initialization unit, be used for: judge whether internal memory has the initialization information of the combination of bytes coding of object table, if there is no initialization information, accesses meta-data storehouse, combination of bytes coding to object table carries out initialization, and by initialization information write memory; Initialization information comprises row list of file names, column data list of types, primary key column list, line unit combination of bytes coded system and target pointer thereof, value field combination of bytes coded system and the target pointer thereof of object table; If there is initialization information, generates primary key column combination of bytes coded trigger signal, and be sent to primary key column combination of bytes coding unit;
Primary key column combination of bytes coding unit, be used for: according to the initialization information of internal memory object table line unit combination of bytes coding, from user data to be deposited, parse major key train value as combination of bytes coded object, carry out one by one byte code combination, form 1 byte sequence, as the line unit value of key-value pair storage format, generate non-primary key column combination of bytes coded trigger signal, and be sent to non-primary key column combination of bytes coding unit;
Non-primary key column combination of bytes coding unit, be used for: according to the initialization information of internal memory object table value field combination of bytes coding, from user data to be deposited, parse non-major key train value as combination of bytes coded object, carry out one by one byte code combination, form 1 byte sequence, as the value field content of key-value pair storage format;
HBase writing unit, be used for: the value field byte sequence that the line unit value byte sequence that primary key column byte code unit is obtained and non-primary key column byte code unit obtain, be assembled into key-value pair, and the data inserted mode of calling HBase client application DLL (dynamic link library) completes HBase and writes;
HBase inquiry unit comprises the second input block, the second initialization unit, major key combination of bytes coding unit, scanning element and Gray code unit, wherein:
The second input block, for: inquiry (read) request of input to object table;
The second initialization unit, be used for: judge whether internal memory has the initialization information of the combination of bytes coding of object table, if there is no initialization information, accesses meta-data storehouse, combination of bytes coding to object table carries out initialization, and by initialization information write memory; Initialization information comprises row list of file names, column data list of types, primary key column list, line unit combination of bytes coded system and target pointer thereof, value field combination of bytes coded system and the target pointer thereof of object table; If there is initialization information, generates major key combination of bytes coded trigger signal, and be sent to major key combination of bytes coding unit;
Major key combination of bytes coding unit, for: one by one the major key belonging in inquiry request scope is carried out to combination of bytes coding according to the initialization information of internal memory object table, obtain the strong scope of row in the corresponding HBase of result set;
Scanning element, for: according to the strong scope scanning of row HBase, obtain the value scope in the corresponding HBase of result set, and be sent to Gray code unit;
Gray code unit, for: the value scope of the corresponding HBase of result set that reception scanning element is sent, carry out Gray code according to value field combination of bytes coded system and target pointer thereof, obtain this inquiry request result, return to Query Result.
The principle of the embodiment of the present invention is elaborated as follows:
When a line is recorded in and stores in HBase, only take 1 key-value pair, fixed overhead is about 32Bytes, in the time that mass data need to be stored, can effectively reduce the storage overhead of quantity and the fixed-length field of key-value pair like this.In new key-value pair form, line unit is still the correspondence of primary key column; In Fig. 3, according to the original mode storing one row of HBase record, while storing 4 row, need 3 key-value pairs; The mode that in the embodiment of the present invention, the value of row 2~row 4 is encoded by combination of bytes, is integrated into 1 byte sequence, is stored in the value field of key-value pair, and its storage organization is shown in Figure 6.
In specific implementation process, when data storage, according to the metadata information of table, the major key part to this line item and non-major key part are carried out combination of bytes coding respectively, form a line unit byte sequence and a value byte sequence, then call HBase client application DLL (dynamic link library) and complete the storage to this record.Wherein metadata information comes from the parsing while building table, and process is as follows: from build predicative sentence, parse the metadata information of list structure, be incorporated into metadatabase, and shown in Figure 7.Metadata information comprises table name, row name, column data type, major key, and the combination of bytes that this table need to use is encoded.
Combination of bytes coding is the core of the embodiment of the present invention, and its realization is:
In HBase efficient storage process: for the value of all primary key column is integrated into 1 byte sequence according to its corresponding data type by being converted to the mode of byte code, as the line unit value field of key-value pair; The value of other all non-primary key column is also integrated into 1 byte sequence according to its corresponding data class by being converted to the mode of byte code, as the value field of key-value pair.
In HBase data read process: for train value that needs are returned according to its corresponding data type from the byte sequence Gray code of line unit field and value field out.
In addition, for multiple train value assembly codings being become to 1 byte sequence, and the value that Gray code goes out specify columns from byte sequence, in combination of bytes coding, also should realize a target pointer, for describing position and corresponding row name and the row type thereof of all byte sequences of living in of row.Like this, encode and just can realize mutual coding and the Gray code of byte sequence and its train value comprising by combination of bytes.
The HBase storage means that the embodiment of the present invention is introduced, mainly thought and its implementation that proposes a kind of user data efficient storage in HBase, therefore, use scenes is not limited to the HBase client application DLL (dynamic link library) mode of calling using in this patent and puts in storage, and any other HBase warehouse-in mode can be transformed the storage mode of user data in HBase with this patent institute put forward the methods.
Measured data contrast:
The contrast test of the HBase storage means in traditional HBase storage mode and the embodiment of the present invention uses identical list structure, and in table, a line has 8 row, and average a line record length is about 80 bytes, uses identical compressed encoding when two scene warehouse-ins.
Two index correlation datas of ratio of compression before and after the consuming time and warehouse-in of the warehouse-in of two scenes, ginseng is shown in Table 1.
Two index correlation data tables of ratio of compression before and after the consuming time and warehouse-in of the warehouse-in of table 1, two scenes
Can see from index correlation data, HBase storage means in the use embodiment of the present invention, owing to having reduced fixed field consumption, therefore can obtain better ratio of compression, has saved storage space, and the raising that has brought thus warehouse-in throughput performance, reduce the spent time of warehouse-in.
Those skilled in the art can carry out various modifications and variations to the embodiment of the present invention, if these amendment and modification within the scope of the claims in the present invention and equivalent technologies thereof, these revise and modification also within protection scope of the present invention.
The prior art that the content of not describing in detail in instructions is known to the skilled person.
Claims (6)
1. the high-efficiency storage method based on HBase, is characterized in that, comprises the following steps:
Step 101, input a line user data to object table to be stored;
Step 102, judge the initialization information that whether has the combination of bytes of object table coding in internal memory, if do not had, forward step 103 to; If had, forward step 104 to;
Step 103, accesses meta-data storehouse, carry out initialization to the combination of bytes coding of object table, and by initialization information write memory, return to step 102;
Step 104, the initialization information of encoding according to object table line unit combination of bytes in internal memory, from user data to be deposited, parse major key train value as combination of bytes coded object, carry out one by one byte code combination, form 1 byte sequence, as the line unit value of key-value pair storage format, forward step 105 to;
Step 105, the initialization information of encoding according to object table value field combination of bytes in internal memory, from user data to be deposited, parse non-major key train value as combination of bytes coded object, carry out one by one byte code combination, form 1 byte sequence, as the value field content of key-value pair storage format, forward step 106 to;
Step 106, storage HBase: the value field byte sequence that the line unit value byte sequence that step 104 is obtained and step 105 obtain, be assembled into key-value pair, and the data inserted mode of calling HBase client application DLL (dynamic link library) completes HBase and writes.
2. the high-efficiency storage method based on HBase as claimed in claim 1, is characterized in that: after step 106, also comprise the step of following inquiry HBase data:
Step 201, the inquiry request of input to object table;
Step 202, judge the initialization information that whether has the combination of bytes of object table coding in internal memory, if do not had, forward step 203 to; If had, forward step 204 to;
Step 203, accesses meta-data storehouse, carry out initialization to the combination of bytes coding of object table, and by initialization information write memory, return to step 202;
Step 204, one by one the major key belonging in inquiry request scope is carried out to combination of bytes coding according to the initialization information of object table in internal memory, obtain the strong scope of row in the corresponding HBase of result set;
Step 205, according to the strong scope scanning of the row of step 204 HBase, obtain the value scope in the corresponding HBase of result set, forward step 206 to;
Value scope in the corresponding HBase of result set of step 206, integrating step 205, carries out Gray code according to value field combination of bytes coded system and target pointer thereof, obtains this inquiry request result, forwards step 207 to;
Step 207, return to Query Result.
3. the high-efficiency storage method based on HBase as claimed in claim 1 or 2, is characterized in that: described initialization information comprises row list of file names, column data list of types, primary key column list, line unit combination of bytes coded system and target pointer thereof, value field combination of bytes coded system and the target pointer thereof of object table.
4. the high efficiency storage system based on HBase, it is characterized in that: described system comprises HBase writing station, described HBase writing station comprises the first input block, the first initialization unit, primary key column combination of bytes coding unit, non-primary key column combination of bytes coding unit, HBase writing unit, wherein:
Described the first input block, for: input a line user data to object table to be stored;
Described the first initialization unit, be used for: judge whether internal memory has the initialization information of the combination of bytes coding of object table, if there is no initialization information, accesses meta-data storehouse, combination of bytes coding to object table carries out initialization, and by initialization information write memory; If there is initialization information, generates primary key column combination of bytes coded trigger signal, and be sent to primary key column combination of bytes coding unit;
Described primary key column combination of bytes coding unit, be used for: according to the initialization information of internal memory object table line unit combination of bytes coding, from user data to be deposited, parse major key train value as combination of bytes coded object, carry out one by one byte code combination, form 1 byte sequence, as the line unit value of key-value pair storage format, generate non-primary key column combination of bytes coded trigger signal, and be sent to non-primary key column combination of bytes coding unit;
Described non-primary key column combination of bytes coding unit, be used for: according to the initialization information of internal memory object table value field combination of bytes coding, from user data to be deposited, parse non-major key train value as combination of bytes coded object, carry out one by one byte code combination, form 1 byte sequence, as the value field content of key-value pair storage format;
Described HBase writing unit, be used for: the value field byte sequence that the line unit value byte sequence that primary key column byte code unit is obtained and non-primary key column byte code unit obtain, be assembled into key-value pair, and the data inserted mode of calling HBase client application DLL (dynamic link library) completes HBase and writes.
5. the high efficiency storage system based on HBase as claimed in claim 4, it is characterized in that: described system also comprises HBase inquiry unit, described HBase inquiry unit comprises the second input block, the second initialization unit, major key combination of bytes coding unit, scanning element and Gray code unit, wherein:
Described the second input block, for: the inquiry request of input to object table;
Described the second initialization unit, be used for: judge whether internal memory has the initialization information of the combination of bytes coding of object table, if there is no initialization information, accesses meta-data storehouse, combination of bytes coding to object table carries out initialization, and by initialization information write memory; If there is initialization information, generates major key combination of bytes coded trigger signal, and be sent to major key combination of bytes coding unit;
Described major key combination of bytes coding unit, for: one by one the major key belonging in inquiry request scope is carried out to combination of bytes coding according to the initialization information of internal memory object table, obtain the strong scope of row in the corresponding HBase of result set;
Described scanning element, for: according to the strong scope scanning of row HBase, obtain the value scope in the corresponding HBase of result set, and be sent to Gray code unit;
Described Gray code unit, for: the value scope of the corresponding HBase of result set that reception scanning element is sent, carry out Gray code according to value field combination of bytes coded system and target pointer thereof, obtain this inquiry request result, return to Query Result.
6. the high efficiency storage system based on HBase as described in claim 4 or 5, is characterized in that: described initialization information comprises row list of file names, column data list of types, primary key column list, line unit combination of bytes coded system and target pointer thereof, value field combination of bytes coded system and the target pointer thereof of object table.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201410188339.9A CN104008134B (en) | 2014-05-06 | 2014-05-06 | Efficient storage method and system based on Hbase |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201410188339.9A CN104008134B (en) | 2014-05-06 | 2014-05-06 | Efficient storage method and system based on Hbase |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN104008134A true CN104008134A (en) | 2014-08-27 |
| CN104008134B CN104008134B (en) | 2017-02-15 |
Family
ID=51368791
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201410188339.9A Active CN104008134B (en) | 2014-05-06 | 2014-05-06 | Efficient storage method and system based on Hbase |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN104008134B (en) |
Cited By (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN104778097A (en) * | 2015-03-27 | 2015-07-15 | 新浪网技术(中国)有限公司 | Data recovery method and data recovery device |
| CN106326361A (en) * | 2016-08-10 | 2017-01-11 | 中国农业银行股份有限公司 | HBase database-based data inquiry method and device |
| CN106528674A (en) * | 2016-10-31 | 2017-03-22 | 厦门服云信息科技有限公司 | Method and device for high-performance query based on Hbase row keys |
| CN111078753A (en) * | 2019-12-17 | 2020-04-28 | 联想(北京)有限公司 | HBase database-based time sequence data storage method and device |
| CN112925836A (en) * | 2019-12-06 | 2021-06-08 | 腾讯科技(深圳)有限公司 | Data conversion method and equipment |
| CN113094292A (en) * | 2020-01-09 | 2021-07-09 | 上海宝存信息科技有限公司 | Data storage device and non-volatile memory control method |
| CN115146070A (en) * | 2022-06-28 | 2022-10-04 | 北京百度网讯科技有限公司 | Key value generation method, knowledge graph generation method, device, equipment and medium |
| CN116049197A (en) * | 2023-03-07 | 2023-05-02 | 中船重工奥蓝托无锡软件技术有限公司 | HBase-based data equilibrium storage method |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102332030A (en) * | 2011-10-17 | 2012-01-25 | 中国科学院计算技术研究所 | Data storage, management and query method and system for distributed key-value storage system |
| US20130103658A1 (en) * | 2011-10-19 | 2013-04-25 | Vmware, Inc. | Time series data mapping into a key-value database |
| CN103294678A (en) * | 2012-02-24 | 2013-09-11 | 深圳市腾讯计算机系统有限公司 | Data storage and access method and system based on LV (length-value) format |
-
2014
- 2014-05-06 CN CN201410188339.9A patent/CN104008134B/en active Active
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102332030A (en) * | 2011-10-17 | 2012-01-25 | 中国科学院计算技术研究所 | Data storage, management and query method and system for distributed key-value storage system |
| US20130103658A1 (en) * | 2011-10-19 | 2013-04-25 | Vmware, Inc. | Time series data mapping into a key-value database |
| CN103294678A (en) * | 2012-02-24 | 2013-09-11 | 深圳市腾讯计算机系统有限公司 | Data storage and access method and system based on LV (length-value) format |
Non-Patent Citations (2)
| Title |
|---|
| 冯亚丽,等: ""多格式海量数据统一存取的索引结构"", 《计算机应用研究》 * |
| 李业丽,等: ""基于多维数据模型的数据仓库的构建"", 《北京印刷学院学报》 * |
Cited By (12)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN104778097A (en) * | 2015-03-27 | 2015-07-15 | 新浪网技术(中国)有限公司 | Data recovery method and data recovery device |
| CN106326361A (en) * | 2016-08-10 | 2017-01-11 | 中国农业银行股份有限公司 | HBase database-based data inquiry method and device |
| CN106528674A (en) * | 2016-10-31 | 2017-03-22 | 厦门服云信息科技有限公司 | Method and device for high-performance query based on Hbase row keys |
| CN106528674B (en) * | 2016-10-31 | 2019-10-01 | 厦门服云信息科技有限公司 | The High Performance Data Query method and apparatus being good for based on Hbase row |
| CN112925836A (en) * | 2019-12-06 | 2021-06-08 | 腾讯科技(深圳)有限公司 | Data conversion method and equipment |
| CN112925836B (en) * | 2019-12-06 | 2024-05-28 | 腾讯科技(深圳)有限公司 | Data conversion method and device |
| CN111078753A (en) * | 2019-12-17 | 2020-04-28 | 联想(北京)有限公司 | HBase database-based time sequence data storage method and device |
| CN111078753B (en) * | 2019-12-17 | 2024-02-27 | 联想(北京)有限公司 | Time sequence data storage method and device based on HBase database |
| CN113094292A (en) * | 2020-01-09 | 2021-07-09 | 上海宝存信息科技有限公司 | Data storage device and non-volatile memory control method |
| US11520698B2 (en) | 2020-01-09 | 2022-12-06 | Shannon Systems Ltd. | Data storage device in a key-value storage architecture with data compression, and non-volatile memory control method |
| CN115146070A (en) * | 2022-06-28 | 2022-10-04 | 北京百度网讯科技有限公司 | Key value generation method, knowledge graph generation method, device, equipment and medium |
| CN116049197A (en) * | 2023-03-07 | 2023-05-02 | 中船重工奥蓝托无锡软件技术有限公司 | HBase-based data equilibrium storage method |
Also Published As
| Publication number | Publication date |
|---|---|
| CN104008134B (en) | 2017-02-15 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN104008134A (en) | Efficient storage method and system based on Hbase | |
| US10812589B2 (en) | Storage architecture for heterogeneous multimedia data | |
| CN104820714B (en) | Magnanimity tile small documents memory management method based on hadoop | |
| US20190132392A1 (en) | Storage architecture for heterogeneous multimedia data | |
| CN103491185B (en) | A kind of remotely-sensed data cloud storage means based on image blocks tissue | |
| CN112214472B (en) | Meteorological lattice data storage and query method, device and storage medium | |
| CN104657362A (en) | Method and device for storing and querying data | |
| CN106471501B (en) | Data query method, data object storage method and data system | |
| CN107423422A (en) | Spatial data distributed storage and search method and system based on grid | |
| CN113568995A (en) | Dynamic tile map making method based on retrieval conditions and tile map system | |
| CN106202548A (en) | Data storage method, search method and device | |
| CN117591040B (en) | Data processing method, device, equipment and readable storage medium | |
| CN104951464B (en) | Data storage method and system | |
| CN113918535A (en) | Data reading method, device, equipment and storage medium | |
| Gupta et al. | Faster as well as early measurements from big data predictive analytics model | |
| WO2024022330A1 (en) | Metadata management method based on file system, and related device thereof | |
| CN106909623B (en) | A kind of data set and date storage method for supporting efficient mass data to analyze and retrieve | |
| CN111797279B (en) | Method and device for storing data | |
| CN115712581A (en) | Data access method, storage system and storage node | |
| Jiang et al. | MOIST: A scalable and parallel moving object indexer with school tracking | |
| CN109376214B (en) | Data processing method, device and system, computer equipment and readable medium | |
| CN117335811A (en) | Column data compression method and device and storage medium | |
| Byun et al. | Asymmetric index management scheme for high-capacity compressed databases | |
| CN113868440B (en) | Feature library management method, device, equipment and medium | |
| CN113626439A (en) | A data processing method, device, data processing equipment and storage medium |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| C14 | Grant of patent or utility model | ||
| GR01 | Patent grant | ||
| CP01 | Change in the name or title of a patent holder |
Address after: 430074, No. 88, postal academy road, Hongshan District, Hubei, Wuhan Patentee after: Wuhan post and Telecommunications Science Research Institute Co., Ltd. Address before: 430074, No. 88, postal academy road, Hongshan District, Hubei, Wuhan Patentee before: Wuhan Inst. of Post & Telecom Science |
|
| CP01 | Change in the name or title of a patent holder |