[go: up one dir, main page]

CN113541694B - Data compression method, device and electronic equipment - Google Patents

Data compression method, device and electronic equipment Download PDF

Info

Publication number
CN113541694B
CN113541694B CN202010296259.0A CN202010296259A CN113541694B CN 113541694 B CN113541694 B CN 113541694B CN 202010296259 A CN202010296259 A CN 202010296259A CN 113541694 B CN113541694 B CN 113541694B
Authority
CN
China
Prior art keywords
parameter
event
data
type
identification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010296259.0A
Other languages
Chinese (zh)
Other versions
CN113541694A (en
Inventor
胡霞
姚满海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN202010296259.0A priority Critical patent/CN113541694B/en
Publication of CN113541694A publication Critical patent/CN113541694A/en
Application granted granted Critical
Publication of CN113541694B publication Critical patent/CN113541694B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The embodiment of the application provides a data compression method, a data compression device and electronic equipment. The method comprises the steps of obtaining a first event record, calling a first event definition comprising a parameter identifier of the first event and a preset value, comparing the first event definition with the first event record, and generating first event record compressed data according to a comparison result, wherein the first event record compressed data comprises first type parameter data comprising the parameter identifier of a first type parameter and a record value, the first type parameter is parameter with a record value different from the preset value, second type parameter data comprising the parameter identifier or third type parameter data comprising the parameter identifier, the second type parameter is parameter with the record value identical to the preset value, and the third type parameter is parameter with the record value of the parameter not recorded in the first event record. According to the method of the embodiment of the application, the data volume of the log data is effectively reduced by reducing the parameter record value and the parameter identifier recorded in the event record.

Description

Data compression method and device and electronic equipment
Technical Field
The present application relates to the field of intelligent terminals, and in particular, to a data compression method, apparatus and electronic device.
Background
In the application scenario of a personal terminal (for example, a mobile phone or a personal computer), the data size of log data of the personal terminal is increasing with the increasing application functions that can be implemented by the personal terminal.
In some application scenarios, log data of a personal terminal needs to be uploaded to a server for comprehensive analysis (e.g., for personal terminal failure analysis). With the increase of the data volume of the log data of the personal terminal, the storage space requirement for storing the log data on the personal terminal is also increased continuously, the data transmission flow consumed by uploading the log data is also increased, the normal operation of the personal terminal is seriously affected, and the user experience is reduced.
Disclosure of Invention
Aiming at the problem of overlarge data volume of log data, the application provides a data compression method, a data compression device, electronic equipment and a computer readable storage medium.
The embodiment of the application adopts the following technical scheme:
in a first aspect, the present application provides a data compression method, including:
Acquiring a first event record corresponding to the first event, wherein the first event record comprises a parameter identifier and a record value of one or more parameters defined in the first event;
invoking a first event definition corresponding to the first event, wherein the first event definition comprises parameter identifiers and preset values of all parameters defined in the first event;
Comparing the first event definition with the first event record, and generating first event record compressed data according to the comparison result, wherein the first event record compressed data comprises:
Event identification data, wherein the event identification data comprises an event identification of a first event;
the first type of parameter data comprises parameter identifiers and recorded values of the first type of parameters, wherein the first type of parameters are parameters with different recorded values from preset values in parameters contained in the first event record;
A second type of parameter data or a third type of parameter data, wherein:
the second type parameter data comprises a parameter identifier of a second type parameter, wherein the second type parameter is a parameter of which the recording value of the parameter is the same as a preset value in the parameters contained in the first event record;
The third type of parameter data includes a parameter identifier of a third type of parameter, and the third type of parameter is a parameter in which a recorded value of the parameter is not recorded in the first event record, among parameters included in the first event definition.
In a possible implementation manner based on the first aspect, the first event record compressed data includes:
And the second type parameter data and the third type parameter data are one type of data with smaller data quantity.
In a possible implementation manner based on the first aspect, the first type of parameter data further includes a first parameter class identifier, the second type of parameter data further includes a second parameter class identifier, and the third type of parameter data further includes a third parameter class identifier.
In a possible implementation manner based on the first aspect, the event identification data further includes a first compression mode identification or a second compression mode identification, where:
when the first event record compressed data comprises second type parameter data, the event identification data comprises a first compression mode identification;
When the first event record compression data includes third type parameter data, the event identification data includes a second compression mode identification.
In a possible implementation manner based on the first aspect, the binary code of the event identification data is a code generated by replacing the last bit of the event identification offset code with the binary code of the first compression mode identification or the second compression mode identification, wherein the event identification offset code is a code generated by shifting the binary code of the event identification by 1 bit.
In a possible implementation manner based on the first aspect, the first type of parameter data further includes a first parameter class identifier, the second type of parameter data further includes a second parameter class identifier, and the third type of parameter data further includes a second parameter class identifier.
In a possible implementation manner based on the first aspect, the method includes:
The binary code of the first type parameter data comprises a first type parameter identification code, wherein the first type parameter identification code is generated by replacing the last bit of the first type parameter offset code with the binary code of the first type parameter identification, and the first type parameter offset code is generated by shifting the binary code of the first type parameter identification by 1 bit to the left;
The binary code of the second type parameter data comprises a second type parameter identification code, wherein the second type parameter identification code is generated by replacing the last bit of the second type parameter offset code with the binary code of the second type parameter identification, and the second type parameter offset code is generated by shifting the binary code of the second type parameter identification by 1 bit leftwards;
The binary code of the third type parameter data comprises a third type parameter identification code, wherein the third type parameter identification code is generated by replacing the last bit of the third type parameter offset code by the binary code of the second parameter identification, and the third type parameter offset code is generated by shifting the binary code of the parameter identification of the third type parameter by 1 bit.
In a second aspect, the present application provides a data parsing method, including:
Acquiring first event record compression data, wherein the first event record compression data is generated by compressing a first event record according to the method of the first aspect, and the first event record corresponds to the first event;
invoking a first event definition corresponding to the first event, wherein the first event definition comprises parameter identifiers and preset values of all parameters defined in the first event;
parsing the first event record compressed data according to the first event definition, comprising:
When the first event record compressed data comprises second-class parameter data, aiming at the parameter which records the parameter identification but does not record the parameter value in the first event record compressed data, taking the preset value in the definition of the first event as the parameter value;
Or alternatively
When the first event record compressed data comprises third type parameter data, aiming at the parameter which does not record the parameter identification in the first event record compressed data, taking the preset value in the definition of the first event as the parameter value.
In a third aspect, the present application provides a data compression apparatus comprising:
an event record obtaining module, configured to obtain a first event record corresponding to a first event, where the first event record includes a parameter identifier and a record value of one or more parameters defined in the first event;
The event definition calling module is used for calling a first event definition corresponding to the first event, wherein the first event definition comprises parameter identifiers and preset values of all parameters defined in the first event;
the compression module is used for comparing the definition of the first event with the first event record, and generating first event record compressed data according to the comparison result, wherein the first event record compressed data comprises:
Event identification data, wherein the event identification data comprises an event identification of a first event;
the first type of parameter data comprises parameter identifiers and recorded values of the first type of parameters, wherein the first type of parameters are parameters with different recorded values from preset values in parameters contained in the first event record;
A second type of parameter data or a third type of parameter data, wherein:
the second type parameter data comprises a parameter identifier of a second type parameter, wherein the second type parameter is a parameter of which the recording value of the parameter is the same as a preset value in the parameters contained in the first event record;
The third type of parameter data includes a parameter identifier of a third type of parameter, and the third type of parameter is a parameter in which a recorded value of the parameter is not recorded in the first event record, among parameters included in the first event definition.
In a fourth aspect, the present application provides a data analysis device, including:
A data acquisition module, configured to acquire first event record compressed data, where the first event record compressed data is data generated by compressing a first event record according to the method of the first aspect, and the first event record corresponds to the first event;
The event definition calling module is used for calling a first event definition corresponding to the first event, wherein the first event definition comprises parameter identifiers and preset values of all parameters defined in the first event;
a data parsing module for parsing compressed data of a first event record according to a first event definition, comprising:
When the first event record compressed data comprises second-class parameter data, aiming at the parameter which records the parameter identification but does not record the parameter value in the first event record compressed data, taking the preset value in the definition of the first event as the parameter value;
Or alternatively
When the first event record compressed data comprises third type parameter data, aiming at the parameter which does not record the parameter identification in the first event record compressed data, taking the preset value in the definition of the first event as the parameter value.
In a fifth aspect, the present application provides an electronic device comprising a memory for storing computer program instructions and a processor for executing the program instructions, wherein the computer program instructions, when executed by the processor, trigger the electronic device to perform the method steps of an embodiment of the present application
In a sixth aspect, the present application provides a computer readable storage medium having a computer program stored therein, which when run on a computer causes the computer to perform the method of the embodiments of the present application.
According to the technical scheme provided by the embodiment of the application, at least the following technical effects can be realized:
According to the method of the embodiment of the application, the data volume of the log data is effectively reduced by reducing the parameter record value and the parameter identifier recorded in the event record, so that the storage space occupation of the log data and the flow consumption in the transmission process are reduced;
Furthermore, according to the method of an embodiment of the present application, for files with various data structures, corresponding data compression operations can be performed without modifying the code generating the data structure, so that risks caused by modifying programs are avoided;
Furthermore, according to the method of an embodiment of the present application, the compression operation depends on the event definition, and when there is no corresponding event definition, the compression result cannot be resolved, so that the data security is greatly improved.
Drawings
FIG. 1 is a flow chart of an embodiment of a data compression method according to the present application;
FIG. 2 is a flow chart of an embodiment of a data parsing method according to the present application;
FIG. 3 is a schematic diagram of an embodiment of a data compression device according to the present application;
fig. 4 is a schematic structural diagram of an embodiment of a data parsing apparatus according to the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be clearly and completely described below with reference to specific embodiments of the present application and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The terminology used in the description of the embodiments of the application herein is for the purpose of describing particular embodiments of the application only and is not intended to be limiting of the application.
Aiming at the problem that the data volume of the log data is overlarge, the application provides a data compression method to reduce unnecessary data codes in the log data, thereby reducing the data volume of the log data.
For log data, one possible compression method is to store and transmit data in a format of JS object numbered musical notation (JavaScript Object Notation, JSON) key-value pairs. But the key value pair is used for storage, a large number of key character strings exist, a large storage space is occupied, and the key character strings are transmitted in a plaintext form during data transmission, so that information safety hazards exist.
Another possible compression method is to use a Protocol Buffer (Protocol buf). The method adopts the following steps:
defining the structured data as a message, wherein the message corresponds to class in Java, and storing the message definition by using a data structure definition file;
Generating a code file corresponding to each code file according to the data structure definition file, for example, generating a class file for each message;
each member in the message is numbered, the number of the member and the coding type Value of the member are stored in the Tag-Value form, and the corresponding data Value is stored in the Value.
In the above scheme, it is necessary to generate one class file for each message separately, and generate a large number of auxiliary codes, and for the case where there are a large number of messages, a large number of auxiliary codes will be generated, and the data compression effect is not ideal. Moreover, the implementation of the scheme needs to modify the program source code, which greatly increases the implementation difficulty of the scheme.
Another possible compression is to employ a data compression system Avro. Avro is similar to Protobuf in overall implementation concept. In the Avro-based compression scheme, the member data is numbered and encoded in the order of the numbers, but no Tag is stored in Avro, and the encoding and decoding are performed in the order depending on the event definition file. In Avro-based compression schemes, since a corresponding code file, such as a calss file, needs to be generated, a large amount of auxiliary code still is generated, and the program source code needs to be modified to implement Avro-based compression schemes.
Based on the above analysis of different compression schemes, an embodiment of the present application proposes a new data compression method to implement better data compression on log data. To put forward the method of the embodiments of the present application, the inventors first analyze the data structure of log data. Generally, log data is used to record events of a personal terminal. For each event, the event may contain one or more parameters describing the event, the parameters describing different event states by different parameter values. For an event record of an event, the event record is used to record parameter values of one or more of all parameters included in the event (hereinafter, the parameter values recorded in the event record are simply referred to as record values).
Taking mobile phone log data as an example, the mobile phone can generate log data when in operation, and the log data is used for recording the operation state of the mobile phone. The log data is recorded in the form of individual events (events). One event may have one or more different event records for different event generation times and/or event generation sources. Each event record contains one or more records of parameters (Param), each record of parameters contains a parameter identification name corresponding to the parameter and a record value corresponding to the parameter, and the event record describes the state information of the event through the records of the parameters.
Since the log data includes a collection of event records. Therefore, if the data volume of the event record can be reduced, the data volume of the log data is reduced, and the data compression of the log data is realized.
Generally, for an event, a parameter included in the event has a preset parameter value (hereinafter, the parameter value preset for the parameter is simply referred to as a preset value). The preset value of the parameter is the parameter value of the parameter when the event is in the preset state. The value of the parameter at the time of normal state of the event is generally set to a preset value, so that if the event is abnormal, the parameter value of one or more parameters included in the event is not the preset value.
Since the preset value has previously determined a specific value, if it can be determined that one parameter satisfies the preset value based on the existence of the preset value, the parameter value of the parameter can be confirmed by retrieving the preset value even if the specific parameter value of the parameter is not recorded.
Further, in an actual application scenario, in order to facilitate interpretation of log data, event definitions are generally predefined, and the data format of the log data is normalized by the event definitions. The event definition includes definitions of all parameters included by the event, e.g., parameter identification names of the parameters, data types of parameter values of the parameters. The event definition is saved by the generating side (e.g., personal terminal) and the parsing side (e.g., server that needs to analyze the log data) of the log data. When generating log data, if an event record of a certain event is to be generated, calling a pre-stored event definition of the event, and generating the event record meeting the data format of the event definition through the definition of parameters in the event definition. When analyzing the log data, if the event record of a certain event is to be analyzed, calling the pre-stored event definition of the event, and analyzing the event record through the definition of the parameters in the event definition.
Therefore, based on the existence of the event definition, if the preset value of the parameter is written in the event definition, when the event record is compressed, the parameter value is not required to be saved for the parameter with the preset value in the compressed result, and when the compressed result of the event record is analyzed, if the parameter with the parameter value not recorded is searched, the preset value in the event definition is directly called.
Further, in the actual application scenario, there are parameters in the event record where no parameter value is recorded, or the parameters are defined in the event definition, but are not recorded at all in the event record. For the above case, if the parameters of the parameter value record in the event record can be confirmed, the parameters of the event record without the parameter value record can be confirmed according to comparison with all the defined parameters in the event definition.
To sum up, in one embodiment of the present application, when the event record is compressed, in the compression result:
Aiming at the parameters of which the parameter values are not preset values, reserving parameter identifiers and parameter values of the parameters;
aiming at the parameter with the parameter value being a preset value, the parameter value is not reserved, and only the parameter identification is reserved;
For parameters in which no parameter value is recorded in the event record or parameters which are defined in the event definition but are not recorded at all in the event record, i.e. the parameter value is not reserved and the parameter identification is not reserved.
Further, it is contemplated that if parameters for which no parameter value is recorded in the event record can be identified, parameters for which a parameter value is recorded in the event record can also be identified based on comparison with all defined parameters in the event definition.
Thus, in one embodiment of the present application, upon compression of an event record, in the compression result:
Aiming at the parameters of which the parameter values are not preset values, reserving parameter identifiers and parameter values of the parameters;
aiming at the parameters which are not recorded with the parameter values in the event records or the parameters which are defined in the event definition but are not recorded with the event records at all, the parameter values are not reserved, and only the parameter identifiers are reserved;
And aiming at the parameter with the parameter value being a preset value, namely, the parameter value is not reserved, and the parameter identification is not reserved.
The following describes in detail the technical solutions provided by the embodiments of the present application with reference to the accompanying drawings.
FIG. 1 is a flow chart of an embodiment of a data compression method according to the present application. In one embodiment of the present application, as shown in fig. 1, the process of compressing the first event record includes:
Step 110, acquiring a first event record corresponding to a first event, wherein the first event record comprises parameter identifiers and record values of one or more parameters defined in the first event;
Step 120, calling a first event definition corresponding to the first event, wherein the first event definition comprises parameter identifiers and preset values of all parameters defined in the first event;
step 130, comparing the first event definition with the first event record, and generating compressed data of the first event record according to the comparison result.
Specifically, in step 130, the generation of the first event record compression data includes event identification data and first type parameter data, where:
Event identification data includes an event identification of the first event;
the first type of parameter data comprises a parameter identifier and a recorded value of a first type of parameter, wherein the first type of parameter is a parameter of which the recorded value is different from the preset value in parameters contained in the first event record.
Further, in step 130, the first event record compressed data is generated to further include second type parameter data or third type parameter data (the first event record compressed data further includes one type of data of the second type parameter data and the third type parameter data, and does not include the second type parameter data and the third type parameter data at the same time), wherein:
The second type of parameter data comprises a parameter identifier of a second type of parameter, wherein the second type of parameter is a parameter of which the recording value of the parameter is the same as the preset value in the parameters contained in the first event record;
The third type of parameter data includes a parameter identifier of a third type of parameter, and the third type of parameter is a parameter in which a recorded value of the parameter is not recorded in the first event record, among parameters included in the first event definition.
That is, step 130 includes at least a first implementation and a second implementation. In the practical application process, one implementation manner of the first implementation manner and the second implementation manner is selected to implement step 130.
Specifically, in a first implementation of step 130, the first event record compression data includes event identification data, first type parameter data, and second type parameter data.
Specifically, in a second implementation of step 130, the first event record compression data includes event identification data, first type parameter data, and third type parameter data.
According to the method of the embodiment of the application, the data volume of the log data is effectively reduced by reducing the parameter record value and the parameter identification recorded in the event record, so that the storage space occupation of the log data and the flow consumption in the transmission process are reduced.
Furthermore, according to the method of an embodiment of the present application, for files with various data structures, the corresponding data compression operation can be performed without modifying the code generating the data structure, thereby avoiding the risk caused by modifying the program.
Furthermore, according to the method of an embodiment of the present application, the compression operation depends on the event definition, and when there is no corresponding event definition, the compression result cannot be resolved, so that the data security is greatly improved.
Specifically, in the embodiment of the present application, the parameter identifier may be a parameter number (ID) or a parameter name. Specifically, in an actual application scenario, a specific type of the parameter identifier may be determined according to specific application requirements. For example, in an application scenario, the parameter number is used as the parameter identification in view of the fact that the code length of the parameter number is smaller than the parameter name. In another application scenario, the parameter name is used as the parameter identification, considering that there is a repetition of the parameter number.
It should be noted that, in various embodiments and application scenarios of the present disclosure, the object of data compression is an event record of log data. However, the data compression method provided by the embodiment of the application is not limited to the event record for the log data, and any data with a similar data record structure to the event record for the log data can be adopted.
The specific analysis process and execution process of the embodiment shown in fig. 1 are described below in a specific application scenario.
In an application scenario, each event has a corresponding event definition, which specifies an event Identification (ID) and parameters. The effect of the event definition is to define the number of event parameters, the data type of the parameter values, and the equivalent of the configuration file. For example, assume that for Event a there is the following Event (Event) definition:
<Event id=913000007>
<Param id=0name="Age"type="int32"default=10/>
<Param id=1name="Money"type="double"default=1.0/>
<Param id=2name="Name"type="string"default=”ABC”/>
<Param id=3name="Data"type="event"classID=913009001/>
</Event>
In event definition:
Event ID, namely the unique identifier of each Event, which is equivalent to the identification card number of each person, as in the Event definition of Event a, event id= 913000007, wherein 913000007 is the Event ID of Event a;
Parameters an event has one or more parameters for recording information of the event, and in the definition of the parameters, each parameter has its own parameter number (ID), parameter name, parameter type, default value. In the application scenario description of the following embodiments, when a certain parameter needs to be specified, for convenience of description, a specific parameter is referred to by a parameter name.
Specifically, the parameter definition includes:
Parameter ID, namely, parameter ID is 0, wherein 0 is the parameter ID, and the parameter ID is unique in one event, namely, in one event, only one parameter with the parameter ID of 0 exists, in different events, the values of the parameter ID do not influence each other, namely, in one event, the parameter with the parameter ID of 0 exists, and in the other event, the parameter with the parameter ID of 0 also exists, and the parameter ID of 0 does not relate to each other and does not influence each other;
parameter name, namely the name of the parameter, such as name= "Age", age is the parameter name;
parameter type, namely the data type of parameter value, such as type= "double", double is the parameter type;
The default value is a default value of an agent in event parameter definition, such as default=10 in < paramid=0 name= "agent" type= "int32" default=10/>, and 10 is a default value of an agent, and in event definition, when the parameter type is event (event), the parameter has no default value.
Further, in some application scenarios, since there are multiple events, the event definition file is composed of multiple event definitions. As described above for event a, only one event definition in the event definition file. For example, in a specific application scenario, the event definition file is as follows:
based on the event definition of the event a, in a specific application scenario, an event record a for the event a may be in the following format by way of example:
{“EventID”:913000007;“Age”:23;“Money”:1.0;“Name”:“ABC”}。
in event record a, there are the following three types of cases for the record data of the parameter:
Class a data (first type of parameter data of the embodiment shown in fig. 1) that the recorded value of the parameter in the event record is different from the preset value. In the log data of the mobile phone, the data generally represent abnormal data, namely places with errors in a program, such as Age in the example, a preset value default=10 in event definition, namely the preset value of Age is 10, and 'Age' in event record A, namely the recorded value of Age is 23, the preset value is different from the recorded value, in the scene of the mobile phone log, the preset value is the correct value of parameters when the program runs, so that 10 is normal, and 23 is abnormal, and the data of class A is the data which represents the difference from the preset value;
Class B data (second type of parameter data of the embodiment shown in fig. 1) the recorded values of the parameters in the event record are the same as the preset values. In the log data of the mobile phone, the data are normal data when the program runs, or preset values for marking the normal data, such as Money and Name;
Class C Data (third class of parameter Data of the embodiment shown in fig. 1) parameters for which no values are recorded in the event record, such as parameters Data, data in the event described above, are defined in the event definition, but in the event record there is no corresponding recorded value. In the mobile phone log Data, such Data corresponds to a situation that, as defined by the event, there are 4 parameters Age, money, name, data, the program performs assignment on the 4 parameters in the operation process, the assignment sequence is Money, name, age, data, when the assignment is performed on Money and Name, the assignment is normal, money=1.0, name= "ABC", when the assignment is performed on age=23, after the assignment is finished, the code operates between Age and Data, the program crashes because of the error assignment of Age, and Data has no value, therefore, when the log is reported, only the values of the three parameters Money, name, age can be reported, and the Data of the Data parameters cannot be reported.
Taking a mobile phone log as an example, in the three types of data, the A type data represents fault information in the log and is key information to be stored and is required to be stored completely, the B type data can not be stored when the data is compressed because the record value is the same as the preset value of the parameter in the event definition, and is recovered according to the preset value in the event definition when the data is analyzed, and the C type data is not stored because the record value is not assigned.
The class A data must be stored completely, the parameter value of the class B data can be recovered according to the preset value in the event definition when the data is analyzed, so that the class A data can not be stored, and the class C data cannot be stored as no assignment exists. Therefore, in the data compression result, the parameter values for the B-class and C-class data are not stored, and in order to distinguish whether the B-class data or the C-class data, for which the parameter values are not stored, are distinguished when analyzing the data, whether the parameter identification is stored in the compression result is used.
Specifically, the first implementation of step 130 may be described as follows:
For class a data, storing complete information such as parameter identification (parameter name or parameter ID) +parameter value;
For the B-class data, only a parameter identification (parameter name or parameter ID) is stored and used for marking the parameter as the B-class data;
For class C data, nothing directly exists.
Taking the event definition of event a described above as an example, event record a for event a. In the event record a, it can be seen that Age is a type Data, money and Name are B type Data, and Data is a C type Data, and after Data compression based on the first implementation manner of step 130, the following event record compressed Data A1 can be obtained:
{913000007;Age:23;Money;Name}。
In the event record compressed data A1 described above:
Event ID 913000007, which is an event definition for finding an event ID 913000007 in the event definition file;
the Age 23 is class A data which must be stored completely, so that the parameter identification and the parameter value 23 of the Age are stored;
Money and Name are B-type data, in order to compress the data, only parameter names Money and Name are stored during data compression, the parameter names Money and Name are used for marking that Money and Name are the same as preset values, and during data analysis, data recovery is required according to the preset values in an event definition file;
Data is not assigned, the Data is not directly stored, and when the Data is analyzed, the Data is C-type Data without assignment or assignment by adopting a preset value can be known if any Data of the Data is not available in stored information {913000007; "Age"; "23;" Money ";" Name "}.
In the event record compressed data A1, compared with the event record a, the C-class data is not stored, which is equivalent to compressing the space for storing the C-class data originally, and the B-class data only stores the parameter name, marks, does not store the parameter value, and saves the space.
Specifically, the second implementation of step 130 may be described as follows:
For class a data, storing complete information, such as parameter identification (parameter name or parameter ID) +actual value of parameter;
For class B data, nothing is directly stored;
for class C data, only a parameter identification (parameter name or parameter ID) is stored for marking the parameter as class C data.
Taking the event definition of event a described above as an example, event record a for event a. After data compression based on the second implementation of step 130, the following event record compressed data A2 may be obtained:
{913000007;Age:23;Data}。
in the event record compressed data A2 described above:
as with event record compressed data A1, event IDs such as 913000007 and class A data Age 23 are all required to be stored completely and kept unchanged;
For B-class data in the event record compressed data A2, no information is stored, and when the data is analyzed, as {913000007; age:23; data } does not have any information of Money and Name, the data is restored by adopting a preset value in event definition;
For the C-class Data in the event record compression Data A2, only the parameter name is stored, the parameter name is used for marking the C-class Data which is not assigned, the assignment is not carried out during Data analysis, and the assignment is not carried out by adopting a preset value in event definition.
In the event record compressed data A2, compared with the event record a, the B-class data is not stored, which is equivalent to compressing the space for storing the B-class data originally, and the C-class data only stores the parameter name, marks, does not store the parameter value, and saves the space.
Further, based on the data compression method of the embodiment shown in fig. 1, an embodiment of the present application further provides a data parsing method for compressed data.
FIG. 2 is a flow chart of an embodiment of a data parsing method according to the present application. In an embodiment of the present application, as shown in fig. 2, the process of parsing compressed data for a first event record includes:
Step 210, obtaining compressed data of a first event record, wherein the compressed data of the first event record is generated by compressing the first event record according to any one of the methods of claims 1 to 7, and the first event record corresponds to the first event;
step 220, calling a first event definition corresponding to the first event, wherein the first event definition comprises parameter identifiers and preset values of all parameters defined in the first event;
Step 230, parsing the first event record compressed data according to the first event definition, including:
When the first event record compressed data comprises second-class parameter data, aiming at the parameter which records the parameter identification but does not record the parameter value in the first event record compressed data, taking the preset value in the definition of the first event as the parameter value;
Or alternatively
When the first event record compressed data comprises third type parameter data, aiming at the parameter which does not record the parameter identification in the first event record compressed data, taking the preset value in the definition of the first event as the parameter value.
In particular, the particular implementation of step 230 is dependent upon the particular implementation of step 130.
In one implementation manner of step 230, when step 130 adopts the first implementation manner, the parameter with the parameter identifier recorded in the compressed data but no parameter value recorded in the compressed data is recorded for the first event, and the preset value in the definition of the first event is taken as the parameter value;
in another implementation manner of step 230, when step 130 employs the second implementation manner, the parameter in which the parameter identifier is not recorded in the compressed data is recorded for the first event, and the preset value in the definition of the first event is taken as the parameter value.
Here, in the embodiment and the application scenario of the present specification, the object of data parsing is a compression result of an event record of log data. However, the data analysis method provided by the embodiment of the present application is not limited to the compressed data of the event record for the log data, and any data compressed by adopting the method of the embodiment of the present application may adopt the data analysis method provided by the embodiment of the present application.
Further, since the implementation of step 230 depends on the implementation of step 130, in one embodiment, the data formats of the event record compression data at the data compression end (e.g., a mobile phone uploading log data) and the data analysis end (e.g., a server analyzing log data) need to be unified, that is, if the double-ended needs to confirm that the event record compression data contains the second type of parameter data or the third type of parameter data.
Further, in an actual application scenario, in an embodiment, the implementation manner of step 130 is determined according to a specific actual situation. Specifically, in one implementation manner of step 130, one type of data with a smaller data amount from the second type of parameter data and the third type of parameter data is selected as the data stored in the first event record compressed data. That is, the first event record compressed data includes one type of data having a smaller data amount from among the second type of parameter data and the third type of parameter data.
For example, event record a for the example above. In the event record a, 1 class a data, 2 class B data, 1 class C data, and 50% class B data, and 25% class a and class C data, so that the step 130 may obtain a better compression effect by adopting the second implementation manner (the data length of the event record compressed data A2 is significantly shorter than that of the event record compressed data A1).
Further, in the event record compressed data A1 and the event record compressed data A2, the parameter identifier is a parameter name, such as Age, name, money, data. Since the parameter name is a character string and occupies a large space, in a more preferable solution, the parameter ID is used for the parameter identification.
Further, in one embodiment, in one implementation of step 130, description information is written in the event record compression data, where the description information is used to describe whether step 130 employs the first implementation or the second implementation. Thus, the data parsing end can determine the implementation to be adopted in step 230 according to the description information in the event record compressed data.
Specifically, in one implementation of step 130, the event identification data further includes a first compression mode identifier or a second compression mode identifier, where:
when the first event record compressed data comprises second type parameter data, the event identification data comprises a first compression mode identification;
When the first event record compression data includes the third type of parameter data, the event identification data includes a second compression mode identification.
Furthermore, in the practical application scenario, the event record compressed data is stored and transmitted in a binary coding form. In one implementation of step 130, the binary encoding of the event identification data is an encoding generated by replacing the last bit of the event identification offset encoding with the binary encoding of the first compression type identification or the second compression type identification, wherein the event identification offset encoding is an encoding generated by shifting the binary encoding of the event identification by 1 bit to the left.
For example, the event Tag1 is encoded in binary of the event identification data, and the definition of the event Tag1 is as follows:
Event tag1=event ID < < 1|compression mode identification. (1)
The meaning of the formula (1) is that after shifting the binary code of the event ID by 1 bit, the last bit after the left shift is replaced with a compression type flag (compression_type).
The value of the compression_type is 0 (first compression mode identifier) or 1 (second compression mode identifier), which correspond to the first implementation manner and the second implementation manner of the step 130 respectively.
Taking event ID 913000007 as an example. The binary encoding of 913000007 is:
0011 0110 0110 1011 0100 0110 0100 0111。
After shifting one bit to the left:
0110 1100 1101 0110 1000 1100 1000 1110。
In a second implementation, as shown in step 130, the value of compact_type is 1, and 1 is substituted for the last bit 0, resulting in 0110 1100 1101 0110 1000 1100 1000 1111.
By the compression mode identification, the analysis end can confirm whether the parameters without parameter values in the data contained in the event record compression data are the second type parameter data or the third type parameter data. But before that, the parsing end needs to confirm the first type of parameter data by determining whether the parameters included in the event record compression data have parameter values.
To facilitate distinguishing between the first type of parameter data, in one implementation of step 130, compression mode identification is not used, and identification is defined separately for each type of parameter data. Three parameter class identifiers corresponding to the first class parameter data, the second class parameter data and the third class parameter data are respectively defined, and the analysis end confirms which class the parameter data belongs to through the parameter class identifiers contained in the parameter data. Specifically, in one implementation of step 130, the first type of parameter data further includes a first parameter type identifier, the second type of parameter data further includes a second parameter type identifier, and the third type of parameter data further includes a second parameter type identifier.
For example, in an application scenario, the parameter class identifier is written with a binary encoding of the parameter identifier to generate the parameter Tag1.
A first parameter class is defined as 00, a first parameter class is defined as 01, and a first parameter class is defined as 10.
The binary coding of the first type of parameter data comprises binary coding of parameter Tag1 and parameter values of the first type of parameter data;
The binary code of the second type parameter data comprises a parameter Tag1 of the second type parameter data;
The binary encoding of the third type of parameter data includes a parameter Tag1 of the third type of parameter data.
The parameter Tag1 of each type of parameter data is generated according to the parameter identifier corresponding to the type of parameter data.
The parameter Tag1 is defined as follows:
parameter tag1=parameter ID < < 2|parameter class identification. (2)
The meaning of the formula (1) is that after the binary code of the parameter ID is shifted left by 2 bits, the last 2 bits after the shift left are replaced by the parameter class identification.
Further, in one implementation of step 130, the parameter class identification is introduced based on the use of the compression mode identification. Specifically, a first parameter type identifier and a second parameter type identifier are defined, in event record compression data, the first parameter type identifier is included in the first parameter type data, the second parameter type identifier is included in the second parameter type data, and the second parameter type identifier is included in the third parameter type data. Thus, the data analysis end can confirm the first type of parameter data based on the parameter type identifier, and the parameter data is the first type of parameter data when the parameter data comprises the first parameter type identifier. And further, confirming second type parameter data or third type parameter data according to the compression mode identification, wherein when the event identification data comprises the first compression mode identification, the parameter data comprising the second parameter type identification is the second type parameter data, and when the event identification data comprises the second compression mode identification, the parameter data comprising the second parameter type identification is the third type parameter data.
Furthermore, in the practical application scenario, the event record compressed data is stored and transmitted in a binary coding form. In one implementation of step 130, the parameter class identification is written to a binary encoding of the parameter identification to generate a parameter identification encoding. Specific:
The binary code of the first type parameter data comprises a first type parameter identification code, wherein the first type parameter identification code is generated by replacing the last bit of the first type parameter offset code with the binary code of the first type parameter identification, and the first type parameter offset code is generated by shifting the binary code of the parameter name of the first type parameter by 1 bit to the left;
The binary code of the second type parameter data comprises a second type parameter identification code, wherein the second type parameter identification code is generated by replacing the last bit of the second type parameter offset code with the binary code of the second type parameter identification, and the second type parameter offset code is generated by shifting the binary code of the parameter name of the second type parameter by 1 bit to the left;
The binary code of the third type parameter data comprises a third type parameter identification code, wherein the third type parameter identification code is generated by replacing the last bit of the third type parameter offset code by the binary code of the second parameter identification, and the third type parameter offset code is generated by shifting the binary code of the parameter name of the third type parameter by 1 bit.
For example, in an application scenario, a parameter identification code is defined as parameter Tag2, and a parameter class identification is written into a binary code of the parameter identification to generate parameter Tag2. The idea of parameter Tag2 is similar to that of event Tag1, in that the last bit after the parameter ID has been shifted one bit to the left is used for data discrimination. The parameter Tag2 is defined as follows:
Parameter tag2=parameter ID < <1|data_type. (2)
The meaning of the formula (2) is that after the binary code of the parameter ID is shifted left by 2 bits, the last 2 bits after the shift are replaced with the parameter class identification data_type. The data_type is used for distinguishing parameter data types, and the data_type is a one-bit binary number, namely 0 and 1, and has the following value meaning:
When the parameter data is the first type of parameter data, the corresponding data_type takes 0;
When the parameter data is the second type parameter data or the third type parameter data, the corresponding data_type takes 1.
The following relationship can be obtained in combination with compression type identifier (compression_type):
TABLE 1
Taking event description a as an example, the second implementation of step 130 is used to compress event description a:
for class A data Age 23, the parameter ID of Age is 0, and the binary code is 0000 0000;
0000 0000 after shifting one bit to the left;
Since Age is class A data, the value of data_type is 0,0 is substituted for 0, and 0000 0000 (parameter Tag2 of Age) is obtained;
for class B data, nothing is stored;
for the C-class Data, the parameter ID of the Data is 3, and the binary code is 0000 0011;
0000 0110 after shifting one bit to the left;
Since Data is class C Data, the value of data_type is 1, and 1 is substituted for 0, resulting in 0000 0111 (parameter Tag2 of Data).
Eventually all data is stored in binary form.
And carrying out complete data storage on the A-type data in the form of Tag-Value. Since the parameter Tag of Age is 00000000, the binary encoding of the parameter value 23 of Age is 0001 0111. The complete storage of Age is therefore:
0000 0000 0001 0111。
Thus, the event description a has a final compression result of:
0110 1100 1101 0110 1000 1100 1000 1111 0000 0000 0001 0111 0000 0111。
wherein:
0110 1100 1101 0110 1000 1100 1000 1111 is event Tag1;
0000 0000 is the parameter Tag2 of Age;
0001 0111 is the actual value 23 of Age;
0000 0111 is the parameter Tag2 of Data.
It is to be understood that some or all of the steps or operations in the above embodiments are merely examples, and that other operations or variations of the various operations may also be performed by embodiments of the present application. Furthermore, the various steps may be performed in a different order presented in the above embodiments, and it is possible that not all of the operations in the above embodiments are performed.
Furthermore, based on the data compression method provided in the embodiment of the present application, the embodiment of the present application further provides a data compression device. Fig. 3 is a schematic diagram of an embodiment of a data compression device according to the present application. As shown in fig. 3, the data compression apparatus 300 includes:
An event record obtaining module 310, configured to obtain a first event record corresponding to a first event, where the first event record includes a parameter name and a record value of one or more parameters defined in the first event;
an event definition calling module 320, configured to call a first event definition corresponding to the first event, where the first event definition includes parameter names of all parameters defined in the first event and preset values;
The compression module 330 is configured to compare the first event definition with the first event record, and generate compressed data of the first event record according to the comparison result, where the compressed data of the first event record includes:
Event identification data, wherein the event identification data comprises an event identification of a first event;
The first type of parameter data comprises parameter names and recorded values of the first type of parameters, wherein the first type of parameters are parameters with different recorded values from preset values in parameters contained in the first event record;
and one type of data in the second type of parameter data or the third type of parameter data, wherein:
The second type parameter data comprises parameter names of second type parameters, wherein the second type parameters are parameters, the recorded value of which is the same as a preset value, in parameters contained in the first event record;
The third type of parameter data includes a parameter name of a third type of parameter, which is a parameter in which a recorded value of the parameter is not recorded in the first event record, among parameters included in the first event definition.
Further, based on the data analysis method provided in the embodiment of the present application, the embodiment of the present application also provides a data analysis device, fig. 4 is a schematic structural diagram of an embodiment of the data analysis device according to the present application. As shown in fig. 4, the data analysis device 400 includes:
a data acquisition module 410, configured to acquire compressed data of a first event record, where the first event record corresponds to a first event, the compressed data being data generated by a data compression device according to an embodiment of the present application;
An event definition calling module 420, configured to call a first event definition corresponding to the first event, where the first event definition includes parameter names of all parameters defined in the first event and preset values;
the data parsing module 430, configured to parse the first event record compressed data according to the first event definition, includes:
When the first event record compressed data comprises second-class parameter data, aiming at the parameter which records the parameter identification but does not record the parameter value in the first event record compressed data, taking the preset value in the definition of the first event as the parameter value;
Or alternatively
When the first event record compressed data comprises third type parameter data, aiming at the parameter which does not record the parameter identification in the first event record compressed data, taking the preset value in the definition of the first event as the parameter value.
Further, in the 90 s of the 20 th century, improvements to one technology could be clearly distinguished as improvements in hardware (e.g., improvements to circuit structures such as diodes, transistors, switches, etc.) or software (improvements to the process flow). However, with the development of technology, many improvements of the current method flows can be regarded as direct improvements of hardware circuit structures. Designers almost always obtain corresponding hardware circuit structures by programming improved method flows into hardware circuits. Therefore, an improvement of a method flow cannot be said to be realized by a hardware entity module. For example, a programmable logic device (Programmable Logic Device, PLD) (e.g., field programmable gate array (Field Programmable GATE ARRAY, FPGA)) is an integrated circuit whose logic functions are determined by the programming of the device by an accessing party. The designer programs itself to "integrate" a digital device onto a single PLD without having to ask the chip manufacturer to design and fabricate application-specific integrated circuit chips. Moreover, nowadays, instead of manually manufacturing integrated circuit chips, such programming is mostly implemented with "logic compiler (logic compiler)" software, which is similar to the software compiler used in program development and writing, and the original code before being compiled is also written in a specific programming language, which is called hardware description language (Hardware Description Language, HDL), but HDL is not just one, but a plurality of kinds, such as ABEL(Advanced Boolean Expression Language)、AHDL(Altera Hardware Description Language)、Confluence、CUPL(Cornell University Programming Language)、HDCal、JHDL(Java Hardware Description Language)、Lava、Lola、MyHDL、PALASM、RHDL(Ruby Hardware Description Language), and VHDL (Very-High-SPEED INTEGRATED Circuit Hardware Description Language) and Verilog are currently most commonly used. It will also be apparent to those skilled in the art that a hardware circuit implementing the logic method flow can be readily obtained by merely slightly programming the method flow into an integrated circuit using several of the hardware description languages described above.
The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer readable medium storing computer readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, application SPECIFIC INTEGRATED Circuits (ASICs), programmable logic controllers, and embedded microcontrollers, examples of which include, but are not limited to, ARC 625D, atmel AT91SAM, microchip PIC18F26K20, and Silicone Labs C8051F320, and the memory controller may also be implemented as part of the control logic of the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller in a pure computer readable program code, it is well possible to implement the same functionality by logically programming the method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc. Such a controller may thus be regarded as a kind of hardware component, and means for performing various functions included therein may also be regarded as structures within the hardware component. Or even means for achieving the various functions may be regarded as either software modules implementing the methods or structures within hardware components.
In the description of the embodiments of the present application, for convenience of description, the apparatus is described as being functionally divided into various modules/units, where the division of each module/unit is merely a division of logic functions, and the functions of each module/unit may be implemented in one or more pieces of software and/or hardware when the embodiments of the present application are implemented.
In particular, the apparatus according to the embodiment of the present application may be fully or partially integrated into one physical entity or may be physically separated when actually implemented. The modules can be realized in the form of software calling through the processing element, can be realized in the form of hardware, can also be realized in the form of software calling through the processing element, and can be realized in the form of hardware. For example, the detection module may be a separately established processing element or may be implemented integrated in a certain chip of the electronic device. The implementation of the other modules is similar. In addition, all or part of the modules can be integrated together or can be independently implemented. In implementation, each step of the above method or each module above may be implemented by an integrated logic circuit of hardware in a processor element or an instruction in a software form.
For example, the modules above may be one or more integrated circuits configured to implement the methods above, such as one or more Application SPECIFIC INTEGRATED Circuits (ASICs), or one or more digital signal processors (DIGITAL SINGNAL processors, DSPs), or one or more field programmable gate arrays (Field Programmable GATE ARRAY, FPGA), or the like. For another example, the modules may be integrated together and implemented in the form of a System-On-a-Chip (SOC).
An embodiment of the application also proposes an electronic device comprising a memory for storing computer program instructions and a processor for executing the program instructions, wherein the computer program instructions, when executed by the processor, trigger the electronic device to perform the method steps according to the embodiments of the application.
In particular, in one embodiment of the present application, the one or more computer programs are stored in the memory, where the one or more computer programs include instructions that, when executed by the apparatus, cause the apparatus to perform the method steps described in the embodiments of the present application.
Specifically, in an embodiment of the present application, the processor of the electronic device may be a device on chip SOC, where the processor may include a central processing unit (Central Processing Unit, CPU) and may further include other types of processors. Specifically, in an embodiment of the present application, the processor of the electronic device may be a PWM control chip.
Specifically, in an embodiment of the present application, the processor may include, for example, a CPU, DSP, microcontroller, or digital signal processor, and may further include a GPU, an embedded neural network processor (Neural-network Process Units, NPU), and an image signal processor (IMAGE SIGNAL Processing, ISP), where the processor may further include a necessary hardware accelerator or logic Processing hardware circuit, such as an ASIC, or one or more integrated circuits for controlling the execution of the program according to the technical solution of the present application, and so on. Further, the processor may have a function of operating one or more software programs, which may be stored in a storage medium.
In particular, in an embodiment of the application, the memory of the electronic device may be a read-only memory (ROM), other type of static storage device capable of storing static information and instructions, a random access memory (random access memory, RAM) or other type of dynamic storage device capable of storing information and instructions, an electrically erasable programmable read-only memory (ELECTRICALLY ERASABLE PROGRAMMABLE READ-only memory, EEPROM), a read-only optical disk (compact disc read-only memory, CD-ROM) or other optical disk storage, optical disk storage (including compact disk, laser disk, optical disk, digital versatile disk, blu-ray disk, etc.), magnetic disk storage media or other magnetic storage device, or any computer readable medium capable of carrying or storing the desired program code in the form of instructions or data structures and capable of being accessed by a computer.
In particular, in an embodiment of the present application, the processor and the memory may be combined into a processing device, more commonly separate components, and the processor is configured to execute the program code stored in the memory to implement the method according to the embodiment of the present application. In particular, the memory may also be integrated into the processor or separate from the processor.
Further, the apparatus, device, module or unit illustrated in the embodiments of the present application may be implemented by a computer chip or entity, or by a product having a certain function.
It will be apparent to those skilled in the art that embodiments of the present application may be provided as a method, apparatus, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media having computer-usable program code embodied therein.
In several embodiments provided by the present application, any of the functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application.
In particular, in one embodiment of the present application, there is further provided a computer readable storage medium having a computer program stored therein, which when run on a computer, causes the computer to perform the method provided by the embodiment of the present application.
An embodiment of the application also provides a computer program product comprising a computer program which, when run on a computer, causes the computer to perform the method provided by the embodiment of the application.
The description of embodiments of the present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (means) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In the embodiments of the present application, the term "at least one" refers to one or more, and the term "a plurality" refers to two or more. "and/or", describes an association relation of association objects, and indicates that there may be three kinds of relations, for example, a and/or B, and may indicate that a alone exists, a and B together, and B alone exists. Wherein A, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one of the following" and the like means any combination of these items, including any combination of single or plural items. For example, at least one of a, b and c may represent a, b, c, a and b, a and c, b and c, or a and b and c, wherein a, b, c may be single or plural.
In embodiments of the present application, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises an element.
The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
The embodiments of the present application are described in a progressive manner, and the same and similar parts of the embodiments are all referred to each other, and each embodiment is mainly described in the differences from the other embodiments. In particular, for the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments in part.
Those of ordinary skill in the art will appreciate that the various elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as a combination of electronic hardware, computer software, and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
It will be clearly understood by those skilled in the art that, for convenience and brevity of description, specific working procedures of the apparatus, the apparatus and the units described above may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.
The foregoing is merely exemplary embodiments of the present application, and any person skilled in the art may easily conceive of changes or substitutions within the technical scope of the present application, which should be covered by the present application. The protection scope of the present application shall be subject to the protection scope of the claims.

Claims (12)

1. A method of data compression, comprising:
Acquiring a first event record corresponding to a first event, wherein the first event record comprises parameter identifiers and record values of one or more parameters defined in the first event;
Invoking a first event definition corresponding to the first event, wherein the first event definition comprises parameter identifiers and preset values of all parameters defined in the first event;
Comparing the first event definition with the first event record, and generating first event record compressed data according to a comparison result, wherein the first event record compressed data comprises:
event identification data, wherein the event identification data comprises an event identification of the first event;
the first type of parameter data comprises parameter identifiers and recorded values of first type of parameters, wherein the first type of parameters are parameters, included in the first event record, of which the recorded values are different from the preset values;
A second type of parameter data or a third type of parameter data, wherein:
The second type parameter data comprises a parameter identifier of a second type parameter, wherein the second type parameter is a parameter of which the record value of the parameter is the same as the preset value in the parameters contained in the first event record;
The third type of parameter data includes a parameter identifier of a third type of parameter, where the third type of parameter is a parameter whose recorded value is not recorded in the first event record, in the parameters included in the first event definition.
2. The method of claim 1, wherein the first event record compressed data comprises:
and the second type parameter data and the third type parameter data are one type data with smaller data quantity.
3. The method according to claim 1 or 2, wherein the first class of parameter data further comprises a first parameter class identifier, the second class of parameter data further comprises a second parameter class identifier, and the third class of parameter data further comprises a third parameter class identifier.
4. The method of claim 1 or 2, wherein the event identification data further comprises a first compression mode identification or a second compression mode identification, wherein:
when the first event record compressed data comprises the second type parameter data, the event identification data comprises the first compression mode identification;
when the first event record compression data includes the third type of parameter data, the event identification data includes the second compression mode identification.
5. The method of claim 4, wherein the binary encoding of the event identification data is an encoding generated by replacing a last bit of an event identification offset encoding with the binary encoding of the first compression type identification or the second compression type identification, wherein the event identification offset encoding is an encoding generated by shifting left the binary encoding of the event identification by 1 bit.
6. The method of claim 4 or 5, wherein the first type of parameter data further comprises a first parameter class identifier, the second type of parameter data further comprises a second parameter class identifier, and the third type of parameter data further comprises a second parameter class identifier.
7. The method according to claim 6, wherein:
The binary code of the first type parameter data comprises a first type parameter identification code, wherein the first type parameter identification code is generated by replacing the last bit of a first type parameter offset code with the binary code of the first type parameter identification, and the first type parameter offset code is generated by shifting the binary code of the first type parameter identification left by 1 bit;
The binary code of the second type parameter data comprises a second type parameter identification code, wherein the second type parameter identification code is generated by replacing the last bit of a second type parameter offset code with the binary code of the second type parameter identification, and the second type parameter offset code is generated by shifting the binary code of the second type parameter identification by 1 bit;
The binary code of the third-class parameter data comprises a third-class parameter identification code, wherein the third-class parameter identification code is generated by replacing the last bit of the third-class parameter offset code with the binary code of the second-class parameter identification, and the third-class parameter offset code is generated by shifting the binary code of the parameter identification of the third-class parameter by 1 bit.
8. A data parsing method, comprising:
Acquiring first event record compressed data, wherein the first event record compressed data is generated by compressing a first event record according to the method of any one of claims 1-7, and the first event record corresponds to the first event;
Invoking a first event definition corresponding to the first event, wherein the first event definition comprises parameter identifiers and preset values of all parameters defined in the first event;
parsing the first event record compressed data according to the first event definition, including:
When the first event record compressed data comprises second-class parameter data, aiming at the parameter which records the parameter identification but does not record the parameter value in the first event record compressed data, taking the preset value in the first event definition as the parameter value;
Or alternatively
And when the first event record compressed data comprises third-class parameter data, aiming at the parameter which is not recorded with the parameter identification in the first event record compressed data, taking the preset value in the first event definition as a parameter value.
9. A data compression apparatus, comprising:
an event record obtaining module, configured to obtain a first event record corresponding to a first event, where the first event record includes a parameter identifier and a record value of one or more parameters defined in the first event;
The event definition calling module is used for calling a first event definition corresponding to the first event, wherein the first event definition comprises parameter identifiers and preset values of all parameters defined in the first event;
the compression module is used for comparing the first event definition with the first event record, and generating first event record compressed data according to a comparison result, wherein the first event record compressed data comprises:
event identification data, wherein the event identification data comprises an event identification of the first event;
the first type of parameter data comprises parameter identifiers and recorded values of first type of parameters, wherein the first type of parameters are parameters, included in the first event record, of which the recorded values are different from the preset values;
A second type of parameter data or a third type of parameter data, wherein:
The second type parameter data comprises a parameter identifier of a second type parameter, wherein the second type parameter is a parameter of which the record value of the parameter is the same as the preset value in the parameters contained in the first event record;
The third type of parameter data includes a parameter identifier of a third type of parameter, where the third type of parameter is a parameter whose recorded value is not recorded in the first event record, in the parameters included in the first event definition.
10. A data analysis device, comprising:
The data acquisition module is configured to acquire compressed data of a first event record, where the compressed data of the first event record is data generated by compressing the first event record according to the method of any one of claims 1 to 7, and the first event record corresponds to the first event;
The event definition calling module is used for calling a first event definition corresponding to the first event, wherein the first event definition comprises parameter identifiers and preset values of all parameters defined in the first event;
a data parsing module, configured to parse the first event record compressed data according to the first event definition, including:
When the first event record compressed data comprises second-class parameter data, aiming at the parameter which records the parameter identification but does not record the parameter value in the first event record compressed data, taking the preset value in the first event definition as the parameter value;
Or alternatively
And when the first event record compressed data comprises third-class parameter data, aiming at the parameter which is not recorded with the parameter identification in the first event record compressed data, taking the preset value in the first event definition as a parameter value.
11. An electronic device comprising a memory for storing computer program instructions and a processor for executing the program instructions, wherein the computer program instructions, when executed by the processor, trigger the electronic device to perform the method steps of any one of claims 1-8.
12. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein a computer program which, when run on a computer, causes the computer to perform the method according to any of claims 1-8.
CN202010296259.0A 2020-04-15 2020-04-15 Data compression method, device and electronic equipment Active CN113541694B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010296259.0A CN113541694B (en) 2020-04-15 2020-04-15 Data compression method, device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010296259.0A CN113541694B (en) 2020-04-15 2020-04-15 Data compression method, device and electronic equipment

Publications (2)

Publication Number Publication Date
CN113541694A CN113541694A (en) 2021-10-22
CN113541694B true CN113541694B (en) 2025-02-21

Family

ID=78120096

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010296259.0A Active CN113541694B (en) 2020-04-15 2020-04-15 Data compression method, device and electronic equipment

Country Status (1)

Country Link
CN (1) CN113541694B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105846825A (en) * 2015-01-30 2016-08-10 富士通株式会社 Compression method, decompression method, compression device and decompresssion device
CN109787638A (en) * 2019-01-10 2019-05-21 杭州幻方科技有限公司 A kind of compression storing data processing unit and method

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001282858A (en) * 2000-03-31 2001-10-12 Mitsubishi Electric Corp Data collection device using log data compression method.
US7062681B2 (en) * 2002-12-03 2006-06-13 Microsoft Corporation Method and system for generically reporting events occurring within a computer system
CN110362547B (en) * 2018-04-02 2023-10-03 杭州阿里巴巴智融数字技术有限公司 Method and device for encoding, analyzing and storing log file
CN110958212B (en) * 2018-09-27 2022-04-12 阿里巴巴集团控股有限公司 Data compression method, data decompression method, device and equipment
CN110263224A (en) * 2019-05-07 2019-09-20 南京智慧图谱信息技术有限公司 A kind of event mode link data compression method based on ELP model

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105846825A (en) * 2015-01-30 2016-08-10 富士通株式会社 Compression method, decompression method, compression device and decompresssion device
CN109787638A (en) * 2019-01-10 2019-05-21 杭州幻方科技有限公司 A kind of compression storing data processing unit and method

Also Published As

Publication number Publication date
CN113541694A (en) 2021-10-22

Similar Documents

Publication Publication Date Title
KR100614677B1 (en) How to Compress / Restore Structured Documents
CN111249736B (en) Code processing method and device
CN110895503B (en) Application performance monitoring method and client
US8949207B2 (en) Method and apparatus for decoding encoded structured data from a bit-stream
CN114647233B (en) PLC operation configuration monitoring method and device, storage medium and electronic equipment
WO2011109252A2 (en) Compressing source code written in a scripting language
CN114154020B (en) High-capacity data processing method and device based on dynamic label mapping
CN107566090B (en) Fixed-length/variable-length text message processing method and device
JP6699200B2 (en) Grammar generation for simple data types
CN115952520A (en) Big data platform data standardization processing system and method applied to data files
US8972851B2 (en) Method of coding or decoding a structured document by means of an XML schema, and the associated device and data structure
CN113157941B (en) Service characteristic data processing method, service characteristic data processing device, text generating method, text generating device and electronic equipment
CN108153528B (en) Flow model expansion processing method and device, storage medium and electronic equipment
CN113541694B (en) Data compression method, device and electronic equipment
US10614161B2 (en) Method for integration of semantic data processing
US10477245B2 (en) Methods and devices for coding and decoding depth information, and video processing and playing device
CN118523780A (en) Method for decompressing and compressing SAS data set and application thereof
CN108959411B (en) Processing method, device and equipment of ETL (extract transform and load) task
CN115834736B (en) Declarative message decoding method for binary message
US9075737B2 (en) Verification device, verification method and computer program product
CN107436728B (en) Rule analysis result storage method, rule backtracking method and device
CN116468062A (en) Page display method and device based on graphic code
US10019418B2 (en) Efficient XML interchange profile stream decoding
CN107896136B (en) Radar track message encoding method
CN113452735A (en) Narrow-band transmission method and equipment based on block chain

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant