Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a flowchart of a digital character-oriented data encoding method according to an embodiment of the present invention, and as shown in fig. 1, the digital character-oriented data encoding method according to an embodiment of the present invention includes:
101. adding control characters to be arranged between printable characters in data to be stored or to be sent, wherein the control characters are used for controlling a data structure and data analysis;
102. according to each character with 4 bits, carrying out binary coding on all characters of the data added with the control character;
the data to be stored or to be sent is composed of printable characters, the printable characters comprise digital characters and special characters, and the total number of the printable characters, the special characters and the control characters is not more than 16; and when the binary coding is carried out, the coding is carried out according to a preset coding rule.
Before 101, obtaining data to be stored or to be sent;
the data to be stored or to be transmitted, for example, sensor devices such as MEMS sensors and RFID tags, and the data format of the acquired signals need to be stored in a storage medium or transmitted to other terminals or servers. For example, the temperature sensor uniquely identifying the OID number 1.2.156.101818.30.56.123456 collects 25.6 ℃ temperature data.
In 101, control characters are added to printable characters in data to be stored or to be transmitted, so that the data are stored in a storage medium or transmitted after being encoded.
Printable characters may include numeric characters and special characters. The numeric characters are characters containing data information in the data to be stored, such as numbers 0-9. Special characters include other characters used to separate numeric characters, such as ".", "/", etc. The former may be used to represent decimal numbers and the latter may be used to represent the interval between years, months and days. The control characters are used for identifying the printing characters and the key value pairs stored in the data, and when the data are analyzed, the required printable characters and the printing characters can be accurately distinguished according to the control characters.
In the device objects such as sensors, RFID tags and the like, only digital data are stored, and since the digital characters only have 10 character symbols of ' 0 ' -9 ', 4-bit nibbles are adopted for representation, and 2 can be coded by 4 bits416 characters. With 4-bit length nibbles, 10 numeric characters "0" - "9" can be defined, and a small number of special symbols or control symbols (no more than 6) such as a (or "."), B (or "/"), C, D, E, F, etc. The following embodiments are all described by taking a 4-bit encoding method as an example.
In this embodiment, the printable characters and control characters are encoded as a 4-bit one-character binary number. For special characters and control characters within the numbers 0-9 and 6 (represented by ABCDEF), the total number is less than 16, and all characters can be represented by selecting binary numbers of four bits for coding.
The data coding method facing the digital characters, provided by the embodiment of the invention, codes printable characters and control characters of data to be stored or to be sent according to binary numbers of 4 bits and one character respectively, wherein one character with 4 bits can meet the total number of the printable characters and the control characters, and can code all the characters with the minimum number of bits, so that the data to be stored can occupy the minimum storage space in a storage medium, the storage efficiency is improved, and the cost can be greatly reduced for objects with limited storage. For data to be sent, the minimum bandwidth can be occupied in the transmission process, and the transmission efficiency is improved.
Based on the content of the above-mentioned embodiment, as an alternative embodiment, the special character includes at least one of a first special character and a second special character; the first special character is used for identifying a decimal point of a floating-point number; the second special character is used to identify a separator in the code.
Printable characters include numeric characters, which are numbers from 0 to 9 above, and special characters, which may be ".", "/" (with the encoding being defined as "a" and "B" characters, respectively), etc. Wherein, the "-" can be used as a decimal point of a floating point number, and can also be used as a separator in OID and Handle coding; "/" can be used as a separator in the unique identification code in the Handle. The printable characters constitute the set V, V { "0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "", "/" }. The printable characters are mainly used for string IDs representing unique identifiers of various metadata, and assigned string values of the metadata. It should be noted that the functions of the first special character and the second special character can also be realized by the same character. For example, "is used to identify both the decimal point of a floating point number and the delimiter in the encoding.
Based on the content of the above-mentioned embodiments, as an alternative embodiment, the control characters include at least 2 of the following control characters: a first control character, a second control character, a third control character, and a fourth control character, wherein: the first control word is used to separate two adjacent elementary data units; the second control character is used for identifying the beginning of the data block, and the third control character is used for identifying the end of the data block; the fourth control character identifies the end of the entire data structure.
The control characters mainly include four, and two of the control characters can be selected for application, and for convenience of description, a first control character, a second control character, a third control character and a fourth control character are respectively represented by "C" - "F". The control characters constitute a set W, W { "C", "D", "E", "F" }. The control character is mainly used for control data blocks, and may define "C" for the separator between ID-Value pairs; "D" and "E" appear in pairs, representing the start identifier and end identifier, respectively, of a data segment, which act as braces "{" and "}" in JSON; "F" serves as the terminator for the entire data structure. The key mentioned in the embodiment of the invention refers to the number of the stored value; value, refers to the data stored.
Of course, according to the requirement, some characters selected from the control character set W can be defined as special characters to be added into the printable character set V, so that the expression capability of the numeric character string is stronger. However, this results in a reduction in the number of control characters, which may make the expression capability of the data block weak or the expression efficiency low, and may be set according to specific needs.
According to the data coding method for the digital characters, which is provided by the embodiment of the invention, the data coding method can be flexibly customized through the four control characters, and the method can be aligned with the data representation capability of semi-structured XML or JSON to a certain extent.
Based on the content of the foregoing embodiment, as an optional embodiment, the primitive is a key-value pair, where: the key is a metadata identifier and is composed of 4-bit printable characters; the value is the assigned value corresponding to the metadata and is represented by a 4-bit printable character.
Metadata (Metadata), which is data describing data (data about data), is mainly information describing data property (property) and is used to support functions such as indicating storage location, history data, resource search, file record, and the like.
The printable character Set V and the control character Set W are defined by 4 bits, then characters are taken from the printable character Set V to construct a Set ID-Set of the metadata identifier ID and a metadata assigned Value character string, and a key Value pair ID-Value is formed by the ID character string and the assigned Value. Different ID-Value key Value pairs are defined and combined by using the control characters in the control character set W, so that a flexible data structure is constructed.
Based on the content of the above embodiments, as an alternative embodiment, the preset encoding rule, i.e. the character encoding manner, may adopt the BASE16 encoding method in RFC4638, as shown in table 1 below.
TABLE 1
According to the requirement, a customized 4-bit coding mode can be adopted, that is, the corresponding relation between the representation symbols and the binary system is adjusted at will, and besides the represented symbols from "0" to "9", 6 special symbols can be customized according to the application requirement to replace "a" to "F".
In order to realize the flexibility of data structure representation based on digital characters, the method mainly takes key Value pairs (namely ID-Value pairs) as basic units to construct, and adopts control characters to control various customized data structures. The key ID is a unique identification identifier character string of metadata, needs to be predefined and forms a uniform industry specification, and the assignment corresponding to the metadata, namely the key ID Value, can flexibly construct a digital character string according to actual conditions. It should be noted that all the characters in the key ID string and the Value string are from the printable character set V. And the control characters for the control structure are from the control character set W.
Based on the above description of the embodiments, as an alternative embodiment, the data block is composed of a plurality of key value pairs, control characters and small data blocks, and all the characters in the data block are composed of 4-bit printable characters and control characters. For example, there may be multiple samples of data for a sensor, which are stored or transmitted as a block of data. A plurality of sampled values, each for a small data block, may be formed into a data block in the form of key-value pairs in combination with control characters.
The following description will be given by way of specific examples in conjunction with the above-described embodiments.
(1) Definition of metadata identifier Set ID-Set
The metadata identifier set needs to be pre-keyed, mainly defining the realistic meaning of the metadata and the corresponding metadata identifiers. Since these metadata identifiers are present in the data structure, the length of the identifier string should be as short as possible, typically 1-3 characters, and the length of the metadata identifiers in the set may vary.
For example, for sensor data collection, a Set of metadata identifiers ID-Set may be defined, as shown in Table 2 below.
TABLE 2
Of course, many similar metadata identifiers IDs may also be defined to express different data description requirements of the sensors.
(2) Binary representation of metadata identifiers and assignment strings
In order to effectively identify different types of characters (strings), all metadata identifier ID (key) strings and binary symbol strings are underlined; the control characters C, D, E and F and the binary symbol strings thereof adopt bold marks; the binary symbol strings of the special characters are marked by deletion lines; the assignment character string of the metadata and the binary symbol string thereof adopt normal font identification. There is a space between the binary strings of two characters for easy reading, which is not present in the actual data representation.
According to the 4-bit encoding character encoding BASE16 table above. The binary representation of the metadata identifier ID string and its assigned Value string is shown in table 3 below.
TABLE 3
(3) Control character set W definition
In the control character set, the element "C" is used for a separator between ID-Value pairs; the elements "D" and "E" appear in pairs, representing the start identifier and end identifier of the data segment, respectively, which act as braces "{" and "}" in JSON; the element "F" serves as the terminator for the entire data structure. Four control characters are shown in bold and their definitions are shown in table 4 below.
TABLE 4
(4) Representation and partitioning of ID-Value pairs
The key-Value pair ID-Value is a basic unit in the data structure. In the present encoded data format, the ID numeric string and the Value numeric string are directly connected. And between two key-value pairs, a control character C (1100) is required for separation.
For example, describing a temperature sensor, a certain sample value is 25.6 degrees.
First, a temperature sensor key-value pair is defined as023, wherein02Is metadata sensor type ID, 3 indicates that the sensor is a temperature sensor; next, the key-value pair of the sampling value is defined as0425.6, wherein04Is the metadata sensor sample value, 25.6 indicates that the sensor sample temperature is25.6 degrees; the last two key-value pairs are separated by a control character C (1100).
Thus, the whole piece of information description data is:023C0425.6 corresponding to a binary number of0000 0010001111000000 01000010 0101 1010 0110。
(5) Representation of the entire data structure
The expression of the entire data structure starts with control character C (1100) and ends with F (1111). The middle of the data structure block is composed of a plurality of data blocks, key value pairs and control characters. Where the data block may in turn be composed of several small data blocks and several key-value pairs, control characters, the beginning and end of each data block being identified by control characters D (1101) and E (1110), respectively. The key-value pairs are separated by C (1100), and the data blocks and the key-value pairs are separated by D (1101) and E (1110).
For example, the temperature sensor with the unique identification OID number of 1.2.156.101818.30.56.123456 begins sampling 25 minutes at 18/10/1/19 of 19 years, with a sampling interval of 30 minutes, and the three consecutive samples are 25.6, 28, and 31.5.
Key-value pairs of keys formed according to the data in the example are shown in table 5 below.
TABLE 5
In the above data, the key value pairs of three consecutive sample values may be combined into one data block, and the three key value pairs are packed by control characters D and E.
The key value pairs and the data blocks are added with control characters (shown in bold) as follows:
the corresponding binary representation is as follows:
fig. 2 is a flowchart of a digital character-oriented data parsing method according to an embodiment of the present invention, and as shown in fig. 2, a digital character-oriented data parsing method according to an embodiment of the present invention includes:
201. decoding the received or read data according to a 4-bit character and a decoding rule corresponding to the coding rule used in coding to respectively obtain printable characters and control characters;
202. analyzing the meaning represented by the control character to obtain data consisting of printable characters;
wherein the printable characters include numeric characters and special characters, and a total number of the printable characters, the special characters, and the control characters is not greater than 16.
The data analysis method is the reverse process of the storage method, and the specific flow can be referred to the embodiment of the storage method.
The parsing method is the inverse process of the previous process of constructing the data structure, and a specific implementation process can be as follows. Data in the data structure is read in 4-bit one-character bytes, i.e., aligned in nibbles. The analysis process is as follows:
(1) the first control character C is read and the next control character is searched continuously, and the key value pair between the two control characters is taken out.
(2) Resolving the key-value pair: for example, the numeric string is restored according to the BASE16 encoding method, and then the metadata identifier ID and its assigned string are separated according to the metadata identifier Set ID-Set.
(3) If the next control character is F, continuously taking out the key value pair (if the key value pair exists) between the two control characters, and entering the step (2) for analysis; and quitting after the analysis is finished;
if the next control character is C, continuously taking out the key value pair (if the key value pair exists) between the two control characters, and entering the step (2) for analysis; meanwhile, continuously searching the next control character, and entering into (3);
if the next control character is D, pressing D into the stack, taking out the key value pair (if the key value pair exists) between the two control characters, and entering the step (2) for analysis; continuing to find the next control character, and entering (3);
if the next control character is E, pushing D out of the stack, taking out the key value pair (if the key value pair exists) between the two control characters, and entering the step (2) for analysis; and (4) continuing to find the next control character and entering (3).
Fig. 3 is a structural diagram of a digital character-oriented data encoding system according to an embodiment of the present invention, and as shown in fig. 3, the digital character-oriented data encoding system includes: a processing module 301 and an encoding module 302. The processing module 301 is configured to add a control character to data to be stored or to be sent, where the control character is arranged between printable characters, and is used to control a data structure and data analysis; the encoding module 302 is configured to perform binary encoding on all characters of the data to which the control character is added according to each 4-bit character; the data to be stored or to be sent is composed of printable characters, the printable characters comprise digital characters and special characters, and the total number of the printable characters, the special characters and the control characters is not more than 16; and when the binary coding is carried out, the coding is carried out according to a preset coding rule.
Fig. 4 is a structural diagram of a digital character-oriented data parsing system according to an embodiment of the present invention, and as shown in fig. 4, the digital character-oriented data parsing system includes: a parsing module 401 and a processing module 402. Wherein, include: the analysis module 401 is configured to decode the received or read data according to a decoding rule corresponding to the encoding rule used in encoding, and obtain a printable character and a control character, respectively, according to one character with 4 bits; the processing module 402 is configured to analyze the meaning represented by the control character to obtain data composed of printable characters; wherein the printable characters include numeric characters and special characters, and a total number of the printable characters, the special characters, and the control characters is not greater than 16.
The system embodiment provided in the embodiments of the present invention is for implementing the above method embodiments, and for details of the process and the details, reference is made to the above method embodiments, which are not described herein again.
The digital character-oriented data coding system or the analysis system provided by the embodiment of the invention respectively codes or decodes the printable characters and the control characters according to the binary number of one character with 4 bits, and one character with 4 bits can not only meet the total number of the printable characters and the control characters, but also code and correspondingly decode all the characters with the minimum number of bits, so that the data to be stored can occupy the minimum storage space in a storage medium, the storage efficiency is improved, and the cost can be greatly reduced for the object with limited storage. For data to be sent, the minimum bandwidth can be occupied in the transmission process, and the transmission efficiency is improved.
Fig. 5 is a schematic entity structure diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 5, the electronic device may include: a processor (processor)501, a communication Interface (Communications Interface)502, a memory (memory)503, and a bus 504, wherein the processor 501, the communication Interface 502, and the memory 503 are configured to communicate with each other via the bus 504. The communication interface 502 may be used for information transfer of an electronic device. The processor 501 may call logic instructions in the memory 503 to perform a method comprising: adding control characters to be arranged between printable characters in data to be stored or to be sent, wherein the control characters are used for controlling a data structure and data analysis; according to each character with 4 bits, carrying out binary coding on all characters of the data added with the control character; the data to be stored or to be sent is composed of printable characters, the printable characters comprise digital characters and special characters, and the total number of the printable characters, the special characters and the control characters is not more than 16; and when the binary coding is carried out, the coding is carried out according to a preset coding rule.
In addition, the logic instructions in the memory 503 may be implemented in the form of software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the above-described method embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In another aspect, an embodiment of the present invention further provides a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program is implemented to perform the transmission method provided in the foregoing embodiments when executed by a processor, and for example, the method includes: adding control characters to be arranged between printable characters in data to be stored or to be sent, wherein the control characters are used for controlling a data structure and data analysis; according to each character with 4 bits, carrying out binary coding on all characters of the data added with the control character; the data to be stored or to be sent is composed of printable characters, the printable characters comprise digital characters and special characters, and the total number of the printable characters, the special characters and the control characters is not more than 16; and when the binary coding is carried out, the coding is carried out according to a preset coding rule.
The above-described system embodiments are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.