[go: up one dir, main page]

CN111178008A - Digital character-oriented data encoding method, digital character-oriented data analyzing method and digital character-oriented data encoding system - Google Patents

Digital character-oriented data encoding method, digital character-oriented data analyzing method and digital character-oriented data encoding system Download PDF

Info

Publication number
CN111178008A
CN111178008A CN201911320614.7A CN201911320614A CN111178008A CN 111178008 A CN111178008 A CN 111178008A CN 201911320614 A CN201911320614 A CN 201911320614A CN 111178008 A CN111178008 A CN 111178008A
Authority
CN
China
Prior art keywords
characters
character
data
control
printable
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911320614.7A
Other languages
Chinese (zh)
Inventor
刘云浩
王继良
罗五明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201911320614.7A priority Critical patent/CN111178008A/en
Publication of CN111178008A publication Critical patent/CN111178008A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

本发明实施例提供一种面向数字字符的数据编码方法、解析方法及系统,该编码方法包括:在待存储或待发送的数据中,加入控制字符编排于可打印字符之间,用于控制数据结构和数据解析;按照4比特每个字符,对加入控制字符后数据的所有字符进行二进制编码;其中,待存储或待发送的数据,由可打印字符构成,可打印字符包括数字字符和特殊字符,可打印字符、特殊字符和控制字符的总数量不大于16;进行二进制编码时,按照预设的编码规则进行编码。该方法既可以满足可打印字符和控制字符的总数量,又能够以最小的位数对所有字符进行编码和相应解码,从而存储介质中能够占用最小的存储空间,提高存储效率,传输过程能够占用最小的带宽,提高传输效率。

Figure 201911320614

Embodiments of the present invention provide a digital character-oriented data encoding method, parsing method, and system. The encoding method includes: adding control characters to the data to be stored or to be sent and arranged between printable characters for controlling the data Structure and data analysis; perform binary encoding on all characters of the data after adding control characters according to each character of 4 bits; among them, the data to be stored or to be sent is composed of printable characters, and the printable characters include numeric characters and special characters , the total number of printable characters, special characters and control characters is not more than 16; when binary encoding is performed, encoding is performed according to the preset encoding rules. The method can not only satisfy the total number of printable characters and control characters, but also encode and decode all characters with the smallest number of digits, so that the storage medium can occupy the smallest storage space, improve the storage efficiency, and the transmission process can occupy the minimum storage space. Minimum bandwidth, improve transmission efficiency.

Figure 201911320614

Description

Digital character-oriented data encoding method, digital character-oriented data analyzing method and digital character-oriented data encoding system
Technical Field
The invention relates to the field of data management, in particular to a digital character-oriented data encoding method, a digital character-oriented data analyzing method and a digital character-oriented data encoding system.
Background
Some industries completely adopt numeric characters to represent information at present. For example, the unique identifier OID object identifier is mainly represented by "0" - "9" numeric characters and "-"; the Handle system mainly adopts 0-9 numeric characters, a ', '/' and the like; the sampled data of the MEMS sensor is also typically represented by numerical characters; the communication protocol in the sensor network can also customize the data format based on the digital characters. The data representation storage or communication protocol is mainly characterized by: firstly, resources are limited, and some digital character data are generally required to be stored in electronic storage devices with limited resources such as RFID tags, MEMS sensors and the like; secondly, the expression mode is flexible and various, and is difficult to express by adopting a fixed data format, and usually adopts a semi-structured data block to express.
Currently, in a storage object with limited resources, stored digital characters, alphabetic characters and the like are generally mixed together, an ASCII character set is used for encoding, and bytes with a length of 8 bits are used for representation, which causes that the data representation efficiency of the digital characters is not high; meanwhile, in terms of controlling data blocks, a relatively complex character string or a special character is generally used for partitioning, which also causes a low storage efficiency.
Disclosure of Invention
In order to solve the above problems, embodiments of the present invention provide a digital character-oriented data encoding method, parsing method, and system.
In a first aspect, an embodiment of the present invention provides a digital character-oriented data encoding method, including: adding control characters to be arranged between printable characters in data to be stored or to be sent, wherein the control characters are used for controlling a data structure and data analysis; according to each character with 4 bits, carrying out binary coding on all characters of the data added with the control character; the data to be stored or to be sent is composed of printable characters, the printable characters comprise digital characters and special characters, and the total number of the printable characters, the special characters and the control characters is not more than 16; and when the binary coding is carried out, the coding is carried out according to a preset coding rule.
Further, the control characters include at least two of the following control characters: a first control character, a second control character, a third control character, and a fourth control character, wherein: the first control word is used to separate two adjacent elementary data units; the second control character is used for identifying the beginning of the data block, and the third control character is used for identifying the end of the data block; the fourth control character identifies the end of the entire data structure.
Further, the primitives are key-value pairs, wherein: the key is a metadata identifier and is composed of 4-bit printable characters; the value is the assigned value corresponding to the metadata and is represented by a 4-bit printable character.
Furthermore, the data block is composed of a plurality of key value pairs, control characters and small data blocks, and all the characters in the data block are composed of 4-bit printable characters and control characters.
Further, the special character includes at least one of a first special character and a second special character; the first special character is used for identifying a decimal point of a floating point number; the second special character is used to identify a separator in the code.
In a second aspect, an embodiment of the present invention provides a digital character-oriented data parsing method, including: decoding the received or read data according to a 4-bit character and a decoding rule corresponding to the coding rule used in coding to respectively obtain printable characters and control characters; analyzing the meaning represented by the control character to obtain data consisting of printable characters; wherein the printable characters include numeric characters and special characters, and a total number of the printable characters, the special characters, and the control characters is not greater than 16.
In a third aspect, an embodiment of the present invention provides a digital character-oriented data encoding system, including: the processing module is used for adding control characters to be arranged among printable characters in data to be stored or to be sent and is used for controlling a data structure and data analysis; the encoding module is used for carrying out binary encoding on all characters of the data added with the control characters according to each 4-bit character; the data to be stored or to be sent is composed of printable characters, the printable characters comprise digital characters and special characters, and the total number of the printable characters, the special characters and the control characters is not more than 16; and when the binary coding is carried out, the coding is carried out according to a preset coding rule.
In a fourth aspect, an embodiment of the present invention provides a digital character-oriented data parsing system, including: the analysis module is used for decoding the received or read data according to a 4-bit character and a decoding rule corresponding to the coding rule used in coding to respectively obtain printable characters and control characters; the processing module is used for analyzing the meaning represented by the control character to obtain data consisting of printable characters; wherein the printable characters include numeric characters and special characters, and a total number of the printable characters, the special characters, and the control characters is not greater than 16.
In a fifth aspect, an embodiment of the present invention provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer program to implement the steps of the digital character oriented data encoding method according to the first aspect or the digital character oriented data parsing method according to the second aspect of the present invention.
In a sixth aspect, the present invention provides a non-transitory computer readable storage medium, on which a computer program is stored, the computer program, when being executed by a processor, implementing the steps of the digital character oriented data encoding method of the first aspect or the digital character oriented data parsing method of the second aspect of the present invention.
The digital character-oriented data encoding method, the digital character-oriented data analysis method and the digital character-oriented data encoding system are characterized in that control characters are added to printable characters and encoding and decoding are respectively carried out according to binary numbers of 4 bits and one character, and the 4 bits and one character can meet the total number of the printable characters and the control characters and can encode and correspondingly decode all characters with the minimum number of bits, so that data to be stored can occupy the minimum storage space in a storage medium, the storage efficiency is improved, and the cost can be greatly reduced for objects with limited storage. For data to be sent, the minimum bandwidth can be occupied in the transmission process, and the transmission efficiency is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a flow chart of a method for encoding digital character-oriented data according to an embodiment of the present invention;
FIG. 2 is a flow chart of a data parsing method for numeric characters according to an embodiment of the present invention;
FIG. 3 is a block diagram of a numeric character-oriented data encoding system according to an embodiment of the present invention;
FIG. 4 is a block diagram of a data parsing system for numeric characters according to an embodiment of the present invention;
fig. 5 is a schematic physical structure diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a flowchart of a digital character-oriented data encoding method according to an embodiment of the present invention, and as shown in fig. 1, the digital character-oriented data encoding method according to an embodiment of the present invention includes:
101. adding control characters to be arranged between printable characters in data to be stored or to be sent, wherein the control characters are used for controlling a data structure and data analysis;
102. according to each character with 4 bits, carrying out binary coding on all characters of the data added with the control character;
the data to be stored or to be sent is composed of printable characters, the printable characters comprise digital characters and special characters, and the total number of the printable characters, the special characters and the control characters is not more than 16; and when the binary coding is carried out, the coding is carried out according to a preset coding rule.
Before 101, obtaining data to be stored or to be sent;
the data to be stored or to be transmitted, for example, sensor devices such as MEMS sensors and RFID tags, and the data format of the acquired signals need to be stored in a storage medium or transmitted to other terminals or servers. For example, the temperature sensor uniquely identifying the OID number 1.2.156.101818.30.56.123456 collects 25.6 ℃ temperature data.
In 101, control characters are added to printable characters in data to be stored or to be transmitted, so that the data are stored in a storage medium or transmitted after being encoded.
Printable characters may include numeric characters and special characters. The numeric characters are characters containing data information in the data to be stored, such as numbers 0-9. Special characters include other characters used to separate numeric characters, such as ".", "/", etc. The former may be used to represent decimal numbers and the latter may be used to represent the interval between years, months and days. The control characters are used for identifying the printing characters and the key value pairs stored in the data, and when the data are analyzed, the required printable characters and the printing characters can be accurately distinguished according to the control characters.
In the device objects such as sensors, RFID tags and the like, only digital data are stored, and since the digital characters only have 10 character symbols of ' 0 ' -9 ', 4-bit nibbles are adopted for representation, and 2 can be coded by 4 bits416 characters. With 4-bit length nibbles, 10 numeric characters "0" - "9" can be defined, and a small number of special symbols or control symbols (no more than 6) such as a (or "."), B (or "/"), C, D, E, F, etc. The following embodiments are all described by taking a 4-bit encoding method as an example.
In this embodiment, the printable characters and control characters are encoded as a 4-bit one-character binary number. For special characters and control characters within the numbers 0-9 and 6 (represented by ABCDEF), the total number is less than 16, and all characters can be represented by selecting binary numbers of four bits for coding.
The data coding method facing the digital characters, provided by the embodiment of the invention, codes printable characters and control characters of data to be stored or to be sent according to binary numbers of 4 bits and one character respectively, wherein one character with 4 bits can meet the total number of the printable characters and the control characters, and can code all the characters with the minimum number of bits, so that the data to be stored can occupy the minimum storage space in a storage medium, the storage efficiency is improved, and the cost can be greatly reduced for objects with limited storage. For data to be sent, the minimum bandwidth can be occupied in the transmission process, and the transmission efficiency is improved.
Based on the content of the above-mentioned embodiment, as an alternative embodiment, the special character includes at least one of a first special character and a second special character; the first special character is used for identifying a decimal point of a floating-point number; the second special character is used to identify a separator in the code.
Printable characters include numeric characters, which are numbers from 0 to 9 above, and special characters, which may be ".", "/" (with the encoding being defined as "a" and "B" characters, respectively), etc. Wherein, the "-" can be used as a decimal point of a floating point number, and can also be used as a separator in OID and Handle coding; "/" can be used as a separator in the unique identification code in the Handle. The printable characters constitute the set V, V { "0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "", "/" }. The printable characters are mainly used for string IDs representing unique identifiers of various metadata, and assigned string values of the metadata. It should be noted that the functions of the first special character and the second special character can also be realized by the same character. For example, "is used to identify both the decimal point of a floating point number and the delimiter in the encoding.
Based on the content of the above-mentioned embodiments, as an alternative embodiment, the control characters include at least 2 of the following control characters: a first control character, a second control character, a third control character, and a fourth control character, wherein: the first control word is used to separate two adjacent elementary data units; the second control character is used for identifying the beginning of the data block, and the third control character is used for identifying the end of the data block; the fourth control character identifies the end of the entire data structure.
The control characters mainly include four, and two of the control characters can be selected for application, and for convenience of description, a first control character, a second control character, a third control character and a fourth control character are respectively represented by "C" - "F". The control characters constitute a set W, W { "C", "D", "E", "F" }. The control character is mainly used for control data blocks, and may define "C" for the separator between ID-Value pairs; "D" and "E" appear in pairs, representing the start identifier and end identifier, respectively, of a data segment, which act as braces "{" and "}" in JSON; "F" serves as the terminator for the entire data structure. The key mentioned in the embodiment of the invention refers to the number of the stored value; value, refers to the data stored.
Of course, according to the requirement, some characters selected from the control character set W can be defined as special characters to be added into the printable character set V, so that the expression capability of the numeric character string is stronger. However, this results in a reduction in the number of control characters, which may make the expression capability of the data block weak or the expression efficiency low, and may be set according to specific needs.
According to the data coding method for the digital characters, which is provided by the embodiment of the invention, the data coding method can be flexibly customized through the four control characters, and the method can be aligned with the data representation capability of semi-structured XML or JSON to a certain extent.
Based on the content of the foregoing embodiment, as an optional embodiment, the primitive is a key-value pair, where: the key is a metadata identifier and is composed of 4-bit printable characters; the value is the assigned value corresponding to the metadata and is represented by a 4-bit printable character.
Metadata (Metadata), which is data describing data (data about data), is mainly information describing data property (property) and is used to support functions such as indicating storage location, history data, resource search, file record, and the like.
The printable character Set V and the control character Set W are defined by 4 bits, then characters are taken from the printable character Set V to construct a Set ID-Set of the metadata identifier ID and a metadata assigned Value character string, and a key Value pair ID-Value is formed by the ID character string and the assigned Value. Different ID-Value key Value pairs are defined and combined by using the control characters in the control character set W, so that a flexible data structure is constructed.
Based on the content of the above embodiments, as an alternative embodiment, the preset encoding rule, i.e. the character encoding manner, may adopt the BASE16 encoding method in RFC4638, as shown in table 1 below.
TABLE 1
Figure BDA0002327047720000071
According to the requirement, a customized 4-bit coding mode can be adopted, that is, the corresponding relation between the representation symbols and the binary system is adjusted at will, and besides the represented symbols from "0" to "9", 6 special symbols can be customized according to the application requirement to replace "a" to "F".
In order to realize the flexibility of data structure representation based on digital characters, the method mainly takes key Value pairs (namely ID-Value pairs) as basic units to construct, and adopts control characters to control various customized data structures. The key ID is a unique identification identifier character string of metadata, needs to be predefined and forms a uniform industry specification, and the assignment corresponding to the metadata, namely the key ID Value, can flexibly construct a digital character string according to actual conditions. It should be noted that all the characters in the key ID string and the Value string are from the printable character set V. And the control characters for the control structure are from the control character set W.
Based on the above description of the embodiments, as an alternative embodiment, the data block is composed of a plurality of key value pairs, control characters and small data blocks, and all the characters in the data block are composed of 4-bit printable characters and control characters. For example, there may be multiple samples of data for a sensor, which are stored or transmitted as a block of data. A plurality of sampled values, each for a small data block, may be formed into a data block in the form of key-value pairs in combination with control characters.
The following description will be given by way of specific examples in conjunction with the above-described embodiments.
(1) Definition of metadata identifier Set ID-Set
The metadata identifier set needs to be pre-keyed, mainly defining the realistic meaning of the metadata and the corresponding metadata identifiers. Since these metadata identifiers are present in the data structure, the length of the identifier string should be as short as possible, typically 1-3 characters, and the length of the metadata identifiers in the set may vary.
For example, for sensor data collection, a Set of metadata identifiers ID-Set may be defined, as shown in Table 2 below.
TABLE 2
Figure BDA0002327047720000081
Of course, many similar metadata identifiers IDs may also be defined to express different data description requirements of the sensors.
(2) Binary representation of metadata identifiers and assignment strings
In order to effectively identify different types of characters (strings), all metadata identifier ID (key) strings and binary symbol strings are underlined; the control characters C, D, E and F and the binary symbol strings thereof adopt bold marks; the binary symbol strings of the special characters are marked by deletion lines; the assignment character string of the metadata and the binary symbol string thereof adopt normal font identification. There is a space between the binary strings of two characters for easy reading, which is not present in the actual data representation.
According to the 4-bit encoding character encoding BASE16 table above. The binary representation of the metadata identifier ID string and its assigned Value string is shown in table 3 below.
TABLE 3
Figure BDA0002327047720000091
(3) Control character set W definition
In the control character set, the element "C" is used for a separator between ID-Value pairs; the elements "D" and "E" appear in pairs, representing the start identifier and end identifier of the data segment, respectively, which act as braces "{" and "}" in JSON; the element "F" serves as the terminator for the entire data structure. Four control characters are shown in bold and their definitions are shown in table 4 below.
TABLE 4
Figure BDA0002327047720000092
Figure BDA0002327047720000101
(4) Representation and partitioning of ID-Value pairs
The key-Value pair ID-Value is a basic unit in the data structure. In the present encoded data format, the ID numeric string and the Value numeric string are directly connected. And between two key-value pairs, a control character C (1100) is required for separation.
For example, describing a temperature sensor, a certain sample value is 25.6 degrees.
First, a temperature sensor key-value pair is defined as023, wherein02Is metadata sensor type ID, 3 indicates that the sensor is a temperature sensor; next, the key-value pair of the sampling value is defined as0425.6, wherein04Is the metadata sensor sample value, 25.6 indicates that the sensor sample temperature is25.6 degrees; the last two key-value pairs are separated by a control character C (1100).
Thus, the whole piece of information description data is:023C0425.6 corresponding to a binary number of0000 0010001111000000 01000010 0101 1010 0110。
(5) Representation of the entire data structure
The expression of the entire data structure starts with control character C (1100) and ends with F (1111). The middle of the data structure block is composed of a plurality of data blocks, key value pairs and control characters. Where the data block may in turn be composed of several small data blocks and several key-value pairs, control characters, the beginning and end of each data block being identified by control characters D (1101) and E (1110), respectively. The key-value pairs are separated by C (1100), and the data blocks and the key-value pairs are separated by D (1101) and E (1110).
For example, the temperature sensor with the unique identification OID number of 1.2.156.101818.30.56.123456 begins sampling 25 minutes at 18/10/1/19 of 19 years, with a sampling interval of 30 minutes, and the three consecutive samples are 25.6, 28, and 31.5.
Key-value pairs of keys formed according to the data in the example are shown in table 5 below.
TABLE 5
Figure BDA0002327047720000111
In the above data, the key value pairs of three consecutive sample values may be combined into one data block, and the three key value pairs are packed by control characters D and E.
The key value pairs and the data blocks are added with control characters (shown in bold) as follows:
Figure BDA0002327047720000112
the corresponding binary representation is as follows:
Figure BDA0002327047720000113
Figure BDA0002327047720000121
fig. 2 is a flowchart of a digital character-oriented data parsing method according to an embodiment of the present invention, and as shown in fig. 2, a digital character-oriented data parsing method according to an embodiment of the present invention includes:
201. decoding the received or read data according to a 4-bit character and a decoding rule corresponding to the coding rule used in coding to respectively obtain printable characters and control characters;
202. analyzing the meaning represented by the control character to obtain data consisting of printable characters;
wherein the printable characters include numeric characters and special characters, and a total number of the printable characters, the special characters, and the control characters is not greater than 16.
The data analysis method is the reverse process of the storage method, and the specific flow can be referred to the embodiment of the storage method.
The parsing method is the inverse process of the previous process of constructing the data structure, and a specific implementation process can be as follows. Data in the data structure is read in 4-bit one-character bytes, i.e., aligned in nibbles. The analysis process is as follows:
(1) the first control character C is read and the next control character is searched continuously, and the key value pair between the two control characters is taken out.
(2) Resolving the key-value pair: for example, the numeric string is restored according to the BASE16 encoding method, and then the metadata identifier ID and its assigned string are separated according to the metadata identifier Set ID-Set.
(3) If the next control character is F, continuously taking out the key value pair (if the key value pair exists) between the two control characters, and entering the step (2) for analysis; and quitting after the analysis is finished;
if the next control character is C, continuously taking out the key value pair (if the key value pair exists) between the two control characters, and entering the step (2) for analysis; meanwhile, continuously searching the next control character, and entering into (3);
if the next control character is D, pressing D into the stack, taking out the key value pair (if the key value pair exists) between the two control characters, and entering the step (2) for analysis; continuing to find the next control character, and entering (3);
if the next control character is E, pushing D out of the stack, taking out the key value pair (if the key value pair exists) between the two control characters, and entering the step (2) for analysis; and (4) continuing to find the next control character and entering (3).
Fig. 3 is a structural diagram of a digital character-oriented data encoding system according to an embodiment of the present invention, and as shown in fig. 3, the digital character-oriented data encoding system includes: a processing module 301 and an encoding module 302. The processing module 301 is configured to add a control character to data to be stored or to be sent, where the control character is arranged between printable characters, and is used to control a data structure and data analysis; the encoding module 302 is configured to perform binary encoding on all characters of the data to which the control character is added according to each 4-bit character; the data to be stored or to be sent is composed of printable characters, the printable characters comprise digital characters and special characters, and the total number of the printable characters, the special characters and the control characters is not more than 16; and when the binary coding is carried out, the coding is carried out according to a preset coding rule.
Fig. 4 is a structural diagram of a digital character-oriented data parsing system according to an embodiment of the present invention, and as shown in fig. 4, the digital character-oriented data parsing system includes: a parsing module 401 and a processing module 402. Wherein, include: the analysis module 401 is configured to decode the received or read data according to a decoding rule corresponding to the encoding rule used in encoding, and obtain a printable character and a control character, respectively, according to one character with 4 bits; the processing module 402 is configured to analyze the meaning represented by the control character to obtain data composed of printable characters; wherein the printable characters include numeric characters and special characters, and a total number of the printable characters, the special characters, and the control characters is not greater than 16.
The system embodiment provided in the embodiments of the present invention is for implementing the above method embodiments, and for details of the process and the details, reference is made to the above method embodiments, which are not described herein again.
The digital character-oriented data coding system or the analysis system provided by the embodiment of the invention respectively codes or decodes the printable characters and the control characters according to the binary number of one character with 4 bits, and one character with 4 bits can not only meet the total number of the printable characters and the control characters, but also code and correspondingly decode all the characters with the minimum number of bits, so that the data to be stored can occupy the minimum storage space in a storage medium, the storage efficiency is improved, and the cost can be greatly reduced for the object with limited storage. For data to be sent, the minimum bandwidth can be occupied in the transmission process, and the transmission efficiency is improved.
Fig. 5 is a schematic entity structure diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 5, the electronic device may include: a processor (processor)501, a communication Interface (Communications Interface)502, a memory (memory)503, and a bus 504, wherein the processor 501, the communication Interface 502, and the memory 503 are configured to communicate with each other via the bus 504. The communication interface 502 may be used for information transfer of an electronic device. The processor 501 may call logic instructions in the memory 503 to perform a method comprising: adding control characters to be arranged between printable characters in data to be stored or to be sent, wherein the control characters are used for controlling a data structure and data analysis; according to each character with 4 bits, carrying out binary coding on all characters of the data added with the control character; the data to be stored or to be sent is composed of printable characters, the printable characters comprise digital characters and special characters, and the total number of the printable characters, the special characters and the control characters is not more than 16; and when the binary coding is carried out, the coding is carried out according to a preset coding rule.
In addition, the logic instructions in the memory 503 may be implemented in the form of software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the above-described method embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In another aspect, an embodiment of the present invention further provides a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program is implemented to perform the transmission method provided in the foregoing embodiments when executed by a processor, and for example, the method includes: adding control characters to be arranged between printable characters in data to be stored or to be sent, wherein the control characters are used for controlling a data structure and data analysis; according to each character with 4 bits, carrying out binary coding on all characters of the data added with the control character; the data to be stored or to be sent is composed of printable characters, the printable characters comprise digital characters and special characters, and the total number of the printable characters, the special characters and the control characters is not more than 16; and when the binary coding is carried out, the coding is carried out according to a preset coding rule.
The above-described system embodiments are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1.一种面向数字字符的数据编码方法,其特征在于,包括:1. a digital character-oriented data encoding method, is characterized in that, comprises: 在待存储或待发送的数据中,加入控制字符编排于可打印字符之间,用于控制数据结构和数据解析;In the data to be stored or to be sent, control characters are added and arranged between the printable characters to control the data structure and data analysis; 按照4比特每个字符,对加入控制字符后数据的所有字符进行二进制编码;According to each character of 4 bits, perform binary encoding on all characters of the data after adding control characters; 其中,所述待存储或待发送的数据,由可打印字符构成,所述可打印字符包括数字字符和特殊字符,所述可打印字符、特殊字符和控制字符的总数量不大于16;所述进行二进制编码时,按照预设的编码规则进行编码。Wherein, the data to be stored or sent consists of printable characters, the printable characters include numeric characters and special characters, and the total number of printable characters, special characters and control characters is not greater than 16; the When performing binary encoding, the encoding is performed according to the preset encoding rules. 2.根据权利要求1所述的面向数字字符的数据编码方法,其特征在于,所述控制字符包括如下控制字符中的至少两个:第一控制字符、第二控制字符、第三控制字符和第四控制字符,其中:2. The digital character-oriented data encoding method according to claim 1, wherein the control character comprises at least two of the following control characters: the first control character, the second control character, the third control character and Fourth control character, where: 第一控制字符用于分隔相邻的两个基本数据单元;The first control character is used to separate two adjacent basic data units; 第二控制字符用于标识数据块的开始,第三控制字符用于标识数据块的结束;The second control character is used to identify the beginning of the data block, and the third control character is used to identify the end of the data block; 第四控制字符标识整个数据结构的结束。The fourth control character marks the end of the entire data structure. 3.根据权利要求2所述的面向数字字符的数据编码方法,其特征在于,所述基本数据单元为键值对,其中:3. The digital character-oriented data encoding method according to claim 2, wherein the basic data unit is a key-value pair, wherein: 键为元数据标识符,由4比特可打印字符构成;The key is the metadata identifier, consisting of 4-bit printable characters; 值为元数据所对应的赋值,由4比特可打印字符表示。The value is the assignment corresponding to the metadata, represented by 4-bit printable characters. 4.根据权利要求3所述的面向数字字符的数据编码方法,其特征在于,所述数据块由若干键值对、控制字符及小数据块构成,数据块中所有字符由4比特可打印字符及控制字符构成。4. digital character-oriented data encoding method according to claim 3, is characterized in that, described data block is made up of some key-value pairs, control character and small data block, and all characters in the data block are composed of 4-bit printable characters and control characters. 5.根据权利要求1所述的面向数字字符的数据编码方法,其特征在于,所述特殊字符包括第一特殊字符和第二特殊字符中至少一个;5. The digital character-oriented data encoding method according to claim 1, wherein the special character comprises at least one of the first special character and the second special character; 所述第一特殊字符用于标识浮点数的小数点;The first special character is used to identify the decimal point of the floating point number; 所述第二特殊字符用于标识编码中的分隔符。The second special character is used to identify the delimiter in the encoding. 6.一种面向数字字符的数据解析方法,其特征在于,包括:6. A digital character-oriented data analysis method, characterized in that, comprising: 对接收或读取的数据,根据4位一个字符,按照与编码时所用编码规则相应的解码规则进行解码,分别获得可打印字符和控制字符;For the data received or read, according to 4 characters per character, decode according to the decoding rules corresponding to the coding rules used in encoding, and obtain printable characters and control characters respectively; 对控制字符所表示的含义进行解析,得到可打印字符组成的数据;Analyze the meaning represented by the control characters to obtain data composed of printable characters; 其中,所述可打印字符包括数字字符和特殊字符,所述可打印字符、特殊字符和控制字符的总数量不大于16。Wherein, the printable characters include numeric characters and special characters, and the total number of the printable characters, special characters and control characters is not more than 16. 7.一种面向数字字符的数据编码系统,其特征在于,包括:7. A digital character-oriented data encoding system is characterized in that, comprising: 处理模块,用于在待存储或待发送的数据中,加入控制字符编排于可打印字符之间,用于控制数据结构和数据解析;The processing module is used for adding control characters to the data to be stored or to be sent and arranged between the printable characters, for controlling the data structure and data analysis; 编码模块,用于按照4比特每个字符,对加入控制字符后数据的所有字符进行二进制编码;The encoding module is used to perform binary encoding on all characters of the data after adding the control characters according to each character of 4 bits; 其中,所述待存储或待发送的数据,由可打印字符构成,所述可打印字符包括数字字符和特殊字符,所述可打印字符、特殊字符和控制字符的总数量不大于16;所述进行二进制编码时,按照预设的编码规则进行编码。Wherein, the data to be stored or sent consists of printable characters, the printable characters include numeric characters and special characters, and the total number of printable characters, special characters and control characters is not greater than 16; the When performing binary encoding, the encoding is performed according to the preset encoding rules. 8.一种面向数字字符的数据解析系统,其特征在于,包括:8. A digital character-oriented data parsing system, characterized in that, comprising: 解析模块,用于对接收或读取的数据,根据4位一个字符,按照与编码时所用编码规则相应的解码规则进行解码,分别获得可打印字符和控制字符;The parsing module is used to decode the received or read data according to a 4-bit character, according to the decoding rule corresponding to the coding rule used in coding, and obtain printable characters and control characters respectively; 处理模块,用于对控制字符所表示的含义进行解析,得到可打印字符组成的数据;The processing module is used to parse the meaning represented by the control characters to obtain data composed of printable characters; 其中,所述可打印字符包括数字字符和特殊字符,所述可打印字符、特殊字符和控制字符的总数量不大于16。Wherein, the printable characters include numeric characters and special characters, and the total number of the printable characters, special characters and control characters is not more than 16. 9.一种电子设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其特征在于,所述处理器执行所述程序时实现如权利要求1至6任一项所述面向数字字符的数据编码方法或解析方法的步骤。9. An electronic device, comprising a memory, a processor and a computer program stored on the memory and running on the processor, wherein the processor implements any one of claims 1 to 6 when the processor executes the program The steps of the digital character-oriented data encoding method or parsing method described in item 1. 10.一种非暂态计算机可读存储介质,其上存储有计算机程序,其特征在于,该计算机程序被处理器执行时实现如权利要求1至6任一项所述面向数字字符的数据编码方法或解析方法的步骤。10. A non-transitory computer-readable storage medium on which a computer program is stored, characterized in that, when the computer program is executed by a processor, the digital character-oriented data encoding according to any one of claims 1 to 6 is implemented The steps of a method or parsing method.
CN201911320614.7A 2019-12-19 2019-12-19 Digital character-oriented data encoding method, digital character-oriented data analyzing method and digital character-oriented data encoding system Pending CN111178008A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911320614.7A CN111178008A (en) 2019-12-19 2019-12-19 Digital character-oriented data encoding method, digital character-oriented data analyzing method and digital character-oriented data encoding system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911320614.7A CN111178008A (en) 2019-12-19 2019-12-19 Digital character-oriented data encoding method, digital character-oriented data analyzing method and digital character-oriented data encoding system

Publications (1)

Publication Number Publication Date
CN111178008A true CN111178008A (en) 2020-05-19

Family

ID=70653962

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911320614.7A Pending CN111178008A (en) 2019-12-19 2019-12-19 Digital character-oriented data encoding method, digital character-oriented data analyzing method and digital character-oriented data encoding system

Country Status (1)

Country Link
CN (1) CN111178008A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113162628A (en) * 2021-04-26 2021-07-23 深圳希施玛数据科技有限公司 Data encoding method, data decoding method, terminal and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1038536A (en) * 1987-12-29 1990-01-03 索尼公司 digital data transfer method
CN101076945A (en) * 2004-03-12 2007-11-21 国际商业机器公司 DC balance 6B/8B transmitted code with local parity check
CN101527614A (en) * 2008-12-31 2009-09-09 世纪中网科技(深圳)有限公司 Encoding method for transmitting GPS data
WO2012082936A2 (en) * 2010-12-14 2012-06-21 Ngmoco, Llc A communication protocol between a high-level language and a native language
CN107276719A (en) * 2017-06-05 2017-10-20 武汉虹信通信技术有限责任公司 A kind of number decimal number odd even number recognition methods for communication system
CN108509397A (en) * 2018-03-21 2018-09-07 清华大学 Storage, analytic method and the system of hierarchical structure data based on identifier technology
CN109858231A (en) * 2019-01-22 2019-06-07 武汉极意网络科技有限公司 Action trail lossless compression-encoding method, user equipment, storage medium and device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1038536A (en) * 1987-12-29 1990-01-03 索尼公司 digital data transfer method
CN101076945A (en) * 2004-03-12 2007-11-21 国际商业机器公司 DC balance 6B/8B transmitted code with local parity check
CN101527614A (en) * 2008-12-31 2009-09-09 世纪中网科技(深圳)有限公司 Encoding method for transmitting GPS data
WO2012082936A2 (en) * 2010-12-14 2012-06-21 Ngmoco, Llc A communication protocol between a high-level language and a native language
CN107276719A (en) * 2017-06-05 2017-10-20 武汉虹信通信技术有限责任公司 A kind of number decimal number odd even number recognition methods for communication system
CN108509397A (en) * 2018-03-21 2018-09-07 清华大学 Storage, analytic method and the system of hierarchical structure data based on identifier technology
CN109858231A (en) * 2019-01-22 2019-06-07 武汉极意网络科技有限公司 Action trail lossless compression-encoding method, user equipment, storage medium and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘立斌: "数据在计算机中的表示与存储", 《民营科技》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113162628A (en) * 2021-04-26 2021-07-23 深圳希施玛数据科技有限公司 Data encoding method, data decoding method, terminal and storage medium

Similar Documents

Publication Publication Date Title
CN110445860B (en) Message sending method, device, terminal equipment and storage medium
CN109104405B (en) Binary protocol encoding and decoding method and device
JP2019537172A (en) Method and system for indexing bioinformatics data
CN101783788A (en) File compression method, file compression device, file decompression method, file decompression device, compressed file searching method and compressed file searching device
CN101526963A (en) Method for identifying web page coding, device and terminal equipment
CN114880523B (en) String processing method, device, electronic device and storage medium
CN110708307B (en) Transcoder generation method and apparatus, electronic device, and storage medium
CN111898340A (en) File processing method, device and readable storage medium
WO2016124070A1 (en) Data processing method and device
CN104391993A (en) Method and system for recognizing webpage codes
KR20100059825A (en) An apparatus for preparing a display document for analysis
CN117177100B (en) An intelligent AR polarization data transmission method
CN104360988A (en) Method and device for identifying coding mode of Chinese characters
CN112232025A (en) Character string storage method and device and electronic equipment
CN111178008A (en) Digital character-oriented data encoding method, digital character-oriented data analyzing method and digital character-oriented data encoding system
CN115276889A (en) Decoding processing method, decoding processing device, computer equipment and storage medium
CN111159394A (en) Text abstract generation method and device
CN111475679B (en) HTML document processing method, page display method and equipment
CN118261254B (en) A method and device for compressing long text
US9235610B2 (en) Short string compression
CN110287147B (en) Character string sorting method and device
CN119226568A (en) Data retrieval method, device, terminal and computer-readable storage medium
CN114969330A (en) Clustering method and system based on long text
CN115358209B (en) Event relationship extraction method, device, electronic device and storage medium
CN112347801A (en) A kind of electronic chip information data analysis method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200519

RJ01 Rejection of invention patent application after publication