[go: up one dir, main page]

CN104504342B - Method using invisible character hiding information is encoded based on Unicode - Google Patents

Method using invisible character hiding information is encoded based on Unicode Download PDF

Info

Publication number
CN104504342B
CN104504342B CN201410733815.0A CN201410733815A CN104504342B CN 104504342 B CN104504342 B CN 104504342B CN 201410733815 A CN201410733815 A CN 201410733815A CN 104504342 B CN104504342 B CN 104504342B
Authority
CN
China
Prior art keywords
secret
carrier
message
unicode
encoding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410733815.0A
Other languages
Chinese (zh)
Other versions
CN104504342A (en
Inventor
吴槟
易小伟
赵险峰
冯凯
何晓磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Information Engineering of CAS
Original Assignee
Institute of Information Engineering of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Information Engineering of CAS filed Critical Institute of Information Engineering of CAS
Priority to CN201410733815.0A priority Critical patent/CN104504342B/en
Publication of CN104504342A publication Critical patent/CN104504342A/en
Application granted granted Critical
Publication of CN104504342B publication Critical patent/CN104504342B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • G06F21/6263Protecting personal data, e.g. for financial or medical purposes during internet communication, e.g. revealing personal data from cookies

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Medical Informatics (AREA)
  • Document Processing Apparatus (AREA)
  • Storage Device Security (AREA)

Abstract

The present invention proposes a kind of method using invisible character hiding information based on Unicode codings, mainly includes message embedded mobile GIS and message extraction algorithm.Secret information is hidden by using the Unicode coding characteristics of invisible character, the embedding capacity of secret information is improved on the premise of security is ensured.Using the present invention, secret information can be hidden on the text carrier containing the invisible character such as space using Unicode codings;The insertion form of secret information can neatly be changed according to coding schedule to ensure the safety of information;The embedding capacity of secret information can be effectively improved.

Description

基于Unicode编码利用不可见字符隐藏信息的方法A Method of Hiding Information Using Invisible Characters Based on Unicode Encoding

技术领域technical field

本发明涉及一种基于Unicode编码的利用不可见字符隐藏信息的方法,属于信息隐藏技术领域。The invention relates to a method for hiding information by using invisible characters based on Unicode encoding, and belongs to the technical field of information hiding.

背景技术Background technique

随着计算机应用的普及和互联网的迅猛发展,人们对网络通信安全提出了更高的要求。与信息加密技术不同,信息隐藏技术通过掩盖传输消息行为的存在性来保障通信安全性。当前,随着文本文档的应用越来越广泛,利用文本文档隐藏秘密消息成为信息安全领域的一个研究方向。在计算机科学领域,Unicode编码是一种通用字符集编码标准,被广泛地应用于现代操作系统。几乎所有的文字处理软件都支持对文本数据进行Unicode编码、解析和存储。With the popularization of computer applications and the rapid development of the Internet, people have put forward higher requirements for network communication security. Different from information encryption technology, information hiding technology guarantees communication security by concealing the existence of the behavior of transmitting messages. At present, with the application of text documents more and more widely, using text documents to hide secret messages has become a research direction in the field of information security. In the field of computer science, Unicode encoding is a universal character set encoding standard, which is widely used in modern operating systems. Almost all word processing software supports Unicode encoding, parsing and storage of text data.

不可见字符指文档打印时不可见的字符集合,包括空格、制表符和换行符等。通常地,利用不可见字符来隐藏秘密消息具有不易察觉、操作简单和隐蔽性强等优势。WbStego4软件是一款利用不可见字符进行隐藏信息的开源工具,它支持TXT、HTML、PDF等多种数据格式。但是它不支持中文文本数据载体的信息隐藏,并且信息嵌入容量较为有限,在实际应用中存在很大的局限性。Invisible characters refer to a set of characters that are invisible when the document is printed, including spaces, tabs, and newlines. Generally, using invisible characters to hide secret messages has the advantages of imperceptibility, simple operation, and strong concealment. WbStego4 software is an open source tool that uses invisible characters to hide information. It supports multiple data formats such as TXT, HTML, and PDF. But it does not support information hiding of Chinese text data carrier, and the information embedding capacity is relatively limited, which has great limitations in practical application.

发明内容Contents of the invention

本发明所要解决的技术问题是克服现有WbStego4软件技术的不足,提供一种基于Unicode编码的利用不可见字符隐藏信息的方法,适用于基于Unicode编码的含有不可见字符的中英文文本载体来隐藏消息,在保证安全性前提下提升隐藏信息的容量,能够较好地满足隐蔽通信的要求。The technical problem to be solved by the present invention is to overcome the deficiencies of the existing WbStego4 software technology, to provide a method for hiding information based on Unicode encoding using invisible characters, which is suitable for hiding information in Chinese and English text carriers containing invisible characters based on Unicode encoding Messages can increase the capacity of hidden information under the premise of ensuring security, which can better meet the requirements of covert communication.

本发明的技术解决方案是一种基于Unicode编码的利用不可见字符隐藏信息的方法,它主要包括消息嵌入算法和提取算法两个部分:The technical solution of the present invention is a method for hiding information based on Unicode encoding using invisible characters, which mainly includes two parts: a message embedding algorithm and an extraction algorithm:

消息嵌入算法,根据编码表对载体对象中的不可见字符进行重新编码,将秘密消息嵌入到载体对象中,其中载体对象是基于Unicode编码的中英文文本数据,不可见字符包括半角/全角空格和制表符(对应的Unicode码表示分别为0x20 00、0x00 30和0x09 00)。嵌入算法的输入数据有载体对象、秘密消息和编码表,输出数据是带秘载体。Message embedding algorithm, re-encode the invisible characters in the carrier object according to the code table, and embed the secret message into the carrier object, where the carrier object is Chinese and English text data based on Unicode encoding, and invisible characters include half-width/full-width spaces and Tab characters (the corresponding Unicode codes are 0x20 00, 0x00 30 and 0x09 00). The input data of the embedding algorithm includes carrier object, secret message and code table, and the output data is the carrier with secret.

消息提取算法,是嵌入算法的逆向算法,它根据编码表从带秘载体中恢复出秘密消息。提取算法的输入数据有带秘载体和编码表,输出数据是秘密消息。The message extraction algorithm is the reverse algorithm of the embedded algorithm, and it recovers the secret message from the secret carrier according to the coding table. The input data of the extraction algorithm has a secret carrier and a coding table, and the output data is a secret message.

本发明的基于Unicode编码的利用不可见字符隐藏信息的方法,通过利用不可见字符的编码特点在文本载体中隐藏秘密消息,本方法包括以下步骤:The method of using invisible characters to hide information based on Unicode encoding of the present invention hides secret messages in text carriers by utilizing the encoding characteristics of invisible characters. The method comprises the following steps:

(1)发送发和接收方协定秘钥,并分别根据秘钥构造编码表;(1) The sender and the receiver agree on a secret key, and construct a coding table according to the secret key;

(2)发送方选择载体对象,根据步骤(1)中生成的编码表,通过信息嵌入将秘密消息嵌入到载体对象中,得到带秘载体;(2) The sender selects the carrier object, and according to the encoding table generated in step (1), embeds the secret message into the carrier object through information embedding to obtain the secret carrier;

(3)发送方将步骤(2)中得到的带秘载体通过通信信道传输到接收方;(3) The sender transmits the encrypted carrier obtained in the step (2) to the receiver through the communication channel;

(4)接收方根据步骤(1)中生成的编码表,通过提取算法从步骤(3)中接收到的带秘载体中将秘密消息提取出来,得到秘密消息。(4) According to the encoding table generated in step (1), the receiver extracts the secret message from the secret carrier received in step (3) through an extraction algorithm to obtain the secret message.

本发明与现有技术相比的有益效果在于:The beneficial effect of the present invention compared with prior art is:

(1)本发明中,载体对象的选择更宽泛。适用于隐藏秘密消息的载体对象可以选择基于Unicode编码的文本数据,以满足实际应用的需求。(1) In the present invention, the selection of carrier objects is wider. The carrier object suitable for hiding secret messages can choose text data based on Unicode encoding to meet the needs of practical applications.

(2)本发明中,通过利用秘钥生成器所生成的秘钥来保证算法的安全性。秘钥可控制编码表的生成,在保证信息隐藏算法安全的前提下,将消息嵌入/提取操作与消息加密/解密操作联合起来,降低了实际应用中操作的复杂性和能量消耗。(2) In the present invention, the security of the algorithm is guaranteed by using the secret key generated by the secret key generator. The secret key can control the generation of the encoding table. On the premise of ensuring the security of the information hiding algorithm, the message embedding/extraction operation and the message encryption/decryption operation are combined to reduce the complexity and energy consumption of the operation in practical applications.

(3)本发明中,通过利用编码表以提升嵌入信息的容量。在编制码表时,利用Unicode码中冗余的不可见字符码字设计了由256个一一映射关系组成的编码表,其中编码表的映射关系受到秘钥的控制。编码表将每次嵌入信息率从1比特提升到每次可嵌入8比特。(3) In the present invention, the capacity of embedded information is increased by using the encoding table. When compiling the code table, a code table composed of 256 one-to-one mapping relations is designed by using the redundant invisible character code words in the Unicode code, and the mapping relationship of the code table is controlled by the secret key. The encoding table increases the information rate per embedding from 1 bit to 8 bits per embedding.

附图说明Description of drawings

图1是本发明方法实施例的实现流程图;Fig. 1 is the realization flowchart of the method embodiment of the present invention;

图2是本发明方法中消息嵌入算法的实施流程图;Fig. 2 is the implementation flowchart of message embedding algorithm in the inventive method;

图3是本发明方法中消息提取算法的实施流程图。Fig. 3 is an implementation flow chart of the message extraction algorithm in the method of the present invention.

具体实施方式Detailed ways

为使本发明的上述目的、特征和优点能够更加明显易懂,下面通过具体实施例和附图,对本发明做进一步说明。In order to make the above objects, features and advantages of the present invention more obvious and understandable, the present invention will be further described below through specific embodiments and accompanying drawings.

如图1所示,是本发明的实现流程示意图,该信息隐藏方法可以表示为一个六元组,即∑=<C,S,Tk,C',Ek,Dk>,其中C为载体对象集合、S为秘密消息集合、Tk为根据秘钥k构造的编码表、C′为载体对象隐藏秘密消息后所得到的带秘载体集合、Ek为消息嵌入算法、Dk为消息提取算法。在信息隐藏方法Σ中,包括2个主要的算法模块:消息嵌入算法模块和消息提取算法模块,分别由发送方和接收方调用。各模块的功能描述如下:As shown in Figure 1 , it is a schematic diagram of the implementation flow of the present invention, and the information hiding method can be expressed as a six-tuple, that is, ∑=<C, S, T k , C', E k , D k >, where C is The set of carrier objects, S is the set of secret messages, T k is the coding table constructed according to the secret key k, C′ is the set of secret carriers obtained after the secret message is hidden by the carrier object, E k is the message embedding algorithm, and D k is the message embedding algorithm. extraction algorithm. In the information hiding method Σ, there are two main algorithm modules: a message embedding algorithm module and a message extraction algorithm module, which are called by the sender and receiver respectively. The function description of each module is as follows:

1、消息嵌入算法1. Message embedding algorithm

发送方根据秘钥k构造编码表Tk,将发送方输入的秘密信息S嵌入到所选择的载体对象C中,输出带秘载体C′。然后,发送方将该带秘载体C′通过通信信道传输到接收方。执行消息嵌入算法Ek的过程可以表示为:The sender constructs the encoding table T k according to the secret key k, embeds the secret information S input by the sender into the selected carrier object C, and outputs the encrypted carrier C′. Then, the sender transmits the encrypted carrier C' to the receiver through the communication channel. The process of executing the message embedding algorithm Ek can be expressed as:

Ek:C×M×Tk→C′E k : C×M×T k →C′

2、消息提取算法2. Message extraction algorithm

接收方根据提取秘钥k构造编码表Tk,将从发送方处得到的带秘载体C′通过提取算法提取出秘密信息M。执行消息提取算法Dk的过程可以表示为:The receiver constructs the encoding table T k according to the extraction key k, and extracts the secret information M from the secret carrier C′ obtained from the sender through the extraction algorithm. The process of executing the message extraction algorithm Dk can be expressed as:

Dk:C′×Tk→MD k :C′×T k →M

如图1所示,本发明的具体实现过程如下:As shown in Figure 1, the specific implementation process of the present invention is as follows:

1、编码表构造1. Code table structure

本发明通过发送方和接收方共享的秘钥k控制编码表Tk的生成,在保证编码安全性的前提下提高隐藏信息的容量,以满足实际应用的要求。构造编码表的具体实现步骤为:The present invention controls the generation of the encoding table T k through the secret key k shared by the sender and the receiver, and increases the capacity of hidden information under the premise of ensuring encoding security to meet the requirements of practical applications. The specific implementation steps of constructing the encoding table are as follows:

(1)密钥k必须是2048比特的二进制字符串,以十进制的方式可以表示成(1) The key k must be a 2048-bit binary string, which can be expressed as

k=(n0,n1,…,ni,…,n255),ni∈N∩[0,255],i=0,1,…255k=(n 0 ,n 1 ,...,n i ,...,n 255 ),n i ∈N∩[0,255], i=0,1,...255

此外,密钥k的每个分量ni还必须满足如下条件In addition, each component n i of the key k must also satisfy the following conditions

以上描述表明:密钥k可以表述为0到255整数序列的一个置换,如下式所示The above description shows that the key k can be expressed as a permutation of an integer sequence from 0 to 255, as shown in the following formula

k=perms(0,1,2,…,255)k=perms(0,1,2,...,255)

其中perms()为置换函数。Where perms() is the permutation function.

(2)根据Unicode码的特征,发送方和接收方事先查找出256个不可见字符的可选码字,作为双方共享的编码表的构造基础(每个码字必须是2个字节的Unicode编码)。如表1所示给出了一个256个可选码字的十六进制表示示例。(2) According to the characteristics of the Unicode code, the sender and the receiver find the optional codewords of 256 invisible characters in advance, as the basis for the construction of the code table shared by both parties (each codeword must be a 2-byte Unicode coding). As shown in Table 1, a hexadecimal representation example of 256 optional codewords is given.

(3)根据步骤(1)和步骤(2),建立从密钥到可选码字的编码表Tk。表1给出的即是在密钥k=(0,1,2,…,255)时所构造出的编码表。(3) According to step (1) and step (2), establish a coding table T k from the key to the optional codeword. Table 1 shows the encoding table constructed when the key k=(0, 1, 2, . . . , 255).

表1.编码表Table 1. Encoding table

mm 码字numbers mm 码字numbers mm 码字numbers mm 码字numbers mm 码字numbers mm 码字numbers mm 码字numbers 0000 00 D800 D8 2525 01 E401 E4 4A4A 02 F802 F8 6F6F 04 E304 E3 9494 05 F705 F7 B9B9 07 E207 E2 DEDE 08 F608 F6 0101 00 D900 D9 2626 01 E501 E5 4B4B 03 D803 D8 7070 04 E404 E4 9595 05 F805 F8 BABA 07 E307 E3 DFDF 08 F708 F7 0202 00 DA00DA 2727 01 E601 E6 4C4C 03 D903 D9 7171 04 E504 E5 9696 06 D806 D8 BBBB 07 E407 E4 E0E0 08 F808 F8 0303 00 DB00 DB 2828 01 E701 E7 4D4D 03 DA03 DA 7272 04 E604 E6 9797 06 D906 D9 BCBC 07 E507 E5 E1E1 09 D809 D8 0404 00 DC00 DC 2929 01 F001 F0 4E4E 03 DB03 DB 7373 04 E704 E7 9898 06 DA06DA BDBD 07 E607 E6 E2E2 09 D909 D9 0505 00 DD00 DD 2A2A 01 F101 F1 4F4F 03 DC03 DC 7474 04 F004 F0 9999 06 DB06DB BEBE 07 E707 E7 E3E3 09 DA09DA 0606 00 DE00 DE 2B2B 01 F201 F2 5050 03 DD03 DD 7575 04 F104 F1 9A9A 06 DC06 DC BFBF 07 F007 F0 E4E4 09 DB09 DB 0707 00 DF00 DF 2C2C 01 F301 F3 5151 03 DE03 DE 7676 04 F204 F2 9B9B 06 DD06 DD C0C0 07 F107 F1 E5E5 09 DC09 DC 0808 00 E000 E0 2D2D 01 F401 F4 5252 03 DF03 DF 7777 04 F304 F3 9C9C 06 DE06 DE C1C1 07 F207 F2 E6E6 09 DD09 DD 0909 00 E100 E1 2E2E 01 F501 F5 5353 03 E003 E0 7878 04 F404 F4 9D9D 06 DF06 DF C2C2 07 F307 F3 E7E7 09 DE09 DE 0A0A 00 E200 E2 2F2F 01 F601 F6 5454 03 E103 E1 7979 04 F504 F5 9E9E 06 E006 E0 C3C3 07 F407 F4 E8E8 09 DF09 DF 0B0B 00 E300 E3 3030 01 F701 F7 5555 03 E203 E2 7A7A 04 F604 F6 9F9F 06 E106 E1 C4C4 07 F507 F5 E9E9 09 E009 E0 0C0C 00 E400 E4 3131 01 F801 F8 5656 03 E303 E3 7B7B 04 F704 F7 A0A0 06 E206 E2 C5C5 07 F607 F6 EAEA 09 E109 E1 0D0D 00 E500 E5 3232 02 D802 D8 5757 03 E403 E4 7C7C 04 F804 F8 A1A1 06 E306 E3 C6C6 07 F707 F7 EBEB 09 E209 E2 0E0E 00 E600 E6 3333 02 D902 D9 5858 03 E503 E5 7D7D 05 D805 D8 A2A2 06 E406 E4 C7C7 07 F807 F8 ECEC 09 E309 E3 0F0F 00 E700 E7 3434 02 DA02 DA 5959 03 E603 E6 7E7E 05 D905 D9 A3A3 06 E506 E5 C8C8 08 D808 D8 EDED 09 E409 E4 1010 00 F000 F0 3535 02 DB02 DB 5A5A 03 E703 E7 7F7F 05 DA05DA A4A4 06 E606 E6 C9C9 08 D908 D9 EEEE 09 E509 E5 1111 00 F100 F1 3636 02 DC02 DC 5B5B 03 F003 F0 8080 05 DB05DB A5A5 06 E706 E7 CACA 08 DA08 DA EFEF 09 E609 E6 1212 00 F200 F2 3737 02 DD02 DD 5C5C 03 F103 F1 8181 05 DC05 DC A6A6 06 F006 F0 CBCB 08 DB08DB F0F0 09 E709 E7 1313 00 F300 F3 3838 02 DE02 DE 5D5D 03 F203 F2 8282 05 DD05 DD A7A7 06 F106 F1 CCCC 08 DC08 DC F1F1 09 F009 F0 1414 00 F400 F4 3939 02 DF02 DF 5E5E 03 F303 F3 8383 05 DE05 DE A8A8 06 F206 F2 CDcd 08 DD08 DD F2F2 09 F109 F1 1515 00 F500 F5 3A3A 02 E002 E0 5F5F 03 F403 F4 8484 05 DF05 DF A9A9 06 F306 F3 CECE 08 DE08 DE F3F3 09 F209 F2 1616 00 F600 F6 3B3B 02 E102 E1 6060 03 F503 F5 8585 05 E005 E0 AAAAA 06 F406 F4 CFCF 08 DF08 DF F4F4 09 F309 F3 1717 00 F700 F7 3C3C 02 E202 E2 6161 03 F603 F6 8686 05 E105 E1 ABAB 06 F506 F5 D0D0 08 E008 E0 F5F5 09 F409 F4 1818 00 F800 F8 3D3D 02 E302 E3 6262 03 F703 F7 8787 05 E205 E2 ACAC 06 F606 F6 D1D1 08 E108 E1 F6F6 09 F509 F5 1919 01 D801 D8 3E3E 02 E402 E4 6363 03 F803 F8 8888 05 E305 E3 ADAD 06 F706 F7 D2D2 08 E208 E2 F7F7 09 F609 F6 1A1A 01 D901 D9 3F3F 02 E502 E5 6464 04 D804 D8 8989 05 E405 E4 AEAE 06 F806 F8 D3D3 08 E308 E3 F8F8 09 F709 F7 1B1B 01 DA01DA 4040 02 E602 E6 6565 04 D904 D9 8A8A 05 E505 E5 AFAF 07 D807 D8 D4D4 08 E408 E4 F9F9 09 F809 F8 1C1C 01 DB01 DB 4141 02 E702 E7 6666 04 DA04DA 8B8B 05 E605 E6 B0B0 07 D907 D9 D5D5 08 E508 E5 FAFA 0A D80A D8 1D1D 01 DC01 DC 4242 02 F002 F0 6767 04 DB04 DB 8C8C 05 E705 E7 B1B1 07 DA07DA D6D6 08 E608 E6 FBFacebook 0A D90A D9 1E1E 01 DD01 DD 4343 02 F102 F1 6868 04 DC04 DC 8D8D 05 F005 F0 B2B2 07 DB07DB D7D7 08 E708 E7 FCFC 0A DA0A DA 1F1F 01 DE01 DE 4444 02 F202 F2 6969 04 DD04 DD 8E8E 05 F105 F1 B3B3 07 DC07 DC D8D8 08 F008 F0 FDFD 0A DB0A DB 2020 01 DF01 DF 4545 02 F302 F3 6A6A 04 DE04 DE 8F8F 05 F205 F2 B4B4 07 DD07 DD D9D9 08 F108 F1 FEFE 0A DC0A DC 21twenty one 01 E001 E0 4646 02 F402 F4 6B6B 04 DF04 DF 9090 05 F305 F3 B5B5 07 DE07 DE DADA 08 F208 F2 FFFF 0A DD0A DD 22twenty two 01 E101 E1 4747 02 F502 F5 6C6C 04 E004 E0 9191 05 F405 F4 B6B6 07 DF07 DF DBDB 08 F308 F3 23twenty three 01 E201 E2 4848 02 F602 F6 6D6D 04 E104 E1 9292 05 F505 F5 B7B7 07 E007 E0 DCDC 08 F408 F4 24twenty four 01 E301 E3 4949 02 F702 F7 6E6E 04 E204 E2 9393 05 F605 F6 B8B8 07 E107 E1 DDDD 08 F508 F5

在本发明中,编码表是作为消息嵌入算法模块和提取算法模块的输入,且发送方和接收方通过秘钥k构造的编码表是相同的,所以输入到嵌入算法模块的编码表Tk与输入到提取算法模块的编码表Tk一致。In the present invention, the encoding table is used as the input of the message embedding algorithm module and the extraction algorithm module, and the encoding table constructed by the sender and the receiver through the secret key k is the same, so the encoding table T k input to the embedding algorithm module is the same as The coding table T k input to the extraction algorithm module is consistent.

2、消息嵌入2. Message embedding

本发明中消息嵌入算法模块执行嵌入算法E()可以表示成In the present invention, the message embedding algorithm module executes the embedding algorithm E() can be expressed as

c′=E(c,m,Tk)c'=E(c,m,T k )

其中,模块输入参数有载体对象c、待嵌入的秘密消息m和嵌入秘钥k生成的编码表Tk,模块输出是包含秘密消息的带秘载体c′。此外,本发明要求嵌入信息的载体c为基于Unicode编码的文本数据,包括TXT格式文本和Word、PDF、XML、HTML复合文档中的文本数据;并且在嵌入模块中m作为字节流方式进行处理,所以m可以是TXT文本数据和JPEG图像数据。Among them, the input parameters of the module include the carrier object c, the secret message m to be embedded and the encoding table T k generated by embedding the secret key k, and the output of the module is the secret carrier c′ containing the secret message. In addition, the present invention requires that the carrier c of the embedded information is text data based on Unicode encoding, including text data in TXT format text and Word, PDF, XML, and HTML compound documents; and m is processed as a byte stream in the embedding module , so m can be TXT text data and JPEG image data.

嵌入算法的实施流程图如图2所示,具体实现步骤为:The implementation flow chart of the embedded algorithm is shown in Figure 2, and the specific implementation steps are as follows:

(1)对输入的载体对象c进行预处理,用0x2000替换c中出现在编码表Tk的码字;(1) Carry out preprocessing to the input carrier object c, replace the code word that appears in coding table T k in c with 0x2000;

(2)顺序地读取c的2个字节数据xx,并判断xx值是否为0x0000;(2) Read the 2-byte data xx of c sequentially, and judge whether the value of xx is 0x0000;

(3)如果步骤(2)中结果为no,则执行步骤(4),否则执行步骤(12a);(3) If the result in step (2) is no, then perform step (4), otherwise perform step (12a);

(4)判断xx是否为不可见字符,包括半角/全角空格和制表符,它们对应的Unicode码表示分别为0x2000、0x0030和0x0900;(4) Determine whether xx is an invisible character, including half-width/full-width spaces and tabs, and their corresponding Unicode codes are 0x2000, 0x0030 and 0x0900;

(5)如果步骤(4)中结果为no,则跳转到步骤(2),否则执行步骤(6);(5) If the result in step (4) is no, then jump to step (2), otherwise execute step (6);

(6)对输入的秘密消息s,读取1个字节数据y;(6) For the input secret message s, read 1 byte of data y;

(7)判断y值是否为EOF;(7) Determine whether the y value is EOF;

(8)如果步骤(7)中结果为no,则执行步骤(9),否则执行步骤(12b);(8) If the result in step (7) is no, then perform step (9), otherwise perform step (12b);

(9)对输入的编码表Tk,查找Tk中y值对应的码字zz;(9) To the input coding table T k , search for the code word zz corresponding to the y value in T k ;

(10)使用步骤(9)中的zz值替换步骤(5)所得到c中的xx值;(10) use the zz value in the step (9) to replace the xx value in the obtained c of the step (5);

(11)重复执行步骤(2);(11) Repeat step (2);

(12a)输出载体嵌入容量不足的提示信息。(12a) Outputting a prompt message that the embedding capacity of the carrier is insufficient.

(12b)输出改变后的c,即为带秘载体c′。(12b) Output the changed c, which is the secret carrier c'.

3、消息提取3. Message extraction

本发明中消息提取算法模块执行提取算法D(),可以表示成In the present invention, the message extraction algorithm module executes the extraction algorithm D(), which can be expressed as

m=D(c′,Tk′)m=D(c',T k' )

其中,模块输入参数有含秘密消息的带秘载体c′和提取秘钥k′生成的编码表Tk′,模块输出是提取得到的秘密消息m。特别地,为了保持编码表Tk和Tk′的一致性,本发明中嵌入秘钥k与提取密钥k′是相同的。Among them, the input parameters of the module include the secret carrier c' containing the secret message and the code table T k ' generated by extracting the secret key k', and the output of the module is the extracted secret message m. In particular, in order to maintain the consistency of the encoding tables T k and T k' , the embedding key k and the extraction key k' are the same in the present invention.

提取算法的实施流程图如图3所示,具体实现步骤为:The implementation flowchart of the extraction algorithm is shown in Figure 3, and the specific implementation steps are as follows:

(1)对输入的带秘载体c′,读取2个字节数据yy;(1) Read 2 bytes of data yy for the input tape carrier c';

(2)判断yy值是否为EOF;(2) Determine whether the value of yy is EOF;

(3)如果步骤(2)中结果为no,则执行步骤(4),否则执行步骤(8);(3) If the result in step (2) is no, then execute step (4), otherwise execute step (8);

(4)根据输入编码表Tk′,判断步骤(2)中的yy值是否为Tk′中的码字;(4) according to input coding table T k ' , judge whether the yy value in the step (2) is the code word in T k ' ;

(5)如果步骤(4)中结果为no,则跳转到步骤(1),否则执行步骤(6);(5) If the result in step (4) is no, then jump to step (1), otherwise execute step (6);

(6)查找Tk′,获得yy值所对应的m值;(6) Search T k' to obtain the m value corresponding to the yy value;

(7)重复执行步骤(1);(7) Repeat step (1);

(8)输出秘密消息m。(8) Output the secret message m.

以上实施例仅用以说明本发明的技术方案而非对其进行限制,本领域的普通技术人员可以对本发明的技术方案进行修改或者等同替换,而不脱离本发明的精神和范围,本发明的保护范围应以权利要求所述为准。The above embodiments are only used to illustrate the technical solution of the present invention and not to limit it. Those of ordinary skill in the art can modify or equivalently replace the technical solution of the present invention without departing from the spirit and scope of the present invention. The scope of protection should be determined by the claims.

Claims (6)

1.一种基于Unicode编码的利用不可见字符隐藏信息的方法,适用于基于Unicode编码的含有不可见字符的文本载体来隐藏消息,包括以下步骤:1. A method of utilizing invisible characters to hide information based on Unicode encoding, which is suitable for hiding messages based on the text carrier containing invisible characters based on Unicode encoding, comprising the following steps: (1)发送发和接收方协定秘钥,并通过秘钥控制编码表的生成,利用Unicode码中的不可见字符分别根据秘钥构造编码表;通过发送方和接收方共享的秘钥k控制编码表Tk的生成,构造编码表的步骤为:(1) The sender and the receiver agree on the secret key, and control the generation of the encoding table through the secret key, and use the invisible characters in the Unicode code to construct the encoding table according to the secret key; control the code table through the secret key k shared by the sender and the receiver The generation of coding table T k , the step of constructing coding table is: (1-1)秘钥k必须是2048比特的二进制字符串,以十进制的方式表示成:(1-1) The secret key k must be a 2048-bit binary string expressed in decimal as: n0n1…ni…n255,ni∈N∩[0,255],i=0,1,…255,n 0 n 1 …n i …n 255 , n i ∈ N∩[0,255], i=0,1,…255, 此外,秘钥k的分量ni和nj满足如下条件:In addition, the components n i and n j of the secret key k satisfy the following conditions: <mrow> <mo>&amp;ForAll;</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>&amp;Element;</mo> <mo>{</mo> <mn>0</mn> <mo>,</mo> <mn>1</mn> <mo>,</mo> <mo>...</mo> <mo>,</mo> <mn>255</mn> <mo>}</mo> <mo>,</mo> <mi>i</mi> <mo>&amp;NotEqual;</mo> <mi>j</mi> <mo>&amp;DoubleRightArrow;</mo> <msub> <mi>n</mi> <mi>i</mi> </msub> <mo>&amp;NotEqual;</mo> <msub> <mi>n</mi> <mi>j</mi> </msub> <mo>,</mo> </mrow> <mrow><mo>&amp;ForAll;</mo><mi>i</mi><mo>,</mo><mi>j</mi><mo>&amp;Element;</mo><mo>{</mo><mn>0</mn><mo>,</mo><mn>1</mn><mo>,</mo><mo>...</mo><mo>,</mo><mn>255</mn><mo>}</mo><mo>,</mo><mi>i</mi><mo>&amp;NotEqual;</mo><mi>j</mi><mo>&amp;DoubleRightArrow;</mo><msub><mi>n</mi><mi>i</mi></msub><mo>&amp;NotEqual;</mo><msub><mi>n</mi><mi>j</mi></msub><mo>,</mo></mrow> 秘钥k表述为0到255整数序列的一个置换,如下式所示:The secret key k is expressed as a permutation of an integer sequence from 0 to 255, as shown in the following formula: k=perms(0,1,2,…,255),k=perms(0,1,2,...,255), 其中perms()为置换函数;where perms() is a replacement function; (1-2)根据Unicode码的特征,发送方和接收方事先查找出256个不可见字符的可选码字,作为双方共享的编码表的构造基础;(1-2) According to the characteristics of the Unicode code, the sender and the receiver search out the optional codewords of 256 invisible characters in advance, as the basis for the construction of the code table shared by both parties; (1-3)根据步骤(1-1)和步骤(1-2),建立从秘钥到可选码字的编码表Tk(1-3) according to step (1-1) and step (1-2), set up the coding table T k from secret key to optional code word; (2)发送方选择基于Unicode编码的文本数据作为载体对象,根据步骤(1)中生成的编码表对载体对象中的不可见字符进行重新编码,将秘密消息嵌入到载体对象中,得到带秘载体;(2) The sender selects text data based on Unicode encoding as the carrier object, re-encodes the invisible characters in the carrier object according to the encoding table generated in step (1), embeds the secret message into the carrier object, and obtains the secret message carrier; (3)发送方将步骤(2)中得到的带秘载体通过通信信道传输到接收方;(3) The sender transmits the encrypted carrier obtained in the step (2) to the receiver through the communication channel; (4)接收方根据步骤(1)中生成的编码表,从步骤(3)中接收到的带秘载体中将秘密消息提取出来,得到秘密消息。(4) The receiver extracts the secret message from the secret carrier received in step (3) according to the encoding table generated in step (1), and obtains the secret message. 2.如权利要求1所述的方法,其特征在于:步骤(2)通过执行嵌入算法E()将秘密消息嵌入到载体对象中,嵌入算法E()表示为2. The method according to claim 1, characterized in that: step (2) embeds the secret message in the carrier object by executing the embedding algorithm E (), and the embedding algorithm E () is expressed as c′=E(c,m,Tk),c'=E(c,m,T k ), 其中,输入参数为载体对象c、待嵌入的秘密消息m和嵌入秘钥k生成的编码表Tk,输出是包含秘密消息的带秘载体c′。Among them, the input parameters are the carrier object c, the secret message m to be embedded and the encoding table T k generated by embedding the key k, and the output is the secret carrier c′ containing the secret message. 3.如权利要求2所述的方法,其特征在于:载体对象c为基于Unicode编码的文本数据,包括TXT格式文本和Word、PDF、XML、HTML复合文档中的文本数据;秘密消息m作为字节流方式进行处理,m是TXT文本数据或者JPEG图像数据。3. The method according to claim 2, characterized in that: the carrier object c is text data based on Unicode encoding, including text data in TXT format text and Word, PDF, XML, and HTML compound documents; secret message m is used as a character Throttling mode for processing, m is TXT text data or JPEG image data. 4.如权利要求2所述的方法,其特征在于,所述嵌入算法的具体实现步骤为:4. the method for claim 2 is characterized in that, the concrete realization step of described embedding algorithm is: (2-1)对输入的载体对象c进行预处理,用0x20 00替换c中出现在编码表Tk的码字;(2-1) Carry out preprocessing to the input carrier object c, replace the code word that appears in coding table T k in c with 0x2000; (2-2)顺序地读取c的2个字节数据xx,并判断xx值是否为0x00 00;(2-2) Sequentially read the 2-byte data xx of c, and judge whether the value of xx is 0x00 00; (2-3)如果步骤(2-2)中结果为no,则执行步骤(2-4),否则执行步骤(2-12a);(2-3) If the result in step (2-2) is no, then execute step (2-4), otherwise execute step (2-12a); (2-4)判断xx是否为不可见字符,包括半角/全角空格和制表符,它们对应的Unicode码表示分别为0x20 00、0x00 30和0x09 00;(2-4) Determine whether xx is an invisible character, including half-width/full-width spaces and tabs, and their corresponding Unicode codes are 0x20 00, 0x00 30 and 0x09 00; (2-5)如果步骤(2-4)中结果为no,则跳转到步骤(2-2),否则执行步骤(2-6);(2-5) If the result in step (2-4) is no, then jump to step (2-2), otherwise execute step (2-6); (2-6)对输入的秘密消息s,读取1个字节数据y;(2-6) For the input secret message s, read 1 byte of data y; (2-7)判断y值是否为EOF;(2-7) Determine whether the y value is EOF; (2-8)如果步骤(2-7)中结果为no,则执行步骤(2-9),否则执行步骤(2-12b);(2-8) If the result in step (2-7) is no, then perform step (2-9), otherwise perform step (2-12b); (2-9)对输入的编码表Tk,查找Tk中y值对应的码字zz;(2-9) To the input coding table T k , search for the code word zz corresponding to the y value in T k ; (2-10)使用步骤(2-9)中的zz值替换步骤(2-5)所得到c中的xx值;(2-10) use the zz value in step (2-9) to replace the xx value in c obtained in step (2-5); (2-11)重复执行步骤(2-2);(2-11) Repeat step (2-2); (2-12a)输出载体嵌入容量不足的提示信息;(2-12a) outputting a message indicating that the embedding capacity of the carrier is insufficient; (2-12b)输出改变后的c,即为带秘载体c′。(2-12b) Output the changed c, which is the secret carrier c'. 5.如权利要求4所述的方法,其特征在于:步骤(4)通过执行提取算法D()从带秘载体中将秘密消息提取出来,提取算法D()表示为5. The method according to claim 4, characterized in that: step (4) extracts the secret message from the tape secret carrier by executing the extraction algorithm D (), and the extraction algorithm D () is expressed as m=D(c′,Tk′),m=D(c', T k' ), 其中,输入参数为含秘密消息的带秘载体c′和提取秘钥k′生成的编码表Tk′,输出是提取得到的秘密消息m;为了保持编码表Tk和Tk′的一致性,嵌入秘钥k与提取秘钥k′相同。Among them, the input parameter is the code table T k ' generated by the secret carrier c' containing the secret message and the extracted secret key k', and the output is the extracted secret message m; in order to maintain the consistency of the code table T k and T k' , the embedding key k is the same as the extraction key k′. 6.如权利要求5所述的方法,其特征在于,所述提取算法的具体实现步骤为:6. the method for claim 5 is characterized in that, the concrete realization step of described extracting algorithm is: (4-1)对输入的带秘载体c′,读取2个字节数据yy;(4-1) read 2 bytes of data yy to the input tape carrier c'; (4-2)判断yy值是否为EOF;(4-2) Determine whether the value of yy is EOF; (4-3)如果步骤(4-2)中结果为no,则执行步骤(4-4),否则执行步骤(4-8);(4-3) If the result in step (4-2) is no, then execute step (4-4), otherwise execute step (4-8); (4-4)根据输入编码表Tk′,判断步骤(4-2)中的yy值是否为Tk′中的码字;(4-4) according to input coding table T k ' , judge whether the yy value in the step (4-2) is the code word in T k ' ; (4-5)如果步骤(4-4)中结果为no,则跳转到步骤(4-1),否则执行步骤(4-6);(4-5) If the result in step (4-4) is no, then jump to step (4-1), otherwise execute step (4-6); (4-6)查找Tk′,获得yy值所对应的m值;(4-6) Find T k' to obtain the m value corresponding to the yy value; (4-7)重复执行步骤(4-1);(4-7) Repeat step (4-1); (4-8)输出秘密消息m。(4-8) Output the secret message m.
CN201410733815.0A 2014-12-04 2014-12-04 Method using invisible character hiding information is encoded based on Unicode Active CN104504342B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410733815.0A CN104504342B (en) 2014-12-04 2014-12-04 Method using invisible character hiding information is encoded based on Unicode

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410733815.0A CN104504342B (en) 2014-12-04 2014-12-04 Method using invisible character hiding information is encoded based on Unicode

Publications (2)

Publication Number Publication Date
CN104504342A CN104504342A (en) 2015-04-08
CN104504342B true CN104504342B (en) 2018-04-03

Family

ID=52945738

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410733815.0A Active CN104504342B (en) 2014-12-04 2014-12-04 Method using invisible character hiding information is encoded based on Unicode

Country Status (1)

Country Link
CN (1) CN104504342B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106301790B (en) * 2016-08-18 2019-11-15 中国联合网络通信集团有限公司 Confidential information interaction method, mobile terminal
CN107885704A (en) * 2016-09-29 2018-04-06 厦门雅迅网络股份有限公司 Text information hiding method and its system
CN106570356B (en) * 2016-11-01 2020-01-31 南京理工大学 Embedding and Extracting Method of Text Watermark Based on Unicode Encoding
CN107103630B (en) * 2017-03-03 2019-11-26 中国科学院信息工程研究所 A kind of carrier-free concealed communication method based on GIF attribute interval division mapping code
CN109657769B (en) * 2018-12-29 2021-11-19 安徽大学 Two-dimensional code information hiding method based on run length coding
CN111027080B (en) * 2019-11-26 2021-11-19 中国人民解放军战略支援部队信息工程大学 Information hiding method and system based on OOXML composite document source file data area position arrangement sequence

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1504044A (en) * 2001-06-12 2004-06-09 �Ҵ���˾ Method for invisibly embedding and hiding data into soft copy text documents
CN1674055A (en) * 2004-07-26 2005-09-28 刘�东 Text digital water mark technology based on symbol redundancy encoding
CN101645061A (en) * 2009-09-03 2010-02-10 张�浩 Information hiding method taking text information as carrier

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6769061B1 (en) * 2000-01-19 2004-07-27 Koninklijke Philips Electronics N.V. Invisible encoding of meta-information
WO2002101522A2 (en) * 2001-06-12 2002-12-19 International Business Machines Corporation Method of authenticating a plurality of files linked to a text document
CN1599405A (en) * 2004-07-26 2005-03-23 刘�东 Text digital watermark technology of carried hidden information by symbolic redundancy encoding
US20090285402A1 (en) * 2008-05-16 2009-11-19 Stuart Owen Goldman Service induced privacy with synchronized noise insertion
CN103761459B (en) * 2014-01-24 2016-08-17 中国科学院信息工程研究所 A kind of document multiple digital watermarking embedding, extracting method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1504044A (en) * 2001-06-12 2004-06-09 �Ҵ���˾ Method for invisibly embedding and hiding data into soft copy text documents
CN1674055A (en) * 2004-07-26 2005-09-28 刘�东 Text digital water mark technology based on symbol redundancy encoding
CN101645061A (en) * 2009-09-03 2010-02-10 张�浩 Information hiding method taking text information as carrier

Also Published As

Publication number Publication date
CN104504342A (en) 2015-04-08

Similar Documents

Publication Publication Date Title
CN104504342B (en) Method using invisible character hiding information is encoded based on Unicode
CN107561564B (en) A kind of compression implementation method of big-dipper satellite information transmission
CN103049682B (en) Character pitch encoding-based dual-watermark embedded text watermarking method
CN102508824B (en) Compression coding and decoding method and device for microblog information
CN102724668A (en) Method and system for sharing WIFI (wireless fidelity) network information on basis of two-dimensional code graphs
CN106815544A (en) A kind of information concealing method based on Quick Response Code
CN105426709A (en) JPEG image information hiding based private information communication method and system
CN102096787A (en) Method and device for hiding information based on word2007 text segmentation
CN103400173A (en) Generating method and reading method of two-dimensional code containing private information
CN104753540A (en) Data compression method, data decompression method and device
Kumar et al. A high capacity email based text steganography scheme using Huffman compression
JP4168946B2 (en) Document data encoding or decoding method and program thereof
Kumar et al. An efficient text steganography scheme using Unicode Space Characters
CN103731154B (en) Data compression algorithm based on semantic analysis
CN104376236A (en) Scheme self-adaptive digital watermark embedding and extracting method based on camouflage technology
Chou et al. A Webpage Data Hiding Method by Using Tag and CSS Attribute Setting
CN116664123A (en) Digital wallet design method based on blockchain technology
CN108536860A (en) Encrypting web, decryption method, terminal device and computer readable storage medium
CN107071455B (en) Jpeg image information concealing method based on data flow
CN101419589A (en) Method and system for protecting computer document content
CN106484661A (en) A kind of method of EBCDIC coding extension
CN105183750B (en) Close-coupled XML resolution system
CN105871542A (en) Encryption and decryption method of ciphertext
CN101840483A (en) Method and system for protecting computer document contents
CN102439589A (en) Method and apparatus for encoding and decoding xml documents using path code

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant