[go: up one dir, main page]

CN111814009B - Mode matching method based on search engine retrieval information - Google Patents

Mode matching method based on search engine retrieval information Download PDF

Info

Publication number
CN111814009B
CN111814009B CN202010598366.9A CN202010598366A CN111814009B CN 111814009 B CN111814009 B CN 111814009B CN 202010598366 A CN202010598366 A CN 202010598366A CN 111814009 B CN111814009 B CN 111814009B
Authority
CN
China
Prior art keywords
character
string
characters
pattern
pattern string
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010598366.9A
Other languages
Chinese (zh)
Other versions
CN111814009A (en
Inventor
汪浩
徐媛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Changhong Electric Co Ltd
Original Assignee
Sichuan Changhong Electric Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Changhong Electric Co Ltd filed Critical Sichuan Changhong Electric Co Ltd
Priority to CN202010598366.9A priority Critical patent/CN111814009B/en
Publication of CN111814009A publication Critical patent/CN111814009A/en
Application granted granted Critical
Publication of CN111814009B publication Critical patent/CN111814009B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • G06F16/90344Query processing by using string matching techniques
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A pattern matching method based on search engine search information includes picking up character or character string from pattern string to be matched to form new pattern string, comparing said new pattern string with pattern string to be matched, recording character number of interval between said character or character string, forming object by character number of character string and character number of character interval, carrying out multiple comparison between character in new pattern string and character in corresponding position of target string, comparing even index in object until character in new pattern string is matched with character in corresponding position of target string completely, comparing odd index in object until pattern string to be matched is found out in target string. By optimizing the comparison structure and the comparison sequence of the pattern strings, the first character of the pattern strings does not need to be compared with the corresponding character in the target string one by one in each round of matching, the comparison times can be reduced, and the pattern matching efficiency is improved.

Description

Mode matching method based on search engine retrieval information
Technical Field
The invention relates to the field of pattern matching, in particular to a BF improved algorithm based on search engine retrieval information pattern matching.
Background
With the rapid development of computer network technology, the amount of information generated at every moment is explosively increased, including unstructured data such as characters, pictures, audio, video, geographical positions, and the like. Computer networks offer convenience and rapidity to people, but are also inundated with such massive amounts of data information. The unordered and huge information world and the various link modes of the forms make people feel unpreferable when searching for the required information. The data information amount is huge and is mostly in an unstructured data format, which puts higher requirements on searching engine retrieval information, and the search engine does not only help people to retrieve more information from massive data, but also queries matched information more quickly, efficiently and accurately, and filters information or harmful information which is irrelevant to retrieval.
Pattern matching is required for search engines to perform information retrieval and content filtering operations from large amounts of data. Pattern matching is a basic operation of character strings in a data structure, and aims to find out all pattern strings which are the same as a pattern string from a certain target string after the pattern string is given. The specific operation process is that, assuming that P is a given pattern string and T is a target string to be searched, all pattern strings identical to P are required to be found from T. If one or more pattern strings with P exist in T, the position of the pattern string in T is given, and the matching is called successful, otherwise, the matching fails. The BF algorithm is one of pattern matching common algorithms, and the idea is that the first character of a target string T is matched with the first character of a pattern string P, and if the first character of the target string T is equal to the first character of the pattern string P, the second character of the target string T is continuously compared with the second character of the pattern string P; and if not, comparing the second character of the target string T with the first character of the pattern string P, and sequentially comparing one round by one round until each character in the pattern string P is sequentially equal to one continuous character sequence in the target string T.
To better understand the principle of the conventional BF algorithm, the process of the BF algorithm is described as an example. It should be noted that the example is only for illustrating the algorithm principle, so the target string T in this example is small in length and only 25 bits, and the data format includes only letters and special characters, but in practice, the length of the target string T stored in the database to be traversed and matched is far more than 25, and the data format is more complicated. Given a target string T ═ xyzabxyzaz% yyxyza% xyyy, the length is 25, and the pattern string P ═ xyza% xyyy, the length is 9. Here, the target string T represents mass data stored in a search engine database, and the pattern string P represents a keyword that a user wants to match a search result in a search engine. As shown in fig. 1, the conventional BF algorithm example matching process has the following steps:
(1) 1 st round of matching
Comparing the 1 st character of the pattern string P with the 1 st character of the target string T, P0Is equal to T0Comparison continues down in sequence, P1Is equal to T1,P2Is equal to T2,P3Is equal to T3At comparison of 5, P4Is not equal to T4The match fails. In the 1 st round of matching, the number of comparisons was 5 in total.
(2) 2 nd to 5 th round of matching
Comparing the 1 st character of the pattern string P with the 2 nd character of the target string T, P0Is not equal to T1The match fails. In the 2 nd round of matching, the number of comparisons was 1 in total. According to the idea of BF algorithm, the matching of the 3 rd round, the 4 th round and the 5 th round is continued, and P is found after the matching is respectively compared once0Is not equal to T2,P0Is not equal to T3,P0Is not equal to T4The match failed, so the number of comparisons at the 3 rd round match was 1 in total, the number of comparisons at the 4 th round match was 1 in total, and the number of comparisons at the 5 th round match was 1 in total.
(3) 6 th round of matching
Comparing the 1 st character of the pattern string P with the 6 th character of the target string T, P0Is equal to T5Comparison continues down in sequence, P1Is equal to T6At comparison of 3, P2Is not equal to T7The match fails. In the 6 th round of matching, the number of comparisons was 3 in total.
(4) 7 th round of matching
Comparing the 1 st character of the pattern string P with the 7 th character of the target string T, P0Is not equal to T6The match fails. In the 7 th round of matching, the number of comparisons was 1 in total.
(5) 8 th round of matching
Comparing the 1 st character of the pattern string P with the 8 th character of the target string T, P0Is equal to T7Comparison continues down in sequence, P1Is equal to T8,P2Is equal to T9,P3Is equal to T10At comparison of 5, P4Is not equal to T11The match fails. In the 8 th round of matching, the number of comparisons was 5 in total.
(6) 9 th to 16 th round of matching
Comparing the 1 st character of the pattern string P with the 9 th character of the target string T, P0Is not equal to T8The match fails. In the 9 th round of matching, the number of comparisons was 1 in total. According to the BF algorithm idea, continuously carrying out matching from the 10 th round to the 16 th round, respectively comparing once to find P0Is not equal to T9,P0Is not equal to T10,P0Is not equal to T11,P0Is not equal to T12,P0Is not equal to T13,P0Is not equal to T14,P0Is not equal to T15The matching fails, so the total number of comparisons for each round of matching is 1 in the 10 th to 16 th rounds of matching.
(7) 17 th round of matching
Comparing the 1 st character of the pattern string P with the 17 th character of the target string T, P0Is equal to T16And sequentially continuing to compare downwards to find that each character comparison of the pattern string and the target string is equal, and the matching is successful after 9 times of comparison.
The total number of comparisons from round 1 to round 17 can be added, in this example 35 comparisons are required to achieve a successful match using the conventional BF algorithm.
It can be seen from the above BF conventional algorithm flow that each matching is performed from the first character of the pattern string one by one, and after one matching is completed, the pattern string is moved backward by one character distance with respect to the target string and is continuously matched one by one, and in the whole process, once a certain character fails to be matched, the matching is performed from the beginning, that is, from the next character at the starting point of the current target string, the matching is repeated one by one, so that the comparison times are large, and the matching efficiency is low.
Disclosure of Invention
The invention provides an improved algorithm aiming at the defects that each character of a BF traditional algorithm needs to be compared one by one in each round of matching, the comparison sequence is fixed and single, the comparison times are multiple, the matching efficiency is low and the like, overcomes the defects of the BF algorithm, optimizes the comparison structure and the comparison sequence of a pattern string in the process of finding the pattern string from a target string, reduces the comparison times and improves the matching efficiency.
In order to solve the technical problem, one embodiment of the present invention adopts the following technical solutions:
a pattern matching method based on search engine retrieval information adopts BF improved algorithm, and the specific technical scheme is as follows:
step 1: and extracting some characters or character strings from the pattern string to be matched according to the structural characteristics and the content characteristics of the pattern string to be matched to form a new pattern string. The basic requirement for a new pattern string is that it needs to include the first character and the last character of the pattern string to be matched; the further technical scheme is that the new pattern string also needs to include special characters which are different from other characters in the pattern string to be matched, or the new pattern string also needs to include continuous regular character strings in the pattern string to be matched. Therefore, the new pattern string can be a new pattern string consisting of the first character and the last character in the pattern string to be matched; or a new mode string consisting of a first character, a special character different from other characters and a last character; or a new mode string consisting of a first character, a continuous regular character string and a last character; it can also be a new pattern string composed of the first character, a special character distinguished from other characters, a continuous regular character string and the last character. It is also possible that the first character or the last character of the string of patterns to be matched is a special character or a continuous regular string of characters that is distinguished from other characters. Because the probability of the existence of special characters, regular character strings, continuous character strings and the combination of the special characters, the regular character strings and the continuous character strings in the target string is low, the new mode strings with optimized structures are utilized to match with the target string, the probability of character matching failure can be improved, and the next round of matching can be rapidly carried out. The special characters referred to herein are relative, and usually have a small probability of appearing in the pattern string to be matched, and there may be differences in types, for example, when there are several english letters and a punctuation mark or a number in the pattern string to be matched, then the punctuation mark or the number may be used as a special character, when there are several numbers and an english letter in the pattern string to be matched, then the english letter may be used as a special character, there may be more than one special character in the pattern string to be matched, and here, the selection pattern of the special character is merely exemplified, and not limited. The continuous regular character string may be a character string obtained by repeating one character, or a character string obtained by arranging a plurality of characters in sequence and repeating them, such as XX, XYXY, R1! R1! R1! And the like.
Step 2: and comparing the new mode string with the mode string to be matched, recording the number of characters at intervals among the characters or the character strings, and forming an object by the number of characters corresponding to the characters or the character strings forming the new mode string and the number of characters at intervals among the characters or the character strings.
There is provided a specific operation method, the composition mode of the object being: { N0,N1,N2Either { N } or { N }0,N1,N2,N3,N4……NXX is a positive integer, N0,N1,N2,N3,N4……NXThe values of (A) are natural numbers which can be equal or different from each other, N represents the index of the object, and the subscript is used for distinguishing whether the index is an odd index or an even index; the index with even subscript is the number of the first character in the pattern string to be matched, or the number of special characters different from other characters, or continuous regular wordsThe number of characters of the string, or the number of characters of the last character; the index with the index odd is the number of characters spaced between two adjacent indexes. Generally, the characters corresponding to even indexes are arranged left and right in sequence to form the characters of the new mode string.
And step 3: firstly, comparing the characters in the new pattern string with the characters at the corresponding positions of the target string, wherein the following comparison sequence is: the even-numbered index in the object controls the number of characters needing to be compared, and the odd-numbered index controls the number of characters needing to be compared in a skipping or interval mode; and if a certain character of the new pattern string is not matched with the target string in the comparison process from left to right, performing the next round of matching according to the same comparison sequence. The left-to-right comparison indicated here is not a single comparison, and may be a right-to-left comparison.
And 4, step 4: if each character in the new pattern string is compared and is equal to the character at the corresponding position in the target string, the comparison is started from the first character which is not compared in the pattern string to be matched, at the moment, the characters of the new pattern string do not need to be compared, and the following comparison sequence is that: the even-numbered index in the object controls the number of characters which are not needed to be compared in a skipping or interval mode, and the odd-numbered index controls the number of characters which are needed to be compared;
and 5: and (4) repeating the steps 3 and 4 until the pattern string to be matched is found in the target string, and terminating the algorithm.
Through the steps, the pattern matching method provided by the invention constructs a new pattern string by utilizing the structural characteristics and the content characteristics of the pattern string, the characters in the new pattern string are selected from the whole pattern string (namely, the pattern string to be matched), the content characteristics of the pattern string are reflected, only the characters at the characteristic positions are compared at the beginning, and other characters are skipped.
The pattern matching method (BF improved algorithm) optimizes the comparison structure and the comparison sequence of the pattern strings, and each round of matching does not need to compare the first character of the pattern strings with the corresponding character in the target string one by one, so that the comparison times can be reduced, and the pattern matching efficiency can be improved.
Drawings
Fig. 1 is a flow chart of a BF conventional algorithm.
Fig. 2 is a flow chart of the BF improvement algorithm of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Fig. 2 is a flowchart of an implementation of the BF improvement algorithm of the present invention, which employs the same target string T ═ xyz bxxyxzaz% yyxza% xyyyy and pattern string P ═ xyza% xyyy as in the conventional BF algorithm, and the BF improvement algorithm includes the following steps:
according to the above-mentioned invention, the new pattern string N is formed by observing that the pattern string P is xyz% xyyy, which is different from the special characters of other characters, and 3 continuous and regular characters yyy connected together, and adding the first character x, the character forming the new pattern string N is x% yy, compared with the original pattern string xyza% xyyy, the number of the 3 characters or the interval characters of the string is 3, 1, respectively, so that the structure of the new pattern string N can be expressed as {1, 3, 1, 1, 3} by the object, wherein the 0 th, 2, 4 even bit indexes represent the number of characters to be compared, the 1 st, 3 rd bit indexes represent the number of characters to be skipped without comparison, that is, during the whole pattern matching process, firstly, pattern matching is performed according to x (comparison) yza% (comparison) x (comparison) yyy (comparison), when the new pattern string N sequentially has mismatch from left to right, the next round of matching is continued until all the characters in the new pattern string N are compared and equal, and then the comparison is started from the first character which is skipped before and has no comparison, and at this time, each round of matching is performed according to x (no comparison) yza (comparison)% (no comparison) x (comparison) yyy (no comparison).
(1) 1 st round of matching
Pattern matching is performed in the order of x (compare) yza (not compare)% (compare) x (not compare) yyy (compare), comparing the 1 st character of the pattern string P with the 1 st character of the target string T, P0Is equal to T0Skipping 3 character comparisons, P4Is not equal to T4The match fails. In the 1 st round of matching, the number of comparisons was 2 in total.
(2) 2 nd to 5 th round of matching
Comparing the 1 st character of the pattern string P with the 2 nd character of the target string T, P0Is not equal to T1The match fails. In the 2 nd round of matching, the number of comparisons was 1 in total. According to the idea of BF algorithm, the matching of the 3 rd round, the 4 th round and the 5 th round is continued, and P is found after the matching is respectively compared once0Is not equal to T2,P0Is not equal to T3,P0Is not equal to T4The match failed, so the number of comparisons at the 3 rd round match was 1 in total, the number of comparisons at the 4 th round match was 1 in total, and the number of comparisons at the 5 th round match was 1 in total.
(3) 6 th round of matching
Comparing the 1 st character of the pattern string P with the 6 th character of the target string T, P0Is equal to T5Then skip 3 characters and continue to compare down, P4Is not equal to T9The match fails. In the 6 th round of matching, the number of comparisons was 2 in total.
(4) 7 th round of matching
Comparing the 1 st character of the pattern string P with the 7 th character of the target string T, P0Is not equal to T6The match fails. In the 7 th round of matching, the number of comparisons was 1 in total.
(5) 8 th round of matching
Comparing the 1 st character of the pattern string P with the 8 th character of the target string T, P0Is equal to T7Then skip 3 character comparisons, P4Is not equal to T11The match fails. In the 8 th round of matching, the number of comparisons was 2 in total.
(6) 9 th to 16 th round of matching
Comparing the 1 st character of the pattern string P with the 9 th character of the target string T, P0Is not equal to T8The match fails. The 10 th to 16 th matching was continued, and all the results were found to fail after 1 comparison.
(7) 17 th round of matching
Comparing the 1 st character of the pattern string P with the 17 th character of the target string T, P0Is equal to T17And continuing to compare sequentially downwards to find that each character x% yyy in the new pattern string N is equal to each character at the corresponding position in the target string, and then, starting to compare the characters which are not compared before the first character in the pattern string P is skipped, wherein the comparison sequence is x (not compared) yza% (compared)% (not compared) x (compared) yyy (not compared), the rest characters in the pattern string P are also equal, and the matching is successful and is compared for 9 times in total.
The total number of comparisons from round 1 to round 17 can be added, and in this example, 28 comparisons are required for successful matching using the BF improvement algorithm, which is 7 times less than that of the BF conventional algorithm. For target strings and pattern strings with longer lengths, the advantages of the improved algorithm are more obvious, and the comparison times are more reduced. Characters in the new pattern string N are selected from the whole pattern string P, the content characteristics of the pattern string are reflected, only the characters on the characteristic positions are compared at the beginning, other characters are skipped, compared with the BF traditional algorithm, the method has the advantages that the speed of character matching failure can be increased by sequentially matching every bit from the beginning, the number of the characters of the new pattern string is a fraction of the whole pattern string or even less, the new pattern string can cause matching failure in most cases, and if no failure occurs and the characters in the new pattern string are matched with other characters skipped previously, the characters in the new pattern string do not need to be compared again. By optimizing the comparison structure and the comparison sequence of the pattern strings, the defect that the BF traditional algorithm is mechanically compared with the target strings one by one from the first character of the pattern strings is overcome, the comparison times are reduced, and the matching efficiency is improved.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims (5)

1. A pattern matching method based on search engine retrieval information adopts BF improved algorithm, and is characterized by comprising the following steps:
step 1: extracting some characters or character strings from the pattern string to be matched according to the structural characteristics and the content characteristics of the pattern string to be matched to form a new pattern string;
step 2: comparing the new pattern string with the pattern string to be matched, recording the number of characters at intervals among the characters or the character strings, and forming an object by the number of characters corresponding to the characters or the character strings forming the new pattern string and the number of characters at intervals among the characters or the character strings;
and step 3: firstly, comparing the characters in the new pattern string with the characters at the corresponding positions of the target string, wherein the following comparison sequence is: the even-numbered index in the object controls the number of characters needing to be compared, and the odd-numbered index controls the number of characters needing to be compared in a skipping or interval mode; if a certain character of the new pattern string is not matched with the target string in the comparison process from left to right, performing the next round of matching according to the same comparison sequence; after the index-referenced natural numbers are sorted, the even-numbered index and the odd-numbered index are distinguished;
and 4, step 4: if each character in the new pattern string is compared and is equal to the character at the corresponding position in the target string, the comparison is started from the first character which is not compared in the pattern string to be matched, at the moment, the characters of the new pattern string do not need to be compared, and the following comparison sequence is that: the even-numbered index in the object controls the number of characters which are not needed to be compared in a skipping or interval mode, and the odd-numbered index controls the number of characters which are needed to be compared;
and 5: and (4) repeating the steps 3 and 4 until the pattern string to be matched is found in the target string, and terminating the algorithm.
2. The method of claim 1, wherein the new pattern string includes a first character and a last character of the pattern string to be matched.
3. The pattern matching method based on search engine retrieval information as claimed in claim 2, wherein the new pattern string needs to include a special character distinguished from other characters in the pattern string to be matched.
4. A pattern matching method based on search engine retrieval information according to claim 2 or 3, characterized in that the new pattern string needs to include continuous regular character strings in the pattern string to be matched.
5. The pattern matching method for retrieving information based on a search engine according to claim 4, wherein the object is composed of a pattern of: { N0,N1,N2Either { N } or { N }0,N1,N2,N3,N4……NXX is a positive integer, N0,N1,N2,N3,N4……NXThe values of (A) are natural numbers, which may be equal or different; the index with even subscript is the character number of the first character in the pattern string to be matched, or the character number of special characters different from other characters, or the character number of continuous regular character strings, or the character number of the last character; the index with the index odd is the number of characters spaced between two adjacent indexes.
CN202010598366.9A 2020-06-28 2020-06-28 Mode matching method based on search engine retrieval information Active CN111814009B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010598366.9A CN111814009B (en) 2020-06-28 2020-06-28 Mode matching method based on search engine retrieval information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010598366.9A CN111814009B (en) 2020-06-28 2020-06-28 Mode matching method based on search engine retrieval information

Publications (2)

Publication Number Publication Date
CN111814009A CN111814009A (en) 2020-10-23
CN111814009B true CN111814009B (en) 2022-03-01

Family

ID=72855468

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010598366.9A Active CN111814009B (en) 2020-06-28 2020-06-28 Mode matching method based on search engine retrieval information

Country Status (1)

Country Link
CN (1) CN111814009B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112580747B (en) * 2020-12-29 2024-08-23 珠海金山数字网络科技有限公司 Matching method and device

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05274184A (en) * 1992-03-24 1993-10-22 Nec Corp Execution result comparing device
CN102799600A (en) * 2012-04-10 2012-11-28 成都网安科技发展有限公司 Multi-mode matching algorithm and system based on encoding association
CN103500178A (en) * 2013-09-09 2014-01-08 中国科学院计算机网络信息中心 Quick multi-mode matching method on worst-case scenario of FS algorithm
CN104081669A (en) * 2012-04-28 2014-10-01 华为技术有限公司 Method for repairing and decoding air interface voice frame, and signal source side information acquisition method and device
CN104462266A (en) * 2014-11-21 2015-03-25 北京京东尚科信息技术有限公司 Method and system for improving string matching
CN104519056A (en) * 2014-12-15 2015-04-15 广东科学技术职业学院 Double-jump-based single mode matching method
CN107220333A (en) * 2017-05-24 2017-09-29 电子科技大学 A kind of chracter search method based on Sunday algorithms
CN107660283A (en) * 2015-04-03 2018-02-02 甲骨文国际公司 Method and system for implementing a log parser in a log analysis system
CN108886367A (en) * 2016-01-29 2018-11-23 零点科技公司 Method, apparatus and system for compression and decompression data

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080022403A1 (en) * 2006-07-22 2008-01-24 Tien-Fu Chen Method and apparatus for a pattern matcher using a multiple skip structure
US8615804B2 (en) * 2010-02-18 2013-12-24 Polytechnic Institute Of New York University Complementary character encoding for preventing input injection in web applications
CN102750379B (en) * 2012-06-25 2014-07-02 华南理工大学 Fast character string matching method based on filtering type
CN109977276B (en) * 2019-03-22 2020-12-22 华南理工大学 An Improved Single Pattern Matching Method Based on Sunday Algorithm

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05274184A (en) * 1992-03-24 1993-10-22 Nec Corp Execution result comparing device
CN102799600A (en) * 2012-04-10 2012-11-28 成都网安科技发展有限公司 Multi-mode matching algorithm and system based on encoding association
CN104081669A (en) * 2012-04-28 2014-10-01 华为技术有限公司 Method for repairing and decoding air interface voice frame, and signal source side information acquisition method and device
CN103500178A (en) * 2013-09-09 2014-01-08 中国科学院计算机网络信息中心 Quick multi-mode matching method on worst-case scenario of FS algorithm
CN104462266A (en) * 2014-11-21 2015-03-25 北京京东尚科信息技术有限公司 Method and system for improving string matching
CN104519056A (en) * 2014-12-15 2015-04-15 广东科学技术职业学院 Double-jump-based single mode matching method
CN107660283A (en) * 2015-04-03 2018-02-02 甲骨文国际公司 Method and system for implementing a log parser in a log analysis system
CN108886367A (en) * 2016-01-29 2018-11-23 零点科技公司 Method, apparatus and system for compression and decompression data
CN107220333A (en) * 2017-05-24 2017-09-29 电子科技大学 A kind of chracter search method based on Sunday algorithms

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
A new string matching algorithm based on logical indexing;Daniar Heri Kurniawan 等;《2015 International Conference on Electrical Engineering and Informatics (ICEEI)》;20151217;394-399 *
Novel approach for string searching and matching using American standard code for information interchange value;C. Vamsi Krishna 等;《2016 International Conference on Recent Trends in Information Technology (ICRTIT)》;20160919;1-5 *
一种改进的Sunday字符串匹配算法;刘雨心 等;《太原理工大学学报》;20130915;第44卷(第5期);604-607 *
一种改进的字符串匹配算法;王成 等;《计算机工程》;20060120;第32卷(第2期);62-64 *
基于k-匿名的位置隐私保护方法研究;邢凯;《中国优秀硕士学位论文全文数据库 信息科技辑》;20190115(第01(2019)期);I138-236 *
基于函数属性的二进制文件快速比对;俞昕;《中国优秀硕士学位论文全文数据库 信息科技辑》;20190115(第01(2019)期);I138-4 *

Also Published As

Publication number Publication date
CN111814009A (en) 2020-10-23

Similar Documents

Publication Publication Date Title
US8200646B2 (en) Efficient retrieval of variable-length character string data
CN102867040B (en) Chinese search engine mixed speech-oriented query error correction method and system
JP4538449B2 (en) String search method and equipment
US7912818B2 (en) Web graph compression through scalable pattern mining
US7882109B2 (en) Computer representation of a data tree structure and the associated encoding/decoding methods
JPH11212980A (en) Production of index and retrieval method
WO2008053583A1 (en) Bit sequence searching method and program
CN114238257B (en) Log processing methods, log processing devices and electronic equipment
JP6072922B2 (en) Character string search device, character string search method, and character string search program
CN111814009B (en) Mode matching method based on search engine retrieval information
KR101089722B1 (en) Prefix tree-based indexing method and apparatus, recording medium thereof
CN113065419B (en) Pattern matching algorithm and system based on flow high-frequency content
CN109460495B (en) Redundant field filtering method based on improved BM algorithm and suffix array
JP2012018592A (en) Sequence analysis apparatus, sequence analysis method and computer program
JPH11203315A (en) Method for retrieving symbol column and device therefor and recording medium for recording symbol column retrieval program
CN117914824A (en) Active IPv6 address detection method and device based on multi-level association strategy
CN116483959A (en) Method, device and medium for retrieving question-related subgraphs based on evidence graph schema
CN109657108B (en) Domain name asset data storage and query method and system
JP4510041B2 (en) Document search system and program
CN113010882A (en) Self-defined position sequence pattern matching algorithm suitable for cache loss attack
CN119003835A (en) Efficient binary character string matching method based on KMP algorithm
CN113609342A (en) Data storage method
CN113609341A (en) Method for generating data dictionary
CN113691352A (en) Data segmentation method
KR970066950A (en) Information retrieval device for searching sentences to search for strings matching keywords

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant