CN111814009A - BF improved algorithm based on search engine retrieval information pattern matching - Google Patents
BF improved algorithm based on search engine retrieval information pattern matching Download PDFInfo
- Publication number
- CN111814009A CN111814009A CN202010598366.9A CN202010598366A CN111814009A CN 111814009 A CN111814009 A CN 111814009A CN 202010598366 A CN202010598366 A CN 202010598366A CN 111814009 A CN111814009 A CN 111814009A
- Authority
- CN
- China
- Prior art keywords
- characters
- string
- character
- pattern
- pattern string
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/903—Querying
- G06F16/90335—Query processing
- G06F16/90344—Query processing by using string matching techniques
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a BF improved algorithm based on search engine retrieval information pattern matching, which comprises the steps of firstly extracting characters or character strings from a pattern string to be matched to form a new pattern string, comparing the new pattern string with the pattern string to be matched, recording the number of characters at intervals among the characters or character strings, forming an object by the number of the characters forming the new pattern string and the number of characters at intervals of the characters, then carrying out multi-round comparison by using the characters in the new pattern string and the characters at the corresponding positions of a target string, firstly comparing even indexes in the object until the characters at the corresponding positions of the new pattern string and the target string are completely matched, and then comparing odd indexes in the object until the pattern string to be matched is found in the target string. The BF improved algorithm optimizes the comparison structure and the comparison sequence of the pattern strings, and each round of matching does not need to compare the first character of the pattern strings with the corresponding character in the target string one by one, so that the comparison times can be reduced, and the pattern matching efficiency can be improved.
Description
Technical Field
The invention relates to the field of pattern matching, in particular to a BF improved algorithm based on search engine retrieval information pattern matching.
Background
With the rapid development of computer network technology, the amount of information generated at every moment is explosively increased, including unstructured data such as characters, pictures, audio, video, geographical positions, and the like. Computer networks offer convenience and rapidity to people, but are also inundated with such massive amounts of data information. The unordered and huge information world and the various link modes of the forms make people feel unpreferable when searching for the required information. The data information amount is huge and is mostly in an unstructured data format, which puts higher requirements on searching engine retrieval information, and the search engine does not only help people to retrieve more information from massive data, but also queries matched information more quickly, efficiently and accurately, and filters information or harmful information which is irrelevant to retrieval.
Pattern matching is required for search engines to perform information retrieval and content filtering operations from large amounts of data. Pattern matching is a basic operation of character strings in a data structure, and aims to find out all pattern strings which are the same as a pattern string from a certain target string after the pattern string is given. The specific operation process is that, assuming that P is a given pattern string and T is a target string to be searched, all pattern strings identical to P are required to be found from T. If one or more pattern strings with P exist in T, the position of the pattern string in T is given, and the matching is called successful, otherwise, the matching fails. The BF algorithm is one of pattern matching common algorithms, and the idea is that the first character of a target string T is matched with the first character of a pattern string P, and if the first character of the target string T is equal to the first character of the pattern string P, the second character of the target string T is continuously compared with the second character of the pattern string P; and if not, comparing the second character of the target string T with the first character of the pattern string P, and sequentially comparing one round by one round until each character in the pattern string P is sequentially equal to one continuous character sequence in the target string T.
To better understand the principle of the conventional BF algorithm, the process of the BF algorithm is described as an example. It should be noted that the example is only for illustrating the algorithm principle, so the target string T in this example is small in length and only 25 bits, and the data format includes only letters and special characters, but in practice, the length of the target string T stored in the database to be traversed and matched is far more than 25, and the data format is more complicated. Given a target string T ═ xyzabxyzaz% yyxyza% xyyy, the length is 25, and the pattern string P ═ xyza% xyyy, the length is 9. Here, the target string T represents mass data stored in a search engine database, and the pattern string P represents a keyword that a user wants to match a search result in a search engine. As shown in fig. 1, the conventional BF algorithm example matching process has the following steps:
(1) 1 st round of matching
Comparing the 1 st character of the pattern string P with the 1 st character of the target string T, P0Is equal to T0Comparison continues down in sequence, P1Is equal to T1,P2Is equal to T2,P3Is equal to T3At comparison of 5, P4Is not equal to T4The match fails. In the 1 st round of matching, the number of comparisons was 5 in total.
(2) 2 nd to 5 th round of matching
Comparing the 1 st character of the pattern string P with the 2 nd character of the target string T, P0Is not equal to T1The match fails. In the 2 nd round of matching, the number of comparisons was 1 in total. According to the idea of BF algorithm, the matching of the 3 rd round, the 4 th round and the 5 th round is continued, and P is found after the matching is respectively compared once0Is not equal to T2,P0Is not equal to T3,P0Is not equal to T4The match failed, so the number of comparisons at the 3 rd round match was 1 in total, the number of comparisons at the 4 th round match was 1 in total, and the number of comparisons at the 5 th round match was 1 in total.
(3) 6 th round of matching
Comparing the 1 st character of the pattern string P with the 6 th character of the target string T, P0Is equal to T5Comparison continues down in sequence, P1Is equal to T6At comparison of 3, P2Is not equal to T7The match fails. In the 6 th round of matching, the number of comparisons was 3 in total.
(4) 7 th round of matching
Comparing the 1 st character of the pattern string P with the 7 th character of the target string T, P0Is not equal to T6The match fails. In the 7 th round of matching, the number of comparisons was 1 in total.
(5) 8 th round of matching
Will pattern the first of the string P1 character is compared with the 8 th character of the target string T, P0Is equal to T7Comparison continues down in sequence, P1Is equal to T8,P2Is equal to T9,P3Is equal to T10At comparison of 5, P4Is not equal to T11The match fails. In the 8 th round of matching, the number of comparisons was 5 in total.
(6) 9 th to 16 th round of matching
Comparing the 1 st character of the pattern string P with the 9 th character of the target string T, P0Is not equal to T8The match fails. In the 9 th round of matching, the number of comparisons was 1 in total. According to the BF algorithm idea, continuously carrying out matching from the 10 th round to the 16 th round, respectively comparing once to find P0Is not equal to T9,P0Is not equal to T10,P0Is not equal to T11,P0Is not equal to T12,P0Is not equal to T13,P0Is not equal to T14,P0Is not equal to T15The matching fails, so the total number of comparisons for each round of matching is 1 in the 10 th to 16 th rounds of matching.
(7) 17 th round of matching
Comparing the 1 st character of the pattern string P with the 17 th character of the target string T, P0Is equal to T16And sequentially continuing to compare downwards to find that each character comparison of the pattern string and the target string is equal, and the matching is successful after 9 times of comparison.
The total number of comparisons from round 1 to round 17 can be added, in this example 35 comparisons are required to achieve a successful match using the conventional BF algorithm.
It can be seen from the above BF conventional algorithm flow that each matching is performed from the first character of the pattern string one by one, and after one matching is completed, the pattern string is moved backward by one character distance with respect to the target string and is continuously matched one by one, and in the whole process, once a certain character fails to be matched, the matching is performed from the beginning, that is, from the next character at the starting point of the current target string, the matching is repeated one by one, so that the comparison times are large, and the matching efficiency is low.
Disclosure of Invention
The invention provides an improved algorithm aiming at the defects that each character of a BF traditional algorithm needs to be compared one by one in each round of matching, the comparison sequence is fixed and single, the comparison times are multiple, the matching efficiency is low and the like, overcomes the defects of the BF algorithm, optimizes the comparison structure and the comparison sequence of a pattern string in the process of finding the pattern string from a target string, reduces the comparison times and improves the matching efficiency.
In order to solve the technical problem, one embodiment of the present invention adopts the following technical solutions:
a specific technical scheme of BF improved algorithm based on search engine retrieval information pattern matching is as follows:
step 1: and extracting some characters or character strings from the pattern string to be matched according to the structural characteristics and the content characteristics of the pattern string to be matched to form a new pattern string. The basic requirement for a new pattern string is that it needs to include the first character and the last character of the pattern string to be matched; the further technical scheme is that the new pattern string also needs to include special characters which are different from other characters in the pattern string to be matched, or the new pattern string also needs to include continuous regular character strings in the pattern string to be matched. Therefore, the new pattern string can be a new pattern string consisting of the first character and the last character in the pattern string to be matched; or a new mode string consisting of a first character, a special character different from other characters and a last character; or a new mode string consisting of a first character, a continuous regular character string and a last character; it can also be a new pattern string composed of the first character, a special character distinguished from other characters, a continuous regular character string and the last character. It is also possible that the first character or the last character of the string of patterns to be matched is a special character or a continuous regular string of characters that is distinguished from other characters. Because the probability of the existence of special characters, regular character strings, continuous character strings and the combination of the special characters, the regular character strings and the continuous character strings in the target string is low, the new mode strings with optimized structures are utilized to match with the target string, the probability of character matching failure can be improved, and the next round of matching can be rapidly carried out. The special characters referred to herein are relative, and usually have a small probability of appearing in the pattern string to be matched, and there may be differences in types, for example, when there are several english letters and a punctuation mark or a number in the pattern string to be matched, then the punctuation mark or the number may be used as a special character, when there are several numbers and an english letter in the pattern string to be matched, then the english letter may be used as a special character, there may be more than one special character in the pattern string to be matched, and here, the selection pattern of the special character is merely exemplified, and not limited. The continuous regular character string may be a character string obtained by repeating one character, or a character string obtained by arranging a plurality of characters in sequence and repeating them, such as XX, XYXY, R1! R1! R1! And the like.
Step 2: and comparing the new mode string with the mode string to be matched, recording the number of characters at intervals among the characters or the character strings, and forming an object by the number of characters corresponding to the characters or the character strings forming the new mode string and the number of characters at intervals among the characters or the character strings.
There is provided a specific operation method, the composition mode of the object being: { N0,N1,N2Either { N } or { N }0,N1,N2,N3,N4……NXX is a positive integer, N0,N1,N2,N3,N4……NXThe values of (A) are natural numbers which can be equal or different from each other, N represents the index of the object, and the subscript is used for distinguishing whether the index is an odd index or an even index; the index with even subscript is the character number of the first character in the pattern string to be matched, or the character number of special characters different from other characters, or the character number of continuous regular character strings, or the character number of the last character; the index with the index odd is the number of characters spaced between two adjacent indexes. Generally, the character corresponding to the even index is pressed left and rightThe characters of the new mode string are arranged together in sequence.
And step 3: firstly, comparing the characters in the new pattern string with the characters at the corresponding positions of the target string, wherein the following comparison sequence is: the even-numbered index in the object controls the number of characters needing to be compared, and the odd-numbered index controls the number of characters needing to be compared in a skipping or interval mode; and if a certain character of the new pattern string is not matched with the target string in the comparison process from left to right, performing the next round of matching according to the same comparison sequence. The left-to-right comparison indicated here is not a single comparison, and may be a right-to-left comparison.
And 4, step 4: if each character in the new pattern string is compared and is equal to the character at the corresponding position in the target string, the comparison is started from the first character which is not compared in the pattern string to be matched, at the moment, the characters of the new pattern string do not need to be compared, and the following comparison sequence is that: the even-numbered index in the object controls the number of characters which are not needed to be compared in a skipping or interval mode, and the odd-numbered index controls the number of characters which are needed to be compared;
and 5: and (4) repeating the steps 3 and 4 until the pattern string to be matched is found in the target string, and terminating the algorithm.
Through the steps, the BF improved algorithm constructs a new pattern string by utilizing the structural characteristics and the content characteristics of the pattern string, the characters in the new pattern string are selected from the whole pattern string (namely, the pattern string to be matched), the BF improved algorithm embodies the content characteristics of the pattern string, only the characters on the characteristic positions are compared at the beginning, other characters are skipped, compared with the BF traditional algorithm, the method can accelerate the failure speed of character matching, generally, the number of the characters of the new pattern string is a fraction or even less of the number of the characters of the whole pattern string, the new pattern string can cause the matching failure in most cases, and if the characters in the new pattern string are matched from other characters skipped before without failure, the characters in the new pattern string can not be compared again.
The BF improved algorithm optimizes the comparison structure and the comparison sequence of the pattern strings, and each round of matching does not need to compare the first character of the pattern strings with the corresponding character in the target string one by one, so that the comparison times can be reduced, and the pattern matching efficiency can be improved.
Drawings
Fig. 1 is a flow chart of a BF conventional algorithm.
Fig. 2 is a flow chart of the BF improvement algorithm of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Fig. 2 is a flowchart of an implementation of the BF improvement algorithm of the present invention, which employs the same target string T ═ xyz bxxyxzaz% yyxza% xyyyy and pattern string P ═ xyza% xyyy as in the conventional BF algorithm, and the BF improvement algorithm includes the following steps:
according to the above-mentioned invention, the new pattern string N is formed by observing that the pattern string P is xyz% xyyy, which is different from the special characters of other characters, and 3 continuous and regular characters yyy connected together, and adding the first character x, the character forming the new pattern string N is x% yy, compared with the original pattern string xyza% xyyy, the number of the 3 characters or the interval characters of the string is 3, 1, respectively, so that the structure of the new pattern string N can be expressed as {1, 3, 1, 1, 3} by the object, wherein the 0 th, 2, 4 even bit indexes represent the number of characters to be compared, the 1 st, 3 rd bit indexes represent the number of characters to be skipped without comparison, that is, during the whole pattern matching process, firstly, pattern matching is performed according to x (comparison) yza% (comparison) x (comparison) yyy (comparison), when the new pattern string N sequentially has mismatch from left to right, the next round of matching is continued until all the characters in the new pattern string N are compared and equal, and then the comparison is started from the first character which is skipped before and has no comparison, and at this time, each round of matching is performed according to x (no comparison) yza (comparison)% (no comparison) x (comparison) yyy (no comparison).
(1) 1 st round of matching
According to x(COMPARATIVE) yza (COMPARATIVE)% (COMPARATIVE) x (COMPARATIVE) yyy (COMPARATIVE) in that order to perform pattern matching, comparing the 1 st character of the pattern string P with the 1 st character of the target string T, P0Is equal to T0Skipping 3 character comparisons, P4Is not equal to T4The match fails. In the 1 st round of matching, the number of comparisons was 2 in total.
(2) 2 nd to 5 th round of matching
Comparing the 1 st character of the pattern string P with the 2 nd character of the target string T, P0Is not equal to T1The match fails. In the 2 nd round of matching, the number of comparisons was 1 in total. According to the idea of BF algorithm, the matching of the 3 rd round, the 4 th round and the 5 th round is continued, and P is found after the matching is respectively compared once0Is not equal to T2,P0Is not equal to T3,P0Is not equal to T4The match failed, so the number of comparisons at the 3 rd round match was 1 in total, the number of comparisons at the 4 th round match was 1 in total, and the number of comparisons at the 5 th round match was 1 in total.
(3) 6 th round of matching
Comparing the 1 st character of the pattern string P with the 6 th character of the target string T, P0Is equal to T5Then skip 3 characters and continue to compare down, P4Is not equal to T9The match fails. In the 6 th round of matching, the number of comparisons was 2 in total.
(4) 7 th round of matching
Comparing the 1 st character of the pattern string P with the 7 th character of the target string T, P0Is not equal to T6The match fails. In the 7 th round of matching, the number of comparisons was 1 in total.
(5) 8 th round of matching
Comparing the 1 st character of the pattern string P with the 8 th character of the target string T, P0Is equal to T7Then skip 3 character comparisons, P4Is not equal to T11The match fails. In the 8 th round of matching, the number of comparisons was 2 in total.
(6) 9 th to 16 th round of matching
The 1 st character of the pattern string P is related to the targetThe 9 th character of string T is compared, P0Is not equal to T8The match fails. The 10 th to 16 th matching was continued, and all the results were found to fail after 1 comparison.
(7) 17 th round of matching
Comparing the 1 st character of the pattern string P with the 17 th character of the target string T, P0Is equal to T17And continuing to compare sequentially downwards to find that each character x% yyy in the new pattern string N is equal to each character at the corresponding position in the target string, and then, starting to compare the characters which are not compared before the first character in the pattern string P is skipped, wherein the comparison sequence is x (not compared) yza% (compared)% (not compared) x (compared) yyy (not compared), the rest characters in the pattern string P are also equal, and the matching is successful and is compared for 9 times in total.
The total number of comparisons from round 1 to round 17 can be added, and in this example, 28 comparisons are required for successful matching using the BF improvement algorithm, which is 7 times less than that of the BF conventional algorithm. For target strings and pattern strings with longer lengths, the advantages of the improved algorithm are more obvious, and the comparison times are more reduced. Characters in the new pattern string N are selected from the whole pattern string P, the content characteristics of the pattern string are reflected, only the characters on the characteristic positions are compared at the beginning, other characters are skipped, compared with the BF traditional algorithm, the method has the advantages that the speed of character matching failure can be increased by sequentially matching every bit from the beginning, the number of the characters of the new pattern string is a fraction of the whole pattern string or even less, the new pattern string can cause matching failure in most cases, and if no failure occurs and the characters in the new pattern string are matched with other characters skipped previously, the characters in the new pattern string do not need to be compared again. By optimizing the comparison structure and the comparison sequence of the pattern strings, the defect that the BF traditional algorithm is mechanically compared with the target strings one by one from the first character of the pattern strings is overcome, the comparison times are reduced, and the matching efficiency is improved.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.
Claims (5)
1. A BF improved algorithm based on search engine retrieval information pattern matching is characterized by comprising the following steps:
step 1: extracting some characters or character strings from the pattern string to be matched according to the structural characteristics and the content characteristics of the pattern string to be matched to form a new pattern string;
step 2: comparing the new pattern string with the pattern string to be matched, recording the number of characters at intervals among the characters or the character strings, and forming an object by the number of characters corresponding to the characters or the character strings forming the new pattern string and the number of characters at intervals among the characters or the character strings;
and step 3: firstly, comparing the characters in the new pattern string with the characters at the corresponding positions of the target string, wherein the following comparison sequence is: the even-numbered index in the object controls the number of characters needing to be compared, and the odd-numbered index controls the number of characters needing to be compared in a skipping or interval mode; if a certain character of the new pattern string is not matched with the target string in the comparison process from left to right, performing the next round of matching according to the same comparison sequence;
and 4, step 4: if each character in the new pattern string is compared and is equal to the character at the corresponding position in the target string, the comparison is started from the first character which is not compared in the pattern string to be matched, at the moment, the characters of the new pattern string do not need to be compared, and the following comparison sequence is that: the even-numbered index in the object controls the number of characters which are not needed to be compared in a skipping or interval mode, and the odd-numbered index controls the number of characters which are needed to be compared;
and 5: and (4) repeating the steps 3 and 4 until the pattern string to be matched is found in the target string, and terminating the algorithm.
2. The BF improvement algorithm based on search engine retrieval information pattern matching as claimed in claim 1, characterized in that the new pattern string needs to include the first character and the last character of the pattern string to be matched.
3. The BF improvement algorithm based on search engine retrieval information pattern matching as claimed in claim 2, characterized in that the new pattern string needs to include special characters distinguished from other characters in the pattern string to be matched.
4. The BF improving algorithm based on search engine retrieval information pattern matching as claimed in claim 2 or 3, wherein the new pattern string is required to include a continuous regular character string in the pattern string to be matched.
5. The BF-improving algorithm based on search engine retrieval information pattern matching according to claim 4, wherein the object's constituent pattern is: { N0,N1,N2Either { N } or { N }0,N1,N2,N3,N4……NXX is a positive integer, N0,N1,N2,N3,N4……NXThe values of (A) are natural numbers, which may be equal or different; the index with even subscript is the character number of the first character in the pattern string to be matched, or the character number of special characters different from other characters, or the character number of continuous regular character strings, or the character number of the last character; the index with the index odd is the number of characters spaced between two adjacent indexes.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010598366.9A CN111814009B (en) | 2020-06-28 | 2020-06-28 | Mode matching method based on search engine retrieval information |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010598366.9A CN111814009B (en) | 2020-06-28 | 2020-06-28 | Mode matching method based on search engine retrieval information |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN111814009A true CN111814009A (en) | 2020-10-23 |
| CN111814009B CN111814009B (en) | 2022-03-01 |
Family
ID=72855468
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202010598366.9A Active CN111814009B (en) | 2020-06-28 | 2020-06-28 | Mode matching method based on search engine retrieval information |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN111814009B (en) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112580747A (en) * | 2020-12-29 | 2021-03-30 | 珠海金山网络游戏科技有限公司 | Matching method and device |
Citations (13)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPH05274184A (en) * | 1992-03-24 | 1993-10-22 | Nec Corp | Execution result comparing device |
| US20080022403A1 (en) * | 2006-07-22 | 2008-01-24 | Tien-Fu Chen | Method and apparatus for a pattern matcher using a multiple skip structure |
| US20110252475A1 (en) * | 2010-02-18 | 2011-10-13 | Raymond Mui | Complementary Character Encoding for Preventing Input Injection in Web Applications |
| CN102750379A (en) * | 2012-06-25 | 2012-10-24 | 华南理工大学 | Fast character string matching method based on filtering type |
| CN102799600A (en) * | 2012-04-10 | 2012-11-28 | 成都网安科技发展有限公司 | Multi-mode matching algorithm and system based on encoding association |
| CN103500178A (en) * | 2013-09-09 | 2014-01-08 | 中国科学院计算机网络信息中心 | Quick multi-mode matching method on worst-case scenario of FS algorithm |
| CN104081669A (en) * | 2012-04-28 | 2014-10-01 | 华为技术有限公司 | Method for repairing and decoding air interface voice frame, and signal source side information acquisition method and device |
| CN104462266A (en) * | 2014-11-21 | 2015-03-25 | 北京京东尚科信息技术有限公司 | Method and system for improving string matching |
| CN104519056A (en) * | 2014-12-15 | 2015-04-15 | 广东科学技术职业学院 | Double-jump-based single mode matching method |
| CN107220333A (en) * | 2017-05-24 | 2017-09-29 | 电子科技大学 | A kind of chracter search method based on Sunday algorithms |
| CN107660283A (en) * | 2015-04-03 | 2018-02-02 | 甲骨文国际公司 | Method and system for implementing a log parser in a log analysis system |
| CN108886367A (en) * | 2016-01-29 | 2018-11-23 | 零点科技公司 | Method, apparatus and system for compression and decompression data |
| CN109977276A (en) * | 2019-03-22 | 2019-07-05 | 华南理工大学 | A kind of single pattern matching method based on Sunday algorithm improvement |
-
2020
- 2020-06-28 CN CN202010598366.9A patent/CN111814009B/en active Active
Patent Citations (13)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPH05274184A (en) * | 1992-03-24 | 1993-10-22 | Nec Corp | Execution result comparing device |
| US20080022403A1 (en) * | 2006-07-22 | 2008-01-24 | Tien-Fu Chen | Method and apparatus for a pattern matcher using a multiple skip structure |
| US20110252475A1 (en) * | 2010-02-18 | 2011-10-13 | Raymond Mui | Complementary Character Encoding for Preventing Input Injection in Web Applications |
| CN102799600A (en) * | 2012-04-10 | 2012-11-28 | 成都网安科技发展有限公司 | Multi-mode matching algorithm and system based on encoding association |
| CN104081669A (en) * | 2012-04-28 | 2014-10-01 | 华为技术有限公司 | Method for repairing and decoding air interface voice frame, and signal source side information acquisition method and device |
| CN102750379A (en) * | 2012-06-25 | 2012-10-24 | 华南理工大学 | Fast character string matching method based on filtering type |
| CN103500178A (en) * | 2013-09-09 | 2014-01-08 | 中国科学院计算机网络信息中心 | Quick multi-mode matching method on worst-case scenario of FS algorithm |
| CN104462266A (en) * | 2014-11-21 | 2015-03-25 | 北京京东尚科信息技术有限公司 | Method and system for improving string matching |
| CN104519056A (en) * | 2014-12-15 | 2015-04-15 | 广东科学技术职业学院 | Double-jump-based single mode matching method |
| CN107660283A (en) * | 2015-04-03 | 2018-02-02 | 甲骨文国际公司 | Method and system for implementing a log parser in a log analysis system |
| CN108886367A (en) * | 2016-01-29 | 2018-11-23 | 零点科技公司 | Method, apparatus and system for compression and decompression data |
| CN107220333A (en) * | 2017-05-24 | 2017-09-29 | 电子科技大学 | A kind of chracter search method based on Sunday algorithms |
| CN109977276A (en) * | 2019-03-22 | 2019-07-05 | 华南理工大学 | A kind of single pattern matching method based on Sunday algorithm improvement |
Non-Patent Citations (6)
| Title |
|---|
| C. VAMSI KRISHNA 等: "Novel approach for string searching and matching using American standard code for information interchange value", 《2016 INTERNATIONAL CONFERENCE ON RECENT TRENDS IN INFORMATION TECHNOLOGY (ICRTIT)》 * |
| DANIAR HERI KURNIAWAN 等: "A new string matching algorithm based on logical indexing", 《2015 INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING AND INFORMATICS (ICEEI)》 * |
| 俞昕: "基于函数属性的二进制文件快速比对", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
| 刘雨心 等: "一种改进的Sunday字符串匹配算法", 《太原理工大学学报》 * |
| 王成 等: "一种改进的字符串匹配算法", 《计算机工程》 * |
| 邢凯: "基于k-匿名的位置隐私保护方法研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112580747A (en) * | 2020-12-29 | 2021-03-30 | 珠海金山网络游戏科技有限公司 | Matching method and device |
Also Published As
| Publication number | Publication date |
|---|---|
| CN111814009B (en) | 2022-03-01 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US8200646B2 (en) | Efficient retrieval of variable-length character string data | |
| CN102867040B (en) | Chinese search engine mixed speech-oriented query error correction method and system | |
| US7912818B2 (en) | Web graph compression through scalable pattern mining | |
| US7756847B2 (en) | Method and arrangement for searching for strings | |
| US7882109B2 (en) | Computer representation of a data tree structure and the associated encoding/decoding methods | |
| JPH11212980A (en) | Production of index and retrieval method | |
| WO2008053583A1 (en) | Bit sequence searching method and program | |
| CN114238257B (en) | Log processing methods, log processing devices and electronic equipment | |
| JP6072922B2 (en) | Character string search device, character string search method, and character string search program | |
| CN111814009B (en) | Mode matching method based on search engine retrieval information | |
| JP2009512099A (en) | Method and apparatus for restartable hashing in a try | |
| KR101089722B1 (en) | Prefix tree-based indexing method and apparatus, recording medium thereof | |
| CN113065419B (en) | Pattern matching algorithm and system based on flow high-frequency content | |
| CN117914824A (en) | Active IPv6 address detection method and device based on multi-level association strategy | |
| JP2012018592A (en) | Sequence analysis apparatus, sequence analysis method and computer program | |
| JPH11203315A (en) | Method for retrieving symbol column and device therefor and recording medium for recording symbol column retrieval program | |
| CN113010882B (en) | Custom position sequence pattern matching method suitable for cache loss attack | |
| CN116483959A (en) | Method, device and medium for retrieving question-related subgraphs based on evidence graph schema | |
| JP4510041B2 (en) | Document search system and program | |
| CN109657108B (en) | Domain name asset data storage and query method and system | |
| CN113609342A (en) | Data storage method | |
| CN119003835A (en) | Efficient binary character string matching method based on KMP algorithm | |
| CN113609341A (en) | Method for generating data dictionary | |
| CN113691352A (en) | Data segmentation method | |
| KR970066950A (en) | Information retrieval device for searching sentences to search for strings matching keywords |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |