CN116186073A - A Method of Improving Data Retrieval Speed - Google Patents
A Method of Improving Data Retrieval Speed Download PDFInfo
- Publication number
- CN116186073A CN116186073A CN202310000018.0A CN202310000018A CN116186073A CN 116186073 A CN116186073 A CN 116186073A CN 202310000018 A CN202310000018 A CN 202310000018A CN 116186073 A CN116186073 A CN 116186073A
- Authority
- CN
- China
- Prior art keywords
- business
- data
- physical view
- query
- field
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2453—Query optimisation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24552—Database cache management
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2468—Fuzzy queries
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/248—Presentation of query results
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Fuzzy Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Automation & Control Theory (AREA)
- Probability & Statistics with Applications (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
本发明属于数据检索技术领域,具体涉及一种提高数据检索速度的方法,包括以下步骤:S1、生成多表模糊查询请求;S2、搭建业务物理视图服务;S3、降维多表模糊查询请求,对业务物理视图进行单表查询;其中,业务物理视图为对同一类业务功能归类为单一物理视图。本发明通过建立一个业务物理视图服务,实时的创建和更新待查询的业务物理视图,并对业务物理视图进行全文索引,为业务查询提供单物理视图快速模糊查询,使模糊查询不受左右模糊查询的限制,查询效率从几十秒上百秒的查询速度下降到几十毫秒,查询效率极大地提高,大大的提高了用户体验,特别是针对多表联合查询,多字段同时查询更体现出查询的优势。
The invention belongs to the technical field of data retrieval, and specifically relates to a method for increasing data retrieval speed, comprising the following steps: S1, generating a multi-table fuzzy query request; S2, building a business physical view service; S3, reducing dimensionality and multi-table fuzzy query requests, Perform single-table query on the business physical view; among them, the business physical view is to classify the same type of business function as a single physical view. The present invention establishes a business physical view service, creates and updates the business physical view to be queried in real time, and performs full-text indexing on the business physical view, and provides fast fuzzy query of a single physical view for business query, so that the fuzzy query is not subject to left and right fuzzy query The query efficiency drops from tens of seconds to hundreds of seconds to tens of milliseconds, which greatly improves the query efficiency and greatly improves the user experience, especially for multi-table joint query, multi-field simultaneous query reflects the query The advantages.
Description
技术领域technical field
本发明属于数据检索技术领域,具体涉及一种提高数据检索速度的方法。The invention belongs to the technical field of data retrieval, and in particular relates to a method for improving data retrieval speed.
背景技术Background technique
在使用MySQL关系型数据库的TMS系统或CRM系统等开发过程中,客户经常需要对某些信息进行模糊查询,例如,小程序的运单列表界面有个查询输入框,需要在这个查询输入框查找包含“21”字符的运单号或手机号或车牌号或地址所对应的所有运单,这种查询通常会涉及到多个业务表:包含运单号的运单表,包含手机号的帐号表,包含车牌号的运单运力表,包含地址信息的货源表等数据表,需要多表联合进行多字段的模糊查询。当各种表数据量比较大的时候,进行这样的查询会非常缓慢,经常因数据查询时间过长引起接口超时,导致功能无法正常使用,同时在高并发情况下查询会导致数据库性能急剧下降。During the development of TMS systems or CRM systems using MySQL relational databases, customers often need to perform fuzzy queries on certain information. For example, there is a query input box on the waybill list interface of the applet. For all the waybills corresponding to the waybill number or mobile phone number or license plate number or address of the "21" character, this query usually involves multiple business tables: the waybill table containing the waybill number, the account table containing the mobile phone number, including Data tables such as the waybill capacity table of the license plate number and the supply table containing address information require multiple tables to be combined to perform multi-field fuzzy queries. When the amount of data in various tables is relatively large, it will be very slow to perform such a query. The interface often times out due to the long data query time, resulting in the function not being used normally. At the same time, querying under high concurrency will lead to a sharp drop in database performance.
为了避免出现上述查询缓慢的情况,目前大部分方法都是利用MySQL特点,为了保证能够利用索引在开发中禁止左模糊查询,只能进行右模糊查询;或者是利用搜索引擎系统,例如ES进行全文搜索。但这两种方法都有弊端,禁止左模糊查询会导致实际场景中不知道开头字符导致无法查询;而ES的全文搜索对于像TMS或CRM系统并非全部字段数据都是纯文本的情况不太适用。In order to avoid the above-mentioned slow query situation, most of the current methods use the characteristics of MySQL. In order to ensure that the index can be used to prohibit left fuzzy query during development, only right fuzzy query can be performed; or use search engine systems, such as ES for full text search. However, these two methods have disadvantages. Prohibition of left fuzzy query will lead to the inability to know the beginning character in the actual scene and make the query impossible; and the full-text search of ES is not suitable for situations where not all field data in TMS or CRM systems are plain text .
因此,针对上述技术问题,有必要提供一种提高数据检索速度的方法。Therefore, in view of the above technical problems, it is necessary to provide a method for improving the speed of data retrieval.
发明内容Contents of the invention
本发明的目的在于提供一种提高数据检索速度的方法,以解决上述的多表联合模糊查询数据速度慢、效率低的问题。The purpose of the present invention is to provide a method for improving data retrieval speed, so as to solve the above-mentioned problems of slow data speed and low efficiency of multi-table joint fuzzy query.
为了实现上述目的,本发明一实施例提供的技术方案如下:In order to achieve the above object, the technical solution provided by an embodiment of the present invention is as follows:
一种提高数据检索速度的方法,包括以下步骤:A method of increasing data retrieval speed, comprising the steps of:
S1、生成多表模糊查询请求;S1. Generate a multi-table fuzzy query request;
S2、搭建业务物理视图服务;S2. Building a business physical view service;
S3、降维多表模糊查询请求,对业务物理视图进行单表查询;S3, Dimensionality reduction and multi-table fuzzy query request, perform single-table query on the business physical view;
其中,业务物理视图为对同一类业务功能归类为单一物理视图,存储到业务物理视图中的数据是在存储前先对业务表数据进行业务逻辑处理,用于获得可供直接查询的数据。Among them, the business physical view classifies the same type of business functions into a single physical view, and the data stored in the business physical view is to perform business logic processing on the business table data before storage to obtain data that can be directly queried.
进一步地,所述S2中的业务物理视图服务包括探针服务、Kafka缓存服务和物理视图处理逻辑,所述探针服务用于作为一个MySQL数据库的从库,实时监控主库数据的变化,并将收集到的变化数据发往所述Kafka服务进行缓存,探针服务可以很好的解决业务物理视图数据与业务表数据同步,避免业务表数据变化了,而业务物理视图表数据没有及时更新的问题,同时无需修改原业务系统的逻辑,使业务系统与业务物理视图服务进行解藕,方便开发和扩展;Further, the business physical view service in S2 includes probe service, Kafka cache service and physical view processing logic, and the probe service is used as a slave library of a MySQL database to monitor changes in master database data in real time, and Send the collected change data to the Kafka service for caching. The probe service can well solve the problem of synchronizing the business physical view data with the business table data, and avoid the business table data changing, but the business physical view table data is not updated in time At the same time, there is no need to modify the logic of the original business system, so that the business system and the business physical view service can be decoupled to facilitate development and expansion;
所述Kafka缓存服务用于存储通过所述探针服务获取的业务表变化数据,所述Kafka缓存服务的高吞吐量及高缓存可以保证数据不丢失,作为消费端的物理视图处理逻辑服务如果重启,也不会丢失缓存的数据,重启后可以继续从重启前的消费点继续消费,保证业务表的变化数据能完整提供给物理视图处理逻辑进行处理。The Kafka cache service is used to store the business table change data acquired through the probe service. The high throughput and high cache of the Kafka cache service can ensure that data will not be lost. If the physical view processing logic service of the consumer is restarted, The cached data will not be lost, and consumption can continue from the consumption point before the restart after restarting, ensuring that the changed data of the business table can be completely provided to the physical view processing logic for processing.
进一步地,所述物理视图处理逻辑包括以下操作步骤:Further, the physical view processing logic includes the following steps:
S201、对业务查询结果进行分类,对每一类查询结果建立一个业务物理视图表,用于确定建立业务物理视图表的种类;S201. Classify the business query results, and establish a business physical view table for each type of query results, to determine the type of business physical view table to be established;
S202、经过业务分类后,确定每张业务物理视图表对应的字段;S202. After the business classification, determine the fields corresponding to each business physical view table;
S203、若模糊查询同时包含多个字段的结果,则在业务物理视图表增加一个汇总字段,并存储多个字段的汇总值;S203. If the fuzzy query contains the results of multiple fields at the same time, add a summary field to the business physical view table, and store the summary values of the multiple fields;
S204、对业务物理视图的字段及汇总字段分别建立全文索引;S204. Establish full-text indexes for the fields and summary fields of the business physical view;
S205、实时更新业务物理视图数据;S205. Update the business physical view data in real time;
S206、业务处理逻辑进行数据处理加工;S206. The business processing logic performs data processing;
S207、加工后的数据通过业务物理视图更新模块保存到业务物理视图表中。S207. The processed data is saved in the business physical view table through the business physical view update module.
进一步地,所述S202中的业务物理视图表是在多个业务数据表基础上建立业务物理视图表,是从各业务表根据过滤条件涉及的字段抽离而成,是各业务表的数据项子集。Further, the business physical view table in S202 is a business physical view table based on multiple business data tables, which is extracted from the fields involved in each business table according to the filter conditions, and is the data item of each business table Subset.
进一步地,所述S202中业务物理视图表的设计原则如下:Further, the design principle of the service physical view table in S202 is as follows:
S2021、同类业务确定一个主Key,建立一张对应的业务物理视图表;S2021. Determine a master key for similar services, and create a corresponding service physical view table;
S2022、同类业务各过滤条件对应的字段,从对应的业务表抽出对应字段作为业务物理视图表的字段。S2022. For the fields corresponding to the filter conditions of the same kind of business, extract the corresponding fields from the corresponding business table as the fields of the business physical view table.
进一步地,所述S204中MySQL默认支持2个字的字段级全文索引,要求在用户查询界面输入至少2个字才能进行查询,如果需要支持单字的模糊查询,则需要修改MySQL里面的配置,以支持单字模糊查询。Further, in the above-mentioned S204, MySQL supports a field-level full-text index of 2 characters by default, and it is required to input at least 2 characters in the user query interface to perform a query. If fuzzy query of a single character needs to be supported, the configuration in MySQL needs to be modified to Support single word fuzzy query.
进一步地,因为汇总字段是由多个字段数据合并产生的数据,为了对汇总字段能使用全文索引,必须对汇总字段内容的组成做特殊处理,即所述S204中汇总字段值=字段1值+空格+字段2值+空格+…+空格+字段N值,在各字段值之间添加空格,用于防止在进行汇总字段全文索引时前后两个字段值拼在一起导致查询结果值分属2个字段;Further, because the summary field is data generated by merging multiple field data, in order to use the full-text index for the summary field, special processing must be done to the composition of the summary field content, that is, the summary field value in S204 =
例如:需要查找车牌号或手机号包含“31”字符的运单,汇总字段里面数据由车牌“粤Bxxxx3”、承运人手机号“132456yyyyy”拼接而成,如果不加空格时汇总字段值为“粤Bxxxx3132456yyyyy”,如果进行查询时会将该数据查出,但实际上它是不符合条件的数据,如果加上空格后,汇总字段值变为“粤Bxxxx3 132456yyyyy”,这样就不会查询出来。For example: you need to find the waybill whose license plate number or mobile phone number contains the character "31". The data in the summary field is composed of the license plate "粤Bxxxx3" and the carrier's mobile phone number "132456yyyyy". If no spaces are added, the summary field value is "粤Bxxxx3132456yyyyy", if the query is performed, the data will be found out, but in fact it is not eligible data, if a space is added, the summary field value will become "Guangdong Bxxxx3 132456yyyyy", so it will not be queried.
进一步地,所述S205具体包括以下操作步骤:Further, the S205 specifically includes the following steps:
S2051、更新数据调度模块消费Kafka缓存服务缓存的业务表的更新数据;S2051. Update the data scheduling module to consume the update data of the service table cached by the Kafka cache service;
S2052、调度程序将更新数据交由对应的业务处理逻辑进行数据处理。S2052. The scheduler transfers the updated data to the corresponding business processing logic for data processing.
进一步地,所述S206中的业务处理逻辑是根据与主Key对应的本业务表的数据进行关联,获取业务物理视图中本表对应字段的数据,再按照查询条件对数据进行业务逻辑加工。Further, the business processing logic in S206 is to associate according to the data of the business table corresponding to the main key, obtain the data of the corresponding fields of the table in the business physical view, and then process the data according to the query conditions.
进一步地,所述S206中的数据处理加工模式包括:Further, the data processing mode in S206 includes:
S2061、提取主Key对应业务字段的值即为目标数据,无需进一步加工;S2061, extracting the value of the business field corresponding to the main Key is the target data, without further processing;
S2062、提取主Key对应业务字段的值,需要根据业务逻辑的上下文进行运算,得到目标数据;S2062, extracting the value of the business field corresponding to the main Key needs to be calculated according to the context of the business logic to obtain the target data;
S2063、提取主Key对应多业务表字段值进行汇总拼接,拼接模式按照拼接公式,得到目标数据;S2063, extracting the field values corresponding to the main Key to perform summary splicing, and the splicing mode is according to the splicing formula to obtain the target data;
对数据进行业务逻辑加工是为了在存入业务物理视图表前就生成可供直接查询的数据,像按照上面的汇总字段值处理逻辑,通过数据拼接方式,将多个字段值拼在一起,通过一个字段就可以查询,不用多个字段去分别查询,极大的提高查询速度。The business logic processing of data is to generate data that can be directly queried before it is stored in the business physical view table. For example, according to the above summary field value processing logic, multiple field values are stitched together through data splicing. One field can be queried, instead of multiple fields to be queried separately, which greatly improves the query speed.
与现有技术相比,本发明具有以下优点:Compared with the prior art, the present invention has the following advantages:
本发明通过建立一个业务物理视图服务,实时的创建和更新待查询的业务物理视图,并对业务物理视图进行全文索引,为业务查询提供单物理视图快速模糊查询,使模糊查询不受左右模糊查询的限制,查询效率从几十秒上百秒的查询速度下降到几十毫秒,查询效率极大地提高,大大的提高了用户体验,特别是针对多表联合查询,多字段同时查询更体现出查询的优势。The present invention establishes a business physical view service, creates and updates the business physical view to be queried in real time, and performs full-text indexing on the business physical view, and provides fast fuzzy query of a single physical view for business query, so that fuzzy query is not affected by left and right fuzzy query The query efficiency drops from tens of seconds to hundreds of seconds to tens of milliseconds, which greatly improves the query efficiency and greatly improves the user experience, especially for multi-table joint query, multi-field simultaneous query reflects the query The advantages.
附图说明Description of drawings
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明中记载的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only These are some embodiments described in the present invention. Those skilled in the art can also obtain other drawings based on these drawings without creative work.
图1为本发明一实施例中一种提高数据检索速度的方法的工作原理图;Fig. 1 is a working principle diagram of a method for improving data retrieval speed in an embodiment of the present invention;
图2为本发明一实施例中物理视图处理逻辑的部分结构示意图。FIG. 2 is a schematic diagram of a partial structure of physical view processing logic in an embodiment of the present invention.
具体实施方式Detailed ways
以下将结合附图所示的各实施方式对本发明进行详细描述。但该等实施方式并不限制本发明,本领域的普通技术人员根据该等实施方式所做出的结构、方法或功能上的变换均包含在本发明的保护范围内。The present invention will be described in detail below in conjunction with various embodiments shown in the drawings. However, these embodiments do not limit the present invention, and any structural, method or functional changes made by those skilled in the art according to these embodiments are included in the protection scope of the present invention.
本发明公开了一种提高数据检索速度的方法,参考图1-图2所示,包括以下步骤:The invention discloses a method for improving data retrieval speed, as shown in Fig. 1-Fig. 2, comprising the following steps:
S1、生成多表模糊查询请求;S1. Generate a multi-table fuzzy query request;
S2、搭建业务物理视图服务;S2. Building a business physical view service;
S3、降维多表模糊查询请求,对业务物理视图进行单表查询;S3, Dimensionality reduction and multi-table fuzzy query request, perform single-table query on the business physical view;
其中,业务物理视图为对同一类业务功能归类为单一物理视图,存储到业务物理视图中的数据是在存储前先对业务表数据进行业务逻辑处理,用于获得可供直接查询的数据。Among them, the business physical view classifies the same type of business functions into a single physical view, and the data stored in the business physical view is to perform business logic processing on the business table data before storage to obtain data that can be directly queried.
通过采用探针服务与业务物理视图结合,将多表联合、多字段模糊查询降维为业务物理视图的单表字段查询,查询集中在业务物理视图表,可有效避免关系型数据库的多表联表,多字段的模糊匹配引起索引失效的缺陷。By combining the probe service with the business physical view, the multi-table union and multi-field fuzzy query are reduced to the single-table field query of the business physical view, and the query is concentrated in the business physical view table, which can effectively avoid the multi-table join of the relational database. Table, the defect that the fuzzy matching of multiple fields causes the index to fail.
例如,在TMS系统中,通过各种条件(运单号,承运人手机号、车牌号、装卸货地等)模糊查找运单列表,归为一类,都是要找出对应运单号列表,只不过查询条件不同,所以建立以运单号为主Key、查询条件对应字段组成其他字段的物理视图表,业务逻辑仅对业务物理视图表进行查询即可获得所需数据。For example, in the TMS system, through various conditions (waybill number, carrier’s mobile phone number, license plate number, loading and unloading location, etc.) to fuzzily search the list of waybills, and classify them into one category, it is necessary to find the list of corresponding waybill numbers. However, the query conditions are different, so a physical view table with the waybill number as the main key and other fields composed of fields corresponding to the query conditions is established. The business logic can only query the business physical view table to obtain the required data.
其中,所述S2中的业务物理视图服务包括探针服务、Kafka缓存服务和物理视图处理逻辑,所述探针服务用于作为一个MySQL数据库的从库,实时监控主库数据的变化,并将收集到的变化数据发往所述Kafka服务进行缓存,探针服务可以很好的解决业务物理视图数据与业务表数据同步,避免业务表数据变化了,而业务物理视图表数据没有及时更新的问题,同时无需修改原业务系统的逻辑,使业务系统与业务物理视图服务进行解藕,方便开发和扩展。Wherein, the business physical view service in S2 includes a probe service, a Kafka cache service, and physical view processing logic, and the probe service is used as a slave database of a MySQL database to monitor changes in master database data in real time, and The collected change data is sent to the Kafka service for caching. The probe service can well solve the problem of synchronizing the business physical view data with the business table data, avoiding the problem that the business table data changes, but the business physical view table data is not updated in time At the same time, there is no need to modify the logic of the original business system, so that the business system and the business physical view service are decoupled, which is convenient for development and expansion.
另外,所述Kafka缓存服务用于存储通过所述探针服务获取的业务表变化数据,所述Kafka缓存服务的高吞吐量及高缓存可以保证数据不丢失,作为消费端的物理视图处理逻辑服务如果重启,也不会丢失缓存的数据,重启后可以继续从重启前的消费点继续消费,保证业务表的变化数据能完整提供给物理视图处理逻辑进行处理。In addition, the Kafka cache service is used to store the business table change data obtained through the probe service. The high throughput and high cache of the Kafka cache service can ensure that data will not be lost. Restarting will not lose the cached data. After restarting, you can continue to consume from the consumption point before restarting, ensuring that the changed data of the business table can be completely provided to the physical view processing logic for processing.
参考图1-图2所示,物理视图处理逻辑是整个服务的核心,所述物理视图处理逻辑包括以下操作步骤:Referring to Figures 1-2, the physical view processing logic is the core of the entire service, and the physical view processing logic includes the following steps:
S201、对业务查询结果进行分类,对每一类查询结果建立一个业务物理视图表,用于确定建立业务物理视图表的种类;S201. Classify the business query results, and establish a business physical view table for each type of query results, to determine the type of business physical view table to be established;
例如,如果TMS系统需要对货源列表和运单列表两项功能提供模糊查询功能,对运单列表如果需要对运单编号、承运人手机号、车牌号、货名等进行模糊查询,不管哪个条件进行查询,共性都将返回运单号列表,所以建立一个以运单号为主Key的业务物理视图表,货源列表中如果需要对详细地址、收货人姓名、货物名称进行模糊查询,需要建立一个以货源编号作为主Key的业务物理视图表,如此类推。For example, if the TMS system needs to provide a fuzzy query function for the two functions of the source list and the waybill list, and for the waybill list, if it needs to perform fuzzy query on the waybill number, carrier mobile phone number, license plate number, cargo name, etc., no matter which condition is used for query, The commonality will return the waybill number list, so create a business physical view table with the waybill number as the main key. If you need to fuzzy query the detailed address, consignee name, and cargo name in the source list, you need to create a source of goods The number is used as the business physical view table of the main key, and so on.
S202、经过业务分类后,确定每张业务物理视图表对应的字段;S202. After the business classification, determine the fields corresponding to each business physical view table;
S203、若模糊查询同时包含多个字段的结果,则在业务物理视图表增加一个汇总字段,并存储多个字段的汇总值;S203. If the fuzzy query contains the results of multiple fields at the same time, add a summary field to the business physical view table, and store the summary values of the multiple fields;
S204、对业务物理视图的字段及汇总字段分别建立全文索引;S204. Establish full-text indexes for the fields and summary fields of the business physical view;
S205、实时更新业务物理视图数据;S205. Update the business physical view data in real time;
S206、业务处理逻辑进行数据处理加工;S206. The business processing logic performs data processing;
S207、加工后的数据通过业务物理视图更新模块保存到业务物理视图表中。S207. The processed data is saved in the business physical view table through the business physical view update module.
其中,所述S202中的业务物理视图表是在多个业务数据表基础上建立业务物理视图表,是从各业务表根据过滤条件涉及的字段抽离而成,是各业务表的数据项子集。Wherein, the business physical view table in S202 is a business physical view table established on the basis of multiple business data tables, which is extracted from the fields involved in each business table according to the filter conditions, and is the data item of each business table. set.
另外,所述S202中业务物理视图表的设计原则如下:In addition, the design principle of the business physical view table in S202 is as follows:
S2021、同类业务确定一个主Key,建立一张对应的业务物理视图表;S2021. Determine a master key for similar services, and create a corresponding service physical view table;
S2022、同类业务各过滤条件对应的字段,从对应的业务表抽出对应字段作为业务物理视图表的字段。S2022. For the fields corresponding to the filter conditions of the same kind of business, extract the corresponding fields from the corresponding business table as the fields of the business physical view table.
具体地,所述S204中MySQL默认支持2个字的字段级全文索引,要求在用户查询界面输入至少2个字才能进行查询,如果需要支持单字的模糊查询,则需要修改MySQL里面的配置,以支持单字模糊查询。Specifically, in the above-mentioned S204, MySQL supports field-level full-text indexing of 2 characters by default, and it is required to input at least 2 characters in the user query interface to perform a query. If fuzzy query of a single character needs to be supported, the configuration in MySQL needs to be modified to Support single word fuzzy query.
因为汇总字段是由多个字段数据合并产生的数据,为了对汇总字段能使用全文索引,必须对汇总字段内容的组成做特殊处理,即所述S204中汇总字段值=字段1值+空格+字段2值+空格+…+空格+字段N值,在各字段值之间添加空格,用于防止在进行汇总字段全文索引时前后两个字段值拼在一起导致查询结果值分属2个字段。Because the summary field is the data generated by merging multiple field data, in order to use the full-text index for the summary field, special processing must be done to the composition of the summary field content, that is, the summary field value in S204 =
例如:需要查找车牌号或手机号包含“31”字符的运单,汇总字段里面数据由车牌“粤Bxxxx3”、承运人手机号“132456yyyyy”拼接而成,如果不加空格时汇总字段值为“粤Bxxxx3132456yyyyy”,如果进行查询时会将该数据查出,但实际上它是不符合条件的数据,如果加上空格后,汇总字段值变为“粤Bxxxx3132456yyyyy”,这样就不会查询出来。For example: you need to find the waybill whose license plate number or mobile phone number contains the character "31". The data in the summary field is composed of the license plate "粤Bxxxx3" and the carrier's mobile phone number "132456yyyyy". If no spaces are added, the summary field value is "粤Bxxxx3132456yyyyy", if the query is performed, the data will be found out, but in fact it is not eligible data, if a space is added, the summary field value will become "Guangdong Bxxxx3132456yyyyy", so it will not be queried.
此外,所述S205具体包括以下操作步骤:In addition, the S205 specifically includes the following steps:
S2051、更新数据调度模块消费Kafka缓存服务缓存的业务表的更新数据;S2051. Update the data scheduling module to consume the update data of the service table cached by the Kafka cache service;
S2052、调度程序将更新数据交由对应的业务处理逻辑进行数据处理。S2052. The scheduler transfers the updated data to the corresponding business processing logic for data processing.
再者,所述S206中的业务处理逻辑是根据与主Key对应的本业务表的数据进行关联,获取业务物理视图中本表对应字段的数据,再按照查询条件对数据进行业务逻辑加工。Furthermore, the business processing logic in S206 is to associate according to the data of the business table corresponding to the main Key, obtain the data of the corresponding field of the table in the business physical view, and then perform business logic processing on the data according to the query conditions.
进一步地,所述S206中的数据处理加工模式包括:Further, the data processing mode in S206 includes:
S2061、提取主Key对应业务字段的值即为目标数据,无需进一步加工;S2061, extracting the value of the business field corresponding to the main Key is the target data, without further processing;
S2062、提取主Key对应业务字段的值,需要根据业务逻辑的上下文进行运算,得到目标数据;S2062, extracting the value of the business field corresponding to the main Key needs to be calculated according to the context of the business logic to obtain the target data;
S2063、提取主Key对应多业务表字段值进行汇总拼接,拼接模式按照拼接公式,得到目标数据。S2063, extracting the field values corresponding to the main Key to perform summary splicing, and the splicing mode is according to the splicing formula to obtain the target data.
对数据进行业务逻辑加工是为了在存入业务物理视图表前就生成可供直接查询的数据,像按照上面的汇总字段值处理逻辑,通过数据拼接方式,将多个字段值拼在一起,通过一个字段就可以查询,不用多个字段去分别查询,极大的提高查询速度。The business logic processing of data is to generate data that can be directly queried before it is stored in the business physical view table. For example, according to the above summary field value processing logic, multiple field values are stitched together through data splicing. One field can be queried, instead of multiple fields to be queried separately, which greatly improves the query speed.
一种提高数据检索速度的方法,具体包括以下步骤:A method for improving data retrieval speed, specifically comprising the following steps:
步骤1、分析需要进行单/多表查询的业务,进行分类汇总,整理出需要建立的业务物理视图表及相关的字段,确定每一类业务物理视图表的主Key;
步骤2、在MySQL中创建业务物理视图表,对需要进行模糊查询的字段及汇总字段设置带有ngram解析器的全文索引,可以根据业务要求对查询输入最少字数进行查询限制,也可相应修改MySQL的全文索引配置以适应业务的需求;
步骤3、搭建探针服务,以MySQL从库方式运行,可针对需要监控的业务表进行相应字段监控,包括监控插入、删除、更新等数据库的操作;
步骤4、配置探针服务与Kafka的连接,探针服务在监控到业务表发生数据变化时,将变化的数据快速推送到Kafka;
步骤5、更新数据调度逻辑从Kafka消费更新数据,根据数据表信息,交由表对应的业务处理逻辑进行处理;
步骤6、表业务处理逻辑负责提取的根据主Key关联本数据表对应字段的值,结合查询条件的业务逻辑要求进行数据的加工处理,包括汇总拼接数据处理,形成用于可供直接查询的数据,提交给业务物理视图更新模块;
步骤7、业务物理视图更新模块将上述处理好的数据存储到物理视图表中,由于对字段及汇总字段设置了全文索引,MySQL自动会对内容进行分词索引;
步骤8、业务系统在进行多表模糊查询时,需要改变原有的用LIKE“%xxx%”的进行模糊查询方式,改用SELECT*FROM业务物理视图表WHERE MATCH(物理视图字段)AGAINST(“查询内容”IN BOOLEAN MODE)模式进行查询。
本发明通过采用探针服务与业务物理视图结合,充分利用物理视图和MySQL全文索引的优势,支持业务系统进行多表联合多字段查询;利用探针服务可以使业务物理视图的数据能得到实时更新,通过业务处理逻辑将最终结果数据以物理视图模式存储,使业务系统在查询时不用涉及到多表的联合操作,只需进行物理视图表的单表操作;对需要进行查询的字段及汇总字段等进行全文索引可有效利用MySQL的数据索引功能,防止用LIKE左右模糊查询导致索引失效而引发全表扫描的弊端,最大限度的提升系统性能及效率,大大提高了用户的查询体验,本发明也可以用于模糊查询的其他场景。The present invention combines the probe service with the business physical view, fully utilizes the advantages of the physical view and MySQL full-text index, and supports the business system to perform multi-table joint multi-field query; the data of the business physical view can be updated in real time by using the probe service , the final result data is stored in the physical view mode through the business processing logic, so that the business system does not need to involve the joint operation of multiple tables when querying, but only needs to perform single-table operations on the physical view table; the fields that need to be queried and the summary fields Full-text indexing can effectively utilize the data indexing function of MySQL, prevent the disadvantages of full-table scanning caused by fuzzy queries caused by LIKE left and right, improve system performance and efficiency to the greatest extent, and greatly improve user query experience. Other scenarios that can be used for fuzzy queries.
对于本领域技术人员而言,显然本发明不限于上述示范性实施例的细节,而且在不背离本发明的精神或基本特征的情况下,能够以其他的具体形式实现本发明。因此,无论从哪一点来看,均应将实施例看作是示范性的,而且是非限制性的,本发明的范围由所附权利要求而不是上述说明限定,因此旨在将落在权利要求的等同要件的含义和范围内的所有变化囊括在本发明内。不应将权利要求中的任何附图标记视为限制所涉及的权利要求。It will be apparent to those skilled in the art that the invention is not limited to the details of the above-described exemplary embodiments, but that the invention can be embodied in other specific forms without departing from the spirit or essential characteristics of the invention. Accordingly, the embodiments should be regarded in all points of view as exemplary and not restrictive, the scope of the invention being defined by the appended claims rather than the foregoing description, and it is therefore intended that the scope of the invention be defined by the appended claims rather than by the foregoing description. All changes within the meaning and range of equivalents of the elements are embraced in the present invention. Any reference sign in a claim should not be construed as limiting the claim concerned.
此外,应当理解,虽然本说明书按照实施例加以描述,但并非每个实施例仅包含一个独立的技术方案,说明书的这种叙述方式仅仅是为清楚起见,本领域技术人员应当将说明书作为一个整体,各实施例中的技术方案也可以经适当组合,形成本领域技术人员可以理解的其他实施方式。In addition, it should be understood that although the specification is described according to the embodiments, not each embodiment only includes an independent technical solution, and this description in the specification is only for clarity, and those skilled in the art should take the specification as a whole , the technical solutions in the various embodiments can also be properly combined to form other implementations that can be understood by those skilled in the art.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310000018.0A CN116186073A (en) | 2023-01-01 | 2023-01-01 | A Method of Improving Data Retrieval Speed |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310000018.0A CN116186073A (en) | 2023-01-01 | 2023-01-01 | A Method of Improving Data Retrieval Speed |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116186073A true CN116186073A (en) | 2023-05-30 |
Family
ID=86441596
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310000018.0A Pending CN116186073A (en) | 2023-01-01 | 2023-01-01 | A Method of Improving Data Retrieval Speed |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116186073A (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6697799B1 (en) * | 1999-09-10 | 2004-02-24 | Requisite Technology, Inc. | Automated classification of items using cascade searches |
CN107273506A (en) * | 2017-06-19 | 2017-10-20 | 西安电子科技大学 | A kind of method of database multi-list conjunctive query |
CN108334534A (en) * | 2017-10-27 | 2018-07-27 | 平安普惠企业管理有限公司 | Operation system field configuration method, apparatus, server and readable storage medium storing program for executing |
US20180268065A1 (en) * | 2017-03-20 | 2018-09-20 | Facebook, Inc. | Search Result Ranking Based on Post Classifiers on Online Social Networks |
CN110825604A (en) * | 2019-11-05 | 2020-02-21 | 北京博睿宏远数据科技股份有限公司 | Method, device, equipment and medium for monitoring user track and performance of application |
CN113127518A (en) * | 2021-04-29 | 2021-07-16 | 成都康赛信息技术有限公司 | Method for flexibly inquiring data |
CN113837703A (en) * | 2021-10-27 | 2021-12-24 | 江苏物润船联网络股份有限公司 | Logistics waybill carrying information quantification anti-duplication real-time automatic verification method |
-
2023
- 2023-01-01 CN CN202310000018.0A patent/CN116186073A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6697799B1 (en) * | 1999-09-10 | 2004-02-24 | Requisite Technology, Inc. | Automated classification of items using cascade searches |
US20180268065A1 (en) * | 2017-03-20 | 2018-09-20 | Facebook, Inc. | Search Result Ranking Based on Post Classifiers on Online Social Networks |
CN107273506A (en) * | 2017-06-19 | 2017-10-20 | 西安电子科技大学 | A kind of method of database multi-list conjunctive query |
CN108334534A (en) * | 2017-10-27 | 2018-07-27 | 平安普惠企业管理有限公司 | Operation system field configuration method, apparatus, server and readable storage medium storing program for executing |
CN110825604A (en) * | 2019-11-05 | 2020-02-21 | 北京博睿宏远数据科技股份有限公司 | Method, device, equipment and medium for monitoring user track and performance of application |
CN113127518A (en) * | 2021-04-29 | 2021-07-16 | 成都康赛信息技术有限公司 | Method for flexibly inquiring data |
CN113837703A (en) * | 2021-10-27 | 2021-12-24 | 江苏物润船联网络股份有限公司 | Logistics waybill carrying information quantification anti-duplication real-time automatic verification method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110489445B (en) | Rapid mass data query method based on polymorphic composition | |
US9336301B2 (en) | Merging semantically similar clusters based on cluster labels | |
US11599535B2 (en) | Query translation for searching complex structures of objects | |
CN111506621B (en) | Data statistical method and device | |
US20200293545A1 (en) | Data storage method and apparatus, server, and storage medium | |
CN108959538B (en) | Full text retrieval system and method | |
CN100483411C (en) | Information searching method and device in relation ship data bank | |
CN112231321B (en) | A kind of Oracle secondary index and index real-time synchronization method | |
CN107526746B (en) | Method and apparatus for managing document index | |
CN106161193B (en) | Mail processing method, device and system | |
CN106547828B (en) | database caching system and method based on neural network | |
CN102955792A (en) | Method for implementing transaction processing for real-time full-text search engine | |
CN114443699B (en) | Information query method, device, computer equipment and computer readable storage medium | |
CN113312181A (en) | High-concurrency workflow approval method based on activiti custom form | |
CN109815240A (en) | Method, apparatus, device and storage medium for managing indexes | |
WO2020147334A1 (en) | Method and system for data query based on ignite cache architecture | |
CN111625570A (en) | List data resource retrieval method and device | |
CN116186073A (en) | A Method of Improving Data Retrieval Speed | |
US8805820B1 (en) | Systems and methods for facilitating searches involving multiple indexes | |
US8688695B2 (en) | Computerized searchable document repository using separate metadata and content stores and full text indexes | |
CN117472854A (en) | Acceleration batch file search model | |
CN116049193A (en) | Data storage method and device | |
CN110309369B (en) | Tree structure data realization method and terminal | |
CN115438032A (en) | A cloud computing-based data retrieval method, device and system | |
CN100367203C (en) | A string reference method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |