[go: up one dir, main page]

CN103064933B - Data query method and system - Google Patents

Data query method and system Download PDF

Info

Publication number
CN103064933B
CN103064933B CN201210566137.4A CN201210566137A CN103064933B CN 103064933 B CN103064933 B CN 103064933B CN 201210566137 A CN201210566137 A CN 201210566137A CN 103064933 B CN103064933 B CN 103064933B
Authority
CN
China
Prior art keywords
data
field
query
query word
acquisition unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201210566137.4A
Other languages
Chinese (zh)
Other versions
CN103064933A (en
Inventor
谢永方
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Gaohang Intellectual Property Operation Co ltd
Handan Duofei Fastener Manufacturing Co ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201210566137.4A priority Critical patent/CN103064933B/en
Publication of CN103064933A publication Critical patent/CN103064933A/en
Priority to PCT/CN2013/082130 priority patent/WO2014101445A1/en
Application granted granted Critical
Publication of CN103064933B publication Critical patent/CN103064933B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/93Document management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • General Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明涉及一种数据查询方法及系统,所述方法包括:接收输入的查询请求,所述查询请求中携带所要查询的字段和所述字段中的查询词;从所述字段对应的集中索引表中查询得到所述查询词对应的采集器标识;根据所述查询请求生成携带有所述字段和查询词的查询命令,并将所述查询命令发送给所述采集器标识对应的数据采集器,用以所述数据采集器通过所述数据采集器中所述查询命令携带的字段对应的本地索引表查询得到与所述查询命令中携带的查询词相匹配的数据;接收所述数据采集器返回的所述数据,根据接收到的数据形成所述查询请求的查询结果并输出。本发明能够提高数据查询的处理速度,减轻数据采集器的系统资源占用和数据查询服务器的负荷压力。

The present invention relates to a data query method and system. The method includes: receiving an input query request, the query request carrying a field to be queried and query words in the field; Query in the query to obtain the collector identification corresponding to the query word; generate a query command carrying the field and the query word according to the query request, and send the query command to the data collector corresponding to the collector identification, The data collector is used to query the local index table corresponding to the field carried by the query command in the data collector to obtain the data matching the query word carried in the query command; receive the data returned by the data collector the data, and form and output the query result of the query request according to the received data. The invention can improve the processing speed of data query, reduce the system resource occupation of the data collector and the load pressure of the data query server.

Description

数据查询方法及系统Data query method and system

技术领域technical field

本发明涉及计算机网络技术领域,尤其涉及一种数据查询方法、数据查询服务器、数据采集器及一种数据查询系统。The invention relates to the technical field of computer networks, in particular to a data query method, a data query server, a data collector and a data query system.

背景技术Background technique

在当前互联网极度发达的时代,数据的采集与查询系统具有广泛的用途,各种信息技术(InformationTechnology,IT)系统、网络设备、安全设备都会产生大量的日志等数据,其中有很多日志数据需要长期存档,并用于各种审计和查询。In the current era of extremely developed Internet, data collection and query systems have a wide range of uses. Various information technology (Information Technology, IT) systems, network equipment, and security equipment will generate a large number of logs and other data, and many of the log data require long-term archived and used for various audits and inquiries.

在海量数据的采集查询系统中,系统的架构有分布式存储和集中式存储两种方式,不论哪种方式,都面临着海量日志数据的快速存储、快速查询的需求。In the mass data collection and query system, the system architecture has two methods: distributed storage and centralized storage. No matter which method is used, it is faced with the demand for fast storage and fast query of massive log data.

现有的一种分布式数据查询系统包括一台数据查询服务器和多台数据采集器,数据采集器负责日志的采集(接收、格式化、归并)、存储和建立索引,数据查询服务器是日志查询的统一入口。需要查询指定日志的时候,由数据查询服务器给所有的数据采集器下发查询命令,将所有数据采集器的查询结果收到之后汇总出最终的查询结果。如果数据采集器很多,一次查询中要查询的日志仅存在于少数数据采集器中,查询操作又很频繁,则这种现有方案会增加所有数据采集器的负担,包括数据采集器的功耗和中央处理单元(CentralProcessingUnit,CPU)资源,同时数据采集器除了查询之外,还需要做数据接收和入库的工作,如果查询操作很频繁,也会影响数据采集器的采集性能,降低了系统整体的处理能力。An existing distributed data query system includes a data query server and multiple data collectors. The data collector is responsible for log collection (receiving, formatting, merging), storage and indexing. The data query server is a log query server. unified entrance. When the specified log needs to be queried, the data query server sends query commands to all data collectors, and after receiving the query results from all data collectors, the final query results are summarized. If there are many data collectors, the logs to be queried in one query only exist in a few data collectors, and the query operation is very frequent, this existing solution will increase the burden of all data collectors, including the power consumption of the data collectors and central processing unit (Central Processing Unit, CPU) resources, at the same time, in addition to querying, the data collector also needs to do the work of data receiving and storage. If the query operation is very frequent, it will also affect the collection performance of the data collector and reduce the system overall processing power.

现有的另一种分布式数据查询系统的原始日志数据采用集中式存储,每个数据采集器只负责日志的采集(接收、格式化、归并)、上报,日志内容在数据采集器处理之后并不在本地保存,而是上报到数据查询服务器去存储。数据查询服务器在收到数据采集器上报的日志之后,集中存储到数据库中,并建立索引,需要对日志查询的时候直接到数据查询服务器的数据库中查询就可以。这种数据集中存储的方式,使日志的查询操作仅限于在数据查询服务器的数据库中执行,不会影响到数据采集器。然而,由于日志数据集中存放在数据查询服务器的数据库中,数据采集器需要上报大量的日志数据,一方面使数据查询服务器的负荷大大增加,另一方面也大量消耗了数据采集器和数据查询服务器之间的带宽,这样也就限制了一台数据查询服务器可以带的数据采集器的数量,整个系统的处理能力不可能很高。The original log data of another existing distributed data query system is stored in a centralized manner, and each data collector is only responsible for log collection (receiving, formatting, merging) and reporting, and the log content is processed by the data collector. It is not saved locally, but reported to the data query server for storage. After the data query server receives the log reported by the data collector, it will be centrally stored in the database and indexed. When it is necessary to query the log, it can be directly queried in the database of the data query server. This way of centralized data storage makes log query operations limited to the database of the data query server and will not affect the data collector. However, since the log data is centrally stored in the database of the data query server, the data collector needs to report a large amount of log data. On the one hand, the load on the data query server is greatly increased; In this way, the number of data collectors that can be carried by a data query server is limited, and the processing capacity of the entire system cannot be very high.

发明内容Contents of the invention

本发明提供了一种数据查询方法、数据查询服务器、数据采集器及数据查询系统,能够提高数据查询的处理速度,减轻数据采集器的系统资源占用和数据查询服务器的负荷压力,提升整个系统的处理能力。The invention provides a data query method, a data query server, a data collector and a data query system, which can improve the processing speed of data query, reduce the system resource occupation of the data collector and the load pressure of the data query server, and improve the overall system efficiency. processing power.

为实现上述目的,本发明第一方面提供了一种数据查询方法,所述方法包括:In order to achieve the above object, the first aspect of the present invention provides a data query method, the method comprising:

接收输入的查询请求,所述查询请求中携带所要查询的字段和所述字段中的查询词;Receiving an input query request, carrying the field to be queried and the query words in the field in the query request;

从所述字段对应的集中索引表中查询得到所述查询词对应的采集器标识,所述集中索引表中存储所述字段中的查询词与采集器标识的对应关系;Obtain the collector identification corresponding to the query word from the centralized index table corresponding to the field, and store the corresponding relationship between the query word in the field and the collector ID in the centralized index table;

根据所述查询请求生成携带有所述字段和查询词的查询命令,并将所述查询命令发送给所述采集器标识对应的数据采集器,用以所述数据采集器在所述数据采集器中所述查询命令携带的字段对应的本地索引表中,查询得到与所述查询命令中携带的查询词相匹配的数据;Generate a query command carrying the field and query words according to the query request, and send the query command to the data collector corresponding to the collector identification, so that the data collector can be used in the data collector In the local index table corresponding to the field carried by the query command in the query, the data matching the query word carried in the query command is obtained by querying;

接收所述数据采集器返回的所述数据,根据接收到的数据形成所述查询请求的查询结果并输出。The data returned by the data collector is received, and the query result of the query request is formed and output according to the received data.

结合第一方面,在第一方面的第一种可能的实施方式中,所述从所述字段对应的集中索引表中查询得到所述查询词对应的采集器标识之前,还包括:In conjunction with the first aspect, in the first possible implementation manner of the first aspect, before obtaining the collector identifier corresponding to the query word from the centralized index table corresponding to the field, the method further includes:

针对所述字段,建立所述字段对应的集中索引表;For the field, establish a centralized index table corresponding to the field;

所述建立所述字段对应的集中索引表,包括:The establishment of the centralized index table corresponding to the field includes:

接收各数据采集器发送的所述字段的上报索引表,所述上报索引表中包括发送所述上报索引表的数据采集器中的数据对应于所述字段的查询词;receiving the reporting index table of the field sent by each data collector, the reporting index table including the data in the data collector sending the reporting index table corresponding to the query words of the field;

在所述字段的集中索引表中,存储数据采集器的标识与所述数据采集器上报的上报索引表中该字段的查询词的对应关系。In the centralized index table of the field, the corresponding relationship between the identifier of the data collector and the query word of the field in the report index table reported by the data collector is stored.

结合第一方面,在第一方面的第二种可能的实施方式中,所述从所述字段对应的集中索引表中查询得到所述查询词对应的采集器标识,包括:With reference to the first aspect, in the second possible implementation manner of the first aspect, the querying to obtain the collector identifier corresponding to the query word from the centralized index table corresponding to the field includes:

若所述查询请求中携带至少两个所要查询的字段,获取所述查询请求中各字段的查询词,并记录所述各字段的查询词之间的逻辑关系;If the query request carries at least two fields to be queried, obtain query words of each field in the query request, and record the logical relationship between the query words of each field;

从所述各字段对应的集中索引表中查询得到所述各字段的查询词对应的采集器标识;Querying from the centralized index table corresponding to each field to obtain the collector identification corresponding to the query word of each field;

根据所述各字段的查询词之间的逻辑关系,从查询得到的所述采集器标识中筛选得到满足所述逻辑关系的采集器标识。According to the logical relationship between the query words in each field, the collector identifiers satisfying the logical relationship are obtained by screening the collector identifiers obtained from the query.

第二方面,本发明还提供了一种数据查询方法,所述方法包括:In a second aspect, the present invention also provides a data query method, the method comprising:

接收数据查询服务器发送的查询命令,所述查询命令包括所述数据查询服务器接收的查询请求中携带的所要查询的字段和所述字段中的查询词;receiving a query command sent by the data query server, the query command including the field to be queried and the query word in the field carried in the query request received by the data query server;

从所述字段对应的本地索引表中查询得到与所述查询命令中的查询词相匹配的数据的存储位置,所述本地索引表中存储所述字段中的查询词与所述数据的存储位置的对应关系;Query from the local index table corresponding to the field to obtain the storage location of the data matching the query word in the query command, and store the query word in the field and the storage location of the data in the local index table corresponding relationship;

根据所述数据的存储位置,获取所述数据并发送给所述数据查询服务器。According to the storage location of the data, the data is acquired and sent to the data query server.

结合第二方面,在第二方面的第一种可能的实施方式中,所述从所述字段对应的本地索引表中查询得到与所述查询命令中的查询词相匹配的数据的存储位置之前,还包括:With reference to the second aspect, in the first possible implementation manner of the second aspect, before the query from the local index table corresponding to the field obtains the storage location of the data matching the query word in the query command ,Also includes:

针对所述字段,建立所述字段对应的本地索引表;For the field, establish a local index table corresponding to the field;

所述建立所述字段对应的本地索引表,包括:The establishment of the local index table corresponding to the field includes:

获取当前数据采集器中的数据和所述数据的存储位置,所述数据中包括至少一个字段的内容;Obtain the data in the current data collector and the storage location of the data, the data includes the content of at least one field;

针对每一个字段,将所述数据在该字段的内容作为所述数据的查询词,建立所述查询词与所述存储位置的映射关系,形成该字段在所述当前数据采集器的本地索引表。For each field, use the content of the data in the field as the query word of the data, establish the mapping relationship between the query word and the storage location, and form the local index table of the field in the current data collector .

结合第二方面的第一种可能的实施方式,在第二方面的第二种可能的实施方式中,所述在所述当前数据采集器中所述字段的本地索引表中,存储所述数据的查询词与所述数据的存储位置的对应关系之后,还包括:With reference to the first possible implementation manner of the second aspect, in a second possible implementation manner of the second aspect, the data is stored in the local index table of the field in the current data collector After the corresponding relationship between the query word and the storage location of the data, it also includes:

从该字段的本地索引表中提取出所述查询词,对所述查询词进行去重处理,形成所述当前数据采集器的所述字段的上报索引表;extracting the query word from the local index table of the field, and performing deduplication processing on the query word to form the report index table of the field of the current data collector;

将所述字段的上报索引表发送给所述数据查询服务器,用以所述数据查询服务器建立所述字段对应的集中索引表。Send the report index table of the field to the data query server, so that the data query server can establish a centralized index table corresponding to the field.

结合第二方面,在第二方面的第三种可能的实施方式中,所述从所述字段对应的本地索引表中查询得到与所述查询命令中的查询词相匹配的数据的存储位置,包括:With reference to the second aspect, in a third possible implementation manner of the second aspect, the query from the local index table corresponding to the field obtains the storage location of the data matching the query word in the query command, include:

若所述查询命令中携带至少两个所要查询的字段,获取所述查询命令中各字段的查询词,并记录所述各字段的查询词之间的逻辑关系;If the query command carries at least two fields to be queried, obtain the query words of each field in the query command, and record the logical relationship between the query words of each field;

从所述各字段对应的本地索引表中查询得到与所述查询命令中各字段的查询词相匹配的数据的存储位置;Querying from the local index table corresponding to each field to obtain the storage location of the data matched with the query word of each field in the query command;

根据所述各字段的查询词之间的逻辑关系,从查询得到的所述数据的存储位置中筛选得到满足所述逻辑关系的数据的存储位置。According to the logical relationship between the query words in each field, the storage locations of the data satisfying the logical relationship are obtained by screening from the storage locations of the data obtained through the query.

第三方面,本发明还提供了一种数据查询服务器,所述数据查询服务器包括:In a third aspect, the present invention also provides a data query server, and the data query server includes:

第一接收单元,用于接收输入的查询请求,所述查询请求中携带所要查询的字段和所述字段中的查询词;A first receiving unit, configured to receive an input query request, the query request carrying the fields to be queried and query words in the fields;

第一查询单元,用于从所述字段对应的集中索引表中,查询得到所述第一接收单元接收的所述查询请求携带的查询词对应的采集器标识,所述集中索引表中存储所述字段中的查询词与采集器标识的对应关系;The first query unit is configured to query from the centralized index table corresponding to the field to obtain the collector identifier corresponding to the query word carried in the query request received by the first receiving unit, and the centralized index table stores all The corresponding relationship between the query word in the above field and the collector identifier;

第一处理单元,用于根据所述查询请求生成携带有所述字段和查询词的查询命令,并将所述查询命令发送给所述第一查询单元查询得到的采集器标识对应的数据采集器,用以所述数据采集器在所述数据采集器中所述查询命令携带的字段对应的本地索引表中,查询得到与所述查询命令中携带的查询词相匹配的数据;A first processing unit, configured to generate a query command carrying the field and a query word according to the query request, and send the query command to the data collector corresponding to the collector identifier obtained through query by the first query unit , for the data collector to search in the local index table corresponding to the field carried by the query command in the data collector to obtain data matching the query words carried in the query command;

第一输出单元,用于接收所述数据采集器返回的所述数据,根据接收到的数据形成所述查询请求的查询结果并输出。The first output unit is configured to receive the data returned by the data collector, form and output the query result of the query request according to the received data.

结合第三方面,在第三方面的第一种可能的实施方式中,所述数据查询服务器还包括:With reference to the third aspect, in a first possible implementation manner of the third aspect, the data query server further includes:

第一索引单元,用于针对所述字段,建立所述字段对应的集中索引表;The first indexing unit is configured to, for the field, establish a centralized index table corresponding to the field;

所述第一索引单元包括:The first index unit includes:

第一接收子单元,用于接收各数据采集器发送的所述字段的上报索引表,所述上报索引表中包括发送所述上报索引表的数据采集器中的数据对应于所述字段的查询词;The first receiving subunit is used to receive the report index table of the field sent by each data collector, and the report index table includes the data in the data collector that sends the report index table corresponding to the query of the field word;

第一索引子单元,用于在所述字段的集中索引表中,存储数据采集器的标识与所述数据采集器上报的上报索引表中该字段的查询词的对应关系。The first index subunit is configured to store, in the centralized index table of the field, the corresponding relationship between the identifier of the data collector and the query word of the field in the report index table reported by the data collector.

结合第三方面,在第三方面的第二种可能的实施方式中,所述第一查询单元包括:With reference to the third aspect, in a second possible implementation manner of the third aspect, the first query unit includes:

第一解析子单元,用于若所述第一接收单元接收的所述查询请求中携带至少两个所要查询的字段,获取所述查询请求中各字段的查询词,并记录所述各字段的查询词之间的逻辑关系;The first parsing subunit is configured to obtain the query words of each field in the query request if the query request received by the first receiving unit carries at least two fields to be queried, and record the query words of each field Logical relationship between query terms;

第一查询子单元,用于从所述各字段对应的集中索引表中查询得到所述第一解析子单元获取的所述各字段的查询词对应的采集器标识;The first query subunit is configured to query from the centralized index table corresponding to each field to obtain the collector identifier corresponding to the query word of each field obtained by the first parsing subunit;

第一过滤子单元,用于根据所述第一解析子单元获取的所述各字段的查询词之间的逻辑关系,从所述第一查询子单元查询得到的所述采集器标识中筛选得到满足所述逻辑关系的采集器标识。The first filtering subunit is configured to filter and obtain from the collector identifier queried by the first query subunit according to the logical relationship between the query words in the fields obtained by the first parsing subunit The identifier of the collector that satisfies the logical relationship.

第四方面,本发明还提供了一种数据采集器,所述数据采集器包括:In a fourth aspect, the present invention also provides a data collector, which includes:

第二接收单元,用于接收数据查询服务器发送的查询命令,所述查询命令携带所要查询的字段和所述字段中的查询词;The second receiving unit is used to receive the query command sent by the data query server, the query command carries the field to be queried and the query word in the field;

第二查询单元,用于从所述字段对应的本地索引表中查询得到与所述第二接收单元接收的查询命令中的查询词相匹配的数据的存储位置,所述本地索引表中存储所述字段中的查询词与所述数据的存储位置的对应关系;The second query unit is configured to query from the local index table corresponding to the field to obtain the storage location of the data that matches the query word in the query command received by the second receiving unit, and the local index table stores all The corresponding relationship between the query word in the above field and the storage location of the data;

第二处理单元,用于根据所述第二查询单元查询得到的所述数据的存储位置,获取所述数据并发送给所述数据查询服务器。The second processing unit is configured to acquire the data and send it to the data query server according to the storage location of the data obtained through query by the second query unit.

结合第四方面,在第四方面的第一种可能的实施方式中,所述数据采集器还包括:With reference to the fourth aspect, in a first possible implementation manner of the fourth aspect, the data collector further includes:

第二索引单元,用于针对所述字段,建立所述字段对应的本地索引表;The second indexing unit is configured to, for the field, establish a local index table corresponding to the field;

所述第二索引单元包括:The second index unit includes:

获取子单元,用于获取当前数据采集器中的数据和所述数据的存储位置,所述数据中包括至少一个字段的内容;An acquisition subunit, configured to acquire the data in the current data collector and the storage location of the data, the data includes the content of at least one field;

第二索引子单元,用于针对所述获取子单元获取的每一个字段,将所述数据在该字段的内容作为所述数据的查询词,在所述数据采集器中所述字段的本地索引表中,存储所述数据的查询词与所述数据的存储位置的对应关系。The second index subunit is used for each field acquired by the acquisition subunit, using the content of the data in the field as the query word of the data, and the local index of the field in the data collector In the table, the corresponding relationship between the query word of the data and the storage location of the data is stored.

结合第四方面的第一种可能的实施方式,在第四方面的第二种可能的实施方式中,所述第二索引单元还包括:With reference to the first possible implementation manner of the fourth aspect, in a second possible implementation manner of the fourth aspect, the second indexing unit further includes:

第三索引子单元,用于从所述第二索引子单元得到的该字段的本地索引表中提取出所述查询词,对所述查询词进行去重处理,形成所述当前数据采集器的所述字段的上报索引表;The third index subunit is used to extract the query word from the local index table of the field obtained by the second index subunit, perform deduplication processing on the query word, and form the current data collector The reporting index table of the field;

发送子单元,用于将所述第三索引子单元形成的所述字段的上报索引表发送给所述数据查询服务器,用以所述数据查询服务器建立所述字段的集中索引表。The sending subunit is configured to send the report index table of the field formed by the third index subunit to the data query server, so that the data query server can establish a centralized index table of the field.

结合第四方面,在第四方面的第三种可能的实施方式中,所述第二查询单元包括:With reference to the fourth aspect, in a third possible implementation manner of the fourth aspect, the second query unit includes:

第二解析子单元,用于若所述第二接收单元接收的所述查询命令中携带至少两个所要查询的字段,获取所述查询命令中各字段的查询词,并记录所述各字段的查询词之间的逻辑关系;The second parsing subunit is configured to obtain the query words of each field in the query command if the query command received by the second receiving unit carries at least two fields to be queried, and record the query words of each field Logical relationship between query terms;

第二查询子单元,用于从所述各字段对应的本地索引表中查询得到与所述第二解析子单元获取的所述查询命令中各字段的查询词相匹配的数据的存储位置;The second query subunit is configured to query from the local index table corresponding to each field to obtain the storage location of the data that matches the query words of each field in the query command obtained by the second parsing subunit;

第二过滤子单元,用于根据所述第二解析子单元获取的所述各字段的查询词之间的逻辑关系,从所述第二查询子单元查询得到的所述数据的存储位置中筛选得到满足所述逻辑关系的数据的存储位置。The second filtering subunit is configured to filter from the storage locations of the data queried by the second query subunit according to the logical relationship between the query words in the fields obtained by the second parsing subunit A storage location of data satisfying the logical relationship is obtained.

第五方面,本发明还提供了一种数据查询系统,所述系统包括:In the fifth aspect, the present invention also provides a data query system, the system comprising:

上述第三方面提供的数据查询服务器和上述第四方面提供的数据采集器。The data query server provided in the above third aspect and the data collector provided in the above fourth aspect.

本发明实施例提供的数据查询方法、数据查询服务器、数据采集器及数据查询系统,通过在数据采集器和数据查询服务器中分别建立本地索引表和集中索引表,可以有效的减轻数据采集器的系统资源占用,使数据采集器可以有更多的资源用于提高采集的性能,从而提升系统整体的处理能力,提高数据查询的处理速度。The data query method, data query server, data collector and data query system provided by the embodiments of the present invention can effectively reduce the burden of the data collector by establishing a local index table and a centralized index table in the data collector and the data query server respectively. The system resource occupation allows the data collector to have more resources to improve the performance of the collection, thereby improving the overall processing capability of the system and improving the processing speed of data query.

附图说明Description of drawings

图1为本发明实施例提供的数据查询系统的架构图;Fig. 1 is the architectural diagram of the data inquiry system that the embodiment of the present invention provides;

图2为本发明实施例一提供的索引建立过程的信令图;FIG. 2 is a signaling diagram of an index establishment process provided by Embodiment 1 of the present invention;

图3为本发明实施例一提供的一种数据查询方法流程图;Fig. 3 is a flow chart of a data query method provided by Embodiment 1 of the present invention;

图4为本发明实施例一提供的又一种数据查询方法流程图;FIG. 4 is a flowchart of another data query method provided by Embodiment 1 of the present invention;

图5为本发明实施例二提供的数据查询系统的示意图;FIG. 5 is a schematic diagram of a data query system provided by Embodiment 2 of the present invention;

图6为本发明实施例二提供的数据查询服务器和数据采集器的示意图;6 is a schematic diagram of a data query server and a data collector provided in Embodiment 2 of the present invention;

图7为本发明实施例三提供的数据查询服务器的示意图;FIG. 7 is a schematic diagram of a data query server provided by Embodiment 3 of the present invention;

图8为本发明实施例三提供的数据采集器的示意图。FIG. 8 is a schematic diagram of a data collector provided by Embodiment 3 of the present invention.

具体实施方式detailed description

下面通过附图和实施例,对本发明的技术方案做进一步的详细描述。The technical solutions of the present invention will be described in further detail below with reference to the accompanying drawings and embodiments.

图1是本发明实施例提供的数据查询系统的架构图,如图1所示,本发明采用分布式架构,包括一台数据查询服务器10和多台数据采集器20,数据采集器20负责对日志源30上报的海量日志等数据的采集(包括接收、格式化、归并)、存储和索引等处理,数据查询服务器10是数据查询的统一入口。Fig. 1 is the architectural diagram of the data query system that the embodiment of the present invention provides, as shown in Fig. 1, the present invention adopts distributed structure, comprises a data query server 10 and a plurality of data collectors 20, and data collector 20 is responsible for The data query server 10 is a unified entrance for data query for the collection (including receiving, formatting, and merging), storage, and indexing of massive logs reported by the log source 30 .

本发明提供的数据查询方法可用于海量数据的快速查询,在下面的实施例中以日志数据为例进行说明。The data query method provided by the present invention can be used for fast query of massive data, and log data is taken as an example in the following embodiments for illustration.

实施例一Embodiment one

在进行日志数据的查询之前,需预先对系统中已存储的数据建立索引,通常在数据存储时进行,用以系统根据建立的索引表进行数据的查询。Before querying log data, it is necessary to build an index on the data stored in the system in advance, usually during data storage, so that the system can query data based on the established index table.

本实施例中在数据采集器和数据查询服务器中分别建立本地索引表和集中索引表。本地索引表用于存储当前数据采集器中的日志数据的索引,其作用是:当给出查询条件时,可以查找到本地数据中符合条件的所有日志的具体存储位置。集中索引表用于存储各字段的查询词与采集器标识的索引,其作用是:当给出查询条件时,可以查找到待查询数据可能存储在哪些数据采集器上,集中索引表中给出了存储有待查询数据的数据采集器的标识信息。In this embodiment, a local index table and a centralized index table are respectively established in the data collector and the data query server. The local index table is used to store the index of the log data in the current data collector. Its function is: when the query condition is given, the specific storage location of all logs that meet the conditions in the local data can be found. The centralized index table is used to store the query words of each field and the index of the collector identification. Its function is: when the query condition is given, it can find out which data collectors the data to be queried may be stored on. The centralized index table gives The identification information of the data collector that stores the data to be queried.

图2是本实施例提供的索引建立过程的信令图,如图2所示,包括:Figure 2 is a signaling diagram of the index establishment process provided by this embodiment, as shown in Figure 2, including:

步骤S101、数据采集器获取当前数据采集器中的数据和所述数据的存储位置。Step S101, the data collector acquires the data currently in the data collector and the storage location of the data.

可选地,数据采集器中存储的数据为日志源上报的原始日志数据。在日志源上报原始日志数据到数据采集器之后,数据采集器还需要为原始日志数据建本地索引。Optionally, the data stored in the data collector is the original log data reported by the log source. After the log source reports the original log data to the data collector, the data collector also needs to build a local index for the original log data.

数据采集器对原始日志数据进行格式化和归并处理,将原始日志数据处理为日志表中每条记录的形式(即日志表中的每行记录),每个日志表可能有多个字段,如下表1所示,该日志表包括字段1和字段2等字段,序号表示数据的存储位置。The data collector formats and merges the original log data, and processes the original log data into the form of each record in the log table (that is, each row record in the log table). Each log table may have multiple fields, as follows As shown in Table 1, the log table includes fields such as field 1 and field 2, and the serial number indicates the storage location of the data.

表1Table 1

序号serial number 字段1内容field 1 content 字段2内容field 2 content ...... 11 bbbbbb ffffff ...... 22 aaaaaa cccccc ...... 33 aaaaaa gggggg ......

步骤S102、数据采集器针对每一个字段,将所述数据在该字段的内容作为所述数据的查询词,建立所述查询词与所述存储位置的映射关系,形成该字段在所述当前数据采集器的本地索引表。Step S102, for each field, the data collector uses the content of the data in the field as the query word of the data, establishes a mapping relationship between the query word and the storage location, and forms the field in the current data The local index table of the collector.

在数据采集器的本地索引中,针对日志表中每条记录的每个字段分别建立本地索引表,每个索引表对应指定的日志表中某个字段内容和含有所述指定的某个字段内容的数据在日志表的存储位置信息。其中,字段内容作为对应数据的查询词。所述查询词与所述存储位置的映射关系可以但不限于采用表格的形式表示,如下表2所示为数据采集器中字段1的本地索引表:In the local index of the data collector, a local index table is established for each field of each record in the log table, and each index table corresponds to the content of a field in the specified log table and contains the specified field content The storage location information of the data in the log table. Among them, the field content is used as the query word of the corresponding data. The mapping relationship between the query word and the storage location can be expressed in the form of a table, but not limited to, as shown in Table 2 below, which is the local index table of field 1 in the data collector:

表2Table 2

字段1内容field 1 content 位置Location aaaaaa 22 aaaaaa 33

bbbbbb 11

通过对应字段的本地索引表可以快速地找到数据在该数据采集器上的具体位置。例如,要字段1某个内容的具体位置,则根据字段1的本地索引表,即可快速找到对应的位置。The specific location of the data on the data collector can be quickly found through the local index table of the corresponding field. For example, if you want a specific location of a certain content in field 1, you can quickly find the corresponding location according to the local index table of field 1.

步骤S103、数据采集器从该字段的本地索引表中提取出所述查询词,对所述查询词进行去重处理,形成所述当前数据采集器的所述字段的上报索引表。Step S103 , the data collector extracts the query word from the local index table of the field, performs deduplication processing on the query word, and forms a report index table of the field of the current data collector.

数据采集器根据字段的本地索引表,提取出上报的上报索引表。上报索引表中已经没有每个索引内容对应的具体原始日志位置,而只有各个数据对应该字段的内容,即查询词。通常在上报之前,还会对查询词进行去重处理,使上报的每个字段的不重复内容。例如表2中字段1内容有两个aaa和一个bbb,经过去重处理后,上报索引表中只有一个aaa和一个bbb,如下表3所示。The data collector extracts the reported reporting index table according to the local index table of the field. In the report index table, there is no specific original log location corresponding to each index content, but only the content of each data corresponding to the field, that is, the query word. Usually, before reporting, the query words will be deduplicated, so that the content of each field reported will not be repeated. For example, there are two aaa and one bbb in field 1 in Table 2. After deduplication processing, there is only one aaa and one bbb in the reporting index table, as shown in Table 3 below.

表3table 3

字段1内容field 1 content aaaaaa bbbbbb ......

对于新增的本地索引表中提取出的查询词,将其与已上报的上报索引表相比对,如果查询词相同,则不添加到上报索引表中,仅上传新增的不重复的上报索引表。For the query words extracted from the newly added local index table, compare them with the reported report index table. If the query words are the same, they will not be added to the report index table, and only the newly added non-repeated report will be uploaded. direction chart.

步骤S104、数据采集器将所述字段的上报索引表发送给所述数据查询服务器。Step S104, the data collector sends the report index table of the field to the data query server.

数据采集器将如表3所示的上报索引表上报给数据查询服务器。对于本地索引表中有更新的情况,则仅上传根据更新后的本地索引表生成的上报索引表中新增的内容。The data collector reports the reporting index table shown in Table 3 to the data query server. If there is an update in the local index table, only upload the newly added content in the report index table generated according to the updated local index table.

数据查询服务器对于多个数据采集器发送的上报索引表进行汇总,并在数据查询服务器中分别建立各个字段对应的集中索引表,具体包括:The data query server summarizes the reporting index tables sent by multiple data collectors, and establishes centralized index tables corresponding to each field in the data query server, specifically including:

步骤S105、数据查询服务器接收各数据采集器发送的所述字段的上报索引表。Step S105, the data query server receives the report index table of the field sent by each data collector.

数据采集器的上报索引表中包括该数据采集器中的数据对应于所述字段的查询词。The report index table of the data collector includes the query words corresponding to the fields in the data in the data collector.

步骤S106、建立所述查询词与所述采集器标识的映射关系,形成所述字段的集中索引表。Step S106, establishing a mapping relationship between the query word and the collector identifier, and forming a centralized index table for the field.

针对不同的字段,数据查询服务器分别建立对应字段的集中索引表。例如对于字段1,数据查询服务器的集中索引表中会存储各个数据采集器上包含的查询词以及对应的采集器标识,如下表4所示:For different fields, the data query server establishes centralized index tables for corresponding fields. For example, for field 1, the centralized index table of the data query server will store the query words contained in each data collector and the corresponding collector identifier, as shown in Table 4 below:

表4Table 4

字段1内容Field 1 content 采集器IDCollector ID aaaaaa 采集器1Collector 1 aaaaaa 采集器2Collector 2 bbbbbb 采集器1Collector 1

又例如,对于字段2,数据查询服务器建立的集中索引表如下表5所示:As another example, for field 2, the centralized index table established by the data query server is shown in Table 5 below:

表5table 5

字段2内容field 2 content 采集器IDCollector ID cccccc 采集器1Collector 1 dddddd 采集器2Collector 2 eeeeee 采集器3Collector 3

数据查询服务器中对应每个字段有一个集中索引表,将每个数据采集器上报的上报索引表汇总到数据查询服务器的集中索引表中,即在数据查询服务器上所述字段的集中索引表中,存储数据采集器的标识与所述数据采集器上报的上报索引表中该字段的查询词的对应关系。There is a centralized index table corresponding to each field in the data query server, and the reported index table reported by each data collector is summarized into the centralized index table of the data query server, that is, in the centralized index table of the field on the data query server storing the correspondence between the identifier of the data collector and the query words in this field in the reporting index table reported by the data collector.

在建立索引表后,当接收到查询请求时,即可通过索引表找到需要查询的数据。After the index table is established, when a query request is received, the data to be queried can be found through the index table.

图3是本实施例提供的数据查询方法流程图,如图3所示,本发明的数据查询方法包括:Fig. 3 is a flow chart of the data query method provided in this embodiment, as shown in Fig. 3, the data query method of the present invention includes:

步骤S201、数据查询服务器接收输入的查询请求。Step S201, the data query server receives an input query request.

用户通过表格检索或表达式检索等方式,输入查询请求。对于表格检索方式,与用户的交互界面中会给出固定的字段,用户可以在多个字段的提示框中输入查询词,最后通过提交按键将查询请求提交给数据查询服务器。对于表达式检索的方式,则是用户直接输入所要查询的字段及该字段的查询词,提交给数据查询服务器。The user enters a query request by means of table search or expression search. For the form retrieval method, fixed fields will be given in the user interface, and the user can enter query words in the prompt boxes of multiple fields, and finally submit the query request to the data query server by pressing the submit button. For the method of expression retrieval, the user directly inputs the field to be queried and the query word of the field, and submits it to the data query server.

数据查询服务器接收到的查询请求中携带所要查询的字段和所述字段的查询词。The query request received by the data query server carries the field to be queried and the query words of the field.

当查询请求包含多个字段时,数据查询服务器接收到的查询请求还携带各个字段的查询词之间的逻辑关系。例如,用户在不同的检索字段输入查询词,数据查询服务器接收这些在各个不同的检索字段的查询词及各查询词之间的逻辑关系。When the query request includes multiple fields, the query request received by the data query server also carries the logical relationship between the query words in each field. For example, the user inputs query words in different search fields, and the data query server receives these query words in each different search fields and the logical relationship between each query word.

例如,输入的查询请求为:(字段1=aaa)AND(字段2=ccc),该查询请求中包括字段1和字段2这两个字段,其对应的查询词分别为aaa和ccc,AND表示这两个查询词之间的逻辑关系是“与”的关系。For example, the input query request is: (field 1=aaa) AND (field 2=ccc), the query request includes two fields, field 1 and field 2, and the corresponding query words are aaa and ccc, and AND means The logical relationship between these two query words is the relationship of "and".

步骤S202、数据查询服务器从所述字段对应的集中索引表中查询得到所述查询词对应的采集器标识。Step S202, the data query server queries the centralized index table corresponding to the field to obtain the collector identifier corresponding to the query word.

当查询请求中包括字段1时,则字段1的集中索引表中得到与查询词相匹配的查询词,得到对应的采集器标识,从而得到所述查询词在哪一个的数据采集器。When field 1 is included in the query request, the query word matching the query word is obtained from the centralized index table of field 1, and the corresponding collector identifier is obtained, so as to obtain which data collector the query word is in.

例如,查询请求为:字段1=aaa,则从表4所示的集中索引表中查询得到aaa对应的采集器标识为采集器1和采集器3。For example, if the query request is: field 1=aaa, then the collector identifiers corresponding to aaa are obtained from the centralized index table shown in Table 4 as collector 1 and collector 3 .

对于查询请求包括多个字段的情况,本步骤具体包括:For the case where the query request includes multiple fields, this step specifically includes:

步骤S2021、获取所述查询请求中各字段的查询词,并记录所述查询请求中携带的所述各字段的查询词之间的逻辑关系。Step S2021. Obtain the query terms of each field in the query request, and record the logical relationship between the query terms of each field carried in the query request.

步骤S2022、从所述各字段对应的集中索引表中查询得到所述各字段的查询词对应的采集器标识。Step S2022, querying the centralized index table corresponding to each field to obtain the identifier of the collector corresponding to the query word in each field.

步骤S2023、根据所述各字段的查询词之间的逻辑关系,从步骤S2022查询得到的所述采集器标识筛选得到满足所述逻辑关系的采集器标识。Step S2023: According to the logical relationship between the query words in each field, filter the collector identifier obtained from the query in step S2022 to obtain the collector identifier satisfying the logical relationship.

例如,查询请求为:(字段1=aaa)AND(字段2=ccc),则从表4所示的集中索引表中查询得到aaa对应的采集器标识为采集器1和采集器3,从表5所示的集中索引表中查询得到ccc对应的采集器标识为采集器1,aaa与ccc之间的逻辑关系为“与”,则可以筛选出满足逻辑关系的采集器标识只有采集器1。For example, if the query request is: (field 1=aaa) AND (field 2=ccc), then the collectors corresponding to aaa are identified as collector 1 and collector 3 from the centralized index table shown in Table 4. The collector ID corresponding to ccc obtained from the centralized index table shown in 5 is collector 1, and the logical relationship between aaa and ccc is "AND", then only collector 1 can be selected as the collector ID satisfying the logical relationship.

步骤S203、数据查询服务器根据所述查询请求生成携带有所述字段和查询词的查询命令,并将所述查询命令发送给所述采集器标识对应的数据采集器。Step S203, the data query server generates a query command carrying the fields and query words according to the query request, and sends the query command to the data collector corresponding to the collector identifier.

数据查询服务器生成的查询命令可以与输入的查询请求相同,也可以只包括所发送的目的数据采集器包括的查询词。例如,查询请求为:(字段1=aaa)OR(字段2=ccc),则从表4所示的集中索引表中查询得到aaa对应的采集器标识为采集器1和采集器3,从表5所示的集中索引表中查询得到ccc对应的采集器标识为采集器1,aaa与ccc之间的逻辑关系为“或”,向采集器1发送的查询命令为:(字段1=aaa)OR(字段2=ccc),向采集器3发送的查询命令为:字段2=ccc。The query command generated by the data query server may be the same as the input query request, or may only include query words sent by the destination data collector. For example, if the query request is: (field 1=aaa) OR (field 2=ccc), then the collectors corresponding to aaa are identified as collector 1 and collector 3 by querying from the centralized index table shown in Table 4. The collector identifier corresponding to ccc obtained from the centralized index table shown in 5 is collector 1, the logical relationship between aaa and ccc is "or", and the query command sent to collector 1 is: (field 1=aaa) OR(field 2=ccc), the query command sent to the collector 3 is: field 2=ccc.

所述数据采集器通过所述数据采集器的本地索引表查询得到与所述查询词相匹配的数据,具体的过程后续结合图3进行详细说明。The data collector queries the local index table of the data collector to obtain the data matching the query word, and the specific process will be described in detail later with reference to FIG. 3 .

步骤S204、数据查询服务器接收所述数据采集器返回的所述数据,根据接收到的数据形成所述查询请求的查询结果并输出。Step S204, the data query server receives the data returned by the data collector, forms and outputs a query result of the query request according to the received data.

数据查询服务器对接收到的数据进行汇总,可以但不限于以表格的形式输出。The data query server summarizes the received data, which can be, but not limited to, output in the form of a table.

例如,对于查询请求为:(字段1=aaa)AND(字段2=ccc),最后输出的查询结果如下表6所示:For example, if the query request is: (field 1=aaa) AND (field 2=ccc), the final output query results are shown in Table 6 below:

表6Table 6

序号serial number 字段1field 1 字段2field 2 字段3field 3 22 aaaaaa cccccc iiiiii

图4是本实施例提供的又一数据查询方法流程图,如图4所示,本发明的数据查询方法包括:Fig. 4 is a flow chart of another data query method provided in this embodiment. As shown in Fig. 4, the data query method of the present invention includes:

步骤S301、数据采集器接收数据查询服务器发送的查询命令。Step S301, the data collector receives the query command sent by the data query server.

所述查询命令包括所述数据查询服务器接收的查询请求中携带的所要查询的字段和所述字段的查询词,可选地,也可以包括多个字段的查询词和所述查询词之间的逻辑关系。The query command includes the field to be queried and the query words of the fields carried in the query request received by the data query server. Optionally, it may also include the query words of multiple fields and the query words between the query words. Logic.

步骤S302、数据采集器从所述字段对应的本地索引表中查询得到与所述查询命令相匹配的数据的存储位置。Step S302, the data collector queries the local index table corresponding to the field to obtain the storage location of the data matching the query command.

对于查询命令中包含多个字段的查询词的情况,本步骤具体包括:For the case where the query command contains query words in multiple fields, this step specifically includes:

步骤S3021、数据采集器获取步骤S301接收的查询命令中各字段的查询词,并记录所述查询命令中携带的各字段的查询词之间的逻辑关系。Step S3021, the data collector obtains the query words of each field in the query command received in step S301, and records the logical relationship between the query words of each field carried in the query command.

步骤S3022、数据采集器从所述各字段对应的本地索引表中查询得到与所述各字段的查询词相匹配的数据的存储位置。Step S3022, the data collector queries the local index table corresponding to each field to obtain the storage location of the data matching the query word in each field.

例如,查询命令为:(字段1=aaa)OR(字段2=ccc),则在字段1的本地索引表中查询得到数据对应的存储位置为2,在字段2的本地索引表中查询得到数据对应的存储位置也为2。For example, if the query command is: (field 1=aaa) OR (field 2=ccc), then the storage location corresponding to the data obtained from the query in the local index table of field 1 is 2, and the data obtained in the local index table of field 2 The corresponding storage location is also 2.

步骤S3023、数据采集器根据所述各字段的查询词之间的逻辑关系,从查询得到的所述数据的存储位置中筛选得到满足所述逻辑关系的数据的存储位置。Step S3023. According to the logical relationship between the query words in each field, the data collector screens the storage locations of the data obtained from the query to obtain the storage locations of the data satisfying the logical relationship.

数据采集器对匹配得到的数据的存储位置根据查询词之间的逻辑关系进行筛选得到满足所述逻辑关系的数据的存储位置。The data collector screens the storage locations of the matched data according to the logical relationship between the query words to obtain the storage locations of the data satisfying the logical relationship.

步骤S303、数据采集器根据所述数据的存储位置,获取所述数据并发送给所述数据查询服务器。Step S303, the data collector obtains the data according to the storage location of the data and sends it to the data query server.

数据采集器根据步骤S302得到的数据的存储位置,获取对应的所述数据,例如:可以获取得到满足所述查询命令的数据为:The data collector acquires the corresponding data according to the storage location of the data obtained in step S302, for example: the data that can be obtained to satisfy the query command is:

序号serial number 字段1field 1 字段2field 2 字段3field 3 22 aaaaaa cccccc iiiiii

数据采集器将所述数据发送给数据查询服务器,用以数据查询服务器汇总输出所述查询请求的查询结果。The data collector sends the data to the data query server, so that the data query server summarizes and outputs the query results of the query request.

本实施例提供的数据查询方法,通过在数据采集器和数据查询服务器中分别建立本地索引表和集中索引表,在查询数据时,通过对应字段的集中索引表找到查询词对应的采集器标识,从而在采集器标识对应的数据采集器中获取对应的数据,这样可以有效的减轻数据查询服务器和数据采集器的系统资源占用,使数据采集器可以有更多的资源用于提高采集的性能,提高数据查询的处理速度。In the data query method provided by this embodiment, by establishing a local index table and a centralized index table in the data collector and the data query server respectively, when querying data, the collector identifier corresponding to the query word is found through the centralized index table of the corresponding field, In this way, the corresponding data can be obtained in the data collector corresponding to the collector identifier, which can effectively reduce the system resource occupation of the data query server and the data collector, so that the data collector can have more resources for improving the performance of the collection. Improve the processing speed of data query.

以上是对本发明所提供的数据查询方法进行的详细描述,下面对本发明提供的数据查询系统进行详细描述。The above is a detailed description of the data query method provided by the present invention, and the data query system provided by the present invention will be described in detail below.

实施例二Embodiment two

图5是本实施例提供的数据查询系统的示意图,如图5所示,本发明实施例的数据查询系统包括:数据查询服务器10和数据采集器20。其中,数据采集器20负责数据的采集,包括接收、格式化、归并,数据的存储并对存储的数据建立索引。数据查询服务器10用于对多个数据采集器20上存储的内容进行统一管理,并作为数据查询的统一入口。FIG. 5 is a schematic diagram of a data query system provided in this embodiment. As shown in FIG. 5 , the data query system in this embodiment of the present invention includes: a data query server 10 and a data collector 20 . Among them, the data collector 20 is responsible for data collection, including receiving, formatting, merging, data storage and indexing of the stored data. The data query server 10 is used for unified management of the contents stored on multiple data collectors 20, and serves as a unified entrance for data query.

图6是本实施例提供的数据查询服务器10和数据采集器20的示意图,如图6所示,数据查询服务器10包括第一索引单元100、第一接收单元101、第一查询单元102、第一处理单元103和第一输出单元104。Figure 6 is a schematic diagram of the data query server 10 and the data collector 20 provided by this embodiment, as shown in Figure 6, the data query server 10 includes a first index unit 100, a first receiving unit 101, a first query unit 102, a A processing unit 103 and a first output unit 104 .

数据采集器20包括第二索引单元200、第二接收单元201、第二查询单元202和第二处理单元203。The data collector 20 includes a second indexing unit 200 , a second receiving unit 201 , a second querying unit 202 and a second processing unit 203 .

在进行日志数据的查询之前,数据采集器20和数据查询服务器10需预先对系统中已存储的数据建立索引,通常在数据存储时完成,用以系统根据建立的索引表进行数据的查询。数据查询服务器10利用第一索引单元100针对每一个所述字段,建立所述字段的集中索引表。数据采集器20利用第二索引单元200建立字段的本地索引表。Before querying the log data, the data collector 20 and the data query server 10 need to build an index on the data stored in the system in advance, which is usually completed during data storage, so that the system can query the data according to the established index table. The data query server 10 uses the first indexing unit 100 to create a centralized index table for each of the fields. The data collector 20 utilizes the second index unit 200 to establish a local index table of fields.

本地索引表用于存储当前数据采集器中的日志数据的索引,其作用是:当给出查询条件时,可以查找到本地数据中符合条件的所有日志的具体存储位置。集中索引表用于存储待查询数据与采集器标识的索引,其作用是:当给出查询条件时,可以查找到待查询数据可能存储在哪些数据采集器上,集中索引表中给出了存储有待查询数据的数据采集器的标识信息。The local index table is used to store the index of the log data in the current data collector. Its function is: when the query condition is given, the specific storage location of all logs that meet the conditions in the local data can be found. The centralized index table is used to store the index of the data to be queried and the identifier of the collector. Its function is: when the query condition is given, it can be found which data collectors the data to be queried may be stored on. The centralized index table gives the storage The identification information of the data collector whose data is to be queried.

其中,第二索引单元200包括获取子单元2001、第二索引子单元2002、第三索引子单元2003和发送子单元2004。Wherein, the second indexing unit 200 includes an acquiring subunit 2001 , a second indexing subunit 2002 , a third indexing subunit 2003 and a sending subunit 2004 .

获取子单元2001用于获取当前数据采集器中的数据和所述数据的存储位置,所述数据中包括至少一个字段的内容。The acquiring subunit 2001 is configured to acquire the data in the current data collector and the storage location of the data, and the data includes the content of at least one field.

可选地,数据采集器中存储的数据为日志源上报的原始日志数据。在日志源上报原始日志数据到数据采集器之后,数据采集器还需要为原始日志数据建本地索引。Optionally, the data stored in the data collector is the original log data reported by the log source. After the log source reports the original log data to the data collector, the data collector also needs to build a local index for the original log data.

数据采集器对原始日志数据进行格式化和归并处理,将原始日志数据处理为日志表中每条记录的形式(即日志表中的每行记录),每个日志表可能有多个字段,如表1所示,该日志表包括字段1和字段2等字段,序号表示数据的存储位置。The data collector formats and merges the raw log data, and processes the raw log data into the form of each record in the log table (that is, each row in the log table). Each log table may have multiple fields, such as As shown in Table 1, the log table includes fields such as field 1 and field 2, and the serial number indicates the storage location of the data.

第二索引子单元2002用于针对每一个字段,将所述数据在该字段的内容作为所述数据的查询词,建立所述查询词与所述存储位置的映射关系,形成该字段在所述当前数据采集器的本地索引表。The second index subunit 2002 is used for each field, using the content of the data in the field as the query word of the data, establishing a mapping relationship between the query word and the storage location, forming the field in the The local index table of the current data collector.

在数据采集器20的本地索引中,针对日志表中每条记录的每个字段分别建立本地索引表,每个索引表对应指定的日志表中某个字段内容和含有所述指定的某个字段内容的数据在日志表的存储位置信息。其中,字段内容作为对应数据的查询词。所述查询词与所述存储位置的映射关系可以但不限于采用表格的形式表示,如表2所示。In the local index of the data collector 20, a local index table is established for each field of each record in the log table, and each index table corresponds to a certain field content in the specified log table and contains the specified certain field The storage location information of the content data in the log table. Among them, the field content is used as the query word of the corresponding data. The mapping relationship between the query word and the storage location can be expressed in the form of a table, as shown in Table 2, but is not limited to.

通过对应字段的本地索引表可以快速地找到数据在该数据采集器上的具体位置。例如,要字段1某个内容的具体位置,则根据字段1的本地索引表,即可快速找到对应的位置。The specific location of the data on the data collector can be quickly found through the local index table of the corresponding field. For example, if you want a specific location of a certain content in field 1, you can quickly find the corresponding location according to the local index table of field 1.

第三索引子单元2003用于从该字段的本地索引表中提取出所述查询词,对所述查询词进行去重处理,形成所述当前数据采集器的所述字段的上报索引表。The third index subunit 2003 is configured to extract the query word from the local index table of the field, and perform deduplication processing on the query word to form the report index table of the field of the current data collector.

数据采集器根据字段的本地索引表,提取出上报的上报索引表。上报索引表中已经没有每个索引内容对应的具体原始日志位置,而只有各个数据对应该字段的内容,即查询词。通常在上报之前,还会对查询词进行去重处理,使上报的每个字段的不重复内容。例如表2中字段1内容有两个aaa和一个bbb,经过去重处理后,上报索引表中只有一个aaa和一个bbb,如表3所示。The data collector extracts the reported reporting index table according to the local index table of the field. In the report index table, there is no specific original log location corresponding to each index content, but only the content of each data corresponding to the field, that is, the query word. Usually, before reporting, the query words will be deduplicated, so that the content of each field reported will not be repeated. For example, there are two aaa and one bbb in field 1 in Table 2. After deduplication processing, there is only one aaa and one bbb in the reporting index table, as shown in Table 3.

对于新增的本地索引表中提取出的查询词,将其与已上报的上报索引表相比对,如果查询词相同,则不添加到上报索引表中,仅上传新增的不重复的上报索引表。For the query words extracted from the newly added local index table, compare them with the reported report index table. If the query words are the same, they will not be added to the report index table, and only the newly added non-repeated report will be uploaded. direction chart.

发送子单元2004用于将所述字段的上报索引表发送给数据查询服务器10的第一索引单元100,用以所述数据查询服务器建立所述字段的集中索引表。The sending subunit 2004 is configured to send the report index table of the field to the first index unit 100 of the data query server 10, so that the data query server can build a centralized index table of the field.

第一索引单元100包括第一接收子单元1001和第一索引子单元1002。The first indexing unit 100 includes a first receiving subunit 1001 and a first indexing subunit 1002 .

第一接收子单元1001用于接收各数据采集器发送的所述字段的上报索引表。所述上报索引表中包括发送上报索引表的数据采集器中的数据对应于所述字段的查询词。The first receiving subunit 1001 is configured to receive the report index table of the field sent by each data collector. The report index table includes query words corresponding to the fields in which the data in the data collector sending the report index table corresponds.

第一索引子单元1002用于建立所述查询词与所述采集器标识的映射关系,形成所述字段的集中索引表。The first index subunit 1002 is configured to establish a mapping relationship between the query word and the collector identifier to form a centralized index table for the field.

第一索引子单元1002针对不同的字段,分别建立对应字段的集中索引表。例如对于字段1,集中索引表中会存储各个数据采集器上包含的查询词以及对应的采集器标识,如表4所示。For different fields, the first indexing subunit 1002 establishes centralized index tables corresponding to the fields. For example, for field 1, the query words contained in each data collector and the corresponding collector identifiers are stored in the centralized index table, as shown in Table 4.

第一索引子单元1002对应每个字段建立一个集中索引表,每个数据采集器上报的上报索引表汇总到数据查询服务器的集中索引表中。The first index subunit 1002 establishes a centralized index table corresponding to each field, and the reported index table reported by each data collector is summarized into the centralized index table of the data query server.

通过第一索引单元100和第二索引单元200分别在数据查询服务器10和数据采集器20中建立索引表后,当接收到查询请求时,即可通过索引表找到需要查询的数据。After the index tables are established in the data query server 10 and the data collector 20 respectively by the first index unit 100 and the second index unit 200, when a query request is received, the data to be queried can be found through the index tables.

第一接收单元101用于接收输入的查询请求。The first receiving unit 101 is configured to receive an input query request.

用户通过表格检索或表达式检索等方式,输入查询请求。对于表格检索方式,与用户的交互界面中会给出固定的字段,用户可以在多个字段的提示框中输入查询词,最后通过提交按键将查询请求提交给数据查询服务器。对于表达式检索的方式,则是用户直接输入所要查询的字段及该字段的查询词,提交给数据查询服务器。The user enters a query request by means of table search or expression search. For the form retrieval method, fixed fields will be given in the user interface, and the user can enter query words in the prompt boxes of multiple fields, and finally submit the query request to the data query server by pressing the submit button. For the method of expression retrieval, the user directly inputs the field to be queried and the query word of the field, and submits it to the data query server.

第一接收单元101接收到的查询请求中携带所要查询的字段和所述字段的查询词。The query request received by the first receiving unit 101 carries the field to be queried and the query word of the field.

当查询请求包含多个字段时,第一接收单元101接收到的查询请求还包括各个字段的查询词之间的逻辑关系。例如,用户在不同的检索字段输入查询词,第一接收单元101接收这些在各个不同的检索字段的查询词及各查询词之间的逻辑关系。When the query request includes multiple fields, the query request received by the first receiving unit 101 also includes a logical relationship between query words in each field. For example, the user inputs query words in different search fields, and the first receiving unit 101 receives these query words in different search fields and the logical relationship between the query words.

第一查询单元102用于从所述字段的集中索引表中,查询得到第一接收单元101接收的所述查询请求携带的查询词对应的采集器标识。The first query unit 102 is configured to query from the centralized index table of the field to obtain the collector identifier corresponding to the query word carried in the query request received by the first receiving unit 101 .

当查询请求中包括字段1时,第一查询单元102则从字段1的集中索引表中得到与查询词相匹配的查询词,得到对应的采集器标识,从而得到所述查询词在哪一个的数据采集器。When the query request includes field 1, the first query unit 102 obtains the query word that matches the query word from the centralized index table of field 1, and obtains the corresponding collector identifier, thereby obtaining the query word in which data collector.

对于查询请求包括多个字段的情况,第一查询单元102包括:解析子单元、第一查询子单元和第一过滤子单元(图未示)。For the case where the query request includes multiple fields, the first query unit 102 includes: a parsing subunit, a first query subunit and a first filtering subunit (not shown in the figure).

第一解析子单元用于获取所述查询请求中各字段的查询词,并记录所述各字段的查询词之间的逻辑关系。The first parsing subunit is used to obtain query words in each field in the query request, and record the logical relationship between the query words in each field.

第一查询子单元,用于从所述字段对应的集中索引表中查询得到所述第一解析子单元获取的所述各字段的查询词对应的采集器标识。The first query subunit is configured to query from the centralized index table corresponding to the field to obtain the collector identifier corresponding to the query word of each field obtained by the first parsing subunit.

第一过滤子单元,用于根据所述第一解析子单元获取的所述各字段的查询词之间的逻辑关系,从第一查询子单元查询得到的所述采集器标识筛选得到满足所述逻辑关系的采集器标识。The first filtering subunit is configured to filter the collector identification obtained from the query from the first query subunit according to the logical relationship between the query words in the fields obtained by the first parsing subunit to obtain the The collector identifier of the logical relationship.

第一处理单元103用于根据所述查询请求生成携带有所述字段和查询词的查询命令,并将所述查询命令发送给第一查询单元102查询得到的采集器标识对应的数据采集器20。The first processing unit 103 is configured to generate a query command carrying the field and the query word according to the query request, and send the query command to the data collector 20 corresponding to the collector identification obtained by the query obtained by the first query unit 102 .

第一处理单元103生成的查询命令可以与输入的查询请求相同,也可以只包括所发送的目的数据采集器包括的查询词。例如,查询请求为:(字段1=aaa)OR(字段2=ccc),则第一查询单元102从表4所示的集中索引表中查询得到aaa对应的采集器标识为采集器1和采集器3,第一查询单元102从表5所示的集中索引表中查询得到ccc对应的采集器标识为采集器1,aaa与ccc之间的逻辑关系为“或”,第一处理单元103生成向采集器1发送的查询命令为:(字段1=aaa)OR(字段2=ccc),第一处理单元103生成向采集器3发送的查询命令为:字段2=ccc。The query command generated by the first processing unit 103 may be the same as the input query request, or may only include query words sent by the destination data collector. For example, if the query request is: (field 1=aaa) OR (field 2=ccc), then the first query unit 102 can obtain the collector identification corresponding to aaa from the centralized index table shown in Table 4 as collector 1 and collector 3, the first query unit 102 queries from the centralized index table shown in Table 5 to obtain the collector identifier corresponding to ccc as collector 1, the logical relationship between aaa and ccc is "or", and the first processing unit 103 generates The query command sent to the collector 1 is: (field 1=aaa) OR (field 2=ccc), and the query command generated by the first processing unit 103 and sent to the collector 3 is: field 2=ccc.

数据采集器20的第二接收单元201用于接收数据查询服务器10发送的查询命令。The second receiving unit 201 of the data collector 20 is used for receiving the query command sent by the data query server 10 .

所述查询命令包括数据查询服务器10接收的查询请求中携带的所要查询的字段和所述字段的查询词,可以包括多个字段的查询词和所述查询词之间的逻辑关系。The query command includes the field to be queried and the query words of the fields carried in the query request received by the data query server 10, and may include the query words of multiple fields and the logical relationship between the query words.

第二查询单元202用于从所述字段对应的本地索引表中查询得到与第二接收单元201接收的查询命令中的查询词相匹配的数据的存储位置。The second query unit 202 is configured to query the local index table corresponding to the field to obtain the storage location of the data matching the query word in the query command received by the second receiving unit 201 .

可选地,对于查询命令中包含多个字段的查询词的情况,第二查询单元202包括:第二解析子单元、第二查询子单元和第二过滤子单元(图未示)。Optionally, for the case where the query command contains a query word in multiple fields, the second query unit 202 includes: a second parsing subunit, a second query subunit, and a second filtering subunit (not shown in the figure).

第二解析子单元用于若第二接收单元201接收的所述查询命令中携带多个所要查询的字段时,获取所述查询命令中各字段的查询词,并记录所述查询命令中携带的各字段的查询词之间的逻辑关系。The second parsing subunit is used to obtain the query words of each field in the query command if the query command received by the second receiving unit 201 carries a plurality of fields to be queried, and record the query words carried in the query command The logical relationship between query words in each field.

第二查询子单元用于从所述各字段对应的本地索引表中查询得到与所述第二解析子单元获取的所述查询命令中各字段的查询词相匹配的数据的存储位置。The second query subunit is configured to query the local index table corresponding to each field to obtain the storage location of the data that matches the query word of each field in the query command acquired by the second parsing subunit.

第二过滤子单元用于根据所述第二解析子单元获取的所述各字段的查询词之间的逻辑关系,从所述第二查询子单元查询得到的所述数据的存储位置中筛选得到满足所述逻辑关系的数据的存储位置。The second filtering subunit is used to filter and obtain from the storage location of the data queried by the second query subunit according to the logical relationship between the query words in the fields obtained by the second parsing subunit The storage location of the data satisfying the logical relationship.

第二处理单元203用于根据第二查询单元202查询得到的所述数据的存储位置,获取所述数据并发送给所述数据查询服务器。The second processing unit 203 is configured to obtain the data and send it to the data query server according to the storage location of the data obtained through query by the second query unit 202 .

第二处理单元203将第二查询单元202查询得到的数据发送给数据查询服务器10的第一输出单元104,用以输出所述查询请求的查询结果。The second processing unit 203 sends the data queried by the second query unit 202 to the first output unit 104 of the data query server 10, so as to output the query result of the query request.

第一输出单元104用于接收数据采集器20的第二处理单元203返回的所述数据,根据接收到的数据形成所述查询请求的查询结果并输出。The first output unit 104 is configured to receive the data returned by the second processing unit 203 of the data collector 20, form and output the query result of the query request according to the received data.

本发明实施例提供的数据查询服务器、数据采集器及系统,利用第一索引单元在数据查询服务器中建立集中索引表,利用第二索引单元在数据采集器中建立本地索引表,可以提高数据查询的处理速度。对于待查询的数据只存在于少数数据采集器的情况,本发明不用每台数据采集器都去查询,降低了数据采集器的负担。对于需要频繁查询的日志系统,本发明实施例提供的方案可以有效的减轻数据采集器的系统资源占用,使数据采集器可以有更多的资源用于提高采集的性能,从而提升系统整体的处理能力。In the data query server, data collector and system provided by the embodiments of the present invention, the first index unit is used to establish a centralized index table in the data query server, and the second index unit is used to establish a local index table in the data collector, which can improve the efficiency of data query. processing speed. For the situation that the data to be queried only exists in a small number of data collectors, the present invention does not require every data collector to inquire, which reduces the burden on the data collectors. For the log system that requires frequent query, the solution provided by the embodiment of the present invention can effectively reduce the system resource occupation of the data collector, so that the data collector can have more resources to improve the performance of collection, thereby improving the overall processing of the system ability.

实施例三Embodiment Three

图7为本实施例提供的数据查询服务器10的示意图,如图7所示,所述数据查询服务器10包括:网络接口71、处理器72和存储器73。系统总线74用于连接网络接口71、处理器72和存储器73。FIG. 7 is a schematic diagram of the data query server 10 provided in this embodiment. As shown in FIG. 7 , the data query server 10 includes: a network interface 71 , a processor 72 and a memory 73 . The system bus 74 is used to connect the network interface 71 , the processor 72 and the memory 73 .

网络接口71用于与数据采集器20进行通信。The network interface 71 is used to communicate with the data collector 20 .

存储器73可以是永久存储器,例如硬盘驱动器和闪存,存储器73中具有软件模块和设备驱动程序。软件模块能够执行本发明上述方法的各种功能模块;设备驱动程序可以是网络和接口驱动程序。The memory 73 can be a permanent memory, such as a hard disk drive and a flash memory, and has software modules and device drivers in the memory 73 . The software modules can execute various functional modules of the above method of the present invention; the device driver can be a network and interface driver.

在启动时,这些软件组件被加载到存储器73中,然后被处理器72访问并执行如下指令:At startup, these software components are loaded into memory 73, then accessed by processor 72 and execute the following instructions:

接收输入的查询请求,所述查询请求中携带所要查询的字段和所述字段中的查询词;Receiving an input query request, carrying the field to be queried and the query words in the field in the query request;

从所述字段对应的集中索引表中查询得到所述查询词对应的采集器标识,所述集中索引表中存储所述字段中的查询词与采集器标识的对应关系;Obtain the collector identification corresponding to the query word from the centralized index table corresponding to the field, and store the corresponding relationship between the query word in the field and the collector ID in the centralized index table;

根据所述查询请求生成携带有所述字段和查询词的查询命令,并将所述查询命令发送给所述采集器标识对应的数据采集器,用以所述数据采集器通过所述数据采集器中所述查询命令携带的字段对应的本地索引表查询得到与所述查询命令中携带的查询词相匹配的数据;Generate a query command carrying the field and query words according to the query request, and send the query command to the data collector corresponding to the collector identification, so that the data collector can pass through the data collector The local index table corresponding to the field carried by the query command is queried to obtain data matching the query words carried in the query command;

接收所述数据采集器返回的所述数据,根据接收到的数据形成所述查询请求的查询结果并输出。The data returned by the data collector is received, and the query result of the query request is formed and output according to the received data.

本实施例的数据查询服务器通过字段的集中索引表找到查询词对应的采集器标识,从而在采集器标识对应的数据采集器中获取对应的数据,可以有效的减轻数据查询服务器的系统资源占用,提高数据查询的处理速度。The data query server in this embodiment finds the collector identifier corresponding to the query word through the centralized index table of the field, thereby obtaining corresponding data in the data collector corresponding to the collector identifier, which can effectively reduce the system resource occupation of the data query server. Improve the processing speed of data query.

进一步的,所述处理器访问存储器73的软件组件后,执行以下过程的指令:Further, after the processor accesses the software components of the memory 73, it executes the instructions of the following process:

针对所述字段,建立所述字段对应的集中索引表;For the field, establish a centralized index table corresponding to the field;

所述建立所述字段对应的集中索引表,包括:The establishment of the centralized index table corresponding to the field includes:

接收各数据采集器发送的所述字段的上报索引表,所述上报索引表中包括发送所述上报索引表的数据采集器中的数据对应于所述字段的查询词;receiving the reporting index table of the field sent by each data collector, the reporting index table including the data in the data collector sending the reporting index table corresponding to the query words of the field;

建立所述查询词与所述采集器标识的映射关系,形成所述字段的集中索引表。A mapping relationship between the query word and the collector identifier is established to form a centralized index table for the field.

上述指令过程就是数据查询服务器建立集中索引表的过程,通过建立查询词与采集器标识的映射关系,以便在数据查询时,找到查询词对应的采集器标识,并从对应的数据采集器得到匹配的数据。The above instruction process is the process of the data query server establishing a centralized index table. By establishing the mapping relationship between the query word and the collector ID, in order to find the collector ID corresponding to the query word during data query, and get the matching from the corresponding data collector The data.

进一步的,所述处理器访问存储器73的软件组件后,执行以下过程的指令:Further, after the processor accesses the software components of the memory 73, it executes the instructions of the following process:

当所述查询请求中携带至少两个所要查询的字段时,获取所述查询请求中各字段的查询词,并记录所述各字段的查询词之间的逻辑关系;When the query request carries at least two fields to be queried, obtain query words of each field in the query request, and record the logical relationship between the query words of each field;

从所述各字段对应的集中索引表中查询得到所述各字段的查询词对应的采集器标识;Querying from the centralized index table corresponding to each field to obtain the collector identification corresponding to the query word of each field;

根据所述各字段的查询词之间的逻辑关系,从查询得到的所述采集器标识筛选得到满足所述逻辑关系的采集器标识。According to the logical relationship among the query words in each field, the collector identifiers satisfying the logical relationship are obtained by screening the collector identifiers obtained from the query.

上述指令过程就是数据查询服务器针对多个所要查询的字段的查询词查找对应的采集器标识的过程,可以避免访问不能完全满足查询请求的数据采集器。The above instruction process is the process in which the data query server searches for the corresponding collector identifiers for the query words in multiple fields to be queried, which can avoid accessing data collectors that cannot fully satisfy the query request.

图8为本实施例提供的数据采集器20的示意图,如图8所示,所述数据采集器20包括:网络接口81、处理器82和存储器83。系统总线84用于连接网络接口81、处理器82和存储器83。FIG. 8 is a schematic diagram of the data collector 20 provided in this embodiment. As shown in FIG. 8 , the data collector 20 includes: a network interface 81 , a processor 82 and a memory 83 . The system bus 84 is used to connect the network interface 81 , the processor 82 and the memory 83 .

网络接口81用于与数据查询服务器10进行通信。The network interface 81 is used for communicating with the data query server 10 .

存储器83可以是永久存储器,例如硬盘驱动器和闪存,存储器83中具有软件模块和设备驱动程序。软件模块能够执行本发明上述方法的各种功能模块;设备驱动程序可以是网络和接口驱动程序。The memory 83 can be a permanent memory, such as a hard disk drive and a flash memory, and has software modules and device drivers in the memory 83 . The software modules can execute various functional modules of the above method of the present invention; the device driver can be a network and interface driver.

在启动时,这些软件组件被加载到存储器83中,然后被处理器82访问并执行如下指令:At startup, these software components are loaded into memory 83 and then accessed by processor 82 to execute the following instructions:

接收数据查询服务器发送的查询命令,所述查询命令包括所述数据查询服务器接收的查询请求中携带的所要查询的字段和所述字段中的查询词;receiving a query command sent by the data query server, the query command including the field to be queried and the query word in the field carried in the query request received by the data query server;

从所述字段对应的本地索引表中查询得到与所述查询命令相匹配的数据的存储位置,所述本地索引表中存储所述字段中的查询词与所述数据的存储位置的对应关系;Query the storage location of the data matched with the query command from the local index table corresponding to the field, and store the corresponding relationship between the query word in the field and the storage location of the data in the local index table;

根据所述数据的存储位置,获取所述数据并发送给所述数据查询服务器。According to the storage location of the data, the data is acquired and sent to the data query server.

本实施例的数据采集器通过字段的本地索引表找到查询词对应的数据,提供给数据查询服务器,可以有效的减轻数据采集器的系统资源占用,使数据采集器可以有更多的资源用于提高采集的性能,提高数据查询的处理速度。The data collector of this embodiment finds the data corresponding to the query word through the local index table of the field, and provides it to the data query server, which can effectively reduce the system resource occupation of the data collector, so that the data collector can have more resources for Improve the performance of collection and improve the processing speed of data query.

进一步的,所述处理器访问存储器83的软件组件后,执行以下过程的指令:Further, after the processor accesses the software components of the memory 83, it executes the instructions of the following process:

针对所述字段,建立所述字段对应的本地索引表;For the field, establish a local index table corresponding to the field;

所述建立所述字段对应的本地索引表,包括:The establishment of the local index table corresponding to the field includes:

获取当前数据采集器中的数据和所述数据的存储位置,所述数据中包括至少一个字段的内容;Obtain the data in the current data collector and the storage location of the data, the data includes the content of at least one field;

针对每一个字段,将所述数据在该字段的内容作为所述数据的查询词,建立所述查询词与所述存储位置的映射关系,形成该字段在所述当前数据采集器的本地索引表。For each field, use the content of the data in the field as the query word of the data, establish the mapping relationship between the query word and the storage location, and form the local index table of the field in the current data collector .

上述指令过程就是数据采集器建立本地索引表的过程,通过建立查询词与数据的存储位置的映射关系,从而根据查询词对应的数据的存储位置获取数据。The above instruction process is the process for the data collector to establish a local index table. By establishing the mapping relationship between the query word and the storage location of the data, the data is obtained according to the storage location of the data corresponding to the query word.

进一步的,所述处理器访问存储器83的软件组件后,执行以下过程的指令:Further, after the processor accesses the software components of the memory 83, it executes the instructions of the following process:

从该字段的本地索引表中提取出所述查询词,对所述查询词进行去重处理,形成所述当前数据采集器的所述字段的上报索引表;extracting the query word from the local index table of the field, and performing deduplication processing on the query word to form the report index table of the field of the current data collector;

将所述字段的上报索引表发送给所述数据查询服务器,用以所述数据查询服务器建立所述字段对应的集中索引表。Send the report index table of the field to the data query server, so that the data query server can establish a centralized index table corresponding to the field.

上述指令过程就是数据采集器根据本地索引表建立上报索引表并发送给数据查询服务器的过程,以便数据查询服务器建立集中索引表。The above instruction process is the process in which the data collector establishes the reporting index table according to the local index table and sends it to the data query server, so that the data query server can establish a centralized index table.

进一步的,所述处理器访问存储器83的软件组件后,执行以下过程的指令:Further, after the processor accesses the software components of the memory 83, it executes the instructions of the following process:

当所述查询命令中携带至少两个所要查询的字段时,获取所述查询命令中各字段的查询词,并记录所述各字段的查询词之间的逻辑关系;When at least two fields to be queried are carried in the query command, the query words of each field in the query command are obtained, and the logical relationship between the query words of the fields is recorded;

从所述各字段对应的本地索引表中查询得到与所述各字段的查询词相匹配的数据的存储位置;Querying from the local index table corresponding to each field to obtain the storage location of the data matching the query words of each field;

根据所述各字段的查询词之间的逻辑关系,从查询得到的所述数据的存储位置中筛选得到满足所述逻辑关系的数据的存储位置。According to the logical relationship between the query words in each field, the storage locations of the data satisfying the logical relationship are obtained by screening from the storage locations of the data obtained through the query.

上述指令过程就是数据采集器针对多个所要查询的字段的查询词找到对应的存储位置的过程,可以避免获取不能完全满足查询命令的数据。The above instruction process is the process in which the data collector finds the corresponding storage locations for the query words in multiple fields to be queried, which can avoid obtaining data that cannot fully satisfy the query command.

专业人员应该还可以进一步意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。Professionals should further realize that the units and algorithm steps described in conjunction with the embodiments disclosed herein can be implemented by electronic hardware, computer software, or a combination of the two. In order to clearly illustrate the relationship between hardware and software Interchangeability. In the above description, the composition and steps of each example have been generally described according to their functions. Whether these functions are executed by hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present invention.

结合本文中所公开的实施例描述的方法或算法的步骤可以用硬件、处理器执行的软件模块,或者二者的结合来实施。软件模块可以置于随机存储器(RAM)、内存、只读存储器(ROM)、电可编程ROM、电可擦除可编程ROM、寄存器、硬盘、可移动磁盘、CD-ROM、或技术领域内所公知的任意其它形式的存储介质中。The steps of the methods or algorithms described in connection with the embodiments disclosed herein may be implemented by hardware, software modules executed by a processor, or a combination of both. Software modules can be placed in random access memory (RAM), internal memory, read-only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, removable disk, CD-ROM, or any other Any other known storage medium.

以上所述的具体实施方式,对本发明的目的、技术方案和有益效果进行了进一步详细说明,所应理解的是,以上所述仅为本发明的具体实施方式而已,并不用于限定本发明的保护范围,凡在本发明的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。The specific embodiments described above have further described the purpose, technical solutions and beneficial effects of the present invention in detail. It should be understood that the above descriptions are only specific embodiments of the present invention and are not intended to limit the scope of the present invention. Protection scope, within the spirit and principles of the present invention, any modification, equivalent replacement, improvement, etc., shall be included in the protection scope of the present invention.

Claims (11)

1. a data query method, it is characterised in that described method includes:
Data query server receives the inquiry request of user's input, carries the query word in the field to inquire about and described field in described inquiry request;
The inquiry from centralized indexes table corresponding to described field of described data query server obtains the harvester mark that described query word is corresponding, stores the corresponding relation of the query word in described field and harvester mark in described centralized indexes table;
Described data query server generates the querying command carrying described field and query word according to described inquiry request, and described querying command is sent to the data acquisition unit that described harvester mark is corresponding, in the local index table that the field carried in order to described data acquisition unit querying command described in described data acquisition unit is corresponding, inquiry obtains the data matched with the query word that carries in described querying command;
Described data query server receives the described data that described data acquisition unit returns, and forms the Query Result of described inquiry request according to the data received and exports;
Wherein, before the centralized indexes table that described data query server is corresponding from described field, inquiry obtains the harvester mark that described query word is corresponding, also include:
Described data query server, for described field, sets up the centralized indexes table that described field is corresponding;
Described set up the centralized indexes table that described field is corresponding, including:
What described data query server received the described field that each data acquisition unit sends reports concordance list, described in report concordance list include sending described in report the data in the data acquisition unit of concordance list corresponding to the query word of described field;
Described data query server is in the centralized indexes table of described field, and what the mark of storage data acquisition unit and described data acquisition unit reported reports the corresponding relation of the query word of this field in concordance list.
2. data query method according to claim 1, it is characterised in that the inquiry from centralized indexes table corresponding to described field of described data query server obtains the harvester mark that described query word is corresponding, including:
If carrying the field that at least two to be inquired about in described inquiry request, described data query server obtains the query word of each field in described inquiry request, and records the logical relation between the query word of the described each field carried in described inquiry request;
The inquiry from centralized indexes table corresponding to described each field of described data query server obtains the harvester mark that the query word of described each field is corresponding;
Described data query server is according to the logical relation between the query word of described each field, and from the described harvester mark that inquiry obtains, screening is met the harvester mark of described logical relation.
3. a data query method, it is characterised in that described method includes:
Data acquisition unit receives the querying command that data query server sends, and described querying command carries the query word in the field to inquire about and described field;
The inquiry from local index table corresponding to described field of described data acquisition unit obtains the storage position of the data matched with the query word in described querying command, stores the corresponding relation of the query word in described field and the storage position of described data in described local index table;
The described data acquisition unit storage position according to described data, obtains described Data Concurrent and gives described data query server;
Wherein, before the local index table that described data acquisition unit is corresponding from described field, inquiry obtains the storage position of data matched with the query word in described querying command, also include:
Described data acquisition unit, for described field, sets up the local index table that described field is corresponding;
Described set up the local index table that described field is corresponding, including:
Data acquisition unit obtains the storage position of the data in Current data acquisition device and described data, and described data include the content of at least one field;
Described data acquisition unit is for each field, using described data in the content of this field as the query word of described data, in the local index table of field described in described Current data acquisition device, store the corresponding relation of the query word of described data and the storage position of described data.
4. data query method according to claim 3, it is characterised in that in the described local index table of field described in described Current data acquisition device, after storing the corresponding relation of the query word of described data and the storage position of described data, also includes:
Described data acquisition unit extracts described query word from the local index table of this field, and described query word is carried out duplicate removal process, formed described Current data acquisition device described field report concordance list;
The concordance list that reports of described field is sent to described data query server by described data acquisition unit, sets up, in order to described data query server, the centralized indexes table that described field is corresponding.
5. data query method according to claim 3, it is characterised in that the inquiry from local index table corresponding to described field of described data acquisition unit obtains the storage position of the data matched with the query word in described querying command, including:
If carrying the field that at least two to be inquired about in described querying command, data acquisition unit obtains the query word of each field in described querying command, and records the logical relation between the query word of each field carried in described querying command;
The inquiry from local index table corresponding to described each field of described data acquisition unit obtains the storage position of the data matched with the query word of each field in described querying command;
Described data acquisition unit is according to the logical relation between the query word of described each field, and from the storage position inquiring about the described data obtained, screening is met the storage position of the data of described logical relation.
6. a data query server, it is characterised in that described data query server includes:
First receives unit, for receiving the inquiry request of user's input, carries the query word in the field to inquire about and described field in described inquiry request;
First query unit, for in the centralized indexes table corresponding from described field, inquiry obtains described first and receives the harvester mark that the query word that carries of described inquiry request that unit receives is corresponding, stores the corresponding relation of the query word in described field and harvester mark in described centralized indexes table;
First processing unit, for generating the querying command carrying described field and query word according to described inquiry request, and described querying command is sent to the data acquisition unit of the harvester mark correspondence that described first query unit inquiry obtains, in the local index table that the field carried in order to described data acquisition unit querying command described in described data acquisition unit is corresponding, inquiry obtains the data matched with the query word that carries in described querying command;
First output unit, for receiving the described data that described data acquisition unit returns, forms the Query Result of described inquiry request according to the data received and exports;
Described data query server also includes:
First indexing units, for for described field, setting up the centralized indexes table that described field is corresponding;
Described first indexing units includes:
First receives subelement, reports concordance list for what receive described field that each data acquisition unit sends, described in report concordance list include sending described in report the data in the data acquisition unit of concordance list corresponding to the query word of described field;
First index subelement, in the centralized indexes table of described field, what the mark of storage data acquisition unit and described data acquisition unit reported reports the corresponding relation of the query word of this field in concordance list.
7. data query server according to claim 6, it is characterised in that described first query unit includes:
First resolves subelement, if receiving in the described inquiry request that unit receives for described first and carrying the field that at least two to be inquired about, obtain the query word of each field in described inquiry request, and record the logical relation between the query word of the described each field carried in described inquiry request;
First inquiry subelement, obtains the harvester mark that the query word of described each field of described first parsing subelement acquisition is corresponding for inquiry the centralized indexes table corresponding from described each field;
First filters subelement, for the logical relation between the query word of the described each field obtained according to described first parsing subelement, inquire about from described first and the described harvester mark that subelement inquiry obtains is screened the harvester mark being met described logical relation.
8. a data acquisition unit, it is characterised in that including:
Second receives unit, and for receiving the querying command that data query server sends, described querying command carries the query word in the field to inquire about and described field;
Second query unit, obtain and the described second storage position receiving the data that the query word in the querying command that unit receives matches for inquiry in the local index table corresponding from described field, described local index table stores the corresponding relation of the query word in described field and the storage position of described data;
Second processing unit, the storage position of the described data for obtaining according to described second query unit inquiry, obtain described Data Concurrent and give described data query server;
Described data acquisition unit also includes:
Second indexing units, for for described field, setting up the local index table that described field is corresponding;
Described second indexing units includes:
Obtaining subelement, be used for the storage position of data and the described data obtaining in Current data acquisition device, described data include the content of at least one field;
Second index subelement, for each field obtained for described acquisition subelement, using described data in the content of this field as the query word of described data, in the local index table of field described in described data acquisition unit, store the corresponding relation of the query word of described data and the storage position of described data.
9. data acquisition unit according to claim 8, it is characterised in that described second indexing units also includes:
3rd index subelement, extracts described query word for indexing the local index table of this field that subelement obtains from described second, described query word is carried out duplicate removal process, formed described data acquisition unit described field report concordance list;
Sending subelement, the concordance list that reports for indexing the described field that subelement is formed by the described 3rd is sent to described data query server, sets up the centralized indexes table of described field in order to described data query server.
10. data acquisition unit according to claim 8, it is characterised in that described second query unit includes:
Second resolves subelement, if receiving in the described querying command that unit receives for described second and carrying the field that at least two to be inquired about, obtain the query word of each field in described querying command, and record the logical relation between the query word of each field carried in described querying command;
Second inquiry subelement, obtains resolving the storage position of the data that the query word of each field matches in the described querying command that subelement obtains with described second for inquiry the local index table corresponding from described each field;
Second filters subelement, for the logical relation between the query word of the described each field obtained according to described second parsing subelement, inquire about the storage position screening the data being met described logical relation the storage position of the described data that subelement inquiry obtains from described second.
11. a data query system, it is characterised in that described system includes:
Data acquisition unit as described in data query server as claimed in claims 6 or 7 and claim as arbitrary in claim 8~10.
CN201210566137.4A 2012-12-24 2012-12-24 Data query method and system Expired - Fee Related CN103064933B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201210566137.4A CN103064933B (en) 2012-12-24 2012-12-24 Data query method and system
PCT/CN2013/082130 WO2014101445A1 (en) 2012-12-24 2013-08-23 Data query method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210566137.4A CN103064933B (en) 2012-12-24 2012-12-24 Data query method and system

Publications (2)

Publication Number Publication Date
CN103064933A CN103064933A (en) 2013-04-24
CN103064933B true CN103064933B (en) 2016-06-29

Family

ID=48107563

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210566137.4A Expired - Fee Related CN103064933B (en) 2012-12-24 2012-12-24 Data query method and system

Country Status (2)

Country Link
CN (1) CN103064933B (en)
WO (1) WO2014101445A1 (en)

Families Citing this family (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103064933B (en) * 2012-12-24 2016-06-29 华为技术有限公司 Data query method and system
CN105099735B (en) * 2014-05-07 2018-05-22 中国移动通信集团福建有限公司 A kind of method and system for obtaining magnanimity more detailed logging
CN105302827B (en) * 2014-06-30 2018-11-20 华为技术有限公司 A kind of searching method and equipment of event
CN104216957A (en) * 2014-08-20 2014-12-17 北京奇艺世纪科技有限公司 Query system and query method for video metadata
CN104317924A (en) * 2014-10-30 2015-01-28 中国银行股份有限公司 Data query method and device in local clearings
CN105871951A (en) * 2015-01-21 2016-08-17 上海可鲁系统软件有限公司 Industrial internet of things distributed business voucher processing method
CN107015990B (en) * 2016-01-27 2020-06-09 阿里巴巴集团控股有限公司 Data searching method and device
CN105930441B (en) * 2016-04-18 2019-04-26 华信咨询设计研究院有限公司 A kind of radio monitoring data query method
CN106354823A (en) * 2016-08-30 2017-01-25 北京旷视科技有限公司 Method, device and system for summarizing face matching system operation data
CN107784050A (en) * 2016-12-14 2018-03-09 平安科技(深圳)有限公司 Log information lookup method and device
CN107066610A (en) * 2017-05-02 2017-08-18 中国联合网络通信集团有限公司 A kind of price queries method and apparatus
CN107577506B (en) * 2017-08-07 2021-03-19 台州市吉吉知识产权运营有限公司 Data preloading method and system
CN109299219B (en) * 2018-08-31 2022-08-12 北京奥星贝斯科技有限公司 Data query method and device, electronic equipment and computer readable storage medium
CN109308305B (en) * 2018-09-30 2021-06-08 广州圣亚科技有限公司 Monitoring data query method and device and computer equipment
CN109299348B (en) * 2018-11-28 2021-09-28 北京字节跳动网络技术有限公司 Data query method and device, electronic equipment and storage medium
CN109885548A (en) * 2019-02-22 2019-06-14 网易(杭州)网络有限公司 Log inquiring method, device, storage medium and electronic device
CN110502915B (en) * 2019-08-30 2021-07-30 恩亿科(北京)数据科技有限公司 Data processing method, device and system
CN110674369A (en) * 2019-09-23 2020-01-10 杭州迪普科技股份有限公司 Data query method and device
CN111062193B (en) * 2019-12-16 2023-04-25 医渡云(北京)技术有限公司 Medical data labeling method and device, storage medium and electronic equipment
CN113486048A (en) * 2021-07-13 2021-10-08 广西电力职业技术学院 Data retrieval system and data retrieval method
CN117425887A (en) * 2022-05-18 2024-01-19 京东方科技集团股份有限公司 Data processing method, device, electronic equipment and computer readable storage medium
CN116541354A (en) * 2023-05-10 2023-08-04 北京自如信息科技有限公司 Log retrieval method, device, computer equipment and storage medium
CN117271562B (en) * 2023-11-21 2024-01-19 成都凌亚科技有限公司 Data acquisition processing method and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6144957A (en) * 1997-02-28 2000-11-07 Oracle Corporation Method and apparatus for using incompatible types of indexes to process a single query
CN102193917A (en) * 2010-03-01 2011-09-21 中国移动通信集团公司 Method and device for processing and querying data
CN102375853A (en) * 2010-08-24 2012-03-14 中国移动通信集团公司 Distributed database system, method for building index therein and query method
CN102789487A (en) * 2012-06-29 2012-11-21 用友软件股份有限公司 Data query and retrieval processing device and data query and retrieval processing method

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1968132B (en) * 2006-10-16 2010-04-21 华为技术有限公司 Method for establishing call log association between network entities and searching associated call log
CN103064933B (en) * 2012-12-24 2016-06-29 华为技术有限公司 Data query method and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6144957A (en) * 1997-02-28 2000-11-07 Oracle Corporation Method and apparatus for using incompatible types of indexes to process a single query
CN102193917A (en) * 2010-03-01 2011-09-21 中国移动通信集团公司 Method and device for processing and querying data
CN102375853A (en) * 2010-08-24 2012-03-14 中国移动通信集团公司 Distributed database system, method for building index therein and query method
CN102789487A (en) * 2012-06-29 2012-11-21 用友软件股份有限公司 Data query and retrieval processing device and data query and retrieval processing method

Also Published As

Publication number Publication date
WO2014101445A1 (en) 2014-07-03
CN103064933A (en) 2013-04-24

Similar Documents

Publication Publication Date Title
CN103064933B (en) Data query method and system
CN111241177B (en) Data collection method, system and network equipment
CN111258978B (en) Data storage method
CN106960037B (en) A kind of distributed index the resources integration and share method across intranet and extranet
CN104376053B (en) A kind of storage and retrieval method based on magnanimity meteorological data
CN113900810B (en) Distributed graph processing method, system and storage medium
CN101727465A (en) Methods for establishing and inquiring index of distributed column storage database, device and system thereof
CN111382146A (en) Data storage control method, device, equipment and storage medium
WO2014015488A1 (en) Method and apparatus for data storage and query
CN104408159A (en) Data correlating, loading and querying method and device
CN114398520B (en) Data retrieval method, system, device, electronic equipment and storage medium
CN110727727A (en) Statistical method and device for a database
CN105677903A (en) Data acquisition method and device as well as computer device
CN107622064A (en) A data reading method and system
CN105681327A (en) Automatic query method and system for firewall policy
CN109388651B (en) A data processing method and device
CN111694793A (en) Log storage method and device and log query method and device
CN114238473A (en) Metadata management method, device and electronic device
US9846703B2 (en) Page-based metadata system for distributed filesystem
CN102685222B (en) A kind of cloud SRM device for electric power system
CN103810197A (en) Hadoop-based data processing method and system
CN111881086A (en) Big data storage method, big data query method, electronic device and storage medium
CN104424316A (en) Data storage method, data searching method, related device and system
CN106649678B (en) Data processing method and system
CN107800738A (en) The method and apparatus of data renewal

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20191210

Address after: 056000 Liu Han Xiang Liu Han Cun Xi, Yongnian District, Handan City, Hebei Province

Patentee after: Handan Duofei Fastener Manufacturing Co.,Ltd.

Address before: 510000 unit 2414-2416, building, No. five, No. 371, Tianhe District, Guangdong, China

Patentee before: GUANGDONG GAOHANG INTELLECTUAL PROPERTY OPERATION Co.,Ltd.

Effective date of registration: 20191210

Address after: 510000 unit 2414-2416, building, No. five, No. 371, Tianhe District, Guangdong, China

Patentee after: GUANGDONG GAOHANG INTELLECTUAL PROPERTY OPERATION Co.,Ltd.

Address before: 518129 Bantian HUAWEI headquarters office building, Longgang District, Guangdong, Shenzhen

Patentee before: HUAWEI TECHNOLOGIES Co.,Ltd.

TR01 Transfer of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160629

CF01 Termination of patent right due to non-payment of annual fee