CN118567856A

CN118567856A - Asynchronous management method and device for database query task and computer equipment

Info

Publication number: CN118567856A
Application number: CN202410708441.0A
Authority: CN
Inventors: 王春龙; 杨波; 杨丰; 胡微微
Original assignee: Shenzhen Magic Competition Technology Co ltd
Current assignee: Shenzhen Magic Competition Technology Co ltd
Priority date: 2024-06-03
Filing date: 2024-06-03
Publication date: 2024-08-30
Anticipated expiration: 2044-06-03
Also published as: CN118567856B

Abstract

The present application relates to a method, device, computer equipment, computer-readable storage medium and computer program product for asynchronous management of database query tasks. The method comprises: determining idle threads, and if the number of idle threads exceeds a preset thread threshold, then screening out multiple target database query tasks from a preparatory task queue according to the number of idle threads; splitting each target database query task into multiple subtasks; for each of the multiple subtasks, if it is determined that other subtasks depend on the targeted subtask, or have other subtasks that are the same as the targeted subtask, then treating the targeted subtask as a shared subtask; assigning multiple target database query tasks and shared subtasks to idle threads for processing. The use of this method can complete all query tasks faster, improve query efficiency, and reduce the pressure on a single database point by dispersing the query load.

Description

Database query task asynchronous management method, device and computer equipment

技术领域Technical Field

本申请涉及计算机技术领域，特别是涉及一种数据库查询任务异步管理方法、装置、计算机设备、计算机可读存储介质和计算机程序产品。The present application relates to the field of computer technology, and in particular to a method, apparatus, computer equipment, computer-readable storage medium and computer program product for asynchronous management of database query tasks.

背景技术Background Art

随着信息技术的飞速发展，数据库已经逐渐成为了企业和个人进行数据存储和查询的关键工具。同时随着数据库中存储的数据变得愈加复杂和庞大时，想要实现高效和灵活的数据获取就会面临巨大的挑战。With the rapid development of information technology, databases have gradually become a key tool for enterprises and individuals to store and query data. At the same time, as the data stored in the database becomes more complex and large, achieving efficient and flexible data acquisition will face huge challenges.

在某些情况下，为了获取应用程序中的数据，我们通常采用传统的方式，即通过提供特定的API并限定输入参数来实现。然而，这种方法存在一定的局限性，在面临复杂查询需求时，其灵活性有限，无法满足需求。其次，在处理大量数据和复杂查询时，传统的同步查询方式往往会导致查询效率低下，甚至可能导致数据库的阻塞。In some cases, in order to obtain data in the application, we usually adopt the traditional method, that is, by providing a specific API and limiting the input parameters. However, this method has certain limitations. When faced with complex query requirements, its flexibility is limited and it cannot meet the requirements. Secondly, when dealing with large amounts of data and complex queries, the traditional synchronous query method often leads to low query efficiency and may even cause database blocking.

发明内容Summary of the invention

基于此，有必要针对上述技术问题，提供一种能够提高查询效率和缓解数据库压力的数据库查询任务异步管理方法、装置、计算机设备、计算机可读存储介质和计算机程序产品。Based on this, it is necessary to provide a method, device, computer equipment, computer-readable storage medium and computer program product for asynchronous management of database query tasks that can improve query efficiency and alleviate database pressure in response to the above technical problems.

第一方面，本申请提供了一种数据库查询任务异步管理方法，包括：In a first aspect, the present application provides a method for asynchronously managing database query tasks, comprising:

确定空闲线程，若所述空闲线程的数量超过预设线程阈值，则根据所述空闲线程的数量，从预备任务队列中筛选出多个目标数据库查询任务；Determine idle threads, and if the number of the idle threads exceeds a preset thread threshold, select multiple target database query tasks from the preparation task queue according to the number of the idle threads;

将每个所述目标数据库查询任务拆分成多个子任务；Splitting each of the target database query tasks into multiple subtasks;

针对所述多个子任务中的每个子任务，若确定其他子任务依赖于所针对的子任务，或者，具有与所针对的子任务相同的其他子任务时，则将所针对的子任务作为共享子任务；For each of the multiple subtasks, if it is determined that other subtasks depend on the targeted subtask, or have other subtasks that are the same as the targeted subtask, then the targeted subtask is used as a shared subtask;

将所述多个目标数据库查询任务和所述共享子任务，分配给所述空闲线程进行处理。The multiple target database query tasks and the shared subtasks are assigned to the idle thread for processing.

第二方面，本申请还提供了一种数据库查询任务异步管理装置，包括：In a second aspect, the present application also provides a database query task asynchronous management device, comprising:

第一确定模块，用于确定空闲线程，若所述空闲线程的数量超过预设线程阈值，则根据所述空闲线程的数量，从预备任务队列中筛选出多个目标数据库查询任务；A first determination module is used to determine idle threads, and if the number of the idle threads exceeds a preset thread threshold, a plurality of target database query tasks are screened out from a preparation task queue according to the number of the idle threads;

拆分模块，用于将每个所述目标数据库查询任务拆分成多个子任务；A splitting module, used for splitting each of the target database query tasks into multiple subtasks;

第二确定模块，用于针对所述多个子任务中的每个子任务，若确定其他子任务依赖于所针对的子任务，或者，具有与所针对的子任务相同的其他子任务时，则将所针对的子任务作为共享子任务；A second determination module is used for, for each of the plurality of subtasks, if it is determined that other subtasks depend on the targeted subtask, or have other subtasks that are the same as the targeted subtask, to use the targeted subtask as a shared subtask;

分配模块，用于将所述多个目标数据库查询任务和所述共享子任务，分配给所述空闲线程进行处理。The allocation module is used to allocate the multiple target database query tasks and the shared subtasks to the idle threads for processing.

第三方面，本申请还提供了一种计算机设备，包括存储器和处理器，所述存储器存储有计算机程序，所述处理器执行所述计算机程序时实现以下步骤：In a third aspect, the present application further provides a computer device, including a memory and a processor, wherein the memory stores a computer program, and when the processor executes the computer program, the following steps are implemented:

第四方面，本申请还提供了一种计算机可读存储介质，其上存储有计算机程序，所述计算机程序被处理器执行时实现以下步骤：In a fourth aspect, the present application further provides a computer-readable storage medium having a computer program stored thereon, wherein when the computer program is executed by a processor, the following steps are implemented:

第五方面，本申请还提供了一种计算机程序产品，包括计算机程序，该计算机程序被处理器执行时实现以下步骤：In a fifth aspect, the present application further provides a computer program product, including a computer program, which implements the following steps when executed by a processor:

上述数据库查询任务异步管理方法、装置、计算机设备、计算机可读存储介质和计算机程序产品，首先通过确定空闲线程，若所述空闲线程的数量超过预设线程阈值，则根据所述空闲线程的数量，从预备任务队列中筛选出多个目标数据库查询任务；充分利用线程资源，避免线程资源的闲置浪费。然后通过将每个所述目标数据库查询任务拆分成多个子任务；有利于并行处理，每个子任务可以在单独的线程中并发执行，进一步提高查询效率。再通过针对所述多个子任务中的每个子任务，若确定其他子任务依赖于所针对的子任务，或者，具有与所针对的子任务相同的其他子任务时，则将所针对的子任务作为共享子任务；避免了不必要的重复计算，降低了整体的计算成本。最后通过将所述多个目标数据库查询任务和所述共享子任务，分配给所述空闲线程进行处理。实现了保证所有线程都能充分利用起来，最大化系统并行处理能力。这样既可以更快地完成所有查询任务，提高查询效率，又能通过分散查询负载，降低对数据库单点的压力，使得数据库可以更平稳地服务于多个并发请求，提高系统的整体稳定性。The above-mentioned database query task asynchronous management method, device, computer equipment, computer-readable storage medium and computer program product first determine the idle threads. If the number of the idle threads exceeds the preset thread threshold, multiple target database query tasks are screened out from the preparatory task queue according to the number of the idle threads; thread resources are fully utilized to avoid idle waste of thread resources. Then, each of the target database query tasks is split into multiple subtasks; it is conducive to parallel processing, and each subtask can be executed concurrently in a separate thread, further improving the query efficiency. Then, for each of the multiple subtasks, if it is determined that other subtasks depend on the targeted subtask, or have other subtasks that are the same as the targeted subtask, the targeted subtask is used as a shared subtask; unnecessary repeated calculations are avoided, and the overall calculation cost is reduced. Finally, the multiple target database query tasks and the shared subtasks are assigned to the idle threads for processing. It is achieved to ensure that all threads can be fully utilized and maximize the parallel processing capability of the system. This can not only complete all query tasks faster and improve query efficiency, but also reduce the pressure on a single database point by dispersing the query load, allowing the database to serve multiple concurrent requests more smoothly and improve the overall stability of the system.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

为了更清楚地说明本申请实施例或相关技术中的技术方案，下面将对本申请实施例或相关技术描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本申请的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他相关的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present application or the related technologies, the drawings required for use in the embodiments of the present application or the related technical descriptions will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present application. For ordinary technicians in this field, other related drawings can be obtained based on these drawings without paying creative work.

图1为一个实施例中数据库查询任务异步管理方法的应用环境图；FIG1 is an application environment diagram of a method for asynchronous management of database query tasks in one embodiment;

图2为一个实施例中数据库查询任务异步管理方法的流程示意图；FIG2 is a schematic diagram of a flow chart of a method for asynchronously managing database query tasks in one embodiment;

图3为另一个实施例中数据库查询任务异步管理方法的流程示意图；FIG3 is a schematic diagram of a flow chart of a method for asynchronously managing database query tasks in another embodiment;

图4为一个实施例中数据库查询任务异步管理装置的结构框图；FIG4 is a structural block diagram of a database query task asynchronous management device in one embodiment;

图5为另一个实施例中数据库查询任务异步管理装置的结构框图；FIG5 is a structural block diagram of a device for asynchronously managing database query tasks in another embodiment;

图6为一个实施例中计算机设备的内部结构图。FIG. 6 is a diagram showing the internal structure of a computer device in one embodiment.

具体实施方式DETAILED DESCRIPTION

为了使本申请的目的、技术方案及优点更加清楚明白，以下结合附图及实施例，对本申请进行进一步详细说明。应当理解，此处描述的具体实施例仅仅用以解释本申请，并不用于限定本申请。In order to make the purpose, technical solution and advantages of the present application more clearly understood, the present application is further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present application and are not used to limit the present application.

本申请实施例提供的数据库查询任务异步管理方法，可以应用于如图1所示的应用环境中。其中，终端102通过网络与服务器104进行通信。数据存储可以存储服务器104需要处理的数据。数据存储可以集成在服务器104上，也可以放在云上或其他网络服务器上。终端102生成数据库查询任务异步管理请求，然后将数据库查询任务异步管理请求发送至服务器104，以使服务器104将多个目标数据库查询任务和共享子任务，分配给空闲线程进行处理。其中，终端102可以但不限于是各种个人计算机、笔记本电脑、智能手机、平板电脑、物联网设备和便携式可穿戴设备，物联网设备可为智能音箱、智能电视、智能空调、智能车载设备、投影设备等。便携式可穿戴设备可为智能手表、智能手环、头戴设备等。头戴设备可以为虚拟现实（Virtual Reality，VR）设备、增强现实（Augmented Reality，AR）设备、智能眼镜等。服务器104可以是独立的物理服务器，也可以是多个物理服务器构成的服务器集群或分布式，还可以是提供云计算服务的云服务器。The database query task asynchronous management method provided in the embodiment of the present application can be applied to the application environment shown in FIG. 1. Among them, the terminal 102 communicates with the server 104 through the network. The data storage can store the data that the server 104 needs to process. The data storage can be integrated on the server 104, or it can be placed on the cloud or other network servers. The terminal 102 generates a database query task asynchronous management request, and then sends the database query task asynchronous management request to the server 104, so that the server 104 assigns multiple target database query tasks and shared subtasks to idle threads for processing. Among them, the terminal 102 can be, but is not limited to, various personal computers, laptops, smart phones, tablet computers, Internet of Things devices and portable wearable devices. The Internet of Things devices can be smart speakers, smart TVs, smart air conditioners, smart car devices, projection devices, etc. The portable wearable device can be a smart watch, a smart bracelet, a head-mounted device, etc. The head-mounted device can be a virtual reality (VR) device, an augmented reality (AR) device, smart glasses, etc. The server 104 may be an independent physical server, or a server cluster or distributed server consisting of multiple physical servers, or a cloud server that provides cloud computing services.

在一个示例性的实施例中，如图2所示，提供了一种数据库查询任务异步管理方法，以该方法应用于图1中的104为例进行说明，包括以下步骤202至步骤208。其中：In an exemplary embodiment, as shown in FIG2 , a method for asynchronous management of database query tasks is provided, which is described by taking the method applied to step 104 in FIG1 as an example, and includes the following steps 202 to 208 . Among them:

步骤202，确定空闲线程，若空闲线程的数量超过预设线程阈值，则根据空闲线程的数量，从预备任务队列中筛选出多个目标数据库查询任务。Step 202, determining idle threads, if the number of idle threads exceeds a preset thread threshold, then screening out a plurality of target database query tasks from the preparation task queue according to the number of idle threads.

其中，空闲线程是指在一个多线程环境中，那些已经完成上一个任务并等待新任务分配的线程。它们没有正在执行的工作负载，随时可以接受并开始处理新的任务。预设线程阈值是指一个预先设定好的数值，用于指示允许有多少个空闲线程存在。当检测到空闲线程的数量超过这个阈值时，将会触发相应的动作，比如从预备任务队列中取出任务进行处理。预备任务队列是指是一个先进先出（FIFO）队列或优先级队列，用于存储等待执行的任务。这些任务尚未分配给任何线程，而是按照某种策略（如先来先服务、优先级等）排队等候线程来取走并执行。目标数据库查询任务是指那些待处理的数据库操作任务，这些任务可能由于各种原因暂未分配给线程执行，等待满足一定条件（如空闲线程足够）时才会被挑选出来进一步处理，可选的，目标数据库查询任务是由多个SQL脚本组成，并且是通过临时表的方式查询数据。Idle threads refer to threads that have completed the previous task and are waiting for new tasks to be assigned in a multithreaded environment. They have no workload being executed and can accept and start processing new tasks at any time. The preset thread threshold refers to a pre-set value that indicates how many idle threads are allowed to exist. When it is detected that the number of idle threads exceeds this threshold, the corresponding action will be triggered, such as taking tasks from the standby task queue for processing. The standby task queue refers to a first-in-first-out (FIFO) queue or priority queue for storing tasks waiting to be executed. These tasks have not been assigned to any thread, but are queued and waited for threads to take and execute according to a certain strategy (such as first-come, first-served, priority, etc.). The target database query task refers to those pending database operation tasks. These tasks may not be assigned to threads for execution for various reasons. They will be selected for further processing only when certain conditions (such as sufficient idle threads) are met. Optionally, the target database query task consists of multiple SQL scripts, and the data is queried in the form of temporary tables.

具体地，首先在一个多线程环境下，实时监控所有线程的状态，识别出当前哪些线程已完成先前分配的任务且正处于空闲状态，即没有正在进行的工作。然后当检测到空闲线程的数量超过了预设的一个阈值时，表明资源（线程）有富余，此时为了更高效地利用资源，可以启动下一步的任务调度策略。最后根据当前空闲线程的具体数量，从预备任务队列中选择相应数量的目标数据库查询任务。预备任务队列里存储着等待执行的各种数据库查询任务，这些任务按一定的策略（例如：先进先出、优先级等）排列。Specifically, first, in a multi-threaded environment, the status of all threads is monitored in real time to identify which threads have completed previously assigned tasks and are currently idle, that is, there is no work in progress. Then, when it is detected that the number of idle threads exceeds a preset threshold, it indicates that there is surplus resources (threads). At this time, in order to use resources more efficiently, the next task scheduling strategy can be started. Finally, according to the specific number of current idle threads, the corresponding number of target database query tasks is selected from the pre-task queue. The pre-task queue stores various database query tasks waiting to be executed, and these tasks are arranged according to certain strategies (for example, first-in-first-out, priority, etc.).

步骤204，将每个目标数据库查询任务拆分成多个子任务。Step 204: split each target database query task into multiple subtasks.

其中，子任务是指目标数据库查询任务的一部分，它代表了原任务中的一个具体的、可单独执行的操作，例如，在一个复杂的数据库查询任务中，子任务可能包括对某个表的查询、对数据的初步过滤、统计计算等部分。通过拆分，原本单一的大任务变成了多个相对较小且逻辑清晰的子任务。Among them, a subtask refers to a part of the target database query task, which represents a specific, independently executable operation in the original task. For example, in a complex database query task, a subtask may include querying a table, preliminary filtering of data, statistical calculations, etc. By splitting, the original single large task becomes multiple relatively small and logically clear subtasks.

具体地，首先对于每一个从预备任务队列中选出的目标数据库查询任务，将其分割成多个粒度更小、逻辑上相对独立的子任务。这样做的原因是，单个大型查询任务可能包含多个步骤，这些步骤中有些是可以并行执行的，并且不同的大型查询任务之间可能存在相同的子任务或者相互依赖的子任务。Specifically, first, for each target database query task selected from the preparation task queue, it is divided into multiple smaller-scale, logically relatively independent subtasks. The reason for this is that a single large query task may contain multiple steps, some of which can be executed in parallel, and different large query tasks may have the same subtasks or interdependent subtasks.

步骤206，针对多个子任务中的每个子任务，若确定其他子任务依赖于所针对的子任务，或者，具有与所针对的子任务相同的其他子任务时，则将所针对的子任务作为共享子任务。Step 206 : for each of the multiple subtasks, if it is determined that other subtasks are dependent on the targeted subtask, or have other subtasks that are the same as the targeted subtask, then the targeted subtask is used as a shared subtask.

其中，共享子任务是指在对目标数据库查询任务拆分成多个子任务后，识别出某些子任务具有特定的属性，即它们可能是其他子任务执行的前提条件，或者它们所执行的操作与其他子任务完全相同或部分相同，这样的子任务被定义为“共享子任务”。Among them, shared subtasks refer to the identification of certain subtasks with specific attributes after splitting the target database query task into multiple subtasks, that is, they may be prerequisites for the execution of other subtasks, or the operations they perform are exactly the same or partially the same as other subtasks. Such subtasks are defined as "shared subtasks".

具体地，首先在对目标数据库查询任务拆分成多个子任务后，检查这些子任务之间的关联关系。如果发现有其他子任务的执行必须依赖于当前正在检查的子任务的结果（例如，一个子任务获取的数据是另一个子任务计算所需的基础数据），则说明这两个子任务之间存在依赖关系。与此同时，还会判断是否存在执行内容相同或相似的子任务，即有多个子任务需要查询相同的数据库记录或执行相同的计算逻辑。在这种情况下，即使它们不是直接的前后依赖关系，执行其中一个子任务的结果也可以供其他子任务复用，以避免重复计算和资源浪费。然后当确认了以上两种情况之一时，就会将所针对的子任务标记为“共享子任务”。这意味着这个子任务的结果不仅仅服务于自身，还需要提供给其他依赖它的子任务使用，或者用来替代那些执行内容相同的子任务，从而达到提高执行效率、减少计算开销的目的。Specifically, after splitting the target database query task into multiple subtasks, check the association between these subtasks. If it is found that the execution of other subtasks must depend on the results of the subtask currently being checked (for example, the data obtained by one subtask is the basic data required for the calculation of another subtask), it means that there is a dependency between the two subtasks. At the same time, it will also be determined whether there are subtasks with the same or similar execution content, that is, there are multiple subtasks that need to query the same database record or execute the same calculation logic. In this case, even if they are not directly dependent on each other, the result of executing one of the subtasks can be reused by other subtasks to avoid repeated calculations and waste of resources. Then when one of the above two situations is confirmed, the subtask in question will be marked as a "shared subtask". This means that the result of this subtask not only serves itself, but also needs to be provided to other subtasks that depend on it, or used to replace those subtasks with the same execution content, so as to achieve the purpose of improving execution efficiency and reducing computing overhead.

步骤208，将多个目标数据库查询任务和共享子任务，分配给空闲线程进行处理。Step 208: assign multiple target database query tasks and shared subtasks to idle threads for processing.

具体地，无论是独立的子任务还是共享的子任务，都会根据它们的执行依赖关系和资源需求，将它们均衡且有效地分配给空闲线程进行处理。这样做的好处是充分利用并行计算的优势，确保能够同时处理多个查询请求，最大限度地加快查询速度，并通过合理分配任务和共享计算结果，有效缓解数据库的压力，保障在高并发场景下的稳定性和性能表现。Specifically, independent and shared subtasks will be evenly and effectively assigned to idle threads for processing based on their execution dependencies and resource requirements. This makes full use of the advantages of parallel computing, ensuring that multiple query requests can be processed simultaneously, maximizing query speed, and effectively alleviating database pressure by reasonably allocating tasks and sharing calculation results, ensuring stability and performance in high-concurrency scenarios.

在其中一个实施例中，根据空闲线程的数量，确定数据库查询任务的目标筛选数量；获取预备任务队列中每一数据库查询任务的执行状态信息，将执行状态信息为尚未执行的数据库查询任务确定为候选数据库查询任务；从候选数据库查询任务中筛选出为目标筛选数量个目标数据库查询任务。In one of the embodiments, a target screening number of database query tasks is determined based on the number of idle threads; the execution status information of each database query task in the preparation task queue is obtained, and the database query tasks whose execution status information is that they have not been executed are determined as candidate database query tasks; and the target screening number of target database query tasks are screened out from the candidate database query tasks.

其中，执行状态信息是指数据库查询任务当前所处的工作阶段或完成状态。可以理解地，本申请对于执行状态信息的内容不做限定，可根据实际需要进行设置。可选的，执行状态信息包括但不限于尚未执行、正在执行、执行完毕、执行失败以及取消等。Among them, the execution status information refers to the current working stage or completion status of the database query task. It can be understood that this application does not limit the content of the execution status information, and it can be set according to actual needs. Optionally, the execution status information includes but is not limited to not yet executed, executing, completed, failed, and canceled.

具体地，首先实时监控当前环境中空闲线程的数量。基于预设的线程管理策略，根据当前空闲线程的数量来决定可以同时处理多少个数据库查询任务。例如，若规定空闲线程必须保留一定的冗余，或者按照一定比例（如空闲线程数量的70%）来分配查询任务，则根据这个原则计算出可以分配给查询任务的目标筛选数量。Specifically, first monitor the number of idle threads in the current environment in real time. Based on the preset thread management strategy, determine how many database query tasks can be processed simultaneously according to the number of current idle threads. For example, if it is stipulated that idle threads must retain a certain redundancy, or query tasks are allocated according to a certain ratio (such as 70% of the number of idle threads), then calculate the target screening number that can be allocated to query tasks based on this principle.

其次，预备任务队列中储存有待处理的数据库查询任务列表。遍历预备任务队列，逐个获取每个查询任务的执行状态信息。执行状态信息包括但不限于：是否已经开始执行、是否执行完毕、是否执行失败等。Secondly, the preparatory task queue stores a list of pending database query tasks. The preparatory task queue is traversed to obtain the execution status information of each query task one by one. The execution status information includes but is not limited to: whether the execution has started, whether the execution has been completed, whether the execution has failed, etc.

然后，在获取到每个查询任务的执行状态后，将那些执行状态为“尚未执行”的任务筛选出来。这些尚未执行的任务被认为是候选数据库查询任务，它们还没有被任何线程占用，可以被分配给空闲线程进行处理。Then, after obtaining the execution status of each query task, those tasks with the execution status of "not yet executed" are filtered out. These unexecuted tasks are considered candidate database query tasks, which are not occupied by any thread and can be assigned to idle threads for processing.

最后，根据计算出的目标筛选数量，会在候选数据库查询任务中选取相应数量的任务。可以理解地，选取的原则可以是按照任务到达队列的先后顺序（FIFO，先进先出），或者是基于任务的优先级、预计执行时间等因素进行优先级排序后选择。一旦选定目标筛选数量的任务，这些任务就会被标记为即将被执行的任务，并准备好被空闲线程接手执行。Finally, according to the calculated target number of screening tasks, a corresponding number of tasks will be selected from the candidate database query tasks. It can be understood that the principle of selection can be based on the order in which the tasks arrive at the queue (FIFO, first in first out), or based on the priority of the task, the estimated execution time and other factors. Once the target number of screening tasks is selected, these tasks will be marked as tasks to be executed and are ready to be taken over by the idle thread for execution.

由于任务的执行状态清晰可控，当扩容或调整资源分配策略时，更容易进行相应的调整和优化，有利于长期的运维管理和性能改进。实现在空闲线程资源充足时处理更多任务，在资源紧张时能适当减少任务处理量，具有良好的弹性伸缩能力，适应不同负载场景。Since the execution status of tasks is clear and controllable, it is easier to make corresponding adjustments and optimizations when expanding capacity or adjusting resource allocation strategies, which is conducive to long-term operation and maintenance management and performance improvement. It can process more tasks when there are sufficient idle thread resources, and appropriately reduce the amount of task processing when resources are tight. It has good elastic scaling capabilities and can adapt to different load scenarios.

在其中一个实施例中，获取每个候选数据库查询任务在预备任务队列中的入列时间；按照入列时间从远至近的顺序，从预备任务队列中的多个候选数据库查询任务中依次筛选出个数为目标筛选数量的目标数据库查询任务；或者获取每个候选数据库查询任务在预备任务队列中的查询优先级；按照查询优先级由高至低的顺序，从预备任务队列中的多个候选数据库查询任务中依次筛选出个数为目标筛选数量的目标数据库查询任务。In one of the embodiments, the time of entry of each candidate database query task in the preparatory task queue is obtained; a target number of target database query tasks are sequentially screened out from multiple candidate database query tasks in the preparatory task queue in order of entry time from far to near; or the query priority of each candidate database query task in the preparatory task queue is obtained; a target number of target database query tasks are sequentially screened out from multiple candidate database query tasks in the preparatory task queue in order of query priority from high to low.

其中，入列时间是指候选数据库查询任务进入预备任务队列的具体时间点。在任务调度时，按照入列时间从远至近的顺序来筛选任务，意味着遵循“先进先出”原则。较早加入预备任务队列的任务将会优先被分配到空闲线程进行处理。这种方法公平对待所有的任务，确保任务处理的顺序与其请求服务的顺序一致。查询优先级是指为数据库查询任务赋予的一个权重值，用来表示任务的重要程度或紧急程度。当按照查询优先级由高至低的顺序筛选任务时，优先级高的任务会被优先分配到空闲线程去执行。这种策略允许灵活处理任务，优先解决更为紧急或者重要的查询请求，有助于优化整体的服务质量和资源利用效率。Among them, the entry time refers to the specific time point when the candidate database query task enters the preparatory task queue. When scheduling tasks, tasks are screened in order from the longest to the shortest entry time, which means following the "first in, first out" principle. Tasks that join the preparatory task queue earlier will be assigned to idle threads for processing first. This method treats all tasks fairly and ensures that the order in which tasks are processed is consistent with the order in which they request services. Query priority refers to a weight value assigned to database query tasks to indicate the importance or urgency of the task. When filtering tasks in order from high to low query priority, tasks with high priority will be assigned to idle threads for execution first. This strategy allows flexible task processing and prioritizes more urgent or important query requests, which helps optimize the overall service quality and resource utilization efficiency.

具体地，可以理解地，本申请至少包括两种任务调度策略。第一种任务调度策略中，首先对预备任务队列中的每一个候选数据库查询任务，都会记录它们进入队列的具体时间（入列时间）。然后将所有候选任务按照入列时间进行排序，即入列时间最早的排在前面，最晚的排在后面。再然后根据当前空闲线程的数量和设定的目标筛选规则，确定本次需要从预备任务队列中筛选出来的目标数据库查询任务的数量。最后从排序后的候选任务列表中，按照从前往后的顺序依次选取任务，直到选出的任务数量达到目标筛选数量为止。这样，最早进入预备任务队列的任务将首先被选中，遵循“先进先出”原则。Specifically, it can be understood that the present application includes at least two task scheduling strategies. In the first task scheduling strategy, first, for each candidate database query task in the preparatory task queue, the specific time (entry time) of their entry into the queue will be recorded. Then all candidate tasks are sorted according to the entry time, that is, the earliest entry time is in front, and the latest is in the back. Then, based on the number of current idle threads and the set target screening rules, the number of target database query tasks that need to be screened out from the preparatory task queue this time is determined. Finally, from the sorted candidate task list, tasks are selected in order from front to back until the number of selected tasks reaches the target screening number. In this way, the task that enters the preparatory task queue earliest will be selected first, following the "first in, first out" principle.

在第二种任务调度策略中，首先对预备任务队列中的每一个候选数据库查询任务，会为其设置一个查询优先级属性，反映任务的紧急程度或重要性。然后将所有候选任务按照查询优先级进行排序，即优先级最高的任务排在前面，最低的排在后面。再然后同样依据空闲线程资源和设定的目标筛选规则，明确需要筛选出的目标数据库查询任务的数量。最后从优先级排序后的候选任务列表中，按照从高到低的顺序依次选取任务，直至选出的任务数量达到目标筛选数量。这样，优先级最高的查询任务将优先被选中并分配给空闲线程执行。In the second task scheduling strategy, a query priority attribute is first set for each candidate database query task in the preparatory task queue to reflect the urgency or importance of the task. Then all candidate tasks are sorted according to the query priority, that is, the highest priority task is in front and the lowest priority task is in the back. Then, based on the idle thread resources and the set target screening rules, the number of target database query tasks that need to be screened is determined. Finally, from the priority-sorted candidate task list, tasks are selected in order from high to low until the number of selected tasks reaches the target screening number. In this way, the query task with the highest priority will be selected first and assigned to the idle thread for execution.

由于采用两种不同的任务调度策略，因此都可以确保在有空闲线程时，按照某种规则合理、高效地分配数据库查询任务。第一种策略更注重任务的公平排队，第二种策略则着重于任务的紧急性和重要性。Since two different task scheduling strategies are adopted, it can be ensured that when there are idle threads, database query tasks are reasonably and efficiently allocated according to certain rules. The first strategy focuses more on fair queuing of tasks, while the second strategy focuses on the urgency and importance of tasks.

在其中一个实施例中，将所针对的子任务分别和其他子任务进行语句参数匹配，得到参数匹配结果；将所针对的子任务分别和其他子任务进行数据范围匹配，得到范围匹配结果；将所针对的子任务分别和其他子任务进行输出格式匹配，得到格式匹配结果；在根据参数匹配结果、范围匹配结果以及格式匹配结果，确定其他子任务依赖于所针对的子任务，或者，具有与所针对的子任务相同的其他子任务时，则将所针对的子任务作为共享子任务。In one of the embodiments, the targeted subtask is matched with other subtasks for statement parameters to obtain parameter matching results; the targeted subtask is matched with other subtasks for data range to obtain range matching results; the targeted subtask is matched with other subtasks for output format to obtain format matching results; when it is determined based on the parameter matching results, the range matching results and the format matching results that other subtasks are dependent on the targeted subtask, or have other subtasks that are the same as the targeted subtask, the targeted subtask is used as a shared subtask.

其中，语句参数是指查询语句中使用的变量、占位符或动态值。这些参数使查询更具灵活性和通用性，可以适应不同的查询条件或数据过滤需求。可选的，语句参数包括但不限于绑定变量、查询参数以及查询模板中的占位符等。Statement parameters refer to variables, placeholders, or dynamic values used in query statements. These parameters make queries more flexible and versatile, and can adapt to different query conditions or data filtering requirements. Optionally, statement parameters include but are not limited to bind variables, query parameters, and placeholders in query templates.

数据范围是指在查询任务中，数据范围特指查询操作所涉及的数据集的限制条件，它决定了从数据库中检索哪些记录。可选的，数据范围包括但不限于逻辑条件以及分页与偏移等。Data range refers to the restriction conditions of the data set involved in the query operation in the query task, which determines which records are retrieved from the database. Optionally, the data range includes but is not limited to logical conditions, paging and offset, etc.

输出格式是指查询结果的呈现方式和结构特征。可选的，输出格式包括但不限于列选择与顺序、数据类型与格式、排序规则、分组与聚合、结果集的行数与列数等。Output format refers to the presentation and structural characteristics of query results. Optionally, output format includes but is not limited to column selection and order, data type and format, sorting rules, grouping and aggregation, number of rows and columns in the result set, etc.

具体地，首先明确要分析的“所针对的子任务”，它是整个流程中的一个特定任务单元，具有特定的执行逻辑、输入参数、数据处理范围和输出格式。Specifically, we first need to identify the “targeted subtask” to be analyzed, which is a specific task unit in the entire process, with specific execution logic, input parameters, data processing range, and output format.

再收集所针对子任务的语句参数。这包括但不限于变量名、参数值、参数类型、参数属性等信息。遍历其他所有子任务，对每个子任务的语句参数进行收集。逐一将所针对子任务的语句参数与每个其他子任务的语句参数进行比对，判断两者在名称、值、类型、属性等方面的匹配程度。如果两者的参数完全一致或存在兼容关系（可通过转换实现匹配），则认为在语句参数上是匹配的。记录每个其他子任务与所针对子任务的语句参数匹配结果，如匹配成功与否、匹配程度等。Then collect the statement parameters of the targeted subtask. This includes but is not limited to variable names, parameter values, parameter types, parameter attributes and other information. Traverse all other subtasks and collect the statement parameters of each subtask. Compare the statement parameters of the targeted subtask with the statement parameters of each other subtask one by one to determine the degree of match between the two in terms of name, value, type, attribute, etc. If the parameters of the two are completely consistent or there is a compatible relationship (matching can be achieved through conversion), it is considered that they are matched in terms of statement parameters. Record the matching results of the statement parameters of each other subtask and the targeted subtask, such as whether the match is successful or not, the degree of match, etc.

其次明确所针对子任务的数据处理范围，包括但不限于数据集大小、记录数、起止索引、数值范围、时间范围、空间范围等。同样遍历其他所有子任务，收集各自的数据处理范围信息。将所针对子任务的数据范围与每个其他子任务的数据范围进行对比，判断是否存在重叠或完全一致的情况。如果一个子任务的数据范围完全包含在所针对子任务的范围内，或者两者处理的是相同范围的数据，那么这两个子任务在数据范围上被认为是匹配的。记录每个其他子任务与所针对子任务的数据范围匹配结果。Secondly, clarify the data processing scope of the targeted subtask, including but not limited to the data set size, number of records, start and end indexes, numerical range, time range, spatial range, etc. Similarly, traverse all other subtasks and collect their respective data processing scope information. Compare the data range of the targeted subtask with the data range of each other subtask to determine whether there is overlap or complete consistency. If the data range of a subtask is completely contained in the range of the targeted subtask, or both process the same range of data, then the two subtasks are considered to match in terms of data range. Record the data range matching results of each other subtask and the targeted subtask.

然后定义所针对子任务的输出格式，包括但不限于结构化数据格式（如CSV、JSON、XML等）、非结构化数据格式（如文本、图片、音视频等）、元数据规范、API返回格式等。收集其他所有子任务的输出格式信息。将所针对子任务的输出格式与每个其他子任务的输出格式进行比较，判断二者是否相同或是否可以通过标准化转换达到兼容。如果两个子任务能产生遵循相同数据模型、编码规则、文件标准的结果，或者这些结果可以互相替代，那么这两个子任务在输出格式上被视为匹配。记录每个其他子任务与所针对子任务的输出格式匹配结果。Then define the output format of the targeted subtask, including but not limited to structured data formats (such as CSV, JSON, XML, etc.), unstructured data formats (such as text, images, audio and video, etc.), metadata specifications, API return formats, etc. Collect the output format information of all other subtasks. Compare the output format of the targeted subtask with the output format of each other subtask to determine whether the two are the same or whether they can be compatible through standardized conversion. If two subtasks can produce results that follow the same data model, encoding rules, and file standards, or these results can be replaced by each other, then the two subtasks are considered to match in output format. Record the output format matching results of each other subtask and the targeted subtask.

再然后基于步骤2至步骤4中记录的所有匹配结果，对每个其他子任务进行综合评估。如果一个其他子任务在参数匹配、范围匹配和格式匹配上均满足要求（即与所针对子任务高度匹配或完全一致），或者其执行逻辑明显依赖于所针对子任务的输出（即其输入参数、数据范围、输出格式与所针对子任务的输出相匹配），则认为这个其他子任务与所针对子任务存在依赖关系或具有相同性质。将所有满足上述条件的其他子任务标记为“共享子任务”。Then, based on all the matching results recorded in steps 2 to 4, each other subtask is comprehensively evaluated. If another subtask meets the requirements in parameter matching, range matching, and format matching (i.e., it is highly matched or completely consistent with the targeted subtask), or its execution logic is obviously dependent on the output of the targeted subtask (i.e., its input parameters, data range, and output format match the output of the targeted subtask), then this other subtask is considered to have a dependency relationship with the targeted subtask or have the same properties. All other subtasks that meet the above conditions are marked as "shared subtasks".

最后根据识别出的共享子任务，进行相应的任务优化、资源共享、代码复用、逻辑合并等操作，以提高效率、减少冗余计算、简化维护工作，实现任务协同和资源利用最大化。Finally, based on the identified shared subtasks, corresponding task optimization, resource sharing, code reuse, logic merging and other operations are performed to improve efficiency, reduce redundant calculations, simplify maintenance work, and achieve task collaboration and maximize resource utilization.

由于识别出共享子任务后，可以将共用的代码逻辑抽象出来，形成独立模块或服务，供多个任务复用。这样不仅减少了代码量，降低了代码复杂性，还简化了维护工作，提高了代码质量和可读性。同时，统一的处理逻辑也有助于确保数据一致性，减少因重复编码导致的潜在错误。After identifying shared subtasks, the common code logic can be abstracted to form independent modules or services for multiple tasks to reuse. This not only reduces the amount of code and code complexity, but also simplifies maintenance and improves code quality and readability. At the same time, unified processing logic also helps ensure data consistency and reduce potential errors caused by repeated coding.

在其中一个实施例中，在其他子任务中筛选出第一目标子任务集合，第一目标子任务集合包括根据参数匹配结果确定出的和所针对的子任务在语句参数上保持一致的其他子任务；在其他子任务中筛选出第二目标子任务集合，第二目标子任务集合包括根据范围匹配结果确定出的和所针对的子任务在数据范围上保持一致的其他子任务；在其他子任务中筛选出第三目标子任务集合，第三目标子任务集合包括根据格式匹配结果确定出的和所针对的子任务在输出格式上保持一致的其他子任务；在具有同时存在于第一目标子任务集合、第二目标子任务集合以及第三目标子任务集合中的其他子任务时，则将所针对的子任务作为共享子任务。In one of the embodiments, a first target subtask set is screened out from other subtasks, the first target subtask set includes other subtasks that are determined based on parameter matching results and are consistent with the targeted subtasks in statement parameters; a second target subtask set is screened out from other subtasks, the second target subtask set includes other subtasks that are determined based on range matching results and are consistent with the targeted subtasks in data range; a third target subtask set is screened out from other subtasks, the third target subtask set includes other subtasks that are determined based on format matching results and are consistent with the targeted subtasks in output format; when there are other subtasks that exist in the first target subtask set, the second target subtask set and the third target subtask set at the same time, the targeted subtask is taken as a shared subtask.

具体地，首先明确当前要分析和判断是否作为共享子任务的具体任务实例。获取除所针对子任务之外的所有相关子任务，构成一个“其他子任务”集合。明确影响子任务执行的关键语句参数，如查询条件、运算参数、函数调用参数等。对所针对子任务与“其他子任务”集合中的每一个子任务进行语句参数的逐项比对，判断其是否完全一致。将语句参数与所针对子任务完全一致的其他子任务筛选出来，组成“第一目标子任务集合”。Specifically, first clarify the specific task instance that is currently to be analyzed and judged as a shared subtask. Obtain all related subtasks except the targeted subtask to form an "other subtask" set. Clarify the key statement parameters that affect the execution of subtasks, such as query conditions, calculation parameters, function call parameters, etc. Compare the statement parameters of the targeted subtask with each subtask in the "other subtask" set item by item to determine whether they are completely consistent. Filter out other subtasks whose statement parameters are completely consistent with the targeted subtask to form the "first target subtask set".

其次明确影响子任务处理数据范围的要素，如时间区间、地理区域、特定记录ID列表等。对所针对子任务与“第一目标子任务集合”中的每一个子任务进行数据范围的逐项比对，判断其是否完全一致。将数据范围与所针对子任务完全一致的子任务从“第一目标子任务集合”中筛选出来，组成“第二目标子任务集合”。Secondly, identify the factors that affect the data scope of the subtask processing, such as time interval, geographical area, specific record ID list, etc. Compare the data scope of the targeted subtask with each subtask in the "first target subtask set" item by item to determine whether they are completely consistent. Filter out the subtasks whose data scope is completely consistent with the targeted subtask from the "first target subtask set" to form the "second target subtask set".

然后明确子任务输出结果的格式要求，如数据结构、文件类型、编码规范、报表样式等。对所针对子任务与“第二目标子任务集合”中的每一个子任务进行输出格式的逐项比对，判断其是否完全一致。将输出格式与所针对子任务完全一致的子任务从“第二目标子任务集合”中筛选出来，组成“第三目标子任务集合”。Then, the format requirements of the subtask output results are clarified, such as data structure, file type, coding specification, report style, etc. The output format of the targeted subtask is compared item by item with each subtask in the "second target subtask set" to determine whether they are completely consistent. The subtasks whose output format is completely consistent with the targeted subtask are selected from the "second target subtask set" to form the "third target subtask set".

最后检查“第三目标子任务集合”中的子任务是否同时存在于“第一目标子任务集合”、“第二目标子任务集合”以及“第三目标子任务集合”之中。如果存在这样的子任务，即它们在语句参数、数据范围、输出格式上均与所针对子任务保持一致，则判定所针对子任务为共享子任务。Finally, check whether the subtasks in the "third target subtask set" exist in the "first target subtask set", "second target subtask set" and "third target subtask set". If such subtasks exist, that is, they are consistent with the targeted subtask in terms of statement parameters, data range and output format, then the targeted subtask is determined to be a shared subtask.

通过识别出在语句参数、数据范围、输出格式三个维度均与所针对子任务一致的其他子任务，并据此判断所针对子任务是否具备作为共享子任务的条件。这样的方法确保了只有在关键属性完全一致的情况下才认定为共享子任务，有利于精确识别出真正可以复用、协同处理的部分，实现资源的有效利用和性能的提升。By identifying other subtasks that are consistent with the targeted subtask in terms of statement parameters, data range, and output format, and judging whether the targeted subtask is qualified as a shared subtask, this method ensures that only when the key attributes are completely consistent can it be identified as a shared subtask, which is conducive to accurately identifying the parts that can be truly reused and processed collaboratively, and realizing effective resource utilization and performance improvement.

在其中一个实施例中，在其他子任务中筛选出第四目标子任务集合，第四目标子任务集合包括根据参数匹配结果确定出的和所针对的子任务在语句参数上保持部分一致的其他子任务；在第四目标子任务集合的其他子任务中筛选出第五目标子任务集合，第五目标子任务集合包括根据范围匹配结果确定出的和所针对的子任务在数据范围上保持部分一致的其他子任务；在第五目标子任务集合的其他子任务中筛选出第六目标子任务集合，第六目标子任务集合包括根据格式匹配结果确定出的和所针对的子任务在输出格式上保持部分一致的其他子任务；将存在于第六目标子任务集合的其他子任务确定为依赖于所针对的子任务，并将所针对的子任务作为共享子任务。In one of the embodiments, a fourth target subtask set is screened out from other subtasks, the fourth target subtask set includes other subtasks that are determined based on parameter matching results and are partially consistent with the targeted subtasks in statement parameters; a fifth target subtask set is screened out from other subtasks of the fourth target subtask set, the fifth target subtask set includes other subtasks that are determined based on range matching results and are partially consistent with the targeted subtasks in data range; a sixth target subtask set is screened out from other subtasks of the fifth target subtask set, the sixth target subtask set includes other subtasks that are determined based on format matching results and are partially consistent with the targeted subtasks in output format; other subtasks existing in the sixth target subtask set are determined to be dependent on the targeted subtask, and the targeted subtask is used as a shared subtask.

具体地，首先明确当前要分析和判断是否作为共享子任务的具体任务实例。获取除所针对子任务之外的所有相关子任务，构成一个“其他子任务”集合。明确影响子任务执行的关键语句参数，如查询条件、运算参数、函数调用参数等。对所针对子任务与“其他子任务”集合中的每一个子任务进行语句参数的逐项比对，判断其是否存在部分一致的参数（即至少有一部分参数相同，但不一定是全部参数都相同）。将语句参数与所针对子任务存在部分一致的其他子任务筛选出来，组成“第四目标子任务集合”。Specifically, first clarify the specific task instance that is currently to be analyzed and judged as a shared subtask. Obtain all related subtasks except the targeted subtask to form an "other subtask" set. Clarify the key statement parameters that affect the execution of subtasks, such as query conditions, calculation parameters, function call parameters, etc. Compare the statement parameters of the targeted subtask with each subtask in the "other subtask" set item by item to determine whether there are partially consistent parameters (that is, at least some of the parameters are the same, but not necessarily all of the parameters are the same). Filter out other subtasks whose statement parameters are partially consistent with the targeted subtask to form the "fourth target subtask set."

其次明确影响子任务处理数据范围的要素，如时间区间、地理区域、特定记录ID列表等。对所针对子任务与“第四目标子任务集合”中的每一个子任务进行数据范围的逐项比对，判断其是否存在部分一致的范围（即至少有一部分数据范围相同，但不一定是全部数据范围都相同）。将数据范围与所针对子任务存在部分一致的子任务从“第四目标子任务集合”中筛选出来，组成“第五目标子任务集合”。Secondly, identify the factors that affect the data range of the subtask processing, such as time interval, geographical area, specific record ID list, etc. Compare the data range of the targeted subtask with each subtask in the "fourth target subtask set" item by item to determine whether there is a partially consistent range (that is, at least part of the data range is the same, but not necessarily all of the data range is the same). Filter out the subtasks whose data range is partially consistent with the targeted subtask from the "fourth target subtask set" to form the "fifth target subtask set".

然后明确子任务输出结果的格式要求，如数据结构、文件类型、编码规范、报表样式等。对所针对子任务与“第五目标子任务集合”中的每一个子任务进行输出格式的逐项比对，判断其是否存在部分一致的格式（即至少有一部分格式属性相同，但不一定是全部格式属性都相同）。将输出格式与所针对子任务存在部分一致的子任务从“第五目标子任务集合”中筛选出来，组成“第六目标子任务集合”。Then, the format requirements of the subtask output results are clarified, such as data structure, file type, encoding specification, report style, etc. The output formats of the targeted subtasks are compared item by item with each subtask in the "fifth target subtask set" to determine whether there are partially consistent formats (that is, at least some format attributes are the same, but not necessarily all format attributes are the same). The subtasks whose output formats are partially consistent with the targeted subtasks are selected from the "fifth target subtask set" to form the "sixth target subtask set".

最后将存在于“第六目标子任务集合” 的其他子任务确定为依赖于所针对的子任务，因为这些子任务在语句参数、数据范围、输出格式上与所针对子任务存在部分一致性，表明它们在某种程度上依赖于所针对子任务的处理结果或部分逻辑。由于存在依赖于所针对子任务的其他子任务，因此将所针对的子任务作为共享子任务。这意味着所针对子任务的处理结果或部分逻辑可以被其他子任务复用，有助于减少重复计算，提高资源利用率。Finally, other subtasks in the "sixth target subtask set" are determined to be dependent on the targeted subtask, because these subtasks are partially consistent with the targeted subtask in terms of statement parameters, data range, and output format, indicating that they are dependent on the processing results or partial logic of the targeted subtask to some extent. Since there are other subtasks that depend on the targeted subtask, the targeted subtask is treated as a shared subtask. This means that the processing results or partial logic of the targeted subtask can be reused by other subtasks, which helps to reduce repeated calculations and improve resource utilization.

由于在其他子任务中筛选出与所针对子任务在语句参数、数据范围、输出格式上存在部分一致性的子任务，确定它们与所针对子任务之间的依赖关系，并据此将所针对子任务作为共享子任务。这种方法允许一定程度的灵活性，即使子任务间并非完全一致也能识别出潜在的复用机会，有助于进一步挖掘的优化潜力。Since subtasks that are partially consistent with the targeted subtask in terms of statement parameters, data range, and output format are screened out from other subtasks, the dependency between them and the targeted subtask is determined, and the targeted subtask is treated as a shared subtask accordingly. This method allows a certain degree of flexibility, and even if the subtasks are not completely consistent, potential reuse opportunities can be identified, which helps to further explore the optimization potential.

在其中一个实施例中，针对每个共享子任务，确定所针对的共享子任务被拆分之前所属于的目标数据库查询任务；在所针对的共享子任务被拆分之前所属于的目标数据库查询任务中将所针对的共享子任务进行标记处理，得到标记任务；针对每个标记后数据库查询任务，将所针对的标记任务和被拆分之前归属于所针对的标记任务的共享子任务，分配给同一个空闲线程。In one of the embodiments, for each shared subtask, the target database query task to which the shared subtask belonged before being split is determined; the shared subtask is marked in the target database query task to which the shared subtask belonged before being split to obtain a marked task; for each marked database query task, the marked task and the shared subtask that belonged to the marked task before being split are assigned to the same idle thread.

其中，标记任务是指对原始数据库查询任务中的某个部分（即共享子任务）进行标识和区分的操作过程。当一个复杂的数据库查询任务被拆分成多个并行执行的子任务时，为了后续管理和调度这些子任务，需要先明确每个子任务在未拆分前所属的原始查询任务以及它在该任务中的位置和作用。Among them, marking tasks refers to the process of identifying and distinguishing a part of the original database query task (i.e., shared subtasks). When a complex database query task is split into multiple subtasks that are executed in parallel, in order to subsequently manage and schedule these subtasks, it is necessary to first clarify the original query task to which each subtask belongs before it is split, as well as its position and role in the task.

具体地，首先针对每一个待处理的共享子任务，首先确定它在被拆分之前所从属的目标数据库查询任务。例如，如果一个大查询任务被拆分为若干个子查询，我们需要知道每个子查询在原任务中的位置和角色。然后在确定了共享子任务所属的原始查询任务后，对该子任务进行标记处理，形成“标记任务”。这一步骤可能包括为子任务设置特定的标识符、标签或属性，记录下它与原始查询任务的关系，便于后续的任务调度和管理。再然后完成所有共享子任务的标记后，会形成一系列带有特定标记的子任务（即标记任务）。这些标记任务不仅包含了子任务本身的内容，还记录了它们在原查询任务中的上下文信息。最后针对每一个标记后的数据库查询任务（即已标记的子任务），找到一个空闲的线程，然后将这个标记任务与其在拆分前所从属的原始查询任务中的其他共享子任务一起，分配到这个线程上进行处理。Specifically, for each shared subtask to be processed, first determine the target database query task to which it belongs before being split. For example, if a large query task is split into several subqueries, we need to know the position and role of each subquery in the original task. Then, after determining the original query task to which the shared subtask belongs, mark the subtask to form a "marked task". This step may include setting a specific identifier, tag or attribute for the subtask to record its relationship with the original query task to facilitate subsequent task scheduling and management. Then, after completing the marking of all shared subtasks, a series of subtasks with specific tags (i.e., marked tasks) will be formed. These marked tasks not only contain the content of the subtask itself, but also record their contextual information in the original query task. Finally, for each marked database query task (i.e., marked subtask), find an idle thread, and then assign this marked task and other shared subtasks in the original query task to which it belongs before being split to this thread for processing.

由于通过合理的任务拆分、标记以及线程调度策略，既提升了查询效率，又增强了的稳定性与可扩展能力，并且各个子任务可以根据其特性独立优化，且在必要时可以根据标记信息重新组合，灵活应对不同场景下的查询需求变化。Due to the reasonable task splitting, marking and thread scheduling strategies, the query efficiency is improved, and the stability and scalability are enhanced. In addition, each subtask can be independently optimized according to its characteristics, and can be recombined according to the marking information when necessary, so as to flexibly respond to changes in query requirements in different scenarios.

在其中一个实施例中，针对每个空闲线程，通过所针对的空闲线程对分配的共享子任务进行处理，得到分配的共享子任务所对应的共享子结果；通过所针对的空闲线程对分配的标记任务进行处理，当所针对的空闲线程处理到分配的标记任务中的共享子任务时，将被拆分之前归属于分配的标记任务的共享子任务的共享子结果，作为分配的标记任务中的共享子任务的处理结果，并通过空闲线程继续处理标记后数据库查询任务。In one of the embodiments, for each idle thread, the assigned shared subtask is processed by the targeted idle thread to obtain the shared subresult corresponding to the assigned shared subtask; the assigned marking task is processed by the targeted idle thread, and when the targeted idle thread processes the shared subtask in the assigned marking task, the shared subresult of the shared subtask that belonged to the assigned marking task before being split is used as the processing result of the shared subtask in the assigned marking task, and the post-marking database query task continues to be processed by the idle thread.

其中，共享子结果是指每个共享子任务在被独立处理后产生的中间结果。由于整个数据库查询任务被拆分成多个共享子任务，每个子任务在各自的空闲线程上运行并完成部分计算，此时产生的结果就是“共享子结果”。The shared sub-result refers to the intermediate result generated by each shared sub-task after being processed independently. Since the entire database query task is split into multiple shared sub-tasks, each sub-task runs on its own idle thread and completes part of the calculation. The result generated at this time is the "shared sub-result".

具体地，首先检测所有可用的空闲线程资源。将一个大型的或者复杂的数据库查询任务分解成多个较小的、独立可处理的共享子任务。其次对于每一个识别到的空闲线程，会为其分配一个或多个共享子任务。空闲线程开始执行其分配到的共享子任务，比如执行特定范围内的数据检索或计算操作。完成处理后，每个空闲线程会产生其所对应处理的共享子任务的结果，即“共享子结果”。然后再某些情况下可能存在特殊标记的任务，这些任务内部可能也包含了一系列相关的共享子任务。当空闲线程遇到这样的标记任务时，它会针对性地处理该任务中包含的共享子任务。再然后当空闲线程在处理标记任务中的某个共享子任务时，如果这个子任务之前已经被拆分并且完成了处理，那么它应当已经有一个已计算好的共享子结果。此时，该空闲线程会获取这个已被拆分且归属于当前标记任务的共享子任务的先前计算结果，并将这个结果直接作为当前子任务的处理结果。最后在获取了相应共享子任务的处理结果后，空闲线程会继续处理标记任务的剩余部分，包括但不限于其他未处理的共享子任务，以及可能的整合或汇总工作。直至整个标记后的数据库查询任务全部完成。Specifically, firstly, all available idle thread resources are detected. A large or complex database query task is decomposed into multiple smaller, independently processable shared subtasks. Secondly, for each identified idle thread, one or more shared subtasks are assigned to it. The idle thread starts to execute the shared subtasks assigned to it, such as performing data retrieval or calculation operations within a specific range. After completing the processing, each idle thread will generate the result of the shared subtask it processes, that is, the "shared subresult". Then, in some cases, there may be specially marked tasks, which may also contain a series of related shared subtasks. When an idle thread encounters such a marked task, it will process the shared subtasks contained in the task in a targeted manner. Then, when the idle thread is processing a shared subtask in the marked task, if this subtask has been split and processed before, it should already have a calculated shared subresult. At this time, the idle thread will obtain the previous calculation result of the shared subtask that has been split and belongs to the current marked task, and directly use this result as the processing result of the current subtask. Finally, after obtaining the processing results of the corresponding shared subtasks, the idle thread will continue to process the remaining parts of the marking task, including but not limited to other unprocessed shared subtasks, and possible integration or aggregation work, until the entire database query task after marking is completed.

由于即使子任务在不同的线程上独立执行，由于采用了标记和共享子结果的方式，能够确保各个子任务的结果正确整合，从而得到完整的查询结果。非常适合处理大数据量、复杂度高的查询场景，尤其对于需要快速响应的实时应用和大规模数据分析，能够提供显著的性能提升。Even if the subtasks are executed independently on different threads, the use of marking and sharing subresults ensures that the results of each subtask are correctly integrated to obtain a complete query result. It is very suitable for processing large data volumes and high-complexity query scenarios, especially for real-time applications and large-scale data analysis that require fast response, and can provide significant performance improvements.

在其中一个实施例中，当空闲线程将多个目标数据库查询任务和共享子任务处理完毕后，将预备队列中的多个目标数据库查询任务的执行状态由尚未执行切换为取消；根据预备队列中具有取消状态的数据库查询任务的数量，确定补充任务数量；将接收到的补充任务数量个数据库查询任务数量，存储到预备队列中。In one of the embodiments, after the idle thread has completed processing multiple target database query tasks and shared subtasks, the execution status of multiple target database query tasks in the preparation queue is switched from not yet executed to canceled; the number of supplementary tasks is determined based on the number of database query tasks with canceled status in the preparation queue; and the received number of supplementary tasks and the number of database query tasks are stored in the preparation queue.

其中，补充任务数量是指在满足一定条件后，需要向预备队列补充的新数据库查询任务的数量。The number of supplementary tasks refers to the number of new database query tasks that need to be added to the reserve queue after certain conditions are met.

具体地，首先监测到有空闲线程。空闲线程从预备队列中取出多个目标数据库查询任务以及相关的共享子任务进行处理。空闲线程逐一执行这些任务直至所有目标数据库查询任务及其共享子任务都已成功处理完毕。Specifically, first, an idle thread is detected. The idle thread takes out multiple target database query tasks and related shared subtasks from the preparation queue for processing. The idle thread executes these tasks one by one until all target database query tasks and their shared subtasks have been successfully processed.

然后当所有分配给空闲线程的任务执行完成后，将预备队列中这些已完成的任务状态从“尚未执行”切换为“取消”状态，表示这些任务已经不需要再次执行，对于这些已经执行完毕的目标数据库查询任务和共享子任务，可设置数据管理期限实现到期自动进行销毁，以释放内存。对于一直没有执行的子任务，也可以通过设置数据管理期限实现到期自动进行取消，与该子任务相关管理的依赖任务也进行取消，并且通过webhook（基于HTTP协议的实时通讯方式）告警通知。Then, when all the tasks assigned to the idle threads are completed, the status of these completed tasks in the standby queue will be switched from "not yet executed" to "cancelled", indicating that these tasks no longer need to be executed. For these completed target database query tasks and shared subtasks, the data management period can be set to automatically destroy them upon expiration to release memory. For subtasks that have not been executed, the data management period can also be set to automatically cancel them upon expiration. The dependent tasks related to the management of the subtask are also canceled, and an alarm notification is sent through webhook (a real-time communication method based on HTTP protocol).

再然后检查预备队列中处于“取消”状态的任务数量。根据这个数量，以及预设规则或算法来确定需要补充多少新的数据库查询任务，以保持队列内有足够的任务供后续空闲线程处理。Then check the number of tasks in the "cancelled" state in the preparation queue. According to this number and the preset rules or algorithms, determine how many new database query tasks need to be added to keep enough tasks in the queue for subsequent idle threads to process.

最后接收到来自外部输入或者其他任务生成渠道的补充任务数量个数据库查询任务。并将这些新任务逐个加入到预备队列中，准备供下一批空闲线程进行执行。Finally, the number of additional tasks from external input or other task generation channels is received, and these new tasks are added to the preparation queue one by one, ready for execution by the next batch of idle threads.

由于通过实时更新任务的状态（如“尚未执行”到“取消”），可以清晰地跟踪每个任务的执行情况，便于监控和问题排查。取消已执行过的任务，防止重复执行，节省计算资源和数据库访问开销。并且根据预备队列中任务的取消情况动态确定补充任务数量，确保任务队列始终有一定的任务储备，维持的持续运行和响应性。自动化的任务补充功能增强了的自我恢复和扩展能力，尤其在面对大量短时任务时，能迅速调整适应。By updating the status of tasks in real time (such as "not yet executed" to "cancelled"), the execution status of each task can be clearly tracked, which is convenient for monitoring and troubleshooting. Cancel the executed tasks to prevent repeated execution, saving computing resources and database access overhead. And dynamically determine the number of supplementary tasks based on the cancellation of tasks in the reserve queue to ensure that the task queue always has a certain amount of task reserves to maintain continuous operation and responsiveness. The automated task supplementation function enhances the self-recovery and expansion capabilities of the system, especially when facing a large number of short-term tasks, it can quickly adjust and adapt.

在一个示例性的实施例中，如图3所示，包括步骤302至步骤306。其中：In an exemplary embodiment, as shown in FIG3 , steps 302 to 306 are included. Among them:

步骤302，确定空闲线程，若空闲线程的数量超过预设线程阈值，则根据空闲线程的数量，确定数据库查询任务的目标筛选数量；获取预备任务队列中每一数据库查询任务的执行状态信息，将执行状态信息为尚未执行的数据库查询任务确定为候选数据库查询任务；获取每个候选数据库查询任务在预备任务队列中的入列时间；按照入列时间从远至近的顺序，从预备任务队列中的多个候选数据库查询任务中依次筛选出个数为目标筛选数量的目标数据库查询任务；或者获取每个候选数据库查询任务在预备任务队列中的查询优先级；按照查询优先级由高至低的顺序，从预备任务队列中的多个候选数据库查询任务中依次筛选出个数为目标筛选数量的目标数据库查询任务；Step 302, determine the idle threads, if the number of idle threads exceeds the preset thread threshold, then determine the target screening number of database query tasks according to the number of idle threads; obtain the execution status information of each database query task in the preparatory task queue, and determine the database query tasks whose execution status information is not yet executed as candidate database query tasks; obtain the entry time of each candidate database query task in the preparatory task queue; in the order of entry time from far to near, sequentially select the target database query tasks with the target screening number from the multiple candidate database query tasks in the preparatory task queue; or obtain the query priority of each candidate database query task in the preparatory task queue; in the order of query priority from high to low, sequentially select the target database query tasks with the target screening number from the multiple candidate database query tasks in the preparatory task queue;

步骤304，将每个目标数据库查询任务拆分成多个子任务；Step 304, split each target database query task into multiple subtasks;

步骤306，将所针对的子任务分别和其他子任务进行语句参数匹配，得到参数匹配结果；将所针对的子任务分别和其他子任务进行数据范围匹配，得到范围匹配结果；将所针对的子任务分别和其他子任务进行输出格式匹配，得到格式匹配结果；在其他子任务中筛选出第一目标子任务集合，第一目标子任务集合包括根据参数匹配结果确定出的和所针对的子任务在语句参数上保持一致的其他子任务；在其他子任务中筛选出第二目标子任务集合，第二目标子任务集合包括根据范围匹配结果确定出的和所针对的子任务在数据范围上保持一致的其他子任务；在其他子任务中筛选出第三目标子任务集合，第三目标子任务集合包括根据格式匹配结果确定出的和所针对的子任务在输出格式上保持一致的其他子任务；在具有同时存在于第一目标子任务集合、第二目标子任务集合以及第三目标子任务集合中的其他子任务时，则将所针对的子任务作为共享子任务；Step 306, respectively matching the statement parameters of the targeted subtask with other subtasks to obtain parameter matching results; respectively matching the data range of the targeted subtask with other subtasks to obtain range matching results; respectively matching the output format of the targeted subtask with other subtasks to obtain format matching results; filtering out a first target subtask set from other subtasks, the first target subtask set including other subtasks that are consistent with the targeted subtask in statement parameters according to the parameter matching results; filtering out a second target subtask set from other subtasks, the second target subtask set including other subtasks that are consistent with the targeted subtask in data range according to the range matching results; filtering out a third target subtask set from other subtasks, the third target subtask set including other subtasks that are consistent with the targeted subtask in output format according to the format matching results; when there are other subtasks that exist in the first target subtask set, the second target subtask set and the third target subtask set at the same time, the targeted subtask is used as a shared subtask;

或者or

将所针对的子任务分别和其他子任务进行语句参数匹配，得到参数匹配结果；将所针对的子任务分别和其他子任务进行数据范围匹配，得到范围匹配结果；将所针对的子任务分别和其他子任务进行输出格式匹配，得到格式匹配结果；在其他子任务中筛选出第四目标子任务集合，第四目标子任务集合包括根据参数匹配结果确定出的和所针对的子任务在语句参数上保持部分一致的其他子任务；在第四目标子任务集合的其他子任务中筛选出第五目标子任务集合，第五目标子任务集合包括根据范围匹配结果确定出的和所针对的子任务在数据范围上保持部分一致的其他子任务；在第五目标子任务集合的其他子任务中筛选出第六目标子任务集合，第六目标子任务集合包括根据格式匹配结果确定出的和所针对的子任务在输出格式上保持部分一致的其他子任务；将存在于第六目标子任务集合的其他子任务确定为依赖于所针对的子任务，并将所针对的子任务作为共享子任务；Match the statement parameters of the targeted subtask with other subtasks respectively to obtain parameter matching results; match the data range of the targeted subtask with other subtasks respectively to obtain range matching results; match the output format of the targeted subtask with other subtasks respectively to obtain format matching results; select a fourth target subtask set from other subtasks, the fourth target subtask set includes other subtasks that are determined according to the parameter matching results and are partially consistent with the targeted subtask in statement parameters; select a fifth target subtask set from other subtasks in the fourth target subtask set, the fifth target subtask set includes other subtasks that are determined according to the range matching results and are partially consistent with the targeted subtask in data range; select a sixth target subtask set from other subtasks in the fifth target subtask set, the sixth target subtask set includes other subtasks that are determined according to the format matching results and are partially consistent with the targeted subtask in output format; determine other subtasks in the sixth target subtask set as dependent on the targeted subtask, and use the targeted subtask as a shared subtask;

步骤308，针对每个共享子任务，确定所针对的共享子任务被拆分之前所属于的目标数据库查询任务；在所针对的共享子任务被拆分之前所属于的目标数据库查询任务中将所针对的共享子任务进行标记处理，得到标记任务；针对每个标记后数据库查询任务，将所针对的标记任务和被拆分之前归属于所针对的标记任务的共享子任务，分配给同一个空闲线程；Step 308, for each shared subtask, determine the target database query task to which the shared subtask belongs before being split; mark the shared subtask in the target database query task to which the shared subtask belongs before being split to obtain a marked task; for each marked database query task, assign the marked task and the shared subtask that belonged to the marked task before being split to the same idle thread;

步骤310，针对每个空闲线程，通过所针对的空闲线程对分配的共享子任务进行处理，得到分配的共享子任务所对应的共享子结果；通过所针对的空闲线程对分配的标记任务进行处理，当所针对的空闲线程处理到分配的标记任务中的共享子任务时，将被拆分之前归属于分配的标记任务的共享子任务的共享子结果，作为分配的标记任务中的共享子任务的处理结果，并通过空闲线程继续处理标记后数据库查询任务；Step 310, for each idle thread, the assigned shared subtask is processed by the idle thread to obtain the shared subresult corresponding to the assigned shared subtask; the assigned marking task is processed by the idle thread, and when the idle thread processes the shared subtask in the assigned marking task, the shared subresult of the shared subtask that belonged to the assigned marking task before being split is used as the processing result of the shared subtask in the assigned marking task, and the post-marking database query task is continued to be processed by the idle thread;

步骤312，当空闲线程将多个目标数据库查询任务和共享子任务处理完毕后，将预备队列中的多个目标数据库查询任务的执行状态由尚未执行切换为取消；根据预备队列中具有取消状态的数据库查询任务的数量，确定补充任务数量；将接收到的补充任务数量个数据库查询任务，存储到预备队列中。Step 312, when the idle thread has completed processing the multiple target database query tasks and shared subtasks, the execution status of the multiple target database query tasks in the preparation queue is switched from not yet executed to canceled; the number of supplementary tasks is determined according to the number of database query tasks with canceled status in the preparation queue; and the number of database query tasks received, which is the number of supplementary tasks, is stored in the preparation queue.

应该理解的是，虽然如上的各实施例所涉及的流程图中的各个步骤按照箭头的指示依次显示，但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明，这些步骤的执行并没有严格的顺序限制，这些步骤可以以其它的顺序执行。而且，如上的各实施例所涉及的流程图中的至少一部分步骤可以包括多个步骤或者多个阶段，这些步骤或者阶段并不必然是在同一时刻执行完成，而是可以在不同的时刻执行，这些步骤或者阶段的执行顺序也不必然是依次进行，而是可以与其它步骤或者其它步骤中的步骤或者阶段的至少一部分轮流或者交替地执行。It should be understood that, although the steps in the flowcharts involved in the above embodiments are displayed in sequence according to the indication of the arrows, these steps are not necessarily executed in sequence according to the order indicated by the arrows. Unless there is a clear explanation in this article, the execution of these steps is not strictly limited in order, and these steps can be executed in other orders. Moreover, at least a part of the steps in the flowcharts involved in the above embodiments may include multiple steps or multiple stages, and these steps or stages are not necessarily executed at the same time, but can be executed at different times, and the execution order of these steps or stages is not necessarily to be carried out in sequence, but can be executed in turn or alternately with other steps or at least a part of the steps or stages in other steps.

基于同样的发明构思，本申请实施例还提供了一种用于实现上述所涉及的数据库查询任务异步管理方法的数据库查询任务异步管理装置。该装置所提供的解决问题的实现方案与上述方法中所记载的实现方案相似，故下面所提供的一个或多个数据库查询任务异步管理装置实施例中的具体限定可以参见上文中对于数据库查询任务异步管理方法的限定，在此不再赘述。Based on the same inventive concept, the embodiment of the present application also provides a database query task asynchronous management device for implementing the database query task asynchronous management method involved above. The implementation scheme for solving the problem provided by the device is similar to the implementation scheme recorded in the above method, so the specific limitations in one or more database query task asynchronous management device embodiments provided below can refer to the limitations of the database query task asynchronous management method above, and will not be repeated here.

在一个示例性的实施例中，如图4所示，提供了一种数据库查询任务异步管理装置400，包括：第一确定模块402、拆分模块404、第二确定模块406和分配模块408，其中：In an exemplary embodiment, as shown in FIG. 4 , a database query task asynchronous management device 400 is provided, comprising: a first determination module 402 , a splitting module 404 , a second determination module 406 and an allocation module 408 , wherein:

第一确定模块402，用于确定空闲线程，若空闲线程的数量超过预设线程阈值，则根据空闲线程的数量，从预备任务队列中筛选出多个目标数据库查询任务；The first determination module 402 is used to determine idle threads, and if the number of idle threads exceeds a preset thread threshold, multiple target database query tasks are screened out from the preparation task queue according to the number of idle threads;

拆分模块404，用于将每个目标数据库查询任务拆分成多个子任务；A splitting module 404 is used to split each target database query task into multiple subtasks;

第二确定模块406，用于针对多个子任务中的每个子任务，若确定其他子任务依赖于所针对的子任务，或者，具有与所针对的子任务相同的其他子任务时，则将所针对的子任务作为共享子任务；A second determining module 406 is used for, for each of the multiple subtasks, if it is determined that other subtasks depend on the targeted subtask, or have other subtasks that are the same as the targeted subtask, then the targeted subtask is used as a shared subtask;

分配模块408，用于将多个目标数据库查询任务和共享子任务，分配给空闲线程进行处理。The allocation module 408 is used to allocate multiple target database query tasks and shared subtasks to idle threads for processing.

在其中一个实施例中，第一确定模块402，用于根据空闲线程的数量，确定数据库查询任务的目标筛选数量；获取预备任务队列中每一数据库查询任务的执行状态信息，将执行状态信息为尚未执行的数据库查询任务确定为候选数据库查询任务；从候选数据库查询任务中筛选出为目标筛选数量个目标数据库查询任务。In one embodiment, the first determination module 402 is used to determine the target screening number of database query tasks based on the number of idle threads; obtain the execution status information of each database query task in the preparation task queue, and determine the database query tasks whose execution status information is not yet executed as candidate database query tasks; and filter out the target screening number of target database query tasks from the candidate database query tasks.

在其中一个实施例中，第一确定模块402，用于获取每个候选数据库查询任务在预备任务队列中的入列时间；按照入列时间从远至近的顺序，从预备任务队列中的多个候选数据库查询任务中依次筛选出个数为目标筛选数量的目标数据库查询任务；或者获取每个候选数据库查询任务在预备任务队列中的查询优先级；按照查询优先级由高至低的顺序，从预备任务队列中的多个候选数据库查询任务中依次筛选出个数为目标筛选数量的目标数据库查询任务。In one embodiment, the first determination module 402 is used to obtain the entry time of each candidate database query task in the preparation task queue; in order of the entry time from far to near, sequentially select the target database query tasks with a target screening number from the multiple candidate database query tasks in the preparation task queue; or obtain the query priority of each candidate database query task in the preparation task queue; in order of the query priority from high to low, sequentially select the target database query tasks with a target screening number from the multiple candidate database query tasks in the preparation task queue.

在其中一个实施例中，第二确定模块406，用于将所针对的子任务分别和其他子任务进行语句参数匹配，得到参数匹配结果；将所针对的子任务分别和其他子任务进行数据范围匹配，得到范围匹配结果；将所针对的子任务分别和其他子任务进行输出格式匹配，得到格式匹配结果；在根据参数匹配结果、范围匹配结果以及格式匹配结果，确定其他子任务依赖于所针对的子任务，或者，具有与所针对的子任务相同的其他子任务时，则将所针对的子任务作为共享子任务。In one embodiment, the second determination module 406 is used to match the statement parameters of the targeted subtask with other subtasks to obtain parameter matching results; match the data range of the targeted subtask with other subtasks to obtain range matching results; match the output format of the targeted subtask with other subtasks to obtain format matching results; when it is determined based on the parameter matching results, the range matching results and the format matching results that other subtasks are dependent on the targeted subtask, or have other subtasks that are the same as the targeted subtask, the targeted subtask is used as a shared subtask.

在其中一个实施例中，第二确定模块406，用于在其他子任务中筛选出第一目标子任务集合，第一目标子任务集合包括根据参数匹配结果确定出的和所针对的子任务在语句参数上保持一致的其他子任务；在其他子任务中筛选出第二目标子任务集合，第二目标子任务集合包括根据范围匹配结果确定出的和所针对的子任务在数据范围上保持一致的其他子任务；在其他子任务中筛选出第三目标子任务集合，第三目标子任务集合包括根据格式匹配结果确定出的和所针对的子任务在输出格式上保持一致的其他子任务；在具有同时存在于第一目标子任务集合、第二目标子任务集合以及第三目标子任务集合中的其他子任务时，则将所针对的子任务作为共享子任务。In one embodiment, the second determination module 406 is used to filter out a first target subtask set from other subtasks, the first target subtask set including other subtasks determined based on parameter matching results and consistent with the targeted subtasks in statement parameters; filter out a second target subtask set from other subtasks, the second target subtask set including other subtasks determined based on range matching results and consistent with the targeted subtasks in data range; filter out a third target subtask set from other subtasks, the third target subtask set including other subtasks determined based on format matching results and consistent with the targeted subtasks in output format; when there are other subtasks that exist in the first target subtask set, the second target subtask set and the third target subtask set at the same time, the targeted subtask is used as a shared subtask.

在其中一个实施例中，第二确定模块406，用于在其他子任务中筛选出第四目标子任务集合，第四目标子任务集合包括根据参数匹配结果确定出的和所针对的子任务在语句参数上保持部分一致的其他子任务；在第四目标子任务集合的其他子任务中筛选出第五目标子任务集合，第五目标子任务集合包括根据范围匹配结果确定出的和所针对的子任务在数据范围上保持部分一致的其他子任务；在第五目标子任务集合的其他子任务中筛选出第六目标子任务集合，第六目标子任务集合包括根据格式匹配结果确定出的和所针对的子任务在输出格式上保持部分一致的其他子任务；将存在于第六目标子任务集合的其他子任务确定为依赖于所针对的子任务，并将所针对的子任务作为共享子任务。In one embodiment, the second determination module 406 is used to filter out a fourth target subtask set from other subtasks, the fourth target subtask set including other subtasks that are determined based on parameter matching results and are partially consistent with the targeted subtasks in statement parameters; filter out a fifth target subtask set from other subtasks in the fourth target subtask set, the fifth target subtask set including other subtasks that are determined based on range matching results and are partially consistent with the targeted subtasks in data range; filter out a sixth target subtask set from other subtasks in the fifth target subtask set, the sixth target subtask set including other subtasks that are determined based on format matching results and are partially consistent with the targeted subtasks in output format; determine other subtasks existing in the sixth target subtask set as dependent on the targeted subtask, and treat the targeted subtask as a shared subtask.

在其中一个实施例中，分配模块408，用于针对每个共享子任务，确定所针对的共享子任务被拆分之前所属于的目标数据库查询任务；在所针对的共享子任务被拆分之前所属于的目标数据库查询任务中将所针对的共享子任务进行标记处理，得到标记任务；针对每个标记后数据库查询任务，将所针对的标记任务和被拆分之前归属于所针对的标记任务的共享子任务，分配给同一个空闲线程。In one embodiment, the allocation module 408 is used to determine, for each shared subtask, the target database query task to which the shared subtask belongs before being split; mark the shared subtask in the target database query task to which the shared subtask belongs before being split to obtain a marked task; and for each marked database query task, assign the marked task and the shared subtask that belonged to the marked task before being split to the same idle thread.

在其中一个实施例中，数据库查询任务异步管理装置还包括处理模块410，用于针对每个空闲线程，通过所针对的空闲线程对分配的共享子任务进行处理，得到分配的共享子任务所对应的共享子结果；通过所针对的空闲线程对分配的标记任务进行处理，当所针对的空闲线程处理到分配的标记任务中的共享子任务时，将被拆分之前归属于分配的标记任务的共享子任务的共享子结果，作为分配的标记任务中的共享子任务的处理结果，并通过空闲线程继续处理标记后数据库查询任务。In one of the embodiments, the database query task asynchronous management device also includes a processing module 410, which is used to process the assigned shared subtask through the targeted idle thread for each idle thread to obtain the shared subresult corresponding to the assigned shared subtask; process the assigned marking task through the targeted idle thread, and when the targeted idle thread processes the shared subtask in the assigned marking task, the shared subresult of the shared subtask that belonged to the assigned marking task before being split is used as the processing result of the shared subtask in the assigned marking task, and continue to process the marked database query task through the idle thread.

在其中一个实施例中，数据库查询任务异步管理装置还包括补充模块412，用于当空闲线程将多个目标数据库查询任务和共享子任务处理完毕后，将预备队列中的多个目标数据库查询任务的执行状态由尚未执行切换为取消；根据预备队列中具有取消状态的数据库查询任务的数量，确定补充任务数量；将接收到的补充任务数量个数据库查询任务，存储到预备队列中。In one embodiment, the database query task asynchronous management device also includes a supplement module 412, which is used to switch the execution status of multiple target database query tasks in the preparation queue from not yet executed to canceled when the idle thread completes processing of multiple target database query tasks and shared subtasks; determine the number of supplementary tasks based on the number of database query tasks with canceled status in the preparation queue; and store the received number of supplementary tasks, that is, database query tasks, in the preparation queue.

在另一实施例中，如图5所示，图5为另一个实施例中数据库查询任务异步管理装置的结构框图，包括：第一确定模块402、拆分模块404、第二确定模块406和分配模块408。其中，数据库查询任务异步管理装置400还包括处理模块410以及补充模块412。处理模块140用于针对每个空闲线程，通过所针对的空闲线程对分配的共享子任务进行处理，得到分配的共享子任务所对应的共享子结果；通过所针对的空闲线程对分配的标记任务进行处理，当所针对的空闲线程处理到分配的标记任务中的共享子任务时，将被拆分之前归属于分配的标记任务的共享子任务的共享子结果，作为分配的标记任务中的共享子任务的处理结果，并通过空闲线程继续处理标记后数据库查询任务。补充模块412用于当空闲线程将多个目标数据库查询任务和共享子任务处理完毕后，将预备队列中的多个目标数据库查询任务的执行状态由尚未执行切换为取消；根据预备队列中具有取消状态的数据库查询任务的数量，确定补充任务数量；将接收到的补充任务数量个数据库查询任务，存储到预备队列中。In another embodiment, as shown in FIG5 , FIG5 is a structural block diagram of a database query task asynchronous management device in another embodiment, including: a first determination module 402, a splitting module 404, a second determination module 406 and an allocation module 408. Among them, the database query task asynchronous management device 400 also includes a processing module 410 and a supplementing module 412. The processing module 140 is used for processing the allocated shared subtasks through the idle threads for each idle thread to obtain the shared subresults corresponding to the allocated shared subtasks; processing the allocated marking tasks through the idle threads, when the idle threads process the shared subtasks in the allocated marking tasks, the shared subresults of the shared subtasks that belonged to the allocated marking tasks before being split are used as the processing results of the shared subtasks in the allocated marking tasks, and the marked database query tasks are continued to be processed through the idle threads. The supplement module 412 is used to switch the execution status of multiple target database query tasks in the preparation queue from not yet executed to canceled after the idle thread has processed multiple target database query tasks and shared subtasks; determine the number of supplementary tasks based on the number of database query tasks with canceled status in the preparation queue; and store the received number of supplementary tasks, that is, database query tasks, in the preparation queue.

上述数据库查询任务异步管理装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中，也可以以软件形式存储于计算机设备中的存储器中，以便于处理器调用执行以上各个模块对应的操作。Each module in the above-mentioned database query task asynchronous management device can be implemented in whole or in part by software, hardware and their combination. Each module can be embedded in or independent of the processor in the computer device in the form of hardware, or can be stored in the memory in the computer device in the form of software, so that the processor can call and execute the operations corresponding to each module above.

在一个示例性的实施例中，提供了一种计算机设备，该计算机设备可以是服务器，其内部结构图可以如图6所示。该计算机设备包括处理器、存储器、输入/输出接口(Input/Output，简称I/O）和通信接口。其中，处理器、存储器和输入/输出接口通过总线连接，通信接口通过输入/输出接口连接到总线。其中，该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质和内存储器。该非易失性存储介质存储有操作、计算机程序和数据库。该内存储器为非易失性存储介质中的操作和计算机程序的运行提供环境。该计算机设备的数据库用于存储与数据库查询任务异步管理相关的数据。该计算机设备的输入/输出接口用于处理器与外部设备之间交换信息。该计算机设备的通信接口用于与外部的终端通过网络连接通信。该计算机程序被处理器执行时以实现一种数据库查询任务异步管理方法。In an exemplary embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be shown in FIG6. The computer device includes a processor, a memory, an input/output interface (Input/Output, referred to as I/O) and a communication interface. The processor, the memory and the input/output interface are connected via a bus, and the communication interface is connected to the bus via the input/output interface. The processor of the computer device is used to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores operations, computer programs and databases. The internal memory provides an environment for the operation of the operations and computer programs in the non-volatile storage medium. The database of the computer device is used to store data related to asynchronous management of database query tasks. The input/output interface of the computer device is used to exchange information between the processor and an external device. The communication interface of the computer device is used to communicate with an external terminal via a network connection. When the computer program is executed by the processor, a method for asynchronous management of database query tasks is implemented.

本领域技术人员可以理解，图6中示出的结构，仅仅是与本申请方案相关的部分结构的框图，并不构成对本申请方案所应用于其上的计算机设备的限定，具体的计算机设备可以包括比图中所示更多或更少的部件，或者组合某些部件，或者具有不同的部件布置。Those skilled in the art will understand that the structure shown in FIG. 6 is merely a block diagram of a partial structure related to the solution of the present application, and does not constitute a limitation on the computer device to which the solution of the present application is applied. The specific computer device may include more or fewer components than shown in the figure, or combine certain components, or have a different arrangement of components.

在一个实施例中，还提供了一种计算机设备，包括存储器和处理器，存储器中存储有计算机程序，该处理器执行计算机程序时实现上述各方法实施例中的步骤。In one embodiment, a computer device is further provided, including a memory and a processor, wherein a computer program is stored in the memory, and the processor implements the steps in the above method embodiments when executing the computer program.

在一个实施例中，提供了一种计算机可读存储介质，其上存储有计算机程序，该计算机程序被处理器执行时实现上述各方法实施例中的步骤。In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored. When the computer program is executed by a processor, the steps in the above-mentioned method embodiments are implemented.

在一个实施例中，提供了一种计算机程序产品，包括计算机程序，该计算机程序被处理器执行时实现上述各方法实施例中的步骤。In one embodiment, a computer program product is provided, including a computer program, which implements the steps in the above method embodiments when executed by a processor.

需要说明的是，本申请所涉及的用户信息（包括但不限于用户设备信息、用户个人信息等）和数据（包括但不限于用于分析的数据、存储的数据、展示的数据等），均为经用户授权或者经过各方充分授权的信息和数据，且相关数据的收集、使用和处理需要符合相关规定。It should be noted that the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data used for analysis, stored data, displayed data, etc.) involved in this application are all information and data authorized by the user or fully authorized by all parties, and the collection, use and processing of relevant data must comply with relevant regulations.

本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程，是可以通过计算机程序来指令相关的硬件来完成，的计算机程序可存储于一非易失性计算机可读取存储介质中，该计算机程序在执行时，可包括如上述各方法的实施例的流程。其中，本申请提供的各实施例中所使用的对存储器、数据库或其它介质的任何引用，均可包括非易失性存储器和易失性存储器中的至少一种。非易失性存储器可包括只读存储器（Read-OnlyMemory，ROM）、磁带、软盘、闪存、光存储器、高密度嵌入式非易失性存储器、阻变存储器（Resistive Random Access Memory，ReRAM）、磁变存储器（Magnetoresistive RandomAccess Memory，MRAM）、铁电存储器（Ferroelectric Random Access Memory，FRAM）、相变存储器（Phase Change Memory，PCM）、石墨烯存储器等。易失性存储器可包括随机存取存储器（Random Access Memory，RAM）或外部高速缓冲存储器等。作为说明而非局限，RAM可以是多种形式，比如静态随机存取存储器（Static Random Access Memory，SRAM）或动态随机存取存储器（Dynamic Random Access Memory，DRAM）等。本申请提供的各实施例中所涉及的数据库可包括关系型数据库和非关系型数据库中至少一种。非关系型数据库可包括基于区块链的分布式数据库等，不限于此。本申请提供的各实施例中所涉及的处理器可为通用处理器、中央处理器、图形处理器、数字信号处理器、可编程逻辑器、基于量子计算的数据处理逻辑器、人工智能（Artificial Intelligence，AI）处理器等，不限于此。Those skilled in the art can understand that all or part of the processes in the above-mentioned embodiment methods can be completed by instructing the relevant hardware through a computer program, and the computer program can be stored in a non-volatile computer-readable storage medium. When the computer program is executed, it can include the processes of the embodiments of the above-mentioned methods. Among them, any reference to the memory, database or other medium used in the embodiments provided in this application can include at least one of non-volatile memory and volatile memory. Non-volatile memory can include read-only memory (ROM), magnetic tape, floppy disk, flash memory, optical memory, high-density embedded non-volatile memory, resistive random access memory (ReRAM), magnetic random access memory (MRAM), ferroelectric random access memory (FRAM), phase change memory (PCM), graphene memory, etc. Volatile memory can include random access memory (RAM) or external cache memory, etc. As an illustration and not limitation, RAM can be in various forms, such as static random access memory (SRAM) or dynamic random access memory (DRAM). The database involved in each embodiment provided in this application may include at least one of a relational database and a non-relational database. Non-relational databases may include distributed databases based on blockchains, etc., but are not limited to this. The processor involved in each embodiment provided in this application may be a general-purpose processor, a central processing unit, a graphics processor, a digital signal processor, a programmable logic unit, a data processing logic unit based on quantum computing, an artificial intelligence (AI) processor, etc., but are not limited to this.

以上实施例的各技术特征可以进行任意的组合，为使描述简洁，未对上述实施例中的各个技术特征所有可能的组合都进行描述，然而，只要这些技术特征的组合不存在矛盾，都应当认为是本申请记载的范围。The technical features of the above embodiments may be arbitrarily combined. To make the description concise, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this application.

以上实施例仅表达了本申请的几种实施方式，其描述较为具体和详细，但并不能因此而理解为对本申请专利范围的限制。应当指出的是，对于本领域的普通技术人员来说，在不脱离本申请构思的前提下，还可以做出若干变形和改进，这些都属于本申请的保护范围。因此，本申请的保护范围应以所附权利要求为准。The above embodiments only express several implementation methods of the present application, and the descriptions thereof are relatively specific and detailed, but they cannot be understood as limiting the scope of the present application. It should be pointed out that, for a person of ordinary skill in the art, several variations and improvements can be made without departing from the concept of the present application, and these all belong to the protection scope of the present application. Therefore, the protection scope of the present application shall be subject to the attached claims.

Claims

1. An asynchronous management method for database query tasks, the method comprising:

Determining idle threads, and screening a plurality of target database query tasks from a preparation task queue according to the number of the idle threads if the number of the idle threads exceeds a preset thread threshold;

Splitting each target database query task into a plurality of subtasks;

For each of the plurality of subtasks, if it is determined that other subtasks depend on the subtask being targeted, or have other subtasks that are the same as the subtask being targeted, then the subtask being targeted is taken as a shared subtask;

And distributing the target database query tasks and the sharing subtasks to the idle thread for processing.

2. The method of claim 1, wherein the screening the preliminary task queue for a plurality of target database query tasks based on the number of idle threads comprises:

determining the target screening number of database query tasks according to the number of idle threads;

acquiring the execution state information of each database query task in the preparation task queue, and determining the database query task which is not executed yet as the execution state information as a candidate database query task;

And screening a plurality of target database query tasks for the target screening from the candidate database query tasks.

3. The method of claim 2, wherein said screening the number of target database query tasks for the target from the candidate database query tasks comprises:

acquiring the enqueuing time of each candidate database query task in the preparation task queue;

Sequentially screening target database query tasks with the number being the target screening number from a plurality of candidate database query tasks in the preparation task queue according to the sequence of the listing time from far to near;

or acquiring the query priority of each candidate database query task in the preparation task queue;

and sequentially screening target database query tasks with the number of target screening numbers from a plurality of candidate database query tasks in the preparation task queue according to the order of the query priority from high to low.

4. The method of claim 1, wherein for each of the plurality of subtasks, if it is determined that other subtasks depend on the subtask being targeted, or have other subtasks that are the same as the subtask being targeted, then treating the subtask being targeted as a shared subtask comprises:

respectively carrying out sentence parameter matching on the subtasks and other subtasks to obtain parameter matching results;

respectively carrying out data range matching on the subtasks and other subtasks to obtain range matching results;

respectively carrying out output format matching on the subtasks and other subtasks to obtain a format matching result;

And when determining that other subtasks depend on the subtasks aimed or have other subtasks identical to the subtasks aimed according to the parameter matching result, the range matching result and the format matching result, taking the subtasks aimed as shared subtasks.

5. The method of claim 4, wherein upon determining that there are other subtasks that are the same as the subtasks being targeted based on the parameter matching result, the range matching result, and the format matching result, then taking the subtasks being targeted as shared subtasks comprises:

screening a first target subtask set from other subtasks, wherein the first target subtask set comprises other subtasks which are determined according to the parameter matching result and keep consistent with the subtasks in terms of statement parameters;

screening a second target subtask set from other subtasks, wherein the second target subtask set comprises other subtasks which are determined according to the range matching result and keep consistent with the subtasks in the data range;

screening a third target subtask set from other subtasks, wherein the third target subtask set comprises other subtasks which are determined according to the format matching result and keep consistent with the subtasks in the output format;

and when other subtasks exist in the first target subtask set, the second target subtask set and the third target subtask set at the same time, the subtask to be targeted is used as a sharing subtask.

6. The method of claim 4, wherein when it is determined that other subtasks depend on the subtask being targeted based on the parameter matching result, the range matching result, and the format matching result, then taking the subtask being targeted as a shared subtask comprises:

Screening a fourth target subtask set from other subtasks, wherein the fourth target subtask set comprises other subtasks which are determined according to the parameter matching result and are kept partially consistent with the subtasks in statement parameters;

Screening a fifth target subtask set from other subtasks of the fourth target subtask set, wherein the fifth target subtask set comprises other subtasks which are determined according to the range matching result and are kept partially consistent with the subtasks in the data range;

screening a sixth target subtask set from other subtasks of the fifth target subtask set, wherein the sixth target subtask set comprises other subtasks which are determined according to the format matching result and are partially consistent with the subtasks in the output format;

other subtasks present in the sixth target set of subtasks are determined to be dependent on the subtask being directed and the subtask being directed is taken as a shared subtask.

7. The method of claim 1, wherein assigning the plurality of target database query tasks and the shared subtask to the idle thread for processing comprises:

For each shared subtask, determining a target database query task to which the shared subtask to which the target database query task belongs before being split;

Marking the aimed sharing subtasks in the target database query task which the aimed sharing subtasks belong to before being split, so as to obtain marking tasks;

For each marked database query task, the marked task aimed at and the shared subtask belonging to the marked task aimed at before being split are distributed to the same idle thread.

8. The method of claim 7, wherein the method further comprises:

Aiming at each idle thread, processing the assigned sharing subtasks through the aimed idle thread to obtain a sharing subtask corresponding to the assigned sharing subtask;

And processing the distributed marking tasks through the aimed idle threads, when the aimed idle threads process the shared subtasks in the distributed marking tasks, taking the shared subtasks which are assigned to the distributed marking tasks before being split as the processing results of the shared subtasks in the distributed marking tasks, and continuously processing the marked database query tasks through the idle threads.

9. The method according to claim 1, wherein the method further comprises:

after the idle thread finishes processing the target database query tasks and the sharing subtasks, switching the execution state of the target database query tasks in the preparation queue from not-executed to cancelled;

Determining the number of supplementary tasks according to the number of database query tasks with cancellation states in the preparation queue;

and storing the received number of the supplementary tasks into the preparation queue.

10. An asynchronous management device for database query tasks, the device comprising:

The first determining module is used for determining idle threads, and if the number of the idle threads exceeds a preset thread threshold value, screening a plurality of target database query tasks from a preparation task queue according to the number of the idle threads;

The splitting module is used for splitting each target database query task into a plurality of subtasks;

A second determining module, configured to, for each of the plurality of subtasks, take the subtask that is targeted as a shared subtask if it is determined that other subtasks depend on the subtask that is targeted, or have other subtasks that are the same as the subtask that is targeted;

And the distribution module is used for distributing the target database query tasks and the sharing subtasks to the idle threads for processing.

11. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any one of claims 1 to 9 when the computer program is executed.

12. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 9.

13. A computer program product comprising a computer program, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any one of claims 1 to 9.