Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, fig. 1 is a schematic diagram of a task synchronization waiting system according to an embodiment of the present invention, where, as shown in fig. 1, the task synchronization waiting system includes an application layer, a stream scheduling layer, and a driving layer, where the application layer includes one or more application threads, the stream scheduling layer includes one or more stream scheduling threads, and the driving layer includes a plurality of drivers.
In the embodiment of the invention, the application thread comprises an application module and an interface module CLAPI, wherein the interface module CLAPI comprises a CLSignal (Computing Language Signal) thread management interface unit, a CLStream (Computing Language Stream) stream management interface unit and a CLEvent (Computing Language Event) event management interface unit, the CLSignal thread management unit is used for synchronously waiting among multiple threads, the CLStream stream management unit is used for providing an establishing and destroying management interface of a stream task queue, and the CLEvent management unit is used for synchronously waiting among parallel computing. Wherein, the CLEvent management units include SCHEDEVENT (SCHEDULER EVENT) schedule events, the schedule events are used for synchronizing states of the flow schedule units, and one CLEvent management unit can correspond to a plurality of SCHEDEVENT schedule events.
The application layer generates a corresponding driving task and a driving task result according to the service to be processed, and sends a corresponding management instruction to the flow scheduling layer through an interface module. The driving layer is used for providing driving to calculate the driving task and returning driving response task to the task distribution unit. The stream scheduling layer is used for acquiring driving tasks from the application layer, the driving tasks comprise control tasks and driving request tasks, the driving request tasks are converted into driving task streams based on the control tasks and added into a stream task queue, the driving task streams in the stream task queue are subjected to driving distribution based on driving response tasks, the driving task streams are processed by matching the distributed driving from the driving layer, and the driving response tasks are returned to the stream scheduling layer.
Specifically, the flow scheduling thread comprises a task queue, a task distributing unit, a flow control management unit, a flow task queue, a flow scheduling unit, a driving distributing unit, a driving information storage unit, a driving information registering unit, an Event database unit and an Event management unit. The driving information storage unit may be in the form of a driving information table, and the event database unit may be in the form of an event information table.
The input of the task queue is connected with the output of the application thread, and the task distributing unit is used for distributing each task in the task queue to the flow control management unit, the flow task queue, the flow scheduling unit and the Event management unit. The output of the stream scheduling unit is connected with the input of the drive distribution unit, the drive distribution unit is used for inquiring the information in the drive storage unit, and the output of the drive distribution unit is connected with the input of the drive layer. The drive information registration unit is used for registering the drive in the drive layer and storing the registration information of each drive into the drive storage unit.
The CLSignal thread management interface unit and the CLEvent Event management interface unit generate corresponding Event tasks and Event control instructions, wherein the Event tasks are forcedly converted into stream data and added into a task queue, the stream data is distributed into the stream task queue through a task distribution unit, and the Event control instructions are distributed to an Event management unit through the task distribution unit.
Further, the task distribution unit issues the task according to a task type in a task list, specifically, the task type includes a driving task type, a driving response task, an Event task type, and an Event control instruction type, the driving task type includes a control task and a driving request task, where the control task may also be referred to as a flow control instruction, the task distribution unit distributes the driving request task and the Event task to a flow task queue, the task distribution unit distributes the control task to a flow control management unit, the task distribution unit distributes the driving response task to a flow scheduling unit, and the task distribution unit sends the Event control instruction to an Event management unit.
The control task is mainly a stream creation and destruction command, and is forwarded to the stream control management unit for processing by the task distribution unit. Further, the flow control management unit performs flow creation and destruction according to the control task.
Optionally, the flow control management unit may detect the maximum number of flows when performing flow creation, and if the maximum number of created flow task queues exceeds a preset creation number threshold, return failure to the application layer, and when performing flow task queue destruction, the flow control management unit may detect whether there is an unfinished driving task flow, if there is an unfinished driving task flow, return failure to the application layer, and meanwhile, the flow control management unit may also detect in real time whether the driving task flow number of the flow task queues exceeds a preset task number threshold, and if so, trigger upper layer flow control to block the application line layer of the application layer.
The above-mentioned driving request task is mainly that upper layer user requests driving processing calculation task by means of flow scheduling unit, according to the flow task queue in which the task is positioned, the driving request task is written into the tail portion of said flow task queue so as to obtain first-in first-out flow.
Optionally, the flow scheduling unit triggers the task scheduling of the flow after receiving the driving request task, the event task and the driving response task. The flow scheduling unit can be implemented by a state machine, and specifically, the state machine includes an executable state, a waiting driving completion state, a waiting driving resource availability state and a waiting event completion state in a flow scheduling process.
The driving response task is mainly that after driving is completed, auxiliary information of the completion state of the driving task is sent to a flow scheduling unit, and the flow scheduling unit triggers the subsequent flow of driving request task processing.
The drive distribution unit receives the drive request task of the stream scheduling unit, analyzes the drive id (drive id) of the drive request task, and searches the drive storage unit for the matched drive registration information according to the drive id.
In a possible embodiment, the flow scheduling thread is one, which can be understood that the flow scheduling is implemented by using an independent thread, and the application thread and the driving thread are decoupled by using the independent flow scheduling thread, so that the task synchronous waiting system of the embodiment of the invention can be extended by configuring a flow task queue and a driving.
Furthermore, when the independent thread is adopted to implement the flow scheduling, a preset number of flow task queues can be scheduled in parallel, for example, 1024 flow task queues are scheduled in parallel, any time consuming process in the flow scheduling thread can seriously affect the scheduling efficiency, the flow scheduling thread can prohibit executing any other calculation task, the driving layer can be designed as a driving thread, and a general driving processing function (also can be called a driving task synchronous waiting function) registered in a driver_register_table can forward a driving request task in the flow scheduling to the driving thread for processing.
Optionally, referring to fig. 2, fig. 2 is a schematic state diagram of a flow scheduling provided in an embodiment of the present invention, as shown in fig. 2, when the flow scheduling unit performs scheduling processing, a driving task flow may be taken out from a head of a flow task queue and delivered to the driving distribution unit for processing, the driving distribution unit receives the driving task flow of the flow scheduling unit, analyzes the driving id of the driving request task, searches for matched driving registration information in a driving storage unit according to the driving id, and invokes a driving load query function to query a load state of a driving corresponding to the driving id through the driving distribution unit.
If the load of the drive id corresponding to the drive is high, the drive state is returned to the flow dispatching unit, the drive state is marked as unavailable (failed), if the load of the drive id corresponding to the drive is low, the drive state is marked as available, if the load of the drive id corresponding to the drive is low, the flow dispatching unit marks the drive task flow as a waiting drive completion state, records auxiliary information (the drive state is executable) of the drive task completion state which the drive task flow is waiting, if the auxiliary information of the drive task completion state is received, finds the drive task flow waiting for the auxiliary information of the drive task completion state, switches the state of the drive task flow into the executable state, and further, the drive task flow of all the drive resource waiting states can be traversed, the drive load inquiry function is called through the drive distribution unit according to the drive id of the drive resource waiting for inquiring the load state of the drive id corresponding to inquire the drive load state, and if the inquiry load of the drive load flow is low, the drive task flow is switched into the executable state by the drive resource waiting state.
The drive distribution unit may allocate drive task completion status auxiliary information resources to the drive task flow, and call a drive processing function to control the drive to process the drive task flow.
Optionally, referring to fig. 3, fig. 3 is a schematic state diagram of another flow scheduling provided in the embodiment of the present invention, as shown in fig. 3, when the flow scheduling unit performs scheduling processing, on the basis of fig. 2, the state machine further adds an expansion of a waiting event completion state, if an event task corresponding to a driving task is not prepared, the driving task is marked as the waiting event completion state, and if an event task corresponding to the driving task is prepared to be completed, the driving task is marked as an executable state.
In the embodiment of the invention, the task synchronous waiting system mainly comprises three parts, namely 1) synchronization among a plurality of stream task queues for solving task dependence among the plurality of stream task queues in a stream scheduling unit, 2) CLStream synchronous waiting for all tasks in a designated stream task queue to be completed, and 3) CLEvent synchronous waiting for the designated stream task queue to schedule to finish execution of a designated CLEvent.
In the embodiment of the invention, the service to be processed can be split into a first number of driving tasks and added into a second number of flow task queues, wherein the second number of flow task queues are in parallel relation, event dependence is added to the driving tasks, the event dependence comprises parallel synchronous waiting events among different flow task queues, and the event dependence is executed to enable the flow task queues to realize parallel synchronous waiting. Referring to fig. 4, fig. 4 is a task splitting schematic diagram of a graph service diagram, where, as shown in fig. 4, a graph service is split into 8 driving tasks, such as a1, b1, c1, a2, b2, c2, a3, c3, and the like, where b1 and c1 are parallel, and a3 and c2 are parallel. Referring to fig. 5 for adding event dependency, fig. 5 is a schematic diagram of a parallel stream task queue of a graph service diagram provided in an embodiment of the present invention, where, as shown in fig. 5, the parallel stream task queue includes a stream task queue StreamA, a stream task queue StreamB, and a stream task queue StreamC, where each stream task queue is synchronized with each other through an event task event e1 and an event task event e2, and a record of an event and wait event operation are used for synchronization between streams. For example, stream A starts with a1 driven task, records events immediately following the record of event task e1, while Stream B and Stream C both wait for completion of event task e1 at the start by wait for event of event task e1, followed by respective driven tasks B1, C1, where Stream B and C scheduling is triggered (waiting for e 1) only if equal Stream A executes to e1 record task (a 1 task has been completed), updates e1 state.
In the embodiment of the invention, the task synchronous waiting system divides the service to be processed into a plurality of driving tasks, converts the driving tasks into driving task flows and adds the driving task flows into a plurality of flow task queues, so that the scheduling among the multithreads is converted into the scheduling among the flows, and the event dependence is utilized to realize the parallel waiting of the plurality of flow task queues, thereby realizing the task synchronous waiting based on the flow task queues, and reducing the system scheduling expenditure compared with the direct scheduling among the multithreads.
Referring to fig. 6, fig. 6 is a flowchart of a task synchronization waiting method according to an embodiment of the present invention, where the task synchronization waiting method may be applied to the task synchronization waiting system according to the embodiment of fig. 1, and as shown in fig. 6, the task synchronization waiting method includes the following steps:
601. splitting the service to be processed into a first number of driving tasks and adding the driving tasks into a second number of flow task queues.
In the embodiment of the present invention, the second number of flow task queues are in parallel relation. The service to be processed can be hardware acceleration processing, neural network processing, CPU computing processing, DSP computing task processing and the like.
Further, the second number of the stream task queues can be determined according to the total data amount of the service to be processed, the service to be processed is split into the first number of driving tasks, parallel and serial relations among the driving tasks are configured through the stream task queues, the parallel driving tasks are correspondingly added into the parallel stream task queues, and serial driving tasks are configured in the stream task queues.
Specifically, the data throughput of each flow task queue and the data volume of each driving task may be determined first, and the second number of flow task queues is determined according to the total data volume of the service to be processed divided by the data throughput of each flow task queue in the total processing period. In one possible embodiment, the number of drive tasks for each stream queue in a round of processing cycles is determined based on the data throughput of each stream task queue divided by the data amount of each drive task, and the total processing cycle drive task number is calculated as the first number.
602. Event dependencies are added to the driving task.
In the embodiment of the invention, the event dependence comprises parallel synchronous waiting events among different flow task queues. Further, a record event and a wait event of a current driving task can be created, the record event of the current driving task is added in a current flow task queue, and the wait event of the current driving task is added in a parallel queue of the current flow task queue, wherein the current driving task is positioned in the current flow queue.
Specifically, a record event may be added after the current drive task and a wait event may be added before the parallel drive task. The driving task in the parallel flow task queue needs to wait for the completion of the current driving task in the current flow task queue before processing the driving task. Please refer to fig. 5 again.
Optionally, a target event object may be found from the event database, a time schedule object is added at the tail of a schedule event list in the target event object, an initial state of the time schedule object is not prepared, an event record task of the current stream task queue is generated, the event record task includes a first time schedule object pointer, the first time schedule object pointer points to the time schedule object at the tail of the schedule event list, and the event record task is added as a record event to the tail of the current stream task queue. By adding a time scheduling object at the tail part of the scheduling event list, the latest target event object in the scheduling event list can be indicated, and the state of the target event object can be quickly found.
Optionally, a target event object may be found from the event database, an event waiting task of the parallel flow task queue is generated, the event waiting task includes a second time scheduling object pointer, the second time scheduling object pointer points to a time scheduling object at the tail of the scheduling event list, and the event waiting task is added as a waiting event to the tail of the parallel flow task queue. The event waiting task is added to the parallel flow task queue, so that the parallel flow task queue waits for the processing completion of the current flow task queue through the event waiting task, and the latest event waiting task in the parallel flow task queue can be indicated through the second time scheduling object pointer, and the state of the event waiting task can be quickly found.
Specifically, referring to fig. 7, fig. 7 is a schematic diagram of a parallel flow task queue creation event dependency according to an embodiment of the present invention, where, as shown in fig. 7, a driving task includes a1, b1, a2, b2..
First, an application in the task synchronization waiting system creates 2 streams through the CLStream stream management interface unit, records as s1 and s2, adds driving tasks a1, a2 to s1, adds b1, b2. to s2, and in the stream scheduling layer, controls the stream task queues stream_task_ queues to correspondingly create two empty stream task queues s1 and s2 through the stream management unit, thereby completing s1 and s2 creation of the CLStream stream management interface unit.
The application creates 1 Event task Event through CLEvent Event management interface unit, records as e1, event management unit of stream dispatch layer receives command of creating Event task Event 1, creates a new empty Event object e1 in Event database event_table, each Event object contains a) record counter of maintenance Event, which can be called Event counter, records as e1.Record_count, initial value 0;b) SCHEDEVENT dispatch Event list, records as e1. Record_events_state, list is initially empty, and completes CLEvent e1 creation.
The application sends the task of the driving in s 1and s2, specifically, the application can issue according to the task sequence of the driving task, and the event task event mainly comprises two steps of event recording and event waiting, which correspond to the event recording and the event waiting respectively. S2 is a parallel flow task queue of s1.
The process of the Event record includes a) invoking CLEvent Event record task of Event management interface unit by application to issue Event control command, b) receiving Event control command by Event management unit of stream scheduling layer, finding target Event object e1 from Event database unit event_table, adding a time scheduling object SCHEDEVENT at tail of schedule Event list e1. Schedule_events_state [ ], generating Event record task of stream s1 in initial state without preparing event_event_READY, recording as pointer of e1. Schedule_task including latest time scheduling object SCHEDEVENT in Event management unit of stream scheduling layer, recording as pointer of first time scheduling object pointer e1. Schedule_task_task [ ], pointing to tail of Event list e1. Schedule_task_event_task [ ], and adding time scheduling object pointer 32 to tail of stream 1. Schedule_task [ ], and recording as pointer of Event schedule Event list 32. Task_task 1) at the tail of stream 1. Schedule Event list 34.
Taking the Event task e1 wait of s2 as an example for illustration, the process of Event waiting includes a) invoking CLEvent the e1.wait operation of the Event management interface unit by application to issue Event control command, b) receiving the e1.wait control command by Event management unit of stream scheduling layer, finding out the target Event object e1 from Event database unit event_table, c) generating EVENT WAIT task of stream s2 inside Event management unit, recording it as e1.wait_task including the latest time scheduling object SCHEDEVENT pointer of target Event object e1 as second time scheduling object pointer e1.wait_task, recording it as second time scheduling object pointer e1.wait_task_event_ptr, the pointer pointing to the last time scheduling object SCHEDEVENT of tail of schedule Event list e1.wait_event [ ], and d) putting e1.wait_task into tail of stream s 2.
Through the two steps of event recording and event waiting, event dependence is added to the driving task.
603. And executing event dependence to enable the streaming task queue to realize parallel synchronous waiting.
In the embodiment of the invention, when the current flow task queue executes a recording event of a current driving task, the time scheduling object state pointed by the first time scheduling object pointer is changed to be ready for completion, when the time scheduling object state pointed by the second time scheduling object pointer is not ready, the state of the parallel flow task queue is switched to be a waiting event completion state, when the time scheduling object state pointed by the second time scheduling object pointer is ready for completion, the state of the parallel flow task queue is switched to be an executable state, the parallel flow task queue in the waiting event completion state is traversed, the time scheduling object state corresponding to the parallel flow task queue in the waiting event completion state is checked, and if the time scheduling object state corresponding to the parallel flow task queue in the waiting event completion state is changed to be ready for completion, the state of the parallel flow task queue in the waiting event completion state is switched to be an executable state. Determining a record event state of a current driving task reached by a current flow task queue through a first time scheduling object, and starting processing of waiting events in a parallel flow task queue under the condition that the record event state of the current flow task queue is ready to be completed, so that the parallel flow task queue can realize parallel synchronous waiting according to the completion condition of the waiting events.
Specifically, further, the task of s1 and s2 is scheduled by the stream scheduling layer, specifically, scheduling is performed according to the sequence of the driving task and the event task in the stream task queue. Scheduling of event tasks may be referred to in fig. 3. Specifically, a wait_task type task is added to the parallel flow task queue, and if the time scheduling object SCHEDEVENT corresponding to the waiting EVENT is NOT READY (the second time scheduling object pointer e1.wait_task. Sched_event_ptr indicates sched_event_not_ready), the parallel flow task queue flow enters the waiting EVENT completion state. Traversing the stream task queue waiting for the EVENT completion state, checking the waiting EVENT flag, and switching the stream task queue state to the executable state if the read second time scheduling object pointer wait_task_sched_event_ptr is ready for completion sched_event_complete.
Further, in the process flow of e1.record_task, the stream task queue performs scheduling to the e1.record_task task, the state of the time scheduling object SCHEDEVENT of the task is marked as completed, and the state of the first time scheduling object pointer record_task_event_ptr is changed to be ready for completing the sched_event_complete.
Further, in the process flow of e1.wait_task, the flow task queue schedules the execution of the task to e1.wait_task, reads the state of the time schedule object SCHEDEVENT of the task, if not completed, the flow state machine switches to the wait event completion state, and if completed, continues to execute the next driving task. Application reset CLEvent event management interface unit one CLEvent event management interface unit corresponds to multiple time schedule objects SCHEDEVENT in event database event_table in the stream schedule layer. Each time the application invokes an e1.record operation, a new time schedule object SCHEDEVENT is added at the end of the schedule event list e1.sched_events_state [ ] list. The application needs to call the e1.reset operation after e1.wait is completed, clearing the air conditioner event list e1.schedevents_state [ ].
Optionally, when the number of driving tasks added in the current flow task queue reaches a preset value, a marking event object is created, an application thread is blocked, a flow marking task of the current flow task queue is generated, the flow marking task comprises a third time scheduling object pointer, the third time scheduling object pointer points to the marking event object, the flow marking task is added to the tail of the current flow task queue as a flow marking event, and when the current flow task queue executes to the flow marking event, the blocked application thread is awakened, and the marking event object is released. When the driving task processing is completed, the scheduling object pointer is pointed to the marked event object through the third time to prompt the wake-up of the application thread so as to ensure the smoothness of task processing.
Further, referring to fig. 8, fig. 8 is a schematic diagram of CLStream synchronous waiting provided in an embodiment of the present invention, as shown in fig. 8, after N tasks are issued on stream s1, s1.Sync is invoked to synchronously wait for all the previous N tasks to be executed.
Specifically, 1) the application calls s1.sync operation to wait for stream synchronization, applies for a mark event object sig1 from CLSignal thread management interface units, creates a new stream mark task s1.sync_task, wherein the stream mark task comprises a third time schedule object pointer s1.sync_task, signal_ptr points to the mark event object sig1 and is sent to a stream schedule layer for processing, an application thread calls a thread to wait for the sig1.wait to enter a blocking state, 2) a task distribution unit of the stream schedule layer receives the s1.sync_task and puts the s1.sync_task at the tail of s1, 3) a stream schedule unit of the stream schedule layer executes to the s1.sync_task, which indicates that the task of s1 is complete, executes an indication operation that a third time schedule object pointer s1.sync_task, signal_pt, signal_notify, and the blocking application thread of the operation system waits for the thread to exit from the object to wake up, and releases the mark event 1.
Optionally, the event database includes a dependency event list, the dependency event list includes a marked task list of a dependency event, a target event object can be found from the event database, an event marked task is generated, the event marked task includes a fourth time-scheduled object pointer and a fifth time-scheduled object pointer, the fourth time-scheduled object pointer points to the marked event object, the fifth time-scheduled object pointer points to a time-scheduled object in the scheduled event list, the fourth time-scheduled object pointer is invoked when the scheduled event list is empty or the state of the time-scheduled object is ready, the event marked object is deleted from the dependency event list or the state of the time-scheduled object is not ready, the event marked task is added at the tail of the marked task list of the dependency event, the dependency event list is traversed, a record event or a waiting event is sequentially read, the fourth time-scheduled object pointer is invoked when the state of the time-scheduled object pointer points to the time-scheduled object is ready, and the corresponding record event is deleted from the dependency event list or the waiting event list is released. After waking up the blocked application thread, deleting the corresponding dependency event from the dependency event list, releasing the marked event object, and releasing the resource of the dependency event list to add a new dependency event.
Further, referring to fig. 9, fig. 9 is a schematic diagram of CLEvent synchronous waiting provided in an embodiment of the present invention, as shown in fig. 9, a task a1, a2 is sent by a thread 1 on an s1 stream, an e1.Sync operation is performed at time t1, and since the scheduling of s1 has not yet been performed, the thread 1 is blocked until s1 is performed until time t3, the thread 1 is awakened to continue to work. Thread 2 continues to send tasks a3, a4. on the s1 stream, and at time t2, performs an e1.Sync operation, corresponding to waiting for the completion of the previous a 1-a 4 tasks, since the scheduling of s1 has not yet completed a 1-a 4 tasks, resulting in blocking of thread 2, and at time t4 s1 executes to e1, thread 2 wakes up to continue working.
Specifically, after the CLEvent event management interface unit creates the target event object e1, the application calls e1.Sync operation to wait for event synchronization, namely, applies for the event object sig1 from the CLSignal thread management interface unit, creates a new event marking task e1.Sync_task, wherein a fourth time scheduling object pointer e1.Sync_task. Signal_ptr in the event marking task e1.Sync_task points to the event marking object sig1, sends the event marking object to the stream scheduling layer for processing, and the application thread calls a thread to wait for the sig1.Wait to enter a blocking state. 2. The e1.sync_task processing of the Event management unit in the stream schedule layer comprises 1) finding a target Event object e1 in an Event database, 2) if the schedule Event list e1.Sched_events_state [ ] is empty or the state of the last time schedule object SCHEDEVENT at the tail is ready to COMPLETE sched_event_complete, calling a fourth time schedule object pointer e1. Sync_task_signal_pt. Notify to indicate operation and exit processing, otherwise continuing to process the remaining tasks, 3) the dependent Event list wait_event_task [ ] of the Event database stores a mark task list (sync task list) of the dependent Event, adding the Event mark task e1.Sync_task at the tail of the mark task list of the dependent Event, and recording the time schedule object SCHEDEVENT information corresponding to the dependent Event, and the fifth time schedule object pointer e1. Sync_task_task [ ] points to the schedule Event list e1. Sync_event_task [ ] time object SCHEDEVENT in the last time schedule object of the dependent Event list. And the stream scheduling expansion is that each time the schedule traverses the dependency EVENT list wait_event_task of the EVENT database, the dependency EVENTs of the tasks are sequentially read, when the state of the fifth time schedule object pointer sync_task_sched_event_ptr is ready to COMPLETE the sched_event_complete, the fourth time schedule object pointer sync_task_signal_ptr.notify indicates the operation, and the completed dependency EVENTs are deleted in the dependency EVENT list wait_event_task. 4. The operating system schedules the application thread which wakes up the blocking, exits from the waiting thread sig1.Wait, and releases the event marker object sig1.
In the embodiment of the invention, the service to be processed is split into a first number of driving tasks and added into a second number of flow task queues, event dependencies are added to the driving tasks, the event dependencies comprise parallel synchronous waiting events among different flow task queues, and the event dependencies are executed to enable the flow task queues to realize parallel synchronous waiting. The service to be processed is split into a plurality of driving tasks, the driving tasks are converted into driving task flows and added into a plurality of flow task queues, so that scheduling among multiple threads is converted into scheduling among the flows, parallel waiting of the plurality of flow task queues is realized by utilizing event dependence, task synchronous waiting based on the flow task queues is realized, and compared with direct scheduling among the multiple threads, the system scheduling overhead is reduced.
Alternatively, for scheduling in the stream scheduling layer, the driving task may be obtained from the application layer.
In the embodiment of the present invention, the driving task may be a task that needs parallel computation, where the driving task includes a control task and a driving request task.
The application layer comprises one or more application threads, and the application threads generate corresponding driving tasks through user interaction. The control task is mainly a creation and destruction command of a drive task stream, and is forwarded to the flow control management unit for processing by the task distribution unit. The driving request task is mainly a task which is requested by an upper user to drive and process calculation through a stream scheduling unit, and the driving task stream is written into the tail part of the corresponding stream task queue according to the stream task queue where the driving request task is located, so that a first-in first-out stream is obtained. The driving response task is mainly that after driving is completed, auxiliary information of the completion state of the driving task passes through a flow scheduling unit, and the flow scheduling unit triggers the subsequent flow of driving request task processing.
Further, the driver request task includes a driver ID corresponding to the task, the stream scheduling layer includes one or more stream scheduling threads, the driver layer includes a plurality of driver threads (the driver threads may also be referred to as drivers), and the driver request task is transferred in the form of stream data among the application thread, the stream scheduling thread, and the driver threads.
Based on the control task, the drive request task is converted into a drive task stream and added to the stream task queue.
In the embodiment of the invention, the control task is processed through the flow scheduling layer, the flow scheduling layer creates and destroys a driving task flow through the control task and converts the driving request task into the driving task flow, and the driving request task can be converted into the driving task flow in a forced conversion mode, so that the data format of an application thread is converted into a flow data format, and the driving task flow is obtained.
Further, based on the control task, when creating the stream task queue, it is determined whether the number of stream task queues to be created exceeds a first number threshold, if the number of stream task queues to be created does not exceed the first number threshold, a new stream task queue is created, the driving request task is forcedly converted into a driving task stream according to a preset conversion rule, and the driving task stream is added into the new stream task queue according to a first-in first-out rule.
Further, whether unfinished driving task flows exist in the flow task queue to be destroyed or not can be judged when the flow task queue is destroyed based on the control task, if unfinished driving request task flows exist in the flow task queue to be destroyed and/or if the number of the flow task queues to be created exceeds the first number threshold, first failure information is returned to an application layer, whether the number of the driving task flows in the flow task queue exceeds a second number threshold or not is judged based on the control task, and if the number of the driving task flows in the flow task queue exceeds the second number threshold, tasks of the application layer are blocked and second failure information is returned to the application layer.
Specifically, when there is an unfinished driving request task flow in the flow task queue to be destroyed, it is indicated that the previous task has not been executed yet and needs to be executed continuously, and if the number of the flow task queues to be created exceeds the first number threshold, it is indicated that the parallel flow task queues in the flow scheduling layer reach the upper bearing limit.
The first failure information includes creation failure information and destruction failure information. The creation failure information may be used to prompt the user for a failure to create the streaming task queue, and the destruction failure information may be used to prompt the user for a failure to destroy the streaming task queue.
The first number threshold may be understood as the maximum number of parallel flow task queues, and in the embodiment of the present invention, the first number threshold is preferably 1024. Judging whether the number of the flow task queues to be created exceeds 1024 when the flow task queues are created; if the number of the stream task queues to be created does not exceed 1024, creating a new stream task queue, and if the number of the stream task queues to be created exceeds 1024, returning creation failure information. When destroying the stream task queue, if detecting that the drive task stream exists in the stream task queue, indicating that the unfinished drive task stream exists in the stream task queue, returning destroying failure information.
The second number threshold may be understood as a maximum flow data amount of the driving task flow in one flow task queue, where the data amount of the driving task flow in the flow task queue exceeds the maximum flow data amount of the flow task queue, blocking the task of the application layer, and returning second failure information to the application layer.
And based on the driving response task, driving and distributing the driving task flow in the streaming task queue.
In the embodiment of the invention, the flow scheduling unit can be notified by the auxiliary information of the completion state of the driving task after the driving is completed, so that the flow scheduling unit triggers the driving request task to process.
Optionally, the flow scheduling unit may take the driving task flow out of the flow task queue, call the application and release functions of the driving load inquiry function and the driving task completion state according to the driving response task, return the driving state of the driving required by the driving task flow, mark the driving task flow as entering a waiting driving completion state if the driving state is available, and switch the driving task flow to an executable state if the driving state is returned to be executable, mark the driving task flow as waiting for the availability state of driving resources if the driving state is unavailable, traverse the driving task flow in the waiting driving resource availability state in real time or in real time, call the driving load inquiry function to inquire the corresponding driving load state, and switch the driving task flow in the corresponding waiting driving resource availability state to the executable state if the load state meets the preset executable condition.
In the embodiment of the invention, the driving task flows in all executable states can be traversed, the driving task flows are taken out from the heads of the corresponding flow task queues and are delivered to the driving distribution unit for processing, the driving distribution unit receives the driving task flows of the flow dispatching unit, analyzes the driving id of the driving request task, searches matched driving registration information in the driving storage unit according to the driving id, and invokes a driving load inquiry function to inquire the load state of the driving corresponding to the driving id through the driving distribution unit.
If the load of the drive id corresponding to the drive is high, the drive state is returned to the flow dispatching unit, the drive state is marked as unavailable (failed), if the load of the drive id corresponding to the drive is low, the drive state is marked as available, if the load of the drive id corresponding to the drive is low, the flow dispatching unit marks the drive task flow as a waiting drive completion state, records auxiliary information (the drive state is executable) of the drive task completion state which the drive task flow is waiting, if the auxiliary information of the drive task completion state is received, finds the drive task flow waiting for the auxiliary information of the drive task completion state, switches the state of the drive task flow into the executable state, and further, the drive task flow of all the drive resource waiting states can be traversed, the drive load inquiry function is called through the drive distribution unit according to the drive id of the drive resource waiting for inquiring the load state of the drive id corresponding to inquire the drive load state, and if the inquiry load of the drive load flow is low, the drive task flow is switched into the executable state by the drive resource waiting state.
The distributed drivers are matched from the driver layer to process the driver task stream.
In the embodiment of the invention, the executable drivers needed by the execution of the executable state driving task flows are matched in the driving layer, and the driving processing function is called to forward the executable state driving task flows to the executable drivers for processing.
Further, the driving thread in the driving layer may be registered by the driving information registration unit and then stored in the driving information storage unit. In the initialization stage, obtaining the application and release function of the drive ID, the drive load inquiry function, the motion message processing function and the drive task completion state of each drive, registering the drive according to the application and release function of the drive ID, the drive load inquiry function, the drive processing function and the drive task completion state to obtain drive information, storing the drive information in a drive information table, and updating the drive information in the drive information table in real time according to the application and release function of the drive load inquiry function, the drive processing function and the drive task completion state.
The application and release function of the driving task completion state are used for processing the driving task completion state auxiliary information. The driving load inquiry function is used for inquiring the load condition of each driving thread in the driving layer, and the driving processing function is used for calling the corresponding driving thread in the driving layer to process the driving task flow. The driving task completion status auxiliary information further includes a resource release request, and after receiving the driving response task, the driving distribution unit releases the resource of the driving task stream corresponding to the driving response task if the driving task completion status auxiliary information corresponding to the driving response task is the resource release request.
Specifically, when each driving request task is issued, a corresponding resource of the driving task completion state auxiliary information is allocated, after the driving request task is completed, the flow scheduling unit is notified through the driving task completion state auxiliary information, the flow scheduling unit performs height according to the driving task completion state auxiliary information, and the corresponding resource of the driving task completion state auxiliary information is released after completion.
In the embodiment of the invention, the drive request task is converted into the drive task stream, so that the scheduling among the multithreading is converted into the scheduling among the streams, the scheduling among the drive task streams is realized by utilizing the stream task queue and the drive distribution, and compared with the direct scheduling among the multithreading, the system scheduling overhead is reduced by the stream task queue scheduling. In addition, the application layer and the driving layer are decoupled through the stream task queue, and the expansion can be performed by configuring the stream task queue and the driving.
Alternatively, in one possible embodiment, the flow scheduling thread is one, which may be understood that the flow scheduling is implemented by using an independent thread, and the application thread and the driving thread are decoupled by using the independent flow scheduling thread, so that the task synchronous waiting system in the embodiment of the present invention may be extended by configuring a flow task queue and a driving.
Furthermore, when the independent thread is adopted to implement the flow scheduling, a preset number of flow task queues can be scheduled in parallel, for example, 1024 flow task queues are scheduled in parallel, any time consuming process in the flow scheduling thread can seriously affect the scheduling efficiency, the flow scheduling thread can prohibit executing any other calculation task, the driving layer can be designed as a driving thread, and a general driving processing function (also can be called a driving task synchronous waiting function) registered in a driver_register_table can forward a driving request task in the flow scheduling to the driving thread for processing.
In one possible embodiment, the drive request task is transferred among the application thread, the stream scheduling thread and the drive thread, and in order to reduce the drive request task storage management overhead, the related data structure definition method is as follows, a general drive message DrvCommonTask is defined, wherein the general drive message comprises driver id, DRIVER EVENT and other information, a respective drive message is defined, the respective drive message is defined according to the drive id, such as a DrvTaskA1 message of a drive A, a message header DrvTaskA1 is of a type DrvCommonTask, and a subsequent body DrvTaskA is for storing a specific parameter configuration message of the drive A.
Specifically, drvTaskA is taken as an example to describe the storage of related data, 1) an upper layer user creates DrvTaskA1 objects in an application thread and sends the objects to a flow scheduling thread for processing, 2) the flow scheduling thread strongly converts the objects into general driving information DrvCommonTask for processing after receiving a driving request task, the driving request task objects are not deleted in the processing period and are directly forwarded to a driving A thread for processing, 3) the driving A thread strongly converts the objects into DrvTaskA type processing after receiving the driving request task, and the driving request task is released after the processing is finished. In this way, drive request task storage management overhead may be reduced.
In one possible embodiment, the obtained driving task may be retained in a preset task queue, whether the driving task flow corresponding to the driving task is processed is judged, if the driving task flow is processed, the driving task is released from the preset task queue, and if the driving task flow is not processed, the driving task is retained in the preset task queue. Therefore, in the processing process of the driving task stream, the driving tasks can be multiplexed when problems occur, and the corresponding driving tasks do not need to be searched in the application layer.
It should be noted that, the task synchronization waiting method provided by the embodiment of the invention can be applied to devices such as a smart phone, a computer, a server and the like which can perform task synchronization waiting.
Optionally, referring to fig. 10, fig. 10 is a schematic structural diagram of a task synchronization waiting device according to an embodiment of the present invention, as shown in fig. 10, where the device includes:
The splitting module 1001 is configured to split a service to be processed into a first number of driving tasks, and add the first number of driving tasks to a second number of flow task queues, where the second number of flow task queues are in a parallel relationship;
An adding module 1002, configured to add an event dependency to the driving task, where the event dependency includes parallel synchronous waiting events between different flow task queues;
and an execution module 1003, configured to execute the event dependency, and enable the streaming task queue to implement parallel synchronous waiting.
Optionally, the splitting module 1001 is further configured to determine a second number of flow task queues according to a total data amount of the service to be processed, split the service to be processed into a first number of driving tasks, configure parallel and serial relationships between the driving tasks through the flow task queues, add the parallel driving tasks to the parallel flow task queues correspondingly, and configure serial driving tasks in the flow task queues.
Optionally, the event dependencies include dependencies of record events and dependencies of wait events, and the adding module 1002 is further configured to create record events and wait events of a current driving task, add record events of the current driving task in a current stream task queue, and add wait events of the current driving task in a parallel queue of the current stream task queue, where the current driving task is located in the current stream queue.
Optionally, the adding module 1002 is further configured to create an event object for the current driving task in a preset event database, where the event object includes an event counter and a scheduled event list, the initial value of the event counter is 0, and the scheduled event list is initially empty, and create a recorded event and a waiting event for the current driving task based on the event object.
Optionally, the adding module 1002 is further configured to find a target event object from the event database, add a time scheduling object to a tail of a scheduling event list in the target event object, where an initial state of the time scheduling object is not prepared, generate an event recording task of the current stream task queue, where the event recording task includes a first time scheduling object pointer, where the first time scheduling object pointer points to a time scheduling object at the tail of the scheduling event list, and add the event recording task as a recording event to the tail of the current stream task queue.
Optionally, the adding module 1002 is further configured to find a target event object from the event database, generate an event waiting task of the parallel flow task queue, where the event waiting task includes a second time scheduling object pointer, where the second time scheduling object pointer points to a time scheduling object at the tail of the scheduling event list, and add the event waiting task as a waiting event to the tail of the parallel flow task queue.
Optionally, the executing module 1003 is further configured to, when the current task queue executes a record event to the current driving task, change a time scheduling object state pointed by the first time scheduling object pointer to be ready for completion, when the time scheduling object state pointed by the second time scheduling object pointer is not ready, switch a state of the parallel task queue corresponding to the parallel task queue to be a waiting event completion state, when the time scheduling object state pointed by the second time scheduling object pointer is ready for completion, switch a state of the parallel task queue corresponding to the parallel task queue to be an executable state, traverse the parallel task queue in the waiting event completion state, check a time scheduling object state corresponding to the parallel task queue in the waiting event completion state, and if the time scheduling object state corresponding to the parallel task queue in the waiting event completion state is changed to be ready for completion, switch a state of the parallel task queue in the waiting event completion state to be an executable state.
Optionally, the executing module 1003 is further configured to create a marking event object and block an application thread when the number of driving tasks added in the current flow task queue reaches a preset value, generate a flow marking task of the current flow task queue, where the flow marking task includes a third time scheduling object pointer, and the third time scheduling object pointer points to the marking event object, add the flow marking task as a flow marking event to a tail of the current flow task queue, and wake up the blocked application thread and release the marking event object when the current flow task queue executes to the flow marking event.
Optionally, the event database includes a dependency event list, the dependency event list includes a marked task list of a dependency event, the dependency event is a record event or a waiting event, the execution module 1003 is further configured to find a target event object from the event database, generate an event marked task, the event marked task includes a fourth time scheduled object pointer and a fifth time scheduled object pointer, the fourth time scheduled object pointer points to the marked event object, the fifth time scheduled object pointer points to a time scheduled object in the schedule event list, call the fourth time scheduled object pointer when the schedule event list is empty or the state of the time scheduled object is ready, increase the event marked task at the tail of the marked task list of the dependency event when the schedule event list is not empty or the state of the time scheduled object is not ready, traverse the dependency event list, sequentially read the record event or the waiting event, call the time scheduled object pointer when the time scheduled object pointed to by the fifth time scheduled object pointer is ready, call the fourth time scheduled object pointer when the time scheduled object pointer is ready, and release the record event or the waiting event from the corresponding record event.
It should be noted that the task synchronization waiting device provided by the embodiment of the invention can be applied to devices such as a smart phone, a computer, a server and the like which can perform task synchronization waiting.
The task synchronous waiting device provided by the embodiment of the invention can realize each process realized by the task synchronous waiting method in the method embodiment, and can achieve the same beneficial effects. In order to avoid repetition, a description thereof is omitted.
Referring to fig. 11, fig. 11 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, as shown in fig. 11, including a memory 1102, a processor 1101, and a computer program stored in the memory 1102 and capable of executing a task synchronization waiting method on the processor 1101, wherein:
The processor 1101 is configured to call a computer program stored in the memory 1102, and perform the following steps:
Splitting the service to be processed into a first number of driving tasks, and adding the first number of driving tasks into a second number of flow task queues, wherein the second number of flow task queues are in parallel relation;
Adding event dependencies to the driving task, wherein the event dependencies comprise parallel synchronous waiting events among different flow task queues;
And executing the event dependence to enable the flow task queue to realize parallel synchronous waiting.
Optionally, the splitting the service to be processed into the first number of driving tasks and adding the driving tasks to the second number of flow task queues by the processor 1101 includes:
determining a second number of flow task queues according to the total data volume of the service to be processed;
splitting the service to be processed into a first number of driving tasks, and configuring parallel and serial relations among the driving tasks through the flow task queue;
and correspondingly adding the parallel driving tasks into a parallel streaming task queue, and configuring serial driving tasks in the streaming task queue.
Optionally, the event dependencies include a dependency of recording an event and a dependency of waiting for an event, and the adding an event dependency to the driving task by the processor 1101 includes:
Creating a recorded event of the current driving task and waiting for the event;
Adding a record event of the current driving task in a current flow task queue, and adding a wait event of the current driving task in a parallel queue of the current flow task queue, wherein the current driving task is positioned in the current flow queue.
Optionally, the recording event for creating the current driving task and the waiting event executed by the processor 1101 include:
Creating an event object for the current driving task in a preset event database, wherein the event object comprises an event counter and a scheduling event list, the initial value of the event counter is 0, and the scheduling event list is initially empty;
and creating a recorded event and a waiting event of the current driving task based on the event object.
Optionally, the creating, based on the event object, a recorded event for the current driving task by the processor 1101 includes:
finding a target event object from the event database;
Adding a time scheduling object at the tail part of a scheduling event list in a target event object, wherein the initial state of the time scheduling object is not prepared;
Generating an event record task of the current flow task queue, wherein the event record task comprises a first time scheduling object pointer pointing to a time scheduling object at the tail part of the scheduling event list;
And adding the event recording task to the tail of the current flow task queue as a recording event.
Optionally, the creating, based on the event object, a waiting event for the current driving task by the processor 1101 includes:
finding a target event object from the event database;
generating an event waiting task of the parallel flow task queue, wherein the event waiting task comprises a second time scheduling object pointer pointing to a time scheduling object at the tail part of the scheduling event list;
And adding the event waiting task to the tail of the parallel flow task queue as a waiting event.
Optionally, the executing the event dependency by the processor 1101, enabling the parallel synchronous waiting of the streaming task queue includes:
when the current flow task queue executes the record event to the current driving task, changing the time scheduling object state pointed by the first time scheduling object pointer into preparation completion;
when the state of the time scheduling object pointed by the second time scheduling object pointer is not prepared, switching the state of the corresponding parallel flow task queue into a waiting event completion state;
When the state of the time scheduling object pointed by the second time scheduling object pointer is prepared, switching the state of the corresponding parallel flow task queue into an executable state;
Traversing the parallel flow task queue of the waiting event completion state, and checking the time scheduling object state corresponding to the parallel flow task queue of the waiting event completion state;
and if the state of the time scheduling object corresponding to the parallel flow task queue in the event waiting completion state is changed to be ready for completion, switching the state of the parallel flow task queue in the event waiting completion state to an executable state.
Optionally, the executing the event dependency by the processor 1101, enabling the parallel synchronization waiting of the streaming task queue further includes:
When the number of the driving tasks added in the current flow task queue reaches a preset value, creating a marked event object and blocking an application thread;
generating a flow mark task of the current flow task queue, wherein the flow mark task comprises a third time scheduling object pointer pointing to the mark event object;
adding the stream marking task as a stream marking event to the tail of a current stream task queue;
When the current flow task queue executes to the flow marking event, waking up the blocked application thread and releasing the marking event object.
Optionally, the event database includes a dependency event list, where the dependency event list is used to record a dependency event, where the dependency event is a record event or a wait event, the dependency event list includes a marked task list of a dependency event, and when the current flow task queue executed by the processor 1101 executes to the marked task, the blocked application thread is awakened, and the marked event object is released, where the method includes:
finding a target event object from the event database;
Generating an event marking task, wherein the event marking task comprises a fourth time scheduling object pointer and a fifth time scheduling object pointer, the fourth time scheduling object pointer points to the marking event object, and the fifth time scheduling object pointer points to a time scheduling object in the scheduling event list;
When the scheduling event list is empty or the state of the time scheduling object is prepared, calling the fourth time scheduling object pointer;
When the scheduling event list is not empty or the state of the time scheduling object is not prepared, adding the event marking task at the tail part of the event-dependent marking task list;
traversing the dependency event list, sequentially reading recorded events or waiting events, and calling the fourth time scheduling object pointer when the time scheduling object state pointed by the fifth time scheduling object pointer is ready to be completed;
and deleting the corresponding recorded event or waiting event from the dependent event, and releasing the marked event object.
It should be noted that, the electronic device provided by the embodiment of the invention can be applied to devices such as a smart phone, a computer, a server and the like which can perform task synchronization waiting.
The electronic equipment provided by the embodiment of the invention can realize each process realized by the task synchronous waiting method in the embodiment of the method, and can achieve the same beneficial effects. In order to avoid repetition, a description thereof is omitted.
The embodiment of the invention also provides a computer readable storage medium, and a computer program is stored on the computer readable storage medium, and when the computer program is executed by a processor, the method for waiting for task synchronization or the method for waiting for task synchronization at an application end provided by the embodiment of the invention is realized, and the same technical effects can be achieved, so that repetition is avoided, and no description is repeated here.
Those skilled in the art will appreciate that implementing all or part of the above-described methods in accordance with the embodiments may be accomplished by way of a computer program stored on a computer readable storage medium, which when executed may comprise the steps of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM) or the like.
The foregoing disclosure is illustrative of the present invention and is not to be construed as limiting the scope of the invention, which is defined by the appended claims.