[go: up one dir, main page]

CN121138812A - Drilling auxiliary decision-making method, device and machine-readable storage medium - Google Patents

Drilling auxiliary decision-making method, device and machine-readable storage medium

Info

Publication number
CN121138812A
CN121138812A CN202511669106.5A CN202511669106A CN121138812A CN 121138812 A CN121138812 A CN 121138812A CN 202511669106 A CN202511669106 A CN 202511669106A CN 121138812 A CN121138812 A CN 121138812A
Authority
CN
China
Prior art keywords
decision
real
drilling
historical
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202511669106.5A
Other languages
Chinese (zh)
Inventor
祝兆鹏
李明伟
宋先知
李根生
张诚恺
周蒙蒙
朱林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Petroleum Beijing
Original Assignee
China University of Petroleum Beijing
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Petroleum Beijing filed Critical China University of Petroleum Beijing
Priority to CN202511669106.5A priority Critical patent/CN121138812A/en
Publication of CN121138812A publication Critical patent/CN121138812A/en
Pending legal-status Critical Current

Links

Landscapes

  • Earth Drilling (AREA)

Abstract

The application relates to the technical field of oil and gas exploration, and discloses a drilling auxiliary decision-making method, a device and a machine-readable storage medium. The drilling auxiliary decision-making method comprises the steps of processing first real-time operation data comprising first real-time working parameters and first real-time drilling data by means of a first initial decision-making model obtained based on behavior clone training to obtain an initial decision, determining actual operation and actual application by a driller according to the initial decision, generating second real-time operation data comprising second real-time working parameters of drilling equipment and second real-time drilling data, determining a first intention label for representing an engineering target of the actual operation according to the second real-time operation data, inputting the second real-time operation data and the first intention label into a decision-making optimization model determined based on reinforcement learning, and finally obtaining the drilling decision efficiently and accurately.

Description

Drilling auxiliary decision-making method, device and machine-readable storage medium
Technical Field
The application relates to the technical field of oil and gas exploration, in particular to a drilling auxiliary decision-making method, a device and a machine-readable storage medium.
Background
Drillers are the core operators in drilling operations and are generally responsible for specific job commands and equipment operations at the drilling site. Thus, driller decision quality is directly related to drilling efficiency, drilling quality, and operational safety.
In the existing drilling operation, the operation environment is complex and the uncertainty is high, and the traditional decision support system usually depends on static rules, so that the decision efficiency and the precision are not high, and the requirement of an operation site cannot be met.
Disclosure of Invention
The embodiment of the application aims to provide a drilling auxiliary decision-making method which is used for solving the problems that in the prior art, the decision-making speed is low, the decision-making accuracy is not strong and the decision cannot be matched with the operation requirement correctly.
To achieve the above object, a first aspect of the present application provides a drilling aid decision making method, comprising:
acquiring first real-time operation data, wherein the first real-time operation data comprises first real-time working parameters of drilling equipment and first real-time drilling data;
Inputting the first real-time operation data into an initial decision model to obtain an initial decision, wherein the initial decision model is obtained based on behavior cloning;
acquiring second real-time operation data of the drilling equipment, wherein the second real-time operation data is obtained after a driller determines actual operation based on an initial decision and applies the actual operation to the drilling equipment;
Determining a first intention label according to the second real-time operation data, wherein the first intention label is used for representing an engineering target of actual operation;
based on reinforcement learning, the second real-time operation data and the first intention label are input into a decision optimization model to obtain an optimization decision.
In the embodiment of the application, the training process of the initial decision model comprises the following steps:
Acquiring historical operation data and corresponding historical decision information, wherein the historical operation data comprises historical working parameters of drilling equipment and historical drilling data;
determining a second intent tag based on the historical drilling data, wherein the second intent tag is used to characterize motivations for the historical decision information;
inputting the historical operation data, the corresponding historical decision information and the second intention label into a preset neural network to obtain the predictive decision information corresponding to the historical operation data;
calculating a loss value corresponding to the historical operation data based on the first loss function according to the historical decision information and the predictive decision information;
And iteratively training the preset neural network by taking the minimum loss value as a target, and ending training until the preset first training ending condition is met so as to obtain an initial decision model.
In the embodiment of the application, the prediction decision information comprises whether a driller makes a decision, the type of the working parameter to be adjusted corresponding to the decision and the adjustment value of each working parameter type.
In the embodiment of the application, the first loss function is obtained by weighted summation of decision occurrence judgment loss, decision type loss and decision parameter adjustment value loss, wherein the decision occurrence judgment loss is a cross entropy loss function, the decision type loss is a multi-label cross entropy loss function, and the decision parameter adjustment value loss is a mean square error loss function.
In an embodiment of the application, the historical decision information comprises a plurality of historical decisions within a predetermined distance of movement of the drilling apparatus.
In the embodiment of the application, the second real-time operation data and the first intention label based on reinforcement learning are input into a decision optimization model to obtain an optimization decision, and the method comprises the following steps:
according to the second real-time operation data as a state space, and taking a first intention label as state enhancement of the state space, inputting the state enhancement into a decision optimization model, and determining a target action in an action space, wherein the action space comprises a change state of a target working parameter;
Determining a desired total prize through a prize function and a second loss function according to the action space and the first intention label;
The optimization decision is determined with the goal of maximizing the desired total prize.
In the embodiment of the application, the reward function is obtained by weighted summation according to the consistency rewards of the initial decision and the actual operation and the consistency rewards of the actual operation and the first intention label.
In an embodiment of the present application, the second loss function includes:
Where L RL is the second loss function, Q represents the expected total rewards that can be achieved by taking some action in a certain state, Representing the desire to calculate the difference in Q values for all states s and actions a, R is the actual reward obtained by taking some action in the current state, γ is the discount factor,Is the maximum Q value that the agent takes in the next state s', Q (s, a) is the Q value that the agent takes action a in state s.
A second aspect of the application provides a drilling aid decision making apparatus comprising:
a memory configured to store instructions, and
A processor configured to debug instructions from the memory and a method capable of implementing any of the drilling assistance decisions when executing the instructions.
A third aspect of the application provides a machine-readable storage medium having stored thereon instructions for causing a machine to perform the method of drilling assistance decision making of any one of the claims.
According to the technical scheme, the first real-time operation data comprising the first real-time working parameters and the first real-time drilling data are processed by means of the first initial decision model obtained based on behavior clone training to obtain an initial decision, the driller determines actual operation according to the initial decision and performs actual application, generates second real-time operation data comprising the second real-time working parameters of the drilling equipment and the second real-time drilling data, determines a first intention label for representing an engineering target of the actual operation according to the actual operation, inputs the second real-time operation data and the first intention label into the decision optimization model determined based on reinforcement learning, and finally obtains the optimized drilling decision efficiently and accurately.
Additional features and advantages of embodiments of the application will be set forth in the detailed description which follows.
Drawings
The accompanying drawings are included to provide a further understanding of embodiments of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain, without limitation, the embodiments of the application. In the drawings:
FIG. 1 schematically illustrates a flow diagram of a method of drilling assistance decision making in accordance with an embodiment of the application;
FIG. 2 schematically illustrates a training process diagram of an initial decision model according to an embodiment of the application;
fig. 3 schematically shows a schematic diagram of an apparatus for drilling aid decision making according to an embodiment of the application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it should be understood that the detailed description described herein is merely for illustrating and explaining the embodiments of the present application, and is not intended to limit the embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
It should be noted that, in the technical scheme of the application, the acquisition, transmission, storage, use, processing and the like of the data all conform to the relevant regulations of the law and regulation. In the embodiments of the present application, some software, components, models, etc. may be mentioned in the industry, and they should be regarded as exemplary only for the purpose of illustrating the feasibility of implementing the technical solution of the present application, but it does not mean that the applicant has or must not use the solution.
It should be noted that, if directional indications (such as up, down, left, right, front, and rear are referred to in the embodiments of the present application), the directional indications are merely used to explain the relative positional relationship, movement conditions, and the like between the components in a specific posture (as shown in the drawings), and if the specific posture is changed, the directional indications are correspondingly changed.
In addition, if there is a description of "first", "second", etc. in the embodiments of the present application, the description of "first", "second", etc. is for descriptive purposes only and is not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In addition, the technical solutions of the embodiments may be combined with each other, but it is necessary to base that the technical solutions can be realized by those skilled in the art, and when the technical solutions are contradictory or cannot be realized, the combination of the technical solutions should be considered to be absent and not within the scope of protection claimed in the present application.
Fig. 1 schematically shows a flow diagram of a method of drilling assistance decision making according to an embodiment of the application. As shown in fig. 1, an embodiment of the present application provides a method of drilling assistance decision making, which may include the following steps S110-S150.
Step S110, first real-time operation data are acquired, wherein the first real-time operation data comprise first real-time working parameters of drilling equipment and first real-time drilling data.
In the embodiment of the application, first real-time operation data including first real-time operation parameters of the drilling equipment and first real-time drilling data in the drilling operation process are collected through the automatic equipment, and preprocessing is carried out.
Specifically, in an alternative embodiment, the collected first real-time operating parameters of the drilling equipment may include a set weight on bit, a set rotational speed, a set displacement, drilling fluid parameters, drill bit wear, equipment operating conditions, etc., while the collected first real-time drilling data may include weight on bit, rotational speed downhole, displacement downhole, formation information, well depth, geological conditions, etc. The preprocessing operations may include outlier/missing value processing, noise reduction, normalization/normalization, etc., to further convert the collected data into an input data set that may be input to the initial decision model.
By collecting the various data, the real-time state of the driller on the actual operation of the drilling equipment and the underground working environment can be reflected, and a basis is provided for determining an initial decision for an initial decision model.
Step S120, inputting the first real-time operation data into an initial decision model to obtain an initial decision, wherein the initial decision model is obtained based on behavior cloning.
In the embodiment of the application, the initial decision model for performing behavior clone training based on historical data is used for processing the first real-time data so as to obtain an initial decision and provide the initial decision for a driller to refer to.
Specifically, in an alternative embodiment, the initial decision model may provide the driller with adjustment decisions including the type of parameters of weight on bit, rotational speed, displacement, etc. of the drilling equipment.
The initial decision determined by the initial decision model can provide basic decision reference and assistance for the driller, so that the driller can be helped to better achieve the potential engineering target judged by the initial decision model at this stage.
Step S130, second real-time operation data of the drilling equipment are obtained after the driller determines actual operation based on the initial decision and applies the actual operation to the drilling equipment.
In the embodiment of the application, after knowing the initial decision recommended by the initial decision model, the driller can determine the actual operation applied to the drilling equipment according to the engineering experience and the actual engineering target of the driller based on the initial decision, so as to obtain second real-time operation data comprising second real-time working parameters and second real-time drilling data of the drilling equipment. The second real-time operating parameters of the drilling apparatus may include set weight on bit, set rotational speed, set displacement, drilling fluid parameters, bit wear, apparatus operating conditions, etc., and the second real-time drilling data may include downhole weight on bit, downhole rotational speed, downhole displacement, formation information, well depth, geological conditions, etc.
Specifically, in an alternative embodiment, the driller may choose to take or not take the initial decision, may choose to take one or more of the parameter types recommended for adjustment in the initial decision, or may choose to take a specific parameter value for adjustment of one or more of the parameter type recommended in the initial decision. Meanwhile, the actual operation of driller applied to the drilling equipment is accurately reflected in the second real-time operation data.
By analyzing and processing the second real-time working parameters including the drilling equipment and the second real-time working data of the second real-time drilling data, decisions of drillers on actual drilling operations under different downhole environments and different engineering targets can be more accurately known by the decision optimization model, and further, the processing speed and accuracy of the decision optimization model are improved through reinforcement learning.
And step 140, determining a first intention label according to the second real-time operation data, wherein the first intention label is used for representing an engineering target of actual operation.
In the embodiment of the application, the first intention label used for representing the engineering target of the actual operation is further determined through the second real-time operation data generated by the actual operation of the driller.
Specifically, in an alternative implementation manner, the change records of the weight on bit, the rotating speed and the displacement in the second real-time operation data are judged through an algorithm, mutation points of parameter changes are found, a mutation vector is determined according to the parameter change values, and then the first intention label is determined according to the mutation vector. One possible way of determining is to determine the mutation point using the CUSUM algorithm. When the algorithm determines that a mutation exists in a parameter, the decision is considered as being made by the driller.
Through the first intention label, the decision optimization model can more accurately acquire engineering targets corresponding to actual operations performed by the driller, and further provides assistance for determining an optimization decision.
And S150, inputting the second real-time operation data and the first intention label into a decision optimization model to obtain an optimization decision, wherein the decision optimization model is obtained based on reinforcement learning.
In the embodiment of the application, the decision optimization model obtained based on reinforcement learning (Reinforcement Learning, RL) is based on an initial decision model, and the second real-time operation data and the first intention label are used as inputs to determine an optimization decision.
By inputting the second real-time operation data reflecting the current actual drilling operation state and the first intention label reflecting the driller actual operation and the target into the decision optimization model, the determined optimization decision can more accord with the actual operation requirement than the initial decision.
According to the technical scheme, according to the first real-time operation data comprising the first real-time working parameters of the drilling equipment and the first real-time drilling data, an initial decision is obtained by using an initial decision model obtained based on behavior cloning, then the actual operation is determined according to the driller reference initial decision and the second real-time operation data generated after application is carried out, a first intention label used for representing an engineering target of the actual operation is determined, and finally, the optimal decision which meets the actual operation requirement, has adaptability and intelligence is determined more efficiently by means of a decision optimization model based on reinforcement learning according to the second real-time operation data and the first intention label.
FIG. 2 schematically illustrates a training process diagram of an initial decision model according to an embodiment of the application. As shown in FIG. 2, an embodiment of the present application provides a training process for an initial decision model, which may include the following steps S210-S250.
Step S210, acquiring historical operation data and corresponding historical decision information, wherein the historical operation data comprises historical working parameters of drilling equipment and historical drilling data.
In the embodiment of the application, various data generated in the historical actual operation can be collected and preprocessed, converted into a training data set, and the behavior clone is used for training the preset neural network. The historical operation data comprise historical operation parameter records of drilling equipment, such as set bit pressure, set rotating speed, set displacement, drilling fluid parameters, drill bit abrasion, equipment running state and the like, and the historical drilling data record data, such as underground bit pressure, underground rotating speed, underground displacement, stratum information, well depth, geological conditions and the like. Meanwhile, the behavior of artificially carrying out parameter adjustment on drillers in the historical operation data is correspondingly a historical decision, and engineering targets are marked, such as 'avoiding equipment abrasion', 'improving the mechanical drilling rate', and the like.
In an embodiment of the application, the historical decision information comprises a plurality of historical decisions within a predetermined distance of movement of the drilling apparatus.
Specifically, in the historical drilling data, due to formation heterogeneity, hydraulic system delay, automatic control compensation, etc., successive changes in multiple historical parameters in the historical drilling data over a short period of time may represent historical decisions for the same driller, while the drilling process is still continuing and the equipment is still moving. Therefore, in the embodiment of the application, the drilling equipment is moved by a preset distance to accurately judge the historical decision of the driller, and one possible value is 3 meters.
Step S220, determining a second intention label according to the historical drilling data, wherein the second intention label is used for representing motivation of the historical decision information.
In the embodiment of the application, change records of underground weight on bit, underground rotating speed and underground displacement in historical drilling data are judged through an algorithm, mutation points of parameter change are found, a mutation vector is determined according to a parameter change value, and a second intention label is determined according to the mutation vector. One possible way of determining is to determine the mutation point using the CUSUM algorithm. When the algorithm determines that a mutation exists in a parameter, the decision is considered as being made by the driller.
Step S230, inputting the historical operation data, the corresponding historical decision information and the second intention label into a preset neural network to obtain the predictive decision information corresponding to the historical operation data.
In the embodiment of the application, in the imitation learning, the behavior cloning makes the preset neural network imitate the decision-making behavior of a driller in a supervised learning mode. Specifically, the operation rule of the driller can be learned by inputting historical operation data and corresponding historical decision information and analyzing the decision of the driller (such as adjusting the weight on bit, the rotating speed and the displacement), and the training target can be to judge whether the driller makes a decision at the next moment, what decision the driller makes and what the parameter change value corresponding to the decision is.
In one possible embodiment, the predictive decision information includes whether the driller makes a decision, the type of operating parameter that the decision corresponds to be adjusted, and the adjustment value for each operating parameter type.
The first step is a binary classification problem, where historical drilling data is processed to output a probability value between 0 and 1, indicating the probability of decision making by the driller at the next time (which may be 1 meter below), and when the probability is greater than a set point, i.e., considered to be operational, the set point may be 0.8 in one embodiment.
The second step is a multi-label classification problem, and the second step shares part of the feature extraction layer with the first step. Since parameters of a plurality of drilling equipment may need to be adjusted simultaneously in the actual drilling operation process, whether the parameters of the drilling equipment are adjusted is judged through three independent Sigmoid output layers which are not mutually exclusive.
The third step is the multi-objective regression problem, which also shares the feature extraction layer. In case the decision parameter type has been determined, the parameter variation values of the drilling equipment are output in this step. This step may have a gating mechanism, i.e. if the second step determines that a certain parameter is not adjusted, the third step forces the rate of change of the parameter to zero.
And for the second step and the third step, physical constraints of drilling engineering experience are introduced to prevent the safety problem of the output parameter adjustment mode. Such as when the formation data is "salt bed," the reduction of displacement is forcibly prohibited.
Therefore, by inputting the historical operation data, the corresponding historical decision information and the second intention label, the prediction decision information corresponding to the historical operation data can be obtained.
Step S240, according to the historical decision information and the predictive decision information, calculating a loss value corresponding to the historical operation data based on the first loss function.
In the embodiment of the application, it can be understood that the predicted decision information in the training process does not necessarily coincide with the historical decision information. Therefore, after each training, the prediction decision information is compared with the historical decision information, and the accuracy of the model is evaluated through the first loss function.
In the embodiment of the application, the first loss function is obtained by weighted summation of decision occurrence judgment loss, decision type loss and decision parameter adjustment value loss, wherein the decision occurrence judgment loss is a cross entropy loss function (binary cross entropy), the decision type loss is a multi-label cross entropy loss function (multi-label cross entropy), and the decision parameter adjustment value loss is a mean square error loss function (MSE).
And S250, iteratively training the preset neural network by taking the minimum loss value as a target, and ending training to obtain an initial decision model when the preset first training ending condition is met.
In the embodiment of the application, the training process is verified and tested with the minimum loss value as a target, and the training is stopped until the preset first training termination condition is met by continuously iterating through adjusting the training parameters and the algorithm structure, so as to obtain an initial decision model. The first training termination condition may be that the loss value is smaller than a preset loss value threshold, or the training number reaches a preset training number threshold.
Through the technical scheme, the behavior cloning training is carried out on the preset neural network by means of the historical operation data and the corresponding historical decision information, the driller behavior is simulated under supervision and learning, and then the first loss function is used for verification and evaluation, so that an initial decision model with higher accuracy is finally obtained.
In an alternative embodiment, step S150 may include the following steps S151-S153.
And step S151, according to the second real-time operation data as a state space, taking the first intention label as the state enhancement of the state space, inputting the state enhancement into a decision optimization model, and determining a target action in an action space, wherein the action space comprises the change state of a target working parameter.
Those skilled in the art will appreciate that in reinforcement learning, a state is a description of an agent interacting with an environment, and an action is an action that an agent may take in a particular state. In the embodiment of the application, preprocessing including outlier/missing value processing, noise reduction, normalization/standardization and the like is firstly carried out on the second real-time operation data, modeling and quantization are further completed to be used as a state space for input, and meanwhile, a first intention label is introduced to represent an engineering target behind the actual operation and is used as state enhancement of the state space and is input to a decision optimization model.
The decision optimization model is based on an initial decision model, based on reinforcement learning, and determines a target action in an action space to output according to an input state space, wherein the action space comprises a change state of a target working parameter.
Step S152, determining the expected total rewards through a rewarding function and a second loss function according to the action space and the first intention label.
In one possible embodiment of the present application, the movement space includes, but is not limited to, either, not changing, increasing or decreasing the weight on bit of the drilling apparatus, not changing, increasing or decreasing the rotational speed of the drill bit of the drilling apparatus, not changing, increasing or decreasing the displacement of the mud pump of the drilling apparatus. Under the action space definition and in combination with the first intention label, calculating the expected total rewards through the rewards function and the second loss function.
In the embodiment of the application, a double-track system rewarding function design is innovatively provided, and the rewarding function is obtained by weighting and summing the rewarding according to the consistency of initial decision and actual operation and the consistency of the actual operation and the first intention label. The consistency rewards of the initial decision and the actual operation represent the initial decision obtained through an initial decision model according to the first real-time operation data, the consistency of the initial decision and the actual operation applied to the drilling equipment is determined after the driller refers to the initial strategy and the actual engineering target, 1 is obtained, 0 is not obtained, the consistency rewards of the actual operation and the first intention label represent the driller actual operation, the consistency of the actual operation and the first intention label obtained through the second real-time operation data, 1 is obtained, and 0 is not obtained. Through the double-track reward function design, the reinforcement learning model can be encouraged to learn more decision trends of drillers, and the efficiency of optimizing decisions to achieve the final engineering goal can be enhanced.
It may be appreciated that the weights of the initial decision and the consistency rewards of the actual operation and the first intention label may be set according to actual requirements, which is not limited in the embodiment of the present application. Illustratively, the initial decision may be 0.3 with a consistency prize for actual operation and the first intent tag may be 0.7.
Through the setting of the reward function, the model can balance various parameters and purposes in the operation in the training process, and meanwhile, the adaptability is improved, and the overall efficiency and safety of the operation are optimized.
In an embodiment of the present application, the second loss function includes:
Where L RL is the second loss function, Q represents the expected total rewards that can be achieved by taking some action in a certain state, Representing the desire to calculate the difference in Q values for all states s and actions a, R is the actual reward obtained by taking some action in the current state, γ is the discount factor,Is the maximum Q value that the agent takes in the next state s', Q (s, a) is the Q value that the agent takes action a in state s.
Through the second loss function, the model can continuously improve the prediction precision, so that the accuracy and the adaptability of the optimization decision are gradually improved.
Step S153, determining an optimization decision with the aim of maximizing the expected total rewards.
In the embodiment of the application, the error between the expected total rewards and the rewards actually obtained, which can be obtained by taking a certain action under a certain state, of the decision optimization model is judged by calculating the time difference (Temporal Difference, TD) error, and the decision optimization model is updated, so that the optimization decision is determined.
According to the technical scheme, the second real-time operation data and the first intention label are simultaneously input into the decision optimization model based on reinforcement learning, and the decision optimization model with high intelligence and high adaptability is finally formed by means of cooperation training of the reward function and the second loss function, so that accurate, high-adaptability and decision support capable of explaining decision purposes can be provided for a driller.
Fig. 3 schematically shows a schematic diagram of an apparatus for drilling aid decision making according to an embodiment of the application. As shown in fig. 3, an embodiment of the present application provides a controller 300, which may include:
a memory 310 configured to store instructions, and
Processor 320 is configured to invoke instructions from memory 310 and when executing the instructions, to implement the methods for drilling assistance decisions described above.
Embodiments of the present application also provide a machine-readable storage medium having stored thereon instructions for causing a machine to perform the above-described method of drilling assistance decision making.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, etc., such as Read Only Memory (ROM) or flash RAM. Memory is an example of a computer-readable medium.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Disks (DVD) or other optical storage, magnetic cassettes, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises an element.
The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and variations of the present application will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the application are to be included in the scope of the claims of the present application.

Claims (10)

1.一种钻井辅助决策方法,其特征在于,包括:1. A drilling-aided decision-making method, characterized in that it includes: 获取第一实时作业数据,其中,所述第一实时作业数据包括钻井设备的第一实时工作参数以及第一实时钻井数据;Acquire first real-time operation data, wherein the first real-time operation data includes first real-time operating parameters of the drilling equipment and first real-time drilling data; 将所述第一实时作业数据输入至初始决策模型,得到初始决策,其中,所述初始决策模型基于行为克隆得到;The first real-time operation data is input into the initial decision model to obtain an initial decision, wherein the initial decision model is obtained based on behavior cloning; 获取钻井设备的第二实时作业数据,其中,所述第二实时作业数据是在司钻基于所述初始决策确定实际操作,并将所述实际操作应用于所述钻井设备后得到的;Acquire second real-time operation data of the drilling equipment, wherein the second real-time operation data is obtained after the driller determines the actual operation based on the initial decision and applies the actual operation to the drilling equipment; 根据所述第二实时作业数据,确定第一意图标签,其中,所述第一意图标签用于表征所述实际操作的工程目标;Based on the second real-time operation data, a first intent tag is determined, wherein the first intent tag is used to characterize the engineering goal of the actual operation; 基于强化学习,将所述第二实时作业数据和所述第一意图标签输入至决策优化模型,得到优化决策。Based on reinforcement learning, the second real-time job data and the first intent label are input into the decision optimization model to obtain an optimized decision. 2.根据权利要求1所述的方法,其特征在于,所述初始决策模型的训练过程包括:2. The method according to claim 1, wherein the training process of the initial decision model includes: 获取历史作业数据以及对应的历史决策信息,其中,所述历史作业数据包括所述钻井设备的历史工作参数以及历史钻井数据;Acquire historical operation data and corresponding historical decision information, wherein the historical operation data includes the historical operating parameters of the drilling equipment and historical drilling data; 根据所述历史钻井数据,确定第二意图标签,其中,所述第二意图标签用于表征所述历史决策信息的动机;Based on the historical drilling data, a second intent label is determined, wherein the second intent label is used to characterize the motivation of the historical decision information; 将所述历史作业数据、对应的所述历史决策信息以及所述第二意图标签输入至预设神经网络,得到所述历史作业数据对应的预测决策信息;The historical task data, the corresponding historical decision information, and the second intent label are input into a preset neural network to obtain the predictive decision information corresponding to the historical task data. 根据所述历史决策信息及所述预测决策信息,基于第一损失函数,计算所述历史作业数据对应的损失值;Based on the historical decision information and the predicted decision information, and using the first loss function, the loss value corresponding to the historical operation data is calculated. 以最小化所述损失值为目标,迭代训练所述预设神经网络,直到满足预设第一训练终止条件时,终止训练以得到所述初始决策模型。With the goal of minimizing the loss value, the preset neural network is trained iteratively until a preset first training termination condition is met, at which point the training is terminated to obtain the initial decision model. 3.根据权利要求2所述的方法,其特征在于,所述预测决策信息包括司钻是否做出决策、所述决策对应的需调整的工作参数类型以及每个所述工作参数类型的调整值。3. The method according to claim 2, wherein the predictive decision information includes whether the driller makes a decision, the type of working parameter to be adjusted corresponding to the decision, and the adjustment value for each type of working parameter. 4.根据权利要求2所述的方法,其特征在于,所述第一损失函数由决策发生判断损失、决策类型损失以及决策参数调整值损失加权求和得到,其中,所述决策发生判断损失为交叉熵损失函数,所述决策类型损失为多标签交叉熵损失函数,所述决策参数调整值损失为均方误差损失函数。4. The method according to claim 2, wherein the first loss function is obtained by weighted summation of decision occurrence judgment loss, decision type loss, and decision parameter adjustment value loss, wherein the decision occurrence judgment loss is a cross-entropy loss function, the decision type loss is a multi-label cross-entropy loss function, and the decision parameter adjustment value loss is a mean squared error loss function. 5.根据权利要求2所述的方法,其特征在于,还包括,所述历史决策信息包括钻井设备移动预设距离内的多个历史决策。5. The method according to claim 2, characterized in that it further includes, the historical decision information including multiple historical decisions within a preset distance of the drilling equipment movement. 6.根据权利要求1所述的方法,其特征在于,所述基于强化学习,将所述第二实时作业数据和所述第一意图标签输入至决策优化模型,得到优化决策,包括:6. The method according to claim 1, wherein the step of inputting the second real-time job data and the first intent label into the decision optimization model based on reinforcement learning to obtain an optimization decision includes: 根据所述第二实时作业数据作为状态空间,并以所述第一意图标签作为所述状态空间的状态增强,输入至所述决策优化模型,从动作空间中确定目标动作,其中,所述动作空间包括目标工作参数的变化状态;The second real-time job data is used as the state space, and the first intent label is used as the state enhancement of the state space. The data is then input into the decision optimization model to determine the target action from the action space, wherein the action space includes the changing states of the target working parameters. 根据所述动作空间与所述第一意图标签,通过奖励函数与第二损失函数,确定期望总奖励;Based on the action space and the first intent label, the expected total reward is determined using a reward function and a second loss function; 以最大化所述期望总奖励为目标,确定所述优化决策。The optimization decision is determined with the objective of maximizing the expected total reward. 7.根据权利要求6所述的方法,其特征在于,所述奖励函数根据所述初始决策与所述实际操作的一致性奖励以及所述实际操作与所述第一意图标签的一致性奖励进行加权求和得到。7. The method according to claim 6, wherein the reward function is obtained by weighted summation of the consistency reward between the initial decision and the actual operation and the consistency reward between the actual operation and the first intention label. 8.根据权利要求6所述的方法,其特征在于,所述第二损失函数包括:8. The method according to claim 6, wherein the second loss function comprises: 其中,L RL为所述第二损失函数,Q表示在某一状态下采取某一行动所能获得的期望总奖励,表示计算所有状态s和动作a下Q值的差异的期望,R是当前状态下采取某个行动所获得的实际奖励,γ是折扣因子,是在下一状态s’中智能体采取的最大Q值,Q(s,a)是智能体在状态s下采取行动a的Q值。Where L <sub>RL </sub> is the second loss function, and Q represents the expected total reward that can be obtained by taking a certain action in a certain state. Let γ represent the expected value of the difference between the Q values for all states s and actions a, where R is the actual reward obtained by taking an action in the current state, and γ is the discount factor. Q(s,a) is the maximum Q value that the agent takes in the next state s', and Q(s,a) is the Q value of the agent taking action a in state s. 9.一种钻井辅助决策装置,其特征在于,包括:9. A drilling auxiliary decision-making device, characterized in that it comprises: 存储器,被配置成存储指令;以及The memory is configured to store instructions; and 处理器,被配置成从所述存储器调用所述指令以及在执行所述指令时能够实现根据权利要求1至8中任一项所述的钻井辅助决策的方法。A processor is configured to retrieve the instructions from the memory and, when executing the instructions, to implement the drilling assistance decision-making method according to any one of claims 1 to 8. 10.一种机器可读存储介质,其特征在于,该机器可读存储介质上存储有指令,该指令用于使得机器执行根据权利要求1至8中任一项所述的钻井辅助决策的方法。10. A machine-readable storage medium, characterized in that the machine-readable storage medium stores instructions for causing a machine to perform a drilling auxiliary decision-making method according to any one of claims 1 to 8.
CN202511669106.5A 2025-11-14 2025-11-14 Drilling auxiliary decision-making method, device and machine-readable storage medium Pending CN121138812A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202511669106.5A CN121138812A (en) 2025-11-14 2025-11-14 Drilling auxiliary decision-making method, device and machine-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202511669106.5A CN121138812A (en) 2025-11-14 2025-11-14 Drilling auxiliary decision-making method, device and machine-readable storage medium

Publications (1)

Publication Number Publication Date
CN121138812A true CN121138812A (en) 2025-12-16

Family

ID=97984914

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202511669106.5A Pending CN121138812A (en) 2025-11-14 2025-11-14 Drilling auxiliary decision-making method, device and machine-readable storage medium

Country Status (1)

Country Link
CN (1) CN121138812A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230116456A1 (en) * 2019-08-23 2023-04-13 Landmark Graphics Corporation Active reinforcement learning for drilling optimization and automation
CN116848313A (en) * 2021-02-02 2023-10-03 沙特阿拉伯石油公司 Methods and systems for autonomous flow control in hydraulic stimulation operations
CN119195731A (en) * 2024-10-25 2024-12-27 西南石油大学 A drilling parameter adaptive control method based on reinforcement learning
CN119937305A (en) * 2025-01-03 2025-05-06 西安石油大学 An automated control method for key drilling parameters based on improved DDPG algorithm
CN120235181A (en) * 2025-05-29 2025-07-01 北京中数睿智科技有限公司 An automated construction method for end-to-end intelligent agents based on graph structure semantic fusion
CN120487037A (en) * 2025-07-21 2025-08-15 中国石油大学(北京) Downhole tool face dynamic control method and system based on reinforcement learning
CN120597735A (en) * 2025-08-08 2025-09-05 吉林大学 A deep reinforcement learning-driven intelligent real-time optimization method for drilling parameters

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230116456A1 (en) * 2019-08-23 2023-04-13 Landmark Graphics Corporation Active reinforcement learning for drilling optimization and automation
CN116848313A (en) * 2021-02-02 2023-10-03 沙特阿拉伯石油公司 Methods and systems for autonomous flow control in hydraulic stimulation operations
CN119195731A (en) * 2024-10-25 2024-12-27 西南石油大学 A drilling parameter adaptive control method based on reinforcement learning
CN119937305A (en) * 2025-01-03 2025-05-06 西安石油大学 An automated control method for key drilling parameters based on improved DDPG algorithm
CN120235181A (en) * 2025-05-29 2025-07-01 北京中数睿智科技有限公司 An automated construction method for end-to-end intelligent agents based on graph structure semantic fusion
CN120487037A (en) * 2025-07-21 2025-08-15 中国石油大学(北京) Downhole tool face dynamic control method and system based on reinforcement learning
CN120597735A (en) * 2025-08-08 2025-09-05 吉林大学 A deep reinforcement learning-driven intelligent real-time optimization method for drilling parameters

Similar Documents

Publication Publication Date Title
US11308413B2 (en) Intelligent optimization of flow control devices
US11480039B2 (en) Distributed machine learning control of electric submersible pumps
CN104024572A (en) Method and system for predicting drill string stuck pipe event
US20210071509A1 (en) Deep intelligence for electric submersible pumping systems
NO20211401A1 (en) Automated concurrent path planning and drilling parameter optimization using robotics
CN114370264B (en) Mechanical penetration rate determination, drilling parameter optimization methods, devices and electronic equipment
CN113553356A (en) Drilling parameter prediction method and system
CN114519291A (en) Method for establishing working condition monitoring and control model and application method and device thereof
CN116644844A (en) Stratum pressure prediction method based on neural network time sequence
CN113627639A (en) Well testing productivity prediction method and system for carbonate fracture-cave reservoir
CN112926805A (en) Intelligent vertical well testing interpretation method and device based on deep reinforcement learning
CN116894213B (en) Method and device for identifying working condition of excavator
CN117313278A (en) An intelligent matching method and system for operation control parameters of large hydraulic pile hammers
CN120068619A (en) Targeting positioning grouting high-efficiency plugging and reinforcing method and system based on deep learning
Wang et al. Application of Recurrent Neural Network Long Short-Term Memory Model on Early Kick Detection
CN114398817A (en) Method and device for dynamically estimating production operation condition of natural gas shaft
KR102612959B1 (en) System and method for predicting shale gas production based on deep learning
CN121138812A (en) Drilling auxiliary decision-making method, device and machine-readable storage medium
CN119191099B (en) Control method, device, equipment and storage medium for running track
CN118793554B (en) A method and system for adjusting the main shaft axis of a generator set
CN120162661A (en) ROP prediction method based on LSTM and ensemble learning algorithm
US20250036837A1 (en) Machine and systems for identifying wells priority for corrosion log utilizing machine learning model
CN117035549A (en) A cost algorithm method for evaluating urban water supply pipeline network plans
CN118442041A (en) Optimization control method and system for deep well lifting system
CN119918160B (en) Automatic gesture correction method and system based on data driving

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination