[go: up one dir, main page]

CN117406972A - RPA high-value process instance discovery method and system based on fit analysis - Google Patents

RPA high-value process instance discovery method and system based on fit analysis Download PDF

Info

Publication number
CN117406972A
CN117406972A CN202311714610.3A CN202311714610A CN117406972A CN 117406972 A CN117406972 A CN 117406972A CN 202311714610 A CN202311714610 A CN 202311714610A CN 117406972 A CN117406972 A CN 117406972A
Authority
CN
China
Prior art keywords
flow
instance
value
rpa
flow instance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311714610.3A
Other languages
Chinese (zh)
Other versions
CN117406972B (en
Inventor
裴学良
邓逸
郑超
袁水平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui Sigao Intelligent Technology Co ltd
Original Assignee
Anhui Sigao Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui Sigao Intelligent Technology Co ltd filed Critical Anhui Sigao Intelligent Technology Co ltd
Priority to CN202311714610.3A priority Critical patent/CN117406972B/en
Publication of CN117406972A publication Critical patent/CN117406972A/en
Application granted granted Critical
Publication of CN117406972B publication Critical patent/CN117406972B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/30Creation or generation of source code
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/02Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides an RPA high-value flow instance discovery method based on fitness analysis, which comprises the following steps: s1: acquiring an interaction log L, and preprocessing the interaction log L to acquire a new interaction log L; s2: clustering the new interaction log L to obtain a clustering result set, and obtaining a flow model set through the clustering result set; s3: simulating the running state of each flow model in the flow model set through a flow executor to obtain a flow instance set, and calculating through the flow instance set to obtain a high-value flow instance set; s4: and converting the high-value flow instance set into an RPA executable script. According to the method, the interaction logs are clustered to obtain the flow model set, the flow examples are obtained by simulating the running state of the flow model, the high-value flow examples are determined by carrying out cluster matching on the flow examples and calculating the matching fitting degree, and the flow with high potential value can be identified in an automatic mode.

Description

RPA high-value flow instance discovery method and system based on fitness analysis
Technical Field
The invention relates to the field of process mining and automation, in particular to an RPA high-value process instance discovery method and system based on fitness analysis.
Background
The software process is automatic, namely RPA (Robotic Process Automation), and can simulate manual operation, and through interaction with an interface of an application program or a system, various operations such as data input, data processing, system integration and the like are completed, so that time consumption caused by manual operation is reduced, manual errors caused by repeated work are even reduced, and enterprise benefits are improved.
Flow discovery is the most challenging task in the flow mining field, mainly taking a serialized event log as input, discovering a structured flow track from the input, and outputting a business flow model. Before implementing RPA, a comprehensive understanding of the business process, including the steps, activities, inputs and outputs of the process, and the rules and constraints associated therewith, may be obtained by the process discovery organization. This provides a basis for the development and configuration of RPA robots that can accurately simulate and automatically perform specific business processes.
Consistency check is one of important technologies in the field of process mining, and based on a process model of process discovery, whether real process operation is consistent with a standard process model is checked, so that whether business is compliant is deduced, and deviation and non-compliant business events, routes and generation reasons are found through analysis.
The method for finding the RPA high-value flow mainly judges the importance of the flow under the whole scene based on business logic or subjective experience, particularly identifies high-frequency flow events and error-prone flow events, thereby playing the advantages of the RPA automatic flow; however, this type of method is too dependent on subjective assumptions, and when the service scene changes, a high-value strategy needs to be readjusted.
Disclosure of Invention
In order to solve the technical problems, the invention provides an RPA high-value flow instance discovery method based on fitness analysis, which comprises the following steps:
s1: acquiring an interaction log L, and preprocessing the interaction log L to acquire a new interaction log L;
s2: clustering the new interaction log L to obtain a clustering result set, and obtaining a flow model set through the clustering result set;
s3: simulating the running state of each flow model in the flow model set through a flow executor to obtain a flow instance set, and calculating through the flow instance set to obtain a high-value flow instance set;
s4: and converting the high-value flow instance set into an RPA executable script.
Preferably, step S1 specifically includes:
s11: obtaining an interaction log L= { sigma generated by operation in an application program 123 ...σ u },σ u=< e 1 ,...,e t> The flow path is u, and u is the number of the flow path; acquiring event e in interaction log L t = (ζ, a, t), where ζ is a unique identifier of a flow path to which the event belongs, a is an activity type of the event, and t is time;
s12: setting a frequency range (0, g) of the low-frequency event, counting the occurrence times of each event, and deleting a flow path to which the event of the activity type belongs from the interaction log L if the event with the occurrence times lower than g exists;
s13: and traversing the events in the rest flow paths, and deleting the latter event from the flow paths if the activity types of the adjacent events are the same, so as to obtain a new interaction log L.
Preferably, step S2 specifically includes:
s21: obtaining new flow paths of the interaction log L, wherein the number of the types of the flow paths is n, obtaining feature vectors of all the flow paths, calculating to obtain similarity among the feature vectors of all the flow paths, and clustering the flow paths with similar feature vectors into the same cluster S through a clustering algorithm and the similarity x Wherein x is the number of the cluster, and x is more than or equal to 1 and less than or equal to n, so as to obtain a clustering result set S 1 ,...,S n
S22: inputting the clustering result set into a heuristic process discovery algorithm, and outputting a process model set P 1 ,...,P n
Preferably, step S21 specifically includes:
S211: the flow path sigma l The frequency of the activity type of the event occurring in the process path is used as a type feature alpha, the adjacency relationship among the events in the process path is used as a transition feature beta, and gamma= [ alpha, beta]As a flow path sigma l I is the number of the flow path;
s212: randomly selecting n flow paths as a clustering center O= { O 1 ,o 2 ,...,o n },o n For the nth clustering center element, calculating the similarity of feature vectors between other flow paths and the clustering centers, and for each non-clustering center flow path, selecting the clustering center with the largest similarity to form the same cluster S x Obtaining an iterative clustering result set, and calculating to obtain a measurement value of the iterative clustering result set;
s213: repeating the step S212, and selecting the iterative clustering result set with the minimum metric value as the output clustering result set S 1 ,...,S n
Preferably, the step S3 specifically includes:
s31: building a flow executor to collect a flow model set P 1 ,...,P n Inputting into a process executor to obtain a process instance set corresponding to the process model setWherein, the method comprises the steps of, wherein,for the ith flow model P i Is a set of flow instance cases of (c),representing the ith flow model P i The j-th flow instance, tr, generated λ = (j, a', λ) represents a flow instance +.>J is the unique identifier of the flow instance, and a' is the activity type of the node;
s32: in a subset of process instancesFind and cluster S x The matching times of the flow paths are calculated and obtained, and the fitting degree of each matching is calculated;
s33: setting the frequency range (0, mu) of the low-frequency flow instance, and calculating the set of flow instanceThe fitting degree average value of the process instance with the matching times larger than mu is calculated to obtain the value of the process instance;
s34: setting a value threshold value as theta, taking a flow instance with a value larger than theta as a high-value flow instance, and obtaining a cluster S x High value flow instance set Q of (2) x
S35: repeating steps S32-S34 to obtain a high-value flow instance set Q of all clusters 1 ,...,Q n
Preferably, step S32 specifically includes:
s321: setting skipped costs of nodes of activity type a' asIs inserted at the cost ofSet A skip And A insert Recording the nodes skipped and inserted in each traversal process, A skip And A insert Initially empty;
s322: a in the process of obtaining matching skip And A insert In cluster S x Selecting a flow path sigma u x =<e 1 ,...,e t >Traversing sigma in time sequence u x Event in (b) settingRepresenting a sequence of flow instances matched by traversing at m times, wherein t is time, m is the number of times, +.>For the node temp is->Is a length of (2);
s323: when t=m+1, will σ u x Event e in (a) m+1 Node converted into flow instance by= (k, a, m+1)And inserted into the sequence->In, the generated sequence->WhereinAn operator representing a node that converts an input event into a flow instance;
judgment sequenceIn the process model P x If present, updating the sequence to obtain +.>Let m=m+1, return to step S323;
otherwise consider the sequenceInvalid, the process advances to step S324;
s324: considered in sequenceInserting several nodes or skipping the latest nodeUntil the flow model P is obtained x Effective sequence->
If presentInserting and skipping nodes can both cause sequencesIf it is valid, calculate the first cost +.>And second cost->Selecting an operation scheme with lower cost and updating the set +.>Or->Let m=m+1, return to step S323; wherein,for inserting a set of nodes, pi is the number of the node, +.>Is a merging operation;
s325: repeating steps S323-S324 until the flow path sigma is traversed u x All events in (1) at this timeAnd sequence->For the flow model P x Valid, i.e. there is a flow instance I j xAnd sequence->Matching is consistent;
s326: based onAnd->Calculate matched flow instance I j x Fitting degree of (a)The formula for the fitness calculation is as follows:
wherein z is j For flow example I j x Quilt cluster S x The number of times the flow path in (1) is matched, str.a' is set A skip The activity type of the middle node, itr.a', is set A insert The activity type of the middle node, tr.a', is the flow instance I j x The activity type of the intermediate node;
updating z j= z j +1 and reset set A skip And A insert Is empty;
s327: traversing cluster S x Repeating steps S322-S326, and calculating to obtain P x Flow instance and cluster S in (a) x The matching times of each flow path and the fitting degree of each matched flow instance.
Preferably, the calculation formula of the value of the flow instance is:
wherein Score (I j x ) For flow example I j x Value of I j x For a subset of process instancesIn the j-th flow instance, z j For flow example I j x Quilt cluster S x The number of times the flow path in (a) is matched to, +.>For flow example I j x Fitting degree of ∈0, +.>Is V (V) j x Is a component of the group.
Preferably, step S4 specifically includes:
s41: for all high-value flow instances in the high-value flow instance set, if parallel flow instance branches exist, merging all nodes on the parallel flow instance branches into a new node; if a node with the functional complexity larger than that of a single RPA flow step exists, splitting the node into a plurality of child nodes to obtain a processed high-value flow instance set;
s42: and converting the processed high-value flow instance set into an RPA executable script through an RPA instruction converter.
An RPA high-value flow instance discovery system based on fitness analysis comprises the following modules:
the preprocessing module is used for acquiring an interaction log L, preprocessing the interaction log L and acquiring a new interaction log L;
the flow model set acquisition module is used for clustering the new interaction logs L to obtain a clustering result set, and obtaining a flow model set through the clustering result set;
the high-value flow instance set acquisition module is used for simulating the running state of each flow model in the flow model set through the flow executor to obtain a flow instance set, and obtaining the high-value flow instance set through the calculation of the flow instance set;
and the executable script generation module is used for converting the high-value flow instance set into the RPA executable script.
The invention has the following beneficial effects:
the method can identify the flow with high potential value in an automatic mode, reduce the requirements of manual intervention and subjective judgment, and improve the flow automation benefit.
Drawings
FIG. 1 is a flow chart of a method according to an embodiment of the present invention;
the achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Referring to fig. 1, the application provides an RPA high-value process instance discovery method based on fitness analysis, which determines a high-value process instance by comparing the fitness of an actual process instance and a standard process, so that a process with high potential value can be more easily identified.
The method can be realized in an automatic mode, so that the requirements of manual intervention and subjective judgment are reduced. Meanwhile, the method can be customized and adjusted according to different business requirements and flow characteristics, for example, an enterprise can adjust the cost quantity and the threshold value of the fitting degree of different activity types according to actual conditions so as to adapt to different business scenes and flow changes.
In addition, enterprises may gain insight and understanding of the process instances through fitness-based methods. Having identified high value flow instances, the enterprise can further analyze and evaluate the impact and potential of these instances, providing powerful support for decision making and optimization.
The method comprises the following steps:
s1: acquiring an interaction log L, and preprocessing the interaction log L to acquire a new interaction log L;
s2: clustering the new interaction log L to obtain a clustering result set, and obtaining a flow model set through the clustering result set;
s3: simulating the running state of each flow model in the flow model set through a flow executor to obtain a flow instance set, and calculating through the flow instance set to obtain a high-value flow instance set;
s4: and converting the high-value flow instance set into an RPA executable script.
Further, the step S1 specifically includes:
s11: obtaining an interaction log L= { sigma generated by operation in an application program 123 ...σ u },σ u=< e 1 ,...,e t> The flow path is u, and u is the number of the flow path; acquiring event e in interaction log L t = (ζ, a, t), where ζ is a unique identifier of a flow path to which the event belongs, a is an activity type of the event, and t is time;
s12: setting a frequency range (0, g) of the low-frequency event, counting the occurrence times of each event, and deleting a flow path to which the event of the activity type belongs from the interaction log L if the event with the occurrence times lower than g exists;
s13: traversing the events in the rest flow paths, and deleting the latter event from the flow paths if the activity types of the adjacent events are the same, so as to obtain a new interaction log L;
in particular, if there is an immediately preceding event e t = (ζ, a, t) and e t+1 = (ζ, a, t+1), where the activity types are the same, the repeatedly occurring event is removed from the flow path, leaving only the first occurring event e t
Further, the step S2 specifically includes:
s21: obtaining new flow paths of the interaction log L, wherein the number of the types of the flow paths is n, obtaining feature vectors of all the flow paths, calculating to obtain similarity among the feature vectors of all the flow paths, and clustering the flow paths with similar feature vectors into the same cluster S through a clustering algorithm and the similarity x Wherein x is the number of the cluster, and x is more than or equal to 1 and less than or equal to n, so as to obtain a clustering result set S 1 ,...,S n
Specifically, the characteristics of the coding flow paths under the event type view and the event transition view are set, the number n of the flow paths is set, the similarity between all the flow paths based on the characteristic vector is calculated, and the flow paths with similar characteristic values are clustered into the same cluster by using a clustering algorithm;
s22: inputting the clustering result set into heuristic process discovery algorithm, and outputting processModel set P 1 ,...,P n
Further, step S21 specifically includes:
s211: the flow path sigma l The frequency of the activity type of the event occurring in the process path is used as a type feature alpha, the adjacency relationship among the events in the process path is used as a transition feature beta, and gamma= [ alpha, beta]As a flow path sigma l I is the number of the flow path;
s212: randomly selecting n flow paths as a clustering center O= { O 1 ,o 2 ,...,o n },o n For the nth clustering center element, calculating the similarity of feature vectors between other flow paths and the clustering centers, and for each non-clustering center flow path, selecting the clustering center with the largest similarity to form the same cluster S x Obtaining an iterative clustering result set, and calculating to obtain a measurement value of the iterative clustering result set;
s213: repeating the step S212, and selecting the iterative clustering result set with the minimum metric value as the output clustering result set S 1 ,...,S n
Specifically, any one flow pathAnd clustering center->The calculation formula of the feature similarity value is as follows:
wherein the method comprises the steps ofTo calculate the two norms of the vector, +.>Represents the selection->Personal cluster center->
Selecting a cluster center with the maximum feature similarity to form a cluster for the flow path of each non-cluster center;
by a function ofEvaluating the average similarity between the flow path and the center point after a certain clustering, wherein +.>Representative at +.>For the calculation, +.>As a clustering result obtained by a clustering center;
represents->Metric values obtained by iterative calculation, if +.>Then->When the metric value converges or exceeds the maximum number of iterations, get +.>Corresponding cluster division results as in interaction log +.>Result based on flow path feature clustering +.>
Further, the step S3 specifically includes:
s31: building a flow executor to collect a flow model set P 1 ,...,P n Inputting into a process executor to obtain a process instance set corresponding to the process model setWherein, the method comprises the steps of, wherein,for the ith flow model P i Is a set of flow instance cases of (c),representing the ith flow model P i The j-th flow instance, tr, generated λ = (j, a', λ) represents a flow instance +.>J is the unique identifier of the flow instance, and a' is the activity type of the node;
s32: in a subset of process instancesFind and cluster S x The matching times of the flow paths are calculated and obtained, and the fitting degree of each matching is calculated;
s33: setting the frequency range (0, mu) of the low-frequency flow instance, and calculating the set of flow instanceThe fitting degree average value of the process instance with the matching times larger than mu is calculated to obtain the value of the process instance;
s34: setting a value threshold value as theta, taking a flow instance with a value larger than theta as a high-value flow instance, and obtaining a cluster S x High value flow instance set Q of (2) x
S35: repeating steps S32-S34 to obtain a high-value flow instance set Q of all clusters 1 ,...,Q n
Further, the step S32 specifically includes:
s321: setting skipped costs of nodes of activity type a' asIs inserted at the cost ofSet A skip And A insert Recording the nodes skipped and inserted in each traversal process, A skip And A insert Initially empty;
s322: a in the process of obtaining matching skip And A insert In cluster S x Selecting a flow path sigma u x =<e 1 ,...,e t >Traversing sigma in time sequence u x Event in (b) settingRepresenting a sequence of flow instances matched by traversing at m times, wherein t is time, m is the number of times, +.>For the node temp is->Is a length of (2);
s323: when t=m+1, will σ u x Event e in (a) m+1 Node converted into flow instance by= (k, a, m+1)And inserted into the sequence->In, the generated sequence->WhereinAn operator representing a node that converts an input event into a flow instance;
judgment sequenceIn the process model P x If present, updating the sequence to obtain +.>Let m=m+1, return to step S323;
otherwise consider the sequenceInvalid, the process advances to step S324;
s324: considered in sequenceInserting several nodes or skipping the latest nodeUntil the flow model P is obtained x Effective sequence->
If there are inserted and skipped nodes, the sequence can be causedIf it is valid, calculate the first cost +.>And second cost->Selecting an operation scheme with lower cost and updating the set +.>Or->Let m=m+1, return to step S323; wherein,for inserting a set of nodesTogether, pi is the number of the node, +.>Is a merging operation;
s325: repeating steps S323-S324 until the flow path sigma is traversed u x All events in (1) at this timeAnd sequence->For the flow model P x Valid, i.e. there is a flow instance I j xAnd sequence->Matching is consistent;
s326: based onAnd->Calculate matched flow instance I j x Fitting degree of (a)The formula for the fitness calculation is as follows:
wherein z is j For flow example I j x Quilt cluster S x The number of times the flow path in (1) is matched, str.a' is set A skip The activity type of the middle node, itr.a', is set A insert The activity type of the middle node, tr.a', is the flow instance I j x The activity type of the intermediate node;
updating z j= z j +1 and reset set A skip And A insert Is empty;
s327: traversing cluster S x Repeating steps S322-S326, and calculating to obtain P x Flow instance and cluster S in (a) x The matching times of each flow path and the fitting degree of each matched flow instance.
Further, the calculation formula of the value of the flow instance is:
wherein Score (I j x ) For flow example I j x Value of I j x For a subset of process instancesIn the j-th flow instance, z j For flow example I j x Quilt cluster S x The number of times the flow path in (a) is matched to, +.>For flow example I j x Fitting degree of ∈0, +.>Is V (V) j x Is a component of the group.
Further, the step S4 specifically includes:
s41: for all high-value flow instances in the high-value flow instance set, if parallel flow instance branches exist, merging all nodes on the parallel flow instance branches into a new node; if a node with the functional complexity larger than that of a single RPA flow step exists, splitting the node into a plurality of child nodes to obtain a processed high-value flow instance set;
s42: and converting the processed high-value flow instance set into an RPA executable script through an RPA instruction converter.
Specifically, each flow node is converted into a Script-Action label in a Script, operation attributes and operation targets needed in the flow node are converted into Script-Command labels, and each Script-Command contains complete interaction information needed for guiding one basic operation and is arranged according to the sequence of the operations.
An RPA high-value flow instance discovery system based on fitness analysis comprises the following modules:
the preprocessing module is used for acquiring an interaction log L, preprocessing the interaction log L and acquiring a new interaction log L;
the flow model set acquisition module is used for clustering the new interaction logs L to obtain a clustering result set, and obtaining a flow model set through the clustering result set;
the high-value flow instance set acquisition module is used for simulating the running state of each flow model in the flow model set through the flow executor to obtain a flow instance set, and obtaining the high-value flow instance set through the calculation of the flow instance set;
and the executable script generation module is used for converting the high-value flow instance set into the RPA executable script.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the terms first, second, third, etc. do not denote any order, but rather the terms first, second, third, etc. are used to interpret the terms as labels.
The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.

Claims (9)

1. The RPA high-value flow instance discovery method based on fitting degree analysis is characterized by comprising the following steps:
s1: acquiring an interaction log L, and preprocessing the interaction log L to acquire a new interaction log L;
s2: clustering the new interaction log L to obtain a clustering result set, and obtaining a flow model set through the clustering result set;
s3: simulating the running state of each flow model in the flow model set through a flow executor to obtain a flow instance set, and calculating through the flow instance set to obtain a high-value flow instance set;
s4: and converting the high-value flow instance set into an RPA executable script.
2. The RPA high-value flow instance discovery method based on fitness analysis according to claim 1, wherein step S1 specifically comprises:
s11: obtaining an interaction log L= { sigma generated by operation in an application program 123 ...σ u },σ u=< e 1 ,...,e t> The flow path is u, and u is the number of the flow path; acquiring event e in interaction log L t = (ζ, a, t), where ζ is a unique identifier of a flow path to which the event belongs, a is an activity type of the event, and t is time;
s12: setting a frequency range (0, g) of the low-frequency event, counting the occurrence times of each event, and deleting a flow path to which the event of the activity type belongs from the interaction log L if the event with the occurrence times lower than g exists;
s13: and traversing the events in the rest flow paths, and deleting the latter event from the flow paths if the activity types of the adjacent events are the same, so as to obtain a new interaction log L.
3. The RPA high-value flow instance discovery method based on fitness analysis according to claim 2, wherein step S2 specifically comprises:
s21: obtaining new flow paths of the interaction log L, wherein the number of the types of the flow paths is n, obtaining feature vectors of all the flow paths, calculating to obtain similarity among the feature vectors of all the flow paths, and clustering the flow paths with similar feature vectors into the same cluster S through a clustering algorithm and the similarity x Wherein x is the number of the cluster, and x is more than or equal to 1 and less than or equal to n, so as to obtain a clustering result set S 1 ,...,S n
S22: inputting the clustering result set into a heuristic process discovery algorithm, and outputting a process model set P 1 ,...,P n
4. The RPA high-value flow instance discovery method based on fitness analysis according to claim 3, wherein step S21 specifically comprises:
s211: the flow path sigma l The frequency of the activity type of the event occurring in the process path is used as a type feature alpha, the adjacency relationship among the events in the process path is used as a transition feature beta, and gamma= [ alpha, beta]As a flow path sigma l I is the number of the flow path;
s212: randomly selecting n flow paths as a clustering center O= { O 1 ,o 2 ,...,o n },o n For the nth clustering center element, calculating the similarity of feature vectors between other flow paths and the clustering centers, and for each non-clustering center flow path, selecting the clustering center with the largest similarity to form the same cluster S x Obtaining an iterative clustering result set, and calculating to obtain a measurement value of the iterative clustering result set;
s213: repeating the step S212, and selecting the iterative clustering result set with the minimum metric value as the output clustering result set S 1 ,...,S n
5. The RPA high-value flow instance discovery method based on fitness analysis according to claim 3, wherein step S3 specifically comprises:
s31: building a flow executor to collect a flow model set P 1 ,...,P n Inputting into a process executor to obtain a process instance set corresponding to the process model setWherein, the method comprises the steps of, wherein,for the ith flow model P i Is a set of flow instance cases of (c),representing the ith flow model P i The j-th flow instance, tr, generated λ = (j, a', λ) represents a flow instance +.>J is the unique identifier of the flow instance, and a' is the activity type of the node;
s32: in a subset of process instancesFind and cluster S x The matching times of the flow paths are calculated and obtained, and the fitting degree of each matching is calculated;
s33: setting the frequency range (0, mu) of the low-frequency flow instance, and calculating the set of flow instanceThe fitting degree average value of the process instance with the matching times larger than mu is calculated to obtain the value of the process instance;
s34: setting a value threshold value as theta, taking a flow instance with a value larger than theta as a high-value flow instance, and obtaining a cluster S x High value flow instance of (2)Set Q x
S35: repeating steps S32-S34 to obtain a high-value flow instance set Q of all clusters 1 ,...,Q n
6. The RPA high-value flow instance discovery method based on fitness analysis according to claim 5, wherein step S32 specifically comprises:
s321: setting skipped costs of nodes of activity type a' asIs inserted with a cost of->Set A skip And A insert Recording the nodes skipped and inserted in each traversal process, A skip And A insert Initially empty;
s322: a in the process of obtaining matching skip And A insert In cluster S x Selecting a flow path sigma u x =<e 1 ,...,e t >Traversing sigma in time sequence u x Event in (b) settingRepresenting a sequence of flow instances matched by traversing at m times, wherein t is time, m is the number of times, +.>For the node temp is->Is a length of (2);
s323: when t=m+1, will σ u x Event e in (a) m+1 Node converted into flow instance by= (k, a, m+1)And inserted into the sequence->In, the generated sequence->WhereinAn operator representing a node that converts an input event into a flow instance;
judgment sequenceIn the process model P x If it exists, updating the sequence to obtainLet m=m+1, return to step S323;
otherwise consider the sequenceInvalid, the process advances to step S324;
s324: considered in sequenceInsert several nodes or skip the latest node +.>Until the flow model P is obtained x Effective sequence->
If there are inserted and skipped nodes, the sequence can be causedEffective, then calculate the first costAnd second cost->Selecting an operation scheme with lower cost and updating the set +.>Or->Let m=m+1, return to step S323; wherein,for inserting a set of nodes, pi is the number of the node, +.>Is a merging operation;
s325: repeating steps S323-S324 until the flow path sigma is traversed u x All events in (1) at this timeAnd sequence ofFor the flow model P x Valid, i.e. there is a flow instance I j xAnd sequence->Matching is consistent;
s326: based onAnd->Calculate matched flow instance I j x Fitting degree of +.>The formula for the fitness calculation is as follows:
wherein z is j For flow example I j x Quilt cluster S x The number of times the flow path in (1) is matched, str.a' is set A skip The activity type of the middle node, itr.a', is set A insert The activity type of the middle node, tr.a', is the flow instance I j x The activity type of the intermediate node;
updating z j= z j +1 and reset set A skip And A insert Is empty;
s327: traversing cluster S x Repeating steps S322-S326, and calculating to obtain P x Flow instance and cluster S in (a) x The matching times of each flow path and the fitting degree of each matched flow instance.
7. The RPA high-value process instance discovery method based on fitness analysis according to claim 6, wherein a calculation formula of a value of a process instance is:
wherein Score (I j x ) For flow example I j x Value of I j x For a subset of process instancesIn the j-th flow instance, z j For flow example I j x Quilt cluster S x The number of times the flow path in (a) is matched to, +.>For flow example I j x Fitting degree of ∈0, +.>Is V (V) j x Is a component of the group.
8. The RPA high-value flow instance discovery method based on fitness analysis according to claim 1, wherein step S4 specifically comprises:
s41: for all high-value flow instances in the high-value flow instance set, if parallel flow instance branches exist, merging all nodes on the parallel flow instance branches into a new node; if a node with the functional complexity larger than that of a single RPA flow step exists, splitting the node into a plurality of child nodes to obtain a processed high-value flow instance set;
s42: and converting the processed high-value flow instance set into an RPA executable script through an RPA instruction converter.
9. An RPA high value flow instance discovery system based on fitness analysis, comprising:
the preprocessing module is used for acquiring an interaction log L, preprocessing the interaction log L and acquiring a new interaction log L;
the flow model set acquisition module is used for clustering the new interaction logs L to obtain a clustering result set, and obtaining a flow model set through the clustering result set;
the high-value flow instance set acquisition module is used for simulating the running state of each flow model in the flow model set through the flow executor to obtain a flow instance set, and obtaining the high-value flow instance set through the calculation of the flow instance set;
and the executable script generation module is used for converting the high-value flow instance set into the RPA executable script.
CN202311714610.3A 2023-12-14 2023-12-14 RPA high-value flow instance discovery method and system based on fitness analysis Active CN117406972B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311714610.3A CN117406972B (en) 2023-12-14 2023-12-14 RPA high-value flow instance discovery method and system based on fitness analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311714610.3A CN117406972B (en) 2023-12-14 2023-12-14 RPA high-value flow instance discovery method and system based on fitness analysis

Publications (2)

Publication Number Publication Date
CN117406972A true CN117406972A (en) 2024-01-16
CN117406972B CN117406972B (en) 2024-02-13

Family

ID=89492855

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311714610.3A Active CN117406972B (en) 2023-12-14 2023-12-14 RPA high-value flow instance discovery method and system based on fitness analysis

Country Status (1)

Country Link
CN (1) CN117406972B (en)

Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020174093A1 (en) * 2001-05-17 2002-11-21 Fabio Casati Method of identifying and analyzing business processes from workflow audit logs
CN102693317A (en) * 2012-05-29 2012-09-26 华为软件技术有限公司 Method and device for data mining process generating
CN104881435A (en) * 2015-05-05 2015-09-02 中国海洋石油总公司 Data mining based automatic research flow well logging evaluation expert system
US20200206920A1 (en) * 2018-12-31 2020-07-02 Kofax, Inc. Systems and methods for identifying processes for robotic automation and building models therefor
US20210086354A1 (en) * 2019-09-19 2021-03-25 UiPath, Inc. Process understanding for robotic process automation (rpa) using sequence extraction
US20210109717A1 (en) * 2019-10-14 2021-04-15 UiPath Inc. Providing Image and Text Data for Automatic Target Selection in Robotic Process Automation
US20210200560A1 (en) * 2019-12-30 2021-07-01 UiPath Inc. Enhanced Target Selection for Robotic Process Automation
US20220229762A1 (en) * 2020-10-23 2022-07-21 UiPath Inc. Robotic Process Automation (RPA) Debugging Systems And Methods
CN114926073A (en) * 2022-06-02 2022-08-19 南京英诺森软件科技有限公司 Method for automatic process mining based on RPA decomposition log
US20220327351A1 (en) * 2021-04-13 2022-10-13 UiPath, Inc. Task and process mining by robotic process automations across a computing environment
CN115759979A (en) * 2022-11-16 2023-03-07 上海弘玑信息技术有限公司 Process intelligent processing method and system based on RPA and process mining
CN115878081A (en) * 2023-02-23 2023-03-31 安徽思高智能科技有限公司 High-value RPA demand analysis system based on process discovery
CN115952919A (en) * 2023-01-16 2023-04-11 哈尔滨工业大学(威海) Risk Intelligent Prediction Method Based on Process Mining
CN115953123A (en) * 2022-12-19 2023-04-11 中移信息技术有限公司 Generation method, device, equipment and storage medium of robot automation process
CN116225513A (en) * 2023-05-09 2023-06-06 安徽思高智能科技有限公司 RPA dynamic flow discovery method and system based on concept drift
CN116628228A (en) * 2023-07-19 2023-08-22 安徽思高智能科技有限公司 A RPA process recommendation method and computer-readable storage medium
CN117170648A (en) * 2023-09-08 2023-12-05 上海艺赛旗软件股份有限公司 Robot flow automation component recommendation method, device, equipment and storage medium

Patent Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020174093A1 (en) * 2001-05-17 2002-11-21 Fabio Casati Method of identifying and analyzing business processes from workflow audit logs
CN102693317A (en) * 2012-05-29 2012-09-26 华为软件技术有限公司 Method and device for data mining process generating
CN104881435A (en) * 2015-05-05 2015-09-02 中国海洋石油总公司 Data mining based automatic research flow well logging evaluation expert system
US20200206920A1 (en) * 2018-12-31 2020-07-02 Kofax, Inc. Systems and methods for identifying processes for robotic automation and building models therefor
US20210086354A1 (en) * 2019-09-19 2021-03-25 UiPath, Inc. Process understanding for robotic process automation (rpa) using sequence extraction
US20210109717A1 (en) * 2019-10-14 2021-04-15 UiPath Inc. Providing Image and Text Data for Automatic Target Selection in Robotic Process Automation
US20210200560A1 (en) * 2019-12-30 2021-07-01 UiPath Inc. Enhanced Target Selection for Robotic Process Automation
US20220229762A1 (en) * 2020-10-23 2022-07-21 UiPath Inc. Robotic Process Automation (RPA) Debugging Systems And Methods
US20220327351A1 (en) * 2021-04-13 2022-10-13 UiPath, Inc. Task and process mining by robotic process automations across a computing environment
CN114926073A (en) * 2022-06-02 2022-08-19 南京英诺森软件科技有限公司 Method for automatic process mining based on RPA decomposition log
CN115759979A (en) * 2022-11-16 2023-03-07 上海弘玑信息技术有限公司 Process intelligent processing method and system based on RPA and process mining
CN115953123A (en) * 2022-12-19 2023-04-11 中移信息技术有限公司 Generation method, device, equipment and storage medium of robot automation process
CN115952919A (en) * 2023-01-16 2023-04-11 哈尔滨工业大学(威海) Risk Intelligent Prediction Method Based on Process Mining
CN115878081A (en) * 2023-02-23 2023-03-31 安徽思高智能科技有限公司 High-value RPA demand analysis system based on process discovery
CN116225513A (en) * 2023-05-09 2023-06-06 安徽思高智能科技有限公司 RPA dynamic flow discovery method and system based on concept drift
CN116628228A (en) * 2023-07-19 2023-08-22 安徽思高智能科技有限公司 A RPA process recommendation method and computer-readable storage medium
CN117170648A (en) * 2023-09-08 2023-12-05 上海艺赛旗软件股份有限公司 Robot flow automation component recommendation method, device, equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
李锐;黄彩云;: "RPA机器人助力电力企业财务数字化转型", 中国新技术新产品, no. 14, 25 July 2020 (2020-07-25) *
王松;: "市场经济背景下鉴定文书签章流程自动化应用系统设计", 财富时代, no. 12, 25 December 2019 (2019-12-25) *

Also Published As

Publication number Publication date
CN117406972B (en) 2024-02-13

Similar Documents

Publication Publication Date Title
van der Aalst et al. Process equivalence: Comparing two process models based on observed behavior
CN109861844B (en) Cloud service problem fine-grained intelligent tracing method based on logs
Ferreira et al. Approaching process mining with sequence clustering: Experiments and findings
CN113779272A (en) Data processing method, device and equipment based on knowledge graph and storage medium
CN110427298B (en) An automatic feature extraction method for distributed logs
Naderifar et al. A review on conformance checking technique for the evaluation of process mining algorithms
CN112287603A (en) A machine learning-based predictive model construction method, device and electronic device
CN107909344A (en) Workflow logs iterative task recognition methods based on relational matrix
CN119006165B (en) Data asset assessment method and system based on big data
CN112765031A (en) Decomposition method of crowd-sourcing vulnerability mining task
CN114638234B (en) Big data mining method and system applied to online business handling
CN118690292A (en) A method and system for detecting abnormal edges in dynamic graph data based on graph neural network
CN117406972B (en) RPA high-value flow instance discovery method and system based on fitness analysis
CN116628351A (en) Flowchart branch recommendation method, device and storage device based on node dependency
CN114418120B (en) Data processing method, device, equipment and storage medium of federated tree model
Krismayer et al. A constraint mining approach to support monitoring cyber-physical systems
CN114841504A (en) Business process optimization method, device, terminal device and storage medium
CN112949778A (en) Intelligent contract classification method and system based on locality sensitive hashing and electronic equipment
Fang et al. Online incremental mining based on trusted behavior interval
CN119719059A (en) Educational resource sharing system and method based on cloud platform
CN112434831A (en) Troubleshooting method and device, storage medium and computer equipment
JP2016520220A (en) Hidden attribute model estimation device, method and program
Huang et al. Act-sagan: Automatic configuration tuning for kafka with self-attention generative adversarial networks
CN115617652B (en) Test case processing method and device, computing equipment and computer storage medium
CN116089608B (en) A multi-task user interaction log segmentation method based on graph embedding

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20240116

Assignee: HUBEI THINGO TECHNOLOGY DEVELOPMENT Co.,Ltd.

Assignor: Anhui Sigao Intelligent Technology Co.,Ltd.

Contract record no.: X2025980022502

Denomination of invention: A method and system for discovering high-value RPA process instances based on fit analysis

Granted publication date: 20240213

License type: Exclusive License

Record date: 20250917

EE01 Entry into force of recordation of patent licensing contract