[go: up one dir, main page]

CN110503208A - Resource regulating method and resource scheduling device in multi-model exploration - Google Patents

Resource regulating method and resource scheduling device in multi-model exploration Download PDF

Info

Publication number
CN110503208A
CN110503208A CN201910791358.3A CN201910791358A CN110503208A CN 110503208 A CN110503208 A CN 110503208A CN 201910791358 A CN201910791358 A CN 201910791358A CN 110503208 A CN110503208 A CN 110503208A
Authority
CN
China
Prior art keywords
machine learning
learning model
resource
model
hyper parameter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910791358.3A
Other languages
Chinese (zh)
Other versions
CN110503208B (en
Inventor
赵庆
李瀚�
桂权力
郝玥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
4Paradigm Beijing Technology Co Ltd
Original Assignee
4Paradigm Beijing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 4Paradigm Beijing Technology Co Ltd filed Critical 4Paradigm Beijing Technology Co Ltd
Priority to CN201910791358.3A priority Critical patent/CN110503208B/en
Publication of CN110503208A publication Critical patent/CN110503208A/en
Application granted granted Critical
Publication of CN110503208B publication Critical patent/CN110503208B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Debugging And Monitoring (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

Resource regulating method and resource scheduling device in multi-model exploration is provided.Resource regulating method in multi-model exploration includes: that based on same target data set multiple machine learning models are carried out with a wheel hyper parameter respectively to explore training, wherein, each machine learning model at least explores M group hyper parameter in the exploration of this wheel, and M is the positive integer greater than 1;Based on the corresponding model-evaluation index of multiple groups hyper parameter that the multiple machine learning model is explored in this wheel respectively, the epicycle performance score of each machine learning model is calculated, and calculates the future potential score of each machine learning model;The epicycle performance point and future potential score of comprehensive each machine learning model determine the Resource Allocation Formula by available resource allocation to each machine learning model;It is explored in training according to the Resource Allocation Formula in the hyper parameter of next round and carries out corresponding scheduling of resource.

Description

Resource regulating method and resource scheduling device in multi-model exploration
Technical field
This invention relates generally to artificial intelligence fields, more particularly, be related to a kind of multi-model explore in resource Dispatching method and resource scheduling device.
Background technique
With the appearance of mass data, artificial intelligence technology is rapidly developed, and machine learning is Artificial Intelligence Development to one The inevitable outcome for determining the stage, is dedicated to the means by calculating, and valuable potential information is excavated from mass data.
Currently, AutoML (Auto-Machine learning, automaton study) be one in machine learning very Popular direction, is dedicated to automatically determining optimized parameter and network structure according to problem.The implementation layer of AutoML goes out not Thoroughly, and can be roughly divided into two types: one kind is that single model (that is, machine learning model) explores best model, is represented as paddy The efficient neural framework search (ENAS) of song, one kind are that multi-model is explored, that is, find out best model from multiple models.Here, Multiple models refer to the fixed model of multiple model structures.Model structure is fixed to refer to other than hyper parameter is uncertain other All determine.Multi-model exploration refers to finally exploring best model by constantly carrying out hyper parameter tuning to multiple models.
However, current heuristic approach is usually that each model independently carries out hyper parameter tuning, lack between model Resource plans as a whole and scheduling.And whichever model which kind of performance all can it is continual operation go down, this is undoubtedly wasted greatly The calculating of amount and time resource.Lack effective administrative mechanism in resource and exploration efficiency.
Summary of the invention
The purpose of the present invention is to provide the resource regulating methods and resource scheduling device in a kind of exploration of multi-model.
An aspect of of the present present invention provide a kind of multi-model explore in resource regulating method, the resource regulating method packet It includes: based on same target data set multiple machine learning models being carried out with a wheel hyper parameter respectively and explores training, wherein at this Each machine learning model at least explores M group hyper parameter in wheel exploration, and M is the positive integer greater than 1;Based on the multiple engineering The corresponding model-evaluation index of multiple groups hyper parameter that model is explored in this wheel respectively is practised, each machine learning mould is calculated The epicycle of type shows score, and calculates the future potential score of each machine learning model;Comprehensive each machine learning model Epicycle performance point and future potential score, determine by available resource allocation give each machine learning model resource allocation side Case;It is explored in training according to the Resource Allocation Formula in the hyper parameter of next round and carries out corresponding scheduling of resource.
Optionally, the epicycle performance score for calculating each machine learning model includes: from the multiple machine learning K best model is commented before model determines in the corresponding model-evaluation index of multiple groups hyper parameter that this wheel is explored respectively Valence index, wherein K is positive integer;For each machine learning model, which is accounted for into the preceding K best moulds The ratio value of type evaluation index shows score as the epicycle of the machine learning model.
Optionally, the future potential score for calculating each machine learning model includes: by each machine learning model It is sequentially stored in an array, obtains in the corresponding model-evaluation index of multiple groups hyper parameter that this wheel is explored To the corresponding multiple arrays of the multiple machine learning model;For each machine learning model, from the machine learning The corresponding array of model extracts dull enhancing array, by the length of dull enhancing array and number corresponding with the machine learning model Future potential score of the ratio of the length of group as the machine learning model.
Optionally, the multiple machine learning model includes the logistic regression machine learning mould with hyper parameter selection mechanism Type, the naive Bayesian machine learning model with hyper parameter selection mechanism, the integrated study mould with hyper parameter selection mechanism Type and with hyper parameter selection mechanism return correlation machine learning model at least two.
Optionally, the resource includes at least one of central processing unit, memory space and thread.
Optionally, based on same target data set multiple machine learning models are carried out with a wheel hyper parameter respectively and explores training The step of further include: judge to meet early stop with the presence or absence of at least one machine learning model in the multiple machine learning model Condition, wherein when at least one machine learning model is confirmed as meeting the condition early stopped, stop it is described at least one The training of machine learning model does not execute calculating epicycle performance score at least one described machine learning model and future is latent The step of power score.
Optionally, the condition early stopped includes: when the epicycle exploration hyper parameter of a machine learning model is corresponding Do not innovate for model-evaluation index continuous I times excellent, then the machine learning model meets early stop condition;And/or when an engineering Practise the corresponding model-evaluation index of J hyper parameter that model is explored in epicycle higher than another machine learning model epicycle most Excellent evaluation index, then another described machine learning model meets early stop condition.
Optionally, array corresponding with the machine learning model sequentially includes the first model-evaluation index to X model Evaluation index, wherein X is greater than the integer equal to M, extracts dull enhancing array from array corresponding with the machine learning model The step of include: the first value being extracted as the first model-evaluation index in dull enhancing array;Refer to for the second model evaluation Any model-evaluation index into X model-evaluation index is marked, if any model-evaluation index is better than current list The maximum value in strong array is adjusted, then the new value any model-evaluation index being extracted as in dull enhancing array.
Optionally it is determined that the step of Resource Allocation Formula include: epicycle performance point based on each machine learning model and Future potential score calculates the composite score of each machine learning model;By the composite score of each machine learning model and entirely The ratio calculation of the summation of the composite score in portion is the resource allocation coefficient of each machine learning model;By the resource allocation side Case is determined as following Resource Allocation Formula: by the resource allocation coefficient and total resources to be allocated with each machine learning model The corresponding resource of product be determined as that the resource of each machine learning model will be distributed to.
It optionally, will be corresponding with the product of the resource allocation coefficient of each machine learning model and total resources to be allocated Resource is determined as including: except the highest machine of resource allocation coefficient by the step of resource for distributing to each machine learning model In all machine learning models except learning model, since the minimum machine learning model of resource allocation coefficient, by machine The product of the resource allocation coefficient of device learning model and total resources to be allocated is rounded downwards and determines the value after downward be rounded For will distribute to machine learning model resource quantity;Machine learning model will be not yet assigned in total resources to be allocated Resource be determined as that the resource of the highest machine learning model of resource allocation coefficient will be distributed to.
It optionally, will be corresponding with the product of the resource allocation coefficient of each machine learning model and total resources to be allocated Resource is determined as the step of distributing to the resource of each machine learning model further include: when distributing to each machine learning model Resource quantity in exist for zero value and when value greater than one, the resource that will will distribute in the resource of machine learning model The quantity of resource of machine learning model of the quantity greater than one sort according to incremental order;In the machine to sort according to incremental order In the resource of device learning model, since the resource of least machine learning model, the resource of machine learning model is at most subtracted It is less one, and one or more machine learning in the machine learning model for being zero to the quantity of resource by the resource allocation of reduction Model, so that being one by the quantity for the resource for distributing to one or more of machine learning models, and and if only if current machine The resource of learning model is reduced to for the moment, and just start the resource of next machine learning model is at most reduced to one, and will reduce Resource allocation to remaining resource quantity be zero machine learning model make remaining resource quantity zero engineering The quantity for practising the resource of one or more machine learning models in model is one, until current machine learning model is according to passing The last one machine learning model in the machine learning model of increasing sequence sequence.
Optionally, the resource regulating method further include: reach pre- in response to the stopping request of user, total training time Fixed total training time or total exercise wheel number, which reach, makes a reservation for total exercise wheel number, stops distributing resource to machine learning model.
Optionally, based on same target data set multiple machine learning models are carried out with a wheel hyper parameter respectively and explores training The step of include: distribute the resource of identical quantity respectively to the multiple machine learning model, and using identical quantity resource It carries out a wheel hyper parameter respectively to the multiple machine learning model based on same target data set and explores training.
An aspect of of the present present invention provides a kind of resource scheduling device explored for multi-model, the resource scheduling device packet Include: hyper parameter explores training unit, is configured as carrying out one respectively to multiple machine learning models based on same target data set Take turns hyper parameter and explore training, wherein each machine learning model at least explores M group hyper parameter in the exploration of this wheel, M for greater than 1 positive integer;Score calculation unit is configured as exploring in this wheel respectively based on the multiple machine learning model more The corresponding model-evaluation index of group hyper parameter, calculates the epicycle performance score of each machine learning model, and calculates every The future potential score of a machine learning model;Resource Allocation Formula determination unit is configured as integrating each machine learning mould The epicycle performance point and future potential score of type determine the resource allocation side by available resource allocation to each machine learning model Case;Scheduling of resource unit is configured as carrying out phase in the hyper parameter of next round is explored and trained according to the Resource Allocation Formula The scheduling of resource answered.
Optionally, score calculation unit is configured as: from the multiple machine learning model respectively this wheel explore to The corresponding model-evaluation index of multiple groups hyper parameter in determine before K best model evaluation index, wherein K is positive integer; For each machine learning model, using the machine learning model account for the ratio value of the preceding K best model evaluation index as The epicycle of the machine learning model shows score.
Optionally, score calculation unit is configured as: each machine learning model is surpassed in the multiple groups that this wheel is explored The corresponding model-evaluation index of parameter is sequentially stored in an array, obtains the multiple machine learning model Corresponding multiple arrays;For each machine learning model, extracted from array corresponding with the machine learning model dull Enhance array, using the ratio of the length of dull enhancing array and the length of array corresponding with the machine learning model as the machine The future potential score of device learning model.
Optionally, the multiple machine learning model includes the logistic regression machine learning mould with hyper parameter selection mechanism Type, the naive Bayesian machine learning model with hyper parameter selection mechanism, the integrated study mould with hyper parameter selection mechanism Type and with hyper parameter selection mechanism return correlation machine learning model at least two.
Optionally, the resource includes at least one of central processing unit, memory space and thread.
Optionally, hyper parameter explores training unit and is also configured to judge whether deposit in the multiple machine learning model Meet the condition early stopped at least one machine learning model, wherein when at least one machine learning model is confirmed as completely When the condition that foot early stops, hyper parameter explores the training that training unit stops at least one machine learning model, not right At least one described machine learning model executes the step of calculating epicycle performance score and future potential score.
Optionally, the condition early stopped includes: when the epicycle exploration hyper parameter of a machine learning model is corresponding Do not innovate for model-evaluation index continuous I times excellent, then the machine learning model meets early stop condition;And/or when an engineering Practise the corresponding model-evaluation index of J hyper parameter that model is explored in epicycle higher than another machine learning model epicycle most Excellent evaluation index, then another described machine learning model meets early stop condition.
Optionally, array corresponding with the machine learning model sequentially includes the first model-evaluation index to X model Evaluation index, wherein X is greater than the integer equal to M, and score calculation unit is configured as: the first model-evaluation index is extracted For the first value in dull enhancing array;It is commented for any model of second model-evaluation index into X model-evaluation index Valence index will be described any if any model-evaluation index enhances the maximum value in array better than current dullness Model-evaluation index is extracted as the new value in dull enhancing array.
Optionally, Resource Allocation Formula determination unit is configured as: the epicycle performance point based on each machine learning model With future potential score, the composite score of each machine learning model is calculated;By the composite score of each machine learning model with The ratio calculation of the summation of whole composite score is the resource allocation coefficient of each machine learning model;By the resource allocation Scheme is determined as following Resource Allocation Formula: by with the resource allocation coefficient of each machine learning model and total money to be allocated The corresponding resource of the product in source is determined as that the resource of each machine learning model will be distributed to.
Optionally, Resource Allocation Formula determination unit is configured as: removing the highest machine learning mould of resource allocation coefficient In all machine learning models except type, since the minimum machine learning model of resource allocation coefficient, by machine learning The product of the resource allocation coefficient of model and total resources to be allocated is rounded downwards and is determined as to divide by the value after downward be rounded The quantity of the resource of dispensing machine learning model;By the resource for being not yet assigned to machine learning model in total resources to be allocated It is determined as that the resource of the highest machine learning model of resource allocation coefficient will be distributed to.
Optionally, Resource Allocation Formula determination unit is also configured to when the resource for distributing to each machine learning model Quantity in exist for zero value and when value greater than one, by the quantity for the resource that will be distributed in the resource of machine learning model The quantity of the resource of machine learning model greater than one sorts according to incremental order;In the machine learning sorted according to incremental order In the resource of model, since the machine learning model with least resource, the resource of the machine learning model is reduced one A unit, and a machine learning model in the machine learning model for being zero to the quantity of resource by the resource allocation of reduction, Described the step of sorting according to incremental order is returned to, until the resource of all machine learning models is all not zero.
Optionally, scheduling of resource unit is also configured to reach pre- in response to the stopping request of user, total training time Fixed total training time or total exercise wheel number, which reach, makes a reservation for total exercise wheel number, stops distributing resource to machine learning model.
Optionally, hyper parameter is explored training unit and is configured as: distributing respectively to the multiple machine learning model identical The resource of quantity, and using identical quantity resource be based on same target data set to the multiple machine learning model respectively into Row one takes turns hyper parameter and explores training.
An aspect of of the present present invention provides a kind of computer readable storage medium, stores on the computer readable storage medium There is computer program, the computer program makes one or more of calculating when being executed by one or more computing devices Device realizes any resource regulating method as described above.
An aspect of of the present present invention provides a kind of money including one or more computing devices and one or more storage devices System is dispatched in source, and record has computer program on one or more of storage devices, and the computer program is by described one A or multiple computing devices make one or more of computing devices realize any scheduling of resource side as described above when executing Method.
This epicycle performance point and future potential score using machine learning model of the present invention carries out the skill of scheduling of resource On the one hand art scheme can make full use of the current performance for each machine learning model of outcome evaluation explored, effectively distribute Resource, and improve the level of resources utilization and explore efficiency and on the other hand also utilize each engineering of the outcome evaluation explored The future performance of model is practised, more reasonably distribution resource, and improves the level of resources utilization and explore efficiency.
Part in following description is illustrated into the other aspect and/or advantage of present general inventive concept, there are also one Dividing will be apparent by description, or can learn by the implementation of present general inventive concept.
Detailed description of the invention
By below with reference to be exemplarily illustrated an example attached drawing carry out description, above and other purpose of the invention and Feature will become apparent, in which:
Fig. 1 shows the flow chart of the resource regulating method in multi-model exploration according to the present invention;
Fig. 2 shows the processes of the epicycle performance score of each machine learning model of calculating of embodiment according to the present invention Figure;
Fig. 3 shows the process of the future potential score of each machine learning model of calculating of embodiment according to the present invention Figure;
Fig. 4 shows the flow chart of the determination Resource Allocation Formula of embodiment according to the present invention;
Fig. 5 shows the flow chart of the distribution of compensation mechanism of embodiment according to the present invention;
Fig. 6 shows the resource scheduling device of embodiment according to the present invention explored for multi-model.
Specific embodiment
The description carried out referring to the drawings is provided, with the sheet for helping comprehensive understanding to be defined by the claims and their equivalents The exemplary embodiment of invention.The description includes various specific details to help to understand, but these details are to be considered merely as showing Example property.Therefore, those skilled in the art will be appreciated that without departing from the scope and spirit of the present invention, The embodiments described herein can be made various changes and modifications.In addition, for clarity and conciseness, known function and structure can be omitted The description made.
Fig. 1 shows the flow chart of the resource regulating method in multi-model exploration according to the present invention.
Referring to Fig.1, in step S110, it is super that based on same target data set multiple machine learning models are carried out with a wheel respectively Parameter explores training, wherein each machine learning model at least explores M group hyper parameter in the exploration of this wheel, and M is greater than 1 Positive integer.
Here, multiple machine learning models may include with hyper parameter selection mechanism logistic regression machine learning model, Naive Bayesian machine learning model with hyper parameter selection mechanism, the integrated study model with hyper parameter selection mechanism and At least two returned in correlation machine learning model with hyper parameter selection mechanism.However, the invention is not limited thereto, this hair Bright multiple machine learning models may include any other machine learning model.
In the present invention, each engineering in a wheel hyper parameter heuristic process, in the multiple machine learning model It practises model and all explores multiple groups hyper parameter.For example, the machine learning model A in the multiple machine learning model explores M group Hyper parameter is explored, that is, has trained the machine learning model A with first group of hyper parameter, the machine learning mould with the second hyper parameter Type A ... and the machine learning model A with M group hyper parameter.For example, when machine learning model is logistic regression machine Learning model, when M is 2, then in a wheel is explored, based on same target data set to the logistic regression with first group of hyper parameter Machine learning model and logistic regression machine learning model with second group of hyper parameter are trained.
Furthermore, it is to be understood that each machine learning model is explored super in a wheel of multiple machine learning models is explored The group number of parameter can be identical, can not also be identical, this is because each machine learning model, which is based on one group of hyper parameter, completes training Used time it is different.
Optionally, step S110, which may also include that, judges in multiple machine learning models with the presence or absence of at least one engineering It practises model and meets the condition early stopped, wherein when at least one machine learning model is confirmed as meeting the condition early stopped, The training for stopping at least one machine learning model, does not execute subsequent meter at least one described machine learning model The step of calculating epicycle performance score and future potential score.According to one embodiment of present invention, early to meeting due to stopped The calculating of the training of the machine learning model of the condition of stopping and epicycle performance score and future potential score, so saving money Source improves trained efficiency.
As an example, the condition early stopped can include: when the epicycle exploration hyper parameter of a machine learning model is corresponding Not innovate for model-evaluation index continuous I times excellent, then the machine learning model meets early stop condition, and/or, when an engineering Practise the corresponding model-evaluation index of J hyper parameter that model is explored in epicycle higher than another machine learning model epicycle most Excellent evaluation index, then another described machine learning model meets early stop condition.Here, I and J can be predetermined Value.Here, it is noted that, in one example, the value of I can be different according to the difference of machine learning model.
Here, model-evaluation index can indicate the training effect of the machine learning model with one group of hyper parameter.At one In example, training effect can indicate verifying collection accuracy rate.In another example, training effect can indicate mean square error.However, Above-mentioned example is merely exemplary, and model-evaluation index of the invention may include the training effect of any instruction machine learning model Parameter and/or value.
More particularly, for example, not created for model-evaluation index continuous I times when training effect indicates verifying collection accuracy rate It is newly excellent to indicate that the corresponding model of I group hyper parameter continuously explored collects upper accuracy rate in verifying and do not hit new peak;When training effect can When indicating mean square error, the excellent corresponding model of I group hyper parameter for indicating continuously to explore is not innovated model-evaluation index continuous I times Mean square error do not reach the new low point.
Excellent understanding is not innovated in order to enhance I times continuous to model-evaluation index, and citing is illustrated below, but the present invention It is without being limited thereto, for example, in some instances, do not innovate excellent can refer to whether representation model evaluation index is greater than predetermined threshold.Assuming that A table Show the array of the corresponding model-evaluation index of multiple hyper parameters continuously explored including machine learning model, I 5, W (x, x+4) indicate A in xth to (x+4) 5 elements maximum value, model-evaluation index instruction verifying collection accuracy rate, len (A) whether the length for indicating array A, then can not be innovated by following pseudocode come judgment models evaluation index continuous I times excellent (i.e., if meet early stop condition):
While x+4<len(A):
If W(x,x+4)<W(0,x–1):
Triggering is early to be stopped
X=x+1
As example (for ease of description, referred to as the first example), multiple machine learning models include logistic regression machine Learning model lr, gradient promote decision tree machine learning model gbdt, depth sparse network machine learning model dsn.Show at this In first example of example property, decision is promoted to logistic regression machine learning model lr, gradient based on same target data set is based on Tree machine learning model gbdt, depth sparse network machine learning model dsn carry out a wheel hyper parameter respectively and explore training.At this In first example, logistic regression machine learning model lr, gradient promote decision tree machine learning model gbdt, depth sparse network Machine learning model dsn at least explores 5 groups of hyper parameters respectively.It should be understood that the first example is merely exemplary, more particularly, Any specific numerical value shown in first example is all that exemplary, of the invention numerical value is not limited to showing in the first example Any specific numerical value, that is to say, that according to an embodiment of the invention, any other numerical value is also feasible.
In illustrative first example, logistic regression machine learning model lr, gradient promote decision tree machine learning mould The multiple groups hyper parameter of type gbdt and depth sparse network machine learning model dsn explored respectively in this wheel is corresponding Model-evaluation index can be expressed as follows:
lr:[0.2,0.4,0.5,0.3,0.6,0.1,0.7,0.3]
gbdt:[0.5,0.2,0.1,0.4,0.2,0.6]
dsn:[0.61,0.67,0.63,0.72,0.8]
Wherein, the single value in single array can indicate the training effect of the machine learning model with one group of hyper parameter. For example, as an example, the single value (for example, 0.2) in array here can indicate verifying collection accuracy rate.In addition, showing first In example, logistic regression machine learning model lr has been trained to eight groups of hyper parameters, and gradient promotes decision tree machine learning model gbdt Six groups of hyper parameters are trained to, depth sparse network machine learning model dsn has been trained to five groups of hyper parameters.
In conjunction with above-mentioned pseudocode, for the array corresponding with model-evaluation index of logistic regression machine learning model lr [0.2,0.4,0.5,0.3,0.6,0.1,0.7,0.3], when the x in pseudocode be 2 when, due to W (2,6)=0.7, W (0,1)= 0.4, W (0,1) < W (2,6), so, in this case, early stop condition is not triggered.Similarly, when x is other values, do not have yet There is the early stop condition of triggering.Therefore, in first example, logistic regression machine learning model lr, which is not triggered, early stops item Part.By similarly calculating, gradient promotes decision tree machine learning model gbdt and depth sparse network machine learning model dsn Also without meeting early stop condition.
Optionally, in addition, as described above, early stop condition can include: as the J that a machine learning model is explored in epicycle The corresponding model-evaluation index of a hyper parameter is higher than another machine learning model in the optimal evaluation index of epicycle, then described another One machine learning model meets early stop condition.
In illustrative first example, the optimal models evaluation index and the 5th (In of logistic regression machine learning model lr In this example, J 5, however the invention is not limited thereto) excellent model-evaluation index is respectively 0.7 and 0.3, and gradient promotes decision The optimal models evaluation index and the 5th excellent model-evaluation index for setting machine learning model gbdt are respectively 0.6 and 0.2, and depth is dilute The optimal models evaluation index and the 5th excellent model-evaluation index for dredging net machine learning model dsn are respectively 0.8 and 0.61.By It is greater than gradient in the 5th excellent model-evaluation index 0.61 of depth sparse network machine learning model dsn and promotes decision tree engineering The optimal models evaluation index 0.6 of model gbdt is practised, so gradient promotes decision tree machine learning model gbdt and is confirmed as completely Foot morning stop condition.Therefore, gradient promotes decision tree machine learning model gbdt and is no longer participate in model exploration.Therefore, by sentencing Whether disconnected machine learning model meets early stop condition and stops exploring to the machine learning model for meeting early stop condition, can subtract The waste of few resource simultaneously improves exploration efficiency.
In one embodiment of the invention, at the beginning, identical quantity can be distributed respectively to multiple machine learning models Resource, and using the resource of identical quantity be based on same target data set one is carried out respectively to the multiple machine learning model It takes turns hyper parameter and explores training.Here, resource can indicate the computing resource for being explored to machine learning model.Show at one In example, resource may include at least one of central processing unit, memory space and thread.
In the step s 120, the multiple groups hyper parameter explored in this wheel respectively based on multiple machine learning models is right respectively The model-evaluation index answered, calculates the epicycle performance score of each machine learning model, and calculates each machine learning model Future potential score.
In the present invention, a wheel performance score of machine learning model can be explored with the machine learning model in epicycle Optimal one or more results are related.Come to show score to the epicycle for calculating each machine learning model later in association with Fig. 2 The step of be explained in more detail.
In addition, in the present invention, the future potential score of machine learning model can indicate if the machine learning model after Continuous exploration can explore the ability of better result.Later with reference to Fig. 3 to the future potential for calculating each machine learning model The step of score, is described in detail.
In step s 130, the epicycle performance point of comprehensive each machine learning model and future potential score, determination can Resource allocation gives the Resource Allocation Formula of each machine learning model.
That is, in the present invention, can epicycle performance point and future potential score based on each machine learning model, To determine the Resource Allocation Formula by available resource allocation to each machine learning model.Due to considering each machine learning mould The epicycle performance point of type and future potential score determine Resource Allocation Formula, so, the resource system between machine learning model It raises and dispatches, to targetedly distribute resource to different machine learning models, whichever machine learning mould avoided Which kind of performance of type all continual can run the case where going down, and save a large amount of calculating and time resource, and then in resource Effective management has been carried out in efficiency with exploring.
In step S140, is explored in training according to Resource Allocation Formula in the hyper parameter of next round and carry out corresponding resource Scheduling.
For example, the hyper parameter in next round is explored in training, corresponding machine learning model can be used it according to resource point Exploration training is carried out with the scheduled resource of scheme.
Optionally, in addition, resource regulating method may also include that reaches in response to the stopping request of user, total training time To total training time is made a reservation for or total exercise wheel number reaches and makes a reservation for total exercise wheel number, stops distributing to machine learning model and provide Source.
Fig. 2 shows the processes of the epicycle performance score of each machine learning model of calculating of embodiment according to the present invention Figure.
Referring to Fig. 2, in step S210, the multiple groups hyper parameter explored respectively in this wheel from multiple machine learning models K best model evaluation index before being determined in corresponding model-evaluation index, wherein K is positive integer.
As described above, logistic regression machine learning model lr, gradient promote decision tree machine in illustrative first example The multiple groups hyper parameter point of device learning model gbdt and depth sparse network machine learning model dsn explored respectively in this wheel Not corresponding model-evaluation index can be expressed as following array: lr:[0.2,0.4,0.5,0.3,0.6,0.1,0.7, 0.3], gbdt:[0.5,0.2,0.1,0.4,0.2,0.6], dsn:[0.61,0.67,0.63,0.72,0.8].Here, due to ladder Degree, which promotes decision tree machine learning model gbdt, can meet early stop condition as described above, and therefore, gradient promotes decision tree machine Device learning model gbdt is subsequent to be not involved in trained exploration.In this case, logistic regression machine learning model lr and depth are dilute Dredge net machine learning model dsn whole model-evaluation indexes in preceding 5 (here, 5 K, however the present invention not to this progress Limitation) a best model evaluation index is respectively as follows: 0.7,0.67,0.63,0.72 and 0.8.
In step S220, for each machine learning model, which is accounted for into the preceding K best model The ratio value of evaluation index shows score as the epicycle of the machine learning model.
In illustrative first example, in 5 best model evaluation indexes " 0.7,0.67,0.63,0.72 and 0.8 " " 0.7 " is the model-evaluation index of logistic regression machine learning model lr, therefore, the model of logistic regression machine learning model lr The ratio value that evaluation index accounts for preceding 5 best model evaluation indexes " 0.7,0.67,0.63,0.72 and 0.8 " is 1/5.Compared to it Under, in illustrative first example, in 5 best model evaluation indexes " 0.7,0.67,0.63,0.72 and 0.8 " " 0.67 ", " 0.63 ", " 0.72 " and " 0.8 " is the model-evaluation index of depth sparse network machine learning model dsn, therefore, The model-evaluation index of depth sparse network machine learning model dsn account for preceding 5 best model evaluation indexes " 0.7,0.67, 0.63,0.72 and 0.8 " ratio value is 4/5.Therefore, in illustrative first example, logistic regression machine learning model lr Epicycle performance score can correspond to 1/5, the epicycle performance score of depth sparse network machine learning model dsn can correspond to 4/ 5。
Fig. 3 shows the process of the future potential score of each machine learning model of calculating of embodiment according to the present invention Figure.
Referring to Fig. 3, in step s310, each machine learning model is distinguished in the multiple groups hyper parameter that this wheel is explored Corresponding model-evaluation index is sequentially stored in an array, is obtained the multiple machine learning model and is respectively corresponded Multiple arrays.
In illustrative first example, as described above, the model-evaluation index with logistic regression machine learning model lr Corresponding array is [0.2,0.4,0.5,0.3,0.6,0.1,0.7,0.3], with depth sparse network machine learning model dsn's The corresponding array of model-evaluation index is [0.61,0.67,0.63,0.72,0.8].
In step s 320, it for each machine learning model, is extracted from array corresponding with the machine learning model single Strong array is adjusted, using the ratio of the length of dull enhancing array and the length of array corresponding with the machine learning model as this The future potential score of machine learning model.
Here, dull enhancing array might not state dull increase array.In one example, when training effect indicates When verifying collection accuracy rate, dullness enhancing array can indicate the big array of monotone increasing.In another example, when training effect can indicate When square error, dullness enhancing array can indicate the small array of monotone decreasing.In other words, the enhancing of the value in dull enhancing array can indicate The enhancing or optimization of training effect.
For ease of description, it sequentially includes the first model evaluation that array corresponding with the machine learning model, which is assumed below, Index is to X model-evaluation index, wherein X is greater than the integer equal to M.
The step of extracting dull enhancing array from array corresponding with the machine learning model can include: comment the first model Valence index extraction is the first value in dull enhancing array.
For example, in illustrative first example, it is corresponding with the model-evaluation index of logistic regression machine learning model lr Array be [0.2,0.4,0.5,0.3,0.6,0.1,0.7,0.3], therefore, 0.2 is extracted as the in dull enhancing array One value.
It is directed in addition, the step of extracting dull enhancing array from array corresponding with the machine learning model may also include that Any model-evaluation index of second model-evaluation index into X model-evaluation index, if any model evaluation refers to Any model-evaluation index is then extracted as dull enhancing array better than the maximum value in current dull enhancing array by mark In new value.
For example, the second model evaluation for logistic regression machine learning model lr refers in illustrative first example Mark 0.4, since the second model-evaluation index 0.4 enhances array (at this point, corresponding to only includes the first value greater than current dullness Dullness enhancing array) in maximum value (that is, 0.2), so 0.4 is extracted as new value in dull enhancing array (that is, second Value).At this point, dull enhancing array becomes [0.2,0.4].Next, being directed to the third mould of logistic regression machine learning model lr Type evaluation index 0.5, since third model-evaluation index 0.5 is greater than current dull enhancing array (at this point, corresponding to includes the The dull enhancing array of one value and second value) in maximum value (that is, 0.4), so 0.4 is extracted as in dull enhancing array New value (that is, third value).At this point, dull enhancing array becomes [0.2,0.4,0.5].Next, being directed to logistic regression machine The 4th model-evaluation index 0.3 of learning model lr, since the 4th model-evaluation index 0.3 is less than current dull enhancing array Maximum value (that is, 0.5) in (at this point, corresponding to the dull enhancing array including the first value, second value and third value), so 0.3 is not extracted as the new value in dull enhancing array.At this point, dull enhancing array is still [0.2,0.4,0.5].It is subsequent, 5th model-evaluation index, 0.6 to the 8th model-evaluation index 0.3 is carried out and 0.4 to the 4th model of the second model-evaluation index Evaluation index 0.3 is similarly processed, and finally obtained dull enhancing array is [0.2,0.4,0.5,0.6,0.7].
In the present invention, the quantity for the numerical value that the length of array can include in indicated number group.In illustrative first example In, the length of the finally obtained dull enhancing array of logistic regression machine learning model lr is 5, with logistic regression machine learning The length of the corresponding array of model lr [0.2,0.4,0.5,0.3,0.6,0.1,0.7,0.3] is 8, therefore, logistic regression machine The future potential of learning model lr is scored at 5/8.
In illustrative first example, based on the future potential score with reference logistic regression machine learning model lr The similar method of calculation method, depth sparse network machine learning model dsn finally obtained dull enhancing array [0.61, 0.67,0.72,0.8] length is 4, corresponding with depth sparse network machine learning model dsn array [0.61,0.67, 0.63,0.72,0.8] length is 5, and therefore, the future potential of logistic regression machine learning model lr is scored at 4/5.
Fig. 4 shows the flow chart of the determination Resource Allocation Formula of embodiment according to the present invention.
Referring to Fig. 4, in step S410, epicycle performance point and future potential score based on each machine learning model, Calculate the composite score of each machine learning model.
In one embodiment, it can be carried out by epicycle performance point to each machine learning model and future potential score Weighted sum calculates the composite score of each machine learning model.Only as an example, identical or different weight can be assigned Epicycle performance point and the future potential score of machine learning model are given to calculate the composite score of each machine learning model.One Weight " 1 " can be assigned epicycle performance point and the future potential score of machine learning model to calculate each engineering by a example Practise the composite score of model, that is, the composite score of each machine learning model is the epicycle performance point and not of machine learning model Carry out the sum of potentiality score.However, above-mentioned example be only it is exemplary, the present invention does not limit the range of weight.Show some In example, when being more concerned about exploration efficiency, the weight for the epicycle performance point for assigning machine learning model can be set greater than imparting The weight of the future potential score of machine learning model.In addition, in some instances, it, can when being more concerned about final exploration result The weight for the epicycle performance point for assigning machine learning model is set smaller than to the future potential score for assigning machine learning model Weight.
For example, the composite score of logistic regression machine learning model lr can are as follows: 1/5+5/ in illustrative first example 8=33/40;The composite score of depth sparse network machine learning model dsn can are as follows: 4/5+4/5=8/5.
In the step s 420, by the ratio of the composite score of each machine learning model and the summation of whole composite score It is calculated as the resource allocation coefficient of each machine learning model.
For example, the resource allocation coefficient of logistic regression machine learning model lr can be counted in illustrative first example It calculates are as follows: (33/40) the resource allocation coefficient of ÷ (33/40+8/5)=33/97, depth sparse network machine learning model dsn can It is calculated as: (8/5) ÷ (33/40+8/5)=64/97.
In step S430, the Resource Allocation Formula is determined as to following Resource Allocation Formula: will be with each machine The corresponding resource of product of the resource allocation coefficient and total resources to be allocated of learning model is determined as that each machine will be distributed to The resource of learning model.
For example, the Resource Allocation Formula of logistic regression machine learning model lr can be true in illustrative first example Be set to following Resource Allocation Formula: by with the resource allocation coefficient 33/97 of logistic regression machine learning model lr and to be allocated The corresponding resources of product of total resources be determined as that the resource of logistic regression machine learning model lr will be distributed to;The sparse net of depth The Resource Allocation Formula of network machine learning model dsn can be confirmed as following Resource Allocation Formula: will be with depth sparse network The resource allocation coefficient 64/97 of machine learning model dsn resource corresponding with the product of total resources to be allocated is determined as to divide The resource of dispensing depth sparse network machine learning model dsn.
Here, total resources to be allocated can indicate the resource of predetermined quantity.As described above, resource may include central processing At least one of device, memory space and thread.
In one example, when resource indicates central processing unit, the quantity of total resources to be allocated can indicate to be allocated Central processing unit quantity.
In another example, when resource indicates memory space, the quantity of total resources to be allocated can indicate to be allocated Memory space quantity or size.
In yet another example, when resource indicates thread (also referred to as task), total resources to be allocated can indicate task Several or Thread Count.For example, in illustrative first example, logistic regression will be distributed to when general assignment number to be allocated is 4 The quantity of the resource of machine learning model lr are as follows: 33/97 × 4=132/97 will distribute to depth sparse network machine learning mould The quantity of the resource of type dsn are as follows: 64/97 × 4=256/97.
It optionally, will be corresponding with the product of the resource allocation coefficient of each machine learning model and total resources to be allocated Resource is determined as the step of distributing to the resource of each machine learning model can include: is removing the highest machine of resource allocation coefficient It, will since the minimum machine learning model of resource allocation coefficient in all machine learning models except device learning model The product of the resource allocation coefficient of machine learning model and total resources to be allocated is rounded downwards and the value after downward be rounded is true The quantity of the resource of machine learning model will be distributed to by being set to;Machine learning mould will be not yet assigned in total resources to be allocated The resource of type is determined as that the resource of the highest machine learning model of resource allocation coefficient will be distributed to.Here, the resource due to having (only as an example, task) be worked as unit of integer, so in this case it is necessary to each machine learning mould The corresponding resource of product of the resource allocation coefficient and total resources to be allocated of type has carried out downward rounding.
As an example, as described above, in illustrative first example, will be distributed when general assignment number to be allocated is 4 To the quantity of the resource of logistic regression machine learning model lr are as follows: 33/97 × 4=132/97 will distribute to depth sparse network The quantity of the resource of machine learning model dsn are as follows: 64/97 × 4=256/97.In this illustration, resource allocation coefficient is being removed Except highest machine learning model (that is, depth sparse network machine learning model dsn that resource allocation coefficient is 64/97) In all machine learning models (that is, logistic regression machine learning model lr that resource allocation coefficient is 33/97), from resource point The logistic regression machine learning model lr that distribution coefficient is 33/97 starts, by the resource allocation of logistic regression machine learning model lr The product (that is, 132/97) of coefficient and total resources to be allocated is rounded downwards and by the value after downward be rounded (that is, 132/97 is downward Value 1 after rounding) it is determined as that the quantity (that is, 1) of the resource (that is, task) of logistic regression machine learning model lr will be distributed to, The resource (task: 4-1=3) for being not yet assigned to machine learning model in total resources (that is, general assignment) to be allocated is determined For the resource of the highest machine learning model of resource allocation coefficient will be distributed to (that is, the depth that resource allocation coefficient is 64/97 is dilute Dredge net machine learning model dsn).That is, being by the quantity for the distributing to logistic regression machine learning model lr of the task 1, the quantity that will distribute to the task of depth sparse network machine learning model dsn is 3.
In addition, in above-mentioned downward rounding, it is understood that there may be the quantity of resource is rounded the case where being 0.Therefore, in order to avoid Such case takes distribution of compensation mechanism.By using distribution of compensation mechanism, it can guarantee powerhouse (that is, the machine that distribution resource is more Device learning model) it is permanently strong, weak person's (that is, distributing resource less machine learning model) still has an opportunity.Referring to Fig. 5 to packet Include by resource corresponding with the product of resource allocation coefficient and total resources to be allocated of each machine learning model be determined as by Distribution of compensation mechanism in the step of distributing to the resource of each machine learning model is described in detail.
Fig. 5 shows the flow chart of the distribution of compensation mechanism of embodiment according to the present invention.
Referring to Fig. 5, in step S510, when existing for zero in the quantity of resource for distributing to each machine learning model When value and value greater than one, the quantity for the resource that will be distributed in the resource of machine learning model is greater than to one machine learning mould The quantity of the resource of type sorts according to incremental order.
For purposes of illustration only, below with six machine learning models a, b, c, d, e, f be assigned number of tasks be [1,0,0,0, 2,7] it is described for the second example, however, the invention is not limited thereto, the quantity of machine learning model and specific distribution Number of resources (for example, number of tasks) can be any other quantity.
In illustrative second example, because of at least one machine learning model (that is, machine learning model b is to machine It is 0 that device learning model d), which distributes number of tasks, so triggering distribution of compensation mechanism.Here, machine learning model will will be distributed to The quantity of the resource of machine learning model of the quantity of resource in resource greater than one sorts according to incremental order.That is, In illustrative second example, the quantity of the resource in machine learning model a to machine learning model f assigned number of tasks Machine learning model greater than one is (that is, the quantity of the resource of machine learning model e and machine learning model f) is according to incremental order It is ordered as [2,7].In the present invention, the resource that quantity is 1 does not necessarily mean that single resource, it can indicate a unit Resource, wherein the resource of a unit corresponds to the resource of predetermined quantity.
In step S520, in the resource of the machine learning model to sort according to incremental order, from least money The machine learning model in source starts, and the resource of the machine learning model is reduced by a unit, and the resource allocation of reduction is given A machine learning model in the machine learning model that the quantity of resource is zero, return step S510, until all models Resource is not 0.It is described in more detail below with reference to illustrative second example, however, the present invention is not limited to illustrative Second example.
In illustrative second example, the quantity 2 of the resource of machine learning model e is subtracted 1, and by the quantity of reduction For 1 resource allocation to the machine learning model that the quantity of resource is 0 (for example, machine learning model b).Due to subtracting 1 The quantity of the resource of machine learning model e later becomes 1, therefore the quantity of the subsequent resource for keeping machine learning model e is 1, that is, no longer distribute resource from the resource of machine learning model e to other machines learning model.At this point, since there is also two The quantity of the resource of machine learning model (that is, machine learning model c and d) is 0, accordingly it is contemplated that continuing from other machines It practises model and distributes resource to the machine learning model that the quantity of resource is 0.Due to machine learning model e resource quantity Become 1, so starting next machine learning model (that is, the resource of machine learning model f) is at most kept to 1.It can be by engineering The quantity for practising the resource of model f reduces to 5 from 7, and the resource of reduction is respectively allocated to machine learning model c and machine learning mould Type d, so that the quantity of the resource of machine learning model c and machine learning model d is 1.
By distribution of compensation mechanism, the quantity of the resource of machine learning model a to machine learning model f eventually become [1, 1,1,1,1,5].Therefore, after using distribution of compensation mechanism, money is assigned in machine learning model a to machine learning model f Source, this guarantees that powerhouse is an eternal powerhouse, and weak person still has an opportunity, and is stopped to avoid only showing bad machine learning model for the moment It explores, this promotes the accuracy rate explored further.
Fig. 6 shows the resource scheduling device of embodiment according to the present invention explored for multi-model.
Referring to Fig. 6, the resource scheduling device 600 explored for multi-model may include hyper parameter exploration training unit 610, obtain Divide computing unit 620, Resource Allocation Formula determination unit 630 and scheduling of resource unit 640.Here, it is explored for multi-model Resource scheduling device 600 is executable referring to figs. 1 to either Fig. 5 description method and/or step.
Hyper parameter is explored training unit 610 and be can be configured to based on same target data set to multiple machine learning models point Not carry out one wheel hyper parameter explore training, wherein this wheel exploration in each machine learning model at least explores M group surpass join Number, M are the positive integer greater than 1.In other words, hyper parameter explores the executable step S110 described referring to Fig.1 of training unit 610. Therefore, it is had been described in detail for simplicity, no longer exploring the step S110 that training unit 610 executes to hyper parameter here, and And hyper parameter is equally applicable to the description of step S110 referring to Fig.1 and explores training unit 610.
Score calculation unit 620 is configured as: being explored respectively in this wheel based on the multiple machine learning model The corresponding model-evaluation index of multiple groups hyper parameter, calculates the epicycle performance score of each machine learning model, and calculates The future potential score of each machine learning model.In other words, score calculation unit 620 can be configured to execute and retouch referring to Fig.1 The step S120 stated.Therefore, for simplicity, the step S120 no longer executed here to score calculation unit 620 has been carried out specifically Description, and score calculation unit 620 is equally applicable to the description of step S120 referring to Fig.1.In addition, as an example, score Computing unit 620 can also carry out the calculating and/or reference of the epicycle performance score referring to each machine learning model of Fig. 2 description The calculating of the future potential score of each machine learning model of Fig. 3 description.
Resource Allocation Formula determination unit 630 is configured as: the epicycle performance point of comprehensive each machine learning model and not Carry out potentiality score, determines the Resource Allocation Formula by available resource allocation to each machine learning model.In other words, resource allocation Scheme determination unit 630 can be configured to execute the step S130 described referring to Fig.1.Therefore, for simplicity, here no longer to money Source allocation plan determination unit 630 has been described in detail, and is equally applicable to resource to the description of step S130 referring to Fig.1 Allocation plan determination unit 630.In addition, as an example, Resource Allocation Formula determination unit 630 can also carry out referring to Fig. 4 description Resource Allocation Formula determination and/or referring to Fig. 5 describe distribution of compensation mechanism.
Scheduling of resource unit 640 can be configured to explore training in the hyper parameter of next round according to the Resource Allocation Formula It is middle to carry out corresponding scheduling of resource.Optionally, scheduling of resource unit 640 is also configured to: in response to user stopping request, Total training time, which reaches, makes a reservation for total training time or total exercise wheel number reaches and makes a reservation for total exercise wheel number, stops to engineering It practises model and distributes resource.
The resource in the multi-model exploration of an exemplary embodiment of the present invention is described with reference to Fig. 1 to Fig. 6 above Dispatching method and resource scheduling device.It is to be understood, however, that: device used in Fig. 1 to Fig. 6, system, unit etc. can quilts It is respectively configured as the software of execution specific function, any combination of hardware, firmware or above-mentioned item.For example, these systems, device or Unit etc. can correspond to dedicated integrated circuit, can also correspond to pure software code, also correspond to software and hardware phase In conjunction with unit.In addition, the one or more functions that these systems, device or unit etc. are realized can also be by physical entity equipment Component in (for example, processor, client or server etc.) is sought unity of action.
In addition, the above method can be realized by the computer program being recorded on calculating readable storage medium storing program for executing.For example, root According to exemplary embodiment of the present invention, it is possible to provide a kind of computer readable storage medium, on the computer readable storage medium It is stored with computer program, the computer program makes one or more of when being executed by one or more computing devices Computing device realizes method either disclosed herein.
For example, making one or more of calculating when the computer program is executed by one or more computing devices Device executes following steps: based on same target data set multiple machine learning models being carried out with a wheel hyper parameter respectively and explores instruction Practice, wherein each machine learning model at least explores M group hyper parameter in the exploration of this wheel, and M is the positive integer greater than 1;It is based on The corresponding model-evaluation index of multiple groups hyper parameter that the multiple machine learning model is explored in this wheel respectively calculates The epicycle of each machine learning model shows score, and calculates the future potential score of each machine learning model;It is comprehensive every The epicycle performance point and future potential score of a machine learning model are determined available resource allocation to each machine learning model Resource Allocation Formula;It is explored in training according to the Resource Allocation Formula in the hyper parameter of next round and carries out corresponding resource tune Degree.
Computer program in above-mentioned computer readable storage medium can be in such as client, host, agent apparatus, service It is run in the environment disposed in the computer equipments such as device, it should be noted that the computer program can also be used to execute when being run Additional step in addition to the steps described above or execute when executing above-mentioned steps is more specifically handled, these additional steps It has been referred to during carrying out the description of correlation technique and device referring to figs. 1 to Fig. 6 with the content being further processed, therefore this In in order to avoid repeat will no longer repeat.
It should be noted that resource regulating method and scheduling of resource in the multi-model exploration of an exemplary embodiment of the present invention Device can be completely dependent on the operation of computer program to realize corresponding function, wherein each unit of device or system is being counted It is corresponding to each step in the function structure of calculation machine program, so that whole device or system are by special software package (for example, lib Library) and be called, to realize corresponding function.
For example, provide embodiment according to the present invention includes one or more computing devices and one or more storage dresses The resource scheduling system set, wherein computer program is stored in one or more of storage devices, in the computer journey It is disclosed herein that sequence realizes one or more of computing devices Either method.For example, one or more of computing devices is made to execute following steps: based on same target data set to more A machine learning model carries out a wheel hyper parameter respectively and explores training, wherein each machine learning model in the exploration of this wheel M group hyper parameter is at least explored, M is the positive integer greater than 1;It is arrived respectively in the exploration of this wheel based on the multiple machine learning model The corresponding model-evaluation index of multiple groups hyper parameter, calculate each machine learning model epicycle performance score, Yi Jiji Calculate the future potential score of each machine learning model;The epicycle performance point and future potential of comprehensive each machine learning model obtain Point, determine the Resource Allocation Formula by available resource allocation to each machine learning model;Existed according to the Resource Allocation Formula The hyper parameter of next round is explored in training and carries out corresponding scheduling of resource.
Particularly, above-mentioned computing device can be disposed in the server, can also be deployed in distributed network environment Node apparatus on.In addition, the computing device equipment may also include video display (such as, liquid crystal display) and user hands over Mutual interface (such as, keyboard, mouse, touch input device etc.).The all components of computing device equipment can be via bus and/or net Network and be connected to each other.
Here, the computing device is not necessarily single device, and can also be any can execute alone or in combination State the device of instruction (or instruction set) or the aggregate of circuit.The computing device can also be integrated control computing device or meter A part of device manager is calculated, or can be configured to Local or Remote (for example, via wireless transmission) with interface inter-link Portable electronic device.
For executing training method or the name Entity recognition side of the neural network of an exemplary embodiment of the present invention The computing device of method can be processor, such processor may include central processing unit (CPU), graphics processor (GPU), can Programmed logic device, application specific processor, microcontroller or microprocessor.As an example, not a limit, the processor can also wrap Include analog processor, digital processing unit, microprocessor, multi-core processor, processor array, network processing unit etc..Processor can Run the instruction being stored in one of storage device or code, wherein the storage device can be with storing data.Instruct sum number According to can be also sent and received via Network Interface Unit by network, wherein the Network Interface Unit can be used any Known transport protocol.
Storage device can be integral to the processor and be integrated, for example, RAM or flash memory are arranged in integrated circuit microprocessor etc. Within.In addition, storage device may include independent device, such as, external dish driving, storage array or any database calculate dress Set other workable storage devices.Storage device and processor can be coupled operationally, or can for example pass through the end I/O Mouth, network connection etc. communicate with each other, and enable a processor to read the file of storage in the storage device.
It should be noted that the present invention is exemplary to implement to focus on solving the resource utilization in the exploration of more machine learning models at present The problem of low and exploration low efficiency etc..In particular, the present invention is this to utilize the epicycle performance point of machine learning model and future Potentiality score carries out the technical solution of scheduling of resource, on the one hand can make full use of each machine learning of outcome evaluation explored The current performance of model, effectively distribution resource, and improve the level of resources utilization and explore efficiency, on the other hand, also using having visited The future performance for each machine learning model of outcome evaluation that rope goes out, more reasonably distribution resource, and improve the level of resources utilization With exploration efficiency.
The foregoing describe each exemplary embodiments of the application, it should be appreciated that foregoing description is merely exemplary, and exhaustive Property, the application is not limited to disclosed each exemplary embodiment.It is right without departing from the scope and spirit of the present application Many modifications and changes are obvious for those skilled in the art.Therefore, the protection of the application Range should be subject to the scope of the claims.

Claims (10)

1. the resource regulating method in a kind of multi-model exploration, which comprises
Based on same target data set multiple machine learning models are carried out with a wheel hyper parameter respectively and explores training, wherein at this Each machine learning model at least explores M group hyper parameter in one wheel exploration, and M is the positive integer greater than 1;
The corresponding model evaluation of multiple groups hyper parameter explored respectively in this wheel based on the multiple machine learning model Index calculates the epicycle performance score of each machine learning model, and the future potential of each machine learning model of calculating obtains Point;
The epicycle performance point and future potential score of comprehensive each machine learning model, determine available resource allocation to each machine The Resource Allocation Formula of device learning model;
It is explored in training according to the Resource Allocation Formula in the hyper parameter of next round and carries out corresponding scheduling of resource.
2. resource regulating method as described in claim 1, wherein the epicycle for calculating each machine learning model shows Divide and includes:
Refer to respectively in the corresponding model evaluation of multiple groups hyper parameter that this wheel is explored from the multiple machine learning model K best model evaluation index before being determined in mark, wherein K is positive integer;
For each machine learning model, which is accounted for the ratio value of the preceding K best model evaluation index Epicycle as the machine learning model shows score.
3. resource regulating method as described in claim 1, wherein the future potential for calculating each machine learning model obtains Divide and includes:
The corresponding model-evaluation index of multiple groups hyper parameter that each machine learning model is explored in this wheel is by successive Order is stored in an array, obtains the corresponding multiple arrays of the multiple machine learning model;
For each machine learning model, dull enhancing array is extracted from array corresponding with the machine learning model, it will be dull Enhance array length and array corresponding with the machine learning model length ratio as the machine learning model not Carry out potentiality score.
4. resource regulating method as described in claim 1, wherein the multiple machine learning model includes selecting with hyper parameter The logistic regression machine learning model for system of selecting a good opportunity, has the naive Bayesian machine learning model with hyper parameter selection mechanism In the integrated study model of hyper parameter selection mechanism and recurrence correlation machine learning model with hyper parameter selection mechanism extremely It is two few.
5. resource regulating method as described in claim 1, wherein the resource includes central processing unit, memory space and line At least one of journey.
6. resource regulating method as described in claim 1, wherein based on same target data set to multiple machine learning models The step that a wheel hyper parameter explores training is carried out respectively further include:
Judge to meet the condition early stopped with the presence or absence of at least one machine learning model in the multiple machine learning model,
Wherein, when at least one machine learning model is confirmed as meeting the condition early stopped, stop at least one described machine The training of device learning model does not execute at least one described machine learning model and calculates epicycle performance score and future potential The step of score.
7. resource regulating method as claimed in claim 6, wherein the condition early stopped includes:
When the epicycle of machine learning model explore do not innovate for hyper parameter corresponding model-evaluation index continuous I time it is excellent, then this Machine learning model meets early stop condition;
And/or
When a machine learning model is higher than another engineering in the corresponding model-evaluation index of J hyper parameter that epicycle is explored Model is practised in the optimal evaluation index of epicycle, then another described machine learning model meets early stop condition.
8. a kind of resource scheduling device explored for multi-model, the resource scheduling device include:
Hyper parameter explores training unit, is configured as carrying out one respectively to multiple machine learning models based on same target data set Take turns hyper parameter and explore training, wherein each machine learning model at least explores M group hyper parameter in the exploration of this wheel, M for greater than 1 positive integer;
Score calculation unit is configured as the super ginseng of multiple groups explored respectively in this wheel based on the multiple machine learning model The corresponding model-evaluation index of number, calculates the epicycle performance score of each machine learning model, and calculates each machine The future potential score of learning model;
Resource Allocation Formula determination unit is configured as integrating the epicycle performance point of each machine learning model and future potential obtains Point, determine the Resource Allocation Formula by available resource allocation to each machine learning model;
Scheduling of resource unit is configured as carrying out phase in the hyper parameter of next round is explored and trained according to the Resource Allocation Formula The scheduling of resource answered.
9. a kind of computer readable storage medium, computer program, the meter are stored on the computer readable storage medium Calculation machine program makes one or more of computing devices realize such as claim when being executed by one or more computing devices Resource regulating method described in any one of 1-7.
10. a kind of resource scheduling system including one or more computing devices and one or more storage devices, one Or record has computer program on multiple storage devices, the computer program is executed by one or more of computing devices When make one or more of computing devices realize such as resource regulating methods of any of claims 1-7.
CN201910791358.3A 2019-08-26 2019-08-26 Resource scheduling method and resource scheduling device in multi-model exploration Active CN110503208B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910791358.3A CN110503208B (en) 2019-08-26 2019-08-26 Resource scheduling method and resource scheduling device in multi-model exploration

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910791358.3A CN110503208B (en) 2019-08-26 2019-08-26 Resource scheduling method and resource scheduling device in multi-model exploration

Publications (2)

Publication Number Publication Date
CN110503208A true CN110503208A (en) 2019-11-26
CN110503208B CN110503208B (en) 2022-05-17

Family

ID=68589639

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910791358.3A Active CN110503208B (en) 2019-08-26 2019-08-26 Resource scheduling method and resource scheduling device in multi-model exploration

Country Status (1)

Country Link
CN (1) CN110503208B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111340240A (en) * 2020-03-25 2020-06-26 第四范式(北京)技术有限公司 Method and device for realizing automatic machine learning
CN112116104A (en) * 2020-09-17 2020-12-22 京东数字科技控股股份有限公司 Method, apparatus, medium, and electronic device for automatically integrating machine learning
CN112149838A (en) * 2020-09-03 2020-12-29 第四范式(北京)技术有限公司 Method, device, electronic equipment and storage medium for realizing automatic model building
CN112667397A (en) * 2020-12-09 2021-04-16 财团法人工业技术研究院 Machine learning system and resource allocation method thereof
CN113780287A (en) * 2021-07-30 2021-12-10 武汉中海庭数据技术有限公司 Optimal selection method and system for multi-depth learning model
CN114003393A (en) * 2021-12-30 2022-02-01 南京大学 Method and system for improving integrated automatic machine learning operation performance
WO2022188575A1 (en) * 2021-03-11 2022-09-15 山东英信计算机技术有限公司 Hyperparameter tuning method and apparatus, and storage medium
WO2022269448A1 (en) * 2021-06-25 2022-12-29 International Business Machines Corporation Selection of machine learning model

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150186176A1 (en) * 2010-11-01 2015-07-02 Microsoft Corporation Dynamic allocation and assignment of virtual environment
CN108108228A (en) * 2018-01-05 2018-06-01 安徽师范大学 A kind of resource allocation methods based on differential evolution algorithm
US20180365065A1 (en) * 2017-07-31 2018-12-20 Seematics Systems Ltd System and method for estimating required processing resources for machine learning tasks
CN109144724A (en) * 2018-07-27 2019-01-04 众安信息技术服务有限公司 A kind of micro services resource scheduling system and method
CN109711548A (en) * 2018-12-26 2019-05-03 歌尔股份有限公司 Hyperparameter selection method, usage method, device and electronic device
CN109816116A (en) * 2019-01-17 2019-05-28 腾讯科技(深圳)有限公司 The optimization method and device of hyper parameter in machine learning model

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150186176A1 (en) * 2010-11-01 2015-07-02 Microsoft Corporation Dynamic allocation and assignment of virtual environment
US20180365065A1 (en) * 2017-07-31 2018-12-20 Seematics Systems Ltd System and method for estimating required processing resources for machine learning tasks
CN108108228A (en) * 2018-01-05 2018-06-01 安徽师范大学 A kind of resource allocation methods based on differential evolution algorithm
CN109144724A (en) * 2018-07-27 2019-01-04 众安信息技术服务有限公司 A kind of micro services resource scheduling system and method
CN109711548A (en) * 2018-12-26 2019-05-03 歌尔股份有限公司 Hyperparameter selection method, usage method, device and electronic device
CN109816116A (en) * 2019-01-17 2019-05-28 腾讯科技(深圳)有限公司 The optimization method and device of hyper parameter in machine learning model

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
DORIAN MINAROLLI.ET AL: ""Cross-Correlation Prediction of Resource Demand for Virtual Machine Resource"", 《IEEE》 *
储雅等: ""云计算资源调度:策略与算法"", 《计算机科学》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111340240A (en) * 2020-03-25 2020-06-26 第四范式(北京)技术有限公司 Method and device for realizing automatic machine learning
CN112149838A (en) * 2020-09-03 2020-12-29 第四范式(北京)技术有限公司 Method, device, electronic equipment and storage medium for realizing automatic model building
WO2022048648A1 (en) * 2020-09-03 2022-03-10 第四范式(北京)技术有限公司 Method and apparatus for achieving automatic model construction, electronic device, and storage medium
CN112116104A (en) * 2020-09-17 2020-12-22 京东数字科技控股股份有限公司 Method, apparatus, medium, and electronic device for automatically integrating machine learning
CN112667397A (en) * 2020-12-09 2021-04-16 财团法人工业技术研究院 Machine learning system and resource allocation method thereof
CN112667397B (en) * 2020-12-09 2023-11-28 财团法人工业技术研究院 Machine learning system and resource allocation method thereof
WO2022188575A1 (en) * 2021-03-11 2022-09-15 山东英信计算机技术有限公司 Hyperparameter tuning method and apparatus, and storage medium
WO2022269448A1 (en) * 2021-06-25 2022-12-29 International Business Machines Corporation Selection of machine learning model
CN113780287A (en) * 2021-07-30 2021-12-10 武汉中海庭数据技术有限公司 Optimal selection method and system for multi-depth learning model
CN114003393A (en) * 2021-12-30 2022-02-01 南京大学 Method and system for improving integrated automatic machine learning operation performance
CN114003393B (en) * 2021-12-30 2022-06-14 南京大学 A method and system for improving the performance of integrated automatic machine learning

Also Published As

Publication number Publication date
CN110503208B (en) 2022-05-17

Similar Documents

Publication Publication Date Title
CN110503208A (en) Resource regulating method and resource scheduling device in multi-model exploration
Wang A fuzzy robust scheduling approach for product development projects
CN112580775B (en) Job scheduling for distributed computing devices
CN113157379B (en) Cluster node resource scheduling method and device
CN111667191B (en) Dual-objective Robust Resource Allocation Method and System from the Perspective of Resource Exclusive and Transfer
CN112685153A (en) Micro-service scheduling method and device and electronic equipment
Behnamian Survey on fuzzy shop scheduling
CN106790529B (en) Dispatching method, control centre and the scheduling system of computing resource
CN111274036A (en) Deep learning task scheduling method based on speed prediction
CN109214756A (en) Based on ant group algorithm and the complete vehicle logistics dispatching method and device of hierarchy optimization, storage medium, terminal
CN115600774A (en) Multi-target production scheduling optimization method for assembly type building component production line
CN112016905B (en) Information display method and device based on approval process, electronic equipment and medium
CN109740870A (en) Resource Dynamic Scheduling Method of Web Application in Cloud Computing Environment
CN110457577A (en) Data processing method, device, equipment and computer storage medium
CN109523178A (en) A kind of O&amp;M method and device towards power communication scene
CN117971630A (en) Heterogeneous computing platform, task simulation and time consumption prediction method, device and equipment thereof
CN109558248A (en) A kind of method and system for the determining resource allocation parameters calculated towards ocean model
Herrera et al. A simulator for intelligent workload managers in heterogeneous clusters
CN118229223A (en) Intelligent quotation method and system based on data mining
CN119066993B (en) A linkage method and system for engineering management BIM system and AI artificial intelligence
Zhang et al. A hybrid particle swarm optimisation for flexible casting job shop scheduling problem with batch processing machine
CN120163357A (en) BIM-based project progress and cost dynamic control optimization method and system
CN119940811A (en) BIM digital collaborative management method for construction progress control
CN118552033A (en) Engineering project management method, system, electronic equipment and storage medium
US20200074340A1 (en) Systems and methods for accelerating model training in machine learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant