FI20235432A1

FI20235432A1 - Computer-implemented method for training at least one model for controlling asset allocation in a distributed energy system

Info

Publication number: FI20235432A1
Application number: FI20235432A
Authority: FI
Inventors: Jukka-Pekka Salmenkaita; Esko Heinonen; Simon Holmbacka
Original assignee: Elisa Oyj
Priority date: 2023-04-17
Filing date: 2023-04-17
Publication date: 2024-10-18
Also published as: WO2024218432A1

Abstract

According to an embodiment, a computer-implemented method for training at least one model (100, 104) for controlling asset allocation in a distributed energy system comprises obtaining historical data about the distributed energy system (101); training a data simulator to generate future data for future distributed energy system conditions using the historical data (102); generating a plurality of new data for new distributed energy system conditions using the trained data simulator (103); and training at least one model for controlling asset allocation in the distributed energy system in the new distributed energy system conditions using the plurality of new data (104).

Description

COMPUTER-IMPLEMENTED METHOD FOR TRAINING AT LEAST ONE

MODEL FOR CONTROLLING ASSET ALLOCATION IN A

DISTRIBUTED ENERGY SYSTEM

TECHNICAL FIELD

[0001] The present disclosure relates to distributed energy systems, and more particularly to a computer- implemented method for training at least one model for controlling asset allocation in a distributed energy system, a computing device, and a computer program prod- uct.

BACKGROUND

[0002] A distributed energy storage (DES) can comprise a large number of nodes, and each node can be powered by, for example, the power grid or by a battery system connected to the node. Many asset allocation decisions, such as how much capacity to offer for an automatic frequency restoration reserve (aFRR) market, need to be done in advance, typically approximately one day earlier & than the operations. Further, asset allocation optimi- a zation is not deterministic, since not all needed in-

I formation is known in advance and activation of allo- = cated capacity is stochastic.

E 25

N

S SUMMARY

LÖ

& [0003] This summary is provided to introduce a selec-

N tion of concepts in a simplified form that are further described below in the detailed description. This sum- mary is not intended to identify key features or essen- tial features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.

[0004] It is an objective to provide a computer-im- plemented method for training at least one model for controlling asset allocation in a distributed energy system, a computing device, and a computer program prod- uct. The foregoing and other objectives are achieved by the features of the independent claims. Further imple- mentation forms are apparent from the dependent claims, the description and the figures.

[0005] According to a first aspect, a computer-imple- mented method for training at least one model for con- trolling asset allocation in a distributed energy system comprises: obtaining historical data about the distrik- uted energy system; training a data simulator to gener- ate future data for future distributed energy system conditions using the historical data; generating a plu- n rality of new data for new distributed energy system

S conditions using the trained data simulator; and train- 3 ing at least one model for controlling asset allocation

N in the distributed energy system in the new distributed = 25 energy system conditions using the plurality of new - data. The method can, for example, efficiently train the & at least one model for controlling asset allocation in 3 the distributed energy system.

N

[0006] In an implementation form the first aspect, the training the at least one model for controlling asset allocation in the distributed energy system in the new distributed energy system conditions using the plurality of new data comprises: obtaining at least one seed model from a database based at least on the new distributed energy system conditions; and using the at least one seed model as a starting point for training the at least one model for controlling asset allocation in the dis- tributed energy system in the new distributed energy system conditions. The method can, for example, effi- ciently train the at least one model for controlling asset allocation in the distributed energy system using the at least one seed model.

[0007] In another implementation form of the first aspect, each seed model of the at least one seed model comprises at least one asset allocation plan for the distributed energy system, energy and/or power charac- teristics of at least one asset of the at least one asset allocation plan, statistical information about n distributed energy system conditions at a time of exe-

S cution of the at least one asset allocation plan, and/or 3 an outcome probability distribution of the at least one

K asset allocation plan. The method can, for example, ef- = 25 ficiently select an appropriate at least one seed model a

N for the new distributed energy system conditions. 3 [0008] In another implementation form of the first & aspect, the statistical information about the distrib-

N uted energy system conditions at the time of execution of the at least one asset allocation plan comprises at least one of: distributed energy system conditions be- fore the time of execution of the allocation plan, prices before and/or during the time of execution of the at least one asset allocation plan, and/or frequency restoration reserves activation request signals before and/or during the time of execution of the at least one asset allocation plan. The method can, for example, ef- ficiently select an appropriate at least one seed model based on the statistical information about the distrib- uted energy system conditions.

[0009] In another implementation form of the first aspect, the outcome probability distribution of the at least one asset allocation plan is based on a plurality of simulations and/or an observed outcome probability distribution of the at least one asset allocation plan.

The method can, for example, more accurately select an appropriate at least one seed model since the outcome probability distribution of the at least one asset al- location plan can more accurately reflect the suitabil- ity of the at least one asset allocation plan.

S [0010] In another implementation form of the first x aspect, the method further comprises scaling the at

N least one seed model according to a capacity character- = 25 istic of the at least one seed model and a capacity > characteristic of the new distributed energy system con- & ditions. The method can, for example, make the at least 2 one seed model more suitable for the new distributed

N energy system conditions.

[0011] In another implementation form of the first aspect, the training the at least one model for con- trolling asset allocation in the distributed energy sys- tem in the new distributed energy system conditions us- 5 ing the plurality of new data further comprises: train- ing a plurality of models for the new distributed energy system conditions by using the at least one seed model as a starting point and by simulating each model in plurality of models a plurality of times, thus obtaining a plurality of trained models; recording an outcome probability distribution of each trained model in the plurality of trained models; and selecting a trained model, from the plurality of trained models, for execu- tion based at least on the outcome probability distri- bution of each trained model in the plurality of trained models and at least one selection criteria. The method can, for example, efficiently choose the trained model for execution.

[0012] In another implementation form of the first aspect, the method further comprises storing at least n the selected trained model in the database. The method

S can, for example, improve training of future models by 3 storing at least the selected trained model in the da- ~ tabase as a seed model.

I 25 [0013] In another implementation form of the first a aspect, the method further comprises storing at least x one non-selected trained model, from the plurality of

N trained models, in the database based at least on the

N outcome probability distribution of each trained model in the plurality of trained models and the at least one selection criteria. The method can, for example, improve training of future models by storing at least one non- selected trained model in the database as a seed model.

[0014] In another implementation form of the first aspect, the training the at least one model for con- trolling asset allocation in the distributed energy sys- tem in the new distributed energy system conditions us- ing the plurality of new data comprises training the at least one model for controlling asset allocation in the distributed energy system in a plurality of distributed energy system conditions. The method can, for example, efficiently train a model that can function in a plu- rality of distributed energy system conditions.

[0015] In another implementation form of the first aspect, the training the at least one model for con- trolling asset allocation in the distributed energy sys- tem in the new distributed energy system conditions al- lows a human to adjust the training. The method can, for example, efficiently adjust the training via the human adjustment.

N

2 [0016] In another implementation form of the first 3 aspect, the training the at least one model for con- ~ trolling asset allocation in the distributed energy sys-

I 25 tem in the new distributed energy system conditions com- a prises reinforcement learning. The method can, for ex- x ample, efficiently train the at least one model using

N reinforcement learning.

N

[0017] According to a second aspect, a computing de- vice comprises at least one processor and at least one memory including computer program code, the at least one memory and the computer program code being configured to, with the at least one processor, cause the computing device to perform the method according to the first aspect.

[0018] According to a third aspect, a computer program product comprises program code configured to perform the method according to the first aspect when the computer program product is executed on a computer.

[0019] Many of the attendant features will be more readily appreciated as they become better understood by reference to the following detailed description consid- ered in connection with the accompanying drawings.

DESCRIPTION OF THE DRAWINGS

[0020] In the following, example embodiments are de- scribed in more detail with reference to the attached figures and drawings, in which: = [0021] Fig. 1 illustrates a flow chart representation

N of a method according to an embodiment; 3 [0022] Fig. 2 illustrates a schematic representation = of model training according to an embodiment;

E 25 [0023] Fig. 3 illustrates a plot representation an & outcome probability distribution according to an embod-

O

& iment; &

[0024] Fig. 4 illustrates a plot representation an outcome probability distribution according to another embodiment; and

[0025] Fig. 5 illustrates a schematic representation of a computing device according to an embodiment.

[0026] In the following, like reference numerals are used to designate like parts in the accompanying draw- ings.

DETAILED DESCRIPTION

[0027] In the following description, reference is made to the accompanying drawings, which form part of the disclosure, and in which are shown, by way of illustra- tion, specific aspects in which the present disclosure may be placed. It is understood that other aspects may be utilised, and structural or logical changes may be made without departing from the scope of the present disclosure. The following detailed description, there- fore, is not to be taken in a limiting sense, as the scope of the present disclosure is defined by the ap- = pended claims.

N [0028] For instance, it is understood that a disclo-

S sure in connection with a described method may also hold = true for a corresponding device or system configured to : 25 perform the method and vice versa. For example, if a & specific method step is described, a corresponding de-

O vice may include a unit to perform the described method

O step, even if such unit is not explicitly described or illustrated in the figures. On the other hand, for ex- ample, 1f a specific apparatus 1s described based on functional units, a corresponding method may include a step performing the described functionality, even if such step is not explicitly described or illustrated in the figures. Further, it is understood that the features of the various example aspects described herein may be combined with each other, unless specifically noted oth- erwise.

[0029] Fig. 1 illustrates a flow chart representation of a method according to an embodiment.

[0030] According to an embodiment, a computer-imple- mented method 100 for training at least one model for controlling asset allocation in a distributed energy system comprises obtaining 101 historical data about the distributed energy system.

[0031] Herein, an asset may refer to any part/compo- nent of the distributed energy system that can be used for, for example, power grid freguency balancing. For example, the distributed energy system may comprise a n plurality of nodes, each node comprising a battery that

S can be used to provide power to the power grid during 3 up regulation and/or to take power from the power grid ~ during down regulation. Allocating an asset may com- z 25 prise, for example, providing an asset for power grid a frequency balancing. x [0032] The historical data about the distributed en-

N ergy system may comprise, for example, historical dis-

N tributed energy system conditions at various points of time, such as days, historical information about prices, such as electricity prices, such as day-ahead and/or intraday market prices, the capacity prices on the re- serve markets, such as for automatic frequency restora- tion reserve (aFRR) up regulation and aFRR down regula- tion, and/or energy price in the aFRR market, and/or historical frequency restoration reserves activation request signals.

[0033] The method 100 may further comprise training 102 a data simulator to generate future data for future distributed energy system conditions using the histor- ical data.

[0034] The data simulator may also be referred to as data generator or similar.

[0035] Herein, distributed energy system conditions may comprise any information that can affect the dis- tributed energy system. Examples of such information are disclosed herein.

[0036] Distributed energy system conditions may also be referred to as market conditions, environmental con- e ditions, or similar.

S [0037] The future distributed energy system condi-

S tions may correspond to, for example, distributed energy ~ system conditions during any period of time, such as a

E 25 day, in the future. For example, the data simulator may

N be configured to output the future data based on input 3 parameters describing the future point of time, such as

S a day. For example, after the data simulator has been trained, the data simulator may take as an input de- scriptive parameters about the future time instance, such as “a Monday in October”, and generate data about the distributed energy system for that future time in- stance.

[0038] The history of activated aFRR signals with fine time resolution may not easily available. However, the activated energies for both up and down regulation and procured capacities on hourly level can be. As the ac- tivations may be done pro rata, the activated energy can be determined with a certain bid size.

[0039] The historical data can be used to simulate the performance with different models. For example, it can be observed what kind of activations there can be with the chosen model hour by hour. For example, at the be- ginning of a day there may be some known state of charge in the batteries of the assets. Then, hour by hour, it can be observed, how the activations would affect the state of charge of the batteries. At the same time, the revenues and possible sanctions caused by lack/excess n of energy can be computed. Thus, the models can be eval-

S uated in terms of, for example, revenue and performance < in general.

I n [0040] The historical data can be used to, for exam-

I 25 ple, simulate full daily profiles by using historical a days and observing hour by hour the likelihood of var- x ious events with certain models. However, there may only

N be a certain amount of historical data available.

N

[0041] Alternatively or additionally, the historical data can be used by, for example, taking the historical data and grouping the historical data into hourly da- tasets, such as data for each of the 24 hours of a day.

Then, simulation can be performed hour by hour by sam- pling from the historical data for each hour of the day.

This approach can be refined by fitting distributions to the hourly data and that way increasing the amount of data available. However, this can break the correla- tion between consecutive hours.

[0042] The data simulator can generate future data for future distributed energy system conditions using the historical data by, for example, utilising the hourly sampling above. Alternatively or additionally, the data simulator can, for example, create Markov chains base on the historical data and generate the future data using the Markov chains. This can, for example, preserve some of the correlation between hours.

[0043] Alternatively or additionally, the data simu- lator can generate future data for future distributed n energy system conditions using other approaches, such

S as generative adversarial networks (GAN), variational 3 autoencoder (VAE) approaches, and/or deep learning. ~ These models can be trained on the historical data and z 25 then used to generate future data for future distributed a energy system conditions. x [0044] The method 100 may further comprise generating

N 103 a plurality of new data for new distributed energy

N system conditions using the trained data simulator.

[0045] The new distributed energy system conditions may correspond to, for example, energy system conditions during a specific day. For example, the method 100 may be used each day to train at least one model for con- trolling asset allocation in the distributed energy sys- tem for that day.

[0046] The method 100 may further comprise training 104 at least one model for controlling asset allocation in the distributed energy system in the new distributed energy system conditions using the plurality of new data.

[0047] In order to simulate new data for the distrib- uted energy system conditions, the trained data simula- tor can be provided with summary statistical information about the new distributed energy system conditions. The data used in training at least one model should come from same distribution as indicated.

[0048] The method 100 can utilise learnings from the generated plurality of new data for the training of the at least one model. © [0049] Distributed energy storage (DES) and/or other

S virtual power plant solution can control thousands of

S battery units at different sites/nodes. Many asset al- ~ location decisions, such as how much capacity to offer

E 25 for aFRR market, need to be done in advance, typically

A approximately one day before the operations. 5 [0050] For example, an asset allocation can state the

O following and similar bids for all hours of the day:

- 12.00: aFRR up regulation bid 1 megawatts (MW), load shift up bid 1MW - 13.00: aFRR down regulation bid 2MW, load shift up bid 2MW - 14.00: aFRR down regulation bid 3MW, load shift down bid 1MW

[0051] Asset allocation optimization is not determin- istic, since, for example, all price points are not known in advance. Further, activation of allocated ca- pacity can be stochastic, since the amount of activated assets can be controlled by the transmission system op- erator, not the asset owner/operator.

[0052] For example, aFRR activations can be based on the grid conditions at that point in time, which cannot be precisely forecasted.

[0053] One approach to handle this stochasticity could be to simulate a large number of outcomes for each asset allocation plan and then select a plan that has desir- able characteristics. However, if this is done from scratch for every round of asset allocation, it can n become computationally ineffective.

S [0054] The method 100 can, for example, be able to 3 cope with the stochasticity of asset allocation in a ~ distributed energy system in effective manner.

E 25 [0055] The method 100 can may be able to efficiently

N adapt to new circumstances, such as new expected price points, different patterns of asset activation.

O [0056] Training a model by simply using historical data as-is may not be optimal, as the controller may overlearn specific historical patterns that are not gen- eralizable for future events.

[0057] The method 100 can use historical data to pro- vide examples of distributions of data in a given mar- ket/environment conditions. This historical distribu- tion can be used to sample data for the data simulator.

The sampling may comprise, for example, Monte Carlo sam- pling of one or multiple variables. The data simulator may comprise, for example, a synthetic data generator which recreates the stochastic elements.

[0058] Using the method 100, the historical data can be used to provide a large simulation capacity of future scenarios with generalization capability.

[0059] For example, the method 100 may, instead of using the historical data as-is, use the historical data to recreate, for example, one market day many times using the trained data simulator. The at least one model can be trained based on the simulated output rather than from data from a single historical data point.

[0060] An activation signal may be provided by, for < example, a grid operator. When working in national fre- < quency reserve markets, the grid operator can require 3 each participant to deliver a selected amount of fre- ~ quency balancing capacity for the market during the time =E 25 of resource activation. The activated frequency balanc- a ing capacity is usually not allowed to fluctuate sig- x nificantly from its intended setpoint, and the partic-

N ipants can be sanctioned in case the participant is not

N able to deliver steady frequency balancing capacity for the market.

[0061] Fig. 2 illustrates a schematic representation of model training according to an embodiment.

[0062] According to an embodiment, the training 104 the at least one model for controlling asset allocation in the distributed energy system in the new distributed energy system conditions using the plurality of new data comprises: obtaining at least one seed model from a database 201 based at least on the new distributed en- ergy system conditions; and using the at least one seed model as a starting point for training the at least one model for controlling asset allocation in the distrib- uted energy system in the new distributed energy system conditions.

[0063] The database 201 may comprise, for example, known good models as seed models. The at least one seed model can be used as a baseline for training the at least one model. Training the at least one model can comprise using at least one known good model as a seed n for iteration. The iteration can be performed by, for

S example, a reinforcement learning (RL) agent 202. The 3 RL agent 202 can evolve further when new models are ~ created. = 25 [0064] For example, a record of "winning” asset allo- a cation strategies and their environmental conditions can x be stored in the database 201, and training for new

N environmental conditions, such as a new market day, can

N start by being seeded by known good solutions to similar conditions.

[0065] The obtaining at least one seed model from the database 201 based at least on the new distributed en- ergy system conditions may comprise, for example, find- ing at least one seed model in the database 201 that matches the new distributed energy system conditions as closely as possible.

[0066] According to an embodiment, the training 104 the at least one model for controlling asset allocation in the distributed energy system in the new distributed energy system conditions comprises reinforcement learn- ing.

[0067] In reinforcement learning, an RL agent/con- troller 202 can learn to achieve objectives in simu- lated, including stochastic, environments.

[0068] The obtaining at least one seed model from a database based at least on the new distributed energy system conditions may comprise, for example, analysing the new distributed energy system conditions and search- e ing the database 201 for models applicable for earlier

S conditions similar to the new distributed energy system 3 conditions. For example, the searching may comprise ~ searching for Mondays in October. The analysing the new

E 25 distributed energy system conditions may comprise, for

N example, forming statistical information about the new 3 distributed energy system conditions. & [0069] According to an embodiment, each seed model of = the at least one seed model comprises at least one asset allocation plan for the distributed energy system, en- ergy and/or power characteristics of at least one asset of the at least one asset allocation plan, statistical information about distributed energy system conditions at a time of execution of the at least one asset allo- cation plan, and/or an outcome probability distribution of the at least one asset allocation plan.

[0070] The at least one asset allocation plan for the distributed energy system may comprise, for example, a plurality of actions and at which times these actions are to be performed.

[0071] For example, an asset allocation plan can in- dicate that at 8.00 on a specific day, a bid of 1MW aFRR up/down regulation bid and a 2MW load shift up/down bid should be performed.

[0072] The energy and/or power characteristics of at least one asset of the at least one asset allocation plan may comprise, for example, aggregate characteris- tics of the overall capacity being allocated. For exam- ple, 30 MWh of usable energy storage, 10 MW average n adjustable power, 30%:70% average up:down regulation

S power split. 3 [0073] An outcome probability distribution can de- ~ scribe how likely different outcomes are for a specific = 25 asset allocation plan. For example, an outcome can com- a prise whether the allocation plan failed to deliver x promised capacity. This may occur due to, for example,

N empty/full batteries at an allocated node or nodes. The

N training 104 of the at least one model can try to find the optimal balance between operational hours and fail- ing occurrences by, for example, maximizing operational hours while minimizing the failings. This can also max- imize other variables, such as revenue.

[0074] According to an embodiment, the statistical information about the distributed energy system condi- tions at the time of execution of the at least one asset allocation plan comprises at least one of: distributed energy system conditions before the time of execution of the allocation plan, prices before and/or during the time of execution of the at least one asset allocation plan, and/or frequency restoration reserves activation reguest signals before and/or during the time of execu- tion of the at least one asset allocation plan.

[0075] The prices before and/or during the time of execution of the at least one asset allocation plan may comprise, for example, electricity prices, such as day- ahead and/or intraday market prices, the capacity prices on the reserve markets, such as for aFRR up regulation and aFRR down regulation, and/or energy price in the aFRR market.

N

2 [0076] The distributed energy system conditions be- 3 fore the time of execution of the allocation plan may ~ comprise, for example, statistical information of the

I 25 prevailing market conditions at the time of plan execu- a tion and/or past market conditions. x [0077] The statistical information about the distrib-

N uted energy system conditions may be available in var-

N ious time granularities. For example, the market may be operated hourly or with a 15-min resolution. Longer trends in the data may also be considered.

[0078] According to an embodiment, the outcome prob- ability distribution of the at least one asset alloca- tion plan is based on a plurality of simulations and/or an observed outcome probability distribution of the at least one asset allocation plan.

[0079] The outcome probability distribution of the at least one asset allocation plan may be based on a plu- rality of simulations by, for example, simulating the at least one asset allocation plan a plurality of times and observing the resulting outcome probability distri- butions.

[0080] The observed outcome probability distribution of the at least one asset allocation plan may comprise a probability distribution that is observed when the at least one asset allocation plan is executed.

[0081] In some embodiments, the at least one asset allocation plan may comprise a plurality of asset allo- cation plans. The outcome probability distribution of en each asset allocation plan can be based on a plurality

S of simulations and/or an observed outcome probability 3 distribution of the asset allocation plan. ~ [0082] The outcome of an asset allocation plan may not

E 25 be single value but an outcome probability distribution.

AN An asset allocation plan may be simulated a plurality 3 of times with the same input parameters. The stochastic & nature of the simulation can result in different results = for every simulation. The method 100 can analyse the distribution of an outcome, such as revenue, and deter- mine which distribution is the best one with heuristics.

Best may not correspond to only maximal expected out- come, but also the probability of getting a rather high outcome or low risk of getting very small outcome.

[0083] The method 100 may further comprise reducing a number of models in the at least one model. For example, number of models in the at least one model can be reduced while maximizing the diversity of remaining models. For example, the method 100 may fetch models that are very different while the conditions are the same. This can enable the RL agent 202 to explore a larger state space more easily.

[0084] According to an embodiment, the method 100 fur- ther comprises scaling the at least one seed model ac- cording to a capacity characteristic of the at least one seed model and a capacity characteristic of the new distributed energy system conditions.

[0085] The capacity characteristic may refer to energy and/or power capacity of one or more assets. The capac- n ity characteristic may comprise, for example, at least

S one of the following: average up regulation power (W), 3 average down regulation power (W) and/or average energy ~ storage capacity (Wh). For example, a seed model in the z 25 at least one seed model may comprise an asset allocation a plan for assets having a specific amount of power ca- x pacity and/or energy capacity to be used for power grid

N frequency balancing. On the other hand, the new dis-

N tributed energy system conditions may require a differ- ent amount of power capacity and/or energy capacity.

Thus, it may be beneficial to scale the seed model to better match the new distributed energy system condi- tions.

[0086] For example, the remaining models, after re- ducing the number of models in the at least one model, can be scaled according to capacity characteristics to approximately match the capacity characteristics of the system for which new model will be trained.

[0087] Scaling can be advantageous if, for example, models from significantly different systems are combined together. For example, one system could be 50MWh, 20MW, and another could be 10MWh, 2MW system. With scaling, they can be made comparable. If a third system was 10MWh 30MW it would probably be so different that models rel- evant for that system are not useful seeds for the first two systems. Selecting the optimal seed model may be based on detecting which available model is closest for at least one capacity characteristic of the new system, n when scaled. For example, a seed model of 50MWh/20MW can

S be an optimal seed model for a new system of 100MWh/40MW, 3 whereas a seed model of 10MWh/2MW can be an optimal seed ~ model for a new system of 20MWh/4MW.

I 25 [0088] The training for new models can be done start- a ing from the at least one seed model, after possible x scaling, as a starting point.

N [0089] According to an embodiment, the training 104

N the at least one model for controlling asset allocation in the distributed energy system in the new distributed energy system conditions using the plurality of new data further comprises: training a plurality of models for the new distributed energy system conditions by using the at least one seed model as a starting point and by simulating each model in plurality of models a plurality of times, thus obtaining a plurality of trained models; recording an outcome probability distribution of each trained model in the plurality of trained models; and selecting a trained model, from the plurality of trained models, for execution based at least on the outcome probability distribution of each trained model in the plurality of trained models and at least one selection criteria.

[0090] During the training, the new model candidates can be simulated many times. The outcome probability distribution of each model can be recorded. The outcome probability distribution can be compared to select one or several models that have superior characteristics in some desirability criteria, such as highest expected value, least likelihood for unacceptable low performance

S etc. x [0091] The outcome of the executed plan can also be ~ recorded.

I 25 [0092] According to an embodiment, the method 100 fur- a ther comprises storing at least the selected trained x model in the database 201.

N [0093] According to an embodiment, the method 100 fur-

N ther comprises storing at least one non-selected trained model, from the plurality of trained models, in the database 201 based at least on the outcome probability distribution of each trained model in the plurality of trained models and the at least one selection criteria.

[0094] One of the models can be selected for execu- tion. The selected model, and optionally other possibly superior but different models, can be recorded in the database 201 to be used as future seeds.

[0095] In some embodiments, the storing at least one non-selected trained model may comprise storing at least one bad model, i.e. a model that does not comply with the at least one selection criteria. The information comprised in the at least one bad model can be used to not influence the seed model. For example, an extreme corner case of a bad model may comprise only discharging or only charging assets a whole day. By storing such bad models, it can be checked that a seed model does resemble such a known bad model.

[0096] According to an embodiment, the training 104 the at least one model for controlling asset allocation n in the distributed energy system in the new distributed

S energy system conditions allows a human to adjust the 3 training. ~ [0097] The human can, for example, guide the training =E 25 of the model with various type of interactions. For a example, the human can rate the models generated during x the training and guide the learning to some direction

N with those ratings. Alternatively or additionally, the

N user may guide with "rather good plans" as starting points. This may also be referred to as reinforcement learning with human advice or reinforcement learning with human feedback. For example, reinforcement learning with human advice/feedback may comprise utilising a known good plan, made by for example a human or some other method, to guide the RL agent into certain direc- tion in the training. Alternatively or additionally, this could be used as a starting point to speed up the training process.

[0098] Further, a human may adjust the trained models after the training by the method 100.

[0099] According to an embodiment, the training 104 the at least one model for controlling asset allocation in the distributed energy system in the new distributed energy system conditions using the plurality of new data comprises training the at least one model for control- ling asset allocation in the distributed energy system in a plurality of distributed energy system conditions.

[0100] The at least one model trained for controlling asset allocation in the distributed energy system in a n plurality of distributed energy system conditions can

S learn a strategy for all conditions in the plurality of 3 distributed energy system conditions and be supported ~ to do so by getting statistical information about the

I 25 conditions. a [0101] If the state space is small enough, it can be x practical to train a model for the plurality of dis-

N tributed energy system conditions, since a small number

N of combinations needs to be tried and a local or the global optimum can be found. In very large state spaces, the global optimum may be challenging to find but the training may still find good enough gradients to quickly find a rather good solution.

[0102] Fig. 3 illustrates a plot representation an outcome probability distribution according to an embod- iment.

[0103] The embodiment of Fig. 3 illustrates two out- come probability distributions of a model. In one of the outcome probability distributions, the outcome corre- sponds to revenue, and in the other outcome probability distribution, the outcome corresponds to a number of state of charge (SOC) violations. A SOC violation refers to a situation where the allocation assets do not have sufficient charge to deliver the promised capacity.

[0104] In some embodiments, the outcome probability distribution may comprise a plurality of outcome prob- ability distributions. For example, in the embodiment of Fig. 3, two outcome probability distribution in terms of two different outcomes are illustrated. The mode for n execution can be chosen based on the any number of out-

S come probability distributions. 3 [0105] The model corresponding to the outcome proba- ~ bility distributions illustrated in the embodiment of

E 25 Fig. 3 may be considered a well performing model. The

N average revenue is high and SOC violations are unlikely. x [0106] The distribution illustrated in the embodiment

S of Fig. 3 may represent a desirable outcome probability distribution. There is a good probability that the real outcome is quite close to the maximum outcome.

[0107] Fig. 4 illustrates a plot representation an outcome probability distribution according to another embodiment.

[0108] The embodiment of Fig. 4 illustrates two out- come probability distributions of a model. In one of the outcome probability distributions, the outcome corre- sponds to revenue, and in the other outcome probability distribution, the outcome corresponds to a number of state of charge (SOC) violations.

[0109] The model corresponding to the outcome proba- bility distributions illustrated in the embodiment of

Fig. 4 may be considered a not so good model. The average revenue is lower and SOC violations are more likely than, for example with the model of Fig. 3.

[0110] Out of the embodiments of Fig. 3 and Fig. 4, it may be better to pick the model resulting in the embodiment of Fig. 3 because it will give a good outcome with a high probability. The embodiment of Fig. 4 can e give even higher outcome but only in rare occasions.

S [0111] Alternatively or additionally, a good model can

S also be chosen using various other criteria. For exam- ~ ple, a good model may leave the SOC in a proper state

E 25 for the next day. For example, even if the outcome for

N one day is g00d, the model should not leave the SOC at, for example, 99% or 1% for the next day, since there

S would be sufficient capacity for the next day. For ex- ample, the model should aim to keep the SOC in a rea- sonable range, such as 40%-80%, for the next day.

[0112] Fig. 5 illustrates a schematic representation of a computing device according to an embodiment.

[0113] According to an embodiment, a computing device 500 comprises at least one processor 501 and at least one memory 502 including computer program code, the at least one memory 502 and the computer program code con- figured to, with the at least one processor 501, cause the computing device 500 to perform the method 100.

[0114] The computing device 500 may comprise at least one processor 501. The at least one processor 501 may comprise, for example, one or more of various processing devices, such as a co-processor, a microprocessor, a digital signal processor (DSP), a processing circuitry with or without an accompanying DSP, or various other processing devices including integrated circuits such as, for example, an application specific integrated cir- cuit (ASIC), a field programmable gate array (FPGA), a n microprocessor unit (MCU), a hardware accelerator, a

S special-purpose computer chip, or the like. 3 [0115] The computing device 500 may further comprise ~ a memory 502. The memory 502 may be configured to store,

E 25 for example, computer programs and the like. The memory

N 502 may comprise one or more volatile memory devices, 3 one or more non-volatile memory devices, and/or a com-

N bination of one or more volatile memory devices and non-

N volatile memory devices. For example, the memory 502 may be embodied as magnetic storage devices (such as hard disk drives, magnetic tapes, etc.), optical magnetic storage devices, and semiconductor memories (such as mask ROM, PROM (programmable RCM), EPROM (erasable

PROM), flash ROM, RAM (random access memory), etc.).

[0116] The computing device 500 may further comprise other components not illustrated in the embodiment of

Fig. 5. The computing device 500 may comprise, for ex- ample, an input/output bus for connecting the computing device 500 to other devices.

[0117] When the computing device 500 is configured to implement some functionality, some component and/or com- ponents of the computing device 500, such as the at least one processor 501 and/or the memory 502, may be configured to implement this functionality. Further- more, when the at least one processor 501 is configured to implement some functionality, this functionality may be implemented using program code comprised, for exam- ple, in the memory.

[0118] The computing device 500 may be implemented at n least partially using, for example, a computer, some

S other computing device, or similar. 3 [0119] Any range or device value given herein may be ~ extended or altered without losing the effect sought. =E 25 Also any embodiment may be combined with another embod- a iment unless explicitly disallowed. x [0120] Although the subject matter has been described

N in language specific to structural features and/or acts, = it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as examples of implementing the claims and other equiv- alent features and acts are intended to be within the scope of the claims.

[0121] It will be understood that the benefits and advantages described above may relate to one embodiment or may relate to several embodiments. The embodiments are not limited to those that solve any or all of the stated problems or those that have any or all of the stated benefits and advantages. It will further be un- derstood that reference to 'an' item may refer to one or more of those items.

[0122] The steps of the methods described herein may be carried out in any suitable order, or simultaneously where appropriate. Additionally, individual blocks may be deleted from any of the methods without departing from the spirit and scope of the subject matter de- scribed herein. Aspects of any of the embodiments de- n scribed above may be combined with aspects of any of the

S other embodiments described to form further embodiments 3 without losing the effect sought. ~ [0123] The term 'comprising' is used herein to mean

I 25 including the method, blocks or elements identified, but a that such blocks or elements do not comprise an exclu- x sive list and a method or apparatus may contain addi-

N tional blocks or elements.

N

[0124] It will be understood that the above descrip- tion is given by way of example only and that various modifications may be made by those skilled in the art.

The above specification, examples and data provide a complete description of the structure and use of exem- plary embodiments. Although various embodiments have been described above with a certain degree of particu- larity, or with reference to one or more individual embodiments, those skilled in the art could make numer- ous alterations to the disclosed embodiments without departing from the spirit or scope of this specifica- tion.

O

N

O

N

< <Q

N

I a a

Ql 0 <t

O

0

N

O

N

Claims

CLAIMS:

1. A computer-implemented method (100) for training at least one model for controlling asset allo- cation in a distributed energy system, the method (100) comprising: obtaining (101) historical data about the dis- tributed energy system; training (102) a data simulator to generate future data for future distributed energy system condi- tions using the historical data; generating (103) a plurality of new data for new distributed energy system conditions using the trained data simulator; and training (104) at least one model for con- trolling asset allocation in the distributed energy sys- tem in the new distributed energy system conditions us- ing the plurality of new data.

2. The computer-implemented method (100) ac- cording to claim 1, wherein the training (104) the at e least one model for controlling asset allocation in the S distributed energy system in the new distributed energy 3 system conditions using the plurality of new data com- ~ prises: I 25 obtaining at least one seed model from a da- a tabase based at least on the new distributed energy x system conditions; and & &

using the at least one seed model as a start- ing point for training the at least one model for con- trolling asset allocation in the distributed energy sys- tem in the new distributed energy system conditions.

3. The computer-implemented method (100) ac- cording to claim 2, wherein each seed model of the at least one seed model comprises at least one asset allo- cation plan for the distributed energy system, energy and/or power characteristics of at least one asset of the at least one asset allocation plan, statistical in- formation about distributed energy system conditions at a time of execution of the at least one asset allocation plan, and/or an outcome probability distribution of the at least one asset allocation plan.

4. The computer-implemented method (100) ac- cording to claim 3, wherein the statistical information about the distributed energy system conditions at the time of execution of the at least one asset allocation plan comprises at least one of: distributed energy sys- & tem conditions before the time of execution of the al- a location plan, prices before and/or during the time of I execution of the at least one asset allocation plan, = 25 and/or frequency restoration reserves activation re- s guest signals before and/or during the time of execution & of the at least one asset allocation plan. O & &

5. The computer-implemented method (100) ac- cording to claim 3 or claim 4, wherein the outcome prob- ability distribution of the at least one asset alloca- tion plan is based on a plurality of simulations and/or an observed outcome probability distribution of the at least one asset allocation plan.

6. The computer-implemented method (100) ac- cording to any of claims 2 - 5, the method further comprising scaling the at least one seed model according to a capacity characteristic of the at least one seed model and a capacity characteristic of the new distrib- uted energy system conditions.

7. The computer-implemented method (100) ac- cording to any of claims 2 - 6, wherein the training the at least one model for controlling asset allocation in the distributed energy system in the new distributed energy system conditions using the plurality of new data further comprises: training a plurality of models for the new & distributed energy system conditions by using the at a least one seed model as a starting point and by simu- I lating each model in plurality of models a plurality of = 25 times, thus obtaining a plurality of trained models; E recording an outcome probability distribution & of each trained model in the plurality of trained mod- 3 els; and N selecting a trained model, from the plurality of trained models, for execution based at least on the outcome probability distribution of each trained model in the plurality of trained models and at least one selection criteria.

8. The computer-implemented method (100) ac- cording to claim 7, the method further comprising stor- ing at least the selected trained model in the database.

9. The computer-implemented method (100) ac- cording to claim 8, the method further comprising stor- ing at least one non-selected trained model, from the plurality of trained models, in the database based at least on the outcome probability distribution of each trained model in the plurality of trained models and the at least one selection criteria.

10. The computer-implemented method (100) ac- cording to claim 1, wherein the training the at least one model for controlling asset allocation in the dis- tributed energy system in the new distributed energy system conditions using the plurality of new data com- & prises training the at least one model for controlling a asset allocation in the distributed energy system in a I plurality of distributed energy system conditions. = 25 a

11. The computer-implemented method (100) ac- & cording to any preceding claim, wherein the training the 2 at least one model for controlling asset allocation in & the distributed energy system in the new distributed energy system conditions allows a human to adjust the training.

12. The computer-implemented method (100) ac- cording to any preceding claim, wherein the training the at least one model for controlling asset allocation in the distributed energy system in the new distributed energy system conditions comprises reinforcement learn-

ing.

13.A computing device, comprising at least one processor and at least one memory including computer program code, the at least one memory and the computer program code configured to, with the at least one pro- cessor, cause the computing device to perform the method according to any preceding claim.

14. A computer program product comprising pro- gram code configured to perform the method according to any of claims 1 - 12 when the computer program product is executed on a computer. O N O N + <Q I I [an a N O <t LO O N O N