Ikemoto et al., 2021 - Google Patents
Continuous deep Q-learning with a simulator for stabilization of uncertain discrete-time systemsIkemoto et al., 2021
View PDF- Document ID
- 8182512532394342536
- Author
- Ikemoto J
- Ushio T
- Publication year
- Publication venue
- Nonlinear Theory and Its Applications, IEICE
External Links
Snippet
A simulator that predicts the behavior of a real system is useful for reinforcement learning (RL) because we can collect experiences more efficiently than through interactions with the real system. However, in the case where there is an identification error, the experiences …
- 238000011105 stabilization 0 title abstract description 4
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding or deleting nodes or connections, pruning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/04—Architectures, e.g. interconnection topology
- G06N3/0472—Architectures, e.g. interconnection topology using probabilistic elements, e.g. p-rams, stochastic processors
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/0265—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion
- G05B13/027—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion using neural networks only
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/0265—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion
- G05B13/0275—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion using fuzzy logic only
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/12—Computer systems based on biological models using genetic models
- G06N3/126—Genetic algorithms, i.e. information processing using digital simulations of the genetic system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/04—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
- G05B13/042—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computer systems based on specific mathematical models
- G06N7/02—Computer systems based on specific mathematical models using fuzzy logic
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computer systems based on specific mathematical models
- G06N7/005—Probabilistic networks
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Li et al. | Exponential stability analysis for delayed semi-Markovian recurrent neural networks: A homogeneous polynomial approach | |
Moerland et al. | A0c: Alpha zero in continuous action space | |
Er et al. | Online tuning of fuzzy inference systems using dynamic fuzzy Q-learning | |
Lin et al. | Reinforcement structure/parameter learning for neural-network-based fuzzy logic control systems | |
Chiang et al. | A self-learning fuzzy logic controller using genetic algorithms with reinforcements | |
Yang et al. | A novel self-constructing radial basis function neural-fuzzy system | |
Kumaraswamy | Neural networks for data classification | |
Wang et al. | A boosting-based deep neural networks algorithm for reinforcement learning | |
Hein et al. | Generating interpretable fuzzy controllers using particle swarm optimization and genetic programming | |
Ikemoto et al. | Continuous deep Q-learning with a simulator for stabilization of uncertain discrete-time systems | |
Peng et al. | Compensatory neural fuzzy network with symbiotic particle swarm optimization for temperature control | |
Aoun et al. | Particle swarm optimisation with population size and acceleration coefficients adaptation using hidden Markov model state classification | |
Eqra et al. | A novel adaptive multi-critic based separated-states neuro-fuzzy controller: Architecture and application to chaos control | |
Mazumdar et al. | A multi-armed bandit approach for online expert selection in markov decision processes | |
Rubio | Stability Analysis for an Online Evolving Neuro‐Fuzzy Recurrent Network | |
Mac Parthaláin et al. | Fuzzy-rough feature selection using flock of starlings optimisation | |
Akbari et al. | Riccati updates for online linear quadratic control | |
Chen et al. | Incremental Reinforcement Learning---a New Continuous Reinforcement Learning Frame Based on Stochastic Differential Equation methods | |
Grimble et al. | Non-linear predictive control for manufacturing and robotic applications | |
Sbarbaro et al. | Multivariable generalized minimum variance control based on artificial neural networks and Gaussian process models | |
Ghorrati | A New Adaptive Learning algorithm to train Feed-Forward Multi-layer Neural Networks, Applied on Function Approximation Problem | |
Haghrah et al. | An incremental learning-based fuzzy control scheme for a class of uncertain Euler-Lagrange systems | |
Ikemoto et al. | Networked control of nonlinear systems under partial observation using continuous deep Q-learning | |
Herzallah et al. | Bayesian adaptive control of nonlinear systems with functional uncertainty | |
Chaturvedi | Factors affecting the performance of artificial neural network models |