Lee et al., 2025 - Google Patents
Constrained Optimization Formulation of Bellman Optimality Equation for Online Reinforcement LearningLee et al., 2025
View PDF- Document ID
- 12631321522424602759
- Author
- Lee H
- Choi K
- Publication year
- Publication venue
- Authorea Preprints
External Links
Snippet
This paper proposes an online reinforcement learning algorithm that directly solves the Bellman optimality equation by casting it as a constrained optimization problem. Unlike policy or value iteration, which incrementally approximate the Bellman (optimality) equation …
Classifications
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/04—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
- G05B13/042—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/0265—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion
- G05B13/027—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion using neural networks only
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Grandia et al. | Nonlinear model predictive control of robotic systems with control lyapunov functions | |
| Hsu et al. | Isaacs: Iterative soft adversarial actor-critic for safety | |
| Jost et al. | Optimal and suboptimal event-triggering in linear model predictive control | |
| Wang et al. | Chance constraint robust control with control barrier functions | |
| D'Jorge et al. | Stochastic model predictive control for tracking linear systems | |
| Mittal et al. | Neural lyapunov model predictive control | |
| Nair et al. | Stochastic mpc with dual control for autonomous driving with multi-modal interaction-aware predictions | |
| Cisneros et al. | A dissipativity formulation for stability analysis of nonlinear and parameter dependent MPC | |
| Curi et al. | Safe reinforcement learning via confidence-based filters | |
| Hu et al. | Provable sim-to-real transfer in continuous domain with partial observations | |
| Lee et al. | Constrained Optimization Formulation of Bellman Optimality Equation for Online Reinforcement Learning | |
| Beckenbach et al. | Addressing infinite-horizon optimization in MPC via Q-learning | |
| Ławryńczuk | Neural networks in model predictive control | |
| Peydayesh et al. | Neuro‐adaptive distributed output‐feedback containment control for multiagent systems with nonstrict‐feedback nonlinear dynamics and input constraints | |
| Zhang et al. | Marl with general utilities via decentralized shadow reward actor-critic | |
| Patel et al. | Conformal robust control of linear systems | |
| Lu et al. | Mpc-inspired reinforcement learning for verifiable model-free control | |
| Osinenko et al. | Stacked adaptive dynamic programming with unknown system model | |
| Kögel et al. | Cooperative distributed MPC using the alternating direction multiplier method | |
| Wang et al. | Safe Navigation in Uncertain Crowded Environments Using Risk Adaptive CVaR Barrier Functions | |
| Pan et al. | Composite learning from model reference adaptive fuzzy control | |
| Ou et al. | Model predictive control of parabolic PDE systems with dirichlet boundary conditions via galerkin model reduction | |
| Lu et al. | Bridging the gaps: Learning verifiable model-free quadratic programming controllers inspired by model predictive control | |
| Han | Control system based on affine TS fuzzy model with uncertainty | |
| Yu et al. | A Convex Optimization Approach to Model-Free Inverse Optimal Control with Provable Convergence |