A journal of IEEE and CAA , publishes high-quality papers in English on original theoretical/experimental research and development in all areas of automation
Volume 7 Issue 2
Mar.  2020

IEEE/CAA Journal of Automatica Sinica

  • JCR Impact Factor: 11.8, Top 4% (SCI Q1)
    CiteScore: 17.6, Top 3% (Q1)
    Google Scholar h5-index: 77, TOP 5
Turn off MathJax
Article Contents
Xiong Yang and Bo Zhao, "Optimal Neuro-Control Strategy for Nonlinear Systems With Asymmetric Input Constraints," IEEE/CAA J. Autom. Sinica, vol. 7, no. 2, pp. 575-583, Mar. 2020. doi: 10.1109/JAS.2020.1003063
Citation: Xiong Yang and Bo Zhao, "Optimal Neuro-Control Strategy for Nonlinear Systems With Asymmetric Input Constraints," IEEE/CAA J. Autom. Sinica, vol. 7, no. 2, pp. 575-583, Mar. 2020. doi: 10.1109/JAS.2020.1003063

Optimal Neuro-Control Strategy for Nonlinear Systems With Asymmetric Input Constraints

doi: 10.1109/JAS.2020.1003063
Funds:  This work was supported by the National Natural Science Foundation of China (61973228, 61973330)
More Information
  • In this paper, we present an optimal neuro-control scheme for continuous-time (CT) nonlinear systems with asymmetric input constraints. Initially, we introduce a discounted cost function for the CT nonlinear systems in order to handle the asymmetric input constraints. Then, we develop a Hamilton-Jacobi-Bellman equation (HJBE), which arises in the discounted cost optimal control problem. To obtain the optimal neurocontroller, we utilize a critic neural network (CNN) to solve the HJBE under the framework of reinforcement learning. The CNN’s weight vector is tuned via the gradient descent approach. Based on the Lyapunov method, we prove that uniform ultimate boundedness of the CNN’s weight vector and the closed-loop system is guaranteed. Finally, we verify the effectiveness of the present optimal neuro-control strategy through performing simulations of two examples.

     

  • loading
  • [1]
    D. Vrabie, K. G. Vamvoudakis, and F. L. Lewis, Optimal Adaptive Control and Differential Games by Reinforcement Learning Principles. London: IET, 2013.
    [2]
    X. Yang and H. B. He, “Self-learning robust optimal control for continuoustime nonlinear systems with mismatched disturbances,” Neural Networks, vol. 99, pp. 19–30, 2018. doi: 10.1016/j.neunet.2017.11.022
    [3]
    W. B. Powell, Approximate Dynamic Programming: Solving the Curses of Dimensionality, 2nd ed. Hoboken, NJ: John Wiley & Sons, 2007.
    [4]
    D. Liu, Q. Wei, D. Wang, X. Yang, and H. Li, Adaptive Dynamic Programming with Applications in Optimal Control. Cham, Switzerland: Springer, 2017.
    [5]
    X. N. Zhong, Z. Ni, and H. B. He, “Gr-GDHP: a new architecture for globalized dual heuristic dynamic programming,” IEEE Trans. Cybernetics, vol. 47, no. 10, pp. 3318–3330, Oct. 2017. doi: 10.1109/TCYB.2016.2598282
    [6]
    D. Wang and X. N. Zhong, “Advanced policy learning near-optimal regulation,” IEEE/CAA J. Autom. Sinica, vol. 6, no. 3, pp. 743–749, May 2019. doi: 10.1109/JAS.2019.1911489
    [7]
    Q. L. Wei, D. R. Liu, Y. Liu, and R. Z. Song, “Optimal constrained self-learning battery sequential management in microgrid via adaptive dynamic programming,” IEEE/CAA J. Autom. Sinica, vol. 4, no. 2, pp. 168–176, Apr. 2017. doi: 10.1109/JAS.2016.7510262
    [8]
    L. Dong, X. N. Zhong, C. Y. Sun, and H. B. He, “Event-triggered adaptive dynamic programming for continuous-time systems with control constraints,” IEEE Trans. Neural Networks and Learning Systems, vol. 28, no. 8, pp. 1941–1952, Aug. 2017. doi: 10.1109/TNNLS.2016.2586303
    [9]
    B. Zhao and D. R. Liu, “Event-triggered decentralized tracking control of modular reconfigurable robots through adaptive dynamic programming,” IEEE Trans. Industrial Electronics, vol. 67, no. 4, pp. 3054–3064, Apr. 2020. doi: 10.1109/TIE.2019.2914571
    [10]
    Y. Jiang and Z.-P. Jiang, Robust Adaptive Dynamic Programming. Hoboken, New Jersey: John Wiley & Sons, 2017.
    [11]
    R. Z. Song, F. L. Lewis, and Q. L. Wei, “Off-policy integral reinforcement learning method to solve nonlinear continuous-time multiplayer nonzerosum games,” IEEE Trans. Neural Networks and Learning Systems, vol. 28, no. 3, pp. 704–713, Mar. 2017. doi: 10.1109/TNNLS.2016.2582849
    [12]
    H. G. Zhang, K. Zhang, Y. L. Cai, and J. Han, “Adaptive fuzzy fault-tolerant tracking control for partially unknown systems with actuator faults via integral reinforcement learning method,” IEEE Trans. Fuzzy Systems, vol. 27, no. 10, pp. 1986–1998, Oct. 2019. doi: 10.1109/TFUZZ.2019.2893211
    [13]
    L. Liu, Z. S. Wang, and H. G. Zhang, “Adaptive fault-tolerant tracking control for MIMO discrete-time systems via reinforcement learning algorithm with less learning parameters,” IEEE Trans. Automation Science and Engineering, vol. 14, no. 1, pp. 299–313, Jan. 2017. doi: 10.1109/TASE.2016.2517155
    [14]
    Y.-J. Liu, S. Li, S. C. Tong, and C. L. P. Chen, “Adaptive reinforcement learning control based on neural approximation for nonlinear discretetime systems with unknown nonaffine dead-zone input,” IEEE Trans. Neural Networks and Learning Systems, vol. 30, no. 1, pp. 295–305, Jan. 2019. doi: 10.1109/TNNLS.2018.2844165
    [15]
    J. N. Li, H. Modares, T. Y. Chai, F. L. Lewis, and L. H. Xie, “Off-policy reinforcement learning for synchronization in multiagent graphical games,” IEEE Trans. Neural Networks and Learning Systems, vol. 28, no. 10, pp. 2434–2445, Oct. 2017. doi: 10.1109/TNNLS.2016.2609500
    [16]
    J. H. Qin, M. Li, Y. Shi, Q. C. Ma, and W. X. Zheng, “Optimal synchronization control of multiagent systems with input saturation via off-policy reinforcement learning,” IEEE Trans. Neural Networks and Learning Systems, vol. 30, no. 1, pp. 85–96, Jan. 2019. doi: 10.1109/TNNLS.2018.2832025
    [17]
    X. Yang and H. B. He, “Event-triggered robust stabilization of nonlinear input-constrained systems using single network adaptive critic designs,” IEEE Trans. Systems, Man, and Cybernetics: Systems, doi: 10.1109/TSMC.2018.2853089, Jul. 2018.
    [18]
    B. Widrow, N. K. Gupta, and S. Maitra, “Punish/reward: learning with a critic in adaptive threshold systems,” IEEE Trans. Systems,Man,and Cybernetics, vol. 3, no. 5, pp. 455–465, Sept. 1973.
    [19]
    D. V. Prokhorov and D. C. Wunsch, “Adaptive critic designs,” IEEE Trans. Neural Networks, vol. 8, no. 5, pp. 997–1007, Sept. 1997. doi: 10.1109/72.623201
    [20]
    R. Padhi, N. Unnikrishnan, X. H. Wang, and S. N. Balakrishnan, “A single network adaptive critic (SNAC) architecture for optimal control synthesis for a class of nonlinear systems,” Neural Networks, vol. 19, no. 10, pp. 1648–1660, 2006. doi: 10.1016/j.neunet.2006.08.010
    [21]
    D. Wang, D. R. Liu, Q. C. Zhang, and D. B. Zhao, “Data-based adaptive critic designs for nonlinear robust optimal control with uncertain dynamics,” IEEE Trans. Systems,Man,and Cybernetics:Systems, vol. 46, no. 11, pp. 1544–1555, Nov. 2016. doi: 10.1109/TSMC.2015.2492941
    [22]
    B. Luo, D. R. Liu, T. W. Huang, and D. Wang, “Model-free optimal tracking control via critic-only Q-learning,” IEEE Trans. Neural Networks and Learning Systems, vol. 27, no. 10, pp. 2134–2144, Oct. 2016. doi: 10.1109/TNNLS.2016.2585520
    [23]
    H. G. Zhang, Y. H. Luo, and D. R. Liu, “Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints,” IEEE Trans. Neural Networks, vol. 20, no. 9, pp. 1490–1503, Sept. 2009. doi: 10.1109/TNN.2009.2027233
    [24]
    M. M. Ha, D. Wang, and D. R. Liu, “Event-triggered adaptive critic control design for discrete-time constrained nonlinear systems,” IEEE Trans. Systems, Man, and Cybernetics: Systems, doi: 10.1109/TSMC.2018.2868510. Sept. 2018.
    [25]
    M. Abu-Khalaf and F. L. Lewis, “Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach,” Automatica, vol. 41, no. 5, pp. 779–791, May 2005. doi: 10.1016/j.automatica.2004.11.034
    [26]
    H. Modares, F. L. Lewis, and M. Naghibi-Sistani, “Adaptive optimal control of unknown constrained-input systems using policy iteration and neural networks,” IEEE Trans. Neural Networks and Learning Systems, vol. 24, no. 10, pp. 1513–1525, Oct. 2013. doi: 10.1109/TNNLS.2013.2276571
    [27]
    Y. H. Zhu, D. B. Zhao, H. B. He, and J. H. Ji, “Event-triggered optimal control for partially-unknown constrained-input systems via adaptive dynamic programming,” IEEE Trans. Industrial Electronics, vol. 64, no. 5, pp. 4101–4109, May 2017. doi: 10.1109/TIE.2016.2597763
    [28]
    D. Wang, H. B. He, and D. R. Liu, “Adaptive critic nonlinear robust control: a survey,” IEEE Trans. Cybernetics, vol. 47, no. 10, pp. 3429–3451, Oct. 2017. doi: 10.1109/TCYB.2017.2712188
    [29]
    H. G. Zhang, K. Zhang, G. Y. Xiao, and H. Jiang, “Robust optimal control scheme for unknown constrained-input nonlinear systems via a plug-n-play event-sampled critic-only algorithm,” IEEE Trans. Systems, Man, and Cybernetics: Systems, doi: 10.1109/TSMC.2018.2889377, Feb. 2019.
    [30]
    L. L. Cui, X. P. Xie, X. W. Wang, Y. H. Luo, and J. B. Liu, “Event-triggered singlenetwork ADP method for constrained optimal tracking control of continuous-time nonlinear systems,” Applied Mathematics and Computation, vol. 352, pp. 220–234, Jul. 2019. doi: 10.1016/j.amc.2019.01.066
    [31]
    Y. Jiang, J. L. Fan, T. Y. Chai, and F. L. Lewis, “Dual-rate operational optimal control for flotation industrial process with unknown operational model,” IEEE Trans. Industrial Electronics, vol. 66, no. 6, pp. 4587–4599, Jun. 2019. doi: 10.1109/TIE.2018.2856198
    [32]
    L. H. Kong, W. He, Y. T. Dong, L. Cheng, C. G. Yang, and Z. J. Li, “Asymmetric bounded neural control for an uncertain robot by state feedback and output feedback,” IEEE Trans. Systems, Man, and Cybernetics: Systems, doi: 10.1109/TSMC.2019.2901277, Apr. 2019.
    [33]
    W. Zhou, H. C. Liu, H. B. He, J. Yi, and T. F. Li, “Neuro-optimal tracking control for continuous stirred tank reactor with input constraints,” IEEE Trans. Industrial Informatics, vol. 15, no. 8, pp. 4516–4524, Aug. 2019. doi: 10.1109/TII.2018.2884214
    [34]
    X. Yang, D. R. Liu, D. Wang, and Q. L. Wei, “Discrete-time online learning control for a class of unknown nonaffine nonlinear systems using reinforcement learning,” Neural Networks, vol. 55, pp. 30–41, 2014. doi: 10.1016/j.neunet.2014.03.008
    [35]
    Y. H. Zhu, D. B. Zhao, X. Yang, and Q. C. Zhang, “Policy iteration for H optimal control of polynomial nonlinear systems via sum of squares programming,” IEEE Trans. Cybernetics, vol. 48, no. 2, pp. 500–509, Feb. 2018. doi: 10.1109/TCYB.2016.2643687
    [36]
    W. Rudin, Principles of Mathematical Analysis, 3rd ed. New York: McGraw-Hill Publishing Co., 1976.
    [37]
    K. Hornik, M. Stinchcombe, and H. White, “Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks,” Neural Networks, vol. 3, no. 5, pp. 551–560, 1990. doi: 10.1016/0893-6080(90)90005-6
    [38]
    K. G. Vamvoudakis and F. L. Lewis, “Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem,” Automatica, vol. 46, no. 5, pp. 878–888, May 2010. doi: 10.1016/j.automatica.2010.02.018
    [39]
    Z. J. Fu, W. F. Xie, S. Rakheja, and J. Na, “Observer-based adaptive optimal control for unknown singularly perturbed nonlinear systems with input constraints,” IEEE/CAA J. Autom. Sinica, vol. 4, no. 1, pp. 48–57, Jan. 2017. doi: 10.1109/JAS.2017.7510322
    [40]
    D. S. Mitrinovic and P. M. Vasic, Analytic Inequalities. Berlin: Springer, 1970.
    [41]
    X. Yang, D. R. Liu, H. W. Ma, and Y. C. Xu, “Online approximate solution of HJI equation for unknown constrained-input nonlinear continuous-time systems,” Information Sciences, vol. 328, pp. 435–454, Jan. 2016. doi: 10.1016/j.ins.2015.09.001
    [42]
    D. R. Liu, X. Yang, D. Wang, and Q. L. Wei, “Reinforecement-learning-based robust controller design for continuous-time uncertain nonlinear systems subject to input constraints,” IEEE Trans. Cybernetics, vol. 45, no. 7, pp. 1372–1385, Jul. 2015. doi: 10.1109/TCYB.2015.2417170
    [43]
    X. Yang and H. B. He, “Adaptive critic learning and experience replay for decentralized event-triggered control of nonlinear interconnected systems,” IEEE Trans. Systems, Man, and Cybernetics: Systems, doi: 10.1109/TSMC.2019.2898370, Mar. 2019.
    [44]
    Z. Ni, N. Malla, and X. N. Zhong, “Prioritizing useful experience replay for heuristic dynamic programming-based learning systems,” IEEE Trans. Cybernetics, vol. 49, no. 11, pp. 3911–3922, Nov. 2019. doi: 10.1109/TCYB.2018.2853582
    [45]
    L. Liu, Z. S. Wang, and H. G. Zhang, “Neural-network-based robust optimal tracking control for MIMO discrete-time systems with unknown uncertainty using adaptive critic design,” IEEE Trans. Neural Networks and Learning Systems, vol. 29, no. 4, pp. 1239–1251, Apr. 2018. doi: 10.1109/TNNLS.2017.2660070
    [46]
    Z. S. Wang, L. Liu, Y. M. Wu, and H. G. Zhang, “Optimal fault-tolerant control for discrete-time nonlinear strict-feedback systems based on adaptive critic design,” IEEE Trans. Neural Networks and Learning Systems, vol. 29, no. 6, pp. 2179–2191, Jun. 2018. doi: 10.1109/TNNLS.2018.2810138

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(7)

    Article Metrics

    Article views (1182) PDF downloads(82) Cited by()

    Highlights

    • An optimal neural control is proposed for nonlinear systems with asymmetric input constraints.
    • This paper introduces a discounted-cost function to tackle asymmetric input constraints.
    • Only a critic neural network is utilized to implement the present optimal neuro-control scheme.
    • Uniform ultimate boundedness stability of all the signals in closed-loop system is proved.

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return