A journal of IEEE and CAA , publishes high-quality papers in English on original theoretical/experimental research and development in all areas of automation
Volume 4 Issue 2
Apr.  2017

IEEE/CAA Journal of Automatica Sinica

  • JCR Impact Factor: 11.8, Top 4% (SCI Q1)
    CiteScore: 17.6, Top 3% (Q1)
    Google Scholar h5-index: 77, TOP 5
Turn off MathJax
Article Contents
Qinglai Wei, Derong Liu, Yu Liu and Ruizhuo Song, "Optimal Constrained Self-learning Battery Sequential Management in Microgrid Via Adaptive Dynamic Programming," IEEE/CAA J. Autom. Sinica, vol. 4, no. 2, pp. 168-176, Apr. 2017. doi: 10.1109/JAS.2016.7510262
Citation: Qinglai Wei, Derong Liu, Yu Liu and Ruizhuo Song, "Optimal Constrained Self-learning Battery Sequential Management in Microgrid Via Adaptive Dynamic Programming," IEEE/CAA J. Autom. Sinica, vol. 4, no. 2, pp. 168-176, Apr. 2017. doi: 10.1109/JAS.2016.7510262

Optimal Constrained Self-learning Battery Sequential Management in Microgrid Via Adaptive Dynamic Programming

doi: 10.1109/JAS.2016.7510262
Funds:

This work was supported in part by National Natural Science Foundation of China 61533017

This work was supported in part by National Natural Science Foundation of China 61273140

This work was supported in part by National Natural Science Foundation of China 61304079

This work was supported in part by National Natural Science Foundation of China 61374105

This work was supported in part by National Natural Science Foundation of China 61379099

This work was supported in part by National Natural Science Foundation of China 61233001

Fundamental Research Funds for the Central Universities FRF-TP-15-056A3

also by the Open Research Project from SKLMCCS 20150104

More Information
  • This paper concerns a novel optimal self-learning battery sequential control scheme for smart home energy systems. The main idea is to use the adaptive dynamic programming (ADP) technique to obtain the optimal battery sequential control iteratively. First, the battery energy management system model is established, where the power efficiency of the battery is considered. Next, considering the power constraints of the battery, a new non-quadratic form performance index function is established, which guarantees that the value of the iterative control law cannot exceed the maximum charging/discharging power of the battery to extend the service life of the battery. Then, the convergence properties of the iterative ADP algorithm are analyzed, which guarantees that the iterative value function and the iterative control law both reach the optimums. Finally, simulation and comparison results are given to illustrate the performance of the presented method.

     

  • loading
  • [1]
    H. P. Li, C. Z. Zang, P. Zeng, H. B. Yu, and Z. W. Li, "A stochastic programming strategy in microgrid cyber physical energy system for energy optimal operation, " IEEE/CAA J. Automat. Sin. , vol. 2, no. 3, pp. 296-303, Jul. 2015.
    [2]
    G. Chen and E. N. Feng, "Distributed secondary control and optimal power sharing in microgrids, " IEEE/CAA J. Automat. Sin. , vol. 2, no. 3, pp. 304-312, Jul. 2015.
    [3]
    W. Wei, F. Liu, and S. W. Mei, "Energy pricing and dispatch for smart grid retailers under demand response and market price uncertainty, " IEEE Trans. Smart Grid, vol.6, no.3, pp.1364-1374, May2015. doi: 10.1109/TSG.2014.2376522
    [4]
    A. Szumanowski and Y. H. Chang, "Battery management system based on battery nonlinear dynamics modeling, " IEEE Transactions on Vehicular Technology, vol.57, no.3, pp.1425-1432, May2008. doi: 10.1109/TVT.2007.912176
    [5]
    H. Rahimi-Eichi, U. Ojha, F. Baronti, and M. Y. Chow, "Battery management system: an overview of its application in the smart grid and electric vehicles, " IEEE Ind. Electron. Maga. , vol. 7, no. 2, pp. 4-16, Jun. 2013.
    [6]
    T. Y. Lee, "Operating schedule of battery energy storage system in a time-of-use rate industrial user with wind turbine generators: a multipass iteration particle swarm optimization approach, " IEEE Trans. Energy Convers. , vol. 22, no. 3, pp. 774-782, Sep. 2007.
    [7]
    P. J. Werbos, "Advanced forecasting methods for global crisis warning and models of intelligence, " Gen. Syst. Yearbook, vol. 22, pp. 25-38, Jan. 1977.
    [8]
    P. J. Werbos, W. T. Miller, and R. S. Sutton, Neural Networks for Control. Cambridge: MIT Press, 1991.
    [9]
    M. Boaro, D. Fuselli, F. De Angelis, D. R. Liu, Q. L. Wei, and F. Piazza, "Adaptive dynamic programming algorithm for renewable energy scheduling and battery management, " Cognit. Comput. , vol. 5, no. 2, pp. 264-277, Jun. 2013.
    [10]
    D. Fuselli, F. De Angelis, M. Boaro, S. Squartini, Q. L. Wei, D. R. Liu, and F. Piazza, "Action dependent heuristic dynamic programming for home energy resource scheduling, " Int. J. Electr. Power Energy Syst. , vol. 48, pp. 148-160, Jun. 2013.
    [11]
    D. Molina, G. K. Venayagamoorthy, J. Q. Liang, and R. G. Harley, "Intelligent local area signals based damping of power system oscillations using virtual generators and approximate dynamic programming, " IEEE Trans. Smart Grid, vol. 4, no. 1, pp. 498-508, Mar. 2013.
    [12]
    J. Q. Liang, D. D. Molina, G. K. Venayagamoorthy, and R. G. Harley, "Two-level dynamic stochastic optimal power flow control for power systems with intermittent renewable generation, " IEEE Trans. Power Syst. , vol. 28, no. 3, pp. 2670-2678, Aug. 2013.
    [13]
    S. Mohagheghi, G. K. Venayagamoorthy, and R. G. Harley, "Fully evolvable optimal neurofuzzy controller using adaptive critic designs, " IEEE Trans. Fuzzy Syst. , vol. 16, no. 6, pp. 1450-1461, Dec. 2008.
    [14]
    Y. F. Tang, H. B. He, J. Y. Wen, and J. Liu, "Power system stability control for a wind farm based on adaptive dynamic programming, " IEEE Trans. Smart Grid, vol. 6, no. 1, pp. 166-177, Jan. 2015.
    [15]
    Z. Zhang and D. B. Zhao, "Clique-based cooperative multiagent reinforcement learning using factor graphs, " IEEE/CAA J. Automat. Sin. , vol. 1, no. 3, pp. 248-256, Jul. 2014.
    [16]
    D. V. Prokhorov and D. C. Wunsch, "Adaptive critic designs, " IEEE Trans. Neur. Net. , vol. 8, no. 5, pp. 997-1007, Sep. 1997.
    [17]
    D. P. Bertsekas and J. N. Tsitsiklis, " Neuro-Dynamic Programming. Belmont, MA: Athena Scientific, 1996.
    [18]
    B. Lincoln and A. Rantzer, "Relaxing dynamic programming, " IEEE Trans. Automat. Control, vol. 51, no. 8, pp. 1249-1260, Aug. 2006.
    [19]
    Q. L. Wei, F. Y. Wang, D. R. Liu, and X. Yang, "Finite-approximation-error-based discrete-time iterative adaptive dynamic programming, " IEEE Trans. Cybernet. , vol. 44, no. 12, pp. 2820-2833, Dec. 2014.
    [20]
    A. Rantzer, "Relaxed dynamic programming in switching systems, " IET Proc. -Contr. Theor. Appl. , vol. 153, no. 5, pp. 567-574, Oct. 2006.
    [21]
    H. Modares and F. L. Lewis, "Linear quadratic tracking control of partially-unknown continuous-time systems using reinforcement learning, " IEEE Trans. Automat. Control, vol. 59, no. 11, pp. 3051-3056, Nov. 2014.
    [22]
    H. Modares and F. L. Lewis, "Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning, " Automatica, vol. 50, no. 7, pp. 1780-1792, Jul. 2014.
    [23]
    H. G. Zhang, T. Feng, G. H. Yang, and H. J. Liang, "Distributed cooperative optimal control for multiagent systems on directed graphs: an inverse optimal approach, " IEEE Trans. Cybernet. , vol. 45, no. 7, pp. 1315-1326, Jul. 2015.
    [24]
    M. Kumar, K. Rajagopal, S. N. Balakrishnan, et al, "Reinforcement learning based controller synthesis for flexible aircraft wings, " IEEE/CAA J. Automat. Sin. , vol. 1, no. 4, pp. 435-448, Oct. 2014.
    [25]
    R. Kamalapurkar, J. R. Klotz, and W. E. Dixon, "Concurrent learning-based approximate feedback-Nash equilibrium solution of N-player nonzero-sum differential games, " IEEE/CAA J. Automat. Sin. , vol. 1, no. 3, pp. 239-247, Jul. 2014.
    [26]
    Q. M. Zhao, H. Xu, and S. Jagannathan, "Near optimal output feedback control of nonlinear discrete-time systems based on reinforcement neural network learning, " IEEE/CAA J. Automat. Sin. , vol. 1, no. 4, pp. 372-384, Oct. 2014.
    [27]
    Q. L. Wei, R. Z. Song, and P. F. Yan, "Data-driven zero-sum neuro-optimal control for a class of continuous-time unknown nonlinear systems with disturbance using ADP, " IEEE Trans. Neur. Net. Lear. Syst. , vol. 27, no. 2, pp. 444-458, Feb. 2016.
    [28]
    Q. L. Wei and D. R. Liu, "Adaptive dynamic programming for optimal tracking control of unknown nonlinear systems with application to coal gasification, " IEEE Trans. Automat. Sci. Eng. , vol. 11, no. 4, pp. 1020-1036, Oct. 2014.
    [29]
    Q. L. Wei and D. R. Liu, "A novel iterative the ta-adaptive dynamic programming for discrete-time nonlinear systems, " IEEE Trans. Automat. Sci. Eng. , vol. 11, no. 4, pp. 1176-1190, Oct. 2014.
    [30]
    Q. L. Wei and D. R. Liu, "Data-driven neuro-optimal temperature control of water-gas shift reaction using stable iterative adaptive dynamic programming, " IEEE Trans. Ind. Electron. , vol. 61, no. 11, pp. 6399-6408, Nov. 2014.
    [31]
    Q. L. Wei, D. R. Liu, and H. Q. Lin, "Value iteration adaptive dynamic programming for optimal control of discrete-time nonlinear systems, " IEEE Trans. Cybernet. , vol. 46, no. 3, pp. 840-853, Mar. 2016.
    [32]
    Q. L. Wei, D. R. Liu, and X. Yang, "Infinite horizon self-learning optimal control of nonaffine discrete-time nonlinear systems, " IEEE Trans. Neur. Net. Lear. Syst. , vol. 26, no. 4, pp. 866-879, Apr. 2015.
    [33]
    Q. L. Wei, D. R. Liu, and F. L. Lewis, "Optimal distributed synchronization control for continuous-time heterogeneous multi-agent differential graphical games, " Inform. Sci. , vol. 317, pp. 96-113, Oct. 2015.
    [34]
    Q. L. Wei and D. R. Liu, "A novel policy iteration based deterministic Q-learning for discrete-time nonlinear systems, " Sci. China Inform. Sci. , vol. 58, no. 12, pp. 1-15, Dec. 2015.
    [35]
    F. L. Lewis, D. Vrabie, K. G. Vamvoudakis, "Reinforcement learning and feedback control: using natural decision methods to design optimal adaptive controllers, " IEEE Control Syst. , vol. 32, no. 6, pp. 76-105, Dec. 2012.
    [36]
    J. J. Murray, C. J. Cox, G. G. Lendaris, and R. Saeks, "Adaptive dynamic programming, " IEEE Trans. Syst. Man Cybern. C Appl. Rev., vol.32, no.2, pp.140-153, May2002. doi: 10.1109/TSMCC.2002.801727
    [37]
    M. Abu-Khalaf and F. L. Lewis, "Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach, " Automatica, vol.41, no.5, pp.779-791, May2005. doi: 10.1016/j.automatica.2004.11.034
    [38]
    R. Z. Song, W. D. Xiao, H. G. Zhang, and C. Y. Sun, "Adaptive dynamic programming for a class of complex-valued nonlinear systems, " IEEE Trans. Neur. Net. Lear. Syst. , vol. 25, no. 9, pp. 1733-1739, Sep. 2014.
    [39]
    R. Z. Song, F. Lewis, Q. L. Wei, H. G. Zhang, Z. P. Jiang, and D. Levine, "Multiple actor-critic structures for continuous-time optimal control using input-output data, " IEEE Trans. Neur. Net. Lear. Syst. , vol. 26, no. 4, pp. 851-865, Apr. 2015.
    [40]
    R. Z. Song, F. L. Lewis, Q. L. Wei, and H. G. Zhang, "Off-policy actor-critic structure for optimal control of unknown systems with disturbances, " IEEE Trans. Cybernet., vol.46, no.5, pp.1041-1050, May2016. doi: 10.1109/TCYB.2015.2421338
    [41]
    D. R. Liu and Q. L. Wei, "Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear systems, " IEEE Trans. Neur. Net. Lear. Syst. , vol. 25, no. 3, pp. 621-634, Mar. 2014.
    [42]
    A. Al-Tamimi, F. L. Lewis, and M. Abu-Khalaf, "Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof, " IEEE Trans. Syst. Man Cybern. B Cybern. , vol. 38, no. 4, pp. 943-949, Aug. 2008.
    [43]
    H. G. Zhang, Q. L. Wei, and Y. H. Luo, "A novel infinite-time optimal tracking control scheme for a class of discrete-time nonlinear systems via the greedy HDP iteration algorithm, " IEEE Trans. Syst. Man Cybern. B Cybern. , vol. 38, no. 4, pp. 937-942, Aug. 2008.
    [44]
    T. Huang and D. R. Liu, "A self-learning scheme for residential energy system control and management, " Neur. Comput. Appl. , vol. 22, no. 2, pp. 259-269, Feb. 2013.
    [45]
    Q. L. Wei, D. R. Liu, and G. Shi, "A novel dual iterative Q-learning method for optimal battery management in smart residential environments, " IEEE Trans. Ind. Electron. , vol. 62, no. 4, pp. 2509-2518, Apr. 2015.
    [46]
    T. Yau, L. N. Walker, H. L. Graham, A. Gupta, and R. Raithel, "Effects of battery storage devices on power system dispatch, " IEEE Trans. Power Apparatus Syst. , vol. PAS-100, no. 1, pp. 375-383, Jan. 1981.
    [47]
    R. E. Bellman, Dynamic Programming. Princeton, NJ: Princeton University Press, 1957.
    [48]
    Q. L. Wei, D. R. Liu, G. Shi, and Y. Liu, "Multibattery optimal coordination control for home energy management systems via distributed iterative adaptive dynamic programming, " IEEE Trans. Ind. Electron. , vol. 62, no. 7, pp. 4203-4214, Jul. 2015.
    [49]
    J. Si and Y. T. Wang, "Online learning control by association and reinforcement, " IEEE Trans. Neur. Net. , vol. 12, no. 2, pp. 264-276, Mar. 2001.

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(7)

    Article Metrics

    Article views (1636) PDF downloads(319) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return