A journal of IEEE and CAA , publishes high-quality papers in English on original theoretical/experimental research and development in all areas of automation
Volume 5 Issue 4
Jul.  2018

IEEE/CAA Journal of Automatica Sinica

  • JCR Impact Factor: 11.8, Top 4% (SCI Q1)
    CiteScore: 17.6, Top 3% (Q1)
    Google Scholar h5-index: 77, TOP 5
Turn off MathJax
Article Contents
Teng Liu, Bin Tian, Yunfeng Ai, Li Li, Dongpu Cao and Fei-Yue Wang, "Parallel Reinforcement Learning: A Framework and Case Study," IEEE/CAA J. Autom. Sinica, vol. 5, no. 4, pp. 827-835, July 2018. doi: 10.1109/JAS.2018.7511144
Citation: Teng Liu, Bin Tian, Yunfeng Ai, Li Li, Dongpu Cao and Fei-Yue Wang, "Parallel Reinforcement Learning: A Framework and Case Study," IEEE/CAA J. Autom. Sinica, vol. 5, no. 4, pp. 827-835, July 2018. doi: 10.1109/JAS.2018.7511144

Parallel Reinforcement Learning: A Framework and Case Study

doi: 10.1109/JAS.2018.7511144
Funds:

the National Natural Science Foundation of China 61503380

the Natural Science Foundation of Guangdong Province, China 2015A030310187

More Information
  • In this paper, a new machine learning framework is developed for complex system control, called parallel reinforcement learning. To overcome data deficiency of current data-driven algorithms, a parallel system is built to improve complex learning system by self-guidance. Based on the Markov chain (MC) theory, we combine the transfer learning, predictive learning, deep learning and reinforcement learning to tackle the data and action processes and to express the knowledge. Parallel reinforcement learning framework is formulated and several case studies for real-world problems are finally introduced.

     

  • loading
  • [1]
    V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, and D. Hassabis, "Human-level control through deep reinforcement learning, " Nature, vol. 518, no. 7540, pp. 529-533, Feb. 2015. http://europepmc.org/abstract/med/25719670
    [2]
    D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, and D. Hassabis, "Mastering the game of Go with deep neural networks and tree search, " Nature, vol. 529, no. 7587, pp. 484-489, Jan. 2016. http://www.ncbi.nlm.nih.gov/pubmed/26819042
    [3]
    Y. K. Zhu, R. Mottaghi, E. Kolve, J. J. Lim, A. Gupta, F. -F. Li, and A. Farhadi, "Target-driven visual navigation in indoor scenes using deep reinforcement learning, " in Proc. 2017 IEEE Int. Conf. Robotics and Automation (ICRA), Singapore, pp. 3357-3364. http://arxiv.org/abs/1609.05143
    [4]
    I. Popov, N. Heess, T. Lillicrap, R. Hafner, G. Barth-Maron, M. Vecerik, T. Lampe, Y. Tassa, T. Erez, and M. Riedmiller, "Data-efficient deep reinforcement learning for dexterous manipulation, " arXiv: 1704.03073, 2017. http://arxiv.org/abs/1704.03073
    [5]
    X. W. Qi, Y. D. Luo, G. Y. Wu, K. Boriboonsomsin, and M. J. Barth, "Deep reinforcement learning-based vehicle energy efficiency autonomous learning system, " in Proc. Intelligent Vehicles Symp. (Ⅳ), Los Angeles, CA, USA, pp. 1228-1233, 2017. http://www.researchgate.net/publication/318800742_Deep_reinforcement_learning-based_vehicle_energy_efficiency_autonomous_learning_system
    [6]
    J. C. Caicedo and S. Lazebnik, "Active object localization with deep reinforcement learning, " in Proc. IEEE Int. Conf. Computer Vision, Santiago, Chile, 2015, pp. 2488-2496. http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=7410643
    [7]
    X. X. Guo, S. Singh, R. Lewis, and H. Lee, "Deep learning for reward design to improve Monte Carlo tree search in Atari games, " arXiv: 1604.07095, 2016. http://arxiv.org/abs/1604.07095
    [8]
    V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller, "Playing Atari with deep reinforcement learning, " arXiv: 1312.5602, 2013.
    [9]
    J. Heinrich and D. Silver, "Deep reinforcement learning from self-play in imperfect-information games, " arXiv: 1603.01121, 2016. http://arxiv.org/abs/1603.01121
    [10]
    D. Hafner, "Deep reinforcement learning from raw pixels in doom, " arXiv: 1610.02164, 2016. http://arxiv.org/abs/1610.02164
    [11]
    K. Narasimhan, T. Kulkarni, and R. Barzilay, "Language understanding for text-based games using deep reinforcement learning, " arXiv: 1506.08941, 2015. http://arxiv.org/abs/1506.08941
    [12]
    L. Li, Y. L. Lin, N. N. Zheng, and F. Y. Wang, "Parallel learning: a perspective and a framework, " IEEE/CAA J. of Autom. Sinica, vol. 4, no. 3, pp. 389-395, Jul. 2017. http://ieeexplore.ieee.org/xpl/articleDetails.jsp?reload=true&arnumber=7974888
    [13]
    F. Y. Wang, "Artificial societies, computational experiments, and parallel systems: a discussion on computational theory of complex social-economic systems, " Complex Syst. Complex. Sci., vol. 1, no. 4, pp. 25-35, Oct. 2004. http://en.cnki.com.cn/Article_en/CJFDTOTAL-FZXT200404001.htm
    [14]
    F. Y. Wang, "Toward a paradigm shift in social computing: the ACP approach, " IEEE Intell. Syst., vol. 22, no. 5, pp. 65-67, Sep. -Oct. 2007. http://ieeexplore.ieee.org/document/4338496/
    [15]
    F. Y. Wang, "Parallel control and management for intelligent transportation systems: concepts, architectures, and applications, " IEEE Trans. Intell. Transp. Syst., vol. 11, no. 3, pp. 630-638, Sep. 2010. http://ieeexplore.ieee.org/document/5549912/
    [16]
    F. Y. Wang and S. N. Tang, "Artificial societies for integrated and sustainable development of metropolitan systems, " IEEE Intell. Syst., vol. 19, no. 4, pp. 82-87, Jul. -Aug. 2004. http://ieeexplore.ieee.org/abstract/document/1333039/
    [17]
    F. Y. Wang, H. G. Zhang, and D. R. Liu, "Adaptive dynamic programming: an introduction, " IEEE Comput. Intell. Mag., vol. 4, no. 2, pp. 39-47, May 2009. http://ieeexplore.ieee.org/xpls/icp.jsp?arnumber=4840325
    [18]
    F. Y. Wang, "The emergence of intelligent enterprises: From CPS to CPSS, " IEEE Intell. Syst., vol. 25, no. 4, pp. 85-88, Jul. -Aug. 2010. http://ieeexplore.ieee.org/document/5552591/
    [19]
    F. Y. Wang, N. N. Zheng, D. P. Cao, C. M. Martinez, L. Li, and T. Liu, "Parallel driving in CPSS: a unified approach for transport automation and vehicle intelligence, " IEEE/CAA J. of Autom. Sinica, vol. 4, no. 4, pp. 577-587, Oct. 2017. http://ieeexplore.ieee.org/document/8039015/
    [20]
    K. F. Wang, C. Gou, and F. Y. Wang, "Parallel vision: an ACP-based approach to intelligent vision computing, " Acta Automat. Sin., vol. 42, no. 10, pp. 1490-1500, Oct. 2016. http://www.aas.net.cn/EN/Y2016/V42/I10/1490
    [21]
    P. Nyberg, E. Frisk, and L. Nielsen, "Driving cycle equivalence and transformation, " IEEE Trans. Veh. Technol., vol. 66, no. 3, pp. 1963-1974, Mar. 2017. http://ieeexplore.ieee.org/document/7493605/
    [22]
    P. Nyberg, E. Frisk, and L. Nielsen, "Driving cycle adaption and design based on mean tractive force, " in Proc. 7th IFAC Symp. Advanced Automatic Control, Tokyo, Japan, vol. 7, no. 1, pp. 689-694, 2013. http://www.researchgate.net/publication/271479464_Driving_Cycle_Adaption_and_Design_Based_on_Mean_Tractive_Force?ev=auth_pub
    [23]
    D. P. Filev and I. Kolmanovsky, "Generalized markov models for real-time modeling of continuous systems, " IEEE Trans. Fuzzy Syst., vol. 22, no. 4, pp. 983-998, Aug. 2014. http://ieeexplore.ieee.org/document/6588289/
    [24]
    D. P. Filev and I. Kolmanovsky, "Markov chain modeling approaches for on board applications, " in Proc. 2010 American Control Conf., Baltimore, MD, USA, pp. 4139-4145. http://ieeexplore.ieee.org/xpls/icp.jsp?arnumber=5530610
    [25]
    T. Liu, X. S. Hu, S. E. Li, and D. P. Cao, "Reinforcement learning optimized look-ahead energy management of a parallel hybrid electric vehicle, " IEEE/ASME Trans. Mechatron., vol. 22, no. 4, pp. 1497-1507, Aug. 2017. http://ieeexplore.ieee.org/document/7932983/
    [26]
    A. Graves and J. Schmidhuber, "Framewise phoneme classification with bidirectional LSTM and other neural network architectures, " Neural Netw., vol. 18, no. 5-6, pp. 602-610, Jul. -Aug. 2005. http://www.ncbi.nlm.nih.gov/pubmed/16112549
    [27]
    Y. S. Lv, Y. J. Duan, W. W. Kang, Z. X. Li, and F. Y. Wang, "Traffic flow prediction with big data: a deep learning approach, " IEEE Trans. Intell. Transp. Syst., vol. 16, no. 2, pp. 865-873, Apr. 2015. http://ieeexplore.ieee.org/document/6894591/
    [28]
    J. P. Zhang, F. Y. Wang, K. F. Wang, W. H. Lin, X. Xu, and C. Chen, "Data-driven intelligent transportation systems: a survey, " IEEE Trans. Intell. Transp. Syst., vol. 12, no. 4, pp. 1624-1639, Dec. 2011. http://ieeexplore.ieee.org/document/5959985/
    [29]
    K. F. Wang, C. Gou, N. N. Zheng, J. M. Rehg, and F. Y. Wang, "Parallel vision for perception and understanding of complex scenes: methods, framework, and perspectives, " Artif. Intell. Rev., vol. 48, no. 3, pp. 299-329, Oct. 2017. doi: 10.1007%2Fs10462-017-9569-z
    [30]
    W. Liu, Z. H. Li, L. Li, and F. Y. Wang, "Parking like a human: A direct trajectory planning solution, " IEEE Trans. Intell. Transp. Syst., vol. 18, no. 12, pp. 3388-3397, Dec. 2017. http://ieeexplore.ieee.org/document/7902173/

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(7)

    Article Metrics

    Article views (1349) PDF downloads(103) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return