A journal of IEEE and CAA , publishes high-quality papers in English on original theoretical/experimental research and development in all areas of automation
Volume 6 Issue 5
Sep.  2019

IEEE/CAA Journal of Automatica Sinica

  • JCR Impact Factor: 11.8, Top 4% (SCI Q1)
    CiteScore: 17.6, Top 3% (Q1)
    Google Scholar h5-index: 77, TOP 5
Turn off MathJax
Article Contents
Yifan Xia, Hui Yu and Fei-Yue Wang, "Accurate and Robust Eye Center Localization via Fully Convolutional Networks," IEEE/CAA J. Autom. Sinica, vol. 6, no. 5, pp. 1127-1138, Sept. 2019. doi: 10.1109/JAS.2019.1911684
Citation: Yifan Xia, Hui Yu and Fei-Yue Wang, "Accurate and Robust Eye Center Localization via Fully Convolutional Networks," IEEE/CAA J. Autom. Sinica, vol. 6, no. 5, pp. 1127-1138, Sept. 2019. doi: 10.1109/JAS.2019.1911684

Accurate and Robust Eye Center Localization via Fully Convolutional Networks

doi: 10.1109/JAS.2019.1911684
Funds:  This work was supported by National Natural Science Foundation of China (61533019, U1811463), Open Fund of the State Key Laboratory for Management and Control of Complex Systems, Institute of Automation, Chinese Academy of Sciences (Y6S9011F51), and in part by the EPSRC Project (EP/N025849/1)
More Information
  • Eye center localization is one of the most crucial and basic requirements for some human-computer interaction applications such as eye gaze estimation and eye tracking. There is a large body of works on this topic in recent years, but the accuracy still needs to be improved due to challenges in appearance such as the high variability of shapes, lighting conditions, viewing angles and possible occlusions. To address these problems and limitations, we propose a novel approach in this paper for the eye center localization with a fully convolutional network (FCN), which is an end-to-end and pixels-to-pixels network and can locate the eye center accurately. The key idea is to apply the FCN from the object semantic segmentation task to the eye center localization task since the problem of eye center localization can be regarded as a special semantic segmentation problem. We adapt contemporary FCN into a shallow structure with a large kernel convolutional block and transfer their performance from semantic segmentation to the eye center localization task by fine-tuning. Extensive experiments show that the proposed method outperforms the state-of-the-art methods in both accuracy and reliability of eye center localization. The proposed method has achieved a large performance improvement on the most challenging database and it thus provides a promising solution to some challenging applications.

     

  • loading
  • [1]
    H. B. Cai, H. Yu, X. L. Zhou, and H. H. Liu, " Robust gaze estimation via normalized iris center-eye corner vector,” in Proc. Int. Conf. Intelligent Robotics and Applications, Tokyo, Japan, 2016, pp. 300–309.
    [2]
    M. Lin, B. Bin, and Q. H. Liu, " Identification of eye movements from non-frontal face images for eye-controlled systems,” Int. J. Automat. Comput., vol. 11, no. 5, pp. 543–554, Oct. 2014. doi: 10.1007/s11633-014-0827-0
    [3]
    Y. Xing, C. Lv, Z. Z. Zhang, H. J. Wang, X. X. Na, D. P. Cao, E. Velenis, and F. Y. Wang, " Identification and analysis of driver postures for in-vehicle driving activities and secondary tasks recognition,” IEEE Trans. Comput. Soc. Syst., vol. 5, no. 1, pp. 95–108, Mar. 2018. doi: 10.1109/TCSS.2017.2766884
    [4]
    H. Yu and H. H. Liu, " Regression-based facial expression optimization,” IEEE Trans. Human-Mach. Syst., vol. 44, no. 3, pp. 386–394, Jun. 2014. doi: 10.1109/THMS.2014.2313912
    [5]
    Z. T. Liu, M. Wu, W. H. Cao, L. F. Chen, J. P. Xu, R. Zhang, M. T. Zhou, and J. W. Mao, " A facial expression emotion recognition based human-robot interaction system,” IEEE/CAA J. Autom. Sinica, vol. 4, no. 4, pp. 668–676, Sep. 2017. doi: 10.1109/JAS.2017.7510622
    [6]
    B. Cyganek and S. Gruszczyński, " Hybrid computer vision system for driverso eye recognition and fatigue monitoring,” Neurocomputing, vol. 126, pp. 78–94, Feb. 2014. doi: 10.1016/j.neucom.2013.01.048
    [7]
    Y. M. Jang, R. Mallipeddi, S. Lee, H. W. Kwak, and M. Lee, " Human intention recognition based on eyeball movement pattern and pupil size variation,” Neurocomputing, vol. 128, pp. 421–432, Mar. 2014. doi: 10.1016/j.neucom.2013.08.008
    [8]
    N. N. Zheng, " The new era of artificial intelligence,” Chin. J. Intell. Sci. Technol., vol. 1, no. 1, pp. 1–3, 2019.
    [9]
    B. Zhang, " Artificial intelligence is entering the post deep-learning era,” Chin. J. Intell. Sci. Technol., vol. 1, no. 1, pp. 4–6, 2019.
    [10]
    J. J. Zhang, F. Y. Wang, X. Wang, G. Xiong, F. H. Zhu, Y. S. Lv, J. C. Hou, S. S. Han, Y. Yuan, Q. C. Lu, and Y. Lee, " Cyber-physical-social systems: The state of the art and perspectives,” IEEE Trans. Comput. Soc. Syst., vol. 5, no. 3, pp. 829–840, Sep. 2018. doi: 10.1109/TCSS.2018.2861224
    [11]
    F.-Y. Wang, Y. Yuan, J. J. Li, D. P. Cao, L. X. Li, P. A. Ioannou, and M. Á Sotelo, " From intelligent vehicles to smart societies: A parallel driving approach,” IEEE Trans. Comput. Soc. Syst., vol. 5, no. 3, pp. 594–604, Sep. 2018. doi: 10.1109/TCSS.2018.2862058
    [12]
    L. Li, Y. S. Lv, and F.-Y. Wang, " Traffic signal timing via deep reinforcement learning,” IEEE/CAA J. Autom. Sinica, vol. 3, no. 3, pp. 247–254, Jul. 2016. doi: 10.1109/JAS.2016.7508798
    [13]
    Y. L. Tian, X. Li, K. F. Wang, and F.-Y. Wang, " Training and testing object detectors with virtual images,” IEEE/CAA J. Autom. Sinica, vol. 5, no. 2, pp. 539–546, Mar. 2018. doi: 10.1109/JAS.2017.7510841
    [14]
    L. Chen, X. M. Hu, W. Tian, H. Wang, D. P. Cao, and F. Y. Wang, " Parallel planning: a new motion planning framework for autonomous driving,” IEEE/CAA J. Autom. Sinica, vol. 6, no. 1, pp. 236–246, Jan. 2019. doi: 10.1109/JAS.2018.7511186
    [15]
    Q. Wang, X. J. Yang, Z. G. Huang, S. Q. Ma, Q. Li, D. W. Gao, and F. Y. Wang, " A novel design framework for smart operating robot in power system,” IEEE/CAA J. Autom. Sinica, vol. 5, no. 2, pp. 531–538, Mar. 2018. doi: 10.1109/JAS.2017.7510838
    [16]
    S. P. Wang, J. Y. Cai, Q. H. Lin, and W. Z. Guo, " An overview of unsupervised deep feature representation for text categorization,” IEEE Trans. Comput. Soc. Syst., vol. 6, no. 3, pp. 504–517, Jun. 2019. doi: 10.1109/TCSS.2019.2910599
    [17]
    R. C. Chen, " User rating classification via deep belief network learning and sentiment analysis,” IEEE Trans. Comput. Soc. Syst., vol. 6, no. 3, pp. 535–546, Jun. 2019. doi: 10.1109/TCSS.2019.2915543
    [18]
    B. L. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba, Object detectors emerge in deep scene cnns. 2014. [Online]. Available: arXiv preprint arXiv: 1412.6856
    [19]
    C. Peng, X. Y. Zhang, G. Yu, G. M. Luo, and J. Sun, " Large kernel matters-Improve semantic segmentation by global convolutional network,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Honolulu, HI, USA, 2017, pp. 1743–1751.
    [20]
    J. Long, E. Shelhamer, and T. Darrell, " Fully convolutional networks for semantic segmentation,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Boston, MA, USA, 2015, pp. 3431–3440.
    [21]
    S. Asteriadis, N. Nikos, A. Hajdu, and I. Pitas, " An eye detection algorithm using pixel to edge information,” in Proc. Int. Symp. on Control, Commun. and Sign. , 2006.
    [22]
    Z. H. Zhou and X. Geng, " Projection functions for eye detection,” Pattern Recogn., vol. 37, no. 5, pp. 1049–1056, May 2004. doi: 10.1016/j.patcog.2003.09.006
    [23]
    L. Bai, L. L. Shen, and Y. Wang, " A novel eye location algorithm based on radial symmetry transform,” in Proc. 18th Int. Conf. Pattern Recognition (ICPR'06), Hong Kong, China, 2006, pp. 511–514.
    [24]
    R. Valenti and T. Gevers, " Accurate eye center location and tracking using isophote curvature,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Anchorage, AK, USA, 2008, pp. 1–8.
    [25]
    F. Timm and E. Barth, " Accurate eye centre localisation by means of gradients,” in Proc. Int. Conf. Computer Vision Theory and Application, Algarve, Portugal, 2011, pp. 125–130.
    [26]
    H. B. Cai, H. Yu, C. Y. Yao, S. Y. Chen, and H. H. Liu, " Convolution-based means of gradient for fast eye center localization,” in Proc. Int. Conf. Machine Learning and Cybernetics (ICMLC), Guangzhou, China, 2015, pp. 759–764.
    [27]
    Y. E. Soelistio, E. Postma, and A. Maes, " Circle-based eye center localization (CECL),” in Proc. 14th IAPR Int. Conf. Machine Vision Applications (MVA), Tokyo, Japan, 2015, pp. 349–352.
    [28]
    M. Asadifard and J. Shanbezadeh, " Automatic adaptive center of pupil detection using face detection and cdf analysis,” in Proc. Int. Multi Conf. Engineers and Computer Scientists, Hong Kong, China, 2010, pp. 130–133.
    [29]
    M. Leo, D. Cazzato, T. De Marco, and C. Distante, " Unsupervised eye pupil localization through differential geometry and local self-similarity matching,” PLoS One, vol. 9, no. 8, pp. e102829, Aug. 2014. doi: 10.1371/journal.pone.0102829
    [30]
    G. M. Araujo, F. M. L. Ribeiro, E. A. B. Silva, and S. K. Goldenstein, " Fast eye localization without a face model using inner product detectors,” in Proc. IEEE Int. Conf. Image Processing (ICIP), Paris, France, 2014, pp. 1366–1370.
    [31]
    W. H. Zhang, M. L. Smith, L. N. Smith, and A. Farooq, " Eye center localization and gaze gesture recognition for human-Bcomputer interaction,” JOSA A, vol. 33, no. 3, pp. 314–325, Mar. 2016. doi: 10.1364/JOSAA.33.000314
    [32]
    A. Villanueva, V. Ponz, L. Sesma-Sanchez, M. Ariz, S. Porta, and R. Cabeza, " Hybrid method based on topography for robust detection of iris center and eye corners,” ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), vol. 9, no. 4, pp. Article No. 25, Aug. 2013.
    [33]
    A. George and A. Routray, " Fast and accurate algorithm for eye localisation for gaze tracking in low-resolution images,” IET Comp. Vision, vol. 10, no. 7, pp. 660–669, Oct. 2016. doi: 10.1049/iet-cvi.2015.0316
    [34]
    I. Choi and D. Kim, " A variety of local structure patterns and their hybridization for accurate eye detection,” Patt. Recogn., vol. 61, pp. 417–432, Jan. 2017. doi: 10.1016/j.patcog.2016.08.009
    [35]
    M. Everingham and A. Zisserman, " Regression and classification approaches to eye localization in face images,” in Proc. 7th Int. Conf. Automatic Face and Gesture Recognition, Southampton, UK, 2006, pp. 441–446.
    [36]
    F. Samaria and S. Young, " HMM-based architecture for face identification,” Image Vision Comput., vol. 12, no. 8, pp. 537–543, Oct. 1994. doi: 10.1016/0262-8856(94)90007-8
    [37]
    M. Hamouz, J. Kittler, J.-K. Kamarainen, P. Paalanen, H. Kalviainen, and J. Matas, " Feature-based affine-invariant localization of faces,” IEEE Trans. Patt. Anal. Mach. Intell., vol. 27, no. 9, pp. 1490–1495, Sep. 2005. doi: 10.1109/TPAMI.2005.179
    [38]
    P. Campadelli, R. Lanzarotti, and G. Lipori, " Precise eye and mouth localization,” Int. J. Patt. Recogn. Artificial Intell., vol. 23, no. 3, pp. 359–377, 2009. doi: 10.1142/S0218001409007259
    [39]
    S. Chen and C. J. Liu, " Eye detection using discriminatory Haar features and a new efficient SVM,” Image Vision Comput., vol. 33, pp. 68–77, Jan. 2015. doi: 10.1016/j.imavis.2014.10.007
    [40]
    Z. H. Niu, S. G. Shan, S. Y. Yan, X. L. Chen, and W. Gao, " 2D cascaded adaboost for eye localization,” in Proc. 18th Int. Conf. Pattern Recognition (ICPR'06), Hong Kong, China, 2006, pp. 1216–1219.
    [41]
    S. Kim, S. T. Chung, S. Jung, D. Oh, J. Kim, and S. Cho, " Multi-scale gabor feature based eye localization,” World Academy of Science,Engineering and Technology, vol. 21, pp. 483–487, 2007.
    [42]
    O. Jesorsky, K. J. Kirchberg, and R. W. Frischholz, " Robust face detection using the hausdorff distance,” in Proc. International Conference on Audio- and Video-Based Biometric Person Authentication, Berlin, Heidelberg: Springer, 2001, pp. 90–95.
    [43]
    B. Kroon, A. Hanjalic, and S. M. P. Maas, " Eye localization for face matching: is it always useful and under what conditions?,” in Proc. Int. Conf. Content-Based Image and Video Retrieval, Niagara Falls, Canada, 2008, pp. 379–388.
    [44]
    D. Chen, X. S. Tang, Z. Y. Ou, and N. Xi, " A hierarchical floatboost and mlp classifier for mobile phone embedded eye location system,” in International Symposium on Neural Networks, J. Wang, Z. Yi, J. M. Zurada, B. L. Lu, and H. Yin, Eds. Berlin, Heidelberg: Springer, 2006, pp. 20–25.
    [45]
    D. Cristinacce, T. F. Cootes, and I. M. Scott, " A multi-stage approach to facial feature detection,” in Proc. British Machine Vision Conf., Edinburgh, UK, 2004, pp. 277–286.
    [46]
    S. Behnke, " Learning face localization using hierarchical recurrent networks,” in Proc. International Conference on Artificial Neural Networks, J. R. Dorronsoro, Ed. Berlin, Heidelberg: Springer, 2002, pp. 1319–1324.
    [47]
    C. Gou, Y. Wu, K. Wang, F.-Y. Wang, and Q. Ji, " Learning-by-synthesis for accurate eye detection,” in Proc. 2016 23rd Int. Conf. Pattern Recognition (ICPR), Cancun, Mexico, 2016, pp. 3362–3367.
    [48]
    C. Gou, Y. Wu, K. Wang, K. F. Wang, F.-Y. Wang, and Q. Ji, " A joint cascaded framework for simultaneous eye detection and eye state estimation,” Patt. Recogn., vol. 67, pp. 23–31, Jul. 2017. doi: 10.1016/j.patcog.2017.01.023
    [49]
    C. Gou, H. Zhang, K. F. Wang, F.-Y. Wang, and Q. Ji, " Cascade learning from adversarial synthetic images for accurate pupil detection,” Patt. Recogn., vol. 88, pp. 584–594, Apr. 2019. doi: 10.1016/j.patcog.2018.12.014
    [50]
    N. Markuš, M. Frljak, I. S. Pandžić, J. Ahlberg, and R. Forchheimer, " Eye pupil localization with an ensemble of randomized trees,” Pattern Recogn., vol. 47, no. 2, pp. 578–587, Feb. 2014. doi: 10.1016/j.patcog.2013.08.008
    [51]
    S. Chen and C. J. Liu, " Clustering-based discriminant analysis for eye detection,” IEEE Trans. Image Process., vol. 23, no. 4, pp. 1629–1638, Apr. 2014. doi: 10.1109/TIP.2013.2294548
    [52]
    Y. Ren, S. Wang, B. Hou, and J. J. Ma, " A novel eye localization method with rotation invariance,” IEEE Trans. Image Process., vol. 23, no. 1, pp. 226–239, Jan. 2014. doi: 10.1109/TIP.2013.2287614
    [53]
    M. Hamouz, J. Kittler, J. K. Kamarainen, P. Paalanen, and H. Kalviainen, " Affine-invariant face detection and localization using GMM-based feature detector and enhanced appearance model,” in Proc. 6th IEEE Int. Conf. Automatic Face and Gesture Recognition, Seoul, South Korea, 2004, pp. 67–72.
    [54]
    M. Türkan, M. M. Pardàs, and A. E. Cetin, " Human eye localization using edge projections,” in Proc. 2nd Int. Conf. Computer Vision Theory and Applications, Barcelona, Spain, 2007, pp. 410–415.
    [55]
    P. Campadelli, R. Lanzarotti, and G. Lipori, " Precise eye localization through a general-to-specific model definition,” in Proc. British Machine Vision Conf., Edinburgh, UK, 2006, pp. 187–196.
    [56]
    R. Valenti and T. Gevers, " Accurate eye center location through invariant isocentric patterns,” IEEE Trans. Patt. Anal. Mach. Intell., vol. 34, no. 9, pp. 1785–1798, Sep. 2012. doi: 10.1109/TPAMI.2011.251
    [57]
    N. Silberman, D. Hoiem, P. Kohli, and R. Fergus, " Indoor segmentation and support inference from RGBD images,” in Proc. 12th European Conf. Computer Vision, Florence, Italy, 2012, pp. 746–760.
    [58]
    Z. Deng, S. Todorovic, and L. Jan Latecki, " Semantic segmentation of RGBD images with mutex constraints,” in Proc. IEEE Int. Conf. Computer Vision, Santiago, Chile, 2015, pp. 1733–1741.
    [59]
    L. C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, " DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs,” IEEE Trans. Patt. Anal. Mach. Intell., vol. 40, no. 4, pp. 834–848, Apr. 2018. doi: 10.1109/TPAMI.2017.2699184
    [60]
    L. C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, Semantic image segmentation with deep convolutional nets and fully connected CRFS. 2014. [Online]. Available: arXiv preprint arXiv: 1412.7062
    [61]
    S. Zheng, S. Jayasumana, B. Romera-Paredes, V. Vineet, Z. Z. Su, D. L. Du, C. Huang, and P. H. S. Torr, " Conditional random fields as recurrent neural networks,” in Proc. IEEE Int. Conf. Computer Vision, Santiago, Chile, 2015, pp. 1529–1537.
    [62]
    H. Noh, S. Hong, and B. Han, " Learning deconvolution network for semantic segmentation,” in Proc. IEEE Int. Conf. Computer Vision, Santiago, Chile, 2015, pp. 1520–1528.
    [63]
    A. Newell, K. Y. Yang, and J. Deng, " Stacked hourglass networks for human pose estimation,” in Proc. European Conference on Computer Vision. Amsterdam, The Netherlands: Springer, 2016, pp. 483–499.
    [64]
    J. Yang, Q. S. Liu, and K. H. Zhang, " Stacked hourglass network for robust facial landmark localisation,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA, 2017, pp. 2025–2033.
    [65]
    S. Park, X. C. Zhang, A. Bulling, and O. Hilliges, " Learning to find eye region landmarks for remote gaze estimation in unconstrained settings,” in Proc. ACM Symp. Eye Tracking Research & Applications, Warsaw, Poland, 2018, Article No. 21.
    [66]
    S. Park, A. Spurr, and O. Hilliges, " Deep pictorial gaze estimation,” in Proc. European Conf. Computer Vision (ECCV), Munich, Germany, 2018, pp. 721–738.
    [67]
    M. Knoche, D. Merget, and G. Rigoll, " Improving facial landmark detection via a super-resolution inception network,” in Proc. German Conference on Pattern Recognition, Basel, Switzerland: Springer, 2017, pp. 239–251.
    [68]
    M. Mostajabi, P. Yadollahpour, and G. Shakhnarovich, " Feedforward semantic segmentation with zoom-out features,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Boston, MA, USA, 2015, pp. 3376–3385.
    [69]
    W. G. Wang, J. B. Shen, and L. Shao, " Video salient object detection via fully convolutional networks,” IEEE Trans. Image Process., vol. 27, no. 1, pp. 38–49, Jan. 2018. doi: 10.1109/TIP.2017.2754941
    [70]
    BioID dataset [Online]. Available: https://www.bioid.com/About/BioID-Face-Database.
    [71]
    Y. Sun, X. G. Wang, and X. O. Tang, " Deep convolutional network cascade for facial point detection,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Portland, OR, USA, 2013, pp. 3476–3483.
    [72]
    G. B. Huang, M. Mattar, T. Berg, and E. Learned-Miller, " Labeled faces in the wild: A database forstudying face recognition in unconstrained environments,” in Proc. Workshop on Faces in ‘Real-Life’ Images: Detection, Alignment, and Recognition. 2008.
    [73]
    Y. F. Xia, J. W. Lou, J. Y. Dong, G. F. Li, and H. Yu, " SDM-based means of gradient for eye center localization,” in Proc. 16th IEEE Int. Conf. Dependable, Autonomic and Secure Computing, 16th Int. Conf. Pervasive Intelligence and Computing, 4th Int. Conf. Big Data Intelligence and Computing and Cyber Science and Technology Congress (DASC/PiCom/DataCom/CyberSciTech), 2018, pp. 862–867.
    [74]
    M. Leo, D. Cazzato, T. De Marco, and C. Distante, " Unsupervised approach for the accurate localization of the pupils in near-frontal facial images,” J. Electron. Imag., vol. 22, no. 3, pp. 033033, Sep. 2013. doi: 10.1117/1.JEI.22.3.033033
    [75]
    H. Yu, O. G. B. Garrod, and P. G. Schyns, " Perception-driven facial expression synthesis,” Comput. Graph., vol. 36, no. 3, pp. 152–162, May 2012. doi: 10.1016/j.cag.2011.12.002

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(7)  / Tables(4)

    Article Metrics

    Article views (1983) PDF downloads(112) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return