A journal of IEEE and CAA , publishes high-quality papers in English on original theoretical/experimental research and development in all areas of automation
Volume 8 Issue 2
Feb.  2021

IEEE/CAA Journal of Automatica Sinica

  • JCR Impact Factor: 11.8, Top 4% (SCI Q1)
    CiteScore: 17.6, Top 3% (Q1)
    Google Scholar h5-index: 77, TOP 5
Turn off MathJax
Article Contents
Xin Luo, Wen Qin, Ani Dong, Khaled Sedraoui and MengChu Zhou, "Efficient and High-quality Recommendations via Momentum-incorporated Parallel Stochastic Gradient Descent-Based Learning," IEEE/CAA J. Autom. Sinica, vol. 8, no. 2, pp. 402-411, Feb. 2021. doi: 10.1109/JAS.2020.1003396
Citation: Xin Luo, Wen Qin, Ani Dong, Khaled Sedraoui and MengChu Zhou, "Efficient and High-quality Recommendations via Momentum-incorporated Parallel Stochastic Gradient Descent-Based Learning," IEEE/CAA J. Autom. Sinica, vol. 8, no. 2, pp. 402-411, Feb. 2021. doi: 10.1109/JAS.2020.1003396

Efficient and High-quality Recommendations via Momentum-incorporated Parallel Stochastic Gradient Descent-Based Learning

doi: 10.1109/JAS.2020.1003396
Funds:  This work was supported in part by the National Natural Science Foundation of China (61772493), the Deanship of Scientific Research (DSR) at King Abdulaziz University (RG-48-135-40), Guangdong Province Universities and College Pearl River Scholar Funded Scheme (2019), and the Natural Science Foundation of Chongqing (cstc2019jcyjjqX0013)
More Information
  • A recommender system (RS) relying on latent factor analysis usually adopts stochastic gradient descent (SGD) as its learning algorithm. However, owing to its serial mechanism, an SGD algorithm suffers from low efficiency and scalability when handling large-scale industrial problems. Aiming at addressing this issue, this study proposes a momentum-incorporated parallel stochastic gradient descent (MPSGD) algorithm, whose main idea is two-fold: a) implementing parallelization via a novel data-splitting strategy, and b) accelerating convergence rate by integrating momentum effects into its training process. With it, an MPSGD-based latent factor (MLF) model is achieved, which is capable of performing efficient and high-quality recommendations. Experimental results on four high-dimensional and sparse matrices generated by industrial RS indicate that owing to an MPSGD algorithm, an MLF model outperforms the existing state-of-the-art ones in both computational efficiency and scalability.

     

  • loading
  • Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/JAS.2020.1003396
  • [1]
    J. Bobadilla, F. Ortega, A. Hernando, and J. Bernal, “Generalization of recommender systems: Collaborative filtering extended to groups of users and restricted to groups of items,” Expert Systems with Applications, vol. 39, no. 1, pp. 172–186, Jan. 2012. doi: 10.1016/j.eswa.2011.07.005
    [2]
    J. Bobadilla, F. Serradilla, and J. Bernal, “A new collaborative filtering metric that improves the behavior of recommender systems,” Knowledge-Based Systems, vol. 23, no. 6, pp. 520–528, Aug. 2010. doi: 10.1016/j.knosys.2010.03.009
    [3]
    X. Luo, M. C. Zhou, Y. N. Xia, Q. S. Zhu, A. C. Ammari, and A. Alabdulwahab, “Generating highly accurate predictions for missing QoS data via aggregating nonnegative latent factor models,” IEEE Trans. Neural Networks and Learning Systems, vol. 27, no. 3, pp. 524–537, 2016. doi: 10.1109/TNNLS.2015.2412037
    [4]
    Z. B. Zheng, H. Ma, M. R. Lyu, and I. King, “QoS-aware web service recommendation by collaborative filtering,” IEEE Trans. Services Computing, vol. 4, no. 2, pp. 140–152, 2011. doi: 10.1109/TSC.2010.52
    [5]
    Y. Shi, M. Larson, and A. Hanjalic, “Collaborative filtering beyond the user-item matrix: A survey of the state of the art and future challenges,” ACM Computing Surveys, vol. 47, no. 1, pp. 45, Jul. 2014.
    [6]
    L. Yang, X. C. Cao, D. Jin, X. Wang, and D. Meng, “A unified semi-supervised community detection framework using latent space graph regularization,” IEEE Trans. Cybernetics, vol. 45, no. 11, pp. 2585–2598, 2015. doi: 10.1109/TCYB.2014.2377154
    [7]
    M. S. Shang, X. Luo, Z. G. Liu, J. Chen, Y. Yuan, and M. C. Zhou, “Randomized latent factor model for high-dimensional and sparse matrices from industrial applications,” IEEE/CAA J. Autom. Sinica, vol. 6, no. 1, pp. 131–141, Jan. 2019. doi: 10.1109/JAS.2018.7511189
    [8]
    T. T. He, L. Hu, K. C. C. Chan, and P. W. Hu, “Learning latent factors for community identification and summarization,” IEEE Access, vol. 6, pp. 30137–30148, 2018. doi: 10.1109/ACCESS.2018.2843726
    [9]
    Z. Ghahramani, “Probabilistic machine learning and artificial intelligence,” Nature, vol. 521, no. 7553, pp. 452–459, 2015. doi: 10.1038/nature14541
    [10]
    N. Zerari, M. Chemachema, and N. Essounbouli, “Neural network based adaptive tracking control for a class of pure feedback nonlinear systems with input saturation,” IEEE/CAA J. Autom. Sinica, vol. 6, no. 1, pp. 278–290, Jan. 2019. doi: 10.1109/JAS.2018.7511255
    [11]
    X. Luo, M. C. Zhou, S. Li, Z. H. You, Y. N. Xia, Q. S. Zhu, and H. Leung, “An Efficient second-order approach to factorizing sparse matrices in recommender systems,” IEEE Trans. Industrial Informatics, vol. 11, no. 4, pp. 946–956, Aug. 2015. doi: 10.1109/TII.2015.2443723
    [12]
    X. Luo, M. C. Zhou, H. Leung, Y. N. Xia, Q. S. Zhu, Z. H. You, and S. Li, “An incremental-and-static-combined scheme for matrix-factorization-based collaborative filtering,” IEEE Trans. Autom. Science and Engineering, vol. 13, no. 1, pp. 333–343, Jan. 2016. doi: 10.1109/TASE.2014.2348555
    [13]
    M. F. Weng and Y. Y. Chuang, “Collaborative video reindexing via matrix factorization,” ACM Trans. Multimedia Computing Communications and Applications. Appl., vol. 8, no. 2, pp. 1–20, 2012.
    [14]
    R. Salakhutdinov and A. Mnih, “Probabilistic matrix factorization,” Advances in Neural Information Processing Systems, vol. 20, pp. 1257–1264, 2008.
    [15]
    X. Luo, H. Wu, M. C. Zhou, and H. Q Yuan, “Temporal pattern-aware QoS prediction via biased non-negative latent factorization of tensors,” IEEE Trans. Cybernetics, vol. 50, no. 5, pp. 1798–18.9, Mar. 2020. doi: 10.1109/TCYB.2019.2903736
    [16]
    J. Sha, Y. Y. Du, and L. Qi, “A user requirement oriented web service discovery approach based on logic and threshold petri net,” IEEE/CAA J. Autom. Sinica, vol. 6, no. 6, pp. 1528–1542, Nov. 2019. doi: 10.1109/JAS.2019.1911657
    [17]
    A. Mahmood and M. Small, “Subspace based network community detection using sparse linear coding,” IEEE Trans. Knowledge and Data Engineering, vol. 28, no. 3, pp. 801–812, Mar. 2016. doi: 10.1109/TKDE.2015.2496345
    [18]
    A. Almalaq, J. Hao, J. J. Zhang, and F. Y. Wang, “Parallel building: A complex system approach for smart building energy management,” IEEE/CAA J. Autom. Sinica, vol. 6, no. 6, pp. 1452–1461, Nov. 2019.
    [19]
    J. J. Pan, S. J. Pan, Y. Jie, L. M. Ni, and Y. Qiang, “Tracking mobile users in wireless networks via semi-supervised colocalization,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 34, pp. 587–600, 2012. doi: 10.1109/TPAMI.2011.165
    [20]
    G. Takács, I. Pilászy, Bottyán Németh, and D. Tikky, “Scalable collaborative filtering approaches for large recommender systems,” J. Machine Learning Research, vol. 10, pp. 623–656, Mar. 2009.
    [21]
    Y. Koren, R. Bell, and C. Volinsky, “Matrix factorization techniques for recommender systems,” IEEE Computer., vol. 42, pp. 30–37, Aug. 2009.
    [22]
    W. W. Wang, A. Cichocki, and J. A. Chambers, “A multiplicative algorithm for convolutive non-negative matrix factorization based on squared euclidean distance,” IEEE Trans. Signal Processing, vol. 57, no. 7, pp. 2858–2864, 2009. doi: 10.1109/TSP.2009.2016881
    [23]
    H. Kim and H. Park, “Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis,” Bioinformatics, vol. 23, no. 12, pp. 1495–1502, 2007. doi: 10.1093/bioinformatics/btm134
    [24]
    Z. Li, J. Tang, and T. Mei, “Deep collaborative embedding for social image understanding,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 41, no. 9, pp. 2070–2083, Sep. 2019. doi: 10.1109/TPAMI.2018.2852750
    [25]
    C. J. Lin, “Projected gradient methods for nonnegative matrix factorization,” Neural Computation, vol. 19, no. 10, pp. 2756–2779, 2007. doi: 10.1162/neco.2007.19.10.2756
    [26]
    Y. Chen, J. Bordes, and D. Filliat, “Comparison studies on active cross-situational object-word learning using non-negative matrix factorization and latent dirichlet allocation,” IEEE Trans. Cognitive and Developmental Systems, vol. 10, no. 4, pp. 1023–1034, Dec. 2018. doi: 10.1109/TCDS.2017.2725304
    [27]
    M. Shiga and H. Mamitsuka, “Non-negative matrix factorization with auxiliary information on overlapping groups,” IEEE Trans. Knowledge and Data Engineering, vol. 27, no. 6, pp. 1615–1628, Jun. 2015. doi: 10.1109/TKDE.2014.2373361
    [28]
    C. C. Leng, H. Zhang, G. R. Cai, I. Cheng, and A. Basu, “Graph regularized Lp smooth non-negative matrix factorization for data representation,” IEEE/CAA J. Autom. Sinica, vol. 6, no. 2, pp. 584–595, Mar. 2019. doi: 10.1109/JAS.2019.1911417
    [29]
    F. Niu, B. Recht, C. Ré, and S. J. Wright, “Hogwild!: A lock-free approach to parallelizing stochastic gradient descent,” Advances in Neural Information Processing Systems, pp. 693–701, 2011.
    [30]
    R. Gemulla, E. Nijkamp, P. J. Haas, and Y. Sismaanis, “Large-scale matrix factorization with distributed stochastic gradient descent,” in Proc. 17th ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, San Diego, California, USA, Aug, 2011, pp: 69−77.
    [31]
    Y. Nesterov, “A method for unconstrained convex minimization problem with the rate of convergence,” Soviet Mathematics Doklady, vol. 269, pp. 543–547, 1983.
    [32]
    H. Ma, D. Zhou, C. Liu, M. R. Lyu, and I. King, “Recommender systems with social regularization.” in Pro. 4th ACM Int. Conf. Web Search and Data Mining, pp: 287−296, 2014.
    [33]
    T. Greenberg-Toledo, R. Mazor, A. Haj-Ali, and S. Kvatinsky, “Supporting the momentum training algorithm using a memristor-based synapse,” IEEE Trans. Circuits and Systems I:Regular Papers, vol. 66, no. 4, pp. 1571–1583, Apr. 2019. doi: 10.1109/TCSI.2018.2888538
    [34]
    N. Qian, “On the momentum term in gradient descent learning algorithms,” Neural Networks:the Official J. Int. Neural Network Society, vol. 12, no. 1, pp. 145–151, 1999. doi: 10.1016/S0893-6080(98)00116-6
    [35]
    C. K. Krzysztof, “Convergence of approximate and incremental subgradient methods for convex optimization,” SIAM J. Optimization, vol. 14, no. 3, pp. 807–840, 2004. doi: 10.1137/S1052623400376366
    [36]
    H. Robbins and D. Siegmund, “A convergence theorem for non-negative almost super martingales and some applications,” Optimizing Methods in Statistics, pp. 233–257, 1971.
    [37]
    J. A. Konstan, B. N. Miller, D. Maltz, J. L. Herlocker, L. R. Gordon, and J. Riedl, “Grouplens: applying collaborative filtering to UseNet news,” Communications of the ACM, vol. 40, no. 3, pp. 77–87, 1997. doi: 10.1145/245108.245126
    [38]
    W. S. Chin, Y. Zhuang, Y. C. Juan, and C. J. Lin, “A fast parallel stochastic gradient method for matrix factorization in shared memory systems,” ACM Trans. Intelligent Systems and Technology, vol. 6, no. 1, pp. 24, Apr. 2015.
    [39]
    G. Bhatnagar and Q. M. J. Wu, “A fractal dimension based framework for night vision fusion,” IEEE/CAA J. Autom. Sinica, vol. 6, no. 1, pp. 220–227, Jan. 2019. doi: 10.1109/JAS.2018.7511102
    [40]
    X. Luo, H. J. Liu, G. P. Gou, Y. N. Xia, and Q. S. Zhu, “A parallel matrix factorization based recommender by alternating stochastic gradient decent,” Engineering Applications of Artificial Intelligence, vol. 25, no. 7, pp. 1403–1412, Oct. 2012. doi: 10.1016/j.engappai.2011.10.011
    [41]
    S. Sedhain, A. K. Menon, S. Sanner, and L. Xie, “AutoRec: autoencoders meet collaborative filtering,” in Proc. 24th Int. Conf. World Wide Web, 2015, pp: 111−112.
    [42]
    Y. Guo, M. Wang, and X. Li, “Application of an improved apriori algorithm in a mobile e-commerce recommendation system,” Industrial Management and Data Systems, vol. 117, no. 2, pp. 287–303, 2017. doi: 10.1108/IMDS-03-2016-0094
    [43]
    Q. Liu, S. Wu, D. Wang, Z. Li, and L. Wang, “Context-aware sequential recommendation,” in Proc. 16th IEEE Int. Conf. Data Mining, pp: 1053−1058, 2016.
    [44]
    S. C. Gao, M. C. Zhou, Y. R. Wang, J. J. Cheng, H. Yachi, and J. H. Wang, “Dendritic neuron model with effective learning algorithms for classification, approximation and prediction,” IEEE Trans Neural Networks and Learning Systems, vol. 30, no. 2, pp. 601–614, Feb. 2019. doi: 10.1109/TNNLS.2018.2846646
    [45]
    P. Y. Zhang and M. C. Zhou, “Dynamic Cloud Task Scheduling Based on a Two-stage Strategy,” IEEE Trans. Autom. Science and Engineering, vol. 15, no. 2, pp. 772–783, Apr. 2018. doi: 10.1109/TASE.2017.2693688
    [46]
    P. Y. Zhang, M. C. Zhou, and G. Fortino, “Security and trust issues in fog computing: A survey,” Future Generation Computer Systems, vol. 88, pp. 16–27, Nov. 2018. doi: 10.1016/j.future.2018.05.008
    [47]
    H. Duan, C. Liu, Q. Zeng and M. C. Zhou, “Refinement-Based Hierarchical Modeling and Correctness Verification of Cross-Organization Collaborative Emergency Response Processes,” IEEE Trans. Systems, Man, and Cybernetics: Systems, vol. 50, no. 8, pp. 2845-2859, Aug. 2020.
    [48]
    X. Guo, M. C. Zhou, S. Liu, and L. Qi, “Lexicographic Multiobjective Scatter Search for the Optimization of Sequence-Dependent Selective Disassembly Subject to Multiresource Constraints,” IEEE Trans. Cybernetics, 50(7), pp. 3307-3317, Jul. 2020.
    [49]
    Z. Huang, X. Xu, H. Zhu and M. C. Zhou, “An Efficient Group Recommendation Model with Multiattention-Based Neural Networks,” IEEE Trans Neural Networks and Learning Systems, vol. 31, no. 11, pp. 4461-4474, Nov. 2020.
    [50]
    X. Luo, M. C. Zhou, S. Li, L. Hu, and M. Shang, “Non-Negativity Constrained Missing Data Estimation for High-Dimensional and Sparse Matrices from Industrial Applications,” IEEE Trans. Cybernetics, vol. 50, no. 5, pp. 1844-1855, May 2020.

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(6)  / Tables(5)

    Article Metrics

    Article views (1095) PDF downloads(52) Cited by()

    Highlights

    • Proposing an MPSGD algorithm with faster convergence than existing parallel SGD algorithms when building an LF model for an RS.
    • Performing theoretical analysis and algorithm design for the proposed MPSGD-based LF model.
    • Conducting empirical studies on four HiDS matrices generated by industrial applications to evaluate the proposed model.

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return