# A Recommendation Model Based on Deep Neural Network

IEEE Access, Volume 6, 2018, Pages 9454-9463.

EI

Weibo:

Abstract:

In recent years, recommendation systems have been widely used in various commercial platforms to provide recommendations for users. Collaborative filtering algorithms are one of the main algorithms used in recommendation systems. Such algorithms are simple and efficient; however, the sparsity of the data and the scalability of the method ...More

Code:

Data:

Introduction

- With the development of artificial intelligence technology, increasingly more intelligent products are being applied in daily life and provide convenience for people in various aspects.
- Collaborative filtering algorithms are the most widely used algorithms in recommendation systems; they are different from content-based methods in that they do not require information about users or items, and they make accurate recommendations based only on interaction information between users and items such as clicks, browsing and rating
- This method is simple and effective, with the rapid development of the Internet, the high sparsity of the data limits the performance of the algorithm; researchers have begun to look for other methods of improving the recommendation performance

Highlights

- With the development of artificial intelligence technology, increasingly more intelligent products are being applied in daily life and provide convenience for people in various aspects
- Considering that the above-mentioned information may be difficult to obtain for most recommendation systems, in this paper, we propose a recommendation model based on deep neural network (DNN) that does not need any extra information
- The remainder of this paper is organized as follows: In Section 2, we introduce the CF methods and some recommendation algorithms based on DNNs
- Considering the above-mentioned disadvantages of the RaF, NCF, Singular Value Decomposition (SVD) and Probabilistic Matrix Factorization (PMF) methods, this paper proposes a new feature representation method based on Quadric Polynomial Regression (QPR), which avoids the issue whereby the preprocessing of missing values may lead to inaccurate results in feature learning and can consider the correlations between features
- We think that too many hidden layers may cause overfitting due to the low feature dimension of the input which comes from the feature representation model, we propose to set the number of hidden layers in the DNN model to 2
- The experimental results show that the proposed model achieves good prediction performance, which proved that the application of deep learning model in recommender system is a successful attempt

Results

- The authors randomly selected 20% of the data of the dataset as the test set and the remaining 80% as the training set.

Conclusion

- The authors discussed the effectiveness and implementation details of applying the DNN model to non-contentbased recommendation systems.
- The experimental results show that the proposed model achieves good prediction performance, which proved that the application of deep learning model in recommender system is a successful attempt.
- One can regard the framework as a guideline for developing deep learning methods for recommendation systems.
- This paper is a preliminary attempt to apply deep learning methods to recommendation systems, so there are many possibilities for improvement, such as building more complex models, or using other deep learning methods

Summary

## Introduction:

With the development of artificial intelligence technology, increasingly more intelligent products are being applied in daily life and provide convenience for people in various aspects.- Collaborative filtering algorithms are the most widely used algorithms in recommendation systems; they are different from content-based methods in that they do not require information about users or items, and they make accurate recommendations based only on interaction information between users and items such as clicks, browsing and rating
- This method is simple and effective, with the rapid development of the Internet, the high sparsity of the data limits the performance of the algorithm; researchers have begun to look for other methods of improving the recommendation performance
## Results:

The authors randomly selected 20% of the data of the dataset as the test set and the remaining 80% as the training set.## Conclusion:

The authors discussed the effectiveness and implementation details of applying the DNN model to non-contentbased recommendation systems.- The experimental results show that the proposed model achieves good prediction performance, which proved that the application of deep learning model in recommender system is a successful attempt.
- One can regard the framework as a guideline for developing deep learning methods for recommendation systems.
- This paper is a preliminary attempt to apply deep learning methods to recommendation systems, so there are many possibilities for improvement, such as building more complex models, or using other deep learning methods

- Table1: The statistics of the three datasets
- Table2: Evaluation results with different recommendation algorithms on the three datasets

Related work

- Breese et al [8] divide the CF algorithm into two classes: memory-based methods and model-based methods. The memory-based CF uses the similarities between users [9] or items [10] to make recommendations. This method is widely used because it is effective and easy to implement, but with the increase in the scale of the recommendation system, the calculation of the similarity becomes increasingly more

VOLUME 6, 2018 difficult; in addition, high data sparsity also limits the performance of this method.

To solve the above-mentioned problems, many modelbased recommendation algorithms have been proposed such as latent semantic models [11], Bayesian models [12], regression-based models [13], clustering models [14], and matrix factorization models [15]. Among the various CF technologies, matrix factorization is the most popular method. This method maps both users and items to vectors with the same dimension, which represents the latent features of the users or items. The representative works of this method include Nonparametric Probabilistic Principal Component Analysis (NPCA) [16], Singular Value Decomposition (SVD) [17], and Probabilistic Matrix Factorization (PMF) [18]. However, the latent features learned by matrix factorization methods are often not sufficiently effective, especially when the rating matrix is very sparse.

Funding

- This work was supported by the National Education Information Technology Research under Grant 171140001

Study subjects and analysis

public datasets: 3

Then, these latent features are regarded as the input data of the deep neural network model, which is the second part of the proposed model and is used to predict the rating scores. Finally, by comparing with other recommendation algorithms on three public datasets, it is verified that the recommendation performance can be effectively improved by our model.

public datasets: 3

Then, these latent features are regarded as the input data of the deep neural network model, which is the second part of the proposed model and is used to predict the rating scores. Finally, by comparing with other recommendation algorithms on three public datasets, it is verified that the recommendation performance can be effectively improved by our model. INDEX TERMS Recommendation system, collaborative filtering, quadric polynomial regression, deep neural network (DNN)

public datasets: 3

Finally, the score with the highest probability will be used as the prediction result. By comparing with some commonly used and state-of-the-art algorithms on three public datasets, it is proved that the proposed model can effectively improve the recommendation accuracy. The remainder of this paper is organized as follows: In Section 2, we introduce the CF methods and some recommendation algorithms based on DNNs

real datasets: 3

DATA DESCRIPTION. In our experiment, we use three real datasets to test the performance of our model: MovieLens-100K, MovieLens1M, and Epinions. The MovieLens-100K dataset contains nearly 100,000 rating records of 943 users on 1,682 items; the dataset comes from the MovieLens website, and all the rating scores are positive and not greater than 5

rating records: 100000

In our experiment, we use three real datasets to test the performance of our model: MovieLens-100K, MovieLens1M, and Epinions. The MovieLens-100K dataset contains nearly 100,000 rating records of 943 users on 1,682 items; the dataset comes from the MovieLens website, and all the rating scores are positive and not greater than 5. The MovieLens-1M dataset also comes from the MovieLens website, but it contains 1,000,209 rating records from 6,040 users for 3,952 movies

rating records: 1000209

The MovieLens-100K dataset contains nearly 100,000 rating records of 943 users on 1,682 items; the dataset comes from the MovieLens website, and all the rating scores are positive and not greater than 5. The MovieLens-1M dataset also comes from the MovieLens website, but it contains 1,000,209 rating records from 6,040 users for 3,952 movies. In addition, it was released later than the previous database, and each user has rated at least 20 movies

rating records: 354857

Before our experiments, the users who have rated fewer than 10 items are removed from the dataset, and the items that have been rated fewer than 10 times are also removed. Ultimately, 354,857 rating records of 15,687 users on 11,657 items remain. The statistics of the three datasets are shown in Table 1

datasets: 3

Ultimately, 354,857 rating records of 15,687 users on 11,657 items remain. The statistics of the three datasets are shown in Table 1. B

datasets: 3

F. EXPERIMENTAL RESULTS In this paper, we have carried out the experiments about 20 times on all three datasets, and then removed one of the best and one of the worst result, taking the average value of the rest as the experimental result. In the first part of the experiment, we tested the effectiveness of the proposed feature representation model

datasets: 3

We only used the MAE measure method in this experiment, and the experimental results are shown in Figure 4. By observing the experimental results of Fig. 4, we can draw the following conclusions: 1) The experimental results of the RaF method on the three datasets are the worst. The reason for this result may be that the input data required by the neural network model need to be continuous and comparable; however, according to our assumptions, the missing value in the rating matrix will be replaced by 0

datasets: 3

2) The NCF method performs well on the MovieLens100k dataset, but it performs poorly on the other two datasets. By analyzing the statistics of the three datasets in Table 1, we believe that the reason for this phenomenon is that the MovieLens-100k dataset is small. It is reasonable to use the One-Hot encoding technique; however, on larger datasets, the encoded vectors are very sparse, which makes it difficult for the neural network to learn effective features

datasets: 3

Therefore, the performance will be worse. 3) The performance of the SVD algorithm on the three datasets is not ideal. The reason for this result may be that the method needs to preprocess the missing values, L

datasets: 3

Zhang et al.: Recommendation Model Based on DNN TABLE 2. Evaluation results with different recommendation algorithms on the three datasets. and the general approach to do this is to replace them with averages or modes, which makes the decomposed features not reliable enough

datasets: 3

The experimental results are shown in Table 2. By observing the experimental results in Table 2, we draw the following conclusions: 1) Our method achieves better performance under both the MAE and RMSE metrics on all three datasets, which means that our model achieves higher prediction accuracy and proves that the combination of the QPR model and DNN model is effective. 2) It can be observed from the experimental results that the RMSE value is larger than the MAE value under the same conditions, which means that the RMSE metric is a better indicator of the performance of the algorithm

datasets: 3

The experimental results show that the proposed model achieves better performance than the other algorithms with respect to the RMSE metric method, which means that the proposed model has high prediction stability. 3) The prediction performance of our model is very different on the three datasets. Given the statistical information in Table 1, we believe that this difference is caused by the number of ratings per user

datasets: 3

The experimental results show that with increasing feature dimension, the performance of the model gradually improves and ultimately becomes stable. When the best performance is achieved, the values of a and b on the three datasets are consistent with the settings in Section 4.4. Through further analysis of the experimental results, we observe that when the feature dimension is low, the features learned by our model are not sufficiently accurate, and there remains significant room for improvement

Reference

- M. J. Pazzani and D. Billsus, ‘‘Content-based recommendation systems,’’ in The Adaptive Web. Berlin, Germany: Springer, 2007, pp. 325–341.
- G. Adomavicius and A. Tuzhilin, ‘‘Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions,’’ IEEE Trans. Knowl. Data Eng., vol. 17, no. 6, pp. 734–749, Jun. 2005.
- Z. Huang, D. Zeng, and H. Chen, ‘‘A comparison of collaborative-filtering recommendation algorithms for e-commerce,’’ IEEE Intell. Syst., vol. 22, no. 5, pp. 68–78, Sep./Oct. 2007.
- X. Su and T. M. Khoshgoftaar, ‘‘A survey of collaborative filtering techniques,’’ Adv. Artif. Intell., vol. 2009, Aug. 2009, Art. no. 421425.
- D. Ciregan, U. Meier, and J. Schmidhuber, ‘‘Multi-column deep neural networks for image classification,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2012, pp. 3642–3649.
- F. Richardson, D. Reynolds, and N. Dehak, ‘‘Deep neural network approaches to speaker and language recognition,’’ IEEE Signal Process. Lett., vol. 22, no. 10, pp. 1671–1675, Oct. 2015.
- E. Arisoy, T. N. Sainath, B. Kingsbury, and B. Ramabhadran, ‘‘Deep neural network language models,’’ in Proc. NAACL-HLT Workshop, Will Ever Really Replace N-Gram Model? 2012, pp. 20–28.
- J. S. Breese, D. Heckerman, and C. Kadie, ‘‘Empirical analysis of predictive algorithms for collaborative filtering,’’ in Proc. 14th Conf. Uncertainty Artif. Intell., 1998, pp. 43–52.
- J. L. Herlocker, J. A. Konstan, A. Borchers, and J. Riedl, ‘‘An algorithmic framework for performing collaborative filtering,’’ in Proc. 22nd Annu. Int. ACM SIGIR Conf. Res. Develop. Inf. Retr., 1999, pp. 230–237.
- B. Sarwar, G. Karypis, J. Konstan, and J. Riedl, ‘‘Item-based collaborative filtering recommendation algorithms,’’ in Proc. 10th Int. Conf. World Wide Web, 2001, pp. 285–295.
- T. Hofmann and J. Puzicha, ‘‘Latent class models for collaborative filtering,’’ in Proc. IJCAI, vol. 99, no. 1999, pp. 1–6 1999.
- K. Miyahara and M. J. Pazzani, ‘‘Collaborative filtering with the simple Bayesian classifier,’’ in Proc. Topics Artif. Intell. (PRICAI), 2000, pp. 679–689.
- S. Vucetic and Z. Obradovic, ‘‘Collaborative filtering using a regressionbased approach,’’ Knowl. Inf. Syst., vol. 7, no. 1, pp. 1–22, 2005.
- J. Liu, Y. Jiang, Z. Li, X. Zhang, and H. Lu, ‘‘Domain-sensitive recommendation with user-item subgroup analysis,’’ IEEE Trans. Knowl. Data Eng., vol. 28, no. 4, pp. 939–950, Apr. 2016.
- Y. Koren, R. Bell, and C. Volinsky, ‘‘Matrix factorization techniques for recommender systems,’’ Computer, vol. 42, no. 8, pp. 30–37, 2009.
- K. Yu, S. Zhu, J. Lafferty, and Y. Gong, ‘‘Fast nonparametric matrix factorization for large-scale collaborative filtering,’’ in Proc. 32nd Int. ACM SIGIR Conf. Res. Develop. Inf. Retr., 2009, pp. 211–218.
- B. Sarwar, G. Karypis, J. Konstan, and J. Riedl, ‘‘Application of dimensionality reduction in recommender system—A case study,’’ Dept. Comput. Sci. Eng. Univ. Minnesota, Minneapolis, MN, USA, Tech. Rep. TR-00-043, 2000.
- A. Mnih and R. R. Salakhutdinov, ‘‘Probabilistic matrix factorization,’’ in Proc. Adv. Neural Inf. Process. Syst., 2008, pp. 1257–1264.
- R. Salakhutdinov, A. Mnih, and G. Hinton, ‘‘Restricted Boltzmann machines for collaborative filtering,’’ in Proc. 24th Int. Conf. Mach. Learn., 2007, pp. 791–798.
- K. Georgiev and P. Nakov, ‘‘A non-IID framework for collaborative filtering with restricted boltzmann machines,’’ in Proc. Int. Conf. Mach. Learn., 2013, pp. 1148–1156.
- A. van den Oord, S. Dieleman, and B. Schrauwen, ‘‘Deep content-based music recommendation,’’ in Proc. Adv. Neural Inf. Process. Syst., 2013, pp. 2643–2651.
- X. Wang and Y. Wang, ‘‘Improving content-based and hybrid music recommendation using deep learning,’’ in Proc. 22nd ACM Int. Conf. Multimedia, 2014, pp. 627–636.
- H. Wang, N. Wang, and D.-Y. Yeung, ‘‘Collaborative deep learning for recommender systems,’’ in Proc. 21th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, 2015, pp. 1235–1244.
- X. He, L. Liao, H. Zhang, L. Nie, X. Hu, and T.-S. Chua, ‘‘Neural collaborative filtering,’’ in Proc. 26th Int. Conf. World Wide Web, 2017, pp. 173–182.
- K. Jarrett, K. Kavukcuoglu, M. Ranzato, and Y. LeCun, ‘‘What is the best multi-stage architecture for object recognition?’’ in Proc. IEEE 12th Int. Conf. Comput. Vis., Sep. 2009, pp. 2146–2153.
- X. Glorot, A. Bordes, and Y. Bengio, ‘‘Deep sparse rectifier neural networks,’’ in Proc. 14th Int. Conf. Artif. Intell. Statist., 2011, pp. 315–323.
- B. Xu, J. Bu, C. Chen, and D. Cai, ‘‘An exploration of improving collaborative recommender systems via user-item subgroups,’’ in Proc. 21st Int. Conf. World Wide Web, 2012, pp. 21–30.
- V. Kumar, A. K. Pujari, S. K. Sahu, V. R. Kagita, and V. Padmanabhan, ‘‘Proximal maximum margin matrix factorization for collaborative filtering,’’ Pattern Recognit. Lett., vol. 86, pp. 62–67, Jan. 2017.
- A. Hernando, J. Bobadilla, and F. Ortega, ‘‘A non negative matrix factorization for collaborative filtering recommender systems based on a Bayesian probabilistic model,’’ Knowl.-Based Syst., vol. 97, pp. 188–202, Apr. 2016.
- C.-L. Liao and S.-J. Lee, ‘‘A clustering based approach to improving the efficiency of collaborative filtering recommendation,’’ Electron. Commerce Res. Appl., vol. 18, pp. 1–9, Jul./Aug. 2016.
- Y. Cai, H.-F. Leung, Q. Li, H. Min, J. Tang, and J. Li, ‘‘Typicality-based collaborative filtering recommendation,’’ IEEE Trans. Knowl. Data Eng., vol. 26, no. 3, pp. 766–779, Mar. 2014. TIEJIAN LUO was born in 1962. He received the Ph.D. degree. He is currently a Professor. He is also the Director of the Information Dynamic and Engineering Applications Laboratory. His research interests include web mining, large scale web performance optimization, and distributed storage systems.
- YANJUN WU was born in 1979. He received the Ph.D. degree from the Institute of Software, Chinese Academy of Sciences. He is currently a Research Professor with ISCAS. His research interests include operating systems and system security.

Full Text

Tags

Comments