Effects of Similarity Measures on The Quality of Predictions

1.966 655


Providing accurate predictions efficiently is vital for the success of recommender systems. There are various factors that might affect the quality of the predictions and online performance. Similarity metric used to determine neighbors is one of such factors. Therefore, given a set of metrics, determining and utilizing the best one is critical for the overall success of collaborative filtering schemes. We scrutinize several binary similarity measures in terms of accuracy and performance. We conduct various real data-based experiments in order todetermine the best similarity measure. Our empirical outcomes show that Yule and Kulczynski metrics providethe best results.
Keywords: Accuracy, performance, binary similarity metric, prediction, collaborative filtering


Accuracy, performance, binary similarity metric, prediction, collaborative filtering

Full Text:



Goldberg, D., Nichols, D. A., Oki, B. M. and Terry, D. B, “Using collaborative filtering to weave an Information Tapestry”, Communications of the ACM, 35 (12), 61-70, 1992.

Perkowitz, M. and Etzioni, O., “Towards adaptive Web sites: Conceptual framework and case study”, Artificial Intelligence, 118 (1-2), 245-275, 2000.

Teknomo, K., Why do we need to measure similarity?, http://people.revoledu.com/kardi/tutorial/Similarity /Applications.html, Accessed on November 1, 2012.

Papagelis, M., Rousidis, I., Plexousakis, D. and Theoharopoulos, E., “Incremental collaborative filtering for highly-scalable recommendation algorithms”, International Conference on Foundations of Intelligent Systems, Saratoga Springs, NY, USA, 553-561, 2005. of the 15th

Robu, V. and La Poutré, H., “Learning the structure of utility graphs used in multi-issue negotiation Lecture Notes in Computer Science, 4078, 192- 206, 2009. filtering”,

Miyahara, K. and Pazzani, M. J., “Collaborative filtering with the simple Bayesian classifier”, In Proceedings of the 6th Pacific Rim International Conference on Artificial Intelligence, Melbourne, Australia, 679-689, 2000.

Miyahara, K. and Pazzani, M. J., “Improvement of collaborative filtering with the simple Bayesian classifier”, Information Processing Society of Japan, 43 (11), 2002.

Kaleli, C. and Polat, H., “Similar or dissimilar users? Or both?”, In Proceedings of the 2009 2nd International Symposium on Electronic Commerce and Security, Nanchang, China, 184-189, 2009.

Cha, S.-H., Yoon, S. and Tappert, C. C., “On binary similarity measures for handwritten character recognition”, In Proceedings of 8th International Conference on Document Analysis and Recognition, Seoul, Korea, 4-8, 2005.

Zhang, B. and Srihari, S. N., “Binary vector measures Document identification”, Retrieval X, 5010 (1), 28-38, 2003. Recognition and

Tubbs, J. D., “A note on binary template matching”, Pattern Recognition, 22 (4), 359-365, 1989.

Friedman, N., Geiger, D. and Goldszmidt, M., “Bayesian network classifiers”, Machine Learning, 29, 131-163, 1997.