Sampling for Approximate Maximum Search in Factorized Tensor

Sampling for Approximate Maximum Search in Factorized Tensor

Zhi Lu, Yang Hu, Bing Zeng

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence
Main track. Pages 2400-2406. https://doi.org/10.24963/ijcai.2017/334

Factorization models have been extensively used for recovering the missing entries of a matrix or tensor. However, directly computing all of the entries using the learned factorization models is prohibitive when the size of the matrix/tensor is large. On the other hand, in many applications, such as collaborative filtering, we are only interested in a few entries that are the largest among them. In this work, we propose a sampling-based approach for finding the top entries of a tensor which is decomposed by the CANDECOMP/PARAFAC model. We develop an algorithm to sample the entries with probabilities proportional to their values. We further extend it to make the sampling proportional to the $k$-th power of the values, amplifying the focus on the top ones. We provide theoretical analysis of the sampling algorithm and evaluate its performance on several real-world data sets. Experimental results indicate that the proposed approach is orders of magnitude faster than exhaustive computing. When applied to the special case of searching in a matrix, it also requires fewer samples than the other state-of-the-art method.
Keywords:
Machine Learning: Data Mining
Multidisciplinary Topics and Applications: Personalization and User Modeling
Combinatorial & Heuristic Search: Combinatorial search/optimisation