I know why you like this movie: Interpretable Efficient Multimodal Recommender

Abstract

Recently, the Efficient Manifold Density Estimator (EMDE) model has been introduced. The model exploits Local Sensitive Hashing and Count-Min Sketch algorithms, combining them with a neural network to achieve state-of-the-art results on multiple recommender datasets. However, this model ingests a compressed joint representation of all input items for each user/session, so calculating attributions for separate items via gradient-based methods seems not applicable. We prove that interpreting this model in a white-box setting is possible thanks to the properties of EMDE item retrieval method. By exploiting multimodal flexibility of this model, we obtain meaningful results showing the influence of multiple modalities: text, categorical features, and images, on movie recommendation output.

Publication
ML4MD ICML Workshop
Michal Daniluk
Michal Daniluk
Research Scientist

My research interests include graph representation learning, recommendation systems, behavioral user representations, NLP.