Speaker:Dr. Wen-Yu Hua(Amazon machine learning scientist)

  • Event Date: 2020-10-29
  • Speaker:  /  Host:

Topic:Similarity Recommendation based on the Attention Mechanism
 
Speaker:Dr. Wen-Yu Hua(Amazon machine learning scientist) 

Date Time:THU. Oct 29,2020, 11:00 AM - 11:50 AM 
 
Place: 4F-427, Assembly Building I
 

Abstract
 
Item-to-item similarity has been long used for building recommender systems in industrial settings, owing to its interpretability and real-time computational productivity. In this work, we have developed a new embedding representation to the similarity-based recommendation system. The proposed solution enhances the information to both text embedding and image embedding. First of all, we have successfully improved the text embedding in two ways: 1) add item description and bullet points on top of the title along with some key attributes to enlarge the text information; 2) apply topic modeling on the description and bullet points to get key topics and keywords, and compare the performance between Word2Vec model and pre-trained fine-tuned Bidirectional Encoder Representations from Transformers (BERT) model on the text attributes. Moreover, we have tested product image embeddings with different settings and compare the performance with two settings: 1) apply max-pooling on a ResNet50 with triplet loss model to get 205-dimension embeddings; 2) apply PCA on the same ResNet50 model to reduce the dimension. Based on the experiment results with different text and image embeddings, we propose a better solution which outperforms the baseline result [1] with increased 20% precision on a fixed recall (0.05). The contribution of this work includes 1) the most comprehensive ASIN catalog information to the text model is used; 2) the best combination of text and image embedding is found. The result shows smaller distance in terms of k-nearest neighbors (KNN) Euclidean measurement and significant precision increased on a down-stream click and purchase task; 3) this framework is not limited to a specific use case, and can be easily adapted to different product categories and marketplaces.