Home AI Amazon Personalize can now unlock intrinsic signals in your catalog to recommend...

Amazon Personalize can now unlock intrinsic signals in your catalog to recommend similar items

October 14, 2021

254

Today, we’re excited to announce a new similar items recommendation recipe (aws-similar-items) in Amazon Personalize that helps you leverage your users’ interaction histories and what you know about the items in your catalog to deliver relevant recommendations.

Across Amazon, we provide personalized experiences for each of our users, and based on a user’s interests, we change their experiences and the items they see. Visitors are often recommended items that users with similar histories have interacted with. These recommendations are called similar items, and they help users discover items relevant to what they’re watching or purchasing. By taking into account the item a user is engaged with, we can improve engagement and conversion. This new recipe uses co-occurrence in interactions data (how often these items appear together across user histories) and thematic similarity (what is similar about the items in your catalog) when making recommendations to better quantify similarity for less popular or new items in your catalog.

This post shows you how to use our new recipe (aws-similar-items) and illustrates the difference compared to our collaborative filtering-based recipe (SIMS).

With this new recipe, similar item recommendations in Amazon Personalize are no longer limited to using only user-item interactions based on the co-occurrence of an item across users’ interaction histories. Co-occurrence is not the only way to define what is similar; thematic similarity takes advantage of the information contained in what you know about your items. Metadata and detailed descriptions used to describe your items contain valuable information about features relevant to your users. Items with similar features are similar whether customers have interacted with them or not. For example, video content set in the same time period, news articles covering common events, or retail items in the same shopping category are thematically similar independent of how users have interacted with them. Our new recipe unlocks these signals for Amazon Personalize to learn from. You can use the investments made to create rich and concise narratives about items to more effectively engage your users.

In Amazon Personalize, the aws-similar-items recipe uses deep learning based techniques and the knowledge you have about your items to identify similarity. This recipe makes sure that customers are exposed to a wider variety of relevant items, which drives better outcomes. Amazon Personalize enables developers to build applications using machine learning (ML) technology to deliver personalized user experiences with no ML expertise required. We make it easy for developers to build applications capable of delivering a wide array of experiences. Amazon Personalize is a fully managed ML service that goes beyond static rule-based recommendation systems and allows customers across industries to create custom recommenders that provide highly personalized user experiences. You receive results via an Application Programming Interface (API) and only pay for what you use, with no minimum fees or upfront commitments. All data is encrypted to be private and secure, and is only used to create recommendations for your users.

Solution architecture

The notebook that accompanies this post demonstrates how item metadata for item-to-item similarity improves the variety of recommendations. We use one Amazon Personalize dataset group with user-item interaction data and item metadata. We create two solutions using each of our related-items recipes, aws-similar-items and SIMS. The aws-similar-items recipe uses both user-item interaction history and item metadata to identify similar items in your catalog. SIMS only uses the user-item interaction history. We then recommend items based on a common seed item.

The following diagram illustrates the architecture we use across this post in examples and comparisons.

Hire a Hardware Engineer.

To demonstrate the difference in recommendations from two solution versions, we compare the results generated using each recipe. This allows us to evaluate how the inclusion of item metadata changes recommendations based on additional dimensions of similarity.

Comparing similar items’ inference results using Amazon Prime Pantry’s dataset

SIMS uses collaborative filtering, a technique that is widely used across item-to-item recommender systems. The recipe is based on an item’s co-occurrence statistics derived from user interaction data. Because the predictions are purely driven by these statistics around a user’s behavior, it works well when you have a large set of interactions data. This approach is fast and reliable during training and inference, but lacks support for intrinsic content. This means that we don’t include valuable information that accounts for thematic similarities across different items or services in a catalogue.

Our new Amazon Personalize recipe (aws-similar-items) uses a deep learning architecture that supports item-metadata along with user-item interactions. You can use this enhancement to provide a richer and more similar inference response.

The following screenshots and examples are derived from the following notebook hosted in the Amazon Personalize Samples GitHub repository. For this example, we used the Amazon Prime Pantry reviews dataset.

First let’s look at the steps we took in this experiment:

Transform reviews into interactions.
Select the most relevant item features to use as metadata:
1. Brand
2. Price
3. Description to be analyzed as unstructured text using our unstructured text feature
Train two Amazon Personalize solution versions using the SIMS and aws-similar-items recipes.

Before we look at our recommendations, we consider how many user-item interactions exist for each of our item IDs.

The following screenshots show the top five most and least interacted items in our dataset (left table), as well as metrics about the distribution of number of user-item interactions (right table).