Clarify Requirements

Design a homepage video recommendation system. The business objective is to increase user engagement. Each time a user loads the homepage, the system recommends the most engaging videos. Users are located wordwide and videos can be in different languages. The are approximately 10 billion videos on the platform, and recommendations should be served quickly.

Frame the Problem as an ML Task

Defining the ML objective

Maximize the number of user clicks.
Maximize the number of completed videos.
Maximize the total watch time.
Maximize the number of relevant videos.
- Define relevance based implicit or explicit rules.

Specifying the system’s input and output

Takes a user as input, and a list of recommended videos as output.

Choosing the right ML category

Content based filtering

Use video features to recommand similar videos.

Pros
- Ability to recommend new videos.
- Ability to capture the unique interests of users.
Cons
- Difficult to discover a user’s new interest.
- Require domain knowledge and feature engineering.

Collaborative filtering (CF)

Use user-user similarities or video-video similarities to recommend new videos.

Pros
- No domain knowledge required.
- Easy to discover user’s new areas of interest.
- Faster and less compute-intensive.
Cons
- Cold-start problem.
- Cannot handle niche (special) interests.

	Content based filtering	Collaborative filtering
Handle new vidoes	Yes	No
Discover new interest areas	No	Yes
No domain knowledge	No	Yes
Efficiency	No	Yes

Hynord filtering.

Parallel: CF based + content based -> Combiner.
Sequential: CF based -> content based

Data Preparation

Data Engineering

Videos
- ID, length, tags, titles, likes, views, language, etc.
Users
- ID, username, age, gender, city, country, language, timezone, etc.
User-video interactions
- impression/click/watch/like/…, timestamps

Feature Engineering

Video features

ID: Categorical data -> Embedding layer.
Duration: Direct.
Language: Embedding layer.
Tile: Context-aware model model. BERT.
Tags: Lightweight model. CBOW.

User features

User demographics
- ID/Langugage/City/Country: Embedding
- Age: Bucketize + One-hot.
- Gender: One-hot.
Contextual information
- Day of week: One-hot.
- Time of day: Bucketize + One-hot.
- Device: One-hot.
User historical interactions
- Search/Like/Watched and Impressions.

Model Development

Matrix factorization

Feedback matrix

Explicit feedback. Sparse.
Implicit feedback. Noisy.
Combination.

Matrix factorization model

Embedding model. Dot product of a user embedding matrix and a video embedding matrix, all as trainable.

Matrix factorization training

Squared distance over observed pairs
- Ignore unobserved.
Squared distance over observed and unobserved pairs
- Treat unobserved as negative.
A weght combination of both.

Matrix factorization optimization

SGD: Minimize loss.
Weights Alternating Least Squares (WALS): Specific to matrix factorization. Fix one and optimize the other. Repeat.

Matrix factorization inference

Dot product and TopK scores. (Commonly used in collabrative filtering).

Pros
- Training speed: Two embedding matrices to learn.
- Serving speed: Fast. Embeddings are static.
Cons
- Only relies on user-video interactions.
- Handle new users are difficult.

Two-tower Neural Network

Feed video/user features to two DNN towers. Run dot product.

Constructing dataset

1 for positive. 0 for negative.

Choosing loss function

Train with cross entropy loss function.

Inference

Approximate nearest neighbot methods to find TopK similar video embeddings.

Pros
- Use video and user features.
- Handles new users.
Cons
- Slower serving.
- Expensive training.

Model Evaluation

StackeOverflow

Offline metrics

Precision@k: Measures the propagation of relevant videos among the top k recommended videos. Multiple k values (e.g. 1, 5, 10) can be used.
mAP: Measures the ranking quality of recommended video. A good fit because the relevant score are binary.
Diversity: Measures how dissimilar recommendation videos are to each other.

Online metrics

Click-through rate (CTR): $CTR = \frac{number\ of\ clicked\ videos}{total\ number\ of\ recommended\ videos}$ .
The number of completed videos.
Total watch time.
Explicit user feedback.

Serving

Candidate generation

Narrow down the videos from potentially billions to thousands. Prioritize efficiency over accuracy.
A model doesn’t relies on video features and can handle new users. A two-tower neural network is a good fit.
Use multiple candidate generation it could improve performance due to different reasons of viewing videos.

Ranking

Prioritize accuracy over efficiency.
Content-based filtering and pick a model based on video features. Larger and heavier two tower neural network.

Re-ranking

Re-ranks the videos by adding additional criteria or constraints.
Clickbait. Region restriction. Misinformation. Duplication. Faireness and bias.

Challenges

Serving speed
- 2 stage.
Precision
- More powerful model based on scoring component.
Diversity
- Multiple candidate generators.
Code-start Problem
- New users: Basic information like age, gender, etc. Recommend and refine.
- New videos: Recommend to random users and retune.
Training scalability
- The model should be easily fine-tuned.

levendlee

Chapter 6: Video Recommendation System

Clarify Requirements

Frame the Problem as an ML Task

Defining the ML objective

Specifying the system’s input and output

Choosing the right ML category

Content based filtering

Collaborative filtering (CF)

Hynord filtering.

Data Preparation

Data Engineering

Feature Engineering

Video features

User features

Model Development

Matrix factorization

Feedback matrix

Matrix factorization model

Matrix factorization training

Matrix factorization optimization

Matrix factorization inference

Two-tower Neural Network

Constructing dataset

Choosing loss function

Inference

Model Evaluation

Offline metrics

Online metrics

Serving

Candidate generation

Ranking

Re-ranking

Challenges

Leave a comment Cancel reply

Chapter 6: Video Recommendation System

Clarify Requirements

Frame the Problem as an ML Task

Defining the ML objective

Specifying the system’s input and output

Choosing the right ML category

Content based filtering

Collaborative filtering (CF)

Hynord filtering.

Data Preparation

Data Engineering

Feature Engineering

Video features

User features

Model Development

Matrix factorization

Feedback matrix

Matrix factorization model

Matrix factorization training

Matrix factorization optimization

Matrix factorization inference

Two-tower Neural Network

Constructing dataset

Choosing loss function

Inference

Model Evaluation

Offline metrics

Online metrics

Serving

Candidate generation

Ranking

Re-ranking

Challenges

Share this:

Leave a comment Cancel reply