Chapter 11: Design A News Feed System

Step 1 – Understand the problem and establish design scope

Interface
- Mobile and Web.
Features
- Push post and see posts.
Order
- Sorted reverse chronolohical order.
How many friends
- 5k
Traffic volume
- 10M DAU
Content Type
- Media (video/images).

Step 2 – Propose high-level design and get buy-in

Flows:

Feed publishing
New feed building

Newsfeed APIs

Feed publishing API

HTTP POST /v1/me/feed

content
auth_token

Newsfeed rertieval API

HTTP Get /v1/me/feed

auth_token

Feed publishing

User: Web/Mobile. Post.
Load balancer: Distribute traffic.
Web servers: Redirect traffic.
Internal services
- Post service: Persist post in database and cache.
- Fanout service: Push new content to friend’s news feed. Newsfeed data is in cache for fast retrieval.
- Notification service: Inform friends that new content is available.

Newsfeed building

User: Web/Mobile. View.
Load balancer.
Web servers.
Internal service:
- Newsfeed service: Fetch news feed from the cache.
- Newsfeed cache: Store news feed IDs needed to render the news feed.

Step 3 – Design deep dive

Feed publishing deep dive

Web servers

Authentication and rate-limiting.

Fanout service

Fanout on write (Push model)

Newsfeed is pre-computed during write time. Delivered to friends’ cache immediately after it is published.

Pros:

Realtime. Pushed to friends immediately.
Fetching newsfeed is fast.

Cons:

Hotkey problem: Users with many friends is slow to fetching the friend list and generating news feeds.
Inactive users: Waste of compute resources.

Fanout on read (Pull model)

Newsfeed is generated on-demand during read time. Recent posts are pulled when a user loads her home page.

Pros:

Inactive users: No waste.
Not pushed so no hotkey problem.

Cons:

Fetching the news is slow.

Hybrid approach

Use a push model for majority of user.
Use a pull model for or celebrities posting or inactive users receiving.
Consistent hashing to distribute requests/data more evenly.

Fanout service workflow

1. Fetch friend IDs from graph database.
1. Get friend info from the user cache. Filter out friends based on settings. (muted, close friends, etc.)
1. Send friends list and new post ID to the message queue.
1. Fanout workers fet ch data from message queue and store news feed data in cache.
- Store ids in cache, not original content. **.
- Set configurable limit to evit old items from cache.
1. Store ** in newsfeed cache.

Newsfeed retrieval deep dive

CDN to store media content.

Workflow

User and retrieve request. Get /v1/me/feed.
Loadbalancer.
Webserver.
Newsfeed service
- Get a list post IDs from the newsfeed cache.
- Fetch the complete user and post object from user cache and post cahce to construct the fully hydrated newsfeed.
- Return fully hydrated new feed in JSON format.

Cache architecture

Newsfeed: news feed cache.
Content: hot cache. normal.
Social graph: follower. following.
Action: liked. replied. others.
Counters: likes. replies. others.

Step 4 – Wrap Up

Scalability

Vertical sharding vs Horizontal sharding.
SQL vs NoSQL.
Master-slave replication.
Consistency models.
Database sharding.

Other talking points

Web tier stateless
Cache as much as possible
Multiple datacenters
Lose components with message queues
Monitor key metrics: QPS during peak hours and latency while refreshing newsfeed.

levendlee

Chapter 11: Design A News Feed System

Chapter 11: Design A News Feed System

Step 1 – Understand the problem and establish design scope

Step 2 – Propose high-level design and get buy-in

Newsfeed APIs

Feed publishing API

Newsfeed rertieval API

Feed publishing

Newsfeed building

Step 3 – Design deep dive

Feed publishing deep dive

Web servers

Fanout service

Fanout on write (Push model)

Fanout on read (Pull model)

Hybrid approach

Fanout service workflow

Newsfeed retrieval deep dive

Workflow

Cache architecture

Step 4 – Wrap Up

Scalability

Other talking points

Leave a comment Cancel reply

Chapter 11: Design A News Feed System

Chapter 11: Design A News Feed System

Step 1 – Understand the problem and establish design scope

Step 2 – Propose high-level design and get buy-in

Newsfeed APIs

Feed publishing API

Newsfeed rertieval API

Feed publishing

Newsfeed building

Step 3 – Design deep dive

Feed publishing deep dive

Web servers

Fanout service

Fanout on write (Push model)

Fanout on read (Pull model)

Hybrid approach

Fanout service workflow

Newsfeed retrieval deep dive

Workflow

Cache architecture

Step 4 – Wrap Up

Scalability

Other talking points

Share this:

Leave a comment Cancel reply