0. Overview
- Web scaling
- Data centers
- Users/servers bound into groups
- Load Balancer
- Running multiple servers
- Stateless
- Servers detached from sessions
- Message Queue
- Task server detached from web servers
- Data centers
- Database scaling
- Replication
- Master/slave to handle store/read
- Sharding
- Distributes workload to different dbs
- Replication
- Cache Improvements
- Generic cache.
- Expiration, evictions, synchronizations.
- CDN
- Cache static data w/o web servers.
- Generic cache.
- Logging/Metrics/Automation
1. Single Server Setup
Request Flow
- Users -> domain name -> DNS (Domain Name System) Server.
- DNS -> IP (Internet Protocol) address -> Users.
- Users -> HTTP (Hypertext Transfer Protocol) Request -> Web Server.
- Web Server -> HTML / JSON -> Users.
Traffic Source
- Web Application
- Server side (Java, Python, etc.) to handle business logic, and storage.
- Client side (HTML, JavaScript, etc.) for presentation.
- Mobile Application
- HTTP is a communication protocol.
- JSON(JavaScript Object Notation) is response format.
2. Database
Separate web/mobile traffic (web tier) and database (data tier).
Relational Database
- Relational database management system (RDBMS) or SQL database.
- MySQL, Oracle database, PostgreSQL, etc.
- Presents and stores data in tables and rows. Perform join operations across different database tables via SQL.
Non-Relational Database
- NoSQL database.
- CouchDB, Neo4j, Cassandra, HBase, Amazon DynamoDB.
- Categories:
- Key-value stores.
- Graph stores.
- Column stores.
- Dodument stores.
- Join operations are not supported.
What to use
Choose relational by default. Use non-relational if:
- Application requires super-low latency.
- Unstructured data.
- Serialize and deserialize.
- Store massive amounts of data.
3. Vertical scaling vs Horizontal scaling
- Vertical scaling
- Scale up, adding more power (CPU, RAM, etc) to your servers.
- Pros:
- Simplicity. Suitable when traffic is low.
- Cons:
- Hard limit. Cannot add unlimited power to a server.
- No failover and redundancy. One server goes down, and the web/app goes done.
- Horizontal scaling
- Scale out, adding more servers to your pool of resources.
4. Load Balancer (Web tier scaling)
A load balancer evenly distributes incoming traffic among web servers that are defined in a load-balanced set.
- Users connect to the load balancer through public IP.
- Load balancer communicates with web servers through private IPs.
How it resolves problems of failover and availability:
- One server is offline, traffic goes to other servers, while a new server is being added.
- Too much traffic, add new servers, load balancer automatically sends requests to them.
5. Database Replication (Data tier scaling)
Master (original) / salve (copies) relationship.
- Master only supports write.
- Data modifying operations send to master, e.g. insert, delete, update, etc.
- Slaves only supports read operations.
- More slaves than master as higher ratio of reads than writes.
Advantages:
- Better performance: Parallel query.
- Reliability: Data is replicated across all locations (masters/slaves).
- Availability: Access data stored in another database if offline.
- Slave goes offline.
- One slave. Reads redirected to master.
- Multiple slaves. Reads redirected to healthy slaves.
- Adds new slave.
- Master goes offline.
- One slave was promoted to a new master. Add a new slave.
- Slave data is not up to date.
- Recover missing data by running data recovery scripts.
- Or multi-master and circular replication.
- Slave data is not up to date.
- One slave was promoted to a new master. Add a new slave.
- Slave goes offline.
6. Cache (Performance Improvement)
Cache tier
- Web server -> Request -> Cache hit -> Web server
- Web server -> Request -> Cache miss -> Database -> Cache store -> Web server (Read through)
Pros:
- Better performance,
- Reduce database workloads.
- Scale cache tier independently.
Considerations:
- Use when data is read frequently but modified infrequently. Cache data is stored in volatile memory, not ideal for persisting data.
- Consistency. Keeping data store and cache in sync. Inconsistency when not transactional.
- Expiration policy. Too long -> data become stale. Too short -> reload from database frequently.
- Evication policy. LRU (least recently used). LFU (least frequently used). FIFO.
- Migrating failures. Multiple cache servers across different data centers to avoid a single point of failure. (SPOF) or overprovision the required memory.
7. Content delivery network (CDN. Cache w/o web tier)
A network of geographically dispersed servers used to deliver static content (image, videos, CSS, JavaScript. etc). (Skipping web servers)
Workflow:
- User access image by URL.
- CDN server requires image form origin.
- The origin returns the image to CDN server, including HTTP header Time-to-Live(TTL).
- CDN caches the image until TTL and returns.
- Following user requests get image from CDN server.
Considerations:
- Cost: CDN are run by third-party providers.
- Expiration policy.
- CDN fallback: If CDN outage, users detect problem and request origin.
- Invalidating files:
- Using APIs from CDN vendors.
- Using object versioning, by adding a parameter to URL.
8. Stateless web tier (Web tier scaling)
To scale web tier horizontally, move the state out of web tier by storing session data in persistent storage (SQL/NoSQL).
Stateful architecture
All requests from the same client must be routed to the same server. Can be done with sticky sessions in load balancers, but add overhead on adding/removing servers.
Stateless architecture
HTTPs requests from users can be sent to any web server. Fetch data from a shared data storage. Simpler, robust and scalable.
Use NoSQL as it is easy to scale.
9. Data Ceneters (Web/Data/Cache tier scaling)
Users are geoDNS-routed, or geo-routed, to the closest data ceneter. In the event of significant data center outage, redirect to a healthy data center.
- Server, database, Cache, are spread across data centers.
- Session data (NoSQL), is shared across data centers.
Challenges:
- Traffic redirection: Direct users to correct data center. GeoDNS.
- Data synchronization:
- Healthy: Different users from different regions use different local databases or caches.
- Failover: Traffic is migrated where data is unavailable. Replicate data.
- Test and deployment: Test on different locations. Automated deployment tools.
10. Message Queue (Web tier horizontal scaling and communication)
Durable, in memory component, to support asynchronous communication. Buffer to distributes asynchronous requests.
- Producers/Publishers: Create messages, publish to queue.
- Consumers/Subscribers: Listen to queue, perform actions defined by the messages.
Decoupling:
- Producer (web server) post message when consumer is unavailable.
- Consumer (dedicated worker) read message when producer is unavailable.
- Producer and consumer can be scaled independently.
11. Logging, Metrics, Automation
-
Logging: Mointor error.
- Per server level.
- Use tools to aggregate them to a centralized service.
-
Metrics: Gain business insights and understand health status.
- Host level metrics: CPU/Memory/Disk
- Aggregated level metrics: Database tier, cache tier, etc.
- Key business metrics: Daily active users, revenue, etc.
-
Automation: CICD. Build, test, deployment.
12. Database Scaling
Vertical Scaling
Scaling up. Adding more power to an existing machine. Example: Amazon Relational Database Service (RDS) 24TB of RAM. Cons:
- Hardware limits.
- SPOF.
- Cost.
Horizontal Scaling
Scaling out/Sharding. Separates large databases into smaller, easily managed parts called shards, sharing schema but unique data.
Choose sharding key: Help to route data to correct database. Choose one that can evenly distribute data.
Challenges:
- Resharding data: Updating sharing and moving data. -> Consistent sharding.
- A single shard could no longer hold more data due to rapid growth.
- Certian shards exprience shard exhaustion faster due to uneven distribution.
- Celebrity problem: Excessive access to a specific shard cause server overload. -> Allocate shard for each celebrity, even further partition.
- Join and de-normalization: Hard to perform join operations across database shards. -> De-normalize so queries can be performed in a single table.
13. Millions of users and beyond
- Keep web tier stateless
- Build redudancy at every tier
- Cache data as much as you can
- Support multiple data centers
- Host static assets in CDN
- Scale your data tier by sharding
- Spit tiers into individual services
- Monitor your system and use automation tools
Leave a comment