How to design Uber

Uber Technique
1. RingPop: Web server and database at the same location.
2. TChannel: RPC protocol
3. Google S2: Location-based storage and query
4. Riak: Open source dynamo DB
Load Balance (webserver)
1. No need to use consistent hashing.
  1. Random.
  2. Based on the webserver’s workload.
2. Implementation
  1. Hardware
  2. Software Nginx
Scenario
1. Features
  1. Basic
    1. Driver report locations: Heat beat. Report per 4 seconds.
    2. Rider request Uber, match a driver with the rider
  2. Improve:
    1. Denny / Accept
    2. Cancel
    3. Pick up / Drop off
  3. Add-On
    1. Pool / Eats
2. Requirement
  1. Ask Drivers / Riders / Area
  2. Driver: Assume 200K. Avg QPS 200K / 4 = 50K. Peak QPS 50K * 3 = 150K. (DB Write)
  3. Rider: Ignorable.
Service
1. Basic Structure: Both services talk to users
  1. Driver <-> Geo Service
    |
  2. Rider <-> Dispatch Service
2. Improved Structure: Only dispatch service talks to users
  1. Driver <-> Driver save location
    Dispatch Service <-> Geo Service
  2. Rider <-> Rider find drivers nearby
Storage
1. Data
  1. Dispatch Service = Dispatch Web Server + Trip Table Database (Read > Write)
  2. Geo Service = Geo Web Server + Location Table (Write > Read) + Driver Table
2. Schema
  1. Trip Table: trip_id; driver_id; rider_id; latitude; longtitude; status; created_at;
  2. Location: driver_id; latitude; longtitude; updated_at;
3. Query Location:
  1. Naive Solution: SQL query latitude and longtitude in ranges. Slow.
  2. Geo Service:
    1. Google S2.
      1. Hilbert Curve
      2. Map 2D space to 1D space. (Global space map to2^64 blocks)
      3. If two points are close in 2D, then they are close in 1D as well.
    2. Geohash
      1. Peano Curve (2 Way Division)
      2. Base32: 0-9, a-z (remove a, i, l, o)
      3. If two string has a common prefix then they are close.
      4. Details:
        
        With longer common prefix, the maximum distance is smaller.
        
        Calculate
        
        First cut Longitude Left 0, Right 1.
        
        Second, cut Latitude. Up 0, Down 1.
        
        Interchange.
        
        Every 5 steps, get 5 bits, get a base32 digit.
      5. Implementation
        
        Method
        
        SQL: Index GeoHash. Like query.
        
        NoSQL Cassandra: Column key GeoHash. Range Query.
        
        NoSQL Redis / Memcached: Store at 4,5,6 digits prefix. (600m – 20km). (Update car location and delete)
        
        Choice
        
        SQL: Okay but slow
        
        Like query is slow.
        
        Driver updated location. Index column changes. B+ tree changes. Low efficiency.
        
        NoSQL: Cassandra Okay but slow
        
        The column key is indexed. Shouldn’t be changed.
        
        NoSQL: Memcached Bad
        
        Not persistent
        
        Doesn’t support set
        
        NoSQL: Redis
        
        QPS > 100K
        
        Persistent
        
        Support list, set, etc. (Set of drivers)
        
        Redis Implementation
        
        Rider: Location -> GeoHash -> Nearby Drivers
        
        Location Table (Frequent write)
        
        Key: geohash
        
        Value: driver ids
        
        Driver Table (Frequent write)
        
        Key: driver_id
        
        Value: lat, log, status,updated_id, trip_id
        
        Driver:
        
        Report Location: Calc Geohash -> Change in Location Table / Driver Table if Geohash changes
        
        Accept Booking: Change trip status, mark unusable in driver table
        
        Finish Trip: Change trip status. Mark usable.
    3. Overall
      1. Tripe table (SQL)
        
        Query by trip_id, rider_id, driver_id, etc.
      2. Location table / Driver table (Redis)
Overall Solution
1. The rider sends a request. The server creates a trip.
  1. Return trip id to the user.
  2. The rider queries the server if the match is successful periodically (or create socket).
2. The server finds a matching driver. Write the driver id into the trip.
  1. Change the driver as unavailable.
3. The driver reports his location.
  1. Find the driver table has an allocated trip id.
  2. Find the trip on the trip table and returns to the driver.
4. Driver accepts.
  1. Change the driver table and the trip table.
  2. The rider gets the information of the matching driver.
5. Driver declines.
  1. Change the driver table and the trip table.
  2. Redo 2.
Scale
1. DB Sharding
  1. Redis server is down
    1. Multiple Redis / Cassandra
  2. Purpose
    1. Share QPS
    2. Avoid single point failure
  3. Sharding Key
    1. City ID
    2. Deciding the City -> Geo-Fence
    3. Include the nearby cities
2. Check if the rider is in airport
  1. Geo Fence
  2. Check which city and then check which airport.
3. How to reduce the impact on database crash
  1. Replica
    1. Replica by Redies: Master and Slave
    2. Replica by yourself: Virtual sharding
    3. Replica by NoSQL: Riak / Cassandra

levendlee

Design Location-Based Service

How to design Uber

Leave a comment Cancel reply

Design Location-Based Service

How to design Uber

Share this:

Leave a comment Cancel reply