Back to articles

System Design

System Design — Design Uber / Ride Sharing (Time-Scale Tiers and Spatial Real-Time Matching)

The fun thing about “design Uber” as an interview question is that it’s the first system design problem most candidates encounter where real-time location is central. WhatsApp is about delivering bytes from one user to another. Instagram is about choosing what to show. Uber is about spatially matching two moving users in real time — a fundamentally different problem class, and the architecture reflects that.

The trap candidates fall into is treating Uber as “a messaging app with maps.” It isn’t. The dominant constraints are different: you’re joining geospatial indices in real time, you’re streaming location updates at high frequency, you’re matching supply and demand on a moving surface, and consistency requirements vary wildly across subsystems. The architecture — if you understand it — falls out naturally from those constraints.

This post is the full walk-through, organized by time-scale tier instead of stepwise. Different shape from my WhatsApp and Instagram posts because Uber’s system has a natural layered structure: subsystems operating at seconds, hundreds of milliseconds, minutes, and hours all coexist with different consistency and durability requirements. Get those tiers right, and the rest writes itself.


Step 1: Scope — What Are We Designing?

Uber the product is enormous: rides, food delivery, freight, scooters, transit, ads, payments. For a 45–60 minute interview, scope to the core ride-hailing flow.

You: “Should I focus on ride-hailing or also include Uber Eats and other products?”

Interviewer: “Just ride-hailing.”

You: “What about pricing — design surge pricing or assume a fixed-rate model?”

Interviewer: “Skip dynamic pricing. Assume a simple fare calculation.”

Functional scope: rider opens app, sees nearby drivers on a map, requests a ride to a destination, gets matched with a driver, both see each other’s location during pickup, ride happens, both see the ride in progress, payment processes at the end. Skip: surge pricing, multiple ride types beyond “UberX,” carpooling, scheduled rides, driver onboarding.

Non-functional: matching latency <5 seconds (rider sees nearby drivers within ~1s, gets matched within ~5s after requesting), location updates render at >1Hz on each side during pickup and ride, payment must be exactly-once, no fare shows twice or missed. The system tolerates eventual consistency in some places (analytics, completed-trip history) but demands strong consistency in others (trip state, payment).


Step 2: Numbers — The Scale Pressure Points

  • ~150M monthly active riders, ~6M active drivers globally
  • ~25M rides/day at peak globally
  • Active drivers are streaming location every ~4 seconds while online
  • That’s 6M × 0.25 = 1.5M location writes/sec at peak (drivers, not riders)
  • Riders send location at lower frequency (every 10s) when looking for rides
  • Active trips at any moment: ~1M globally during peak hours
  • Each active trip generates ~2 location updates/sec across the two parties

The pressure points to flag:

Geospatial query rate. Every rider opening the app does “find nearby drivers within 5km.” That’s a spatial query against a moving index of millions of drivers. Cassandra and Postgres can’t do this efficiently — you need a purpose-built spatial index.

Location write rate. 1.5M writes/sec is large but not extreme. The problem isn’t volume — it’s that each write needs to update a spatial index the system queries against in real time. Index update + read latency together is what matters.

Trip-state strong consistency. A trip lifecycle (requested → matched → picked up → in progress → completed → paid) must be linearizable. You can’t have rider and driver disagreeing on whether the trip is still active — that’s a billing nightmare.

Geographic distribution. Drivers in Mumbai don’t need to be matched against riders in São Paulo. The system can be heavily regionalized, which simplifies scaling.

You don’t need exact numbers. You need to flag that spatial queries and real-time location streaming are the dominant scaling problems — not request volume, not storage. That’s the framing that earns the senior tick.


Step 3: The Time-Scale Tiers

Here’s where this design gets interesting. Subsystems operate at fundamentally different timescales:

┌────────────────────────────────────────────────────────────────────────┐
│ Tier              │ Frequency        │ Consistency      │ Storage      │
├───────────────────┼──────────────────┼──────────────────┼──────────────┤
│ Real-time loc.    │ 100ms–4s        │ Eventual,        │ In-memory    │
│ (driver/rider     │                  │ best-effort      │ + spatial    │
│  position)        │                  │                  │ index        │
├───────────────────┼──────────────────┼──────────────────┼──────────────┤
│ Matching          │ 1–5s            │ Strong (within   │ In-memory    │
│ (find driver,     │                  │ the match), but  │ + transient  │
│  offer)           │                  │ retry-tolerant   │ state        │
├───────────────────┼──────────────────┼──────────────────┼──────────────┤
│ Trip lifecycle    │ Seconds–minutes │ Strongly         │ Postgres /   │
│ (state machine,   │                  │ consistent       │ Spanner      │
│  ETAs)            │                  │ per trip         │              │
├───────────────────┼──────────────────┼──────────────────┼──────────────┤
│ Payment           │ End of trip      │ Exactly-once     │ Transactional│
│                   │                  │                  │ DB           │
├───────────────────┼──────────────────┼──────────────────┼──────────────┤
│ Trip history /    │ Async after trip │ Eventually       │ Cassandra /  │
│ analytics         │                  │ consistent, OK   │ data lake    │
│                   │                  │ to be minutes    │              │
│                   │                  │ behind           │              │
└────────────────────────────────────────────────────────────────────────┘

Each tier wants a different storage technology, different replication model, different durability guarantees. Trying to do all of this with one database is the classic mistake.

The senior signal: organizing the architecture conversation around these tiers, instead of around “here are 8 services I’ll explain,” demonstrates you understand why the system has the shape it does. Spatial latency demands in-memory; trip state needs strong consistency; payment needs transactions; analytics needs durability and queryability but tolerates lag.


Step 4: The Real-Time Tier — Spatial Indexing

The single hardest piece of Uber’s system. Drivers stream location at 4-second intervals; the system maintains an index of where every active driver is right now; riders query that index every time they open the app or refresh the map.

The naive approach: each location write updates a row in Postgres, riders run SELECT * FROM drivers WHERE ST_DWithin(...). Falls over at hundreds of concurrent queries; impossible at production scale.

The right approach: a geohash-based or H3-based spatial index in memory.

Geohash divides the world into a hierarchical grid of cells. A geohash string like 9q8yy identifies a specific cell at a specific resolution; 9q8yyk is a sub-cell. To find drivers near a rider:

// Conceptual flow
1. Compute the rider’s geohash at, say, resolution 6 (~1.2km cell)
2. Look up that cell + the 8 neighboring cells in the spatial index
3. The index returns: list of drivers currently in those 9 cells
4. Filter to drivers within 5km Euclidean distance, sort by distance
5. Return the top N

The index itself is in memory (Redis with the GEO commands, or a custom in-memory store sharded by geohash prefix). Every driver location write becomes:

1. Compute new geohash for the driver
2. If the geohash changed (driver moved across a cell boundary):
   a. Remove driver from old cell’s set
   b. Add driver to new cell’s set
3. If geohash didn’t change, just update the lat/lng in the cell
4. Eventual durability: async write to a durable store (S3 / Cassandra)
   for replay/recovery if the in-memory store loses state

Real Uber uses H3 (Uber’s open-source hexagonal hierarchical spatial index) instead of geohash for better distance properties — geohash cells aren’t equidistant from their neighbors, which matters when computing nearby drivers. Knowing this exists earns a bonus point; you don’t need to design H3 from scratch.

Sharding: the spatial index shards by geohash prefix. Drivers in Mumbai live on one set of nodes; drivers in San Francisco live on another. Cross-region queries don’t happen. This is one of the key insights that makes the system scalable — geographic locality is built into the data model.


Step 5: The Matching Tier — A Distributed State Machine

When a rider requests a trip, the system needs to:

  1. Find candidate drivers (spatial query against the real-time tier)
  2. Score them (proximity, ETA, ratings, recency of last trip)
  3. Offer the trip to the top candidate
  4. If the driver accepts within ~10 seconds, the match is locked
  5. If they decline or time out, offer the next candidate
  6. If no driver accepts after several rounds, return failure to rider

The hardest part: this is a distributed transaction across multiple parties (rider, driver, possibly multiple candidate drivers offered in succession), and the state needs to be consistent. If two riders both get matched to the same driver simultaneously, that’s a billing crisis.

The mechanism: the matching service holds an in-memory lock on each driver while they’re being offered a ride. The lock has a 10-second TTL. When the driver accepts, the lock converts into a confirmed match (persisted to the strong-consistency trip state DB). When they decline, the lock releases; another rider can now match them.

The matching service is regional — each region has its own matching cluster. Within a region, matching is centralized enough that the locking is straightforward; cross-region rides are rare enough to handle as a special case.

Senior signal: recognizing that matching is a distributed lock with TTL, not a database transaction, is the senior insight. Trying to do matching as a Postgres transaction means “take a row lock for 10 seconds while we wait for the driver app to respond” — which doesn’t scale. The right tool is Redis or a similar in-memory key-value store with TTL semantics.


Step 6: The Trip Lifecycle Tier — The State Machine

Once a match is confirmed, the trip lifecycle takes over. This subsystem manages the canonical state of every active trip:

Trip states:
REQUESTED  → rider asked for a ride
MATCHED    → driver accepted, en route to pickup
ARRIVED    → driver at pickup location
IN_PROGRESS → rider in the car, driving to destination
COMPLETED  → arrived at destination
CANCELLED  → either party cancelled
PAYMENT_PENDING / PAID / FAILED → payment states

Transitions are strict. Can’t go from REQUESTED directly to COMPLETED. Can’t skip ARRIVED. Can’t un-cancel.

The storage: Postgres or Spanner, with each trip row having strong-consistency guarantees. Both rider and driver apps read this same row to know the current state; both write transitions through the trip service. The trip service serializes transitions to prevent races (driver “arrived” and rider “cancelled” happening simultaneously).

CREATE TABLE trips (
    trip_id          UUID PRIMARY KEY,
    rider_id         UUID NOT NULL,
    driver_id        UUID NOT NULL,
    state            TEXT NOT NULL CHECK (state IN (
        ‘REQUESTED’, ‘MATCHED’, ‘ARRIVED’, ‘IN_PROGRESS’,
        ‘COMPLETED’, ‘CANCELLED’
    )),
    pickup_lat       DOUBLE PRECISION,
    pickup_lng       DOUBLE PRECISION,
    dropoff_lat      DOUBLE PRECISION,
    dropoff_lng      DOUBLE PRECISION,
    fare_cents       INTEGER,
    requested_at     TIMESTAMPTZ NOT NULL,
    state_updated_at TIMESTAMPTZ NOT NULL,
    version          INTEGER NOT NULL  -- For optimistic concurrency
);

Optimistic concurrency: every transition reads the current version, writes with WHERE version = ?, fails the update if someone else got there first. Retry on conflict. This is how the trip service prevents two services from concurrently transitioning the same trip.

State changes also publish events to a Kafka topic: “trip 123 transitioned to ARRIVED.” Other services subscribe — the rider app gets a push notification, the driver app updates UI, the analytics pipeline records the event, the ETA service updates predictions.

This event-sourcing pattern is what lets the system stay consistent across many subsystems without each one doing direct trip-DB lookups.


Step 7: Real-Time Location Streaming — The Pickup Phase

Once matched, both rider and driver need to see each other’s location in real time. This is its own architectural challenge.

The flow:

  1. Driver app streams location to a Location Service (the same one feeding the spatial index)
  2. The Location Service publishes the driver’s update to a per-trip channel (e.g., trip:123:driver-loc)
  3. The rider’s app subscribes to that channel via a long-lived WebSocket or server-sent events connection
  4. Updates arrive at the rider’s phone, animate on the map
  5. Same in reverse: rider’s location flows to driver

The pub/sub system: typically Redis pub/sub or a custom WebSocket router. The key insight: only matched parties subscribe to each other’s location streams. A driver streaming location to the global spatial index doesn’t mean every nearby rider sees them in real-time — only the matched rider does. The fan-out is bounded.

For the rider seeing nearby drivers before requesting (the “ant farm” map view), polling is sufficient at lower frequency. The rider app fetches “nearby drivers” every 5–10 seconds when the map is open. No pub/sub needed for this case.

The scale benefit: only ~1M concurrent active trips at peak, so only ~2M concurrent WebSocket subscriptions for in-trip location streaming. That’s a manageable load on the gateway tier (same kind of gateway tier as in the WhatsApp design — persistent connection management is a separate scaling problem from application logic).


Step 8: Payment — The Exactly-Once Tier

When the trip transitions to COMPLETED, payment processing kicks off. This is the one place in the entire system where strong correctness is non-negotiable: charging a customer twice or missing a charge are both unacceptable.

The pattern: idempotent payment intent.

  1. Trip service emits a “trip 123 completed, fare $X” event with the trip ID as the idempotency key
  2. Payment service receives the event, checks: have I already processed payment for trip 123? If yes, no-op. If no, proceed.
  3. Payment service calls the payment processor (Stripe, Adyen, etc.) with the idempotency key. If the call has been made before with the same key, the processor returns the existing result.
  4. On success, write the payment result to the database with the trip_id as a unique constraint. If a duplicate event arrives later, the unique constraint catches it.
  5. On failure, retry with backoff. The payment processor’s idempotency means retries don’t double-charge.

The architectural primitive: idempotency keys. Every step of the chain — from event publish to processor call to DB write — uses the trip ID as the deduplication key. This is what enables the at-least-once messaging semantics of Kafka to coexist with the exactly-once semantics required for billing.

The trip service doesn’t wait for payment confirmation to mark the trip done. The trip is “completed” the moment the rider arrives at destination; payment is an async follow-up that may take seconds or, in failure cases, hours. The payment state is tracked separately on the trip row.

Senior follow-up: “What if payment fails permanently — declined card, fraud detection?” — the trip stays in COMPLETED with payment_status = FAILED. The user gets a notification, can update their card, retry happens out-of-band. The driver still gets paid out of an internal balance Uber maintains; recovering the user’s missing payment is Uber’s problem, not the driver’s.


Step 9: The Mobile Client — Where the Real-Time Stack Lives

Mobile interviewers especially care about this section. Riders and drivers each have their own app with different requirements.

Driver app (more demanding). Always-on location streaming, foreground service when on duty, low-power location subscription, network-resilient updates (queue location updates when offline, batch-send on reconnect).

// Conceptual driver-side location streaming
class DriverLocationService : LifecycleService() {
    // Foreground service with FOREGROUND_SERVICE_TYPE_LOCATION
    private val locationClient = FusedLocationProviderClient(this)
    private val locationQueue = mutableListOf<LocationUpdate>()

    override fun onCreate() {
        super.onCreate()
        startForeground(NOTIFICATION_ID, buildOnDutyNotification())

        val request = LocationRequest.Builder(Priority.PRIORITY_HIGH_ACCURACY, 4_000L)
            .setMinUpdateDistanceMeters(10f)  // Don’t emit if hasn’t moved 10m
            .build()

        locationClient.requestLocationUpdates(request, callback, mainLooper)
    }

    private val callback = object : LocationCallback() {
        override fun onLocationResult(result: LocationResult) {
            result.lastLocation?.let { location ->
                queueAndSend(LocationUpdate.from(location))
            }
        }
    }

    private fun queueAndSend(update: LocationUpdate) {
        locationQueue.add(update)
        lifecycleScope.launch {
            try {
                api.streamLocation(update)
                locationQueue.remove(update)
            } catch (e: IOException) {
                // Network error — keep in queue, retry on reconnect
            }
        }
    }
}

Several Android-specific details:

(1) Foreground Service of type FOREGROUND_SERVICE_TYPE_LOCATION is mandatory for drivers on duty. The user sees the persistent notification; the OS knows to keep the service alive. Without this, Android 12+ kills the location streaming.

(2) FusedLocationProviderClient for power efficiency — combines GPS, Wi-Fi, cell tower data and picks the best source. Direct GPS access drains battery 3× faster.

(3) Local queue for offline tolerance. Drivers go through tunnels, parking garages, areas with poor signal. The location updates are queued locally; sent in batches when connectivity returns. The server side handles slightly-late updates gracefully.

(4) Doze mode awareness. If the driver app is in the background (driver pulled over to check a message), Doze can throttle the location callback. The foreground-service status is what protects against this.

Rider app (less demanding). Polls nearby drivers when on the home screen; subscribes to driver location only after match is confirmed. No always-on location needed except during pickup-and-trip phases.

For both apps, local Room database serves as the source of truth for the UI — the same offline-first pattern from the WhatsApp post. Trip state, location history, fare calculation all live locally; the network is a sync layer.


Step 10: The Architecture — In One Picture

                     ┌─────────────┐         ┌─────────────┐
                     │  Rider App  │         │ Driver App  │
                     │ (Room cache,│         │ (Foreground │
                     │  WebSocket) │         │  Service,   │
                     │             │         │  Loc.       │
                     │             │         │  Streaming) │
                     └──────┬──────┘         └──────┬──────┘
                            │                       │
                ┌───────────┴───────────────────────┴───────────┐
                │           Gateway Tier (regional)             │
                │   (auth, WebSocket termination, routing)      │
                └─┬──────────┬──────────┬──────────┬────────────┘
                  │          │          │          │
                  ▼          ▼          ▼          ▼
           ┌──────────┐ ┌─────────┐ ┌─────────┐ ┌──────────┐
           │ Location │ │Matching │ │  Trip   │ │ Payment  │
           │ Service  │ │ Service │ │ Service │ │ Service  │
           │          │ │         │ │         │ │          │
           └────┬─────┘ └────┬────┘ └────┬────┘ └────┬─────┘
                │            │           │           │
                ▼            ▼           ▼           ▼
       ┌──────────────┐ ┌───────┐ ┌─────────┐ ┌──────────┐
       │ Spatial Idx. │ │ Redis │ │Postgres │ │Stripe /  │
       │ (Redis GEO,  │ │ (locks│ │ /Spanner│ │ Adyen    │
       │  H3-sharded) │ │  TTL) │ │ (trips) │ │ (with    │
       └──────┬───────┘ └───────┘ └────┬────┘ │idempot.) │
              │                        │      └──────────┘
              ▼                        ▼
       ┌──────────────┐         ┌─────────────┐
       │  Cassandra   │         │   Kafka     │
       │ (loc. history)│         │ (trip events│
       │              │         │ to consumers)│
       └──────────────┘         └──────┬──────┘
                                       │
                       ┌───────────────┼─────────────────┐
                       ▼               ▼                 ▼
                 ┌──────────┐   ┌──────────┐     ┌──────────┐
                 │ Push Svc │   │Analytics │     │ ETA Svc  │
                 │ (notifs) │   │ Pipeline │     │ (ML)     │
                 └──────────┘   └──────────┘     └──────────┘

Five tiers of services, four storage technologies, one event bus. Each service has one responsibility. The shape of the architecture maps directly to the shape of the problem.


Step 11: The Deep-Dive Questions Senior Interviewers Ask

The setup steps are above. These are the prompts that separate hire from no-hire:

“What if the matching service crashes mid-match? The driver accepted but the trip never confirmed.” — The lock in Redis has a TTL; it expires automatically. The driver app sees the timeout and shows “match failed.” The rider app sees the same and re-requests. Worst case: 10 seconds of confusion, no data corruption.

“How do you handle a driver going into a tunnel during a trip?” — Local queue on driver side; rider app shows “driver location unavailable” with last-known position; ETA degrades but trip state remains IN_PROGRESS; updates flush when connection returns. The client architecture is what tolerates this; the server just sees a brief location-streaming gap.

“Two riders request a ride simultaneously, both nearest to the same driver. How does the system pick?” — The matching service serializes match offers per driver: only one rider can be in the “currently offering this driver” state at a time, enforced by the Redis lock. The second rider waits for the first match to resolve (succeed or timeout) before being offered the same driver.

“ETAs to the rider—how do you compute them and how often do you update?” — Separate ETA service that uses driver location, destination, road network, and historical traffic patterns. The service is ML-driven (route + ETA prediction). It runs on each location update for the active trip, publishes ETA changes to the rider via the same WebSocket channel. Caching: don’t recompute if location hasn’t materially changed.

“The rider cancels right after the driver accepts. How is that handled?” — Trip service receives cancel event, transitions trip to CANCELLED. Driver app receives push notification (“rider cancelled, you’re available again”). The driver’s lock in the matching service is released. If the cancel happened >1 minute after match (configurable), a cancellation fee may apply — processed through the same payment infrastructure.

“How do you scale the spatial index across regions?” — Sharded by geohash prefix. Cells in Mumbai live on the Mumbai shards; cells in São Paulo on São Paulo shards. Cross-region queries don’t happen because riders and drivers are colocated. Service mesh routes a rider’s nearby-drivers query to the shard owning that geohash prefix.

Each of these is a 90-second answer. Hit four out of six and you’ve had a strong interview.


Common Failures in This Interview

Picking Postgres for the spatial index. PostGIS works for moderate-scale spatial queries but doesn’t handle 1.5M writes/sec with sub-second read latency. Naming the right tool (Redis GEO, H3-based custom store, or even Elasticsearch geo queries for some use cases) demonstrates you know spatial databases are a category, not just “put coordinates in Postgres.”

Treating matching as a database transaction. Locks held for 10 seconds inside a Postgres row don’t scale. Recognizing that matching is a distributed lock with TTL, naturally implemented in Redis, is the senior insight.

Ignoring the time-scale tiers. Trying to make every subsystem strongly consistent is over-engineering. Recognizing that location streaming is best-effort, matching is short-window strong, trip state is long-window strong, and analytics is eventually consistent is the architectural taste interviewers grade for.

Skipping the mobile complexity. The driver app is a real engineering challenge — foreground service, location batching, offline queue, battery optimization. Mobile interviewers will absolutely probe this. “The driver streams location and the server stores it” isn’t enough.

Conflating payment with trip state. Trip state and payment state are independent. Payment can be retrying for hours after the trip is complete. Modeling them as one state machine is a real design mistake.

Not addressing geographic distribution. Uber works because the system is regionalized; a global “match every driver to every rider” problem doesn’t exist. Calling out that the architecture shards by region simplifies many of the apparent scale problems.


What Pairing This With WhatsApp and Instagram Teaches

If you’ve read all three system design posts, the pattern is the lesson. WhatsApp is about delivery (move bytes from A to B with guarantees). Instagram is about selection (choose what to show out of many options). Uber is about spatial real-time matching (join two moving entities under latency constraints).

Three different problem classes, three different architectures. The same patterns recur — gateway tier separating connection management from business logic, eventual consistency in some places, strong consistency in others, mobile clients as full local replicas — but the storage choices, the indexing strategy, the fan-out model differ wildly because the underlying problem differs.

If you can pattern-match a new system design question (“design DoorDash,” “design Lime scooters,” “design Tinder”) onto one of these three categories — or recognize when it’s a hybrid — the architecture writes itself. DoorDash is a hybrid of Uber (real-time matching) and an e-commerce checkout flow. Tinder is mostly Instagram (selection from a candidate pool) with some Uber elements (spatial filtering). The reasoning transfers; the specifics fall out.


Closing

The reason “design Uber” is such a useful interview question is that it forces you to engage with constraints most candidates haven’t encountered: real-time location at scale, geospatial indexing, distributed matching with TTL locks, time-scale-tier separation. Get the time-scale tier framing right at the start, and the rest is just elaboration.

The framework: scope first, identify the time-scale tiers, pick the right storage per tier, walk the matching/trip-lifecycle/payment flows, address mobile complexity, anticipate the deep-dive prompts. That framework transfers to any spatial real-time system. The specifics change; the shape of the reasoning doesn’t.

That’s the System Design trilogy: WhatsApp (delivery), Instagram (selection), Uber (spatial real-time matching). Three problem classes that cover most of what senior system design interviews actually ask.

Happy coding!

5 views · 0 comments

Comments (0)

No comments yet. Be the first to share your thoughts.