Behind the “Play” Button: How Netflix Powers 270 Million Homes with Modern Java

Netflix Engineering

Look, we’ve all been there. You’re sitting on your couch, you hit the Netflix icon, and the home page pops up in milliseconds. It feels like magic, but as engineers, we know “magic” is just a word for a very complex, highly optimized system working exactly as intended.

At Netflix, that system is powered by Java. And I’m not talking about the “legacy” Java 8 your bank might still be running. I’m talking about a cutting-edge, high-performance ecosystem using JDK 21, Virtual Threads, and a federated GraphQL architecture.

Let’s pull back the curtain on what happens from the moment you hit that home page to the moment the pixels start moving.

1. The Entry Point: The Federated GraphQL “Brain”

When your device sends a request to the Netflix home page, it doesn’t just hit a single database. It hits the API Gateway (historically powered by Zuul).

Today, Netflix uses a Federated GraphQL architecture. Think of the Gateway as a smart router. It takes your single request for “Home Page Data” and realizes that “Home Page” is actually a composite of dozens of different data points: your profile name, your “Continue Watching” list, and your personalized recommendations.

The Gateway uses a Domain Graph Service (DGS) framework—built as a specialized extension of Spring Boot 3—to “fan out” this single request to thousands of microservices.

High-Level Architectural Flow

This diagram illustrates how the request travels from your remote control through the Java ecosystem and finally hands off to the hardware that moves the video bits.

graph TD
    User[User Device]
    Gateway(API Gateway / Zuul)
    
    subgraph "Federated GraphQL Layer (Spring Boot)"
        DGS["GraphQL DGS Framework"]
        VT["Java Virtual Threads <br/>(Parallelization)"]
    end
    
    subgraph "Java Microservices (Scale Independently)"
        Auth(Authentication Service)
        Meta(Metadata Service)
        Play(Playback Service)
        Rec(Recommendation Service)
    end
    
    subgraph "Platform & Infrastructure"
        SpringBoot["Spring Boot Netflix <br/>(Security, Observability)"]
        ZGC["Generational ZGC <br/>(Low Latency)"]
        Eureka[Netflix Eureka]
    end
    
    subgraph "Content Delivery Network"
        OpenConnect(Open Connect CDN)
        EdgeServer(Nearest Edge Server)
    end

    User -->|GraphQL Query| Gateway
    Gateway --> DGS
    DGS --> VT
    VT --> Auth
    VT --> Meta
    VT --> Play
    VT --> Rec
    
    Play -->|Streaming URL| User
    User -->|Video Request| OpenConnect
    OpenConnect --> EdgeServer
    EdgeServer -->|Video Stream| User
    
    %% Infrastructure links
    Auth -.-> SpringBoot
    Meta -.-> ZGC
    Play -.-> Eureka

2. Concurrency: Moving from RxJava to Virtual Threads

Netflix actually pioneered RxJava (Reactive programming) to handle high-concurrency “fan-outs.” It worked, but let’s be honest: it was a nightmare to debug. Stack traces were useless, and the code looked like functional spaghetti.

With the move to JDK 21, we’ve embraced Virtual Threads (Project Loom). In this model, we’ve ditched complex reactive code for simple, synchronous-style code. Virtual threads are lightweight threads managed by the JVM, not the OS. You can spin up millions of them without the memory overhead of traditional platform threads.

The Fan-Out Sequence

Here is how the Virtual Thread scheduler handles the parallel “heavy lifting” within the Java backend.

sequenceDiagram
    autonumber
    actor U as User Device
    participant G as API Gateway
    participant DGS as GraphQL DGS (Spring Boot)
    participant VTS as VT Scheduler (JVM)
    actor AS as Auth Service
    actor MS as Metadata Service
    actor PS as Playback Service
    
    U->>G: User requests Home Page Data
    G->>DGS: Passes request
    note right of DGS: DGS decomposes query <br/>for parallel resolution
    par Parallel Execution using Virtual Threads
        VTS->>AS: Virtual Thread 1: Authenticate User
        VTS->>MS: Virtual Thread 2: Get Movie Metadata
        VTS->>PS: Virtual Thread 3: Initialize Playback
        AS-->>VTS: User authenticated
        MS-->>VTS: Movie metadata returned
        PS-->>VTS: Playback session ready + CDN URL
    end
    DGS->>G: Aggregate all responses
    G->>U: Final JSON response with CDN URL

3. Reliability: The “Chaos” Philosophy

At Netflix, we don’t try to prevent failure; we assume it’s happening right now. This is where Chaos Engineering and the Circuit Breaker pattern come in.

If the Recommendation Service becomes slow, we don’t want it to bring down the whole home page. We use a Circuit Breaker. If the error threshold is hit, the circuit “trips.” Instead of a spinning loading wheel, the user gets a “fallback”—maybe a generic “Trending Now” list instead of personalized picks.

The Circuit Breaker State Machine

This logic prevents a single failing service from causing a “cascading failure” across the entire 3,000-service mesh.

stateDiagram-v2
    [*] --> Closed: Initial State (All requests pass)
    Closed --> Open: Failures > Threshold (Circuit Trips)
    Open --> Fallback: Bypass service, return cached/generic data
    
    Open --> HalfOpen: Sleep Window Expires (Testing recovery)
    HalfOpen --> Closed: Success (Service recovered)
    HalfOpen --> Open: Failure (Service still down)
    
    note right of Open: Prevents cascading failures

4. The Hand-off: Java Doesn’t Stream the Video

Here’s the plot twist: The Java backend doesn’t actually touch the video bits. Java’s job is the Control Plane. It handles the logic, the security, and the “handshake.” Once the Playback Service determines you’re authorized, it sends your device a specific URL pointing to Open Connect, Netflix’s custom Content Delivery Network (CDN).

Your movie is streamed from a physical hardware box placed inside your local ISP’s data center, while our Java services move on to the next user.

The Bottom Line

We use Java because it has evolved. By combining Spring Boot 3, GraphQL, and JDK 21, we’ve built a system that is both developer-friendly and incredibly “hard to kill.”

As a principal engineer, my advice is simple: Don’t get distracted by the “language of the month.” If you want to scale to 270 million users, focus on your platform internals—upgrade your JDK, optimize your GC, and embrace the simplicity of virtual threads.


Enjoyed this deep dive? Staying updated on architectural shifts shouldn’t be a full-time job. At Knowledge Cafe, we act as your personal research team, filtering the fluff from the world’s top engineering blogs to bring you the insights needed to build the next generation of resilient systems.

Subscribe to Knowledge Cafe here