System Design

Welcome to the Master Guide for System Design . These questions are curated for high-level technical rounds at companies like Amazon, Google, Adobe, and startups . Master the art of scalability, availability, and reliability.

On this page:

1. Scaling & Load Balancing
2. Database Design & Sharding
3. Caching & Content Delivery
4. Microservices & APIs
5. Real-World Design Scenarios

1. Scaling & Load Balancing

Q1: What is the difference between Vertical and Horizontal Scaling? Easy +

Vertical Scaling: Also known as "Scaling Up," it involves adding more power (RAM, CPU) to your existing server.
Horizontal Scaling: Also known as "Scaling Out," it involves adding more servers to your pool to share the workload.

[Image of Vertical vs Horizontal Scaling]
Pro-Tip: Horizontal scaling is preferred for modern distributed systems because it offers High Availability and avoids a "Single Point of Failure."

Q2: How does a Round Robin Load Balancer work? Medium +

Round Robin is the simplest load balancing algorithm. It sends each new incoming request to the next server in a rotating list.

Limitations: It assumes all servers have equal processing power. For servers with different capacities, we use Weighted Round Robin .

Q3: What is "Sticky Session" in Load Balancing? Medium +

Sticky Sessions (Session Affinity) ensure that all requests from a specific user during a session are always sent to the same server .

This is useful when the server stores user session data locally, but it can lead to unbalanced loads if one user performs heavy tasks.

How do you make a Singleton class thread-safe in Java? Hard +

In a multi-threaded environment, a simple "Lazy Initialization" fails because two threads can enter the


      if (instance == null)

block simultaneously.

1. Double-Checked Locking (Recommended for Performance)

This is the most efficient way. We use a volatile keyword to ensure visibility and a synchronized block to handle the lock.

      public class DatabaseConnection { 
      		private static volatile DatabaseConnection instance; 
            private DatabaseConnection() {} // Private Constructor public
      		static DatabaseConnection getInstance() { 
            	if (instance == null) { // First check 
                	synchronized (DatabaseConnection.class) { 
                    	if (instance == null) {	// Second check 
                        	instance = new DatabaseConnection(); 
                         } 
                      } 
                 } 
             return instance;
      		} 
      }

2. Bill Pugh Singleton (The Industry Standard)

This uses a "Static Inner Helper Class." The JVM only loads the inner class into memory when getInstance() is called, making it thread-safe and lazy-loaded without needing synchronization blocks.

      public class Singleton { 
      		private Singleton() {} 
            private static class SingletonHelper { 
            	private static final Singleton INSTANCE = new Singleton(); 
            } 
            public static Singleton getInstance() { 
               return SingletonHelper.INSTANCE; 
            } 
       }

3. Enum Singleton (The Safest Way)

As per Joshua Bloch (Effective Java) , an Enum is the best way to implement a Singleton. It provides 100% thread safety and prevents "Reflection" or "Serialization" from creating a second instance.

      public enum Logger { 
      		INSTANCE; 
            public void log(String msg) { 
            	System.out.println(msg);
      		} 
       }

Interview Pro-Tip: Always mention the volatile keyword in Double-Checked Locking. Without it, a thread might see a half-initialized object due to "Instruction Reordering" by the JVM.

How do you process a 10GB file or millions of database records in Java without crashing the RAM? Hard +

The key to handling large data is Lazy Evaluation. Traditional Collections load all data into memory, but Streams process data on-demand.

1. Reading Large Files (Files.lines)

Never use Files.readAllLines() for large files. Instead, use Files.lines(), which returns a Stream and reads the file line-by-line.

try (Stream<String> lines = Files.lines(Paths.get("large_log_file.txt"))) {
    lines.filter(line -> line.contains("ERROR"))
         .map(String::toUpperCase)
         .forEach(System.out::println);
} catch (IOException e) {
    e.printStackTrace();
}

2. Database Streaming (Spring Data JPA)

When fetching millions of rows, avoid List<Entity>. Use the Stream return type in your Repository. This uses a Database Cursor to fetch rows one by one.

@Query("select u from User u")
Stream<User> getAllUsersStream();

// Inside Service (Must be @Transactional)
try (Stream<User> userStream = userRepository.getAllUsersStream()) {
    userStream.forEach(user -> process(user));
}

3. Parallel Streams for Performance

If the task is CPU-intensive (like calculating a hash for every record), use .parallelStream(). This splits the data across multiple CPU cores using the ForkJoinPool.

Why this is the "Senior" Approach:

Memory Efficiency: Only 1 record is in memory at a time.
Pipelining: Multiple operations (filter, map, sort) are combined into a single pass over the data.
Short-circuiting: Operations like findFirst() stop the processing immediately once the result is found, saving CPU cycles.

⚠️ Crucial Warning: Always use a try-with-resources block when streaming from files or databases to ensure the underlying resources (file handles/connections) are closed.

How do you design a 100% Immutable Class in Java? Hard +

To ensure a class is immutable, you must prevent any modification to its state and prevent subclasses from overriding its behavior.

The 5 Strict Rules:

Declare the class as final: This prevents other classes from extending it and overriding its methods.
Make all fields private and final: Private ensures encapsulation, and final ensures they are initialized only once.
No Setter methods: Do not provide any methods that can change the state of the fields.
Initialize all fields via Constructor: Perform a Deep Copy for mutable objects (like Lists or Dates) during initialization.
Return Deep Copies in Getters: Never return the actual reference of a mutable object; return a copy instead.

Example Implementation:

import java.util.ArrayList;
import java.util.List;

public final class Student {
    private final String name;
    private final List<String> courses;

    public Student(String name, List<String> courses) {
        this.name = name;
        // Deep Copy: Don't just do this.courses = courses;
        this.courses = new ArrayList<>(courses);
    }

    public String getName() {
        return name;
    }

    public List<String> getCourses() {
        // Return a copy to prevent the caller from modifying the list
        return new ArrayList<>(courses);
    }
}

Why use Immutability in System Design?

Thread Safety: Since the state never changes, multiple threads can access it without synchronization.
Caching: Immutable objects are perfect keys for HashMap because their hashCode never changes.
Reliability: It prevents "Side Effects" where one part of the code accidentally changes data used by another part.

Pro-Tip for Java 17: You can use Records to create immutable data carriers instantly: public record Student(String name, List<String> courses) {}. However, note that Records perform a shallow copy, so you still need to manually handle mutable lists for 100% immutability.

Scenario: Users are reporting 500 Errors in Production. What is your step-by-step debugging process? Hard +

Production debugging follows a "Detect -> Isolate -> Fix -> Prevent" cycle. The priority is always MTTR (Mean Time To Recovery).

Step 1: Detection & Verification (The "What")

Check Monitoring Tools (Prometheus, Grafana, Datadog) to see the scale. Is it 1% of users or 100%?
Identify the Impact Area: Is it a specific region (Indore/Mumbai) or a specific feature (Payment/Login)?

Step 2: Isolation (The "Where")

Use the ELK Stack (Elasticsearch, Logstash, Kibana) or Splunk to search for Error Logs.

Correlation IDs: Trace a single failed request across multiple Microservices using tools like Zipkin or Jaeger.
Recent Changes: Check the CI/CD pipeline. Was a new deployment made in the last 30 minutes? (If yes, Rollback immediately).

Step 3: Immediate Mitigation (The "Fix")

Do not try to "fix the code" while the site is down. Restore service first:

Rollback: Revert to the last stable version.
Restart/Scale: If it's a memory leak or high CPU, restart the pods or add more nodes.
Circuit Breaking: If a 3rd party API (like a Payment Gateway) is down, enable the circuit breaker to show a "Service Unavailable" message instead of crashing the app.

Step 4: Root Cause Analysis - RCA (The "Why")

Once the site is stable, perform a deep dive:

// Example: Checking Heap Dump for Memory Leak
jmap -dump:live,format=b,file=heap.bin [pid]
// Analyze heap.bin in Eclipse MAT to find the leaking object.

Step 5: Post-Mortem (The "Never Again")

Update Unit Tests to catch this specific bug in the future.
Add Alerting Rules so you get notified before the users do.

Interview Tip: Mention that "Rollback is always the first choice" if a recent deployment happened. Don't try to "fix-forward" in the middle of a major outage.

Q: Your Java service is slow, but CPU usage is under 10%. What do you investigate first? Hard +

If the CPU is low, the application is not busy processing logic. Instead, threads are likely stuck in a WAITING or BLOCKED state.

1. Database Connection Pool Exhaustion

This is the #1 cause. If your connection pool (e.g., HikariCP) is set to 10 but you have 100 concurrent users, 90 users will wait for a connection to open. The CPU stays idle because the threads are just sitting there.

Check: Hikari metrics or ActiveConnections count.
Fix: Increase pool size or optimize slow SQL queries.

2. Thread Locking & Contention

Multiple threads might be fighting for the same synchronized block or a ReentrantLock. One thread holds the lock for too long, and others form a queue.

Check: Take a Thread Dump using jstack [pid]. Look for threads in BLOCKED status.

3. External API Latency (I/O Wait)

If your service calls a third-party API (like a Payment Gateway or SMS service) and that service is slow, your Java thread stays open waiting for the HTTP response. Since it's waiting for the network, it uses 0% CPU.

Check: Distributed Tracing (Zipkin/Jaeger) to see which downstream service is slow.
Fix: Implement Timeouts and Circuit Breakers (Resilience4j).

4. Garbage Collection "Stop-the-World" Pauses

In some cases, specific GC phases (like a Full GC in the Old Generation) pause all application threads. While the threads are paused, they don't use CPU for application logic.

Check: GC logs using -XX:+PrintGCDetails or tools like GCEasy.io.

5. Memory Leaks (Low CPU, High RAM)

If the Heap is almost full, the JVM spends all its time trying to find a tiny bit of space to allocate new objects. This constant "GC Thrashing" slows down the app without heavy CPU computation.

Pro-Tip: Use VisualVM or JConsole to see the Live Thread Count. If you see many threads in a "Yellow" (Waiting) state, you have found your bottleneck.

Q: A Thread Pool is configured correctly (e.g., 20 core threads), but tasks are still delayed. Why? Hard +

If your Thread Pool size is optimized but latency is high, you are likely facing one of these five "Hidden Bottlenecks":

1. The "Unbounded Queue" Problem

If you use a LinkedBlockingQueue without a capacity limit, tasks will sit in the queue indefinitely if they arrive faster than they can be processed. The threads are working fine, but the Queue Wait Time is killing your performance.

Check: Monitor getQueue().size().
Fix: Use a bounded queue and implement a RejectedExecutionHandler.

2. CPU Context Switching (Thrashing)

If your "correct configuration" actually exceeds the number of available CPU cores (e.g., 200 threads on an 8-core machine), the OS spends more time switching between threads than actually executing code.

Check: Use vmstat or top to look for high "cs" (context switches).
Rule: For CPU-bound tasks, Pool Size should be N + 1 (where N = cores).

3. Shared Resource Contention (Locking)

Your 20 threads might all be trying to access the same Synchronized block, a Database connection, or a shared File handle. 1 thread works while 19 threads wait in a "BLOCKED" state.

Check: Thread Dump (jstack). Look for "waiting to lock <0x000...>"

4. Garbage Collection (STW) Pauses

If the JVM is performing a Full GC, it triggers a "Stop-The-World" event. Every single thread in your "correctly configured" pool will be paused. To the user, it looks like a delay, even though the pool is technically fine.

Check: GC Logs or jstat -gcutil.

5. Thread Starvation (Priority Issues)

If other processes on the same server (like a heavy backup script or another service) are consuming all the CPU/IO, your threads won't get enough CPU time slices from the OS scheduler.

The "Senior" Troubleshooting Formula:

Task Latency = Queue Wait Time + Execution Time + Blocked Time

If Execution Time is low, focus 100% on Queue Wait Time and Blocked Time.

Q: Your application throws OutOfMemoryError, but the Heap is only 40% full. What is happening? Hard +

In Java, java.lang.OutOfMemoryError does not always mean the Heap is full. It means the JVM cannot allocate memory somewhere.

1. Metaspace Exhaustion (java.lang.OutOfMemoryError: Metaspace)

The Metaspace stores class metadata (method names, field types, etc.). If your app uses heavy Reflection or dynamic class loading (like Spring, Hibernate, or CGLIB), the Metaspace can fill up.

Check: jstat -gc [pid] or check Metaspace charts in VisualVM.
Fix: Increase -XX:MaxMetaspaceSize.

2. Stack Overflow (Unable to Create New Native Thread)

Every time you start a new Thread(), the JVM requests memory from the OS for the Thread Stack (usually 1MB per thread). If the OS runs out of RAM or hits the "max user processes" limit (ulimit), the JVM throws OOM even if the Heap is empty.

Check: ulimit -u on Linux or look for "Native Thread" in the error log.
Fix: Use a Thread Pool instead of creating new threads manually.

3. Direct Buffer / Native Memory Leak

High-performance libraries like Netty or NIO use "Direct Buffers" which sit outside the JVM Heap to avoid copying data. If these aren't cleared, you get a "Direct buffer memory" OOM.

Check: Use -XX:NativeMemoryTracking=detail and jcmd to track native allocations.

4. GC Overhead Limit Exceeded

This happens when the JVM spends 98% of its time doing Garbage Collection but recovers less than 2% of the Heap. Even if the Heap isn't technically "Full," the JVM gives up because it can't make progress.

Fix: Identify the "GC Thrashing" cause using a Heap Dump.

5. Compressed Class Space OOM

In 64-bit JVMs, a special part of Metaspace called "Compressed Class Space" has a default limit of 1GB. If you load too many classes, this can fail before the main Metaspace is full.

The "Senior" Troubleshooting Step:

Always check the exact message following the OutOfMemoryError.
Is it Java heap space? Metaspace? Direct buffer memory? or Unable to create new native thread?

🚀 Mastered System Design?

Join our community for daily architecture deep-dives and placement resources!

Join Telegram Follow Instagram

← Back to Last Page

Finished studying this topic?

← Return to Home Page

Search

Exam Helper - Java & Placement Preparation

System Design

1. Scaling & Load Balancing

🚀 Mastered System Design?

Comments

Post a Comment

More Related

For Any Help Related to coding exams follow me on Instagram : codingsolution75

Popular Posts

EPAM Latest Coding Questions with Solutions 2026

Java Interview Questions

WIPRO Latest Coding Questions with Solutions : 2026

MindTree Latest Coding Questions with Solutions

Accenture Latest Coding Questions with Solutions

Fresco Play Hands-on Latest Solutions

LTI (Larsen & Toubro Infotech ) Coding Questions and Solutions 2022-23