Introduction to Networks
Networking plays a critical role in system design, as it enables communication across distributed systems. Understanding networking fundamentals is essential for designing scalable, reliable, and efficient architectures, as these concepts underpin the performance and functionality of systems built across the web, cloud, and internal networks.
In system design interviews, knowledge of networking concepts such as protocols, IP addressing, routing, and load balancing can help you reason about communication between different services and justify design decisions.
Key Networking Concepts
- Node: Any device in a network, such as a server, computer, or smartphone, that sends or receives data. In system design, nodes could represent backend services, databases, or microservices that interact across the network.
- Link: A connection between nodes, either wired or wireless, through which data flows. Links may have bandwidth limitations, which can impact data transfer rates and system performance.
- Packet: A unit of data formatted for network transmission, containing the payload (actual data) and metadata (like source and destination addresses). In a distributed system, packets carry requests and responses between services.
Network Models: OSI and TCP/IP
Network models provide a framework to standardize communication between devices. The two main models, OSI and TCP/IP, organize communication into different layers, each serving a specific function in data transmission.
OSI Model
The OSI model is a conceptual seven-layer model that defines the functions required for data transfer in a network. The layers, in order, include:
- Application Layer: Interfaces directly with user applications, providing network services to users (e.g., HTTP for web browsing, FTP for file transfers).
- Presentation Layer: Ensures that data is in the appropriate format, managing encryption and compression as needed.
- Session Layer: Establishes, manages, and terminates communication sessions between applications.
- Transport Layer: Ensures end-to-end communication, handling data segmentation, flow control, and error recovery (e.g., TCP for reliable data transfer, UDP for low-latency transfer).
- Network Layer: Routes data packets across networks using logical addressing (e.g., IP addresses) and manages packet forwarding.
- Data Link Layer: Ensures reliable node-to-node data transfer within a single network segment, handling MAC addressing and error detection.
- Physical Layer: Transmits raw bitstreams over the physical network medium, such as cables or radio frequencies.
TCP/IP Model
The TCP/IP model is a simplified four-layer model widely used for internet-based communications. The layers are:
While the OSI model provides a theoretical framework, the TCP/IP model is more commonly applied in practical scenarios, especially for web-based systems and applications that communicate over the internet.
IP Addressing and DNS
IP Addresses
IP (Internet Protocol) addresses are unique identifiers assigned to devices on a network. They play a central role in routing data to the correct destination. In system design, IP addresses are crucial for service discovery, network configuration, and client-server communication.
- Public IP Address: An address accessible over the internet, enabling devices or services to be reachable globally.
- Private IP Address: Used within internal networks (e.g., a company’s intranet) and is not accessible from the outside without NAT (Network Address Translation).
- APIPA (Automatic Private IP Addressing): A default IP address (169.254.x.x) assigned when a device cannot obtain an IP address from the network.
IPv4 vs IPv6
IPv4 uses 32-bit addresses, supporting about 4.3 billion unique addresses, while IPv6 uses 128-bit addresses, offering an exponentially larger address space. IPv6 is becoming increasingly necessary as IoT and mobile devices proliferate.
DNS (Domain Name System)
DNS translates human-readable domain names (like www.example.com) into IP addresses. It allows users to access services by name rather than by IP address. In a system design interview, understanding DNS is essential for designing scalable and user-friendly services, as it supports load balancing, high availability, and resilience.
Core Protocols in System Design
TCP vs UDP
Both TCP and UDP are protocols at the transport layer, but they offer different approaches to data transmission:
- TCP (Transmission Control Protocol): Reliable, connection-oriented protocol that ensures data integrity and order. TCP establishes a connection via a 3-way handshake and retransmits lost packets. It’s suitable for applications where data accuracy is essential, such as web browsing (HTTP/HTTPS), file transfers, and emails.
- UDP (User Datagram Protocol): Connectionless, faster protocol without guaranteed delivery. UDP is used in scenarios where low latency is critical, such as real-time video streaming, gaming, or voice-over-IP (VoIP). If a packet is lost, UDP does not retransmit it, keeping latency low.
When to Use TCP vs UDP in System Design
In a system design context, selecting TCP or UDP depends on the application requirements:
- TCP for Consistency: For an e-commerce website, where data integrity is critical (such as payment transactions or order confirmations), TCP ensures that messages are reliably delivered and in the correct order.
- UDP for Real-Time Data: In a multiplayer game, UDP is preferred because real-time performance is more important than guaranteed delivery. Minor packet loss is acceptable if it reduces lag.
HTTP and HTTPS
HTTP (HyperText Transfer Protocol) is a stateless, text-based protocol that forms the basis of web communication. HTTPS (HTTP Secure) adds SSL/TLS encryption to HTTP, providing secure, encrypted connections. HTTPS is essential for handling sensitive information, such as login credentials or payment data, protecting users from man-in-the-middle attacks.
Use Case: HTTPS in Secure Applications
In a system design interview, using HTTPS would be crucial for a financial application, where user data and transactions require confidentiality and integrity. HTTPS would ensure that all data exchanged between the client and server is encrypted.
Load Balancing and Redundancy
Load Balancers
A load balancer distributes incoming requests across multiple servers, ensuring no single server becomes overloaded. Load balancing improves system reliability, scalability, and performance by enabling horizontal scaling.
- Server-side Load Balancing: The server-side load balancer is an intermediary, routing client requests to different backend servers based on load and availability. This is common in cloud services and microservices architectures.
- Client-side Load Balancing: The client has a list of server IPs and selects one based on an algorithm. This is often used with DNS-based load balancing and in edge-based architectures.
Example: Load Balancer in a High-Traffic Application
In a social media application, a load balancer can distribute user requests across multiple servers. This prevents any single server from being overwhelmed and allows the system to scale horizontally by adding more servers to handle additional users.
Security and Encryption
Firewalls
Firewalls filter incoming and outgoing network traffic based on predetermined security rules. Firewalls protect networks by blocking unauthorized access, thus forming a critical component of network security in system designs that handle sensitive or private data.
3-Way Handshake (TCP)
The 3-way handshake is a sequence of steps that establish a TCP connection. It ensures that both client and server are ready to communicate and synchronizes their connection parameters. The steps are:
- SYN: The client initiates a connection by sending a SYN packet to the server.
- SYN-ACK: The server acknowledges the request with a SYN-ACK packet.
- ACK: The client responds with an ACK, establishing the connection.
SSL/TLS Encryption
SSL (Secure Sockets Layer) and TLS (Transport Layer Security) encrypt data between the client and server, protecting sensitive information in transit. In a system design interview, proposing SSL/TLS for any system dealing with user data (e.g., a login service) is essential to ensure security.
RSA Algorithm
The RSA algorithm is a widely-used encryption algorithm based on public-key cryptography. RSA is used for secure data transmission and is fundamental to SSL/TLS protocols, enabling secure data exchange over untrusted networks.
Network Performance Considerations
Bandwidth and Latency
- Bandwidth: The maximum rate at which data can be transmitted, typically measured in Mbps or Gbps. For high-traffic systems, sufficient bandwidth is necessary to avoid bottlenecks and ensure responsiveness.
- Latency: The delay in data transmission between sender and receiver. Low latency is essential for real-time applications, while higher latency may be acceptable in applications that prioritize reliability over speed.
Conclusion
Networking is a fundamental aspect of system design. Understanding networking protocols, performance factors, and security principles is essential for designing scalable, efficient, and secure distributed systems. Mastery of these concepts will enable you to make informed design choices during system design interviews and create robust architectures that handle real-world challenges.