System/Networking Paper Reading List

Index

Middleboxs and NFV
Network Abstractions
eBPF and XDP
Transport Protocol
Microservice and Service Mesh
Network Stack
Workload Interference
Internet Architecture
Container Networking

Reading List

Middleboxs and NFV

The Click Modular Router, TOCS '00
- A modular software architecture for building routers, where packet processing is composed from fine-grained elements connected in a directed graph.
Middleboxes No Longer Considered Harmful, OSDI '04
- Argues that middleboxes are a permanent part of the Internet and proposes a Delegation-Oriented Architecture (DOA) that enables end-hosts to explicitly authorize middlebox processing.
Making Middleboxes Someone Else's Problem: Network Processing as a Cloud Service, SIGCOMM '12
- Proposes APLOMB, which outsources enterprise middlebox processing (e.g., firewalls, WAN optimizers) to the cloud, reducing cost and management complexity.
Design and Implementation of a Consolidated Middlebox Architecture, NSDI '12
- Today's middleboxes are independent, specialized boxes. CoMb consolidates middleboxes to exploit multiplexing, module reuse, and spatial distribution
Split/Merge: System Support for Elastic Execution in Virtual Middleboxes, NSDI'14
- Autoscaling of stateful network functions
ClickOS and the Art of Network Function Virtualization, NSDI'14
- Builds tiny, specialized VMs (as small as 5MB) on top of Xen and Click, achieving boot times under 30ms and near-line-rate throughput for middlebox processing.
Enforcing Network-Wide Policies in the Presence of Dynamic Middlebox Actions using FlowTags, NSDI'14
- Middleboxes dynamically modify packets (e.g., NATs rewriting addresses), breaking SDN policy enforcement. FlowTags lets middleboxes tag packets with metadata so downstream switches can correctly apply policies.
BlindBox: Deep Packet Inspection over Encrypted Traffic, SIGCOMM'15
- Enables middleboxes to perform deep packet inspection on encrypted traffic without decryption, using garbled circuits and tokenization for oblivious keyword search.
Rollback-Recovery for Middleboxes, SIGCOMM'15
- Provides fault tolerance for middleboxes via rollback-recovery with output commit, ensuring no externally visible side effects are lost upon failure.
NetBricks: Taking the V out of NFV, OSDI '16
- Uses Rust's type and memory safety to isolate NFs within a single process instead of separate VMs/containers, enabling zero-copy packet passing between chained NFs.
OpenBox: A Software-Defined Framework for Developing, Deploying, and Managing Network Functions, SIGCOMM'16
- Decouples NF logic from data-plane execution by decomposing NFs into reusable processing blocks, allowing multiple NF applications to share common blocks and reducing redundant computation.
Paving the Way for NFV: Simplifying Middlebox Modifications Using StateAlyzr, NSDI'16
- Uses static analysis to automatically identify and categorize internal state variables in middlebox code, generating scaffolding for state export/import to support migration, scaling, and fault tolerance.
Stateless Network Functions: Breaking the Tight Coupling of State and Processing, NSDI'17
- Decouples NF state from processing by storing all state in a remote low-latency data store, making NF instances stateless and simplifying scaling, migration, and fault tolerance.
NFP: Enabling Network Function Parallelism in NFV, SIGCOMM'17
- Automatically constructs a parallel execution DAG from a sequential NF chain by analyzing read/write dependencies, enabling intra-chain parallelism with up to 2.5x throughput improvement.
NFVnice: Dynamic Backpressure and Scheduling for NFV Service Chains, SIGCOMM'17
- Introduces a rate-aware backpressure mechanism that signals upstream NFs to slow down when downstream NFs are congested, combined with dynamic CPU scheduling to avoid wasted work in chains.
Metron: NFV Service Chains at the True Speed of the Underlying Hardware, NSDI'18
- Eliminates inter-core communication overhead by leveraging hardware classification (NIC RSS, Flow Director) to pin entire service chains to a single CPU core.
Elastic Scaling of Stateful Network Functions, NSDI'18
- Proposes S6, a framework with a distributed shared state abstraction that allows NF instances to transparently access and migrate per-flow state, enabling elastic scaling with minimal disruption.
ResQ: Enabling SLOs in Network Function Virtualization, NSDI'18
- Uses hardware performance counters to profile NF sensitivity to shared resource contention, then makes interference-aware placement decisions to satisfy per-NF latency and throughput SLOs.
Microboxes: High Performance NFV with Customizable, Asynchronous TCP Stacks and Dynamic Subscriptions, SIGCOMM '18
- Provides customizable, per-NF TCP stacks where each NF subscribes only to the TCP events it needs, avoiding the cost of a full TCP implementation while enabling flow-level visibility.
ClickNF: a Modular Stack for Custom Network Functions, ATC '18
- Extends Click with a full, modular TCP/IP stack and POSIX-compatible socket API, enabling transport-layer NFs to be built as compositions of reusable Click elements.
FlowBlaze: Stateful Packet Processing in Hardware, NSDI '19
- Implements stateful NFs directly in programmable switch hardware using extended finite state machines (EFSMs), overcoming the limitation of stateless match-action tables.
Performance Contracts for Software Network Functions, NSDI '19
- Uses symbolic execution to derive formal, human-readable performance contracts that map each packet's execution path through an NF to its precise latency in cycles.
Correctness and Performance for Stateful Chained Network Functions, NSDI '19
- Provides per-flow transactional semantics across chains of stateful NFs by co-designing the state management layer with the packet processing pipeline for minimal overhead.
Verifying software network functions with no verification expertise, SOSP '19
- Enables push-button formal verification of NFs without expertise in formal methods, combining symbolic execution with carefully designed data structure abstractions.
Gallium: Automated Software Middlebox Offloading to Programmable Switches, SIGCOMM '20
- Automatically compiles software NFs (e.g., Click) to programmable switches (P4), partitioning logic between the switch data plane and CPU based on hardware constraints.
TEA: Enabling State-Intensive Network Functions on Programmable Switches, SIGCOMM '20
- Extends switch match-action tables with off-chip DRAM to support NFs with large state (e.g., large flow tables), achieving near-line-rate performance despite limited on-chip memory.
Contention-Aware Performance Prediction For Virtualized Network Functions, SIGCOMM '20
- Predicts NF performance under co-location by profiling NFs in isolation and composing profiles with lightweight micro-benchmarks that characterize resource sensitivity.
SNF: serverless network functions, SoCC '20
- Applies the serverless paradigm to NFs, deploying them as auto-scaled, event-triggered functions, addressing challenges of persistent state across ephemeral invocations.
Performance Interfaces for Network Functions, NSDI '22
- Extends performance contracts into composable, modular performance interfaces that can be composed to predict the performance of NF chains without re-analyzing the full system.
Quadrant: A Cloud-Deployable NF Virtualization Platform, SoCC '22
- Designs an NFV platform for public cloud environments where hardware-level optimizations (DPDK/SR-IOV) may be unavailable, bridging the gap between NFV research and real cloud deployments.
A High-Speed Stateful Packet Processing Approach for Tbps Programmable Switches, NSDI '23
- Proposes Memory-Compute Units (MCUs) that decouple state access from packet processing in switch ASICs, enabling richer stateful NFs at full line rate.
ExoPlane: An Operating System for On-Rack Switch Resource Augmentation, NSDI '23
- An exokernel-inspired OS abstraction for augmenting programmable switches with compute and memory from co-located rack servers, exposing resource heterogeneity to NF developers.
LemonNFV: Consolidating Heterogeneous Network Functions at Line Speed, NSDI '23
- Consolidate unmodified NFs that are implemented in different platforms (e.g., Snort, Click, and NetBricks)
Disaggregating Stateful Network Functions, NSDI '23
- Disaggregates NF state into a shared remote store while running stateless processing instances, enabling independent scaling of compute and state with careful caching and batching to minimize overhead.
Automatic Parallelization of Software Network Functions, NSDI '24
- Automatically parallelizes single-threaded NF code across multiple cores by analyzing state access dependencies and partitioning/replicating state for safe concurrent execution.

Network Abstraction / Language

Chimera: A Declarative Language for Streaming Network Traffic Analysis, Security '12
- A declarative query language for expressing complex, stateful network traffic analysis policies (e.g., multi-step attack detection) over streaming packet data, compiled into efficient automata for real-time execution.
- Enables analysts to specify "what" to detect rather than "how," supporting composition of temporal and cross-flow correlations that are error-prone to implement imperatively.
Abstractions for network update, SIGCOMM '12
- Introduces consistent network update abstractions (per-packet and per-flow consistency) that guarantee network-wide policy invariants are maintained during SDN rule transitions, preventing transient violations.
- Uses a two-phase update mechanism: new rules are installed across all switches before traffic is shifted, ensuring every packet sees either the old or new policy, never a mix.
Compiling Path Queries, NSDI '16
- A regular-expression-based query language for monitoring network paths taken by packets, compiled into switch-level rules that encode path history via packet tagging.
- The compiler uses determinization and tag minimization to generate efficient forwarding rules for runtime path-level monitoring with low overhead.
SNAP: Stateful Network-Wide Abstractions for Packet Processing, SIGCOMM '16
- Provides a "one-big-switch" programming abstraction with mutable per-flow state, letting programmers write stateful packet-processing programs as if the entire network were a single switch.
- The compiler handles state placement and program partitioning across physical switches, bridging the global abstraction and the distributed reality of limited-resource devices.
mOS: A Reusable Networking Stack for Flow Monitoring Middleboxes, NSDI'17
- A reusable, event-driven monitoring stack exposing flow-level abstractions (TCP state events, reassembled bytestreams) so middlebox developers need not re-implement TCP reconstruction.
- Provides a monitoring socket API with per-flow event callbacks, enabling IDS, proxy, and load balancer applications to be written concisely on a common substrate.
Quantitative Network Monitoring with NetQRE, SIGCOMM '17
- Introduces Quantitative Regular Expressions for networking (NetQRE), a declarative language for quantitative monitoring queries (e.g., traffic entropy, SYN-flood ratios) that go beyond boolean pattern matching.
- The compiler generates streaming algorithms from NetQRE programs with formal worst-case performance guarantees, bridging expressive queries and line-rate processing.
Language-Directed Hardware Design for Network Performance Monitoring, SIGCOMM '17
- Proposes Marple, a SQL-like query language for performance monitoring (e.g., per-flow latency, TCP incast detection) that compiles to programmable switch hardware with key-value store augmented pipelines.
- Key insight: co-designing the language and hardware -- the language restricts queries to those efficiently implementable in hardware, while the hardware is designed to support the language's linear-in-state operations.
Sonata: query-driven streaming network telemetry, SIGCOMM '18
- A declarative query interface for telemetry that automatically partitions execution between programmable switches (for early data reduction) and streaming processors (for complex analysis), reducing data volume sent to the backend.
Lyra: A Cross-Platform Language and Compiler for Data Plane Programming on Heterogeneous ASICs, SIGCOMM '20
- A hardware-independent language and compiler for data plane programming that abstracts away differences between heterogeneous switch ASICs, enabling a single program to be compiled to multiple backend targets.
Lucid: a language for control in the data plane, SIGCOMM '21
- A DSL for writing event-driven control programs that execute entirely within the switch data plane, enabling reactive control logic (e.g., congestion response, failure detection) at data-plane speed without controller round-trips.
Programming Network Stack for Middleboxes with Rubik, NSDI '21
- Designed a language for programming middleboxes with an emphasis on supporting various transport protocols and flexible network stack hierarchy.
SwiSh: Distributed Shared State Abstractions for Programmable Switches, NSDI '22
- Provides shared state abstractions (registers, counters, tables) across a network of programmable switches with configurable consistency models, letting developers write programs as if operating on a single switch.
NetRPC: Enabling In-Network Computation in Remote Procedure Calls, NSDI '23
- Enables programmable switches to intercept and process RPC messages (e.g., aggregation, caching) by handling the mismatch between variable-length RPC serialization formats and fixed-pipeline switch hardware.
ClickINC: In-network Computing as a Service in Heterogeneous Programmable Data-center Networks, SIGCOMM '23
- A Click-inspired modular framework for in-network computing that provides a unified abstraction across heterogeneous programmable devices (SmartNICs, switches, FPGAs) with automatic partitioning and placement.

eBPF and XDP (See Also awesome-ebpf)

The eXpress data path: fast programmable packet processing in the operating system kernel, CoNEXT '18
- Introduces XDP, a framework for running eBPF programs at the earliest point in the Linux network stack (before socket buffer allocation), enabling line-rate packet processing with full kernel bypass or selective forwarding.
hXDP: Efficient Software Packet Processing on FPGA NICs, OSDI '20
- Compiles XDP/eBPF programs to run on FPGA-based SmartNICs, offloading packet processing from the host CPU while maintaining the familiar eBPF programming model.
Specification and verification in the field: Applying formal methods to BPF just-in-time compilers in the Linux kernel, OSDI '20
- Applies formal verification (using Coq and SMT solvers) to Linux's BPF JIT compilers, finding and fixing bugs in production code while demonstrating that formal methods can be practical for kernel development.
BPF for storage: an exokernel-inspired approach, HotOS '21
- Proposes extending eBPF to storage I/O paths, allowing applications to inject custom logic (e.g., filtering, aggregation) into the kernel's storage stack to reduce data movement and kernel crossings.
BMC: Accelerating Memcached using Safe In-kernel Caching and Pre-stack Processing, NSDI '21
- Uses eBPF/XDP to implement an in-kernel cache for Memcached that intercepts GET requests before the network stack, achieving significant throughput improvements by avoiding user-kernel transitions for cache hits.
Synthesizing Safe and Efficient Kernel Extensions for Packet Processing, SIGCOMM '21
- Uses program synthesis to automatically generate correct and efficient eBPF packet-processing code from high-level specifications, overcoming the difficulty of manually writing verifier-compliant eBPF programs.
ghOSt: Fast & Flexible User-Space Delegation of Linux Scheduling, SOSP '21
- Enables user-space implementation of CPU scheduling policies via a kernel agent that delegates scheduling decisions, allowing rapid iteration on scheduling algorithms without kernel modifications.
Syrup: User-Defined Scheduling Across the Stack, SOSP '21
- Provides a unified framework for user-defined scheduling policies across CPU, network, and storage using eBPF, enabling application-specific cross-layer scheduling optimizations.
LiteFlow: towards high-performance adaptive neural networks for kernel datapath, SIGCOMM '22
- Embeds lightweight neural network inference within eBPF programs to enable ML-driven decisions (e.g., congestion control) directly in the kernel datapath with microsecond-scale latency.
XRP: In-Kernel Storage Functions with eBPF, OSDI '22
- Allows applications to push storage operations (e.g., B-tree lookups) into the kernel via eBPF, reducing I/O round-trips by chaining dependent reads within the NVMe driver.
Electrode: Accelerating Distributed Protocols with eBPF, NSDI '23
- Offloads performance-critical paths of distributed protocols (e.g., Paxos, chain replication) to eBPF in the kernel, reducing latency by avoiding user-space context switches on the critical path.
Tigger: A Database Proxy That Bounces With User-Bypass, VLDB '23
- Uses eBPF to implement a high-performance database proxy that bypasses user-space for common-case query routing, falling back to user-space only for complex cases.
Automatic Kernel Offload Using BPF, HotOS '23
- Proposes automatically identifying and offloading performance-critical application code fragments to eBPF in the kernel, treating eBPF as a general-purpose kernel acceleration substrate.
EPF: Evil Packet Filter, ATC '23
- Analyzes security vulnerabilities in the eBPF ecosystem, demonstrating how malicious eBPF programs can exploit verifier weaknesses or timing side-channels despite the safety guarantees.
DINT: Fast In-Kernel Distributed Transactions with eBPF, NSDI '24
- Implements distributed transaction coordination (2PC, OCC) in eBPF to minimize latency by keeping the critical path entirely in kernel space, bypassing user-space transaction managers.
FetchBPF: Customizable Prefetching Policies in Linux with eBPF, ATC '24
- Allows applications to define custom file prefetching policies via eBPF hooks in the Linux page cache, enabling workload-specific prefetching without kernel modifications.
eTran: Extensible Kernel Transport with eBPF, NSDI '25
- Enables extensible transport-layer protocols by allowing eBPF programs to customize TCP/UDP processing (e.g., congestion control, header parsing) within the kernel network stack.

Transport Protocol

Data Center TCP (DCTCP), SIGCOMM '10
- Uses ECN marks from switches to achieve fine-grained congestion control in datacenters, reacting proportionally to the extent of congestion rather than treating any congestion signal as severe.
pFabric: minimal near-optimal datacenter transport, SIGCOMM '13
- Achieves near-optimal flow completion times by decoupling flow scheduling from rate control: switches prioritize packets by remaining flow size, while senders transmit at line rate with minimal state.
TIMELY: RTT-based Congestion Control for the Datacenter, SIGCOMM '15
- Uses precise RTT measurements (enabled by NIC hardware timestamps) as the primary congestion signal, avoiding the deployment complexity of ECN while achieving low latency in datacenters.
The QUIC Transport Protocol: Design and Internet-Scale Deployment, SIGCOMM '17
- A UDP-based transport protocol with built-in encryption (TLS 1.3), 0-RTT connection establishment, multiplexed streams without head-of-line blocking, and connection migration across network changes.
Credit-Scheduled Delay-Bounded Congestion Control for Datacenters, SIGCOMMM '17
- Introduces ExpressPass, a credit-based congestion control where receivers explicitly schedule sender transmissions, providing bounded queuing delay and near-zero packet loss in datacenters.
Re-architecting datacenter networks and stacks for low latency and high performance, SIGCOMMM '17
- Proposes NDP, combining per-packet multipath spraying with receiver-driven flow control and switch trimming (headers only on congestion) to achieve ultra-low latency and high throughput.
Homa: A Receiver-Driven Low-Latency Transport Protocol Using Network Priorities, SIGCOMMM '18
- A connectionless, receiver-driven protocol that uses in-network priority queues to schedule packets by remaining message size (SRPT), achieving low tail latency for short messages.
HPCC: high precision congestion control, SIGCOMM '19
- Leverages in-network telemetry (INT) to obtain precise link utilization and queue information, enabling congestion control that converges quickly to high utilization with near-zero queuing.
R2P2: Making RPCs first-class datacenter citizens, ATC '19
- A transport protocol designed specifically for RPCs, with request-level (not connection-level) load balancing and a policy-based scheduling abstraction for implementing various scheduling disciplines.
Swift: Delay is Simple and Effective for Congestion Control in the Datacenter, SIGCOMM '20
- A delay-based congestion control for datacenters that uses NIC timestamps to measure one-way delays, separating fabric and endpoint congestion for targeted responses.
Aeolus: A Building Block for Proactive Transport in Datacenters, SIGCOMM '20
- Provides a proactive transport primitive that pre-allocates bandwidth before data transmission, enabling deadline-aware scheduling and predictable latency for latency-sensitive traffic.
PowerTCP: Pushing the Performance Limits of Datacenter Networks, NSDI '21
- Uses a power function (throughput × delay) as the congestion signal, achieving both high throughput and low latency by balancing these competing objectives more effectively than prior approaches.
TCP is Harmful to In-Network Computing: Designing a Message Transport Protocol (MTP), HotNets '21
- Argues that TCP's byte-stream abstraction and reliability semantics are mismatched for in-network computing (e.g., aggregation at switches), proposing a message-oriented transport with relaxed ordering.
Towards Domain-Specific Network Transport for Distributed DNN Training, NSDI '24
- Designs a transport protocol tailored for distributed DNN training traffic patterns (e.g., all-reduce), exploiting predictable communication patterns for better scheduling and reduced tail latency.
MTP: A Transport for In-Network Computing, NSDI '25
- A message transport protocol enabling in-network compute operations (e.g., aggregation) by providing message-level reliability and allowing switches to process and transform messages in transit.

Microservice and Service Mesh

Microservices: yesterday, today, and tomorrow, Springer '17
- One of the first academic papers on microservices.
Verification in the Age of Microservices, HotOS '17
- Argues that microservice architectures shift verification challenges from intra-application correctness to inter-service protocol and contract verification, requiring new tools for checking composition-level properties.
Service Fabric: A Distributed Platform for Building Microservices in the Cloud, EuroSys '18
- A description of the Azure SF design, with a focus on how they solved hard consistency and distributed systems problems.
Overload Control for Scaling WeChat Microservices, SoCC '18
- Presents DAGOR, WeChat's production overload control system that uses business-priority-based admission control at each service, with cooperative admission across the call graph to shed low-priority requests early and prevent cascading overload.
µTune: Auto-Tuned Threading for OLDI Microservices, OSDI '18
- Shows that the optimal threading model (inline, synchronous, or asynchronous) for microservices varies with load and microservice characteristics, and proposes an online system that automatically selects and tunes the threading configuration to minimize tail latency.
An Open-Source Benchmark Suite for Microservices and Their Hardware-Software Implications for Cloud & Edge Systems, ASPLOS '19
- Introduces DeathStarBench, an open-source suite of representative end-to-end microservice applications (social network, hotel reservation, etc.) for studying microservice performance, revealing that microservices have distinct hardware implications (e.g., deep call graphs amplify tail latency, high OS/network overhead).
Seer: Leveraging Big Data to Navigate the Complexity of Performance Debugging in Cloud Microservices, ASPLOS '19
- Uses deep learning on distributed tracing data and hardware-level metrics to proactively predict QoS violations in microservice systems before they occur, and identifies the culprit microservice causing the violation.
PARTIES: QoS-Aware Resource Partitioning for Multiple Interactive Services, ASPLOS '19
- Dynamically partitions shared hardware resources (cores, cache, memory bandwidth) among co-located latency-sensitive services using a gradient-descent-inspired controller that detects QoS violations and reallocates resources in near-real-time.
E3: Energy-Efficient Microservices on SmartNIC-Accelerated Servers, ATC '19
- Offloads lightweight microservices (e.g., proxies, load balancers) to SmartNIC ARM cores, freeing host CPU for compute-intensive services while significantly reducing energy consumption per request.
Autopilot: workload autoscaling at Google, EuroSys '20
- Google's production autoscaler that uses ML (time-series forecasting) to recommend CPU and memory limits for jobs, reducing resource slack and out-of-memory events at scale.
FIRM: An Intelligent Fine-grained Resource Management Framework for SLO-Oriented Microservices, OSDI '20
- Combines online telemetry with ML models to detect SLO violations, pinpoint the responsible microservice via resource-usage anomaly detection, and apply per-microservice resource adjustments (vertical/horizontal scaling, traffic routing) to restore SLO compliance.
Accelerometer: Understanding Acceleration Opportunities for Data Center Overheads at Hyperscale, ASPLOS '20
- A study on how microservices spend their CPU cycles. It shows that, within Facebook, microservices spend only a small fraction of their execution time service core application logic, and significant cycles on orchestration work (e.g., compression, serialization, and I/O processing).
Nightcore: Efficient and Scalable Serverless Computing for Latency-Sensitive, Interactive Microservices, ASPLOS '21
- A serverless runtime optimized for microsecond-scale internal function calls in microservice applications, using shared-memory message channels and a concurrency-aware scheduler to minimize inter-function invocation overhead.
Sage: Practical and Scalable ML-Driven Performance Debugging in Microservices, ASPLOS '21
- Applies causal Bayesian networks over per-microservice metrics to identify root causes of QoS violations, scaling to large deployments by decomposing the global dependency graph into per-service local models.
Sinan: ML-Based and QoS-Aware Resource Management for Cloud Microservices, ASPLOS '21
- Uses an LSTM-based model to predict end-to-end latency from per-microservice resource allocations, then applies a reinforcement-learning agent to dynamically adjust per-service resources to meet SLOs while minimizing total resource usage.
Automatic Policy Generation for Inter-Service Access Control of Microservices, Security '21
- Static Analysis for invocation logic + abstraction for graph policy enforcement
Characterizing Microservice Dependency and Performance: Alibaba Trace Analysis, SoCC '21
- Large-scale analysis of Alibaba's production microservice traces, revealing structural properties of dependency graphs (e.g., heavy fan-out, long call chains) and their impact on tail latency amplification.
SHOWAR: Right-Sizing And Efficient Scheduling of Microservices, SoCC '21
- Proposes workload-aware right-sizing of microservice containers using time-series prediction of resource demands, combined with bin-packing scheduling to reduce resource waste while meeting SLOs.
Service-Level Fault Injection Testing, SoCC '21
- Introduces a systematic approach to fault injection testing at the service level (rather than individual API calls), automatically generating fault injection campaigns that explore how failures in one service propagate through the microservice dependency graph.
Leveraging Service Meshes as a New Network Layer, HotNets '21
- Highlighted service mesh as an abstraction and discussed some use cases and challenges of SM.
DeepRest: Deep Resource Estimation for Interactive Microservices, EuroSys '22
- Estimates per-request resource consumption by analyzing request content (e.g., API parameters, payload size) using deep learning, enabling more accurate resource provisioning than load-agnostic approaches.
CRISP: Critical Path Analysis of Large-Scale Microservice Architectures, ATC '22
- Uber's production-grade microservice tracing system for critical path analysis (CPA), built on top of Jaeger.
- Section 7.2 has some interesting data on Uber's microservices in production.
SPRIGHT: Extracting the Server from Serverless Computing! High-performance eBPF-based Event-driven, Shared-memory Processing, SIGCOMM '22
- Accelerate service mesh (in serverless deployments) using eBPF and shared memory.
DeepScaling: Microservices AutoScaling for Stable CPU Utilization in Large Scale Cloud Systems, SoCC '22
- Uses deep learning to predict future workload and autoscale microservice replicas to maintain stable CPU utilization targets, deployed at scale in Alibaba Cloud to reduce resource waste from reactive scaling oscillations.
The Power of Prediction: Microservice Auto Scaling via Workload Learning, SoCC '22
- Combines workload prediction with a queuing-theory model to proactively determine the number of microservice replicas needed, avoiding the lag and oscillation of reactive autoscalers.
Executing Microservice Applications on Serverless, Correctly, POPL '23
- Provides a formally verified compilation framework (mu2sls) that automatically transforms microservice applications to run on serverless platforms while preserving exactly-once semantics and fault tolerance guarantees.
The Benefit of Hindsight: Tracing Edge-Cases in Distributed Systems, NSDI '23
- Proposes hindsight logging that retroactively captures detailed traces only when anomalies are detected, enabling root-cause analysis of rare edge cases without the overhead of always-on verbose tracing.
Nodens: Enabling Resource Efficient and Fast QoS Recovery of Dynamic Microservice Applications in Datacenters, ATC '23
- Uses lightweight probing and learned models to quickly detect and recover from QoS degradations caused by dynamic microservice behaviors (e.g., version updates, traffic shifts).
Lifting the veil on Meta’s microservice architecture: Analyses of topology and request workflows, ATC '23
- A large-scale empirical study of Meta's microservice topology and request workflows, revealing patterns such as highly skewed fanout distributions and long critical paths.
ServiceRouter: Hyperscale and Minimal Cost Service Mesh at Meta, OSDI '23
- Meta's production service mesh that achieves low overhead by embedding routing logic into client libraries rather than sidecars, with centralized control for policy and load balancing.
Network-Centric Distributed Tracing with DeepFlow: Troubleshooting Your Microservices in Zero Code, SIGCOMM '23
- Uses eBPF to capture network-level tracing data (TCP flows, latency) without application instrumentation, correlating network events with application traces for root-cause analysis.
Dissecting Overheads of Service Mesh Sidecars, SoCC '23
- A detailed performance analysis of service mesh sidecar proxies (e.g., Envoy), identifying sources of latency and CPU overhead and proposing optimizations.
LatenSeer: Causal Modeling of End-to-End Latency Distributions by Harnessing Distributed Tracing, SoCC '23
- Builds causal models from distributed traces to predict how changes (e.g., scaling, code optimizations) will affect end-to-end latency distributions across the microservice graph.
Expressive Policies For Microservice Networks, HotNets '23
- Language and system support for complex safety properties that reason about the flow of requests across the whole microservice network (not just between adjacent hops).
Application Defined Networks, HotNets '23
- Proposes letting applications define their own network abstractions and policies within service meshes, rather than relying on fixed infrastructure-level primitives.
Cilantro: Performance-Aware Resource Allocation for General Objectives via Online Feedback, OSDI '23
- A general-purpose online resource allocator that uses Bayesian optimization to learn application performance models and allocate resources to meet diverse objectives (latency, throughput, fairness).
Blueprint: A Toolchain for Highly-Reconfigurable Microservice Applications, SOSP '23
- A compilation framework that separates microservice application logic from infrastructure concerns, enabling the same application code to be deployed across different backends (serverless, containers, etc.).
MuCache: a General Framework for Caching in Microservice Graphs, NSDI '24
- A caching framework that automatically identifies caching opportunities in microservice call graphs and places caches at optimal points to reduce redundant computation and latency.
Autothrottle: A Practical Bi-Level Approach to Resource Management for SLO-Targeted Microservices, NSDI '24
- Combines coarse-grained cluster-level scheduling with fine-grained per-service throttling to efficiently meet SLOs while maximizing resource utilization in microservice deployments.
TraceWeaver: Distributed Request Tracing for Microservices Without Application Modification, SIGCOMM '24
- Reconstructs distributed traces from network-level observations (packet timing, connection patterns) without requiring application-level instrumentation or trace ID propagation.
TopFull: An Adaptive Top-Down Overload Control for SLO-Oriented Microservices, SIGCOMM '24
- An overload control mechanism that sheds load at entry points based on downstream capacity signals, preventing overload from propagating deep into the microservice graph.
Canal Mesh: A Cloud-Scale Sidecar-Free Multi-Tenant Service Mesh Architecture, SIGCOMM '24
- Eliminates per-pod sidecars by using per-node proxies with kernel-level traffic interception, reducing resource overhead while maintaining service mesh functionality at cloud scale.
Derm: SLA-aware Resource Management for Highly Dynamic Microservices, ISCA '24
- A hardware-software co-designed resource manager that uses hardware performance counters and ML models to rapidly detect and mitigate SLA violations in dynamic microservice environments.
Copper and Wire: Bridging Expressiveness and Performance for Service Mesh Policies, ASPLOS '25
- Proposes a two-tier policy architecture where expressive high-level policies (Copper) are compiled into fast data-plane rules (Wire), balancing policy expressiveness with enforcement performance.
Embracing Imbalance: Dynamic Load Shifting among Microservice Containers in Shared Clusters, ASPLOS '25
- Instead of load balancing to equalize load, dynamically shifts load between microservice replicas to exploit resource availability variations in shared clusters.
Rajomon: Decentralized and Coordinated Overload Control for Latency-Sensitive Microservices, NSDI '25
- A decentralized overload control system where each microservice independently makes admission decisions while coordinating through request metadata to achieve system-wide overload protection.
High-level Programming for Application Networks, NSDI '25
- Provides a high-level programming model for defining application-specific network policies and behaviors in service meshes, abstracting away low-level proxy configuration.

Network Stack and RPC

netmap: A Novel Framework for Fast Packet I/O, ATC '12
- Provides a high-performance packet I/O framework that maps NIC rings directly into user space, eliminating per-packet system calls and achieving line-rate packet processing.
Chronos: Predictable Low Latency for Data Center Applications, SoCC'12
- Provides predictable low latency for datacenter applications by using deadline-aware scheduling and admission control to bound tail latency under load.
Improving Network Connection Locality on Multicore Systems, EuroSys'12
- Improves network performance on multicore systems by ensuring that connection processing happens on the same core that handles the application, reducing cache misses and cross-core communication.
MegaPipe: A New Programming Interface for Scalable Network I/O, OSDI'12
- A scalable network I/O API that batches system calls and partitions the listening socket across cores to eliminate contention, achieving significantly higher connection rates than BSD sockets.
mTCP: a Highly Scalable User-level TCP Stack for Multicore Systems, NSDI '14
- A user-level TCP stack that achieves high scalability on multicore systems by eliminating kernel crossing overhead and using per-core data structures to avoid locking.
Network stack specialization for performance, SIGCOMM '14
- Argues for generating application-specific network stacks that include only the features needed, reducing code complexity and improving performance for specific workloads.
IX: A Protected Dataplane Operating System for High Throughput and Low Latency, OSDI '14
- A dataplane OS that provides low-latency, high-throughput networking by separating the control plane (Linux) from a specialized, run-to-completion dataplane with zero-copy I/O.
Arrakis: The Operating System is the Control Plane, OSDI '14
- Removes the OS from the I/O data path by using hardware virtualization (SR-IOV) to give applications direct access to network and storage devices, while the OS manages only resource allocation.
StackMap: Low-Latency Networking with the OS Stack and Dedicated NICs, ATC '16
- Enables low-latency networking while retaining the standard OS TCP/IP stack by using dedicated NICs with memory-mapped ring buffers, combining kernel stack compatibility with user-space performance.
ModNet: A Modular Approach to Network Stack Extension, NSDI '15
- Provides a framework for composable network stack extensions, allowing modules (e.g., WAN optimizers, traffic shapers) to be dynamically inserted into the stack without kernel modifications.
RSS++: load and state-aware receive side scaling, CoNEXT '19
- Extends hardware RSS to be aware of CPU load and connection state, dynamically redistributing flows to balance load while maintaining flow affinity for stateful processing.
TAS: TCP Acceleration as an OS Service, EuroSys '19
- Accelates TCP stack by splitting the stack into a "fast" data path (for data transport of established connections) and a control plane (for connection and context management, congestion control etc.).
Snap: a Microkernel Approach to Host Networking, SOSP '19
- Google's user-space networking stack that runs as a microkernel-style service, enabling rapid iteration on networking features while providing isolation between applications and the network stack.
SocksDirect: Datacenter Sockets can be Fast and Compatible, SIGCOMM '19
- Provides a high-performance socket implementation that is fully compatible with existing applications by using RDMA for data transfer while maintaining the POSIX socket API.
Understanding Host Network Stack Overheads, SIGCOMM '20
- A detailed measurement study breaking down where CPU cycles are spent in the Linux network stack, identifying key bottlenecks (e.g., memory allocation, locking) and quantifying their impact.
The nanoPU: A Nanosecond Network Stack for Datacenters, OSDI '21
- A hardware-software co-designed network stack that moves protocol processing into the NIC and wakes threads directly from network events, achieving sub-microsecond RPC latency.
How to diagnose nanosecond network latencies in rich end-host stacks, NSDI '22
- Presents methodology and tools for pinpointing sources of latency at nanosecond granularity in complex end-host network stacks, using hardware timestamps and careful instrumentation.
Remote Procedure Call as a Managed System Service, NSDI '23
- Proposes managing RPC as an OS-level service that handles serialization, transport, and load balancing, decoupling applications from RPC implementation details.
NetClone: Fast, Scalable, and Dynamic Request Cloning for Microsecond-Scale RPCs, SIGCOMM '23
- Implements request cloning (hedged requests) for microsecond-scale RPCs at the NIC level, reducing tail latency by speculatively sending requests to multiple replicas without software overhead.
Fathom: Understanding Datacenter Application Network Performance, SIGCOMM '23
- A measurement and analysis framework for understanding network performance of datacenter applications, correlating application-level metrics with network-level behavior.
A Cloud-Scale Characterization of Remote Procedure Calls, SOSP '23
- A large-scale study of RPC characteristics at Google, revealing patterns in message sizes, call rates, and latency distributions that inform RPC system design.
HydraRPC: RPC in the CXL Era, ATC '24
- Leverages CXL memory pooling to implement zero-copy RPC, where caller and callee share memory regions directly, eliminating serialization and network transfer overhead.

Workload Interference

Q-Clouds: Managing Performance Interference Effects for QoS-Aware Clouds, EuroSys '10
- Profiling applications' performance in a standalone mode and using that to provide a baseline target when consolidating them onto a shared host.
Cuanta: quantifying effects of shared on-chip resource interference for consolidated virtual machines, SoCC '11
- Introduced a cache loader micro-benchmark to profile application performance under varying cache-usage pressure and use the profile to predict the impact of cache interference among consolidated workloads
Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations, MICRO '11
- Each application is profiled 1) using a memory antagonist to obtain its (memory) sensitivity curve and 2) to measure the pressure on the memory it generates.
Toward Predictable Performance in Software Packet-Processing Platforms, NSDI '12
- Profile each NF’s cache ref/sec running alone and its performance drop curve when collocating with a synthetic antagonist. Predict the performance drop with these profiles.
DeepDive: Transparently Identifying and Managing Performance Interference in Virtualized Environments, ATC '13
Detect interference via differential low-level metrics (see Table 1), validate the interference and identify the interfering resource by running the victim in isolation, and mitigate interference via migration.
Bobtail: Avoiding Long Tails in the Cloud, NSDI '13
- Identifies that network latency long tails in public clouds stem from OS/virtualization-layer interference (not the network fabric) and proposes VM-level techniques (e.g., careful placement and outbound pacing) to mitigate tail latency for latency-sensitive cloud applications.
Paragon: QoS-Aware Scheduling for Heterogeneous Datacenters, ASPLOS '13
- Uses collaborative filtering (similar to recommendation systems) to classify incoming workloads with minimal profiling and predict their sensitivity to interference and hardware heterogeneity, enabling QoS-aware placement without exhaustive benchmarking.
CPI2 : CPU performance isolation for shared compute clusters, EuroSys '13
- Uses cycles-per-instruction (CPI) as metrics to detect workload interference and identify perpetrators (and address the interference by throttling). Key takeaway: CPI correlates with application performance and CPI is a stable metrics.
Reconciling High Server Utilization and Sub-millisecond Quality-of-Service, EuroSys '14
- Co-location leads to increases in queuing delay, scheduling delay, and thread load imbalance. Addresses interference online via re-provisioning and scheduling.
Heracles: Improving resource efficiency at scale, ISCA '15
- Manage workload (LC+BE) colocations via an online controller that monitors latency and resource usage and manages the isolation mechanism for different resources.
PerfIso: Performance Isolation for Commercial Latency-Sensitive Services, ATC '18
- Described a production system (Microsoft Bing) for performance isolation
PARTIES: QoS-Aware Resource Partitioning for Multiple Interactive Services, ASPLOS '19
- Online monitoring that detects QoS violations in O(100ms) and boosts the resource allocation of victims.
PicNIC: predictable virtualized NIC, SIGCOMM '19
- Characterize how performance isolation can break in virtualized network stack in terms of network bandwidth and network stack processing rate. Provides an abstraction and construct based on bandwidth, latency, and loss rate to detect isolation breakdown and enforce isolation.
Shenango: Achieving High CPU Efficiency for Latency-sensitive Datacenter Workloads, NSDI '19
- Achieves both high CPU efficiency and low tail latency by rapidly reallocating cores between applications at microsecond timescales using a centralized core arbiter.
Caladan: Mitigating Interference at Microsecond Timescales, OSDI '20
- Uses a set of control signals and corresponding actions to detect and respond to interference over microsecond timescales.
FIRM: An Intelligent Fine-Grained Resource Management Framework for SLO-Oriented Microservices, OSDI '20
- Use online telemetry data (resource usage and latency) and offline learned models to detect and localize microservices that cause SLO violations and mitigate violations via dynamic re-provisioning.

Network Architecture

Architectural considerations for a new generation of protocols, SIGCOMM CCR '90
- Clark's seminal paper identifying key design principles for protocol architecture, including the end-to-end argument, fate-sharing, and the importance of placing functionality at the right layer.
A Data-Oriented (and Beyond) Network Architecture, SIGCOMM '07
- Proposes DONA, a clean-slate architecture that replaces DNS with flat, self-certifying names and in-network resolution, enabling data-centric rather than host-centric networking.
Networking named content, CoNEXT '12
- Introduces Content-Centric Networking (CCN/NDN), where content is addressed by name rather than location, with in-network caching and request aggregation as first-class primitives.
XIA: Efficient Support for Evolvable Internetworking, NSDI '12
- An expressive internet architecture that supports multiple principal types (hosts, services, content) with fallback paths, enabling incremental deployment of new network abstractions.
Serval: An End-Host Stack for Service-Centric Networking, NSDI '12
- A network stack that introduces a service-level abstraction between transport and application layers, enabling service discovery, migration, and load balancing independently of IP addresses.
Enabling a Permanent Revolution in Internet Architecture, SIGCOMM '19
- Argues for architectural pluralism: designing the Internet to support multiple co-existing architectures rather than a single universal design, with mechanisms for graceful evolution.

Container

Slipstream: Automatic Interprocess Communication Optimization, ATC '15
- Automatically optimizes IPC between containers by detecting communication patterns and replacing socket-based IPC with shared memory when processes are co-located.
Slacker: Fast Distribution with Lazy Docker Containers, FAST '16
- Reduces container startup time by lazily fetching image layers on-demand rather than downloading entire images upfront, exploiting the observation that containers use only a small fraction of their image data.
Improving Docker Registry Design Based on Production Workload Analysis, FAST '18
- Analyzes production Docker registry workloads at IBM, identifying inefficiencies in layer deduplication and proposing optimizations for storage and distribution.
Cntr: Lightweight OS Containers, ATC '18
- Enables slim containers by separating application binaries from debugging/development tools, which can be dynamically attached when needed without bloating the container image.
Iron: Isolating Network-based CPU in Container Environments, NSDI '18
- Addresses the problem of network processing consuming CPU cycles charged to the wrong container, providing accurate accounting and isolation of network-induced CPU usage.
FreeFlow: Software-based Virtual RDMA Networking for Containerized Clouds, NSDI '19
- Provides RDMA networking for containers by virtualizing RDMA in software, enabling container migration and multi-tenancy while preserving near-native RDMA performance.
Slim: OS Kernel Support for a Low-Overhead Container Overlay Network, NSDI '19
- Reduces container overlay network overhead by moving encapsulation and routing logic into the kernel, eliminating user-space proxy overhead while maintaining container network abstractions.
Houdini's Escape: Breaking the Resource Rein of Linux Control Groups, CCS '19
- Identifies vulnerabilities in Linux cgroups that allow containers to escape resource limits, demonstrating attacks that consume unbounded CPU, memory, or I/O despite cgroup restrictions.
Particle: Ephemeral Endpoints for Serverless Networking, SoCC '20
- Addresses the overhead of creating network endpoints for short-lived serverless functions by pre-provisioning connection state and using lightweight endpoint assignment.
Parallelizing packet processing in container overlay networks, EuroSys '21
- Improves container overlay network performance by parallelizing packet processing across multiple cores, addressing the bottleneck of single-threaded encapsulation/decapsulation.
MigrOS: Transparent Live-Migration Support for Containerised RDMA Applications, ATC '21
- Enables live migration of containers using RDMA by transparently checkpointing and restoring RDMA connection state, allowing memory-intensive applications to be migrated without modification.
Starlight: Fast Container Provisioning on the Edge and over the WAN, NSDI '22
- Accelerates container provisioning at edge locations by using delta compression and optimized layer transfer protocols designed for high-latency WAN links.
Transparent GPU Sharing in Container Clouds for Deep Learning Workloads, NSDI '23
- Enables multiple containers to share GPUs transparently by intercepting CUDA calls and implementing fair scheduling and memory isolation without requiring application modifications.
ONCache: A Cache-Based Low-Overhead Container Overlay Network, NSDI '25
- Reduces container overlay network overhead by caching network state and forwarding decisions, minimizing per-packet processing while maintaining the flexibility of overlay networks.

Name		Name	Last commit message	Last commit date
Latest commit History 115 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

System/Networking Paper Reading List

Index

Reading List

Middleboxs and NFV

Network Abstraction / Language

eBPF and XDP (See Also awesome-ebpf)

Transport Protocol

Microservice and Service Mesh

Network Stack and RPC

Workload Interference

Network Architecture

Container

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

System/Networking Paper Reading List

Index

Reading List

Middleboxs and NFV

Network Abstraction / Language

eBPF and XDP (See Also awesome-ebpf)

Transport Protocol

Microservice and Service Mesh

Network Stack and RPC

Workload Interference

Network Architecture

Container

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages