Skip to content

unist-n2sl/LatScope

Repository files navigation

LatScope: End-to-End Latency Decomposition Across the Cloud Network Stack

Intro

  • In large-scale cloud and virtualized systems, improving performance requires understanding not just total latency, but where delays actually occur across the network stack.
  • Latency can emerge at multiple layers—including socket, TCP, IP, device, and virtual interfaces—and these delays often change depending on workload behavior, network conditions, and system configuration.
  • Existing tools mainly measure RTT or single-layer metrics, making it difficult to perform cross-layer latency analysis in complex environments.
  • LatScope fills this gap by using eBPF to match packets across layers and compute accurate inter-layer delays, while XDP-based time synchronization enables precise inter-server latency breakdowns.
  • With low-overhead, fine-grained latency decomposition, LatScope helps engineers detect bottlenecks, troubleshoot anomalies, and make informed tuning decisions in real deployments.

Background

  • This project leverages eBPF and XDP to enable fast, low-overhead network performance monitoring in large-scale systems.

eBPF (extended Berkely Packet Filter)

  • A kernel technology that allows user-defined programs to run safely inside restricted parts of the operating system without modifying kernel code.
  • It provides rich visibility into networking, system, and application events while maintaining high performance, making it suitable even for 10-Gbps-class per-core environments.

XDP (eXpress Data Path)

  • A high-performance, eBPF-based packet processing framework that runs at the earliest point in the network stack, enabling packets to be handled or redirected before entering the kernel networking path.
  • It can process traffic at multi-million packets-per-second (Mpps) rates, making it ideal for latency observation and lightweight data collection.

Architecture

Architecture

Requirements

  • Redis
    • For Communication Tool between other servers
  • MySQL
    • For a storage as the data collected by Observer
  • paramiko
    • Python Package which is used for SSH connection
  • BCC (BPF Compiler Collection)
    • A toolkit for building and running eBPF programs in user space, used to collect kernel- and network-level events

Code Structure

  • Metric_Collector
    • ebpf_program_vm
      • metric_measure_vm
        • ebpf_code.py
        • ebpf_conf.py
        • ebpf_database.py
        • ebpf_main.py
        • ebpf_python.py
      • time_sync
        • ebpf_code.py
        • ebpf_conf.py
        • ebpf_main.py
        • ebpf_python.py
    • time_sync_manage
      • ebpf_code.py
      • ebpf_python.py
    • ebpf_preprocess.py
    • ebpf_terminal.py
    • ebpf_database.py
    • ebpf_conf.py
    • ebpf_analyzer.py
    • ebpf_main.py
    • conf
      • connect_info.yaml
      • function_info.yaml
      • ping_info.yaml
      • sampling_info.yaml
      • database_info.yaml
      • management_server_info.yaml
      • redis_info.yaml

Code File

  • time_sync_manage
    • It is used by Management server for Time Synchronization between each servers
    • It make UDP packet and send them to other servers
  • time_sync
    • It is used by Observer for Time Synchronization
    • It installed XDP program
  • ebpf_program_vm
    • Observer Code
    • It collects network metric to central database
  • ebpf_analyzer
    • It calculates the network performance
  • ebpf_preprocess
    • It install Observer code in each server
    • It set tables in Relational Database
  • ebpf_mainprocess
    • It executes Observer in each server
  • ebpf_main
    • It is a entry point

Configuration FILE

  • conf
    • connect_info.yaml (server connect info (address, port, virtual machine etc..))

      variable meaning example
      address address (used by ssh) 10.1.1.1
      port port (used by ssh) 5000
      username name (used by ssh) sonic
      hostname server name (used by analyzer) node1
      metadata_key unique key (used by per server) 1
      novm it is a bare-metaal server metadata_key
      isvm who is the vm's host metadata_key
      eth which interface attach interface names
      other_address other address (used by server) other address
      iscontainer who is the container's host metadata_key
    • function_info.yaml (Which function probed? Not Yet Activated)

    • ping_info.yaml (Information for Time-Synchronization)

      variable meaning example
      address address (where to ping) 10.1.1.1
      port port (where to ping) 5000
      eth interface (used by XDP) enp1s0
    • sampling_info.yaml (Information for Sampling Rate, Address, Port)

      variable meaning example
      size sampling rate (payload size) 72400 (bytes)
      interval sampling rate (time interval) 1 (sec)
      ports sampling port (port that interested) 5000
      filter_retrans filter retransmission packets or not 1
    • database_info.yaml (Information for Database (Address, Port, Passwd etc..))

      variable meaning example
      user db's user xxxx
      passwd db's password xxxx
      host db's address xxxx
      db which db xxxx
    • management_server_info.yaml (Information for Management server (Reporting, Communication etc..))

      variable meaning example
      address manager server's address 10.1.1.1
      port manager server's port 5000
      username manager server's name xxxx
      password manager server's password xxxx
      hostname manager server's hostname xxxx
      eth manager server's interface (used by XDP) xxxx
    • redis_info.yaml (Information for KV Store (Address, Port, Passwd etc..))

      variable meaning example
      address redis's address 10.1.1.1
      port redis's port 5000
    • time_sync.yaml (Cycle of time synchronization)

      variable meaning example
      time_interval interval 64 (s)

Environment

  • OS: Ubuntu 22.04.5 LTS
  • Kernel: Linux 5.14.0

Workflow

  1. Run LatScope:
cd LatScope
sudo python3 ebpf_main.py
  1. Generate graph data:
cd LatScope/graph
sudo python3 ebpf_graph.py
  1. Draw graphs:
cd LatScope/graph/result
./make.sh

Example

  • Result Example
    • Result Example1 (LAN) Result Example1 (LAN)
    • Result Example2 (LTE) Result Example2 (LTE)
    • Result Example3 (WIFI) Result Example3 (WIFI)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors